Download Speech transcription and analysis system and method

Transcript
US006714911B2
(12) United States Patent
(10) Patent N0.:
Waryas et al.
(54)
(45) Date of Patent:
SPEECH TRANSCRIPTION AND ANALYSIS
5,717,828 A
2/1998 Rothenberg
5,791,904 A
5,813,862 A
8/1998 Russell et al.
9/1998 MerZenich et al.
'
2
-
7
gglllncpaltgilerre’nstagQHZEEQXTQS)’
Notice:
_
_
EP
0 504 927
9/1992
EP
1 089 246
4/2001
99/13446
3/1999
SubJect to any disclaimer, the term of this
patent is extended or adjusted under 35
WO
USC 154(k)) by 0 days
OTHER PUBLICATIONS
Bernthal, John, et al. “Articulation and Phonlogical Disor
ders,” 1998, Allyan & Bacon, 4th Edition, pp. 233—236,
(21) Appl- NO-I 09/999,249
.
(22) Filed:
Nov. 15, 2001
(65)
292*
Prior Publication Data
(List continued on next page.)
US 2002/0120441 A1 Aug. 29, 2002
igijicgztlrlllogirggrw?
Related US. Application Data
$111) Aglogeél 45156141) Zr Firm—A11@n, Dyer, Doppelt,
1
(51)
(52)
a .
FOREIGN PATENT DOCUMENTS
0 360 909
4/1990
EP
AHIOHIO, TX (US)
_
e
2/1999 Beattie et al.
(List continued on next page.)
( )
(73) Assignee: Harcourt Assessment, Inc., San
_
gamg 6: a11
ear
5,865,626 A
US
_
1;;
,
(US); Laurie Labbe, San Antonio, TX
(63)
Mar. 30, 2004
SYSTEM AND METHOD
(75) Inventors: Carol Waryas, San Antonio, TX (US);
(*)
US 6,714,911 B2
ra
Continuation-in-part of application No. 09/769,776, ?led on
Jan. 25, 2001, which is a continuation-in-part of application
(57)
No. 09/770,093, ?led on Jan. 25, 2001.
A t
1 c
r1s ,
.
.
ABSTRACT
7
_
Int. Cl. ........................ .. G10L 21/06, G09B 21/00
- ranscription
method uses a computerized
process to
prompt a student to produce at least one phoneme orally.
NGXt a Correct and at least one incorrect production of the
US. Cl. ...................................... .. 704/271; 704/220
phoneme are displayed The therapist Selects from among
Fleld Of Search ........................ ..
[he
productions based upon the Student-produced
704/503, 3, 276, 271, 270, 268, 267, 260,
256, 254, 249, 240, 231, 211, 200; 434/362,
phoneme. The system includes a processor and display to
prompt a student to produce at least one phoneme orally,
185
_
References Clted
disP laY a correct and at least one incorrect P roduction of the
phoneme. The therapist then uses an input device in signal
communication With the processor to select from among the
Us PATENT DOCUMENTS
displayed correct and incorrect productions based upon the
(56)
student-produced phoneme, thus obviating the need for the
49697194 A
2
5’487’671 A
5:562:453 A
5,636,325 A
5,679,001 A
11/1990 EZaWa et al'
therapist to enter the incorrect production symbol by
sBtlunllfr et atl' 1
1/1996 shacirgneetr :1 a'
10/1996 Wepn
symbol, unless it is desired to do so, or unless the actual
production is not found among the displayed production
'
Selections
6/1997 Farrett
10/1997 Russell et al.
37 Claims, 25 Drawing Sheets
enter student and therapist infonnation
select type of administration
603
initiate phonemic profile
present stimulus to student 605
present target and
incorrect
initiate
-
606
yes
i
present stlmulus to student
no
I
l
6 08
'
intendedtargetsentence
t
enter target production
609
select among
I
speech sample k 604
'
607
enter inproduction‘
IPA
options
619
620
l/ 621
622
display target production
623
editto aaal production
624
phonemic
profile
complete
7
yes
automatic analysis performed
611
US 6,714,911 B2
Page 2
Us. PATENT DOCUMENTS
5,927,988 A
6,009,397 A
6,019,607 A
6,030,226 A
7/1999 Jenkins et 211.
12/1999 Siegel
2/2000 Jenkins et 211.
2/2000 Hersh
6,055,498 A
4/2000 Neumeyer et 211.
6,071,123 A
6/2000 Tallal et 211.
6,077,085 A
6,113,393 A
6/2000 Parry et 211.
9/2000 Neuhaus
OTHER PUBLICATIONS
Jackson, Peter, “Introduction to Expert Systems,” 1999,
Addison Wesley Longman Limited, 3rd Edition, pp.
207—210.*
Parrot Software User’s Manual “Automatic Articulation
Analysis 2000,” Parrot Software, Inc.*
Shneiderman, John, “Designing the User Interface,” 1998,
Addison Wesley Lognman Limited, 3rd Edition, pp. 82—83.*
American Speech—Language—Hearing Association, Tech
nology 2000: Clinical Applications for Speech—Language
Pathology,
http://professional.asha.org/techiresources/
Additional Childes Tools, Childes WindoWs Tools, http://
childes.psy.cmu.edu/html/Wintools.html.
Sails, the Speech Assessment & Interactive Learning System
(SAILSTM) Using SAILS in Clinical Assessment and Treat
ment, http://WWW.propeller.net/react/sails2.htm, pp. 1—3.
GFTA—2: Goldman—Fristoe Test of Articulation—2, http://
WWW.agsnet.com/templates/productvieWip.asp?GroupID=
a11750, pp. 1—3.
KLPA: Khan—LeWis Phonological Analysis, http://WWW.ag
snet.com/templates/productvieWip.asp?GroupID=a1820,
pp. 1—2.
Bernthal, John E., and Bankson, Nicholas W. (Eds.),Articu
lation and Phonological Disorders, Fourth Edition, Chapter
9, Instrumentation in Clinical Phonology, by Julie J. Master
son, Steven H. Long, and Eugene H. Buder, 1998, pp.
378—406.
Masterson, Julie and Pagan, Frank, “Interactive System for
Phonological Analysis User’s Guide,” pps 41, Harcourt
Brace & Compnay, San Antonio, 1993.
tech2000/7.htm, pp. 1—7, 1996.
Long, Steven H. and FEY, Marc E., “Computerized Pro?ling
PictureGallery,
User’s Manual,” pps 119, Harcourt Brace & Company, San
Antonio, 1993.
http://WWW.psychcorp.com/catalogs/sla/
sla014atpc.htm, pp. 1—2.
The Childes System, Child Language Data Exchange Sys
tem, http://childes.psy.cmu.edu.
* cited by examiner
U.S. Patent
Mar. 30, 2004
Sheet 1 0125
US 6,714,911 B2
Open professional version of system
\l'
/ 100
Provide access to database of records / 101
.
.
provide
access to ldemographic
data
i
select problem speech sound
/1o2
/1o3
104
apply no
filter
/105
apply filterto limit set
i
search database to create record set
107
<
\106
no
yes
sort set of records into desired sequence / 108
@
FIG. 1A.
U.S. Patent
Mar. 30, 2004
Sheet 2 0f25
resent
US 6,714,911 B2
.
0‘: Store/ store/transmit
transmit
‘
/120
use personal version of system
110
present
select display style
/
l
111
present record
/
i
112
prompt student for pronunciation /
.
. .
therapist
scores‘I pronunciation
/113
114
hear
word
q
pronounced
‘'
/115
broadcast word
calculate aggregate score
t
l
\116
store current score
J,
\
117
t ' l h
ca l cu l a te h‘lSfl'lC?
c ange
\118
calculate statistics
\119
!
FIG. 1B.
U.S. Patent
m
Mar. 30, 2004
Sheet 4 of 25
US 6,714,911 B2
select type of analysis
/501
present symbol to user
/502
prompt userto pronounce word/perfonn narration 503
enter phonetic+representation
/504
apply dialgctical filter
/505
automatically categorize the error
/507
i
if desired, display frequency spectrum
/508
Fl G. 3A.
509\[ if desired, broadcast correct pronunciation ]
511* perform correlation of erryoerss to make diagnosis|
512W
issueteport
]
513\|
save error in database
1
514W
determine change overtime
I
515*
516*
issue report
l
recommend therapeutic program
1
@
FIG. 3B.
U.S. Patent
Mar. 30, 2004
Sheet 5 of 25
US 6,714,911 B2
@
present symbol to user on display
/502’
in communication with processor
l
prompt user to pronounce word/ perform narration /503
i
enter phonetic representation into
/520
separate input device
i
download entered phonetic
/521
representation into processor
@
FIG. 4.
12\
student
A
V
10\
47
system
.
l
.
\ operator Input and storage device
11\
i
therapist
FIG. 5.
U.S. Patent
Mar. 30, 2004
Sheet 6 of 25
US 6,714,911 B2
[enter student and therapist information K601
ir
[
select type of administration
K602
i
f
603* initiate phonemic profile I
i
—>{ present stimulus to studentl’605
‘
v
_ Present target af‘d
| initiate connected speech sample [/604
Incorrect productlons
607
q,
\606
[
619
present stimulus to student
I/
V
yes
[determine intended target sentenceI/620
.
[
entertargetV production
1/ 621
[609
i
622
608\
select among
enter production I
d'splayed opt'ons
m IPA
comers?" to IPA
l
display target production
|
edit to actual production
yes.
if
I
automatic analysis perfonned
@>
l
v
K611
FIG. 6A.
Y
623
K624
U.S. Patent
Mar. 30, 2004
Sheet 7 of 25
QLD
US 6,714,911 B2
output analysis
/ 612
l
aPP'Y age and/or dialect filter
i
/613
output analysis with filter applied
/ 614
+
prepare parent letter and treatment recommendations /
stimuli, transcription, analysis
+
/ 617
Prepare report, letter, treatment recommendations /
@o
615
618
U.S. Patent
Mar. 30, 2004
Sheet 10 of 25
US 6,714,911 B2
[Current Date]
Dear [Caregiver's Name]:
RE:
Client's Name [?rst and Last Name]
Client's First Name] was tested on [Date of Administration] with the Computerized
rticulation and Phonolo Evaluation System to see what sounds he/she [based on SEXLis
able and unable to say. [ ient's First Name] was shown, for example, a photograph of a s oe
on the computer screen and was asked to tell me what the photo was. This was not a test to
detennlne if [Client's ?rst Name] knew the word, but rather how he/she [based on SEX] said
it The results of this evaluation indicate whether Client's First Name] is able to say all the
sounds that are expected at his/hc.r [based on S a e. The evaluatlon also indicates whether
[Client's first name] is_using sounds earl for his/ her Biased on SEX] age [If the age filter is
turned on and if the chem is less than 1 years of age .
Here are [Client's l-Trst Namel's results:
These are the sounds our Chlld is able to sa correctl , which are not expected at his/ her [based on
SEX] age: [lfthe age fl ter is turned on and i client is ess than 10 years of age]
[For Example]
Sound
in
V
V
These are the sounds with which [Client's first name] is having difficulty:
[For Example]
Sound
V
V
All letters in the “Sound:" column should be in orthographic, small letters, not in CAPS]
Otei Results reflect application of the age ?lter [If age filter was used]
Results reflect application of the dialect ?lter [ fdialect filter was used]
[Do not print any note text if ?lters were not used.]
[Print the following if there are no sounds with which the client is havin difficul
[C|ient]'s test results do not indicate that he/she [based on S has di lculty with
articulation and/or phonology at this time.
FIG. 9.
U.S. Patent
Mar. 30, 2004
Sheet 11 0f 25
US 6,714,911 B2
Report Options
(3 |Description of Client's Productions Reports]
Word Length Inventories
‘
Stress Pattern Inventories
Word Shape Inventories
Consonant Inventories
E] ------
.......
Initial
[3 ------- ~C] Consonants
i .......... ..
Count
.......... --
Count and Words
[3 ------- {:1 Consonants By PVM Feature
Single Feature
.......
E] .......
P|ace
L .......... ..
Count
t .......... .v
Count and Words
Voice
p] ------- {:1 Manner
E .......
A||
.......... ..
Count
.......... ..
Count and Words
Two Features
[3 ------
E] ------- {:1 Place-Manner
5 ---------- --
---------- --
Count
Countand Words
[5 ------- 4:] Place-Voice
[g ------- {:1 Voice-Place
[a .......
voice-Manner
E] ------- --Ej Manner-Place
[5 ------- {:1 Manner-Voice
[+1 ------- {j Consonants By Nonlinear Feature
53 ....... (j Medial
game‘
<§ack
Nexp
FIG. 10.
'
_Preview
U.S. Patent
Mar. 30, 2004
Sheet 12 0f 25
US 6,714,911 B2
Treatment Sgestions for
[Client's l-"irst Name and Last Name]
[mm/dd/yyyy-Administration date]
This report is fairlytgeneral in its a proach to ?nal Foal selection because there is considerable
controversy in the leld of speeclzfangua e patho ogy about articulation and phonology treatment. in
addition, it is important to consi er the ole client, In his or her environment, and int e context of
other communication or learning needs and styles when determining the goals of treatment.
Be sure to select the ppm of an age comparison (this would have to be set in the Preferences) and/ or
a dialect ?lter (this would have to be set in the Demographics screen) if you choose to take into
account those considerations in goal selection.
I.
Word Shape Goals
Your client is having dif?culty with the shape of words. The priority at this point in development should
be to strengthen the basic word structures of language. The following is a list of word shapes to target in
treatment:
[System displa 5 word shapes with a percent match lessthan 60% from Comparison qfC?'ent’s
fProduction a Target Fonns.]
CVCVC
CVC
CCVCC
When targetin these word shapes, it is advisable to use sounds that are in your client's current inventory.
For example, i a child uses only CV syllables and [p], [m], and [n] word initially, you can create CVC
words such as ‘pop,’ ‘mop,’ ‘pan,’ etc.
[System displays the following paragraph if no word shape has a match rate of less than 60%. Previous
two paragraphs with the 60% or less match data is not printed]
Your client s ows adequate word sha e development atthe presenttime, however, there are many
segmental substitutions. Treatment s ould focus on segments in all word positions. Considerthree to four
majorsound classes varying in place, manner and voicing.
Your client produced these sounds with a relatively high degree of accuracy in_ the noted positions: _ _
‘System displays consonant se ments produced with at east 70% match in at least one word position
rpm the segmental part of the ompanson of Client's Production and Target Fonns.]
lnitial
T (70%
C 95 )
Medial
T (100%)
Final
T (90%‘
P(99%)
P(76%
P 71%)
G 85%)
BL(75%)
FIG. 11A.
U.S. Patent
Mar. 30, 2004
Sheet 13 of 25
US 6,714,911 B2
[Printtjhis paragraph if the match rate of >70% does not apply. Previous paragraph and table are not
nn e
-l
0 segments meet the 70% criterion for match. For phonemes to utilize forword shape goals, select
phonemes with the highest match below 70% (as identi?ed in the analyses comparing c ient and
target productions.)
Your client seems to overuse the following phonemes. He/she should be encouraged to use sounds
other than these:
[If results from Description of Client’s Production indicate that one consonant segment occurs
over 25 times in the word initial position, over 12 times in the medial position, or over 17 times
in the final position. If none of the segments meet these criteria, omit this section including the
previous paragraph]
Initial
T
Medial
Hnal
T
P
Printthis on all IPE Level 1 Treatment Su estions]
_
tis usually advisable atthis stage of deve opmentto avoid targeting voiced stops in the ?nal position
unless your client already shows use of these sounds in the ?na positron.
ll.
Segmental and/ or Feature Goals
Your client mi ht also benefit from treatment on one ortwo new sounds orfeatures in the ?rst eriod
of treatment.
accuracy:
e following is a list of sounds and features that were produced with less than 0%
[From the segmental part ofthe Comparison of Clients Production and Target Forms:
a. List all target consonant segments and consonant sequences by positron with percent match
less than 60 .
b. List all place, voice, mannerfeatures with percent match less than 60%.
c. List all nonlinearfeatures with percent match less than 60%.
* *List all phonemes in lPA characters]
Segments
Initial
k 50%
FLSSM;
( “0
Medial
k 55%
?i59%(40%)
Final
f(0%
vth(0%0%
v (0%)
f(0%l
th 0%
th 0%
th 0%
v(0%)
th 0%
s ‘'
z 0%
2 0°
sh 0%
z 0%
l 30%)
s} %
ch $30 )
“39%)
in
thg0%
5 %
sh 0 Lu
ch 30 )
new
FIG. 11B.
shch #030g‘,)
r0%)
U.S. Patent
Mar. 30, 2004
Sheet 14 of 25
US 6,714,911 B2
Place-Voice-Manner Features
Place
Initial
Medial
?nal
Labiodental (0%)
Dental (0%)
Labiodental (0%)
Dental (0%)
Labiodental (0%)
Dental (0%
Palatalg25%)
Velar (5 %)
Palatal()25%)
Palatal
(25 ')alatal (34%)
Labiodental
Velar (5 %)
Labiodental Velar (43%)
Voice
Initial
(none)
Medial
Voiced (35%)
Hnal
voiceless (59%)
Manner
Initial
Fricative (0%
Af?cate( 0 )
Liflz?id (45%)
A 'cateLiquid(25%)
Medial
Fricative (0%
Final
Fricative (9%
Af?cate 0 )
Afficate( 0 )
Li uid (45%)
A ‘cate Liquid(25%)
Liquid (4 %)
Nonlinear Features
Manner
Consonantal+
+
nuant +
initial
%
Medial
Consonantal+ 5%
?nal
Consonantal+ 9%
-
+
-+
+
FIG. 11C.
U.S. Patent
Mar. 30, 2004
Sheet 15 of 25
US 6,714,911 B2
Oral Place
lnitial
Labial Coronal Anterior + (15%)
Medial
?nal
Labial Coronal Anterior- (0%)
Labial Coronal Anterior-+ (1
Medial
Advanced Tongue Root - (13%)
Final
Advanced Tongue Root + (14
Advanced Tongue Root +
Advanced Tongue Root + (19
Labialdental + Dorsal Anterior -
(10%)
Pharyngeal Place
Initial
Advanced Tongue Root + (58%)
Advanced Tongue Root +
Advanced Tongue Root + (14%)
(for each sound listed underthe Target mismatch sounds, system will check the Description of
lient's Production results to see whetherthat sound was present or not. For those sounds that WERE
used at least once in any position. if none ofthe sounds meet the criterion, omitthis section including
the following 3 paragraphs]
The following target sounds were resent in your client’s phonetic inventory, which indicates
that she or he is able to producet at sound or feature:
lnitial
Medial
l-Tnal
T
p
it you choose to target these sounds in treatment, you might consider using minimal pairs in orderto
encourage fu rtheruse of the sound orfeature by providing a communicative incentive. Consult the
Comparison of Cllent’s Production and Target Forms Report to determine WhlCh sounds are used
most often as substitutions forthe target sound class and use those sounds in minimal pair contrasts.
Although the target sounds were present in the inventory, it is possible that additional drill and
practice will be necessaryto establish automaticity of production.
For each sound listed underthe Target mismatch sounds, system will checkthe Description of
lient’s Production results to see whetherthat sound was present or not. List those sounds that
WERE NOT used at least once in any position. If none of the sounds meet the criterion, omit this
section includingthe following3 paragraphs]
The following sounds were absent from your client’s phonetic inventory:
Target
FIG. 11D.
U.S. Patent
Mar. 30, 2004
Sheet 16 of 25
US 6,714,911 B2
If you choose to targhet these sounds in treatment, you may first need to teach the client how to produce
these sounds, and t en workto establish automatrcity. You may wish to conduct stimulability testing
forthese sounds and use the results to select the sounds to target first in treatment.
Targetthe sounds or features in word positions that the client already uses well (e.F., initial position if
your client uses CV well). It is often advisable to address problematic sequences 0 ounds or features
after sequences that show no mismatches.
[Print this on all IPE Level lTreatment Suggestions]
Choose sounds that differ in both place and manner and focus on sound class category
(e.g., [v] and [s] as fricatives, ratherthan Lust [g], or [k] and [g] as velars, ratherthan
just [k]) to establish a ma basis for change.
Additionally, you may wish to considerwhether other accurately produced sounds have features in
common with the mismatched sounds. It may be possible to teach the new sounds by extension from
the already produced sounds (e.g., from /t/ to /k/, from /s/ to /z/ . etc.).
FIG. 11E.
U.S. Patent
Mar. 30, 2004
Sheet 17 of 25
US 6,714,911 B2
Treatment Suggestions for
[Client's First Name and Last Name]
[mm/dd/yyyy - Administration date]
This report is fairly eneral in its a proach to final goal selection because there is considerable
controversy in the mid of speech language patho ogy about articulation and phonology treatment.
in addition, it is important to considerthe whole client, in his or her environment, and in the
context of other communication or learning needs and styles when deterrniningthe goals of
treatment.
Be sure to select the option of an age comparison (this would have to be set in the Preferences)
and/ or a dialect ?lter (this would have to be set in the Demographics screen) if you choose to take
into account those considerations in goal selection.
I.
Word Shape Goals
Your client is havin dif?culty with more com lex structures of words that involve consonant
sequences (blends? and longer sequences 0 consonants and vowels. The following is a list of word
shapes to tar et in treatment:
LSystem disp a word shapes with a percent match le_ss than 60% from Comparison of Client's
roduction an Target Fonns.)
CVCVC
CVC
CCVCC
When targeting these word shapes, it is often advisable to use sounds that are in your client's current
inventory.
[System displays the following paragraph if no word shape has a match rate of less than 60%. Previous
two paragra hs with the 60% or less match data is not printed.)
Your client s ows adequate word shane development atthe present time, however, there are many
segmental substitutions. Treatment s ould focus on segments in all word positions. Consider three to
four major sound classes varying in place, manner and voicing.
Your client produced these sounds with a relatively high de ree of accuracy in'the noted positions:
[System displays consonant segments produced with at east 70% match in at least one word
position from t e segmental part of the Comparison of Clients Production and Target Fonns]
Initial
T (79%
C 99 )
P 100%)
Medial
T (99%)
P (85%)
G 70%)
BL (85%)
FIG. 12A.
?nal
T (100%)
P (100%)
U.S. Patent
[Print
Mar. 30, 2004
Sheet 18 of 25
US 6,714,911 B2
paragraph if the match rate of >70% does not apply. Previous paragraph and table are not
nn e .
0 segments meet the 70% criterion for match. For phonemes to utilize forword shape pals, select
phoére‘mes with the highest match below 70% (as identified in the analyses comparing c rent and target
pro u l0llS.
YrlIJUI' cllient seems to ovenise the following phonemes. He/ she should be encouraged to use sounds other
t an t ese:
[it results from Description of Client's Production indicate that one consonant segment occurs over25
times in the word initial position, over 12 times in the medial position, prover 17 times in the ?nal
position. if lnone of the segments meet these criteria, omit this section including the previous
paragrap .
Initial
Medial
Final
19-1
ll.
Segmental and/ or Feature Goals
Your client also shows limitations in speech sound development. The following is a list of sounds and
features that were produced with less than 60% accuraiisy:
[From the segmental part of the Comparison of Clients roduction and Target Fonns:
gblést all target consonant segments and consonant sequences by position with percent match less than
b. List all place, voice, and manner features with percent match less than 60%.
0. List all nonlinearfeatures with percent match less than 60%.
* ‘List all phonemes in IPA characters.)
1. Segments
Initial
k 50%
g 59%
(0%)
v(0%)
Medial
k 55%
E 59%
(40%)
f (0%
v(0%)
th 0%)
th 0%
the
its).
th 0%
th 0%
v(0%
th 0%
2
s
5.22)
shch 0/u30 )
l 30%)
r 0%)
?nal
f(0%)
s( %)
z(0%)
c
250%iI
sh 0 )6
ch 30 )
J:30%)
r 0%)
FIG. 12B.
“30%)
r 0%)