Developing an automated assessment tool
for children’s oral reading
Leen Cleuren
March 5 2007
Overview
• SPACE project
– General objectives
– Development of Chorec
– Development of an automated assessment tool
• Doctoral research project (DRP)
– General objectives
– Development of a new computerized reading test
battery
2
SPACE – General objectives
• Explore the benefits of speech recognition
technology for the assessment of word
decoding skills:
– Automated assessment
– No examiner-bias
3
SPACE – General objectives
• Explore the benefits of speech recognition
and speech synthesis technology for the
development of a remedial reading tool:
– Individual practice possible
– Personally adapted appropriate feedback
4
SPACE – General objectives
in other words...
• Looking for a reading tutor that can make a
diagnosis
Demands for the speech recognizer:
 Tracking the child’s progress during reading
 Accurately detecting and classifying oral reading
errors and strategies
5
SPACE – General objectives
in other words...
• Looking for a reading tutor that can read to
or read along with the child; can intervene
whenever reading errors occur; can give
appropriate feedback
Demands for the speech synthesizer:
 Naturally sounding and highly intelligible speech
 Being able to give different kinds of feedback
6
SPACE – Development of Chorec
• ASR within the context of reading
assessment and instruction is a very
challenging task:
e.g.
– Articulatory competences of young children can
differ
– Oral reading can be fraught with reading errors
7
SPACE – Development of Chorec
• To improve the speech recognizer's ability to
accurately detect reading errors:
– statistical characterization of reading behavior is
necessary
– a model that contains information on the nature and
prevalence of likely reading errors is needed
 Chorec = Children’s Oral REading Corpus
= Database of recorded and annotated children’s oral
reading and oral reading errors and strategies
8
SPACE – Development of Chorec
• Recordings
– 256 regular elementary school children (grade 1-4)
– 150 children with known reading disabilities
(elementary school age)
– Words, pseudowords, stories
• Annotations
– Different annotation layers containing different
information
9
Speech signal;
2 microphones
strand
Reading strategy
stAnt
Reading error
10
SPACE – Development of Chorec
Classification of reading stragies
• correct reading:
– correct direct word recognition within the 1st
doos
trialTo read:
doos
Child reads: d...oo...s
doos To read:
– repeating a directly
once ordoos
Child reads:
b...oo... recognized
doos word
more
To read: spelling
doos out
– partially or completely (in)correctly
Child reads:it doos ... doos
a word before correctly synthesizing
– incorrectly direct word recognition in the first
trial but reading it correctly in the final trial
To read:
Child reads:
doos
boos ... doos
11
SPACE – Development of Chorec
Classifcation of reading strategies
• incorrect reading:
– incorrect direct word recognition within the 1st trial
– partially or completely (in)correctly spelling out and
incorrectly synthesizing
or not synthesizing it at
To read:a worddoos
all
Child reads: boos
To read: of a word
doos within the first trial but
– direct recognition
reads: onb...oo...s
‘correcting’Child
it wrongly
a second trialboos
d...oo...s
...
– omission
or
insertion
of
a
word
To read:
doos
– asking
forreads:
a complete
partial prompt of a word
Child
doos or
... boos
before
carrying
onde
reading
To read:
De doos
staat op
tafel. it
Child reads: De doos staat op tafel.
12
De doos staat op de grote tafel.
SPACE – Development of Chorec
Classification of errors
•
paragraph level:
–
–
–
–
•
sentence level:
–
•
omission or repetition of a part of a sentence
word level:
–
–
•
omission or repetition of a whole line or sentence
erroneous insertion of a word
change of word order
substitution of a word by a synonym or semantically related word
wrong decoding strategy
wrong direct word recognition strategy
grapheme level:
–
–
sequential errors, substitution errors
deletion errors, insertion error
13
Girl, 1st grade, regular elementary school, 3+4 syl. words
14
SPACE – Development of an automated
assessment tool
• Work done by ESAT
• Demo
15
Overview
• SPACE project
– General objectives
– Development of Chorec
– Development of an automated assessment tool
• Doctoral research project (DRP)
– General objectives
– Development of a new computerized reading test
battery
16
DRP – General objectives
• Development of a new computerized test
battery for the assessment of children’s
word decoding skills
• Analysis of the quantitative and qualitative
development of reading errors and
strategies in elementary school children
with and without reading disabilities
17
DRP – Development of a test battery
•
•
•
•
•
Achievements
Research objectives
Participants
Data collection
Speed-accuracy trade-off problem
18
DRP – Development of a test battery
Achievements
•
Development of a computerized reading
tutor assessment platform (see demo)
•
Development of a test battery to assess
elementary school children’s word
decoding skills
19
DRP – Development of a test battery
Achievements
• Word and pseudoword reading test (WRT +
PWRT)
– 3 lists of (pseudo)words: 1 syl., 2 syl., 3+4 syl.
• Words: oog, water, omdraaien
• Pseudowords: eem, ulen, ometuif
• Story reading test (SRT)
– 9 graded text stories: AVI 1 – AVI 9
20
DRP – Development of a test battery
Research objectives
• Standardization of the WRT and PWRT
– Speed-accuracy trade-off problem
• Reliability assessment and validation of the
WRT and PWRT
• Looking for an alternative measure to
capture reading fluency
21
DRP – Development of a test battery
Participants
• 256 regular
elementary school
children (grade 1-4)
– 124 boys
– 132 girls
• Mothertongue = Dutch
• No doubling or
passing over
40
37
35
31
30 28
25
36
33
31
32
28
Boys
Girls
20
15
10
5
0
Gr 1 Gr 2 Gr 3 Gr 4
22
DRP – Development of a test battery
Data collection
• Questionnaire for parents and teachers
– Teacher:
•
•
•
•
Reading instruction method used?
Child’s AVI-level?
Child’s school history?
RD present?
– Parents:
• RD present?
• Child’s name, birth place, birth date, nationality, (previous) residence
• Languages spoken by the child
• Chorec audio recordings: children reading WRT, PWRT, SRT
• Administration of One-Minute-Test, Klepel, AVI-test
23
DRP – Speed-accuracy trade-off
Distribution of speed (2nd grade)
• No class. var.
 Lognormal distrib.:
– Mean: 106 ms
– Std. Dev.: 53 ms
24
DRP – Speed-accuracy trade-off
Distribution of speed (2nd grade)
p < 0.05
R² = 0.12
WRT
PWRT
25
DRP – Speed-accuracy trade-off
Distribution of speed (2nd grade)
p < 0.05
R² = 0.42
Not significant:1LG-2LG, 3+4LG-2LGP,
1LGP-2LG
1LG
2LG
3+4LG
1LGP
2LGP
3+4LGP
26
DRP – Speed-accuracy trade-off
Distribution of speed (2nd grade)
p > 0.05
Boys
Girls
27
DRP – Speed-accuracy trade-off
Distribution of speed (2nd grade)
• No interaction
between:
– Sex
– Task
p > 0.05
28
DRP – Speed-accuracy trade-off
Distribution of #correct (2nd grade)
• No class. var.
– Mean: 32
– Std. Dev.: 8
29
DRP – Speed-accuracy trade-off
Distribution of #correct (2nd grade)
p < 0.05
R² = 0.31
PWRT
WRT
30
DRP – Speed-accuracy trade-off
Distribution of #correct (2nd grade)
1LG
2LG
1LGP
2LGP
3+4LG
p < 0.05
R² = 0.61
Not significant:1LG-2LG
3+4LG-1LGP
31
3+4LGP
DRP – Speed-accuracy trade-off
Distribution of #correct (2nd grade)
p < 0.05
R² = 0.02
Girls
Boys
32
DRP – Speed-accuracy trade-off
Distribution of #correct (2nd grade)
• No interaction
between:
– Sex
– Task
p > 0.05
33
DRP – Development of a test battery
Speed-accuracy trade-off
• WRT + PWRT: partially
speed and partially
accuracy tests
r = - 0.65
– Speed test without time limit
– Accuracy is important too
• 1LG: -0.29, 1LGP: -0.44
• 2LG: -0.45, 2LGP: -0.43
• 3+4LG: -0.59, 3+4LGP: 0.25
34
DRP – Development of a test battery
Speed-accuracy trade-off
• Speed-accuracy trade-off!
– Very fast with low accuracy
– Very slow with high accuracy
• To perform as well as possible: various strategies
possible:
– Optimize speed
– Optimize accuracy
– Optimize both
 We need a measure that captures both!
35
DRP – Development of a test battery
Speed-accuracy trade-off
Score 1: total response time/#correct
Score 2: total response time/ (1-%error)
36
DRP – Development of a test battery
Speed-accuracy trade-off
• Alternative: item response models?
• We have information on reading
performance at the level of the word:
paard
clown huis
start of utterance
stimulus presentation
37
DRP – Development of a test battery
Speed-accuracy trade-off
p = chance to be
correct
p
1 item
θ = skill level
38
Thank you!
• Questions?
39
Descargar

Document