14th International GALA conference, Thessaloniki, 14-16 December 2007
Behavioural scales of language
proficiency: insights from the
use of the Common European
Framework of Reference
Spiros Papageorgiou
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Outline
•
•
•
•
•
•
Background
Aims
Data collection
Data analysis
Results
Implications
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Background
• Advent of the CEFR: increased interest in
behavioural scales of language proficiency
• Using the CEFR scales: Problems




Designing test specifications (Alderson et al., 2006)
Measuring progression in grammar (Keddle, 2004)
Describing the construct of vocabulary (Huhta &
Figueras, 2004)
Designing proficiency scales (Generalitat de Catalunya,
2006)
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Background (2)
• Using the CEFR scales: Criticism



Equivalence of tests constructed for different purposes
(Fulcher, 2004b;Weir, 2005)
Danger of viewing a test as non valid because of not
claiming relevance to the CEFR (Fulcher, 2004a)
Progression in language proficiency not based on SLA
research but on judgements by teachers (cf. North 2000;
North & Schneider 1998)
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Aims of the study
• Investigation of three research questions:



Can users of the CEFR rank-order the scaled descriptors
in the way the appear in the 2001 volume?
If differences in scaling exist between the users of the
CEFR and the 2001 volume, why does this happen?
Can training contribute to more successful scaling?
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Data collection
• 12 users of the scales acting as judges in relating
two language examinations to the CEFR
• Data collected during Familiarisation sessions
described in the Manual for relating examinations to
the CEFR
• Part of a doctoral thesis at Lancaster University
(Papageorgiou, 2007) and a research project at
Trinity College London
• Task: sort descriptors into the six levels
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Data collection (2)
Descriptors N
Number of judges per administration Ratings
Sept Sept
November February July
2005 2005
2005
2006
2006
1st 2nd
Speaking 30
12
12
10
11
-
1350
Writing 25
12
12
10
11
-
1125
Listening 19
12
12
10
11
-
855
Reading 20
12
12
10
11
11
1120
Global 30
12
12
10
11
-
1350
Total 124
5800
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Data analysis
•
•
•
Analysis: FACETS Rasch computer program
3 facets: descriptors-raters-occasions
Rank-ordering of elements of facets on a common
scale
• Fit statistics (Bond and Fox, 2001; McNamara,
1996)

Overfit: too predictable pattern

Misfit: more than expected variance
• Acceptable range of fit statistics
 Descriptors: .4-1.2 (Linacre & Wright, 1994)
 Raters: .5-1.5 (Weigle, 1998)
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Results: Writing Levels A1-B1
+
-2
|
+
-3
|
+
-4
|
+
-5
|
+
-6
|
+
-7
|
+
-8
|
+
-9
|
+ -10
|
+
|
+
|
+
|
+
|
+
|
+
|
+
|
+
|
+
|
W 11 B 1
W 15 B 1
W 12 B 1
W 16 B 1
W4 A2
W 19 A 2
W6 A2
W 24 A 1
W 25 A 1
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Results: Writing Levels B2-C2
+
|
+
|
+
|
+
|
+
|
+
|
+
|
*
|
+
|
7 +
|
6 +
|
5 +
|
4 +
|
3 +
|
2 +
|
1 +
|
0 *
|
-1 +
|
W 18 C 2
W1 C2
W 14 C 2
W 23 C 2
W 10 C 2
W2 C1
W1 3 C 1
W2 1 C 1
W 20 C 1
W 3 C2
W 9 C2
W 17 C 1
W5 B2
W 22 B 2
W7 B2
W8 B2
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
+
|
+
|
+
|
+
|
+
|
+
|
+
|
*
|
+
|
7 +
|
6 +
|
5 +
|
4 +
|
3 +
|
2 +
|
1 +
|
0 *
|
-1 +
|
Results: Raters
C l au di a
M a tt
A l ic e
G e or ge
N i co la
A nd r ew
R it a
K at e
Lo r a
R os e an ne
Sa l ly
Ti m
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Results: Occassions
+
|
+
|
+
|
+
|
+
|
+
|
+
|
*
|
+
|
7 +
|
6 +
|
5 +
|
4 +
|
3 +
|
2 +
|
1 +
|
0 * Feb 06
|
-1 +
|
Nov 05
Sept 05 1st
Sept 05 2nd
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Results: Correlations
Correlations of scaling between the judges and the CEFR volume
Descriptors
Spearman
Speaking
.959
Writing
.946
Listening
.968
Reading
.975
Global
.980
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Summary of results
•
•
•
•
•
•
Trained judges perceived language ability as
intended in the CEFR
Almost identical scaling
Cut-offs between B2-C1 and C1-C2 unclear
Competences other than linguistic: misfitting
descriptors
Unclear and inconsistent wording resulted in level
misplacement by the judges
Mixed effect of training
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Implications of findings
•
•
•
•
Common understanding of the construct in the
CEFR scales can be achieved, but
How valid is it to claim that a test is linked to B2
instead of C1 and C1 instead of C2?
How can sociolinguistic and strategic competences
be tested in relation to the CEFR?
Can SLA research help better understand these
issues?
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Contact details
Spiros Papageorgiou
University of Michigan English Language Institute
500 East Washington Street
Ann Arbor, MI
48104-2028
USA
[email protected]
University of Michigan
English Language Institute
Testing and Certification Division
www.lsa.umich.edu/eli
Descargar

Slide 1