Centre for Research in English Language Learning and Assessment
Language Testing in the Past: Three
lessons to be learned
LTF Nottingham 2013
Professor Cyril J. Weir AcSS
There is nothing new under the sun.
Is there a thing of which it is said,
'See, this is new'?
It has already been
in the ages before us.
[Ecclesiastes, Chapter 1,
verses 9-10]
News item in the Daily Telegraph on
6th December 2011:
Primary school pupils should be given an oral reading
test at the age of 11 to assess whether they can read
properly, according to one of the Government's most
senior education advisers… "If a child cannot read
fluently at age 11 they are going to have a problem at
secondary school," Sir Cyril Taylor told The Sunday
Telegraph. “In some cases, the English results in the
Key Stage 2 tests are overstating what a child is capable
of. Any teacher will tell you, if a child cannot read, they
can't learn.
What we need is a test where they read out loud —
then you can tell in seconds. "
Death in the Cathedral
Following the murder of the Archbishop of Canterbury, the king
sought a way of placating an important, religious power elite.
Clergymen were accordingly allowed to claim that they were
outside the jurisdiction of the secular courts and could only be tried
for a felony in an ecclesiastical court, with the expectation of being
treated with far greater leniency than in a secular court, e.g. a
penance rather than hanging in a number of cases.
Initially, being tonsured and wearing ecclesiastical dress were taken
as sufficient proof of being a cleric
The neck verse
But after 1351 a literacy test was required. As most of the people
who could read at this time were clerics, so the ability to read
aloud a verse from the bible was taken as proof of clericity.
Miserere mei, Deus: secundum magnam misericordiam
tuam et secundum multitudinem miserationum tuarum:
dele iniquitatem meam.
Psalm 51
(Have mercy upon me, O God, according to thy loving-kindness:
according unto the multitude of thy tender mercies blot out my
Only abolished in England in 1706.
Testing oral proficiency
Spolsky writes: …pride of place for a direct
measure of oral language proficiency is
usually granted to the oral interview created
by the Foreign Service Institute (FSI) of the US
State Department Developed originally
between 1952-56…
Spolsky, B. (1990: 158). Oral examination: an historical note.
Language Testing, 7 (2)
Trinity College 1938 (1877)
Assessment Scales
Glenn Fulcher traces the first attempt to assess
second language speaking to the work of the Rev.
George Fisher who became Headmaster of the
Greenwich Royal Hospital Schools in 1834.
“In order to improve and record academic
achievement, he instituted a “Scale Book”, which
recorded performance on a scale of 1 to 5 with
quarter intervals. A scale was created for French as
a second language, with typical speaking prompts
to which boys would be expected to respond at
each level. The Scale Book has not survived.”
Marker standardisation
Thorndike developed a standardized scale for the measurement of
quality in the handwriting of children and also one for the handwriting of
women in 1908 (Thorndike 1911, 1912). Instead of estimating a scale
based simply on connoisseurship as was often the case in the United
Kingdom, Thorndike took a large sample of student handwritten scripts
and used 200 teachers to rank these scripts in order.
From the data he created a scale upon which he placed each script. He
then provided a set of exemplar scripts at various levels to operationalise
a scale from an absolute zero base with scale points defined, and their
distances defined (1912: 295-299).
Teachers were asked to compare
their student’s scripts with those samples on the scale and identify the
closest match to give the level.
Reading into Writing
Robeson, F. E. (1913) in A Progressive Course of Precis
Writing quotes from the criteria for marking the precis
test set by the Oxford and Cambridge Schools
Examination Board:
…The object of the précis is that anyone who had not
time to read the original correspondence might, by
reading the précis, be put in possession of all the
leading features of what passed. The merits of such a
précis are (a) to include all that is important in the
correspondence, (b) to present this in a consecutive
and readable shape, expressed as distinctly as possible,
and as briefly as is compatible with distinctness.
Assessment Literacy
construct was measured in the
past provides us with a valuable
perspective when developing
'new' tests and marking
schemes or critiquing existing
First Lesson
There is nothing new under the sun, but there
are lots of old things we don't know.
Ambrose Bierce, The Devil's Dictionary
Examination boards/test developers need to
preserve/write their own histories and archive
important documents for posterity…
The influence of
people and ideas
The washback of language teaching on testing
Changing priorities in the methods and content of
language teaching obtaining at various stages in the
C20th in the UK included:
 the Grammar Translation or Traditional Method, based upon the
method used for the teaching of classical languages
 the direct method promoted in continental Europe for the formal
education system
 the oral method, Harold Palmer’s attempt to systematise direct
method teaching procedures and align them with emerging ideas
on structural and lexical progression
 the audio-lingual method
 the situational approach
 the communicative approach with its focus on the needs of
learners to use language for real life communication accompanied
by a sub-skills approach to teaching the four macro skills
C 19th: Grammar the foundation of all
Not only was grammar viewed as the “gateway to
all of knowledge” in the C19th, it was thought to
“discipline the mind and the soul, at the same
time honing the intellectual and spiritual abilities
that would enable reading and speaking with
discernment” (Huntsman 1983 p 50) …
(Hillocks 2008 p311)
Language teaching at the end of the C19th
“The prime object of scholastic education is the training of the
mental faculties. Hence a youth is put to hard and dry studies,
often confessedly distasteful, though the whole of them may
be forgotten when he enters practical life. The mental training
is never forgotten; on the contrary, the powers so developed
increase in grasp and tenacity.
Training by the ear will never do this: it simply cultivates one
faculty, memory, and that only for a short time. It is always
found that children so trained are the most volatile have not
power of application, and in after life seldom settle to any
definite pursuit.”
(R.W. Hiley 1887 Journal of Education Vol IX: 308)
The Reform Movement
Throughout the 19th century the GTM tried to carve
out a role in the schools by modeling itself on the
classics, but it was not popular with some teachers,
and in the 1880s a number of language teachers and
academics in Europe instigated a Reform Movement
which, with the assistance of modern ideas from
phonetics, allowed for a new pedagogical approach
rooted in the spoken language.
Grudging acceptance of spoken language
Schools were being encouraged to include
modern languages with an oral component
towards the end of the C19th but
headmasters, according to Gilbert (1953: 3)
“…consented only because they thereby
satisfied utilitarian parents and because the
Modern Side enabled them to ‘shunt the
empties’ or transfer the dullards from classics
to modern languages.”
What to teach/test?
Stern argues that:
where aims are scholarly there will be an
emphasis on written and analytical skills;
where social objectives are dominant there
will be an emphasis on communication
especially oral.
 [In the C20th] the needs of the scholar were
superseded by the needs of the non scholar for
practical everyday use of the language in a spoken
Stern, H. H. (1983). Fundamental concepts of language teaching. Oxford: Oxford University Press.
Henry Sweet
Henry Sweet’s (1899) The Practical Study of
Languages. A Guide for Teachers and Learners
regarded by Howatt (1984:202) as one of the best
Language Teaching methodology books ever
written: “… unsurpassed in the history of linguistic
The papers in CPE 1913 correspond closely to the
chapters in his book
Stability and Innovation in the C20th: CPE
Reading aloud
Knowledge of grammar 1913-32
Use of English
British Council and testing in C20th
Stage 1 (1936 – 1959) Traditional Approaches to Language Testing
The UCLES-British Council Joint Committee for overseeing Cambridge
English exams
1945 The Diploma of English Studies (DES)
1954 Knowledge of English Form for screening applicants for UK Universities
by BC abroad
1958 OSI 210 Assessment of Competence in English Form
Stage 2 (1960 onwards) Psychometric-structuralist Approaches to Language Testing
1963-5 English Proficiency Test Battery (EPTB), (Davies PhD University of
Birmingham 1965)
Stage 3 (1971 onwards )
Communicative Approaches to Language Testing
ELTS Test (Carroll 1978, Munby 1978)
IELTS Test (Alderson, Clapham and Hargreaves)
APTIS (O’Sullivan)
Knowledge of English Form 1954
Lesson 2
People and ideas have an
important influence on English
language testing in the UK
Atlantic rift
Substantive differences grew between the UK
and the USA in their approaches to testing from
In the US the predominant focus was on scoring
validity, in particular the psychometric qualities
of a test
In the UK we find a far greater concern with
construct validity: a concern with the how in the
US as against the what in the UK.
The wider context
One reason for the Atlantic rift can be found in
the differing socio economic contexts prevailing
in Britain and the USA in the early C20th.
The compelling need to produce tests on an
industrial scale in the US strongly influenced
testing organizations in the direction of
objective multiple choice methods at a very
early stage.
Population explosion
Resnick (1982: 177,187) describes how in the
US “the need to identify those who had the
least probability of being able to carry on
normal work for their age, was stimulated by
the demographic explosion.
In 1870 there were about 80,000 students …
by 1910 there were 900,000.”
Needs of the military
Glenn Fulcher (1999: 390) identified the role
played by politics and war in the US and argues
the crisis in the army in WW1 and later in
WW2 contributed to the spread in use of
objective test formats in intelligence tests.
Resnick (1982:182) records the successful
placement in appropriate jobs of 1.7 million
army recruits mobilised in 1917-18.
In short it was the pressure of numbers that
drove US testing in the direction of
psychometrically driven tests and the move
was not unwelcome to school authorities as
standardised tests provided them with the
accountability in schools.
Birth of MCQ
Advances in standardised testing (Thorndike
1908) coupled with the development of the
multiple choice question (MCQ) format by Kelly
(1915) in his Kansas Test of Silent Reading,
marked the beginning of large scale testing in
the USA and provided the impetus for the birth
of psychometrics, spawning the testing
industry we now know.
First MCQ Question 1915
Below are the names of four animals. Draw a line around the
name of each animal that is useful on the farm:
cow tiger rat wolf
Samelson (op. cit.:123) concludes: “The
multiple choice test – efficient quantitative,
objective, capable of sampling wide areas of
subject matter and easily generating data
for complicated statistical analysis – had
become the symbol or synonym of
American Education.”
Third Lesson
Spolsky (1990a:159) reminds us of the crucial
connection of developments in testing with:
“…external, non theoretical, institutional social
forces, that on deeper analysis, often turn out to be a
much more powerful explanation of actual practice…
A clearer view of the history of the field will emerge
once we are willing to look carefully at not just the
ideas that underlie it, but also the institutional, social
and economic situations in which they are realized.”

Proof of Clericity