Are banked cloze items sensitive to discourse level constraints? 深圳大学 外国语学院 郭 丽 A study into the validity of a new test format in College English Test, Band 4 (CET4), a national English test for university students in the mainland of China. 1. Introduction 1. 1 Types of cloze * Fixed-ratio vs rational cloze A bird watcher, in the course _①___ a day’s rambling, almost always finds __②___ species of woodpeckers. Sometimes they reveal ___③ ___ whereabouts by piercing call notes, sometimes __④___ their steady hammering on tree trunks. (Key: of, several, their, by) ……In __①__, on the other hand, while individuals are important, people also try to do everything they can for the other group members. They do this because it is thought that if the __②__ succeeds then each member will succeed. So a person’s life away from work is still more relevant to his or her job is __③___. (Key: Japan, group, America) * Constructed-response vs selected-response cloze (open-ended vs multiple-choice or banked cloze) * Exact word scoring vs acceptable word scoring 1.2 Banked cloze in CET4 A new test format in College English Test Band 4 Old test format (before 2007) (Vocabulary and structure) ** We don’t know why so many people in that region like to wear dresses of such _____ colors. A) low B) humble C) mild D) dull (College English Test Band 4, June 2005) Example of banked cloze Some years ago I was offered a writing assignment that would require three months of travel through Europe. I had been abroad a couple of times, but I could hardly __1__ to know my way around the continent. Moreover, my knowledge of foreign languages was __2__ to a little college French. I hesitated. How could I, unable to speak the language, __3_ unfamiliar with local geography or transportation systems, set up __4__ and do research? It seemed impossible, and with considerable __5__I sat down to write a letter begging off. Halfway through, a thought ran through my mind: you can’t learn if you don’t try. So I accepted the assignment. There were some bad__6__. But by the time I had finished the trip I was an experienced traveler. And ever since, I have never hesitated to head for even the most remote of places, without guides or even __7___ bookings, confident that somehow I will manage. The point is that the new, the different, is almost by definition __8__. But each time you try something, you learn, and as the learning piles up, the world opens to you. I’ve learned to ski at 40, and flown up the Rhine River in a ___9___. And I know I’ll go on doing such things. It’s not because I’m braver or more daring than others. I’m not. But I’ll accept anxiety as another name for challenge and I believe I can ___10___ wonders. totally scary moments manufacture constantly claim regret limited balloon (College English Test 4, June 2008) ppt22 reduced interviews advanced news declare accomplish The purpose of banked cloze component in CET4—— “ to assess students’ ability to understand and employ words at discourse level” (National College English Testing Committee, China. 2006: Syllabus for College English Test Band 4. ) 2. Review of literature and Necessity for the study Contradictory conclusions concerning the construct validity of cloze ** can measure text-level processing ability (Ramanauskas, 1972; Chihara et al., 1977; Bachman, 1985; Jonz, 1991; Fotos, 1991; Chavez-Oller et al., 1994; Storey, 1997; Yamashita, 2003) ** cannot assess comprehension beyond sentence level. (e.g., Alderson, 1979;1980; Kibby, 1980; Markham, 1985) Relevant research (on rational deletion and banked cloze) Yamashita, Junko. 2003 (Processes of taking a gap-filling test: comparison of skilled and less skilled EFL readers. Language Testing 20: 267-293.) *12 Japanese EFL students * Gap-filling test -- key content words and cohesive devices were deleted * Thinking aloud * Text-level information 52% sentence level 17%; clause level 10%; extra-textual 4% * A gap-filling test can be used as a test to measure higher order processing ability. Relevant research (continued) Two studies into banked cloze procedures by Chinese researchers. Gao Xiaoying et al (2008) * Simultaneous introspection and immediate retrospection * 18 Chinese University students * Information sources: clause (47%) ; text level (23%) sentence (22%) ; extra-textual (8%) * The proficient readers -- context-based reading model the less proficient readers -- word-based approaches. Liu (2009) * Similar to the above in research focus and design * 6 second-year students in a Chinese university * Local syntactic (36%) local semantic (29%); sentence (20%); text-level (16%) information. Relevant research (continued) The present study adopts an approach which share similarities with Jonz (1991) and Chihara et al (1977) —— examining the sensitivity of the cloze items to the disruptions to the normal order of texts. Scrambled-text research / on standard fixed-ratio cloze Need for this study Little research has been conducted into banked cloze procedure and consequently there is little evidence as to whether it is sensitive to inter-sentential constraints as the CET4 designers claim it to be. The previous research on banked or rational deletion cloze is mostly qualitative, involving only a small number of subjects. As a high-stake national exam, CET4 exerts a powerful backwash impact upon the English instruction and testing at tertiary level in China. Research questions 1) Do subjects use text-level information in completing the banked cloze test? 2) What types of information (within-sentence or beyondsentence) are used to answer each item? 3) Does English proficiency affect the use of text-level information? 3. Study One Materials Normal Test One *Reading in Depth Section, College English Test Band 4 Dec, 2007 *222 words long *10 contents words (4 nouns, 2 verbs, 2 adjectives and 2 adverbs) deleted Scrambled Test One *Produced by scrambling the sentences in Normal Test One *Cause maximum interruption of any constraint on comprehension attributable to sentence sequence. Response and scoring Students were required to select one word for each blank from the list of choices. Since no change in the form was necessary, the scoring was entirely objective. Normal Test One As war spreads to many corners of the globe, children sadly have been drawn into the center of conflicts. In Afghanistan, Bosnia, and Colombia, however, groups of children have been taking part in peace education __1__. The children, after learning to resolve conflicts, took on the __2__ of peacemakers. The Children’s Movement for Peace in Colombia was even nominated (提名) for the Nobel Peace Prize in 1998. Groups of children __3___ as peacemakers studied human rights and poverty issues in Colombia, eventually forming a group with five other schools in Bogota known as The Schools of Peace. The classroom __4__ opportunities for children to replace angry, violent behaviors with __5__, peaceful ones. It is in the classroom that caring and respect for each person empowers children to take a step __6__ toward becoming peacemakers. Fortunately, educators have access to many online resources that are __7__ useful when helping children along the path to peace. The Young Peacemakers Club, started in 1992, provides a Website with resources for teachers and ___8__ on starting a Kindness Campaign. The World Centers of Compassion for Children International call attention to children’s rights and how to help the __9__ of war. Starting a Peacemakers’ Club is a praiseworthy venture for a class and one that could spread to other classrooms and ideally affect the culture of the __10__school. acting especially projects assuming forward respectively comprehensive images role cooperative information technology entire offers victims Subjects 165 second-year students, 100 male and 65 female, majoring in Business Administration, Architecture, Software Design, Electronic Engineering and Chemistry at Shenzhen University, P.R. China Procedure The subjects were randomly assigned to the two versions of cloze —— the odd rows (Group A1) took Normal Test One and the even rows (Group B1) completed Scrambled Test One. Proficiency Exam * Administrated a week after the cloze tests * Check the equality of the two groups * Divide the subjects into 3 proficiency groups * Listening, Vocabulary and Structure, Reading, Translation and Essay writing. * Reliability value (Cronbach’s Alpha) -- 0.78. Table 1 Proficiency Exam Scores for Groups A1 and B1 N Mean SD Group A1 87 63.03 12.03 Group B1 78 61.29 10.54 t p .983 .327 Table 2 Table 2 Proficiency Exam Scores for Three Proficiency Groups N Mean SD Group 1 58 74.17 5.93 Middle Group 1 49 61.88 2.86 Low 58 50.53 6.64 165 62.21 11.35 High Group 1 Total F 267.44 p .000 pos hoc tests: significant difference in each of the three pairs (.000). Results and discussion Question 1: Do subjects use text-level information in completing the banked cloze test? Table 3 Mean Scores in Normal and Scrambled Conditions N Mean SD Group A1 87 6.30 2.61 Group B1 78 5.53 2.53 t p 1.962 .052 Question 2: What types of information are used to answer each item? Table 4 Facility Index in Normal and Scrambled Conditions Facility Index IF Difference t p .62 +.17 2.52 .01 .74 .53 +.21 2.85 .01 acting .66 .56 +.10 1.19 .23 offers .92 .95 -.03 -.75 .46 cooperative .36 .36 .00 -.04 .97 forward .47 .37 +.10 1.22 .22 especially .72 .72 .00 -.03 .98 information .34 .29 +.05 .63 .53 victims .77 .68 +.09 1.35 .18 entire .53 .45 +.08 1.10 .28 Normal (N=87) Scrambled (N=78) projects .79 role Question 3: Does English proficiency affect their use of text-level information? PL=.08>.05 PH=.02 <.05 PH=.04 <.05 PL=.08 >.05 PM=.08 >.05 4. Study Two Purpose Expand the generalizability of the research, same issues explored Materials Normal Test Two *Reading in Depth Section, College English Test Band 4, June, 2008. *242 words *Chronological account / different type of text *10 contents words (4 verbs, 3 nouns, 2 adjectives and 1 adverbs) deleted Scrambled Test Two * Produced by rearranging the sentences in Normal Test Two * Inter-sentential connections were meant to be thoroughly disrupted Subjects 141 second-year students, 108 male and 31 female, majoring in Information Technology, Civil Engineering and Maths at Shenzhen University, P.R. China Procedure The subjects were randomly assigned to the two versions of cloze —— the odd rows (Group A2 ) took Normal Test Two and the even rows (Group B2) completed Scrambled Test Two. Proficiency Exam * Administrated a week after the cloze tests * Applied as a criterion to evaluate their English competence * Listening, Vocabulary and Structure, Reading, Translation and Essay writing. * Reliability value (Cronbach’s Alpha) -- 0.77 Table 5 Proficiency Exam Scores for Groups A2 and B2 N Mean SD Group A2 69 67.00 11.07 Group B2 72 66.14 9.93 t p .487 .627 Table 6 Proficiency Exam Scores for Three Proficiency Groups N Mean SD Group 2 54 77.04 4.89 Middle Group 2 36 66.42 2.66 Low 51 55.57 5.91 141 66.56 10.47 High Group 2 Total F p 255.947 .000 pos hoc tests: significant difference between each of the three pairs (.000). Results and discussion Question 1 Do subjects use text-level information in completing the banked cloze test? Table 7 Mean Scores in Normal and Scrambled Conditions N Mean SD Group A2 69 5.55 2.10 Group B2 72 4.39 2.03 t p 3.341 .001 What has brought out the different alpha values in the two studies? (Study One: MGroupA1=6.30; MGroupB1= 5.53; t=1.962, p>.05) Text type? Cloze passage in Study One was largely an account of facts and did not exhibit much top-level constraint on the order in which propositions were presented while the passage in the Study Two was a chronological account of a writer’s experience, thus more sensitive to sentence reordering. Question 2 Does English proficiency affect the use of text level clues? Diff=1.49 PH=.003 <.05 Diff=0.98 PM=.140 >.05 Diff=0.77 PL=.234 >.05 As subjects become more proficient in English, they are better able to benefit from discourse constraints ranging across sentence boundaries. Question 3 What types of information are used to answer each item? Table 8 Facility Index in Normal and Scrambled Conditions Facility Index IF Difference t p .53 +.01 .024 .981 .84 .84 .00 .080 .937 scary .25 .10 +.15 2.399 .018 totally .74 .75 - .01 -.194 .846 interviews .64 .44 +.20 2.412 .017 claim .38 .27 +.11 1.304 .194 advanced .58 .56 +.02 .216 .829 balloon .81 .56 +.25 3.320 .001 moments .52 .12 +.40 5.541 .000 regret .26 .23 +.03 .384 .701 Normal (N=69) Scrambled (N=72) accomplish .54 limited 5. Conclusion There may be considerable variation in the effectiveness of banked cloze in assessing discourse level abilities. Different cloze items carry with them different amount or level of information. More proficient test-takers have better access to discourse level clues than the less proficient. And a well-ordered narrative/chronological text could function better than a loosely connected encyclopedic description which simply lists facts. Therefore it can be concluded that tester/researcher intervention is necessary for developing sound banked cloze tests. Limitations Only two samples of banked cloze tests Convenience sampling of subjects Subjects in the two separate studies – equivalent in their English competence? Test one Group 1 Group 2 Normal Scrambled Test Two Scrambled Normal Thank you very much!