Issues in Assessment
Mathematics Assessment and Intervention
Statistical Issues Related to Assessment
• Reliability
– Consistency
– The degree to which students’ results
remain consistent over replications of an
assessment procedure.
– Measurement example
Popham, J. W. (2011). Classroom assessment: What teachers need to know.
Boston: Pearson.
Nitko, A. J. & Brookhart, S. M. (2007). Educational assessment of students.
Upper Saddle River, NJ: Pearson.
Reliability Evidence
• Stability Reliability – test/retest
– The stability of results if no significant
event occurred between administrations
– Typically calculated using a correlation
– Can also used classification consistency
• Does the student receive the same
classification, such as proficient/not
– Note – there is always some instability
between administrations
Reliability Evidence
• Alternative Form Reliability
– This is when you have multiple test forms
– Typically calculated using a correlation
Reliability Evidence
• Internal Consistency Reliability
– Do the items in an assessment function in
a consistent fashion
– Do the items on the assessment measure
a single variable, such as Fraction
Addition or Shape Identification
– More items make an assessment more
– There are specific statistical tests that
are used to determine internal
Standard Error of Measurement
– Any assessment result is an estimate of a
student’s “true” score.
– The standard error of measurement is
the “band” wherein the students true
score likely lies.
– It is found by multiplying the standard
deviation of the assessment by the
square root of 1-the reliability coefficient.
– See graph on Nitko p. 77
– Parent Talk example from Popham (p. 70)
• Validity refers to the inferences from
or use of assessment results.
– There is no such thing as a valid test, only
if the inferences based on the results are
Curricular Aim:
Example – multiplication
fact fluency
Student’s inferred status
Example – 2 minute
timed multiplication test
Validity Evidence
– Content Validity
• Does the content of the test represent
the content of the curricular aim?
– Ex: What if the multiplication test used only the 1
and 5 facts?
• Does the curricular aim involve
different processes? If so, the
assessment must use those different
• Look at categorical concurrence, depth and
range of knowledge, and balance.
Validity Evidence
• Construct Validity
– Does the assessment measure the stated
• Make a hypothesis about how the construct
works and how student results should
illustrate that construct
• Gather data from the assessment
Validity Evidence
• Criterion Validity
– This is primarily related to tests that
purport to predict some result such as
SAT tests predicting college GPA.
Statistics related to assessment: Percentile
• The raw scores of the norming
population are put in order from
lowest to highest. They are then split
into 100 equal groups, called
PERCENTILES. Each student’s score
is then compared to the norming
scores to see where it falls.
Percentiles can only be used on a
norm-referenced test. Why?
Stanines: The percentile score is divided into nine
segments, each of which represents a “standard nine.”
Statistics Related to Assessment Results
– Measures of Central Tendency
• Mean: the average ( X )
• Mode: the most common
• Median: the middle number when the data is put
in order from least to greatest
– When should you use which measure?
Box and Whisker Plot
• A box and whisker plot uses medians and
percentiles to describe data.
• Create a box and whisker plot with the
following data
– Data set 1: 7, 8, 2, 10, 9, 9, 3, 5, 7, 8, 8, 10, 10, 6
– Data set 2: 45, 60, 85, 95, 100, 50, 80, 90, 100,
95, 60, 25
More Descriptive Statistics
• Measures of Variability
– Standard Deviation (SD): a measure of
how spread out the data are; roughly, the
average of how far each data point is from
the mean
– Range: difference between the lowest
data point and the highest data point
– Interquartile Range: rank order the data,
split it in half and in half again, subtract the
median of the bottom half from the median
of the top half
Norm-referenced assessments
• Norm-referenced tests compare a student’s
assessment results to other students (norm group)
who have taken the same test. Only a raw score is
used which is then converted to a percentile.
– Examples
• Iowa Test of Basic Skills
• SAT 9
Criterion-referenced assessments
• Criterion-referenced assessments compare
student’s assessment results with pre-established
criteria, such as the core curriculum. Result can be
reported as raw scores, percentages, or other
conversions of the score.
– Examples
• End of level tests
• Assessment bias refers to “qualities
of an assessment instruct that offend
or unfairly penalize a group of
students because of students’
gender, race, ethnicity,
socioeconomic status, religion, or
other such group-defining
characteristics” (Popham, p.111).
– Offensiveness
• Negative stereotypes presented
• Slurs
• Distress may influence test results
– Unfair penalization
• Content that, while not offensive,
disadvantages a student because of group
• Think about experiences that some students
may have had while others may not have the
same types of opportunities.
• What about assessments in other languages?
• Does the fact that students of
different races perform differently
indicate bias?
• Disparate impact
Bias Detection in the Classroom
• Think seriously about the impact that
differing experiential backgrounds
will have on the way students respond
to your assessments.
Assessing Students with Disabilities and ELLs
• Must follow modifications and
accommodations on the IEP
• Accurately assess ELLs
Self Check p. 133
Evaluate your assessments
• In a small group, look at your own
assessments, evaluating them for
– Reliability
– Validity (the use of the assessments)
– Bias
Teacher responsibilities
When creating assessments
• Apply sound principles of assessment
• Craft assessment procedures that are free
from characteristics irrelevant to curricular
• Accommodate in appropriate ways
• Present results in ways that encourage
• Ensure that assessment materials do not
contain errors
Teacher responsibilities
When choosing assessments
• Use quality assessment materials
• Publication does not equal quality
When administering assessments
• Conduct the assessment professionally
• Accommodate students with disabilities
• Follow rules when administering
standardized tests
Teacher responsibilities
When scoring assessments
Score responses accurately and fairly
Provide feedback for learning
Explain rubrics
Review evaluations individually
Correct your errors quickly
Score and return results as quickly as
Teacher responsibilities: Do No Harm?
• Mr. Allen is having his students score
each other's quizzes and then call out
the scores so he can plot them on the
Do No Harm?
Students in Miss Ela's class are
discussing samples of anonymous
science lab notes to decide which
are great examples, which have
some good points, and which don't
tell the story of the lab at all well.
They are gradually developing
criteria for their own lab "learning
Do No Harm?
Pat's latest story is being read
aloud for the class to critique.
Like each of her classmates,
she's been asked to take notes
during this "peer assessment" so
that she can revise her work
Do No Harm?
Students in Henry's basic writing class
are there because they have failed to
meet the state's writing proficiency
requirements. Henry tells students
that the year will consist of teaching
them to write. Competence at the end
will be all that matters.
Do No Harm?
Jeremy's teacher tells him that his
test scores have been so dismal so
far that no matter what he does
from then on he will fail the class.
Assessment Ethics: Confidentiality
• Who should have access to student
assessment results
– Discuss: What about
Students grading each others’ work
Parents grading student work
Student aides recording scores
Faculty room discussions about students
Public displays of student progress, i.e.
charts, graphs, etc.
Assessment Type Presentation
Sign up for Presentation
Work with your group. The presentation should take
approximately one hour and 15 minutes of class

HALL PASSES - Weber State University