 First text published by G. Allport in the 1930s
 First theoretical models date back to William James and
Sigmund Freud (both late 1800s)
 Considerable variability in explanations for personality
(biological/genetic models, psychodynamic, self-theory,
Common Elements:
 Stability (situations and time) – differentiated from mood
 Research on stability over the lifespan (greater as we age)
Affective, cognitive and behavioral components
Personality assessment
 Recent survey of practicing Ph.D.s, PsyD.s, and Ed.s
revealed that only 32% use personality tests and only 43%
do treatment planning.
 De-emphasis in personality training occurred at the same
time as Mischel shock in 1968, so clinicians trained in the
late 1960s and 1970s did not value personality assessment
 Today, treatment planning based on assessments is
essential from both an ethical standpoint and for insurance
Objective assessments?
 How can personality assessment be more objective
 assess any biases and correct for them (lie, defensiveness)
 find a method to avoid such biases
 look for convergence with reports from others
 assess with low face valid instruments and look for consistent patterns (though
this only really addresses intentional faking)
 Personality assessment is used to further describe the client, just as a
diagnosis does (note that you would not say that depression is causing the
patient's behaviors, you merely use the term to summarize a cluster of
behaviors. The diagnosis itself also does not necessarily imply a causal
mechanism nor an explanation - those from different perspectives would
define it differently)
e.g., if someone is depressed it could be explained biologically,
cognitively, behaviorally, or even in psychodynamic terms
The structure of personality
 Personality involves stable patterns of behavior, affect, and
cognitions. So how stable is stable? (states vs. traits)
 Levels of analysis
 1. factors - groups of traits that show better global predictive
utility (e.g., Big 5 of N, E, O, A, C; The Big 3 of N, E, P; Big 2)
 2. traits - clusters of consistent individual behaviors
 3. habits - consistent (over time) individual behaviors
 4. single acts - individual behaviors
 All levels are used to predict future behavior with the top being the
most robust
 Consider this model when recommending or implementing change in
Predicting behavior
 Difficult to predict specific single behaviors from global
trends; (Epstein, 1983)
 For clinical evaluations, if the context of interest is
known, then you may want to trade off the generalizability
and give a specific prediction
 e.g., Pt.’s test scores indicate that he is generally impulsive. This
may be exacerbated when in the company of other individuals who
are also impulsive and when the individual is drinking, as alcohol
minimizes any inhibition processes that he might have. This
substantially increases the likelihood that he will act impulsively
Two key discussions
Read material in advance and know your MMPI
Scheduled discussion:
Should we use projective tests?
Are they tests or techniques?
Assessing Axis I and II
 Personality addresses both AXIS I and AXIS II disorders.
 What are some AXIS I disorders that might be related to personality
traits? e.g.,
 depression and NA/Neuroticism
 anxiety and NA/neuroticism
 impulse control disorders & extraversion/sensation seeking
 AXIS II personality disorders explicitly link up with personality
assessments (video & DSM-IV)
 Cluster A (odd): Paranoid, Schizoid, Schizotypal
 Custer B (emotional): ASPD, Borderline, Histrionic, Narcissistic
 Cluster C (anxious): Avoidant, Dependent, Obsessive-Compulsive
 PD NOS – features of several Dx,but does not meet criteria for any one.
Selecting a test battery (see Beutler, 1995)
 What is the referral question?
Single most important determinant
 Are there any limiting factors with regard to the client?
 Context of the evaluation? (work, school, hospital, etc.)
 Follow up assessment relevant to trait findings (e.g., patients who
show impulse control problems should also be assessed for
potential for acting out violently)
 Problem focused or broad, multipurpose battery
Nomothetic (allows for normative evaluations) or ipsative
(allows for the evaluation of the individual) analysis
If using qualitative methods, consider:
 1. Method appropriateness – are there quantitative
methods that you could use instead?
 2. Openness – make clear the theoretical orientation that
undergirds the qualitative assessment
 3. Theoretical sensitivity – use qualitative methods that are
based on accepted theories not your own theories
4. Bracketing of expectation – you must explicitly state
where your conclusions depart from accepted theories
5. Responsibility – how were the qualitative methods
administered and interpreted
 6. Saturation/generalizability – when assessing traits,
sample from a large number and wide range of situations
7. verification of methods – cross-validate your methods
using other reports, other test material to see if it agrees
with your conclusions, do findings predict outcomes, etc.
If using qualitative methods, consider: (cont)
 8. grounding – stay close to the data when making
interpretations (no big theoretical leaps)
 9. coherence – do all of the interpretations fit together to
make a coherent story
 10. believability/usefulness – does the use of the
qualitative method provide more info on the client, or just
raise more questions? Does it result in a believable
 11. Intelligibility – Is the report readable and jargon free?
MMPI (Hathaway & McKinley, 1943)
10 clinical scales and 3 validity scales
Empirical scale development with items selected
based on their ability to differentiate normals,
from a target group (another clinical group with
similar symptoms was sometimes also
Clients should be 18 or older & 6th grade
Generally lower face validity (breaks with
tradition of items that clearly sample the domain
of interest); most relevant for clinical population
MMPI development
Item pool derived from psychological and
psychiatric reports, textbooks, previous scales,
Criterion group composition
Minnesota normals – 724 relatives and visitors of
patients at the U. of M. Hospitals, 265 recent high
school grads, 265 administration workers, and 254
medical patients
Clinical groups – 221 patients representing the major
psychiatric categories (excludes those with multiple
diagnoses, or questionable diagnoses)
Item analysis to identify those items
differentiating the clinical and normal groups
MMPI development – cont.
The items that could differentiate were then
cross validated with new groups of normals and
Later developed two non-clinical scales
M/F – initially to identify male homosexuals was
augmented with broader items
Si – derived from an introversion/extraversion scale
and cross validated by predicting involvement in
college activities in a second sample (all female
college students)
Validity scales were either derived rationally (L
& K) or from baserates in the normal group (F)
Utility of the MMPI
Not considered a diagnostic inventory (as
was originally intended)
Ineffective at differential diagnosis (based
on how it was originally developed)
Numerical scale labels was intended to
further minimize the connection with a
specific diagnostic label
Some problems with MMPI
Method of determining the criterion group
The PIGs were not a truly random group
(relatives and friends of those in the hospital –
though largely the medical patients); convenient
Criterion and PIGs were largely from the
midwest, in the late 1930s/early 1940s
Utility of some of the scales as it matched
diagnostic concerns of that era, dated and
culture-specific item content, and
representativeness of the norm group.
MMPI vs. MMPI-2 (1989)
MMPI was the most widely used personality test
in all pops (though only validated for inpatient
adult samples)
MMPI validation and norm samples were ones of
convenience with limited variability on education
(M=8 years), coming from a rural background in
the midwest
Normative data collected in the 1930s
Clinical cut-off now defined by t-score of 65 vs.
70 on the MMPI
Advantages of updating the test
more representative norms (based on projected
census data)
relevance of the items
language employed for the items (both temporally
laden references like “drop the hanky”, and gender
biases in item content)
addition of new scales of relevance today
Uniform T-score transformation now used so that Tscores reflect percentile ranks that are the same
across all clinical scales
Disadvantages to all updates
over 20,000 published studies no longer
MMPI-2 must revalidate all of the scales
inability to make comparisons with
adolescent scores (MMPI-2 vs. MMPI-A)
Many of the new scales are very short and
lack appropriate psychometric properties
How often should we redevelop or renorm
the scale?
MMPI-2 (1989): 567 items
Norm group = 2,600 community based
1138 m & 1462 f, aged 18-85 (M=41,
SD15.3), education 3 yrs - 20+, 61% married
median incomes $25-$35,000, 3% of m and
6% of f receiving mental health treatment
81% Caucasian, 12% A-A, 3% Hispanic, 3%
Native American, 1% Asian-American
Validity scales
 Assumption that the clinical population will not be able
to answer forthright
 Lie – naive or unsophisticated lying (low SES and
 K – less obvious (high SES and education) defensiveness
is a component of all responding
 F – answering questions in such a way so as to be
different from 90% or more of the population (nonnormative responses); See fake bad/fake good profiles
 F – K Index = can be used to indicate fake bad, with
larger numbers making it more likely (little evidence to
suggest that fake good can be detected); see p. 38
Clinical Scales
1. Hs - exaggerated concerns re: physical
illness, or tendency to report symptoms
2. D - Clinical dep; unhappy & pessimistic
about the future
3. Hy - conversion reactions (substitute
illness for emotions)
4. Pd - History of delinquency, antisocial
behavior (non-conventional re: moral
Clinical scales - continued
5. Mf - prototypical gender identity
(military recruits, stewardesses,
homosexual males students)
6. Pa - paranoid symptoms (ideas of
reference, persecution, grandeur)
7. Pt - anxious, obsessive-compulsive,
guilt ridden, self-doubts
8. Sc - thought disorder, perceptual
abnormalities (various types of Schiz.)
Clinical Scales - continued
9. Ma - exhibition of mania, elevated
mood, excessive activity, distractibility,
(possible manic-depression or BP II)
10. Si - college students scoring in the
extreme range on introversion - extra.
Costa & McCrae (1990) suggest that the
MMPI-2 wont work in the normal pop. As
people don’t respond “passively” to items
New Validity Indexes
Basic validity comes from L, F, & K
VRIN (variable response inconsistency)
47 pairs of items that should be answered
similarly or the opposing direction. Client gets
a point for each inconsistent response.
A completely random response set results in
T scores of 96 for m and 98 for f (>80 inval.)
acquiescent responding T = 50
New Validity – cont.
TRIN (true response inconsistency)
23 pairs of items that are opposite in content
either T/T or F/F to assess acquiescent or
non-acquiescent responding
larger raw scores = true responding while
smaller raw scores = false responding
raw scores should be between 6 and 12 in
order to consider the profile valid
Fb - back infrequency items for latter part
Coding the Profile
List scale # codes in order of their T-score
elevations (from highest to lowest)
usually only interpret 4 scale codes and order
does not matter
Welsh coding system involves adding
symbols to numerical scale codes
e.g., L F K 1 2 3 4 5 6 7 8 9 0
T 57 75 43 69 88 75 94 52 81 75 79 59 65
Welsh: 4268371095 FLK
Codes (listed to the right)
** 100-109, * 90-99, “80-89, ‘70-79, +65-69, 60-64, /50-59, .:40-49, #30-39
Some coding forms use ! to denote scores of
110-119 and !! for 120 or greater
Underline identical T-scores (and list in
ascending order) as well as those within one
point of each other
e.g., 4*26”837’10+95/ F’L/K.:
Code Types 2,3 and 4 point codes: 5 point diff
between lowest code T and T of highest scale
not in the code.
MMPI-2 practice case: M.S.
Integrate the MMPI-2 data with the client
information (vs. laundry list). Note: profile valid.
e.g., profile 3-2/2-3 should revolve around the
discussion of depression and the manifestation of
symptoms (physical symptoms tend to be
How does this relate to M.S.?
Recent loss, seeing her physician, isolation
What does the 8 (or 2-3-8) tell you?
How might psychotic symptoms relate to M.S.?
Confusion from malnutrition, confusion as a result of
depression, her age re: dementia? All are possible
M.S. - continued
Include discussion of (or section on) prognosis,
recommendations, and diagnosis
Axis I: 296.24, Major depression, single episode, with
psychotic features
AXIS II: No diagnosis (or deferred)
AXIS III: Malnutrition, dehydration, poor hygiene &
personal care
AXIS IV: Death of spouse (Severity: extreme (acute
AXIS V: GAF: Current, 24; highest past year, 52
MMPI-2 with other pops.
MMPI was originally developed using Caucasian
groups of patients
Although some research has shown mean score
differences between majority and minority
groups, this is less relevant to the issue of
whether there is differential predictive validity
(few studies on this)
Hall, Bansal, & Lopez, 2000, have conducted a
meta-analysis of 30 years research on minority
groups and the MMPI (both versions)
Hall et al., 2000 - summary
 AA – first note that cultural identification moderates all
findings (cf. acculturation)
 Inconsistent findings re: mean differences, with F, 8, &
9 sometimes higher by approximately 5 T-score points
 Many matched grouped studies of patients have found
no differences, though Ns were small (meaning what?)
 Generally no differences in predictive validity that
achieve statistical or clinical significance and any
differences can be attributed to SES and age
 MMPI-2 has representative norms
 Minimal information on the supplemental scales and
even less for the content scales
Hall et al., 2000 – sum cont
 Hispanics likewise show few differences from Caucasians
 Possible differences for scales 3 and 0, with Hispanics
scoring higher on 3 and lower on 0, but these effects
were small with minimal clinical or statistical sig.
 Much stronger effect for acculturation in this ethnic
 Few studies on Native Americans, but they show this
pop. to score slightly higher on most scales
 Few studies for Asian Americans, and they show slight
elevations for scales F, 2, & 8.
 Generally valid to use for these pops given appropriate
acculturation and understanding of the language
Other populations
 Given its original construction, there should be no
problems using the MMPI in medical settings
Medical problems do not necessarily result in higher scores (i.e.,
more distress)
 In substance abuse settings, no profile emerged to
detect substance abuse, but scale 4 was a good
predictor (see also the supplemental scales)
 We will discuss forensic applications later in the
semester (see chapter 13)
 MMPI-2 can be used in non-clinical settings to screen for
psychopathology, but there are some concerns.
False positives are more common
Has not been validated to predict success in other settings (e.g.,
jobs) which is true of most personality tests (predict interest)
MMPI-A (1992)
 Do we need a different inventory for adolescents? Why?
Scales of concern?
M/F for adolescents may be less defined
Theoretically Pd is thought to be elevated, but actually it tends
to be lower
Personality is less stable overall so we need different norms to
better interpret scores and relevant items for this age group
 Valid for those aged 14-18 (for 18 y.o., the decision is
based on life circumstances; e.g. at home? working?)
Important to score on both adult and adolescent norms as there
can be substantial differences (T-score shifts of 15 points)
 478 items (some new some from the original inventory)
 written & auditory forms both in English and Spanish
 Includes all of the clinical, & some new supplemental & content
scales. So we use basically the same scales but different descriptors
(i.e., a high score on Hs will not mean exactly the same thing for
the MMPI-A; e.g., Pd equates more with acting out)
 Biggest change was with the F scale since it is a norm defined scale
(we need new norms)
 Norms: 805 boys & 815 girls aged 14-18 solicited randomly from
schools in 7 states. Represents the U.S. for SES and ethnicity (again
minimal diffs for ethnicity)
 Change from MMPI which had separate norms for different
adolescent age groups (now only one)
 F scale now has 2 parts: F1 = 1st part of test, F2 = 2nd part
MMPI-A: New scales
 New Supplemental scales:
 Alcohol/drug problem proneness (PRO) – empirically
derived to assess the likelihood of alcohol or other drug
problems. Items differentiate adolescents in tx from
those having other psychological problems
 Alcohol/drug problem acknowledgement (ACK) – face
valid items that reflect the admission of problems
 Immaturity (IMM) – reporting behaviors, attitudes, and
perceptions that reflect immaturity (e.g., poor impulse
control, judgment, and self-awareness). Items predict
academic problems and cognitive limitations.
 Check for diagnoses such as oppositional-defiant,
conduct disorder, and in adulthood ASPD
MMPI-A Psychometrics
 For the most part, the psychometric properties of the MMPI-A are
sound. The reliability values are lower than the MMPI-2 values, but
still within acceptable limits.
 Why might there be less temporal stability in the MMPI-A?
 General interpretative data from the MMPI-2 can be generalized to
the MMPI-A, but this data should be considered in light of the
client’s position in life (i.e., consider how the scores relate to school
life, problems with parents, need for independence, etc.)
 Note: no K-correction for clinical scales even though a
defensiveness score is calculated. So what are the clinical scale
implications for a high K?
MCMI-III (Millon, 1990)
 175 item scale assessing problematic personality styles and classic
psychiatric disorders (drawn from the DSM)
 In contrast to the MMPI, this scale was derived theoretically to
match the nosology (taxonomy) of the DSM to facilitate diagnosis
and intervention planning. Assumes that any assessment is theory
driven (vs. MMPI which tried to be a theoretical)
 The theory is grounded in evolutionary principles assessing 4
spheres: existence (from serendipity to an organized structure),
adaptation (survival), replication (reproductive styles that maximize
diversity), and abstraction (the emergence of competencies to
foster planning).
 Scored according to a polarity model. e.g., self vs. other orientation
(reproduction), pleasure vs. pain (existential, or aim of, existence)
 Illustration: Schizoid is marked by deficits in both pleasure and pain
as indicated by the lack of emotion and apathy
MCMI-III properties
 A brief inventory (175 items) that takes only 30 minutes to complete
 3 modifier scales that correspond to the validity scales
Disclosure = defensiveness
Desirability = favorable response set
Debasement = lying
 11 clinical personality patterns: schizoid, avoidant, depressive,
dependent, histrionic, narcissistic, antisocial, aggressive (sadistic),
compulsive, passive-aggressive, self-defeating
 3 scales denoting severe personality patterns: schizotypal,
borderline, paranoid
 7 clinical syndromes: anxiety, somatoform, bipolar, dysthymia,
alcohol dependence, drug dependence, PTSD
 3 severe syndromes: thought disorder, major depression, delusional
MCMI-III- continued
 Scales interpreted based on base rates for each dx and
it assumes that disorders are interconnected (consistent
with comorbidity data)
 Initial studies had classification rates of 90%, but followup studies have been much lower (50% or less)
 Validity data has been equivocal and the reliability data
is likewise lower than the MMPI-2 (these are related,
and both linked to number of items)
CPI (Harrison & Gough)
 Developed at the same time as the MMPI and served as the
personality test for the normal population (MMPI for the clinical
pop.). Drew from a similar item pool.
 480 T/F questions (some overlap with MMPI and others are new)
 Emphasizes more positive/normal aspects of personality
 3 validity scales: well being (normals asked to fake bad), good
impression (normals asked to fake good), communality
(popular/obvious responding that may reflect defensiveness and
 15 general scales assessing a wide range of traits such as
intellectual efficiency, capacity for status, achievement via
 Grouped into 4 quadrants (factors): Norm favoring vs. norm
doubting and externalizing vs. internalizing
CPI - continued
 CPI was revised in 1986 with norms based on 13,000
males & females
 Most commonly used personality inventory overall
 It has been replaced by the NEO-PI as most common in
the last 15 years.
 Psychometrically sound (reliability and validity
coefficients are high and stable for different pops), but a
very long instrument.
 Also some question as to the need for validity scales in
the normal pop.
Burisch suggests this is unnecessary provided; 1) no reason to
lie, 2) knowledge of the construct(s), and 3) self awareness.
NEO-PI (Costa & McCrae, 1985, 1992)
 Based on the empirically derived 5 factor model
 Assumption that 5 factors can represent all of normal personality
 Evaluated this model in a variety of contexts, with samples from all
over the world and in different languages
 Assumes that language is the best place to start examining how to
describe behavior (132 Eskimo words for “snow” indicates it is a
meaningful construct)
 Neuroticism (emotional stability), extraversion, openness to new
experience, agreeableness (quality of interactions) and
conscientiousness (dutiful, organized).
 5 factors have been recovered from other inventories like the
Myers-Briggs, 16PF, etc.
 Full version is 220 items and has 6 facets for each of the 5 factors
 Short form (NEO-FFI) has 60 items and provides factor scores only
 Norms are available for adults, college students and adolescents
(though minimal differences between the latter two groups)
 Strong psychometric properties including very stable retest
coefficients, internal reliability, and validated with other personality
 Can be used to predict job interests (though vocational inventories
such as the Strong Interest Inventory are better suited for this), but
they do not predict job success (same is true for interest
 Often used for intuitive purposes and not empirically validated
purposes (e.g., assume that a manager should be low on N and
high on C vs. empirically testing this assumption with current
Structure of affect and other issues
Big two (PA/NA) vs. 5 factor
Bipolarity of affect (vs. orthogonality)
Temporal question for what defines affect
vs. personality
Problem of temporal language (e.g., “at this
Measures of Affect
 Note: The EPI (Eysenck) likewise measures personality
(extraversion and neuroticism) in the normal population, and these
two factors are usually the first two to emerge in factor analysis.
 These factors correspond to the Big Two affect constructs (PA and
 Note: most of these measures do not address validity of responding
 Nevertheless, research suggests that these scales tend to be fairly
accurate and reflect actuarial rates for affective disorders (5-9% of
adult women and 2-3% of adult men)
 BDI – published in 1961 and revised in ’74, ’78, and ’96.
 Among the most commonly used inventories with a comprehensive
manuals published in 1987, 1993, and 1996 (BDI-II)
 Normed for adolescents and adults aged 13 and older. 21 items with
items arranged in a Guttman approach (increasing order of severity)
 Suicide potential in items 2 and 9. For dx of Depression see
neurovegetative items
BDI - continued
 Internally consistent and reliabilities range from .48 to .86 for
periods ranging from several hours to four weeks
 Why are retest coefficients smaller?
 No way to correct for faked scores
 Validated extensively for use in clinical settings
 BDI-II validated on 500 outpatients drawn from across the country
and a student sample of 120
 1 week retest was .93 and coefficient alphas were .92 or higher
 Average BDI-II scores are 3 points higher than the original BDI
 BDI-II time frame for each item focuses on last two weeks to match
the DSM criteria
BAI (Beck & Steer, 1993)
21 item symptomatic inventory
Items rated on a 0-3 scale
Validated for use for inpatient (N = 1,086),
outpatient (N = 160) and college student
samples (N=65).
Shows convergent validity with other measures
of anxiety and some disciminant validity with
depression measures (though they are
correlated – sharing 10-25% variance)
Rapid self-report tool
CES-D (Radloff, 1977)
 Developed by NIMH for use as a screening tool in the
general population (also in college and geriatric pops)
 Optimal test for this purpose in this population
 20 likert type items focusing on the last week
 Better than the BDI-II at differentiating among those
experiencing lower levels of depression
 Internal consistency is high (.85 in general pop. and .90
in patient samples).
 Retest figures tend to be low (.48) but this is less
relevant for this construct
 A score of 16 is clinical cutoff and it assesses depressed
affect, positive affect, somatic activity, and interpersonal
MAACL-R (Zuckerman & Lubin, 1985)
 Originally published in 1965 and revised in ’85. (132
checklist type items)
 Normed on over 1500 adults, 400 adolescents (approx.
90% Caucasian, 10% Black)
 Scores for Anxiety, Depression, hostility, PA, and SS (the
latter has very poor internal reliability)
 A rapid assessment but not as good psychometrically
 Can be used to evaluate states or traits and reliability
figures are better (though not very high) for the latter
 Scales don’t corr with social desirability and do converge
with MMPI ratings
Behavioral Assessments
 Assumption: behaviors can reflect cognitions and
emotions (e.g., FACS; Ekman & Friesen, 1978)
 Proliferation of behavioral assessments with limited
validity due to the assumption that behavior can be
easily defined and that it represents a meaningful
(typically underlying) construct e.g., sweating, pacing
 How to improve behavioral assessments?
Identify the actual behavior being assessed (lip
turned downward vs. sadness)
Habitual behaviors may indicate underlying condition
Acknowledge role of both traits and situations
Beh assessments – cont.
 Also influenced by factors such as social desirability
(varies depending if one is aware of the assessment)
 Difficult to organize and systematize behaviors (e.g.,
how does one smile equate with the absence of a frown
re: depression?)
Very inconsistent findings regarding the organization of
individual behaviors (even physical symptoms) via F.A.
 Why might self-report and behavioral assessments not
overlap? What does this mean?
 Recall behavioral reactivity phenomenon – change in
behavior as a function of its assessment
Physiological measures
“Some people want to fill the world with silly
physiological measures. And what's wrong with
that?” (McCartney et al., 1976)
Biofeedback – long history but very mixed findings
Plethysmography – changes in blood volume that
may relate to emotional changes
Pupillary responses – attraction and fear?
Polygraph – arousal related to lying?
Cognitive testing refresher
WAIS-III score interpretations for reports:
With regard to the index scores, which declines
the most with age?
Quick, it’s PS!
Which show the greatest decrements secondary
to organic dysfunction (trauma or disease)?
PS, WM, and PO: Depends on the area of the brain
that is damaged. If diffuse, then all three. If temporal
then WM, if more right hemisphere then PO.
Which is the best indicator of premorbid
VC (or subtests of vocabulary, similarities & info.)
Cognitive and personality functioning
What are meaningful ways to integrate these two
pieces of information?
What interpretations might one make for high IQ
individuals relative to low IQ individuals re: personality?
Overlap with maturity? Less complex presentations?
What PD is associated with extremist thinking (splitting),
inability to recognize subtleties?
Other implications?
Ease of use for clients, alternative test format, wider
range of responses (variability), alternative approach to
detecting pathology, difficult for client to identify socially
desirable or undesirable responding, theory based
Defensiveness strategies (see MMPI-2)?
Projective test/technique
MMPI/MMPI-2 is most frequently used test
in inpatient settings
Rorschach & TAT are not too far behind
Advantages of projectives?
Disadvantages of projectives?
Administration and scoring is generally less
standardized so reliability and validity are
Minimal criteria for a test
 Standardized administration
Rorschach has numerous administration procedures
(Bleck, Klopfer, Exner, etc.)
 Standardized scoring
Rorschach has numerous scoring approaches (Bleck,
Klopfer, Exner, etc.)
 Standard of comparison for interpretations (norm group)
Minimal information with regard to representative
Exner’s scoring system
Location – part of the blot
W, D, d, S, (WS)
How common is the location (normative comparisons
from manual)
Determinant – what led to response
Form, Color, FC or CF, Movement, etc.
Evaluate form quality (normative decision based on
manual of responses). Low F+% = psychosis/poor
reality contact
Content – focus on what specifically
Human or animal, whole or detail, nature, etc.
Populars – determines normative responding
Rorschach – Exner
Exner’s (1987) scoring system involves an
attempt to increase validity by objectifying the
scoring, increasing the number of responses
(14), and standardizing the administration
This has resulted in significant improvements in
the test’s reliability and validity
In a meta-analysis, Hiller et al. (1999) found the
Rorschach (using Exner’s scoring) to have larger
validity coefficients than the MMPI-2 for studies
using objective criterion variables
Other projective “tests”
 TAT (Thematic apperception test, Murray)
 Stimuli are less ambiguous than the ink blots
 Tell a story, though little standardization re: which pictures to be used,
scoring (typically a content analysis), etc.
 Used extensively with less literate pops like children (CAT), geriatric
pops (GAT), non-English speaking individuals, etc.
 Draw-a-figure test (figure drawings)
 Person, family, house, tree, etc. – all are interpreted as you
 Minimal standardization for scoring
 Sentence completion
 Sentence stems like “Mom is”, “Life”, etc. largely scored for a thematic
 Bender-Gestalt (the same test used for neuropsychological screens)
 Copying figures and making personality interpretations
Test or technique?
Review articles and come up with an
opinion. Come ready to debate/discuss.
On Tuesday.
Assessment of malingering
 What is malingering? What must it include?
 Intentional? Awareness? Personal gain?
 Very complex phenomenon that may change over time
 e.g., A lie (or lies) that become “real/true” for the individual over time,
or a truthful statement that becomes a lie.
 Most statements can’t be categorized as one or the other, and typically
involve aspects of both
 Berry et al (1995) suggest that faking good and faking bad are distinct
constructs (not opposite ends of the same continuum)
 Harder to detect specific faking vs. general faking
 Content nonresponsivity (CNR) – random responding, all true or all
 Content response faking (CRF) – fake good or bad; research suggests
that these may be independent dimensions (client may fake good on
some parts and fake bad on others)
 Should always be considered (in some form) when there are
contingencies for the patient
Classifications of Misrepresentation
 Are symptoms under conscious control? Are physical/psychological
symptoms motivated by internal or external gains?
 Factitious Disorders – intentional production of symptoms (feigning)
that are motivated by internal gains
 Motivation is to assume the “sick role” as there are no external
incentives for the behavior (e.g., economic gain, avoiding legal
responsibility, etc.)
 Somatoform disorder – unintentional (i.e., unconscious) production
of symptoms for internal gains
 Malingering – intentional production or exaggeration of symptoms
(i.e., conscious) motivated by external incentives
 Lack of cooperation during the evaluation, presence of ASPD,
discrepancy between self-reported data and objective findings,
medicolegal context for referral (e.g., attorney, police, etc.)
 Note: Exaggeration rather than fabrication makes differential very
Pros and Cons of Malingering Dx
 What are the costs of labeling someone a “malingerer”
 Questions all present and future clinical presentations
 What are the limits of our measures to make this differential?
 After weighing the strength of any claim of malingering (relatively
weak given the limits of our measures) and the costs of making an
erroneous judgment, we need to act very carefully
 Use converging, independent evidence to make any determinations
 e.g., objective inventories like the MMPI-2, strong contextual factors
(i.e., to provide the motive and baserates), interview, low probability
baserates for responding (e.g., incorrect on all options when this would
be well below chance responding), and response to the evaluator’s
feedback (e.g., “Actually, you’re doing quite well” – followed by
decrements in performance)
Mind of a murderer – the Bianchi tapes
 Identify the circumstances that could be seen as contingencies for
malingering (reinforcers for malingering)
 Why would that particular malingering behavior be manifested?
 How could client have obtained the information necessary to
provide the malingering profile? Any evidence that this information
was obtained?
 Any indications of malingering in his presentation? (Be objective)
 What are some reasons why he might not be malingering?
 Predict response sets in advance of testing (vs. scoring in hindsight)
 What pattern of responses do you predict for the Rorschach?
 What pattern of responses would you predict for the MMPI-2?
 What’s your call?
Measures of malingering – Berry et al
 The pasta strainer and photo copy machine “incident”
 MMPI-2: F, F-K (note: these two indices are not independent), VRIN
(random), TRIN (all true or all false), and Fb
 Also look for discrepancies between some of your subtle and obvious
supplemental scales (though this can also just assess sophistication in
 The D scale has also been used with some success, as the items
appear to reflect a less sophisticated (popular) view of mental illness
 MCMI – evaluates random responding, low frequency responding,
willingness to disclose information, debasement (willingness to
endorse psychological problems), and desirability (unwilling to
endorse psychological problems). Also as with the D scale of the
MMPI, the well-being scale can likewise assess psychopathology
Measures of malingering – 2 continued
 CPI (Cough, 1957) – intended to assess personality in the normal
 Has 3 validity scales: good impression (faking good), communality (items with
either very high or very low endorsement frequency that assesses random
responding), well-being (assesses fake bad)
 Basic personality inventory (BPI: Jackson, 1989) contains 12 scales each
with 20 T/F items. Research is limited on its utility for this.
 Deviation scale is comparable to the MMPI-2 F scale
 Personality assessment inventory (PAI: Morey, 1991) is a 344 items
 4 validity scales: Inconsistency, infrequency, negative impression management
and positive impression management
 NEO-PI-R (Costa & McCrae, 1991) – no effective validity index, so should
not be used in this context
 16 PF also lacks adequate validity measures and should not be used
Measures to specifically detect malingering
 These measures should be administered when the referral question
specifically implicates malingering and/or when there are
substantial contingencies to suggest that malingering is likely
 Structured Interview of reported symptoms (SIRS)
 Has shown some promise, though it is susceptible to acquiescence and
false positives (claiming malingering when it is not)
 The M test is a 33 item T/F test with three scales: genuine
symptoms of schizophrenia, atypical attitudes not characteristic of
mental illness, and bizarre and unusual symptoms rarely found in
mental illness
 Showed some ability to differentiate patients from directed malingerers
and from suspected malingerers (Note: The problem with using the
latter criterion group as there is no definitive knowledge about those
Measures to specifically detect malinger. - 2
 Test battery approach including WAIS-III and the MMPI-2 – the
more tests administered, the harder it is to present a consistent
 This approach should use baserates for incorrect responses as the
primary means of classifying (see also TOMM)
 Provide response options (typically no more than two) such that a
chance correct criterion can be calculated (e.g., 50% for a two item
version) – this should be no lower than 30% to avoid floor effects
 Track responses over at least 30 trials (the more the better as this
minimizes chance outcomes).
 Calculate the probabilities for deviations from .50 correct and apply it
to client’s correct response rate (i.e., what are the odds that they
would have missed as many as they did if they were truly guessing)
 Evaluate responsiveness to your feedback (e.g., “You’re actually not
doing that bad” vs. “Most people with your type of injury do better”)
 If less sophisticated malingering there will be an immediate and
relatively large response to your comments
Who is your client?
 Why is this question important in addressing the malingering issue?
 If the suspected malingerer is your client who is undergoing
therapy with you (or someone else) to whom is your obligation and
what are the costs/benefits of undertaking an evaluation of
 Does it help the therapeutic process? Focus on why one might be
deceptive to better understand client’s behavior
 If the “client” is the court, then to whom is your obligation and
what are the costs/benefits of undertaking an evaluation of
 Question now is to determine if client is being deceptive/evasive.
Assessing psychopathic personality
 Psychopathic personality = behavior characterized by remorseful
and callous disregard for others and a chronic antisocial lifestyle.
Thus, most ASPDs are not necessarily psychopathic.
 Drawing data from various sources (at least three)
 In person interview
 Testing
 Independent historical information (anything that is not self report – it
is important to note that other official records are not necessarily based
on anything other than self-report)
 Although all three of the above are important in order to provide
converging evidence, the test data will be the strongest tool in
court (due to its psychometric strengths)
Assessment (Meloy & Gacono, 1995)
 The Psychopathy checklist – revised (Hare, 1991) – 20 item test
with a 4-point Likert scale response format. Largely intended for
males (little data on females)
 To be completed by the clinician after a clinical interview and review of
historical data (includes descriptors falling under a single dimension of
psychopathy) e.g., impulsive, irresponsible, shallow emotions, etc.
 Items must be scored in a particular sequence, with more structured
items first, followed by the least structured items (with the former
contributing to the latter)
 Cutoff score of 30 or greater to define psychopathy, with higher scores
denoting more extreme presentations
 Adequate reliability and validity, though note the overlap between
some of the validity criteria and the info used to determine the score
(e.g., extent of criminal record is used for both)
Assessment (Meloy & Gacono, 1995) – p. 2
 The Rorschach – should still pursue the minimum number of
responses (14 or more) as suggested by Exner (1986)
 Include an assessment of defenses and object relations (both of which
appear to have modest reliability) that suggest more narcissism (selfreferences), violations of boundaries, etc. in the psychopathic
personality (specific ratios from Exner’s scoring system are described)
 MMPI-2 – primary focus is on scale 4 (also content subscales drawn
from 4 – be cautious with the latter)
 If administering scale 4 alone, note that you will not have the benefit
of the k correction. Thus, scores will be suppressed.
 L and F will also predict psychopathy (tendency to be untruthful)
 Cognitive abilities (e.g., WAIS-III) are unrelated to the presence of
psychopathy, but may be informative as to the nature of the
presentation (e.g., level of sophistication, concordance with
traditional/normative concepts of intelligence, etc.)
Integrity testing
 Evaluating integrity as a trait, whereas such behavior may be
situation specific (e.g., someone who would not lie in interpersonal
settings might not hesitate to cheat on their taxes).
 Characterological view of integrity downplays situational factors
 Integrity is a very broad concept that can include diverse responses
(e.g., passive vs. active lying, cheating vs. theft, etc.)
 Early paper and pencil tests were validated with the polygraph
 Employed in low end entry jobs when people have to interact with
money (retail, financial services, etc.)
 Today, such tests attempt to predict a wide range of behaviors
including violations of work rules, fraud, absenteeism, etc.
Integrity testing – p. 2
 Overt integrity tests – evaluate beliefs about the incidence of theft
and other counterproductive behaviors, punitive attitudes towards
theft, endorsement of common rationalizations for theft, and direct
questions about one’s own involvement in such activities.
 Personality oriented measures – much broader than integrity tests
and tend to have lower face validity (e.g., high conscientiousness
on the NEO)
 Clinical measures like the MMPI – validity scales
 All are difficult to validate because the behavior we are trying to
predict goes largely undetected. So if a test score does not predict
it could just mean that this is a false positive or someone who was
not caught
The polygraph test
 Measures physiological arousal that is presumed to be associated
with lying. e.g., perspiration as indicated by galvanic skin response,
brain activity suggesting arousal, etc. to the question (not answer)
 Is this assumption reasonable?
 Confounds?
 Under what circumstances can lying not be associated with arousal?
Habituation effect from repeated lying?
Lack of awareness of the lying? (issue of conscious vs. unconscious)
 What is the best way to quantify arousal? Should we evaluate this
normatively or ipsatively?
 Control Question Test (CQT) – compares relevant questions to
control questions which are intended to elicit a strong physiological
response from innocent subjects (e.g., “Prior to 1993, did you ever
do anything that was illegal or dishonest?”)
 While innocent people know they didn’t commit the crime, they are
either uncertain or lying about the CQ. Guilty persons should not
respond as much to the CQ
The polygraph test – p. 2
 Criticisms of the CQT
 Difficult to develop good control questions that will produce similar
responses relative to relevant questions for innocent people. This
results in many false positives (Note: Bias for positive outcome is why
most of these tests have artificially high success rates in forensic
settings – most are guilty)
 CQ are designed for each individual, so standardization is compromised
 Direct Lie Control Test (DLCT) – if person answers truthfully to a
question they are asked the question again and told to lie about it
when asked again (a known lie for comparison)
 Can be standardized and the power of the DLCT is from the instruction
(which is standardized) not the content of the question
 Can reduce the rate of false positives and generally does better than
the CQT
 Initially employed absolute standards for arousal = lying and this
was not at all effective
The polygraph test – p. 3
 The guilty knowledge test (GKT) – not designed to detect
deception, rather it tries to differentiate between those who have
knowledge about a particular event (crime) and those who do not
(the innocent)
 The concealed information test (CIT) – is similar to the above
approach and likewise tries to assess familiarity with specific
information as opposed to lying
 Both of these approaches have the advantage of asking the exact
same questions of all individuals and comparing responses both
within and between subjects
 Minimal data on these approaches, as the bulk of the research is on
the CQT
Does it work?
 Honts (1994) reviewed the literature on the effectiveness of the
polygraph and found that it does about as well as chance in
experimental settings. Most of the reviewed research uses the DLCT
 In real life and experimental settings, the majority of errors are
false negatives (saying someone is innocent when they are guilty)
 Most deceptive individuals (up to 95%) are misclassified
 Because the cost of a false positive (saying someone is guilty when
really they are innocent) is deemed to be higher in our legal
system. Therefore, the cutoff scores (criteria) have been altered so
as to make false negatives more likely
 Why does it fail?
 If high arousal to control questions, then more difficult to discriminate
 Idiosyncratic responses to lying
Admissibility of the polygraph (Saxe & BenShakhar, 1999)
 Courts have almost universally rejected the polygraph, though this
question has been and continues to be litigated extensively
 Courts are increasingly being made responsible for evaluating the
merits of test data, despite lacking the expertise to do so.
 Note: The literature has become increasingly discrepant in its view on
the polygraph (disagreement on its validity even in the scientific
 What criteria should be used to evaluate this information and what
should we tell the courts?
 History
 Marston (1917) used a blood pressure cuff to determine truthfulness
(arousal) in a defendant (Frye), based on the assumption that while
truth required little or no energy, lies do – rejected by the courts
History of the Polygraph
 Note the courts use of the term “experimental” as “not well
established evidence”
 The Frye ruling adequately reflects the courts treatment of the
polygraph even today, though now based on the Federal Rules of
Evidence (FRE) which require that the evidence (polygraph or
otherwise) be relevant and that it aid the jury (i.e., be valid).
 Daubert (1993) was based on the FRE and highlights 4
considerations when ruling on evidence:
 Testability or falsifiability (see Popper and the method of science)
 Error rate
 Peer review and publication
 General acceptance
 This basically requires juries & judges to evaluate scientific issues
History of the Polygraph – p. 2
 In trials like Daubert, scientists with opposing views on the
polygraph present their views and the jury must decide on the
merits of their arguments
 Generally there has been no legal distinction between the concepts
of reliability and validity (you can see where this is go, since, from a
scientific standpoint, reliability limits validity)
 An additional problem with these concepts is that the data is
collected as a series of discrepancy scores and these are then
summed to reflect a qualitative assessment of truthful, deceptive,
and inconclusive. Thus, very different discrepancy readings might
still result in similar qualitative assessments.
 Two accepted approaches for reliability are:
 Test the same person twice on the same issue using the same
polygraph technique with 2 different testers
 Test the person once, but have the chart scored by two different
History of the Polygraph – p. 3
 The latter approach deals on with the error involved in chart
scoring and ignores (or equates) administration error
 The real issue is whether the procedure as a whole is reliable (e.g.,
the creation and administration of control questions), thereby
getting at internal reliability (do different parts of the test agree),
test retest reliability (different administrations of the test agree),
inter-rater reliability (different test administrators agree as to the
 Note: There are practical limitations to how often the “same” test could
be given to the same individual
 What little data exists on reliability focuses only on the between
examiners approach (inter-rater reliability), though this reliability is
reasonable (not high). Thus, this remains an unevaluated
component of the polygraph (major limitation)
History of the Polygraph – p. 4
 Because the courts do not distinguish between reliability and
validity, the minimal reliability that does exist carries far more
weight than it should.
 Modern views of validity highlight the integrative component of
validity (recall Messick, 1995), though to evaluate it, it is necessary
to consider different aspects separately
 Different types of validity are more relevant depending on the
question at hand
 e.g., predictive validity for integrity testing in job placement/hiring, vs.
criterion validity being more relevant for determining truth/lying
 Construct validity gets at the theoretical issue of what is a lie. Is it a
situational phenomenon or a trait? Can it be represented by
physiological responding? Etc.
 No theory to explain why a stronger response should occur for lies vs.
History of the Polygraph – p. 5
 Similar physiological responses to lying appear to occur for
experiences such as surprise/novelty
 Note: For the CQT, questions about the crime are expected to be well
rehearsed for the criminal
 Thus, they have questionable construct validity (not necessarily
measuring what they propose to measure)
 Under-represents the construct of interest and over-represents
irrelevant constructs (surprise, stress, etc.)
 What criterion can be used?
 Outcome of a trial? If the case is dismissed?
 Do either of these assure that we know the client’s status re: lying?
 Note also that a true evaluation of the polygraph would mean that
the examiner only has access to the polygraph data (that s never
the case).
History of the Polygraph – p. 6
 The criterion and predictor are rarely independent.
 e.g., if the polygraph is used to get a confession and the confession
helps get a conviction, then by definition, the polygraph is part of the
criterion (polygraphs are frequently used to get confessions)
 Experimental criteria for the polygraph generally lack external
validity (is lying in an experiment = to lying in a crime involving
yourself? That is, are all types of deception equal?), while real life
evaluations of the polygraph lack experimental rigor and control
(e.g., only a subset of them will ultimately have a clear outcome
regarding deception and this may not be representative of all
 The CQT assumes that you can create similar “control” questions.
 Do deceptions involving different types of crime result in the same
physiological response?
Issues in assessing alcohol/substance abuse
 Recognition of dual diagnosis (vs. assuming all other problems are merely
secondary to the addiction) – How can we address this?
 Timing of assessment remains an important concern as this can dramatically
alter the outcome- When is the optimal time to assess?
 Patterns of use/abuse and general categories (e.g., stimulants, sedatives,
etc.) of use may be important to assessment and intervention
 Also some drugs may be used to offset the deleterious effects of other drugs
 Context in which use typically occurs may help in identifying triggers and high
risk settings for potential relapse – Examples of assess & tx?
 Motivation for seeking treatment is likewise a critical component to evaluating
the patient – Why? How would you assess and tx differently?
 e.g., legal motivation, social/family pressure, work requirement, etc.
 May require different test features to identify those still using as opposed to
those who have used before but are not now using
 The outcome of research in this area varies greatly as a function of how use is
defined (any use, quantity/freq, problem behaviors, combos., etc.)
 May identify different pops (e.g., those with liver damage vs. those losing jobs)
Specific measures to assess alcohol and
drug abuse
 The MMPI-2 has 2 items (264 “I have used alcohol excessively” &
489 “I have a drug or alcohol problem”) that directly assess use,
but the small number of items limits their psychometric properties.
 These items each appear to identify very different groups
 Sensitivity (how well the test identifies those who abuse alcohol) of
approx. 80% for males and 75% for women
 Specificity (how well the test identifies those who do not abuse alcohol)
ranges from 53% to 95% for men and from 76% to 97% for women
(varying on the item and race of the respondent)
 Because the lifetime prevalence base rates for use in the population
are 8% for women and 16% for men, it is difficult to improve on
the base rate of non-use (84% or more)
 Other measures include the MAST and the CAGE – what do you
know about these?
 Both have problems identifying female substance abusers (they were
developed for and validated on, men)
Specific measures to assess alcohol and
drug abuse: MMPI-2 scales – p. 2
 MacAndrew Alcoholism scale – (from the MMPI-2) is best for identifying
white males who have a propensity for polydrug abuse. It has a sensitivity
of approx. 70-75% and 20% false negatives.
 Very high false positive rate for black males, little data on females and
adolescents, and lower hit rates for psychiatric patients
 Addiction Admission scale (also from the MMPI-2) – acknowledgment or
denial of substance abuse problems
 Low reliability
 Addiction Potential scale (also from the MMPI-2) – personality features
associated with use
 Low reliability
 MMPI-2 profiles associated with use: 2/4, 4/2, 2/7, 7/2, 9/4, 4/9,
 Just males: 1/2, 2/1
 Just females: 3/4, 4/3, 6/4, 4/6, 8/4, 4/8
 Code types account for 25-35% of alcoholics & they don’t differ on tx success
Issues in alcohol/drug assessment
 Is there any utility in identify substance abusers who are
doing so covertly or who don’t believe they have a problem?
 Drawbacks: Treatment generally requires the clients willing consent, so
why bother identifying anyone other than those who acknowledge use?
This is consistent with the most widely used model, AA.
 Some benefits: Accuracy of other diagnoses, as use can alter
presentation of other symptoms, it can make some medication
treatments undesirable due to interaction effects, it could bring a
problem to a higher level of awareness for the client, etc.
 Utility in administering a measure for some clients as it can serve as a
standard (vs. an opinion) to the lay person, that allows for a
normative evaluation
 * Research suggests that exposure to norms can not only help with
assessment, but also recognition of problem drinking
 Use, in and of itself is considered problem use for an
alcoholic from an AA perspective. What factors are relevant
from a CD perspective?
Legal/ethical issues in assessing children
 Three components of “consent” for testing (anyone)
 Knowledge – what will be done, why, and how
 Voluntariness – absence of coercion; a child alone can’t do this, but
they are usually asked for assent
 Competence – parents must be legally competent and guardians to
give consent for child
 Also you are ethically (though not legally) bound to tell the parents
of potential risks from testing (e.g., what test scores can be
used for – such as being grounds to deny entry to a special
education program)
 Child is not likely to be the one who asked for testing. So are they
the client? If not, who is?
 Legal issues abound for intelligence testing, but there have been
few precedents for personality assessment. Why?
Legal precedents
 Griggs v Duke Power Company (1971) – job testing
 Hobson v Hansen (1967) – racial disparity (problems with standardization &
norms; assessed present skills rather than innate ability)
 Larry P. v Riles (1972) – culturally biased IQ tests for EMR determination
 PASE v. Hannon (1980) – reversed the Larry P. decision based on the fact that
EMR determinations were based on more than just IQ testing (any thoughts on
the item by item review by the judge?)
 Lora v Board of Education City of New York – use of TAT, Rorschach, & BenderGestalt to label minority children as emotionally disturbed (vague def. for latter)
 Note: Most personality tests are administered voluntarily. Test validation issues:
 Tests must be validated for the purpose for which they are being used
 Tests must be reliable for the pop being used, and appropriate norms must
exist for that pop.
 The tests must be capable of generating appropriate decisions for that pop
(i.e., validity)
 Note: many personality tests were developed for adults and co-opted for
children. Which of the above issues is most affected?
Demers (1986) on testing
 Although there are few legal challenges of personality tests, these
measures do tend to have more problems with reliability and
 Little to no evidence for gender or racial bias in personality testing
 Also, most personality tests are administered in a voluntary context
 Test validation issues:
Tests must be validated for the purpose for which they are
being used
Tests must be reliable for the pop being used, and appropriate
norms must exist for that pop.
The tests must be capable of generating appropriate decisions
for that pop (i.e., validity)
Providing feedback to clients
 APA requires that feedback be provided after testing, but it must be
in a form that they can understand (varies depending on the client)
 This can be best accomplished through an overview of the findings and
then a Q & A session.
 The feedback should provide a clear path to treatment goals
 Consider anything that is assessed as representing a continuum,
such that any characteristic will be shared by some portion of the
 Terminology such as unique and different can be substituted for
“abnormal”, “deviant”, or “pathological”
 Client need not agree with your feedback. Objections can be used
to clarify findings and as a starting point for the intervention
 Have client summarize info. Back to you
Providing feedback to clients - p.2
 Feedback should also include information on the tests themselves
(validity and reliability) in language that can be understood by the
 General psychometrics can be used to enhance the credibility of the
test e.g., “The MMPI has been used for over 50 years by clinicians and
it is one of the most widely used tests. Many research studies have
been done to show that it is pretty consistent in the scores it produces
and that it works pretty well at predicting behaviors.”
 This issue may be further complicated when giving feedback to
those with limited cognitive abilities, but a more detailed account
can be provided to those who have legal guardianship
Providing MMPI-2 feedback to clients
 Empirical evaluation of getting MMPI-2 feedback
 Compared MMPI-2 feedback of college students relative to attention
with no feedback
 The former showed increased self-esteem, immediately & after 2
 Decreased symptomatic distress, immediately and after 2 weeks
 Why would this occur?
 Nature of the client population? (higher functioning, therefore feedback
is likely to be generally positive?)
 Selective sampling? (Those seeking out personality evaluations are
wanting feedback and are more likely to construe it positively?)
 When initially meeting with clients and discussing the testing and
the eventual feedback you will be able to differentiate those who
will be most/least receptive to the feedback
 Highlights the importance of having the client arrive at the decision to
Things to note in your report
 Approach to testing and the consequences for reliability/validity
 Denial of problems, Evasiveness, Minimizes problems and conflicts,
resistant, hard working, honest, etc.
 Consequences for each of these approaches on all testing?
 Cognitive Functioning
 Impaired concentration, memory, reality testing, actual WAIS-III IQs,
ability to understand material
 Consequences for cognitive problems on other tests?
 Affect/mood/emotional control
 Depressed, dysphoric, flat, labile, manic, agitated, blocks strong affect,
feels threatened, poor control over emotions
 Consequences for mood problems on other tests?
Things to note in your report – p. 2
 Areas of conflict
 Need for control through autonomy, need for control, conformity to authority,
resentment towards authority, preoccupation with violence and anger, impulse
 Consequences for these traits on test results?
 Intra- and interpersonal coping strategies
 Social discomfort, suspicious, judgmental, dominant, submissive, self-confident,
aggressive when frustrated, rigid, task oriented, distant, aloof, etc.
 Consequences for these?
 Diagnostic impression
 All AXES, severity, remission (on the rise/fall), prognosis with regard to
likelihood and extent
 Recommendations
 Immediate therapy, individual, family, couples, or group, remove from stressful
situations, hospitalization needed, danger to self or others, need to expand
social relationships, learn to better express emotions, anger management, social
skills training, etc.
DSM-IV codes
 The parenthetical term “(provisional)” may follow a diagnosis to indicate
a significant degree of diagnostic uncertainty
 The phrase “rule out” is used to denote other diagnoses that should be
considered and that are still to be ruled out.
 The numeric code should follow the AXIS number and then the formal
name of the disorder should be listed.
 e.g., AXIS I: 295.40
Schizophreniform disorder (Provisional, rule out
Organic Delusional Disorder), with(out) good
prognostic features.
 Numeric codes from the DSM are matched to the ICD (International
Classification of Diseases) codes to allow for international compatibility.
 Recording procedures: e.g., Major Depressive Disorder
 AXIS I: 296.34 - 4th digit is either 2 (single episode) or 3 (multiple)
-5th digit is severity: 1 = mild, 2= moderate, 3 = severe without
psychotic features, 4= severe with psychotic features,
5= partial remission, 6= full remission
 4th and 5th digits typically apply to most recent or current episode
DSM-IV codes - continued
 Recording procedures: e.g., Bipolar I disorder
 AXIS I: 296.34
- 4th digit is 0 (single episode). For recurrent episodes,
it’s 4 if current or most recent episode is hypomanic or
manic, 5 if depressive, 6 if mixed, 7 if unspecified.
-5th digit is severity: 1 = mild, 2= moderate, 3 = severe
without psychotic features, 4= severe with psychotic
features, 5= partial remission, 6= full remission, 0 =
unspecified (except for hypomanic where 5th digit is
always a 0, and unspecified, where there is no 5th
 For Bipolar II, the 4th digit coding is the same, but do not use the 5th
digit code as is already specified as 9.

MMPI (Hathaway & McKinley, 1943)