Human Performance Studies
& the Assessment
of Telemedicine Programs
Elizabeth Krupinski, PhD
Presented at
The American Telemedicine Association Conference
April 18-21, 1999
Salt Lake City, UT
 The goal of this presentation is to acquaint
the reader with a few basic methods of
assessment for telemedicine applications.
 The emphasis is on assessment of the
diagnostic process and how it compares to
the traditional in-person patient visit.
 A handout of references is available and
references are made to other posters for
specific applications.
Rationale I
 There are numerous aspects of a
telemedicine program that need to be
evaluated in order to determine its utility and
eventual success.
 Some of the most important aspects are:
The diagnostic process (e.g., how good is the
clinician’s accuracy)
The technical design (e.g., what is the best
network design)
The economics (e.g., is it cost-effective)
Rationale II
 Perhaps of most concern to the patient and the
health care provider is the diagnostic process.
 What aspects are different from the
traditional in-person visit?
 Will these factors influence diagnostic
accuracy or the quality of the medical
advice given via telemedicine?
 Four major areas of investigation relating
to the assessment of the diagnostic process
are discussed:
 Comparison of existing technologies with
telemedicine technologies & diagnostic
 Workflow analysis
 User satisfaction analysis
 Communication analysis
A) Diagnostic Performance
 Some medical technologies have been
modified specifically for use in telemedicine
applications (e.g., the electronic stethoscope).
 For some applications new technologies or
those from other areas have been put to use in
telemedicine (e.g., the digital camera for
 In either case, the technology needs to be
assessed in terms of how it impacts on the
clinician’s diagnostic decision.
Objective Methods
Objective measures of observer performance are
preferred to assess diagnostic accuracy in telemedicine
and to compare it with diagnostic accuracy in the
traditional in-person setting.
 These investigations typically are
done in an experimental setting
and the results are generalized to
the clinical setting.
See “Assessment of diagnostic accuracy using a digital camera
for teledermatology” on Tues. in Teledermatology - Part II as an
example of an objective performance study.
The ROC Method
 The most widely used measure of diagnostic
performance is the Receiver Operating Characteristic
(ROC) method.
It is a criterion-free, parameter-free, distribution- independent measure
of diagnostic test performance.
It goes beyond simple accuracy, sensitivity and specificity since it
accounts for both true-positive and false-positive decisions.
The ROC curve provides a straightforward graphic representation of
performance. The area under the curve (Az) ranges from 0.5 = chance
performance to 1.0 = perfect performance.
Readers report their diagnostic decision and their confidence in the
decision. The confidence values are used to generate the ROC curves.
The ROC paradigm requires a “gold standard” for comparison (e.g.,
the in-person visit) of performance using two modalities or
technologies. The “gold standard” is the “right” answer.
The ROC Curve
True-Positive Fraction
Modality 1: Az = 0.65
Modality 2: Az = 0.85
Chance Line: Az = 0.5
False-Positive Fraction
Correlation & Concordance
 Quite often you just want to know if clinicians
make the same diagnosis in-person versus via
Contingency analysis can measure the degree of correlation between inperson and telemedicine diagnoses. It is a variant of correlation analysis
that works on nominal data such as diagnostic categories rather than
 The kappa statistic is becoming a very popular statistic to measure
performance. It can be used to measure rates of agreement in diagnosis
for telemedicine versus in-person visits. Kappa also takes disagreement
rates into account, providing a more accurate measure of performance
than contingency analysis does. The kappa statistic is a variation of the
popular Chi-Square analysis.
 For both analyses there is no “gold standard”. You are only concerned
with rates of agreement.
Subjective Methods
 Subjective methods can be used to assess
other factors that might impact on the diagnosis.
 For example, to judge the quality of a dermatologic image acquired with
a digital camera versus seeing the patient in person, the photograph could
be placed next to the patient and the dermatologist decides rates the
photo as better, same or worse than in-person.
 Photos taken in a variety of lighting conditions could be placed together
and the dermatologist must pick out the best one. This is known as a
forced-choice design.
 Dermatologists could rate the digital photos as excellent, good, fair or
poor in terms of a various parameters such as overall quality, sharpness
and color. The percent of ratings in each category can be analyzed
independently, they can be correlated with each other, with viewing time,
and with the diagnostic decision.
B) Workflow Analysis
 It may be useful to assess the impact of telemedicine
on a clinic’s daily routine before and after
implementation of telemedicine.
 Studies of this sort are generally observational, but try
to look at very specific aspects of workflow
or the working environment so they often
involve objective measurements and can be
analyzed with traditional statistical methods.
 See for example the poster: “Assessing case turn-around times in
a university based telemedicine program”.
The Referring Site
 Some things to examine at the referring site
might be:
Who does the patient spend time with (e.g., MD, PA,
Does the health care provider spend more/less time
with the patient if there is telemedicine?
How much time does it take to prepare a case and who
does the preparation?
Would the patient actually make the trip for an inperson visit if telemedicine was not there?
How would it impact on the patient’s routine if they had
to travel for an in-person visit?
The Consulting Site
 Some things to examine at the consulting
site might be:
Does the consultant’s workload increase with
Is the consultant’s workday longer with telemedicine?
Are only certain sub-specialties seeing telemedicine
Does the consultant require more or less patient history
and other clinical information for a telemedicine case
than an in-person visit?
How far does the consultant have to go from their
office/department to get to the telemedicine clinic?
General Workflow Issues
 Some broader issues that can be examined
as well are:
How long does it take to generate a diagnostic report
and get it back to the referring physician compared to
an in-person specialty consult?
How long does it take to get a telemedicine
appointment compared to an in-person visit?
How many people are involved in setting up and
conducting a telemedicine session compared to an inperson visit?
How do all of these issues impact on business and
economic models?
C) Satisfaction Analyses
 User satisfaction is a very important aspect of
almost any endeavor and telemedicine is no
 Practically anybody involved in telemedicine can be
queried about their satisfaction with things, but the
main three typically looked at are: the patient, the
referring clinician and the consulting clinician.
 See for example the poster “Case volume, response times
and user satisfaction with a university-based teleradiology
Creating Surveys
 Creating a good survey is often harder than
one would imagine. Surveys should be:
Easy to read, using simple language appropriate for the
intended audience (e.g., at about the 8th grade level for
the general patient population).
Translated into other languages if appropriate.
Short and to the point. Surveys longer than 1 page have
less chance of being completed than a 1-page survey.
Address only the issues you are interested in. Do not
use a shot-in-the-dark approach and ask too many
questions hoping for something of interest to appear.
Survey Styles I
 There are many ways to ask questions. The
choice depends on what kind of information you
want to get and in what detail.
Fill-in-the-blank. This is appropriate for information that
changes a lot or has no set answer.
Patient name_____________________
Patient address____________________
Yes/No questions. This is very straightforward, but does not
allow for gradations in answers.
Were you satisfied with the telemedicine session? •
Yes •
Could you see the telemedicine doctor clearly? •
Yes •
Survey Styles II
 Likert scales. A Likert scale allows you to get a
range of responses that may better reflect subtle
differences in answers or opinions.
Were you satisfied with the telemedicine session?
Very satisfied •
Satisfied •
Somewhat satisfied
Somewhat Unsatisfied •
Unsatisfied •
Very unsatisfied).
 Open-ended questions. Open-ended questions can
yield much information, but can be difficult to
analyze because there is generally no structure to the
answers and people can interpret the questions
What did you like about the telemedicine session?
Analyzing the Surveys
 Survey return rates vary considerably.
 Return rates will be better if surveys are filled out
immediately after a telemedicine session while the
respondent is still at the telemedicine facility.
 Sampling methods may be useful if compliance is
low. For example, survey every fifth patient and put
the time into insuring individual compliance.
 For most survey styles, the answers can be analyzed
with traditional statistical methods.
D) Communication Analysis
A very important part of the patient-clinician interaction is
 Telemedicine changes the ways that communication occurs
since the patient and clinician are no longer in the same place.
 Communication analysis is particularly important for real-time
telemedicine sessions.
 It is also important for store-forward sessions because the
patient is likely to have returned home by the time the consult
takes place and the referring clinician receives any feedback
from the consultant.
Methods I
 If the patient agrees, real-time sessions can be observed
live or can be video-taped for later analysis.
 Prior to the observations, you must decide what aspects
of the interaction are important. For example:
How often does the patient speak & for how long?
How often does the consultant speak & for how long?
How often does the referring clinician speak & for how long?
How many interruptions occur?
How much eye contact is made?
How often do questions have to be repeated?
Methods II
 The field of psychology has spent years analyzing
non-verbal means of communication to try and
understand how people feel without actually asking
them. People may respond to questions positively
because they want to please the examiner. Body
language often reveals just the opposite. For example:
How does the patient/clinician sit? Slouched? Straight?
What are the arms doing? Crossed? Hanging to the side?
What are the hands doing? Folded? Tapping a pencil?
Where are the eyes? Toward the speaker? Roaming?
Facial expression - scowling or smiling?
 Traditional methods of experimentation,
observation, and analyses from the social
sciences, medical image perception,
communication theory, research design and
statistics are available to anyone to conduct
assessment investigations in telemedicine.
 A well-designed study or assessment protocol
can answer many questions and provide ways to
strengthen a program and improve the delivery
of care and user satisfaction.
 The keys any successful assessment
protocol are:
Decide beforehand what the important questions are.
Design your study to answer those questions.
Know your study population and what types of
investigation or surveys they will tolerate.
Be prepared to change the protocol if necessary. Be
Utilize appropriate statistical methods whenever

Goal - University of Arizona