Focus on Circulation Snapshots
A Powerful Tool for Print Collection Assessment
Richard Entlich
Research and Assessment Librarian
Cornell University Library
ARL Assessment Conference, October 27, 2010
Libraries’ Customer Service Dilemma
•
21st century customer relationship management
is all about personalization and customization
• Recommender systems
• Loyalty programs
• Affinity programs
•
•
None of these programs can function without a
lot of personal data
Libraries have a big problem with personal data
Throwing Away Our “Customer” Data:
It’s What We Do
CornellNew
University
York State
Library
ALACivil
Code
practices
Practice
of Ethics
on
Law
theand
collection,
Rules use,
disclosure, maintenance and protection of personally-identifiable
information
III.4509.
§
We protect
Libraryeach
records,
library
which
user's
contain
right to
names
privacy
or other
and personally
identifying
confidentiality
details
with
regarding
respect to
theinformation
users of ...sought
collegeorand
received
“The
Library
seeksconsulted,
toand
protect
user
privacy
purging
borrowing
university
and
resources
libraries
library
borrowed,
systems
acquired,
ofbythis
state,
or transmitted.
including ...
records
as related
soon astopossible.
In general,
the link
connecting
a be
records
the circulation
of library
materials,
... shall
patron
with a borrowed
is broken
once
confidential
and shallitem
not be
disclosed
... the item is returned.”
Circulation Analysis without Personal Data
• Limited scope (what, but not who)
• Historical circulation counts tell us
• which items circulated and how often
• circulation within classes such as subject,
language, publication date, unit library
• average circulation levels
• Individual transaction archive records can reveal
• when circulation took place
• user data associated with circulation, if maintained
(often just a rudimentary status code)
Options for Getting a More Personal
Perspective on Usage
• Surveys, focus groups, interviews
•
•
•
•
labor-intensive
not quantitative or comprehensive
self-selected participants
subject to various observer and participant biases
• Grab it from the ILS, when it’s available
• create a “snapshot” of all circulating items
• opportunity to obtain borrower data
• unobtrusive and objective process
“Circulation Snapshot”
• A frozen moment in a
continuous stream of data
• A chance to bring the user into circulation analysis
• Profile the users of print
• Identify relationships between users and materials
• impact of characteristics like status, department, field of study,
and college affiliation on borrowing habits
• breakdown of subjects, languages, dates of publication by user
groups
Photo credit: jeff_golden http://www.flickr.com/photos/jeffanddayna/5067383625/
Doubts About Snapshots
• Are they valid data sources for analysis?
• A snapshot is just a blip in time
• A snapshot has rigid borders
} Efficacy issues
• Is retention of borrower data acceptable practice?
• A snapshot violates privacy } “Ethicacy” issue
A Snapshot is a Random Slice of Time
Photo credit: InfiniteWorld
http://www.flickr.com/photos/infiniteworld/sets/72157601531269767/with/11
61096313/
Print Collections Churn Slowly
Average months
Status of borrower
out
cumulative % of
items in circ
# items
Faculty Study
86.1
6757
4.3%
Academic Staff
22.2
14818
13.7%
Non-academic Staff
21.5
12764
21.9%
Faculty
21.2
37032
45.4%
Graduate Carrel
14.0
6315
49.5%
Internal
11.3
2066
50.8%
Graduate School
11.2
50831
83.2%
Other
9.8
1373
84.0%
Professional School
4.0
2445
85.6%
Undergraduate
2.6
16744
96.2%
ILL
1.2
2280
97.7%
Borrow Direct
0.9
3609
100.0%
Total
157034
A Snapshot Shows Nothing Beyond its Borders
Photo credit: Andy Carvin
http://www.flickr.com/photos/
andycarvin/1936753622/
• Two depts. have the
same number of books
in a subject area
checked out, but one
has 10x as many
faculty
• The same number of
books in two subject
areas are checked out,
but the library owns
100x as many in one
as the other
The Need for Context isn’t Unique to
Snapshots
• All data has borders
• Much data needs to be considered relative to
other appropriate metrics
• Assumptions need to be checked to be sure
measures are meaningful and comparisons
are valid
A Snapshot Can Violate Someone’s Privacy
Photo credit: hjrosasq
http://www.flickr.com/photos/
hjrosasq/2082311437/
Care and Forethought can Help Manage
Privacy Concerns
Photo credit: Vincent Diamante http://www.flickr.com/photos/sklathill/2255718951/
Finding Balance: Confidentiality vs. Analytical
Value
• One extreme—discard all unique identifiers
• Safest, but limits somewhat the analysis opportunities
• Another extreme—retain unique identifiers
• Maximizes analyzability, but compromises confidentiality
• The middle ground—anonymize unique identifiers
• Balances risk and benefit
• Supports analysis of individual borrower behavior without
revealing identity
• e.g., We notice that Romance Studies faculty are borrowing
lots of physics books. Is it one borrower, or a bold new trend?
Anonymization Technique and Example
•Use Cryptographic one-way hash (e.g. MD5 or SHA-1)
•Characteristics
•irreversible
•unique input  unique output
•minor change to input  major change to output
Original user ID:
12345
ID after random transformation (extra security):
123&zQ?45
ID after transformation and SHA-1 encryption:
94D51D75B7AFBCD0F85D1844F06BE73C88B3AC1B
Photo credit: Taber Andrew Bain http://www.flickr.com/photos/andrewbain/3126113695/
Basic Snapshot Recipe @ Cornell
• Query Voyager ILS for data on currently charged
books
• Obtain cotemporal Human Resources (HR) data
• Join the Voyager and HR data on the user ID
• Merge in codebooks for HR codes
• Load resulting table into Excel
• Anonymize ID numbers and discard originals
• Create Pivot table
• Run queries as desired
• Note: Your recipe will probably differ
Ingredients in Snapshots @ Cornell
Field Name
Bibliographic ID, Title, Author, Publisher,
Publication Date, Publication Place,
Language, ISBN, Recon flag
Item ID, Location, Reserve Location,
Create Date, Historical Circulation Total,
Historical Browses Total, Reserve
Charges Total, Item Type, Lost/Missing
Flag, Hold Flag
Charge Date, Current Due Date, Number
of Renewals, Recall Date, Recall Due
Date
Institution ID, Patron Group, Special name
(faculty study vs. carrel, ILL vs. Borrow
Direct), Expired Flag
Category, College code, Department
code, Status, Affiliation
Table Source
Voyager Bibliographic Record
College Description
Registrar's Office (long version of College
code field)
Office of Human Resources (long version
of Department code)
Department description
Voyager Item Record
Voyager Circulation Transaction Record
Voyager patron or patron barcode table
PeopleSoft Human Resource database
Some Early Snapshot Applications at Cornell
• For Unit Library Review process
• From which depts/fields do borrowers of libraries come?
• Which libraries do members of affiliated depts/fields use?
• For Print Collection Usage Task Force review process
• LC class user analysis by department and graduate field
• Department/grad field usage breakdown by LC class
• Circulation time and renewals by patron status
• Other potential uses
• User breakdown by publication date (for off-site transfer
decision-making)
• Inform individual subject selectors about usage in their
domain
Somewhat Messy Process; Tasty Results
Questions?
Comments?
Acknowledgments:
Inspiration: Corey Murata and Hana Levay,
University of Washington Libraries
Help with snapshot development and
analysis at Cornell: Lydia Pettis, Pete Hoyt,
Joanne Leary
Photo credit: star5112 http://www.flickr.com/photos/johnjoh/
4429337539/in/set-72157623515578367/
Camera shutter sound effect from http://www.pachd.com/sounds.html
Descargar

Tightening the Core - Library Assessment