Designing metadata
for resource discovery
Deirdre Kiorgaard
Chair, Joint Steering Committee for the Development of RDA
ACOC seminar, October 2008
Cataloguing myths and legends
We can no longer catalogue everything
The catalogue has lost its central place
Brave new world?
The power of the search engine
Next generation catalogues
People have the power
“Will keyword searching and relevance ranking alone
suffice? Neither Google nor Microsoft seems to think so.
In their mass digitisation projects they are already reusing
the catalogue records created for the printed originals.”
(Danskin, 2006)
Google users
“Sure, Google is great. I use it everyday and there is a
good chance you do too, but their algorithms are not
perfect, and sometimes your results are not quite what you
were looking for. Well, that’s where people-powered
search comes in. Search results that have been provided
or filtered by humans. The idea is that if a person is
deciding what results you see rather than a computer, your
results will be closer to what you are looking for rather than
a big list of all possible related links.” (Gold, 2007)
The forgotten thrill of cataloguing
Social tagging
Cataloguing sites
New basics (1)
Decide what we want to provide access to
Keep in mind the ‘long tail’
“As Antiques Roadshow demonstrates each
week, you just never know what people
will value in the future.” McKinven (2002).
New basics (2)
Create {source, etc} the data
- copy cataloguing, - CiP data
- text scanned from the resources
- metadata from: the creators of online resources, information
from publishers
The difference between
and this
Paradise lost or paradise regained?
Navigation and relationships
Controlled forms of name
 Preferred names for works
Carefully crafted subject vocabularies
The failure of the opac
“The OPAC has tended to favour an increase in the number
of access points over the effective presentation of the
relationships between resources. … It has been the failure
to exploit the navigational potential of this rich metadata
that has given the OPAC such a bad name.” Danskin,
RDA and relationships
 Preferred titles for works and expressions
 Links between the FRBR group 1 entities
 Relationships among works, etc
 Relationships between works etc, and their
creators, etc
 Relationships between persons, families and
corporate bodies
The (not so)
secret life of catalogue data (1)
“metadata increasingly appears farther and
farther away from its original context”
Shreeves, Riley and Milewicz (2006).
Library catalogues
Shared library databases
Digitisation projects
Institutional repositories
The (not so)
secret life of catalogue data (2)
 The GLAM sector?
Galleries, libraries, archives
and museums
 The Internet
Catalogue records have jumped the fence
 Leading to:
Services based on data aggregations
Sharing of library data with other sectors
Exposure of library data to the internet
Making data shareable (1)
 Designing data for current and future services
 Humanly understandable
Understandable outside of its original context
The “On a horse problem”
Understandable outside of its original language
Making data shareable (1)
Machine processable
Free from errors
Clean, consistent and appropriately granular
Use identifiers
“The successful use of information technologies used for
purposes of communication requires far more
standardization than human beings need for
interpretation and use.” Bade (2007)
RDA and sharing data
Whose standards?
“Standards are like toothbrushes; everyone
agrees they are a good idea, but nobody
wants to use anyone else’s.” Baca (2008)
Library standards
Digital library standards
Cultural institutions
Sharing standards (1)
Mappings and crosswalks
MARC Mappings
- MARC 21 to MODS
- MODS to MARC 21
- Dublin Core to MARC 21
- MARC 21 to Dublin Core
- Digital Geospatial Metadata to MARC
- MARC to Digital Geospatial Metadata
- MARC Character Sets to UCS/Unicode
- ONIX to MARC 21
U.S. National Level Requirements
- MARC 21 Authority
- MARC 21 Bibliographic
Sharing standards (2)
Switching schemas and
“Crosswalks, derivatives, hub and spoke
models, and application profiles respond to
the need to identify common ground in the
complex landscape of resource
description. But these objects also imply an
unresolved tension between the need to
minimise proliferation of standards and the
need to create machine-processable
descriptions of resources.” (Godby, Smith
and Childress, 2008).
Achieving commonality
When choosing the standards to use within the library sector:
– use existing standards where they exist
– influence the development of existing standards to cover
any perceived gaps or to address any issues
When working with other communities:
– use elements from existing standards where needed,
rather than re-inventing the wheel
– use and/or develop common vocabularies wherever
– use or build upon common models and principles
– make our element sets available on the web
RDA and achieving commonality
Uses external vocabularies
Jointly develops new vocabularies
Draws on standards in related
Is built on common models and principles
Born free? (1)
 Data sharing not new
 Shared library databases
“… OCLC are trapped in an increasingly inappropriate
business model. A model based upon the value in the
creation and control of data. Increasingly, in this
interconnected world, the value is in making data openly
available and building services upon it. When people get
charged for one thing, but gain value from another, they will
become increasingly uncomfortable with the old status quo.”
(Wallis, 2007)
“One lesson we took away from the analysis was that the
prevailing opinion in the blogosphere is that data should be
free and open. The reality is that nearly every organization
has terms and conditions for data sharing” (Calhoun, 2008,
slide 7).
Born free? (2)
Data is not free to produce
Free versus unfettered access
The problem of invisibility
“His enthusiasm had screened out an enormous
array of people, organizations, and institutions
involved in this ‘direct’ touch. The university, the
library, publishers, editors, referees, authors, the
computer and infrastructure designers, the
cataloguers and library collection managers, …
had no place in his story. When they do their job
well, they do it more or less invisibly.” (Brown &
Duguid, 2008. p. 5-6.)
Sharing in a commercial environment
Thank you
• Bibliography and list of images are
available in written presentation

Designing metadata for resource discovery