Ontologies: What, Why, and How?
Jon Corson-Rikert, Mann Library
Metadata Working Group
4/18/03
What problems are we trying to solve?
• Problems with content
•
•
•
•
Inconsistency
Incompatibility
Incompleteness
Unboundedness
• Need for Automation
•
•
•
•
Discovery
Filtering
Assembly
Interoperability
Why consider ontologies?
• Sharing common understanding of the structure of information
among people or software agents
•
Codifying domain assumptions
· Terminology
· Relationships
•
Reuse of domain knowledge
• Improving information retrieval success
•
Augmenting or refining search terms
· Preferred terminology
· Discriminating among alternative meanings (e.g., WordNet)
•
•
Language translation
Bridging across domains
The Evolution of Knowledge Management
Pre- Web
Web
Semantic Web
Books, Magazines, Articles,
….
Books, Magazines, Articles
Databases, Webpages
Defined Electronic
Information Elements
Libraries/Archives/File
systems
Libraries/Archives/File
Systems/Websites
Electronic
Repositories
Bibliographic Catalogues
on Cards or Computers
Bibliographic Catalogues
Machine Index Catalogues
Machine Readable
Metadata Repositories
Human Indexing
Human Indexing
Machine Indexing
Machine Indexing
Human Indexing
Human reading, checking
and classifying
Statistical Analysis by
Machines
Semantical Analysis by
Machines
Bibliographies
Bibliographies/Output from
Fulltext Search Engines
Knowledge based
specialized webportals
Reviews
Knowledge Mining
Thesauri, Classification Schemes, Glossaries,
Ontologies
Johannes Keizer, FAO
What is an ontology? - 1
A thesaurus on steroids
•
•
Ordered terminology
Prescribed relationships among terms
What is an ontology? - 2
A shallow classification of basic categories
•
Defines categories, and hence terminology
•
Defines rules
(Soergel 1999)
What is an ontology? - 3
In information science:
A characterization, through formal, explicit
knowledge, of the intended meanings and
relationships of a vocabulary of concepts
(Gruber 1993)
What is an ontology? - 4
A formal explicit description of concepts in a domain of
discourse (classes, or concepts),
with properties of each concept describing various
features and attributes of the concepts (slots, roles, or properties)
and restrictions on slots (facets).
An ontology together with a set of individual instances of
classes constitutes a knowledge base
(Ontology 101)
Ontologies have …
Concepts
Relations between concepts
Synonyms
• Class/subclass (broader/narrower; dog is to mammal)
• Membership (“is a”: Spot is a dog )
• Part/whole (hand is part of arm, car has fender)
• Inverse (e.g., pest damages plant so plant is damaged by pest)
Axioms (properties and attributes of concepts)
• Definitions specifying both necessary and sufficient criteria
for membership
• Constraints such as domain and range, minimum or maximum
number of values
•
Ontologies will (eventually) support:
Automatic classification and query
•
•
•
Where does a target word or phrase fit into the ontology
Locating a concept or a cluster of concepts based on a
description and/or relationships
Vocabulary switching between domains
Inference
•
•
Using relationships to determine, given A and B, what C
might be and how you know it
Analysis to enhance navigation
Consistency checking
From common data to common structure
• Controlled vocabulary
•
•
Very simple structure (nearly flat)
The terms are the data
• Taxonomy
•
Primarily to define position within a hierarchy – e.g., species
• Thesaurus
•
•
More options for relationships
Often leverages retrieval and organization of additional data
• Meta-thesaurus
•
A federation of similar thesaurus structures to allow bridging data across
languages or across domains
• Ontology
•
Whatever can’t be done by the above
Typical thesaurus implementation
• A controlled vocabulary or thesaurus limited to the domain
• A set of separate database tables, each with predictable attributes
•
•
•
People
Departments
Resources
• Thesaurus cross-references this content for internal navigation
• Incoming keyword queries can provide a rich context of links to
data tables
Website with thesaurus
People
Publications
Thesaurus
Projects
Orgs
Crops
Queries
Genes
http://mcknight.ccrp.cornell.edu
Thesaurus as leveraging agent
Input query
2nd thesaurus
thesaurus
Refinement
then search against
data warehouse
3rd thesaurus
Gazetteer as leveraging agent
Scenario:
•
•
•
•
•
User finds library record (e.g., book or photo) with place name reference
(e.g., neighborhood in L.A.)
Place name and desired action sent to gazetteer (e.g., find other photos in
nearby L.A. neighborhoods using appropriate historical neighborhood
names)
Gazetteer matches incoming place name with coordinate footprint
Other place names near footprint and in L.A. retrieved
Records related to neighboring places returned to user
Requires:
•
•
•
Structured data (place names, coordinates)
Relationships (historical to modern names, neighborhoods to city)
Functionality (coordinate-based spatial analysis)
Agriculture Heritage Project
• Wide variety of content from diverse organizations
• Open-ended content
• Time and place as first-order variables
• Data likely to cluster by theme, time, and place
• Many areas with sparse data
• Need to appeal to diverse audiences
• Need to produce independently functional results
• Goal: transform flat archives into dynamic context of people,
places, and events
Approach
• Simple underlying content model
• Adaptive relationships among content
•
Sometimes very detailed
•
Often very general
• Approachable from any viewpoint
•
Time, space, originating organization, historical event, personalities, crops,
thematic interests
• Capability for encapsulation and export as curricular units
The ABC Ontology Model
• A rich model incorporating time, place, and events as
well as information more traditionally encoded in
metadata
• Designed for exchange and interoperability as RDFXML metadata
• A set of generalized classes and canonical relationships
among them
• An ontology framework independent of the data it
accompanies
ABC Ontology classes
Entity
abstraction
work
actuality
agent artifact
manifestation
item
temporality
action
event
time
situation
place
ABC Ontology diagrams - 1
Events precede or follow situations
publication
creation
EV0
ST0
EV1
acquisition
ST1
EV2
ABC Ontology diagrams - 2
Most agents, actions, times, and places modify events
publication
creation
EV0
ST0
EV1
acquisition
ST1
EV2
hasAction
photo taken
inPlace
atTime
AC0
photo published
AC1
hasAgent
AG0
photographer
AG1
publishing house
ABC Ontology diagrams - 3
collection
Manifestations exist in situations
Tulips
color print
WK0
MN1
hasRealization
color transparency original
MN0
creation
EV0
contains
ST0
MN3
Kodak archive
poster
MN2
hasRealization
the photo
isPartOf
instanceOf
the poster
publication
contains
EV1
ST1
acquisition
EV2
Complete ABC diagram
Source: http://metadata.net/harmony/cimi_modelling.htm
Source: http://metadata.net/harmony/cimi_modelling.htm
ABC class-property relationships
•
•
•
•
Set of canonical relationships
All bi-directional (inverses)
Provide a domain of possible connections
Serve as the basis for model traversal
ABC class-property relationships - 1
Entity-Entity
contains - isPartOf
Entity-Place
inPlace - isLocationOf
Actuality-Actuality
hasPhase - isPhaseOf
Actuality-Situation
inContext - isContextFor
Work-Manifestation hasRealization - isRealizationOf
Manifestation-Item
hasCopy - isCopyOf
ABC class-property relationships - 2
Temporality-Agent
Temporality-Actuality
Event-Action
Event-Agent
Situation-Event
hasParticipant - isParticipant
involves - isInvolvedIn
transforms - isTransformedBy
usesTool – usedAsToolIn
destroys - isDestroyedBy
hasResult - isResultOf
creates – isCreatedBy
hasAction – isActionOf
hasPresence – isPresentIn
precedes - isPrecededBy
follows - isFollowedBy
Work in progress
Demo of Agriculture Heritage site prototype
Is it worth it?
• It’s worth exploring
• Must be easier to build
• Useful to rethink typical site structure
• Not clear how to leverage all the potential power
• Need more use cases
• What does it mean for metadata?
References
• “Indirect geospatial referencing through place names in the digital library: Alexandria
Digital Library experience with developing and implementing gazetteers,” Linda L.
Hill, Zi Zheng, Proceedings of the American Society for Information Science Annual
Meeting, Washington, D.C., Oct. 31- Nov. 4, 1999, pp. 57-69.
• “Ontology Development 101: A Guide to Creating Your First Ontology”, Natalya F.
Noy, Deborah L. McGuinness, Stanford University, Stanford, CA 94305
• “Science and the Semantic Web,” James Hendler, Science, vol. 299, 1/24/03
• “The ABC Ontology and Model,” Carl Lagoze and Jane Hunter, Journal of Digital
Information, volume 2 issue 2, November, 2001.
• “The Rise of Ontologies or the Reinvention of Classification,” Dagobert Soergel,
Journal of the American Society for Information Science, 50(12):1119-1120, 1999
• “Toward Principles for the Design of Ontologies Used for Knowledge Sharing,”
Thomas R. Gruber, Revision: August 23, 1993, Stanford Knowledge Systems
Laboratory
Descargar

What is an ontology, then?