Introduction to Digital Libraries
Week 4: RDF, Semantic Web, Linked Data
Old Dominion University
Department of Computer Science
CS 751/851 Spring 2010
Michael L. Nelson <[email protected]>
02/01/10
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Resource Description
Framework
• “…RDF can also be used to represent
information about things that can be
identified on the Web, even when they
cannot be directly retrieved on the
Web.”
• HTML is for people, RDF is for
machines…
Quotes, figures, examples are from the W3C RDF Primer (see week 4 readings)
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Eric Miller in HTML
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Eric Miller in RDF
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#">
<contact:Person rdf:about="http://www.w3.org/People/EM/contact#me">
<contact:fullName>Eric Miller</contact:fullName>
<contact:mailbox rdf:resource="mailto:[email protected]"/>
<contact:personalTitle>Dr.</contact:personalTitle>
</contact:Person>
</rdf:RDF>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Viewing the World As Triples
• It is a well established fact that the Va
Tech Hokies are much better than the
UVa Wahoos.
>>>
Subject
Predicate
Object
Triple = (Hokies, betterThan, Wahoos)
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Once More, With URIs
>>>
Subject
Predicate
http://www.hokiesports.com/
Object
info:taunts/betterThan http://en.wikipedia.org/wiki/Wahoos
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
RDF is Modeled as a Graph
Subject
Predicate
Object
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
A More Complex Graph
The above graph can be serialized into a triples notation as:
<http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/creator> <http://www.example.org/staffid/85740>
<http://www.example.org/index.html> <http://www.example.org/terms/creation-date> "August 16, 1999"
<http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/language> "en"
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
The Model is Not the Serialization
This Model…
…can be expressed, or serialized,
in different natural languages:
Los Hokies son mejores que Wahoos
Hokies sind besser als Wahoos
Hokies zijn beter dan Wahoos
…
We will primarily see RDF serialized as “RDF/XML”, but be aware that many other
serializations exist: N3, Turtle, N-Triples, TriX, etc.
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Using Prefixes for Brevity
This is verbose:
<http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/creator> <http://www.example.org/staffid/85740>
<http://www.example.org/index.html> <http://www.example.org/terms/creation-date> "August 16, 1999"
<http://www.example.org/index.html> <http://purl.org/dc/elements/1.1/language> "en"
Define:
prefix
prefix
prefix
prefix
prefix
prefix
prefix
prefix
prefix
rdf:, namespace URI: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs:, namespace URI: http://www.w3.org/2000/01/rdf-schema#
dc:, namespace URI: http://purl.org/dc/elements/1.1/
owl:, namespace URI: http://www.w3.org/2002/07/owl#
ex:, namespace URI: http://www.example.org/ (or http://www.example.com/)
xsd:, namespace URI: http://www.w3.org/2001/XMLSchema#
exterms:, namespace URI: http://www.example.org/terms/
exstaff:, namespace URI: http://www.example.org/staffid/
ex2:, namespace URI: http://www.domain2.example.org/
And the above is more compactly expressed as:
ex:index.html
dc:creator
exstaff:85740
ex:index.html
exterms:creation-date
"August 16, 1999”
ex:index.html
dc:language
"en"
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Reusing URIs
Used as both an object & subject
Literals, not URIs
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Good Ol’ John Smith…
exstaff:85740
exaddressid:85740
exaddressid:85740
exaddressid:85740
exaddressid:85740
exterms:address
exterms:street
exterms:city
exterms:state
exterms:postalCode
exaddressid:85740
"1501 Grant Avenue"
"Bedford"
"Massachusetts"
"01730"
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
…Now With a “Blank Node”
Seems like a neat optimization, right?
No. Don’t do it.
Like sharable libraries and relative URIs,
blank nodes will be the bane of your
existence. They are a good idea 3-4 times,
and a bad idea 29,480,247,598,797 times.
exstaff:85740
???
???
???
???
exterms:address
exterms:street
exterms:city
exterms:state
exterms:postalCode
???
"1501 Grant Avenue"
"Bedford"
"Massachusetts”
"01730"
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
How Old Is John?
2710? 278? Something else?
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
URIs are best.
But if no URI, use typed literals.
<http://www.example.org/staffid/85740> <http://www.example.org/terms/age>
"27"^^<http://www.w3.org/2001/XMLSchema#integer>
Or:
exstaff:85740
exterms:age
"27"^^xsd:integer
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Types Maintain Data Integrity
Good:
Bad:
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
RDF/XML
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:exterms="http://www.example.org/terms/">
Subject
<rdf:Description rdf:about=" http://www.example.org/index.html">
<exterms:creation-date>August 16, 1999</exterms:creation-date>
</rdf:Description>
</rdf:RDF>
Predicate
Object
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Factoring Out the Subject
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
URI is the subject
xmlns:exterms="http://www.example.org/terms/">
of 3 triples
<rdf:Description rdf:about="http://www.example.org/index.html">
<exterms:creation-date>August 16, 1999</exterms:creation-date>
<dc:language>en</dc:language>
<dc:creator rdf:resource="http://www.example.org/staffid/85740"/>
</rdf:Description>
</rdf:RDF>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
URI as subject in RDF/XML
(cf. literal as subject)
Different RDF/XML, Same Graph
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:exterms="http://www.example.org/terms/">
<rdf:Description rdf:about="http://www.example.org/index.html">
<exterms:creation-date>August 16, 1999</exterms:creation-date>
</rdf:Description>
<rdf:Description rdf:about="http://www.example.org/index.html">
<dc:language>en</dc:language>
</rdf:Description>
<rdf:Description rdf:about="http://www.example.org/index.html">
<dc:creator rdf:resource="http://www.example.org/staffid/85740"/>
</rdf:Description>
</rdf:RDF>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Blank Node in RDF/XML
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:exterms="http://example.org/stuff/1.0/">
<rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar">
<dc:title>RDF/XML Syntax Specification (Revised)</dc:title>
<exterms:editor rdf:nodeID="abc"/>
</rdf:Description>
internal ID for the
blank node; not a URI
<rdf:Description rdf:nodeID="abc">
<exterms:fullName>Dave Beckett</exterms:fullName>
<exterms:homePage rdf:resource="http://purl.org/net/dajobe/"/>
</rdf:Description>
</rdf:RDF>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Typed Literal in RDF/XML
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:exterms="http://www.example.org/terms/">
<rdf:Description rdf:about="http://www.example.org/index.html">
<exterms:creation-date
rdf:datatype="http://www.w3.org/2001/XMLSchema#date">1999-08-16
</exterms:creation-date>
</rdf:Description>
</rdf:RDF>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
More Syntactic Sugar…
<?xml version="1.0"?>
<!DOCTYPE rdf:RDF [<!ENTITY xsd "http://www.w3.org/2001/XMLSchema#">]>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:sportex="http://www.exampleRatings.com/terms/">
<rdf:Description rdf:about="http://www.example.com/2002/04/products#item10245">
<sportex:ratingBy rdf:datatype="&xsd;string">Richard Roe</sportex:ratingBy>
<sportex:numberStars rdf:datatype="&xsd;integer">5</sportex:numberStars>
</rdf:Description>
</rdf:RDF>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
RDF Containers
• Bag = unordered collection, possibly
with duplicates
• Seq = an ordered sequence, possibly
with duplicates
• Alt = an unordered list of alternatives
– “Virginia Tech”, “VPI&SU”, “Hokies”, etc.
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Bag Graph
This triple
declares the
blank node
to be a “bag”
Blank Node
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Bag RDF/XML
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:s="http://example.org/students/vocab#">
<rdf:Description rdf:about="http://example.org/courses/6.001">
<s:students>
<rdf:Bag>
<rdf:li rdf:resource="http://example.org/students/Amy"/>
<rdf:li rdf:resource="http://example.org/students/Mohamed"/>
<rdf:li rdf:resource="http://example.org/students/Johann"/>
<rdf:li rdf:resource="http://example.org/students/Maria"/>
<rdf:li rdf:resource="http://example.org/students/Phuong"/>
</rdf:Bag>
</s:students>
</rdf:Description>
</rdf:RDF>
Change <rdf:Bag> to <rdf:Seq> to switch to an ordered list of students
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Alt RDF/XML
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:s="http://example.org/packages/vocab#">
<rdf:Description rdf:about="http://example.org/packages/X11">
<s:DistributionSite>
<rdf:Alt>
<rdf:li rdf:resource="ftp://ftp.example.org"/>
<rdf:li rdf:resource="ftp://ftp1.example.org"/>
<rdf:li rdf:resource="ftp://ftp2.example.org"/>
</rdf:Alt>
</s:DistributionSite>
</rdf:Description>
</rdf:RDF>
These are unique but functionally equivalent,
so we use Alt instead of Bag or Seq.
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Collection vs. Container
• A container is not closed -- there could be
more students in
http://example.org/classes/6.001 recorded in
a graph that we just haven’t found yet.
– the “Open World Assumption”; see:
• http://en.wikipedia.org/wiki/Open_world_assumption
• http://www.mkbergman.com/852/the-open-worldassumption-elephant-in-the-room/
• A collection is a way of, for example,
bounding the students in
http://example.org/classes/6.001
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Graph for Collection
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Simple RDF/XML for Collection
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:s="http://example.org/students/vocab#">
<rdf:Description rdf:about="http://example.org/courses/6.001">
<s:students rdf:parseType="Collection">
<rdf:Description rdf:about="http://example.org/students/Amy"/>
<rdf:Description rdf:about="http://example.org/students/Mohamed"/>
<rdf:Description rdf:about="http://example.org/students/Johann"/>
</s:students>
</rdf:Description>
</rdf:RDF>
note no: <rdf:li> elements
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
RDF/XML Collection,
Long Version
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:s="http://example.org/students/vocab#">
<rdf:Description rdf:about="http://example.org/courses/6.001">
<s:students rdf:nodeID="sch1"/>
</rdf:Description>
<rdf:Description rdf:nodeID="sch1">
<rdf:first rdf:resource="http://example.org/students/Amy"/>
<rdf:rest rdf:nodeID="sch2"/>
</rdf:Description>
<rdf:Description rdf:nodeID="sch2">
<rdf:first rdf:resource="http://example.org/students/Mohamed"/>
<rdf:rest rdf:nodeID="sch3"/>
</rdf:Description>
<rdf:Description rdf:nodeID="sch3">
<rdf:first rdf:resource="http://example.org/students/Johann"/>
<rdf:rest rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#nil"/>
</rdf:Description>
</rdf:RDF>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Reification
• Reification is process of recording who made
an assertion; providing provenance
• The statement of “the Va Tech Hokies are
much better than the UVa Wahoos” is a
different statement than “Michael says ` the
Va Tech Hokies are much better than the UVa
Wahoos’”
– my Wahoo friend Drew will attest to this difference!
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Giving a URI to an RDF
Statement
http://www.hokiesports.com/
info:taunts/betterThan http://en.wikipedia.org/wiki/Wahoos
give this triple the URI: info:MLN/vt1
Reify the RDF statement with this quad of RDF statements:
info:MLN/vt1
info:MLN/vt1
info:MLN/vt1
info:MLN/vt1
rdf:type
rdf:subject
rdf:predicate
rdf:object
rdf:Statement
http://www.hokiesports.com/
info:taunts/betterThan
http://en.wikipedia.org/wiki/Wahoos
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
RDF/XML Reification (1)
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#”
xmlns:taunt="info:taunts/"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description rdf:about="http://www.hokiesports.com/">
<taunts:betterThan rdf:ID="info:MLN/vt1"
rdf:resource="http://en.wikipedia.org/wiki/Wahoos”/>
</rdf:Description>
<rdf:Description rdf:about="info:MLN/vt1">
<dc:creator rdf:resource="http://www.cs.odu.edu/~mln/">
</rdf:Description>
</rdf:RDF>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
RDF/XML Reification (2)
<?xml version="1.0"?>
<!DOCTYPE rdf:RDF [<!ENTITY xsd "http://www.w3.org/2001/XMLSchema#">]>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:exterms="http://www.example.com/terms/"
xml:base="http://www.example.com/2002/04/products">
<rdf:Description rdf:ID="item10245">
<exterms:weight rdf:ID="triple12345" rdf:datatype="&xsd;decimal">2.4
</exterms:weight>
</rdf:Description>
<rdf:Description rdf:about="#triple12345">
<dc:creator rdf:resource="http://www.example.com/staffid/85740"/>
</rdf:Description>
</rdf:RDF>
http://www.example.com/#item10245 is the subject of the triple
(it identifies a tent sold by the company);
http://www.example.com/#triple12345 is the URI of the
RDF statement about the tent.
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Editorial re: Reification
• The solution as presented in RDF is sort of
clunky; it would be better if the model had
originated as quads instead of triples:
Michael
VT
betterThan
UVA
• A different approach: Named Graphs
– “Named graphs, provenance and trust”, WWW
2004,
http://doi.acm.org/10.1145/1060745.1060835
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
DC & RDF/XML
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcterms="http://purl.org/dc/terms/">
<rdf:Description rdf:about="http://www.dlib.org/dlib/may98/05contents.html">
<dc:title>DLIB Magazine - The Magazine for Digital Library Research
- May 1998</dc:title>
<dc:description>D-LIB magazine is a monthly compilation of
contributed stories, commentary, and briefings.</dc:description>
<dc:contributor>Amy Friedlander</dc:contributor>
<dc:publisher>Corporation for National Research Initiatives</dc:publisher>
<dc:date>1998-01-05</dc:date>
<dc:type>electronic journal</dc:type>
<dc:subject>
<rdf:Bag>
<rdf:li>library use studies</rdf:li>
<rdf:li>magazines and newspapers</rdf:li>
</rdf:Bag>
</dc:subject>
<dc:format>text/html</dc:format>
<dc:identifier rdf:resource="urn:issn:1082-9873"/>
<dcterms:isPartOf rdf:resource="http://www.dlib.org"/>
</rdf:Description>
</rdf:RDF>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Semantic Web
• The current web is a web of documents
(.html, .pdf, .jpeg, etc.)
• The Semantic Web is (will be) a web of data
– there is a lot of data on the web, but it is not
marked up in a usable form
– instead, it is “trapped” in HTML tables & elements,
PDF files, images, videos, etc.
– in short, the documents are on the Web, but the
semantics of what those documents say are
largely not on the Web
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Semantic Web Stack
from: http://en.wikipedia.org/wiki/Semantic_Web
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Linked Data
• Linked Data is the portion of the
Semantic Web involved with publishing
the data to the Web
– as opposed to ontology definitions, like
RDFS and OWL
• http://en.wikipedia.org/wiki/RDF_Schema
• http://en.wikipedia.org/wiki/Web_Ontology_Lan
guage
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
TBL’s Original 2006 Note on
Linked Data
1. Use URIs as names for things
2. Use HTTP URIs so that people can
look up those names
3. When someone looks up a URI,
provide useful information, using the
standards (RDF, SPARQL)
4. Include links to other URIs, so that
they can discover more things
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
“Cool URIs for the
Semantic Web”
• Start with:
– http://www.example.com/
• the homepage of Example Inc.
– http://www.example.com/people/alice
• the homepage of Alice
– http://www.example.com/people/bob
• the homepage of Bob
Quotes, figures, examples are from the W3C Cool URIs for the Semantic Web (see week 4 readings)
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
HTTP Content Negotiation
Review
• Assume Alice’s home page is available in
English & German, and HTML & PDF
– valid URIs:
• example.com/people/alice.en.html
• example.com/people/alice.de.html
• example.com/people/alice.en.pdf
• example.com/people/alice.de.pdf
• Q: Which version of Alice to link to?
• A: Link to example.com/people/alice
and let the client & server work out which is
the best representation for the resource Alice.
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
CN in Action
Client sends:
GET /people/alice HTTP/1.1
Host: www.example.com
Accept: text/html, application/xhtml+xml
Accept-Language: en, de
Server responds (200 style):
HTTP/1.1 200 OK
Content-Type: text/html
Content-Language: en
Content-Location: http://www.example.com/people/alice.en.html
Or the server could 302 redirect:
HTTP/1.1 302 Found
Location: http://www.example.com/people/alice.en.html
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Side Track: What is the range
of the http function?
• Simply put, does:
www.cs.odu.edu/~mln/galaxie/
identify:
– my car?
• implication: range includes real world object
– or a page about my car?
• implication: range includes only digital objects
(i.e., web pages, data streams, etc.)
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Answer: HTTP can Identify
Real World Objects
• After much study, the W3C Technical
Architecture Group (TAG) has issued that the
range of the http function includes real world
objects
– that is, the URI can identify my car, even if you can’t
(obviously) get my car over a network
– http://lists.w3.org/Archives/Public/wwwtag/2005Jun/0039.html
• Interesting read re: Semantic Web, Cool
URIs, & FRBR
– http://efoundations.typepad.com/efoundations/2009/02/httpra
nge14-cool-uris-frbr.html
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Summary of httpRange-14
a) If an "http" resource responds to a GET request with a
2xx response, then the resource identified by that URI
is an information resource;
web page
about car
b) If an "http" resource responds to a GET request with a actual car; aka
303 (See Other) response, then the resource identified non-information
resource; no
by that URI could be any resource;
representation
c) If an "http" resource responds to a GET request with a is available
4xx (error) response, then the nature of the resource
is unknown.
from: http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Two Rules for SW Cool URIs
implication: they don’t care for URI schemes like info:, tag:, etc.
since they can’t be dereferenced and thus while URIs, they are not
“on the Web”
1. Be on the Web.
Given only a URI, machines and people should be able to retrieve a
description about the resource identified by the URI from the Web. Such a
look-up mechanism is important to establish shared understanding of what
a URI identifies. Machines should get RDF data and humans should get a
readable representation, such as HTML.
The standard Web transfer protocol, HTTP, should be used.
2. Be unambiguous.
There should be no confusion between identifiers for Web documents
and identifiers for other resources. URIs are meant to identify only one of
them, so one URI can't stand for both a Web document and a real-world object.
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Solution: More URIs
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Alice, Alice, Alice, Alice
Alice as real world object. Or the
concept of Alice. Or the platonic
form of Alice. Or the FRBR “Work”
Alice. Or…
Can’t Touch This
(no representation available)
This URI will negotiate to either:
Description of Alice for machines
Description of Alice for humans
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Or One Fewer Alice
you could implement this with a mod_rewrite rule that takes /id/.* and sends it to
/idRedirector.php?id=$1 (cf. “how we do it today”) which looks at Accept: headers
and chooses to 303 to the URI that will eventually yield the appropriate representation.
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Lack 303 Capability?
Use “#” Trick.
URI1 =
URI1 != URI2
even though the server
has just 1 representation
for “about” (i.e., no
mod_rewrite, no redirector
script, etc.)
URI2 =
As per RFC 2396, the URI fragment (“#”) is applied to the representation
returned by the server; it is not sent to the server as part of the request URI.
The semantics of a fragment are defined by the MIME type, not RFC 2396
(i.e., foo.html#A and foo.rdf#A are “different”, but both fragments are applied
at the client side).
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
# Trick can be combined
with CN
200 style CN is built-in for
servers like Apache, so this
represents no burden for
typical deployment.
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
# Can Be Used as an
Optimization
• put all your people in /people.rdf
(machines) and /people.html (humans)
• have /people CN to either the RDF or
HTML version
• /people#alice identifies Alice as as noninformation resource (and different from
information resources /people.rdf#alice
or /people.html#alice), same as if you
had used 303
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
# Today, 303 Tomorrow?
• OWL defines the sameAs relationship
– http://www.w3.org/2002/07/owl
• So if I had legacy /people#alice URIs and later
added 303 infrastructure, I could create
/people/id/alice and make this RDF
statement somewhere to equate the two:
/people#alice owl:sameAs /people/id/alice
• Warning: owl:sameAs is a powerful relationship -use wisely.
– consider one of these instead:
• rdfs:seeAlso --> “related”
• ore:similarTo --> “similar but not equivalent”
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
owl:sameAs
It is probably “safe” to
owl:sameAs these pubs, since
they are either mirrored from
the original dlib.org, or the
“same” via URL canonicalization
rules (e.g., dlib.org -->
www.dlib.org).
http://scholar.google.com/scholar?cluster=14125544926044564082
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
ore:similarTo
now consider the publications in cluster:
http://scholar.google.com/scholar?cluster=8346216435269208748
all of which are some representation of info:doi/10.1016/j.ipm.2005.03.012
“Co-authorship networks in the digital library research community”
if we said:
http://arxiv.org/abs/cs.DL/0502056 owl:sameAs info:doi/10.1016/j.ipm.2005.03.012
and
http://portal.acm.org/citation.cfm?id=1103329.1103340 owl:sameAs info:doi/10.1016/j.ipm.2005.03.012
then by transitivity, we’d have:
http://arxiv.org/abs/cs.DL/0502056 owl:sameAs http://portal.acm.org/citation.cfm?id=1103329.1103340
which is clearly not true.
use the weaker “ore:similarTo” in this case:
http://arxiv.org/abs/cs.DL/0502056 ore:similarTo info:doi/10.1016/j.ipm.2005.03.012
and
http://portal.acm.org/citation.cfm?id=1103329.1103340 ore:similarTo info:doi/10.1016/j.ipm.2005.03.012
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
rdfs:seeAlso
There is a relationship between:
“A comparison of techniques for estimating IDF values to generate
lexical signatures for the web”
http://dx.doi.org/10.1145/1458502.1458510
and
“Correlation of Term Count and Document Frequency for Google N-Grams”
http://dx.doi.org/10.1007/978-3-642-00958-7_58
but it is clearly not owl:sameAs, and is weaker than the ore:similarTo
relationship in the previous example. So we use rdfs:seeAlso
http://dx.doi.org/10.1145/1458502.1458510 rdfs:seeAlso http://dx.doi.org/10.1007/978-3-642-00958-7_58
The “HTTP 303 See Other” response has similar semantics to rdfs:seeAlso.
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Discovery
• From /id/alice, we can use HTTP
CN to go to either /data/alice (RDF)
or /doc/alice (HTML)
• In HTML:
<link rel="bookmark" href="/id/alice/">
<link rel="alternate" href="/data/alice">
• In RDF:
/id/alice rdfs:isDefinedBy /data/alice
/data/alice rdfs:seeAlso /doc/alice
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Who's Doing Linked
Data?
from: http://linkeddata.org/
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
DBpedia.org
• Extracts information from wikipedia.org
and re-exports it in RDF, which can be
queried by SPARQL
– in other words, now we can start to do
something with all that RDF…
– the following is just a demo of SPARQL;
see http://www.w3.org/TR/rdf-sparql-query/
for more.
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Quick Detour:
SPARQL vs. SQL
Sample database (shown as triples, but imagine
in one table for SQL):
emps:e13954 HR:name 'Joe'
emps:e13954 HR:hire-date 2000-04-14
emps:e13954 HR:salary 48000
this & next 4 slides derived from: http://www.w3.org/2006/Talks/0301-melton-query-langs.pdf
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Sample Queries (1)
SQL:
SELECT salary
FROM employees
WHERE emp_id = 'e13954'
SPARQL:
note: SPARQL does have a FROM clause
allowing a particular RDF graph URI, but
note open world vs. closed world assumptions
here -- all RDF is global
SELECT ?sal
WHERE { emps:e13954 HR:salary ?sal . }
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Sample Queries (2)
SQL:
SELECT emp_id, salary
FROM employees
SPARQL:
SELECT ?id, ?sal
WHERE { ?id HR:salary ?sal }
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Sample Queries (3)
SQL:
SELECT hire_date
FROM employees
WHERE salary >= 21750
SPARQL:
SELECT ?hdate
WHERE { ?id HR:salary ?sal .
?id HR:hire_date ?hdate .
FILTER ?sal >= 21750 }
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Sample Queries (4)
SQL:
SELECT v.hire_date
FROM emp_vars AS v,emp_consts AS c
WHERE v.salary >= 21750
AND v.emp_id = c.emp_id
SPARQL:
SELECT ?hdate
WHERE { ?id HR:salary ?sal .
?id HR:hire_date ?hdate .
FILTER ?sal >= 21750 }
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
DJ Shadow
http://en.wikipedia.org/wiki/DJ_Shadow
http://dbpedia.org/resource/DJ_Shadow
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
DBpedia.org Prefixes
SPARQL:
PREFIX
PREFIX
PREFIX
PREFIX
PREFIX
PREFIX
PREFIX
PREFIX
PREFIX
PREFIX
owl: <http://www.w3.org/2002/07/owl#>
xsd: <http://www.w3.org/2001/XMLSchema#>
rdfs: <http://www.w3.org/2000/01/rdf-schema#>
rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
foaf: <http://xmlns.com/foaf/0.1/>
dc: <http://purl.org/dc/elements/1.1/>
: <http://dbpedia.org/resource/>
dbpedia2: <http://dbpedia.org/property/>
dbpedia: <http://dbpedia.org/>
skos: <http://www.w3.org/2004/02/skos/core#>
DBpedia.org SPARQL endpoint at: http://dbpedia.org/snorql/
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Sample Query (1)
SELECT *
WHERE { ?subject skos:subject
<http://dbpedia.org/resource/Category:Trip_hop_musicians>.
} LIMIT 50
Link
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Sample Query (2)
SELECT ?hasValue
WHERE {
<http://dbpedia.org/resource/DJ_Shadow>
dbpedia2:abstract ?hasValue
}
Link
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Sample Query (3)
SELECT ?x, ?y
WHERE {
?x dbpedia-owl:label
<http://dbpedia.org/resource/Quannum_Projects> .
<http://dbpedia.org/resource/Quannum_Projects> rdfs:label ?y
} LIMIT 50
Link
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
.
Sample Query (4)
SELECT ?property ?hasValue ?isValueOf
WHERE {
{ <http://dbpedia.org/resource/DJ_Shadow> ?property ?hasValue }
UNION
{ ?isValueOf ?property <http://dbpedia.org/resource/DJ_Shadow> }
}
Link
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Semantic Web - The
"Shadow" Web?
• Will every .html file have an .rdf file
associated with it? or corresponding URI
fragments in a single .rdf file?
• Why not just incorporate the RDF into the
HTML file?
• RDFa -- embeddable RDF
– examples from: http://www.w3.org/TR/xhtml-rdfaprimer/
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Adding Semantics to HTML
<div>
<h2>The trouble with Bob</h2>
<h3>Alice</h3>
...
</div>
<div xmlns:dc="http://purl.org/dc/elements/1.1/">
<h2 property="dc:title">The trouble with Bob</h2>
<h3 property="dc:creator">Alice</h3>
...
</div>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Many Triples in 1 File
<div xmlns:dc="http://purl.org/dc/elements/1.1/">
<div about="/alice/posts/trouble_with_bob">
<h2 property="dc:title">The trouble with Bob</h2>
<h3 property="dc:creator">Alice</h3>
...
</div>
<div about="/alice/posts/jos_barbecue">
<h2 property="dc:title">Jo's Barbecue</h2>
<h3 property="dc:creator">Eve</h3>
...
</div>
...
</div>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
<div> & <span>
<div about="/alice/posts/trouble_with_bob">
<h2 property="dc:title">The trouble with Bob</h2>
The trouble with Bob is that he takes much better photos than I do:
<div about="http://example.com/bob/photos/sunset.jpg">
<img src="http://example.com/bob/photos/sunset.jpg" />
<span property="dc:title">Beautiful Sunset</span>
by <span property="dc:creator">Bob</span>.
</div>
</div>
<div> == block level segmentation
<span> == inlined segmentation
http://en.wikipedia.org/wiki/Span_and_div
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Putting It All Together
<div xmlns:foaf="http://xmlns.com/foaf/0.1/" about="#me" rel="foaf:knows">
<ul>
<li typeof="foaf:Person">
<a property="foaf:name" rel="foaf:homepage" href="http://example.com/bob/">Bob</a>
</li>
<li typeof="foaf:Person">
<a property="foaf:name" rel="foaf:homepage" href="http://example.com/eve/">Eve</a>
</li>
<li typeof="foaf:Person">
<a property="foaf:name" rel="foaf:homepage" href="http://example.com/manu/">Manu</a>
</li>
</ul>
</div>
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Corresponding Graph
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Semantic Web Criticism
• Cory Doctorow, "Metacrap: Putting the torch
to seven straw-men of the meta-utopia"
– http://www.well.com/~doctorow/metacrap.htm
• Catherine C. Marshall, Frank Shipman,
"Which Semantic Web?"
– http://scholar.google.com/scholar?cluster=771613
7251541241731
ODU CS 751/851 Spring 2010 Michael L. Nelson [email protected]
Descargar

Introduction to Digital Libraries