The Semantic Web
Stefan Decker
Information Sciences Institute
University of Southern California
Outline
• Semantic Web Overview
– Vision, Challenges, Rationals
• Semantic Web in SCEC
2
Semantic Web
• coined by Tim Berners-Lee (1997)
"The Semantic Web is an extension of the current
web in which information is given well-defined
meaning, better enabling computers and people to
work in cooperation.”
– T. Berners-Lee, J. Hendler, O. Lassila,
“The Semantic Web”, Scientific American, May 2001
3
Doctor’s appointment
“The Semantic Web”, Scientific American, May 2001
Insurance Co.
Rating
Mom
Physician’s Agent
required
treatment
Provider sites
in-plan?
close-by?
Specialist?
Schedule appointment
Driving schedule
Lucy’s Agent
Pete’ Agent
4
Means to Achieve the Vision
• Explicit Ontologies
– Needed to understand each others data
(e.g., joint notion about what a schedule is)
• Web Services
– Required to actively interconnect systems
(automatically make an appointment)
5
Technical challenges
• Interoperability
– Inaccurate, incomplete, heterogeneous data
– Unreliable, ill-defined, evolving services
• Natural language processing, data mining
– make information explicit
• Human-computer interaction
– querying interfaces, visualization
• Scalability
– Subsecond performance
6
Social challenges
• Standardization is hard
– DublinCore
• Bogus or inaccurate metadata
– Physician rating, profile
• Competition and commoditization
• Economical incentive
– Chicken and egg
• Complexity: developers and users
7
Jump Starters
• Machine Readable Data:
–
.org (human-edited
directory)
–
.org (Music encyclopedia)
– RSS (RDF Site Summary)
–
(embedded metadata)
– CC/PP (Composite Capability/Preference
Profiles)
– P3P (Platform for Privacy Preferences)
8
Jump Starters
• B2B Vocabulary Projects
– PapiNet.org: Vocabulary for Paper Industry
– BPMI.org: Vocabulary for exchanging Business Process
Models
– XML-HR: Vocabularies for human resources (HR)
– DMTF (Distributed Management Task Force)
(Vocabularies for managing enterprises
– …
• Research Vocabulary Projects
–
–
–
–
Gen Ontology Working Group
Earth Sciences
MathNet
…
9
How do we get there?
Research communities
DL, AI, DB, …
Standards bodies
W3C, OMG, …
Non-profit
US, EC, Japan
Industry
IBM, Nokia, HP, Microsoft(?),...
Business.semanticweb.org
10
Non-profit
• DARPA
– “DARPA Agent Markup Language”
– since Aug 2000
www.daml.org
• NSF
– Co-sponsored events (e.g., SWWS)
www.semanticweb.org/SWWS
– Further support in the loop
• European Council
www.ontoweb.org
– “Semantic Web Technologies”, FrameWork 6
• Japan
www.net.intap.or.jp/INTAP/
– Interoperability Technology Association for
Information Processing, Japan (INTAP)
11
AI: “Add logic to the Web”
• Assertions, rules
• Agents
• Interoperability
–
–
–
–
–
First-order logics
Ontologies, description logics
Logic programming, datalog
Problem-solving methods
…
Distributed knowledge base
12
DB: “Everything is syntax”
• Semistructured data
• Web services
• Interoperability
–
–
–
–
Data integration
Mediation, query rewriting
Model management
Conceptual modeling
Conglomerate of distributed heterogeneous
(semistructured) databases
13
Many Previously Unknown
Communication Partners
14
Heterogenous Data
• To many data formats/languages
15
1. Step
• Define uniform, underlying syntax
– Lowest common denominator: labeled graphs
(semi-structured Data) -> RDF
Relational Database
Person
Structured Text (e.g., Vcard)
ID
F-name
L-name
1
Stefan
Decker
2
Birgit
Decker
begin:
fn:
n:
end:
vcard
Stefan
Decker;Stefan
vcard
Person
row
row
vcard1
L-name
ID F-name
1
Stefan
L-name
ID F-name
Decker 2
Birgit
fn
Stefan
n
Decker;Stefan
Decker
16
XML
•
•
•
•
Containment, hierarchy
Adjacency (A followed by B)
Attributes (atomic values)
Opaque reference (IDREF)
Good for serialization, poor for modeling
relational semantics
17
Encoding of Information
“The Creator of the Resource “http://www.w3.org/Home/Lassila” is Ora Lassila
http://www.w3.org/Home/Lassila
Creator
Ora Lassila
Endless encoding possibilities in XML:
<Creator>
<uri>http://www.w3.org/Home/Lassila</uri>
<name>Ora Lassila</name>
</Creator>
<Document uri=“http://www.w3.org/Home/Lassila”
<Creator>Ora Lassila</Creator>
</Document>
<Document uri=“http://www.w3.org/Home/Lassila” Creator=“Ora Lassila”/>
18
Introduction to RDF
• RDF (Resource Description Framework)
– Beyond Machine readable to Machine understandable
• RDF unites a wide variety of stakeholders:
– Digital librarians, content-raters, privacy advocates,
B2B industries, AI...
– Significant (but less than XML) industrial momentum,
lead by W3C
• RDF consists of two parts
– RDF Model (a set of triples)
– RDF Syntax (different XML serialization syntaxes)
• RDF Schema for definition of Vocabularies
(simple Ontologies) for RDF (and in RDF)
19
A Simple Example
• Describing Resources
–
–
–
–
URIs: global OIDs, literals
Binary relationships between objects
Arcs (relationships) are first-class objects
Blank (anonymous) nodes
• “Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila”
•
Structure
– Resource
– Property
– Value
(subject)
http://www.w3.org/Home/Lassila
(predicate) http://www.schema.org/#Creator
(object)
"Ora Lassila”
http://www.w3.org/Home/Lassila
s:Creator
Ora Lassila
20
RDF
• Graph-based universal syntax
(Agent-) Applications
RDF-Layer (Single dataformat, Query and
storage System)
Scheduling
Service
Insurance Ratings
Calendar
Semantics in a global, open environment?
21
Step2: Ontologies
• What is an Ontology?
„An ontology is a specification of a conceptualization.“
Tom Gruber, 1993
• Ontologies are social contracts
– Agreed, explicit semantics
– Understandable to outsiders
– (Often) derived in a community process
• Ontologies require Knowledge
Representation
– Is_a hierarchy, part of, attributes, axioms
22
RDF and Ontologies

Idea: Define an Ontology Language by defining
predefined nodes and arcs

The Ontology Language itself is just an Ontology

Ontologies are used to tag data from sources
23
Step 2: Layers on Top of RDF
From an
Ontology
LivingThing
subClassOf
Person
row
row
L-name
Tim Berners-Lee:
“Axioms, Architecture and Aspirations”
W3C all-working group plenary Meeting
28 February 2001
ID F-name
1
Stefan
L-name
ID F-name
Decker 2
Birgit
Decker
24
W3C Semantic Web Activity
Working Groups
RDF Core
Web Ontology
Advanced development
•
•
•
•
•
•
•
Annotation (Annotea)
Access control
Calendaring
Collaboration
Logic
Rules
Workflows
25
RDF Core Working Group
• Resource Description Framework (RDF)
• Goals
– Improve RDF abstract model and XML syntax
according to implementors feedback
– Define precise semantics for RDF and RDF
Schema
– Clarify ties with XML family
26
Web Ontology Working Group
• Standard definition language for ontologies
(conceptual models)
• Derived from Description Logics
– But partial mapping to Datbase and Datalog possible ->
(see Horrocks, Volz, Decker, Grossof: WWW2003)
• Extension of RDF Schema and DAML+OIL
–
–
–
–
Class Expressions (Intersection, Union, Complement)
XML Schema Datatypes
Enumerations
Property Restrictions
• Cardinality Constrains
• Value Restrictions
27
The Layer Cake
Research Phase
Standardization Phase
Recommendation Phase
Tim Berners-Lee:
“Axioms, Architecture and Aspirations”
W3C all-working group plenary Meeting
28 February 2001
28
SCEC/IT Architecture for a
Community Modeling Environment
29
Tasks within SCEC - CME
• Towards an Earth Sciences Ontology:
– Cataloging and Unification of Existing
Databases
• E.g., Fissures and Fault Activity Database
• Building a Mediation Environment
• Organizing a Community Process
• Enriching of Web Services and Grid
Infrastructure with Semantics
– Service Discovery and Match Making
30
Fault Activity Database
• Hand-Maintained within SCEC (Sue Perry)
• Re-engineering of the Database Schemata
<rdfs:Class rdf:about="&FAD_v1;AVG_RECURRENCE_INTERVAL"
rdfs:label="AVG_RECURRENCE_INTERVAL">
<a:_slot_constraints
rdf:resource="&FAD_v1;SCFADsep_02_00106"/>
<rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
</rdfs:Class>
<rdfs:Class rdf:about="&FAD_v1;AVG_SLIP_PER_EVENT"
rdfs:label="AVG_SLIP_PER_EVENT">
<rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
</rdfs:Class>
<rdfs:Class rdf:about="&FAD_v1;AVG_SLIP_PER_EVENT_METHOD"
rdfs:label="AVG_SLIP_PER_EVENT_METHOD">
<rdfs:subClassOf rdf:resource="&rdfs;Resource"/>
</rdfs:Class>
<rdf:Property rdf:about="&FAD_v1;CFM-A_coord_file_URL"
a:maxCardinality="1"
rdfs:label="CFM-A_coord_file_URL">
<rdfs:domain rdf:resource="&FAD_v1;FAULT"/>
<rdfs:range rdf:resource="&rdfs;Literal"/>
</rdf:Property>
31
Planned: Mediation Environment with RDFbased Rule Language
Applications
Mediation with RDF-based Rule Language
Fault Activity
Database
Fissures
Grid Services
32
Motivation:
Why Rule Languages for the Web
• Plethora of data available
– Data needs to be adapted and combined
– “Time to Market”: Faster to write rules than code
– Data Transformation and Integration
• Logic specification, not programming
– Tabled evaluation/bottom-up evaluation
– Semi-structured data
– Multiple semantics (Relational Data, UML, ER,
TopicMaps, DAML+OIL, XML-Schema, special
purpose data models)
– Distributed, heterogeneous sources
33
What’s Wrong With Existing
Approaches?
• Built-in semantics (e.g. SiLRI, RQL, DQL)
– but: many RDF-based languages with different
semantics (DAML+OIL, RDF Schema,
UML/RDF, TopicMaps/RDF, DMTF, …)
– For each language a specialized query language
????
34
TRIPLE:Language Overview
•Native support
•for Resources & namespaces,
•Abbreviations
•Models (sets of RDF statements)
•Reification
•Rules with expressive bodies (full FOL syntax)
•Inspired by F-Logic:
•subject[predicateobject] (“molecule”)
35
Language Description I
• Namespace and resource abbreviations:
– rdf := “http://www.w3.org/1999/02/22-rdf-syntax-ns#”.
– isa := rdf:subClassOf.
• Statements, triples, molecules:
– subject[predicateobject]
– subject[p1o1; p2 o2; ...]
– s1[p1  s2[p2o] ]
• Models, model expressions, parameterized
models:
– s[po]@m
“triple <s,p,o> in model m”
– s[po]@(m1  m2)
model intersection, union, diff.
– s[po]@sf(m1, X, Y) Skolem function
36
Language Description II
• Reification:
– stefan[believes  <Ora[isAuthorOfhomepage]> ]
• Logical formulae:
– usual logical connectives and quantifiers:      

– all variables introduced via  (or )
• Clauses:
– facts: s[p1o1; p2 o2; ...].
– rules: X s1[p1X]  s2[p2X]  ... .
• Model blocks:
– @model { clauses }
– Mdl @model(Mdl) { clauses }
37
Example: Dublin Core
namespace abbreviations
dc := “http://purl.org/dc/elements/1.0/”.
db := “http://www-db.stanford.edu/”.
TRIPLE
Stefan Decker
····
model block
@db:documents {
dc:title
dc:creator
fact
db:d_01_01 [
db:d_01_01
dc:title  TRIPLE;
dc:subject
dc:subject
dc:creator  “Stefan Decker”;
dc:subject  RDF;
...
RDF
triples
dc:subject  triples; ... ].
rule
N p(N)[ rdf:type  xyz:Person;
xyz:name  N ] 
D D[dc:creator  N].
}N  P P[rdf:type  xyz:Person;
xyz:name  N]@db:documents.
Stefan Decker
name
Perso
n
rdf:type
query:
“find all names”
N = “Stefan Decker”
38
Example: Specification of RDF Schema
Semantics
namespace abbreviations
rdf := 'http://www.w3.org/...rdf-syntax-ns#'.
rdfs := 'http://www.w3.org/.../PR-rdf-schema-...#'.
type := rdf:type.
subPropertyOf := rdfs:subPropertyOf.
subClassOf := rdfs:subClassOf.
FORALL Mdl @rdfschema(Mdl) {
resource abbreviations
model block
“copy” triples from Mdl
FORALL O,P,V O[P->V] <O[P->V]@Mdl.
Transitivity of subClassOf
FORALL O,V O[subClassOf->V] <EXISTS W (O[subClassOf->W]
AND W[subClassOf->V]).
…
}
39
Example:
Cars Ontology with RDF Schema Semantics
@cars {
xyz:MotorVehicle[rdfs:subClassOf -> rdfs:Resource].
xyz:PassengerVehicle[rdfs:subClassOf -> xyz:MotorVehicle].
xyz:Truck[rdfs:subClassOf -> xyz:MotorVehicle].
xyz:Van[rdfs:subClassOf -> xyz:MotorVehicle].
xyz:MiniVan[
rdfs:subClassOf -> xyz:Van;
rdfs:subClassOf -> xyz:PassengerVehicle].
}
xyz:MotorVehicl
e
xyz:Truc
k
xyz:Van
xyz:PassengerVehicl
e
xyz:MiniVan
FORALL X <X[rdfs:subClassOf -> xyz:MotorVehicle]@cars.
X = xyz:Van
X = xyz:Truck
X = xyz:PassengerVehicle
FORALL X <X[rdfs:subClassOf -> xyz:MotorVehicle]@rdfschema(cars).
X = xyz:Van
X = xyz:Truck
X = xyz:PassengerVehicle
X = xyz:MiniVan
40
Grid Computing and Web Services (ongoing)
• Matchmaking between Jobs and Resources
• Hard-Coded in Globus Toolkit
– Reeingineering using a Ontology and Rulebased solution
– RDF and DMTF Vocabulary (www.dmtf.org)
<rdfs:Class rdf:ID="CIM_ComputerSystem">
<rdfs:subClassOf rdf:resource="#CIM_System"/>
<version><![CDATA["2.6.0"]]></version><rdfs:comment
parseType="Literal"><![CDATA["A class derived from System that is a special
collection of ManagedSystemElements. This collection provides compute
capabilities and serves as aggregation point to associate one or more of the
following elements: FileSystem, OperatingSystem, Processor and Memory
(Volatile and/or NonVolatile Storage)."]]></rdfs:comment>
<rdfs:subClassOf>
<daml:Restriction>
<daml:toClass rdf:resource="#string"/>
41
Semantic Web and Earth Sciences
• Semantic Web field provides technologies
for explicity vocabulary and mediate data
• Standards-based, many resources available
– Editors, Rule Engines, APIs
• Effort feeds back for other domain
42
Descargar

Semantic Web