Markup Languages and the Semantic Web
Lecture Notes Prepared by
Jagdish S. Gangolly
Interdisciplinary Ph.D Program in Information Science
State University of New York at Albany
10/4/2015
Inf 722 Fall 2007 (Gangolly)
1
Markup Languages
• Knowledge assumed:
– HTML
• DTD (Document Type Definition)
• Tags
– Format (confusion between format and other tags)
– Structure (Too flexible, and so almost useless)
– Content (virtually none)
•
•
•
•
Very poor in semantics
Inability to exploit latent semantics
Users at the mercy of browsers
Inflexibility in adding new tags un less blessed by browsers
10/4/2015
Inf 722 Fall 2007 (Gangolly)
2
XML I
• SGML, the forerunner of HTML
– Too complex (annotated SGML standard runs over
1,000 pages
– Too flexible
– Little browser support
• XML
– Less complex and yet extensible
– Flexible in expressing semantics
– Browser support
10/4/2015
Inf 722 Fall 2007 (Gangolly)
3
XML II
• Separation of format, content, and structure tags
– Content: Schema
• Rich set of data types
• Easy to understand and implement
– Format: XSL (XML Style-sheet language)
• Complex and no universal browser support
• Such support may not be crucial because of XSLT (XSL
Transform) which enables HTMLize XML
– Structure: Subsumed in content and format
– Representing richer semantics than HTML allowed
10/4/2015
Inf 722 Fall 2007 (Gangolly)
4
XML III
• Discipline enforced
• Document Type Definition, required to specify the
grammar of HTML and SGML required programmers to
be familiar with one more language (EBNF - Extended
Backus-Naur Formalism) in which DTDS are represented.
• Good browser support
• DOM (Document Object Model), SAX (Simple API for
XML), and Namespaces facilitates machines to
communicate and (understand) mutual data to an extent
10/4/2015
Inf 722 Fall 2007 (Gangolly)
5
Semantic Web
• ..is a mesh of information linked up in such a
way as to be easily processable by machines,
on a global scale.
(http://infomesh.net/2001/swintro/)
10/4/2015
Inf 722 Fall 2007 (Gangolly)
6
Motivation
• Need for interchangeability of information
(information sharing)
• Need for interchangeability, translatability,
uniformity of ontologies
• Need for improving precision in retrieval
• Need for web services based on understanding of
data as well as metadata
10/4/2015
Inf 722 Fall 2007 (Gangolly)
7
Semantic Web Components
– Data
•
•
•
•
Structure
Content
Format
Ontology
– Metadata
• Representation Languages
• Facility for metadata Interchange
10/4/2015
Inf 722 Fall 2007 (Gangolly)
8
Data
• Data (Semi-structured as well as structured)
• Structure Tags: XML-Schema
• Content Tags: XML-Schema
• Ontology: Ontology representation languages
10/4/2015
Inf 722 Fall 2007 (Gangolly)
9
Metadata I
• Representation languages based on First Order
Logic
• KIF-based Ontolingua
(http://www.ksl.stanford.edu/software/ontolingua/
• Loom (http://www.isi.edu/isd/LOOM/LOOM-HOME.html)
• Frame-Logic
(http://www.cs.sunysb.edu/~kifer/dood/papers.html)
10/4/2015
Inf 722 Fall 2007 (Gangolly)
10
Metadata II
• Languages using standardised syntax
– Simple HTML Ontology Extensions (SHOE)
(http://www.cs.umd.edu/projects/plus/SHOE/)
– XOL Ontology Exchange Language
(XOL)(http://www.ai.sri.com/pkarp/xol/)
– Ontology Markup Language (OML and CKML)
(Ontology Markup Language (OML and CKML)
– Resource Description Framework Schema Language
(RDFS) (http://www.w3.org/TR/rdf-schema/)
– RiboWEB (http://wwwsmi.stanford.edu/projects/helix/riboweb/kb-pub.html)
10/4/2015
Inf 722 Fall 2007 (Gangolly)
11
Metadata III
– OIL (Ontology Interchange Language)
(http://www.ontoknowledge.org/oil/)
– DAML+OIL (http://www.daml.org)
– XFML+CAMEL (eXchangeable Faceted Metadata Language +
Compound term composition Algebraically-Motivated Expression
Language) (http://www.csi.forth.gr/~tzitzik/XFML+CAMEL/)
• Good sources of information:
– http://www.cs.umd.edu/users/hendler/sciam/walkthru.html
– http://www.w3.org/2001/sw/
10/4/2015
Inf 722 Fall 2007 (Gangolly)
12
Dublin Core
• Metadata ElementsISO 15836:2003
10/4/2015
Title
Format
Creator
Identifier
Subject
Source
Description
Publisher
Language
Relation
Contributor
Date
Type
Coverage
Rights
Inf 722 Fall 2007 (Gangolly)
13
RDF
(http://www.xml.com/pub/a/2002/01/30/daml1.html)
• XML based language that allows you to
define classes and properties
<rdfs:Class rdf:ID="Product">
<rdfs:label>Product</rdfs:label> <rdfs:comment>An
item sold by Super Sports Inc.</rdfs:comment>
</rdfs:Class>
<rdfs:Property rdf:ID="productNumber">
<rdfs:label>Product Number</rdfs:label> <rdfs:domain
rdf:resource="#Product"/> <rdfs:range
rdf:resource="http://www.w3.org/2000/01/rdfschema#Literal"/> </rdfs:Property>
10/4/2015
Inf 722 Fall 2007 (Gangolly)
14
RDF
• "there is a Person identified by
http://www.w3.org/People/EM/contact#me, whose
name is Eric Miller, whose email address is
[email protected], and whose title is Dr."
10/4/2015
Inf 722 Fall 2007 (Gangolly)
15
RDF
10/4/2015
Inf 722 Fall 2007 (Gangolly)
16
RDF
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:contact="http://www.w3.org/2000/10/swap/pim/contact#">
<contact:Person
rdf:about="http://www.w3.org/People/EM/contact#me">
<contact:fullName>Eric Miller</contact:fullName>
<contact:mailbox rdf:resource="mailto:[email protected]"/>
<contact:personalTitle>Dr.</contact:personalTitle>
</contact:Person>
</rdf:RDF>
10/4/2015
Inf 722 Fall 2007 (Gangolly)
17
DAML+OIL I (http://www.xml.com/pub/a/2002/01/30/daml1.html)
• DAML+OIL also allows you to define instances of classes
and specify their properties
<Product rdf:ID="WaterBottle"> <rdfs:label>Water Bottle</rdfs:label>
<productNumber>38267</productNumber> </Product>
• DAML+OIL allows datatyping
<daml:DatatypeProperty rdf:ID="productNumber">
<rdfs:label>Product Number</rdfs:label> <rdfs:domain
rdf:resource="#Product"/> <rdfs:range
rdf:resource="http://www.w3.org/2000/10/XMLSchema#nonNegat
iveInteger"/> </daml:DatatypeProperty>
10/4/2015
Inf 722 Fall 2007 (Gangolly)
18
DAML+OIL II
• Provides for uniqueness, equivalence, enumerations,
disjoint classes, disjoint unions of classes, nonexclusive Boolean combinations of classes,
intersection of classes, sub-classing, property
restrictions
• Rich enough to model ontologies
10/4/2015
Inf 722 Fall 2007 (Gangolly)
19
Semantic Web Stack of Expressive Power (Berners-Lee)
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
10/4/2015
Inf 722 Fall 2007 (Gangolly)
20
Semantic Web Stack of Expressive Power (Berners-Lee)
• URI (Uniform Resource Identifier)
– http://www.ietf.org/rfc/rfc2396.txt
• Unicode
– unicode.org
• XML
– http://www.w3.org/XML/
• RDF
– http://www.w3.org/RDF/
• RDF-S (RDF Schema)
– www.w3.org/TR/2000/CR-rdf-schema-20000327/
• SPARQL
– www.w3.org/TR/rdf-sparql-query/
10/4/2015
Inf 722 Fall 2007 (Gangolly)
21
• OWL (Web Ontology Language)
– http://www.w3.org/2004/OWL/
• RIF
– http://www.w3.org/TR/rif-core/
•
•
•
•
Unifying Logic
Proof
Crypto
Trust
10/4/2015
Inf 722 Fall 2007 (Gangolly)
22
Web Ontology Language (OWL) I
• OWL Lite supports those users primarily needing a
classification hierarchy and simple constraints.
• OWL DL supports those users who want the maximum
expressiveness while retaining computational completeness
(all conclusions are guaranteed to be computed) and
decidability (all computations will finish in finite time).
• OWL Full is meant for users who want maximum
expressiveness and the syntactic freedom of RDF with no
computational guarantees.
Source: http://www.w3.org/TR/owl-features/
10/4/2015
Inf 722 Fall 2007 (Gangolly)
23
Semantic Web: Readings
• Semantic Web: Readings
• “The Semantic Web In Breadth”, by Aaron Swartz
– http://logicerror.com/semanticWeb-long
• The Semantic Web: An Introduction
– http://infomesh.net/2001/swintro/
10/4/2015
Inf 722 Fall 2007 (Gangolly)
24
Descargar

Semantic Web - University at Albany - SUNY