Introduction to Semantic Web
José Júlio Alferes
[email protected]
Hanoi, 26th March to 5th April, 2007
Course outline
The Semantic Web idea
Semi-structured data in the Web – XML
Describing Web resources in RDF
Ontologies and Introduction to Description Logics
Web Ontology Language: OWL
Rules and inferences in the Semantic Web
Hanoi, March/April 2007
• A great part of the course is based on the
introductory book:
A Semantic Web Primer
Grigoris Antoniou and Frank van Harmelen,
The MIT Press (2004), ISBN:0262012103
• All the standards, and much more material at:
• Lots of material available at REASE
Register and have a look!
Hanoi, March/April 2007
Part 1: The Vision of a Semantic Web
Today’s Web
• Most of today’s Web content is designed and
appropriate for human consumption
– Even Web content that is generated automatically is
usually then processed and presented without the
original structural information (e.g. from databases)
• Typical Web usage of today needs people
– for seeking and making use of information,
– searching for and getting in touch with other people,
– reviewing catalogs of online stores and ordering
products by filling out forms, …
• A Web in machines, for humans’ usage
Hanoi, March/April 2007
Keyword-Based Search Engines
• Except for keyword-based search engines,
the Web today is not well supported by
software tools, nor is that particular well
suited for automatic processing
• The Web would not have been the huge
success it was, were it not for search
Hanoi, March/April 2007
Problems of Keyword-Based
Search Engines
• Results are highly sensitive to vocabulary
• Results are single Web pages
• Human involvement is necessary to
interpret and combine results
• Results of Web searches are not readily
accessible by other software tools
Hanoi, March/April 2007
Meaning for Web pages
• The meaning of Web pages is not readily machineaccessible: lack of semantics
• It is difficult to distinguish the meaning between
these two sentences:
I am a professor of computer science.
I am a professor of computer science,
you may think. Well, . . .
• Natural language processing cannot be the answer
in such a huge scale
Hanoi, March/April 2007
The Semantic Web Approach
• Represent Web content in a form that is more easily
• Use intelligent techniques to take advantage of these
• The Semantic Web will gradually evolve out of the
existing Web
– It is not a competition to the current WWW
• A Web in machines, for humans and machines usage
• Evolving into a Web of Data
Hanoi, March/April 2007
Semantic Web Definition
• By T. Berners-Lee, J. Hendler, O. Lassila in Scientific
– The Semantic Web is an extension of the current web in which
information is given well-defined meaning, better enabling
computers and people to work in cooperation.
• Definition at the Semantic Web Activity at W3C
– The Semantic Web provides a common framework that allows data
to be shared and reused across application, enterprise, and
community boundaries. It is a collaborative effort led by W3C with
participation from a large number of researchers and industrial
partners. It is based on the Resource Description Framework
(RDF), which integrates a variety of applications using XML for
syntax and URIs for naming.
Hanoi, March/April 2007
The Semantic Web Impact –
Knowledge Management
• Knowledge management concerns itself with
acquiring, accessing, and maintaining knowledge
within an organization
• Key activity of large businesses: internal
knowledge as an intellectual asset
• It is particularly important for international,
geographically dispersed organizations
• Most information is currently available in a
weakly structured form (e.g. text, audio, video)
Hanoi, March/April 2007
Limitations of Current Knowledge
Management Technologies
• Searching information
– Keyword-based search engines only
• Extracting information
– human involvement necessary for browsing, retrieving,
interpreting, combining
• Maintaining information
– inconsistencies in terminology, outdated information.
• Viewing information
– Impossible to define views (in the manner of databases)
on Web knowledge
Hanoi, March/April 2007
Semantic Web Enabled Knowledge
• Knowledge will be organized in conceptual spaces
according to its meaning.
• Automated tools for maintenance and knowledge
• Semantic query answering
• Query answering over several documents
• Defining who may view certain parts of
information (even parts of documents) will be
Hanoi, March/April 2007
The Semantic Web Impact –
B2C Electronic Commerce
• A typical scenario: user visits one or several
online shops, browses their offers, selects
and orders products.
• Ideally humans would visit all, or all major
online stores; but too time consuming
• Shopbots are a useful tool
Hanoi, March/April 2007
Limitations of Shopbots
• They rely on wrappers: extensive
programming required
• Wrappers need to be reprogrammed when
an online store changes its outfit
• Wrappers extract information based on
textual analysis
– Error-prone
– Limited information extracted
Hanoi, March/April 2007
Semantic Web Enabled B2C
Electronic Commerce
• Software agents that can interpret the
product information and the terms of
– Pricing and product information, delivery and
privacy policies will be interpreted and
compared to the user requirements.
• Information about the reputation of shops
• Sophisticated shopping agents will be able
to conduct automated negotiations.
Hanoi, March/April 2007
The Semantic Web Impact –
B2B Electronic Commerce
• Greatest economic promise
• Currently relies mostly on
– Isolated technology, understood only by experts
– Difficult to program and maintain, error-prone
– Each B2B communication requires separate
• Web appears to be perfect infrastructure
– But B2B not well supported by Web standards
Hanoi, March/April 2007
Semantic Web Enabled B2B
Electronic Commerce
• Businesses enter partnerships without much
• Differences in terminology will be resolved using
standard abstract domain models
• Data will be interchanged using translation
• Auctioning, negotiations, and drafting contracts
will be carried out automatically (or semiautomatically) by software agents
Hanoi, March/April 2007
The Semantic Web Impact – Data
• Some applications require managing several
(heterogeneous) database. E.g.
– after company merges
– - groups of companies with similar activities (e.g. hotel
– combination of administrative data – e-Government
– scientific data (e.g. in bioinformatics)
• Most of public data is available on the Web
– And, most likely, proprietary data isn’t available
Hanoi, March/April 2007
Data integration
• What is needed?
– Data available for machine processing
– Data possible to combine, on a Web scale
– Reasoning mechanisms over data
– A Web of Data
Hanoi, March/April 2007
An example of integration
• Consider the following data on a bookstore
Hanoi, March/April 2007
Data integration example
• Export data as relations
– Relations form a graph, where nodes refer to
either data or identifiers (URI)
Hanoi, March/April 2007
Data integration example (cont)
• Data in another book store
Hanoi, March/April 2007
Data integration example (cont)
Hanoi, March/April 2007
Data integration example (cont)
Hanoi, March/April 2007
Data integration example (cont)
• What if we had additional knowledge?
• E.g
– we could know that a:author is the same as f:auteur
– We could know that in both cases they represent
• In other words, we could have
– ontologies describing the concepts and relations
– mappings of our relations into the ontologies
Hanoi, March/April 2007
Data integration example (cont)
Hanoi, March/April 2007
Data integration example (cont)
• With this merged information, richer
queries are possible:
– The second bookstore can query what is the
web page of the author of the book
– Possibly can ask about other information on the
author, or translator via foaf
– The first bookstore can query information about
the translator
Hanoi, March/April 2007
A Web of Data
Hanoi, March/April 2007
Automatic data integration
• This process of merging knowledge is done easily by users
in the Web
• For doing it automatically, some more rigor is needed
Have data structured in the web
Name relations in a standard manner
Have ontologies for describing general concepts
Refer to the ontologies when exporting relations
• The Semantic Web provides technologies to make all this
– For structuring data; defining relations; defining ontologies;
querying data; reasoning over data; …
Hanoi, March/April 2007
Semantic Web Technologies
(Structure data)
Explicit Metadata
Logic and Inference
Hanoi, March/April 2007
• Web content is currently formatted for
human readers rather than programs
• HTML is still the predominant language in
which Web pages are written
– Either directly by user, or using specific tools
• Vocabulary (and markup) describes
presentation, not content
Hanoi, March/April 2007
An HTML Example
<h1>Agilitas Physiotherapy Centre</h1>
Welcome to the home page of the Agilitas Physiotherapy Centre. Do
you feel pain? Have you had an injury? Let our staff Lisa Davenport,
Kelly Townsend (our lovely secretary) and Steve Matthews take care
of your body and soul.
<h2>Consultation hours</h2>
Mon 11am - 7pm<br>
Tue 11am - 7pm<br>
Wed 3pm - 7pm<br>
Thu 11am - 7pm<br>
Fri 11am - 3pm<p>
But note that we do not offer consultation during the weeks of the
<a href=". . .">State Of Origin</a> games.
Hanoi, March/April 2007
Problems with HTML
• Humans have no problem with this
– especially if it is previously processed by a
• Machines (software agents) do:
• How to distinguish therapists from the secretary?
• How to determine exact consultation hours?
– They would have to follow the link to the State
Of Origin games to find when they take place.
Hanoi, March/April 2007
A Better (Structured) Representation
Agilitas Physiotherapy Centre
<therapist>Lisa Davenport</therapist>
<therapist>Steve Matthews</therapist>
<secretary>Kelly Townsend</secretary>
Hanoi, March/April 2007
Explicit Metadata
• This representation is far more easily
processable by machines
• XML for (semi)structured data
• Metadata: data about data
– Metadata capture part of the meaning of data
• Semantic Web does not rely on text-based
manipulation, but rather on machineprocessable metadata
Hanoi, March/April 2007
The term ontology originates from philosophy
• The study of the nature of existence
Different meaning from computer science
• An ontology is an explicit and formal
specification of a conceptualization
Hanoi, March/April 2007
Typical Components of
• Terms denote important concepts (classes of
objects) of the domain
– e.g. professors, staff, students, courses, departments
• Relationships between these terms: typically class
– a class C to be a subclass of another class C' if every
object in C is also included in C'
– e.g. all professors are staff members
Hanoi, March/April 2007
Further Components of
– e.g. X teaches Y
Value restrictions
– e.g. only faculty members can teach courses
Disjointness statements
– e.g. faculty and general staff are disjoint
Logical relationships between objects
– e.g. every department must include at least 10 faculty
Hanoi, March/April 2007
Example of a Class Hierarchy
Hanoi, March/April 2007
The Role of Ontologies in the Web
• Ontologies provide a shared understanding
of a domain: semantic interoperability
– overcome differences in terminology
– mappings between ontologies
• Remember the integration example?
• Ontologies are useful for the organization
and navigation of Web sites
Hanoi, March/April 2007
The Role of Ontologies in Web Search
• Ontologies are useful for improving accuracy of
Web searches
– search engines can look for pages that refer to a precise
concept in an ontology
• Good example: GoPubMed
• Web searches can exploit
generalization/specialization information
– If a query fails to find a relevant document, the search
engine may suggest to the user a more general query.
– If too many answers are retrieved, the search engine
may suggest to the user some specializations.
Hanoi, March/April 2007
Web Ontology Languages
RDF and RDF Schema
• RDF is a data model for objects and relations
between them, much like the graphs of the
integration example
• RDF Schema is a vocabulary description
– Describes properties and classes of RDF resources
– Provides semantics for generalization hierarchies of
properties and classes
Hanoi, March/April 2007
Web Ontology Languages (2)
• A richer ontology language, including
relations between classes (e.g., disjointness)
cardinality (e.g. “exactly one”)
richer typing of properties
characteristics of properties (e.g., symmetry)
Hanoi, March/April 2007
Logic and Inference
• Logic studies the principles of reasoning
– Formal languages for expressing knowledge
– Well-understood formal semantics
• Declarative knowledge: we describe what holds
without caring about how it can be deduced
• Automated reasoners can deduce (infer)
conclusions from a given knowledge
Hanoi, March/April 2007
An Inference Example
prof(X)  faculty(X)
faculty(X)  staff(X)
We can prove/deduce the following conclusions:
prof(X)  staff(X)
Hanoi, March/April 2007
Logic versus Ontologies
• The previous example involves knowledge
typically representable in ontologies
– Logic can be used to uncover ontological knowledge
that is implicitly given
– It can also help uncover unexpected relationships and
• Logic is more general than ontologies
– It can also be used by intelligent agents for making
decisions and selecting courses of action
Hanoi, March/April 2007
Expressive Power versus
Computational Complexity
• The more expressive a logic is, the more it
becomes computationally expensive to draw
– Drawing certain conclusions may become impossible if
non-computability barriers are encountered.
• In the previous examples involved rules “If
conditions, then conclusion,” and only finitely
many objects
– This subset of logic is tractable and is supported by
efficient reasoning tools
Hanoi, March/April 2007
Inference and Explanations
• Besides deducing conclusion, the deductions
themselves may be of interest
• A proof/deduction for a conclusion can be seen as
an explanation for it
• In the context of the Semantic Web explanations
may be useful for trust!
– Provide users with explanation of the results (on
– Activities between automatic agents: create or validate
Hanoi, March/April 2007
Typical Explanation Procedure
• Facts will typically be traced to some Web
– The trust of the Web address will be verifiable
by agents
• Rules may be a part of a shared commerce
ontology or the policy of the online shop
Hanoi, March/April 2007
Software Agents
• Software agents work autonomously and
– They evolved out of object oriented and componentbased programming
• A personal agent on the Semantic Web will:
– receive some tasks and preferences from the person
– seek information from Web sources, communicate with
other agents
– compare information about user requirements and
preferences, make certain choices
– give answers to the user
Hanoi, March/April 2007
Intelligent Personal Agents
Personal Agent
Present in
Web browser
Search engine
Semantic Web
Infrastructure and
WWW docs
WWW docs
Hanoi, March/April 2007
Semantic Web Technologies
• Metadata
– Identify and extract information from Web sources
• Ontologies
– Web searches, interpret retrieved information
– Communicate with other agents
• Logic
– Process retrieved information, draw conclusions,
provides explanations
Hanoi, March/April 2007
A Layered Approach
• The development of the Semantic Web
proceeds in steps
– Each step building a layer on top of another
• Downward compatibility
• Upward partial understanding
Hanoi, March/April 2007
The Semantic Web Layer Tower
Hanoi, March/April 2007
Semantic Web Layers
• XML layer
– Syntactic basis for structured data
• RDF layer
– RDF basic data model for facts
– RDF Schema simple ontology language
• Ontology layer
– More expressive languages than RDF Schema
– Current Web standard: OWL
Hanoi, March/April 2007
Semantic Web Layers (cont)
• Logic layer
– enhance ontology languages further
– application-specific declarative knowledge
• Proof layer
– Proof generation, exchange, validation
• Trust layer
– Digital signatures
– recommendations, rating agencies ….
Hanoi, March/April 2007
The New Semantic Web Layer Tower
Hanoi, March/April 2007
Course outline revisited
The Vision of the Semantic Web
XML and semi-structured data in the Web
Describing Web resources in RDF
Ontologies and Introduction to Description Logics
Web Ontology Language: OWL
Rules and inferences in the Semantic Web
Hanoi, March/April 2007
Part 2: XML & friends
• XML: Extensible Markup Language
• Defined by WWW Consortium (W3C)
• It was designed as an extension of HTML
– Both XML and HTML derive from the former SGML
(Standard Generalized Markup Language)
– In XML, documents can have tags that provide additional
information about parts of the document
• E.g. <part>
<title>XML </title>
<slide> Introduction …</slide>
Hanoi, March/April 2007
Introduction (cont)
• XML versus HTML
– Contrary to HTML, XML is extensible
• The user can add new types of tags and defines later (or adhere to
some predefined set), separately, how to deal with those tags
(namely in what concerns their display).
– In XML, the data, their syntax, and the way they are
displayed are all defined separately.
• In HTML, the syntax is fixed (the set of tags is predefined), and
the tags define simultaneously the data and its visualization.
• As we shall see, this allows for structuring the
information in a way that makes it much easier for
machine processing.
Hanoi, March/April 2007
An HTML Example
<h2>A Semantic Web Primer</h2>
<i>by <b>G. Antoniou</b> and
<b>F. van Harmelen</b></i><br>
MIT Press 2004<br>
ISBN 0262012103
Let’s see it
Hanoi, March/April 2007
The Same Example in XML
<title>A Semantic Web Primer</title>
<author>G. Grigorious</author>
<author>F. van Harmelen</author>
<publisher>MIT Press</publisher>
<ISBN> 0262012103</ISBN>
Let’s try!
Hanoi, March/April 2007
Formatting in HTML and XML
• The HTML representation provides also the
– The main use of an HTML document is to
display information: it must define formatting
• In XML the content is separated from the
– The same information can be displayed in
different ways
– The way to display is defined separately
Hanoi, March/April 2007
HTML versus XML: Similarities
• Both use tags (e.g. <h2> and </year>)
• Tags may be nested (tags within tags)
• Human users can read and interpret both
HTML and XML representations quite
… But how about machines?
Hanoi, March/April 2007
Problems with Automated
Interpretation of HTML Documents
An intelligent agent trying to retrieve the
of the authors of the book
• Authors’ names could appear immediately
after the title
– or immediately after the word by
• Are there two authors?
– Or just one, called “G. Antoniou and F. van
Hanoi, March/April 2007
Structural Information in XML
• HTML documents do not contain structural
• XML is easier to access by machines because
– Every piece of information is described.
– Relations are also defined through the nesting
• E.g., the <author> tags appear within the <book> tags,
so they describe properties of the particular book.
Hanoi, March/April 2007
Structural Information in XML (cont)
• A machine processing the XML document could
determine that:
– the author element refers to the enclosing book element
– rather than by proximity considerations
• XML allows the definition of constraints on
– E.g. a year must be a number of four digits
• XML allows the definition of constraints over the
– E.g. a book may have several authors, but only one title
Hanoi, March/April 2007
Another Example
<h2>Relationship matter-energy</h2>
<i> E = M × c2 </i>
• In XML
<meaning>Relationship matter energy</meaning>
<leftside> E </leftside>
<rightside> M × c2 </rightside>
Hanoi, March/April 2007
Tags in XML
• Despite “talking” of completely different
things, both HTML documents use the same
– In fact they are mainly for displaying
• In XML it is completely different
– XML tags are not fixed: user definable tags
– XML meta markup language: language for
defining markup languages
• We will see DTDs and XML Schema
Hanoi, March/April 2007
XML Vocabularies
• Web applications must agree on common
vocabularies to communicate and collaborate
• Communities and business sectors are defining
their specialized vocabularies
mathematics (MathML)
bioinformatics (BSML)
human resources (HRML)
Hanoi, March/April 2007
The XML Language
An XML document consists of
• a prolog
• a number of elements
• an optional epilog
Hanoi, March/April 2007
Prolog of an XML Document
The prolog consists of
• an XML declaration and
• an optional reference to external structuring
<?xml version="1.0" encoding="UTF-16"?>
<!DOCTYPE book SYSTEM "book.dtd">
• This last one refers to where the vocabulary
(defining the tags and structure) is defined.
Hanoi, March/April 2007
XML Elements
• XML documents then contain elements
– There is always one root element
– Other elements are nested
• An element consists of:
– an opening tag
– the content
– a closing tag
<author>G. Antoniou</author>
Hanoi, March/April 2007
XML Elements (more details)
• Tag names can be chosen almost freely.
• The first character must be a letter, an
underscore, or a colon
• No name may begin with the string “xml”
in any combination of cases
– E.g. no “Xml”, nor “xML”
Hanoi, March/April 2007
Content of XML Elements
• Content may be text, other elements, or nothing
• E.g.
<name>David Billington</name>
<phone> +61 − 7 − 3875 507 </phone>
• If there is no content, then the element is called empty; it is
abbreviated as:
standing for <lecturer></lecturer>
Hanoi, March/April 2007
XML Attributes
• An empty element is not necessarily
– It may have some properties in terms of
• An attribute is a name-value pair inside the
opening tag of an element:
<lecturer name="David Billington"
phone="+61 − 7 − 3875 507"/>
Hanoi, March/April 2007
XML Attributes: Another Example
<order orderNo="23456“
customer="John Smith"
date="October 15, 2002">
<item itemNo="a528" quantity="1"/>
<item itemNo="c817" quantity="3"/>
Hanoi, March/April 2007
The Same Example without
<customer>John Smith</customer>
<date>October 15, 2002</date>
Hanoi, March/April 2007
XML Elements vs Attributes
• Attributes can be replaced by elements
• When to use elements and when attributes is a matter of
– It is like the choice of using attributes or relations in relational
• But attributes cannot be nested
• Attributes guarantee that there is only one customer, one
number and one date for each order
– An element cannot have two attributes with the same name
• We will see that, via attributes, means are also provided for
guaranteeing that no two orders can have the same number
Hanoi, March/April 2007
Further details of XML
– A piece of text that is to be ignored by parser
<!-- This is a comment -->
For storing strings which have the form of tags
<![CDATA[<book> … </book>]]>
– Here <book> and </book> are just strings that appear
has text/data
Processing Instructions (PIs)
– Define procedural attachments, like for displaying
<?stylesheet type="text/css“ href="mystyle.css"?>
Hanoi, March/April 2007
Well-Formed XML Documents
• Syntactically correct documents follow some
– Only one outermost element (called root element)
– Each element contains an opening and a corresponding
closing tag
– Tags may not overlap.
• E.g. the following is wrong:
<author><name>Lee Hong</author></name>
– Attributes within an element have unique names
– Element and tag names are permissible
Hanoi, March/April 2007
XML Documents as trees
• An XML document can be represented as an
ordered labeled tree:
There is exactly one root
There are no cycles
Each non-root node has exactly one parent
Each node has a label.
The order of elements is important
… but the order of attributes is not important
Hanoi, March/April 2007
XML document with an email message
<from name="Michael Maher"
address="[email protected]"/>
<to name="Grigoris Antoniou"
address="[email protected]"/>
<subject>Where is your draft?</subject>
Here is the text of my message…
Hanoi, March/April 2007
Corresponding tree
Hanoi, March/April 2007
Visualizing XML
• To display XML documents, formatting must be
associated with elements
• The simplest way to do this is by CSSs (Cascading
Style Sheets)
<?xml version = “1.0” standalone = “no”?>
<?xml-stylesheet type="text/css" href=“book1.css"?>
• It separates data from formatting in a clear way.
Hanoi, March/April 2007
Example usage of CSS
• File book1.css contains:
book {display: block; font-size: 25; font-weight: bold; text-aligh: center}
title {display: block; font-size: 25; font-weight: bold; text-aligh: center}
author {display: block; font-size: 15; font-weight: bold; text-aligh: center}
publisher {display: block; font-size: 10; text-aligh: center}
year {display: block; font-size: 10; text-aligh: center}
Let’s try
ISBN {display: block; font-size: 10; text-aligh: center}
• CSSs are very limited!
– We will see another possible way in the sequel
• Transform XML documents into HTML documents via a
transformation language (XSL)
Hanoi, March/April 2007
Structuring XML Documents
• One can define the structure of the tags to
be used in XML documents. Define:
– what values an attribute may take
– which elements may or must occur within other
elements, etc.
• If such structuring information exists, the
document can be validated
– This is quite important, specially for sharing
Hanoi, March/April 2007
Structuring XML Documents (cont)
• An XML document is valid if
– it is well-formed
– respects the structuring information it uses
• There are two ways of defining the structure
of XML documents:
– DTDs (the simpler and more restricted way)
– XML Schema (offers extended possibilities)
Hanoi, March/April 2007
Document Type Definition
• The structure of an XML document can be specified by a
• A DTD defines the structure of the data allowed in the
XML document
– Which elements can appear
– Which attributes a tag can (or must) have
– Which sub-elements can (or must) occur inside each element, and
how many times
• DTD are of no use for restricting types of data
– All value are represented as strings
• Syntax of a DTD in a nutshell
– <!ELEMENT element (subelements-specification) >
– <!ATTLIST element (attributes) >
Hanoi, March/April 2007
Specifying elements
• A sub-element can be specified as:
– Structure of sub-elements inside it (where recursion is allowed)
– #PCDATA (parsed character data), i.e., strings
– EMPTY (no sub-element) or ANY (anything can come inside it)
• The specification of sub-elements can include a kind of regular
<!ELEMENT book( title, author+, publisher, ISBN?)>
• Syntax:
– “|” - alternatives
– “,” - sequence
– “+” - 1 or more occurrences
– “*” - 0 or more occurrences
– “?” - 0 or 1 occurrences
Hanoi, March/April 2007
Simple example
<name>David Billington</name>
<phone> +61 − 7 − 3875 507 </phone>
DTD for above element (and all lecturer elements):
<!ELEMENT lecturer (name,phone)>
<!ELEMENT phone (#PCDATA)>
Hanoi, March/April 2007
Attribute specification
• For each attribute, specify:
– Its name
– Its type:
• CDATA, just a string
• (v1| . . . |vn), an enumeration of all possible values
• ID (identifier) or IDREF (reference to an ID) or IDREFS
(multiple IDREFs)
– Additional restrictions
• mandatory (#REQUIRED)
• With a default value (value),
• None of the two above (#IMPLIED)
Hanoi, March/April 2007
IDs and IDREFs
• An element can have at most one ID attribute
• The value of an ID attribute must be unique for the
whole XML document
– An attribute with type ID is an identifier of the element
• An attribute of type IDREF must contain a value
that exists in an ID attribute of another element in
the document
• An attribute of type IDREFS contains a list of (0
or more) values, each of which must exist in an ID
attribute in the document
Hanoi, March/April 2007
An example DTD
<!ELEMENT family (person*)>
<!ELEMENT person (name)>
<!ATTLIST person id
mother IDREF
children IDREFS
Hanoi, March/April 2007
A document respecting the DTD
<?xml version = “1.0” standalone = “no”?>
<!DOCTYPE myfamily SYSTEM “”>
<person id="bob" mother="mary" father="peter">
<name>Bob Marley</name>
<person id="bridget" mother="mary">
<name>Bridget Jones</name>
<person id="mary" children="bob bridget">
<name>Mary Poppins</name>
<person id="peter" children="bob">
<name>Peter Marley</name>
Hanoi, March/April 2007
A DTD for an Email Element
<!ELEMENT email (head,body)>
<!ELEMENT head (from,to+,cc*,subject)>
<!ATTLIST from name
Hanoi, March/April 2007
A DTD for an Email Element (cont)
name CDATA
<!ELEMENT subject (#PCDATA)>
<!ELEMENT body (text,attachment*)>
<!ELEMENT attachment EMPTY>
<!ATTLIST attachment
(mime|binhex) "mime"
Hanoi, March/April 2007
The email DTD explained
• A head element contains (in that order):
a from element
at least one to element
zero or more cc elements
a subject element
• In from, to, and cc elements
– the name attribute is not required
– the address attribute is always required
Hanoi, March/April 2007
The email DTD explained (cont)
• A body element contains
– a text element
– possibly followed by a number of attachment
• The encoding attribute of an attachment
element must have either the value “mime”
or “binhex”
– “mime” is the default value
Hanoi, March/April 2007
DTDs’ limitations
• It is not possible to specify types for data elements
– Atomic type are strings. No integer, float, etc.
• It is difficult to specify (unordered) sets of sub-elements
• The IDs and IDREFs are not typed
– If there are several elements with ID, it is not possible in the
IDREF to specify that it must be an ID of a particular type of
• Eg, suppose you have a document with clients, products and order, all
with IDs. It is not possible to say that the items of the order are
IDREFs only referring to IDs of products!
• The syntax of DTDs is different from that of XML
Hanoi, March/April 2007
XML Schema
• Significantly richer language for defining
the structure of XML documents
• Its syntax is based on XML itself
– It is not necessary to write separate tools
• Reuse and refinement of schemas
– Expand or delete already existent schemas
• Sophisticated set of data types, compared to
Hanoi, March/April 2007
XML Schema prologue
An XML schema is an element with an opening
tag like
• Structure of schema elements
– Element and attribute types using data types
Hanoi, March/April 2007
Element Types
<element name="email"/>
<element name="head" minOccurs="1"
<element name="to" minOccurs="1"/>
• Cardinality constraints:
– minOccurs="x" (default value 1)
– maxOccurs="x" (default value 1)
• This generalizes the *,?,+ of DTDs
Hanoi, March/April 2007
Attribute Types
<attribute name="id" type="ID“
< attribute name="speaks" type="Language"
use="default" value="en"/>
• Existence: use="x", where x may be optional or
• Default value: use="x" value="...", where x
may be default or fixed
Hanoi, March/April 2007
Data Types
• There is a variety of built-in data types
– Numerical data types: integer, Short etc.
– String types: string, ID, IDREF, CDATA etc.
– Date and time data types: time, Month etc.
• There are also user-defined data types
– simple data types, which cannot use elements
or attributes
– complex data types, which can use these
Hanoi, March/April 2007
Data Types (cont)
• Complex data types are defined from
already existing data types by defining
some attributes (if any) and using:
– sequence, a sequence of existing data type
elements (order is important)
– all, a collection of elements that must appear
(order is not important)
– choice, a collection of elements, of which one
will be chosen
Hanoi, March/April 2007
A Data Type Example
<complexType name="lecturerType">
<element name="firstname" type="string"
<element name="lastname" type="string"/>
<attribute name="title" type="string" use="optional"/>
Hanoi, March/April 2007
Data Type Extension Example
Already existing data types can be extended by new
elements or attributes. Example:
<complexType name="extendedLecturerType">
<extension base="lecturerType">
<element name="email" type="string"
minOccurs="0" maxOccurs="1"/>
<attribute name="rank" type="string" use="required"/>
Hanoi, March/April 2007
Resulting Data Type
<complexType name="extendedLecturerType">
<element name="firstname" type="string"
minOccurs="0" maxOccurs="unbounded"/>
<element name="lastname" type="string"/>
<element name="email" type="string"
minOccurs="0" maxOccurs="1"/>
<attribute name="title" type="string" use="optional"/>
<attribute name="rank" type="string" use="required"/>
Hanoi, March/April 2007
Data Type Restriction
An existing data type may be restricted by adding
constraints on certain values
Restriction is not the opposite from extension
Restriction is not achieved by deleting elements or
The following hierarchical relationship holds:
Instances of the restricted type are also instances of the
original type
They satisfy at least the constraints of the original type
Hanoi, March/April 2007
Example of Data Type
<complexType name="restrictedLecturerType">
<restriction base="lecturerType">
<element name="firstname" type="string"
minOccurs="1" maxOccurs="2"/>
<attribute name="title" type="string"
Hanoi, March/April 2007
Restriction of Simple Data Types
<simpleType name="dayOfMonth">
<restriction base="integer">
<minInclusive value="1"/>
<maxInclusive value="31"/>
Hanoi, March/April 2007
Data Type Restriction: Enumeration
<simpleType name="dayOfWeek">
<restriction base="string">
<enumeration value="Mon"/>
<enumeration value="Tue"/>
<enumeration value="Wed"/>
<enumeration value="Thu"/>
<enumeration value="Fri"/>
<enumeration value="Sat"/>
<enumeration value="Sun"/>
Hanoi, March/April 2007
Email Example in XML Schema
<element name="email" type="emailType"/>
<complexType name="emailType">
<element name="head" type="headType"/>
<element name="body" type="bodyType"/>
Hanoi, March/April 2007
Email Example in XML Schema
<complexType name="headType">
<element name="from" type="nameAddress"/>
<element name="to" type="nameAddress"
minOccurs="1" maxOccurs="unbounded"/>
<element name="cc" type="nameAddress"
minOccurs="0" maxOccurs="unbounded"/>
<element name="subject" type="string"/>
Hanoi, March/April 2007
Email Example in XML Schema
<complexType name="nameAddress">
<attribute name="name" type="string"
<attribute name="address"
type="string" use="required"/>
• Similar (and simpler) for bodyType
Hanoi, March/April 2007
An XML document may use more than one DTD
or schema
Since each structuring document was developed
independently, names of tags may clash
The solution is to use a different prefix for each
DTD or schema
Hanoi, March/April 2007
Namespace Declarations
Namespaces are declared within an element and
can be used in that element and any of its children
(elements and attributes)
A namespace declaration has the form:
location is the address of the DTD or schema
If a prefix is not specified: xmlns="location" then
this location is used by default
Hanoi, March/April 2007
Namespaces Example
<faculty title="assistant professor"
name="John Smith"
department="Computer Science"/>
gu:name="Mate Jones"
gu:school="Information Technology"/>
Hanoi, March/April 2007
Querying XML Documents
• In relational databases, parts of a database
can be selected and retrieved using SQL
– Same is necessary for XML documents
– Query languages: XQuery, XQL, Xcerpt, …
• The central concept of most XML query
languages is a path expression
– It specifies how a node (or a set of nodes) in the
tree representation of the XML document can
be reached (positional language)
Hanoi, March/April 2007
• XPath is core for most XML query
• It is a language for addressing parts of an
XML document.
– It operates on the tree data model of XML
– It has a non-XML syntax
• In its simplest forms, XPath is quite similar
to addressing files in a (tree based) folders
Hanoi, March/April 2007
Types of Path Expressions
• Absolute (starting at the root of the tree)
– Syntactically they begin with the symbol /
– It refers to the root of the document (situated
one level above the root element of the
• Relative to a context node
• (See that it is like files in a folder)
Hanoi, March/April 2007
An XML Example
<library location="Bremen">
<author name="Henry Wise">
<book title="Artificial Intelligence"/>
<book title="Modern Web Services"/>
<book title="Theory of Computation"/>
<author name="William Smart">
<book title="Artificial Intelligence"/>
<author name="Cynthia Singleton">
<book title="The Semantic Web"/>
<book title="Browser Technology Revised"/>
Hanoi, March/April 2007
• A path expression is a sequence of steps separated by “/”
• The result of a path expression corresponds to the set of all
elements (and possibly attributes) whose path from the root
is exactly the one given:
– E.g. /library/author returns all author elements in the file
• The initial “/” denotes the root of the document
• “//” denotes any path between the root and the following
– E.g. //book returns all elements book, independently of its path
from the root
• One can use “..” as in folders
Hanoi, March/April 2007
XPath (cont)
• Path expressions are evaluated left to right
– Each step is applied to the set of instances produced by the previous one
• One can access attributes using [email protected]
– E.g. [email protected]
• Returns all titles of books
• One can filter with conditions (between []) in any step of the
– E.g. //author[@name=“Henry Wise”]/book
• Returns all books of author(s) with that name
• A number inside of the [] filters only the element in the order
– E.g. //author[1]/book
• Returns all books of the first author
• A name of an element (or attribute) inside the [] returns only
those elements which contain that element (or attribute)
– E.g. //author[book][email protected]
• Returns the names of author which have at least one book
Hanoi, March/April 2007
Functions in XPath
• XPath has some functions defined:
– E.g. function count() returns the number of elements, instead of the
elements themselves
• E.g. //author[book/count() > 2]
– Returns the authors with more than two books
– There are also function to test positions of a node, sum values,
operators over strings, etc. Examples::
• sum(), contains(st1,st2), concat(st1,st2,st), position(), last(),
round(num), ...
• In conditions one may use logical connectives and, or and
• IDREFs can be dereferenced using the function id()
– E.g. //person/id(@mother)
• Returns all persons referenced by mother (i.e. all persons that
are mothers)
Hanoi, March/April 2007
Recalling the family DTD
<!ELEMENT family (person*)>
<!ELEMENT person (name)>
<!ATTLIST person id
mother IDREF
children IDREFS
Hanoi, March/April 2007
A document respecting the DTD
<?xml version = “1.0” standalone = “no”?>
<!DOCTYPE myfamily SYSTEM “”>
<person id="bob" mother="mary" father="peter">
<name>Bob Marley</name>
<person id="bridget" mother="mary">
<name>Bridget Jones</name>
<person id="mary" children="bob bridget">
<name>Mary Poppins</name>
<person id="peter" children="bob">
<name>Peter Marley</name>
Hanoi, March/April 2007
More features of XPath
• Operator “|” for unions
– E.g. //person/id(@mother) | //person/id(@father)
• Returns all persons that are either mothers or fathers
• Remark: “|” cannot be nested inside other operators
• There is a function returning the relative position of a
node. E.g.
– [position()=last()] selects the last node
– [position() mod 2 =0] selects the even nodes
• There is much more in XPath…
– This was just an introduction
Hanoi, March/April 2007
• Stylesheets are designed to store formatting options of a document
separately from the document itself (just like in CSSs)
• XSL (extensible stylesheet language) was conceived for generating
HTML from XML.
• It contains:
– a transformation language (XSLT),
– and a formatting language
• XSLT is a general language for transforming XML documents
– It can transform XML into XML in another DTD, or XML into HTML
– It can be used independently of formatting concerns
• Transformations are defined based on templates
– Templates combine selection using XPath, with the construction of the
transformation results
Hanoi, March/April 2007
XML example document
<?xml version = “1.0” standalone = “no”?>
<street>5 de Outubro</street> <town>Lisboa</town>
<street>1 de Maio</street> <town>Caparica</town>
<num>A-102</num> <branch>Caparica</branch> <balance>400</balance>
<num>A-101</num> <branch>Lisboa</branch> <balance>100</balance>
<name>Luis</name> <num>A-102</num>
<name>Maria</name> <num>A-101</num>
Hanoi, March/April 2007
DTD of the example
<!DOCTYPE bank [
<!ELEMENT bank ( ( account | client | depositer)+)>
<!ELEMENT account (num, branch, balance)>
<!ELEMENT client (name, street, town)>
<!ELEMENT depositer (name,num)>
<!ELEMENT branch (#PCDATA)>
<!ELEMENT balance (#PCDATA)>
<!ELEMENT street (#PCDATA)>
Hanoi, March/April 2007
XSLT Templates
• Example of an XSLT template with match an select
<xsl:template match=“/bank/client”>
<xsl:value-of select=“name”/>
<xsl:template match=“*”/>
• The match attribute of xsl:template specifies the XPath pattern
• Elements of the XML document returned by that are processed according to
what is specified inside the xsl:template
– xsl:value-of selects for output specify values (in this case of name)
• For elements that are not in any patterns:
– The attributes and elements are output without being processed
– The templates are recursively applied to sub-elements
• The <xsl:template match=“*”/> is used in this example to guarantee that all
other elements produce no output
• If an element is returned by more than a match template, just one is used.
The choice is made by a complex preference criteria
Hanoi, March/April 2007
Example of XML output
• The XSL file:
<xsl:template match=“/bank/client”>
<xsl:value-of select=“name”/>
<xsl:template match=“*”/>
• Output produced:
<myclient> Luís </myclient>
<myclient> Maria </myclient>
Hanoi, March/April 2007
Creating attributes in XSLT
• XSLT does not allow for a xsl:value-of tag inside
another tag
– E.g. one cannot create an attribute for <cliente> using
xsl:value-of directly
– For this one can use xsl:attribute
• xsl:attribute adds attributes to an element. Example:
<xsl:template match=“/bank/client”>
<xsl:attribute name=“id”>
<xsl:value-of select=“name”/>
<xsl:value-of select=“street”/>
<xsl:template match=“*”/>
• Output:
<myclient id=“Luís”> 5 de Outubro</myclient>
<myclient id=“Maria”>1 de Maio</myclient>
Hanoi, March/April 2007
Structural Recursion
The action in a template may simply be that of applying recursively the
templates. E.g.
<xsl:template match=“/bank”>
<xsl:apply-templates />
<xsl:template match=“/bank/client”>
<xsl:value-of select=“name”/>
<xsl:template match=“*”/>
<client> Luís </client>
<client> Maria </client>
Hanoi, March/April 2007
Sorting in XSLT
• An xsl:sort inside a template sorts the elements according
to the template’s pattern
– Sorting is done before other templates are applied
<xsl:template match=“/bank”>
<xsl:apply-templates select=“client”>
<xsl:sort select=“name”/>
<xsl:template match=“client”>
<xsl:value-of select=“name”/>
<xsl:value-of select=“street”/>
<xsl:value-of select=“town”/>
<xsl:template match=“*”/>
Hanoi, March/April 2007
Usage of XSLT
Here we only show a small subset of XSLT
There is much much more into it!
XSLT can be used to move data and metadata
from one XML representation to another
It is chosen when applications that use different
DTDs or schemas need to communicate
XSLT can be used for machine processing of
content without any regard to displaying the
information for people to read.
But it can also be used to display data
Hanoi, March/April 2007
XSLT generating HTML
<xsl:stylesheet xmlns:xsl="">
<xsl:template match = "/">
<xsl:template match = "/bank">
<p><b> Accounts: </b></p>
<xsl:template match = "/banco/account">
<p> The account <b> <xsl:value-of select="num" /> </b>
from the branch <xsl:value-of select="branch" />
has balance <i><xsl:value-of select="balance" /></i>.
Hanoi, March/April 2007
Let’s see the result
• Higher level language to query XML documents
• Use a syntax for … let … where .. return …
 SQL from
where  SQL where
return  SQL select
let has no equivalent in SQL (for temporary variables)
• The for part has XPath expressions and variables that are
assigned with the results returns by the XPath
• The where part imposes condition upon the variable
• The return part specifies what is to be shown in the output
for each variable
Hanoi, March/April 2007
XQuery Example
• Return the number of all accounts with a balance over
400, in elements with the tag <account400>
$x in /bank/account
$num := [email protected]
where $x/balance > 400
return <account400> $num </account400>
• In these case neither the let nor the where part are
for $x in /bank/account[balance>400]
return <account400> [email protected] </account400>
Hanoi, March/April 2007
A more complex example, with joins
• One can specify joins, similarly to SQL
for $b in /bank/account
$c in /bank/client,
$d in /bank/depositor
where $a/num = $d/num
and $c/name = $d/name
return <client-account> $c $a </client-account>
• As for XPath and XSLT, XQuery also has much
more features, not shown here…
• Relevant information can be found from:
Hanoi, March/April 2007
Conclusions on XML
• XML is a metalanguage that allows users to define
• XML separates content, structure and formatting
• XML is the “de facto” standard for the
representation and exchange of structured
information on the Web (and more!)
• XML is supported by query languages
• XML is supported by almost all web and database
Hanoi, March/April 2007
Points for discussion in the remainder
• The nesting of tags does not have standard meaning
• The semantics of XML documents is not accessible to
machines, only to people
• Collaboration and exchange are supported only if there is
underlying shared understanding of the vocabulary
• XML is well-suited for close collaboration, where domainor community-based vocabularies are used
– It is not so well-suited for global communication.
Hanoi, March/April 2007
Part 3: Web Resources in RDF
Problems of XML for meaning of
• XML provides a uniform framework for
interchange of data and metadata between
• But does not provide any means of talking about
the semantics (meaning) of data
• E.g.:
– there is no intended meaning associated with the
nesting of tags
– There is no (processable) intended meaning of each tag
• It is up to each application to interpret the nesting.
Hanoi, March/April 2007
Nesting of Tags in XML
• Consider
<course name="Discrete Maths">
<lecturer>David Billington</lecturer>
<lecturer name="David Billington">
<teaches>Discrete Maths</teaches>
• The nesting are opposite, but the meaning is the same.
• There should be a way of attributing meaning without
compromising with a particular nesting
Hanoi, March/April 2007
Basic Ideas of RDF
• Represent the meaning independently of the syntax
• Basic building block: object-attribute-value triple
– It is called a statement
– In the example, object is David, attribute is lectures, value is Discrete
• Fundamental concepts in RDF are:
– Resources (like the object and value above)
– Properties (like the attribute above)
– Statements (the triple above)
RDF has been given a syntax in XML
– This syntax inherits the benefits of XML
– Other syntactic representations of RDF are possible
– RDF should not be confused with its XML syntactical representation
Hanoi, March/April 2007
• A resource is just any “thing”, an object we want
to refer
– E.g. an author, a book, a places, a person, an hotel, etc
• In RDF, every resource has a URI (Universal
Resource Identifier )
• A URI can be
– a URL (Web address) or
– some other kind of unique identifier
Hanoi, March/April 2007
• Properties describe relations between resources
– E.g. “written by”, “has age”, “has title”, etc.
• Each property is itself also a resource
– So properties are also identified by URIs
• Advantages of using identifying URIs:
– Α global, worldwide, unique naming scheme
– Reduces the homonym problem of distributed data
– (Basically, with URI everything is guaranteed to be uniquely
identified, by a key)
Hanoi, March/April 2007
• Statements assert the properties of resources
– Relate resources via properties
• In RDF, a statement is an object-attribute-value
– It consists of a resource, a property, and a value
• They can be seen as binary predicates:
• Values can be resources or literals
– Literals are just atomic values (strings), that don’t need
to have a URI.
Hanoi, March/April 2007
Representation of Statements
• A statement can be viewed as:
– A triple (object,Property,value)
– An arc connecting two nodes a graph
– A piece of XML code, representing the triple
• Accordingly, an RDF document can be viewed as:
– A set of triples
– A graph (semantic net)
– An XML document with the triples represented
according to a given predefined syntax
Hanoi, March/April 2007
Representing triples in a graph
David Billington
• This piece of the graph is representing the triple:
– (,site-owner,David Billington)
– Or the predicate
site-owner(,David Billington)
• An RDF document can be seen as a directed graph with
labeled nodes and arcs
– from the resource (the subject of the statement)
– to the value (the object of the statement)
• Known in AI as a semantic net
Hanoi, March/April 2007
An example RDF graph
David Billington
Archie Rock
Hanoi, March/April 2007
3456 398
• In RDF it is possible to make statements about
statements. E.g.
– Grigoris believes that David Billington is the
creator of
• Such statements can be used to describe belief or
trust in other statements
• They amount to considering statements
themselves as resources that can then be
• For that one needs to assign a unique identifier to
each statement
Hanoi, March/April 2007
Reifying Statements
• Introduce an auxiliary object (e.g. belief1)
• Relate it to each of the 3 parts of the original
statement through the properties subject,
predicate and object
• In the preceding example
– subject of belief1 is David Billington
– predicate of belief1 is creator
– object of belief1 is
• Now one can say that say that, with a triple, that
Hanoi, March/April 2007
RDF Data Types
• RDF allows to assign types to resources,
and literals
• E.g. for saying that the age is an integer:
(“David Billington”,,
Hanoi, March/April 2007
Data Types Usage
• The ^^-notation denotes the type of a literal
• In practice, the used data typing scheme is the one
of XML Schema
– But the use of any externally defined data typing
scheme is allowed in RDF documents
• XML Schema predefines a large range of data
– E.g. Booleans, integers, floating-point numbers, times,
dates, etc.
Hanoi, March/April 2007
Limitations of RDF
• RDF uses only binary properties
– This is a restriction because often we use predicates with
more than 2 arguments
– Binary predicates can simulate these, but the
representation is not that natural
• Example: referee(X,Y,Z)
– X is the referee in a chess game between players Y and Z
– Can be represented by creating a new resource
chessGame, and having binary predicates
• referee(chessGame,X), player1(chessGame,Y),
Hanoi, March/April 2007
Limitations of RDF (cont)
• Properties are viewed as special kinds of
– Properties can be used as the object in an
object-attribute-value triple (statement)
– They are defined independent of resources
• This possibility offers flexibility
– But it is unusual for modelling languages and
OO programming languages, and it can be
Hanoi, March/April 2007
Critical view of Reification
• The mechanism is quite powerful
– But it appears misplaced in a simple language like RDF
• Making statements about statements introduces a
level of complexity that is usually not necessary
for a basic layer of the Semantic Web
• It may make sense in higher layers, providing
richer representation capabilities
– It is confusing, and complex, to have it in the basic
layer of RDF
Hanoi, March/April 2007
XML Syntax Representation
• Graphs are easy to understand (and visualize) by
• But for being machine-accessible and processable
other representation is in order
• For that, there is also the possibility of
representing RDF documents with XML
– But XML is not a part of the RDF data model
– It is just an(other) syntactical representation
– E.g. serialization of XML is irrelevant for RDF
Hanoi, March/April 2007
• An RDF document is represented by an XML element with
the tag rdf:RDF
– The content of this element is a number of descriptions
• Each description, with tag rdf:Description, denotes a set
of statements, all about a same resource
• An XML element inside a description denotes a sentence
– The tag of the XML element denotes the attribute, and the value
inside the element denotes the value of the statement
• The object resource in a rdf:Description may be one of
the following:
– an about attribute, referencing an existing resource
– an ID attribute, creating a new resource
– without a name, creating an anonymous resource
Hanoi, March/April 2007
Example of XML representation of
David Billington
Hanoi, March/April 2007
Namespaces for RDF
• In XML representation of RDF, namespace
mechanism is used
– For disambiguation
– Namespaces are expected to be RDF documents
defining resources that can be reused
– Large, distributed collections of knowledge
Hanoi, March/April 2007
Example of University Courses
<rdf:Description rdf:about="949318">
<uni:name>David Billington</uni:name>
<uni:title>Associate Professor</uni:title>
<uni:age rdf:datatype="&xsd:integer">27<uni:age>
• Note here that a rdf:Description may define several
• Note the attribute rdf:datatype, for indicating the type of
the literal
Hanoi, March/April 2007
RDF University Courses (cont)
<rdf:Description rdf:about="CIT1111">
<uni:courseName>Discrete Maths</uni:courseName>
<uni:isTaughtBy>David Billington</uni:isTaughtBy>
<rdf:Description rdf:about="CIT2112">
<uni:courseName>Programming III</uni:courseName>
<uni:isTaughtBy>Michael Maher</uni:isTaughtBy>
Hanoi, March/April 2007
rdf:about versus rdf:ID
• An element rdf:Description has
– an rdf:about attribute indicating that the resource has
been “defined” elsewhere
– An rdf:ID attribute indicating that the resource is
• Formally, there is no such thing as “defining” an
object in one place and referring to it elsewhere
– But, for human readability, it is convenient to have a
defining location, while other locations state
“additional” properties
Hanoi, March/April 2007
The rdf:resource Attribute
• In the example above, there is no formal relation
between the lecturer and the courses taught
– It only exists (for human reading) via the name of the
– The use of the same name may just be a coincidence for
a machine
• In RDF, we can state that two entities are the same
using the rdf:resource attribute
• The rdf:resource is used when the value of the
statement is another resource, rather than a literal
(as it was in the examples up to now)
Hanoi, March/April 2007
rdf:resource Example
<rdf:Description rdf:about="CIT1111">
<uni:isTaughtBy rdf:resource="949318"/>
<rdf:Description rdf:about="949318">
<uni:name>David Billington</uni:name>
<uni:title>Associate Professor</uni:title>
Hanoi, March/April 2007
Externally Defined Resources
• One can refer the externally defined resources just by having
its URI in the rdf:about
• Eg. To refer to externally defined course CIT1111 use
as the value of rdf:about
• is the URI where the definition
of CIT1111 is to be found
• A description with an ID defines a fragment URI, which can
be used to reference, from outside, the defined description
Hanoi, March/April 2007
Nested Descriptions
• To abbreviate the XML representation,
descriptions may be defined within other
descriptions, where they are used
• Although a description may be defined
within another description, its scope is
– It is just a simplified notation
Hanoi, March/April 2007
Nested Descriptions: Example
<rdf:Description rdf:about="CIT1111">
<uni:courseName>Discrete Maths</uni:courseName>
<rdf:Description rdf:ID="949318">
<uni:name>David Billington</uni:name>
<uni:title>Associate Professor</uni:title>
• Despite being define in a nested way, other courses, such as CIT3112,
can still refer to the new resource with ID 949318
Hanoi, March/April 2007
Defining types of resources
<rdf:Description rdf:ID="CIT1111">
<rdf:type rdf:resource=""/>
<uni:courseName>Discrete Maths</uni:courseName>
<uni:isTaughtBy rdf:resource="#949318"/>
<rdf:Description rdf:ID="949318">
<rdf:type rdf:resource=""/>
<uni:name>David Billington</uni:name>
<uni:title>Associate Professor</uni:title>
Hanoi, March/April 2007
Abbreviated Syntax
Simplification rules:
1. Childless property elements within description
elements may be directly replaced by XML attributes
2. For description elements with a typing element we can
use the name specified in the rdf:type element instead
of rdf:Description
These rules create syntactic variations of the same
RDF statement
They are equivalent according to the RDF data model,
although they have different XML syntax
Hanoi, March/April 2007
Abbreviated Syntax Example
<rdf:Description rdf:ID="CIT1111">
<rdf:type rdf:resource=""/>
<uni:courseName>Discrete Maths</uni:courseName>
<uni:isTaughtBy rdf:resource="#949318"/>
Applying rule 1
<rdf:Description rdf:ID="CIT1111"
uni:courseName="Discrete Maths">
<rdf:type rdf:resource=""/>
<uni:isTaughtBy rdf:resource="#949318"/>
Hanoi, March/April 2007
Abbreviated Syntax Example
<rdf:Description rdf:ID="CIT1111"
uni:courseName="Discrete Maths">
<rdf:type rdf:resource=""/>
<uni:isTaughtBy rdf:resource="#949318"/>
Applying rule 2
<uni:course rdf:ID="CIT1111"
uni:courseName="Discrete Maths">
<uni:isTaughtBy rdf:resource="#949318"/>
Hanoi, March/April 2007
Container Elements in RDF
• Container elements collect a number of
resources or attributes about which we want
to make statements as a whole
• E.g., we may wish to talk about the courses
given by a particular lecturer
• The content of container elements are
named rdf:_1, rdf:_2, etc, or rdf:li for
Hanoi, March/April 2007
Types of Container Elements
• rdf:Bag an unordered container, allowing multiple
– E.g. members of the faculty board, documents in a
• rdf:Seq an ordered container, which may contain
multiple (repeated) occurrences
– E.g. modules of a course, items on an agenda, an
alphabetized list of staff members (order is imposed)
• rdf:Alt a set of alternatives
– E.g. the document home and mirrors, translations of a
document in various languages ,
Hanoi, March/April 2007
Example for a Bag
<uni:lecturer rdf:ID="949352" uni:name="Grigoris
<rdf:_1 rdf:resource="#CIT1112"/>
<rdf:_2 rdf:resource="#CIT3116"/>
Hanoi, March/April 2007
Example for Alternative
<uni:course rdf:ID="CIT1111"
uni:courseName="Discrete Mathematics">
<rdf:li rdf:resource="#949352"/>
<rdf:li rdf:resource="#949318"/>
Hanoi, March/April 2007
Containers in RDF
• They can be seen as especial kinds of properties, with
unnamed nodes.
• rdf:ID can be used to assign identifiers to those otherwise
unnamed nodes (in order to make it possibly usable
• The one of the previous example:
Discrete Mathematics
Hanoi, March/April 2007
Another syntax for RDF Collections
Shorthand syntax:
"Collection" value for the rdf:parseType attribute:
<rdf:Description rdf:about="#CIT2112">
<uni:isTaughtBy rdf:parseType="Collection">
<rdf:Description rdf:about="#949111"/>
<rdf:Description rdf:about="#949352"/>
<rdf:Description rdf:about="#949318"/>
Hanoi, March/April 2007
RDF Lists
• RDF also provides support for describing
groups containing only the specified
members, in the form of RDF collections
– list structure in the RDF graph
– constructed using a predefined collection
vocabulary: rdf:List, rdf:first, rdf:rest and
Hanoi, March/April 2007
Recall Reification
• Sometimes we wish to make statements
about other statements
• We must be able to refer to a statement
using an identifier
• RDF allows such reference through a
reification mechanism which turns a
statement into a resource
Hanoi, March/April 2007
Reification represented in XML
• rdf:subject, rdf:predicate and rdf:object
allow to access the parts of a statement
• The ID of the statement can be used to refer
to it, as can be done for any description
• We write an rdf:Description if we don’t
want to talk about a statement further
• We write an rdf:Statement if we wish to
refer to a statement as reified
Hanoi, March/April 2007
Reification Example
<rdf:Description rdf:about="#949352">
<uni:name>Grigoris Antoniou</uni:name>
reifies as
<rdf:Statement rdf:ID="StatementAbout949352">
<rdf:subject rdf:resource="#949352"/>
<rdf:predicate rdf:resource="
<rdf:object>Grigoris Antoniou</rdf:object>
Hanoi, March/April 2007
Part 4: RDF Schema
Basic Ideas of RDF Schema
RDF is a universal language that lets users describe
The user can do so in RDF Schema using:
It does not assumes any meaning for the vocabulary used
It does not assume, nor does it define semantics of any particular
application domain
Classes and Properties
Class Hierarchies and Inheritance
Property Hierarchies
Disclaimer: The relation between RDF and RDF Schema
is not the same as between XML and XML Schema
The naming is a bit unfortunate
Hanoi, March/April 2007
Classes and Instances
• We must distinguish between
– Concrete “things” (individual objects) in the
domain: Discrete Maths, David Billington etc.
– Sets of individuals sharing properties called
classes: lecturers, students, courses etc.
• Individual objects that belong to a class are
referred to as instances of that class
• As we’ve seen, the relationship between
instances and classes in RDF is made with
Hanoi, March/April 2007
Using Classes
• Like types, classes impose restrictions on
what can be stated in an RDF document
using the schema
– Disallow nonsense from being stated
– E.g., in programming languages A+1 where A
is an array
– E.g. or X teaches Y where X is a room number
Hanoi, March/April 2007
Class Hierarchies
• Classes can be organized in hierarchies
– A is a subclass of B if every instance of A is
also an instance of B
– Equivalently B is a superclass of A
A subclass graph need not be a tree
– A class may have multiple superclasses
Hanoi, March/April 2007
Class Hierarchy Example
Hanoi, March/April 2007
Sub-classes inherit properties of super-classes. E.g.
Range restriction: Courses must be taught by academic staff
members only
Michael Maher is a professor
He inherits the ability to teach from the class of academic staff
This is done in RDF Schema by defining and fixing the
semantics of “is a subclass of”
It is not up to an application (RDF processing software) to
interpret “is a subclass of”
Note that in RDF, and also in XML, the semantics of everything
was not defined! Here it is no longer that case.
Hanoi, March/April 2007
Property Hierarchies
• Hierarchical relationships may also be
defined for properties
– E.g., “is taught by” is a sub-property of
– If a course C is taught by an academic staff
member A, then C also involves Α
• P is a subproperty of Q, if Q(x,y) is true
whenever P(x,y) is true
Hanoi, March/April 2007
RDF and RDF Schema Layers
• The schema is written in a formal language,
RDF Schema, that can express its
– subClassOf, Class, Property, subPropertyOf,
Resource, etc.
• The RDF graph refers to the elements of the
schema for the semantics, via rdf:type
Hanoi, March/April 2007
Layers Example
Hanoi, March/April 2007
RDF Schema in RDF
The modelling primitives of RDF Schema are defined
using resources and properties
To declare that “lecturer” is a subclass of “academic staff
This is done in RDF
Define resources lecturer, academicStaffMember, and
define property subClassOf
Write triple (lecturer,subClassOf,academicStaffMember)
The “difference” is that the meaning of subClassOf is not
up to the user. It is part of the RDF Schema language (and
Hanoi, March/April 2007
Core Classes of RDF Schema
• rdfs:Resource, the class of all resources
• rdfs:Class, the class of all classes
• rdfs:Literal, the class of all literals
• rdf:Property, the class of all properties.
• rdf:Statement, the class of all reified
Hanoi, March/April 2007
Core Properties
rdf:type, which relates a resource to its class
rdfs:subClassOf, which relates a class to one of
its superclasses
The resource is declared to be an instance of that class
All instances of a class are instances of its superclass
rdfs:subPropertyOf, relates a property to one of
its superproperties
Hanoi, March/April 2007
Core Properties (cont)
rdfs:domain, which specifies the domain of a
property P
– The class of those resources that may appear as
subjects in a triple with predicate P
– If the domain is not specified, then any resource can be
the subject
rdfs:range, which specifies the range of a
property P
The class of those resources that may appear as values
in a triple with predicate P
Hanoi, March/April 2007
<rdfs:Class rdf:about="#lecturer">
<rdfs:subClassOf rdf:resource="#staffMember"/>
<rdf:Property rdf:ID="phone">
<rdfs:domain rdf:resource="#staffMember"/>
<rdfs:range rdf:resource="
Hanoi, March/April 2007
Core Classes and Properties
• rdfs:subClassOf and rdfs:subPropertyOf are
transitive, by definition
• rdfs:Class is a subclass of rdfs:Resource
– Because every class is a resource
• rdfs:Resource is an instance of rdfs:Class
– rdfs:Resource is the class of all resources, so it is a
• Every class is an instance of rdfs:Class
– For the same reason
Hanoi, March/April 2007
Subclass Hierarchy of Some
Modeling Primitives of RDF
Hanoi, March/April 2007
Instance Relationships of Some
Modeling Primitives of RDFS
Hanoi, March/April 2007
Instance Relationships of Some Core
Properties of RDF and RDF Schema
Hanoi, March/April 2007
Reification and Containers
rdf:subject, relates a reified statement to its subject
rdf:predicate, relates a reified statement to its predicate
rdf:object, relates a reified statement to its object
rdf:Bag, the class of bags
rdf:Seq, the class of sequences
rdf:Alt, the class of alternatives
rdfs:Container, which is a superclass of all container
classes, including the three above
Hanoi, March/April 2007
Utility Properties
rdfs:seeAlso relates a resource to another
resource that explains it
rdfs:isDefinedBy is a subproperty of
rdfs:seeAlso and relates a resource to the place
where its definition, typically an RDF schema, is
rfds:comment. Associates comments, typically
longer text, with a resource
rdfs:label. Associates a human-friendly label
(name) with a resource
Hanoi, March/April 2007
University Lecturers Example
<rdfs:Class rdf:ID="lecturer">
The class of lecturers. All lecturers are
academic staff members.
Hanoi, March/April 2007
Courses and teachers
<rdfs:Class rdf:ID="course">
<rdfs:comment>The class of courses</rdfs:comment>
<rdf:Property rdf:ID="isTaughtBy">
Inherits its domain ("course") and range ("lecturer")
from its superproperty "involves"
<rdfs:subPropertyOf rdf:resource="#involves"/>
Hanoi, March/April 2007
Phone Property Example
<rdf:Property rdf:ID="phone">
It is a property of staff members
and takes literals as values.
<rdfs:domain rdf:resource="#staffMember"/>
Hanoi, March/April 2007
Semantics of RDFS and RDF
Unlike XML, or RDF alone, RDF Schema has a
precise, well defined meaning
The semantics of RDF Schema can be define by
translating its model primitives into first-order
logics, more precisely predicate logics with
– This readily provides a precise meaning
– It makes the semantics unambiguous and machine
– It provide a basis for reasoning support by automated
reasoners manipulating logical formulas
Hanoi, March/April 2007
Defining the Semantics of RDFS
All language primitives in RDF and RDF Schema
are represented by constants in the logics
– Resource, Class, Property, subClassOf, etc.
Predefined predicates are used as a foundation for
expressing relationships between the constants
– Variable names begin with ?
– All axioms are implicitly universally quantified
Hanoi, March/April 2007
Auxiliary Axiomatization of Lists
• Function symbols:
nil (empty list)
cons(x,l) (adds an element to the front of the list)
first(l) (returns the first element)
rest(l) (returns the rest of the list)
• Predicate symbols:
– item(x,l) (tests if an element occurs in the list)
– list(l) (tests whether l is a list)
• Lists are used to represent containers in RDF
Hanoi, March/April 2007
Basic Predicates for the
• PropVal(P,R,V)
– A predicate with 3 arguments, which is used to
represent the RDF triple (R,P,V) – with
resource R, property P and value V
• Type(R,T)
– Specifies that the resource R has the type T
– Formally defined by:
Type(?r,?t)  PropVal(type,?r,?t)
Hanoi, March/April 2007
RDF Classes
• Constants: Class, Resource, Property,
– All classes are instances of Class
– So the axiamotization includes the following facts
Hanoi, March/April 2007
Axiomatizing Classes
• Resource is the most general class: every
class and every property is a resource. I.e.
Type(?p,Property)  Type(?p,Resource)
Type(?c,Class)  Type(?c,Resource)
• The predicate in an RDF statement must be
a property
PropVal(?p,?r,?v)  Type(?p,Property)
Hanoi, March/April 2007
The type Property
• type is a property
• type can be applied to resources (domain) and has
a class as its value (range)
Type(?r,?c)  (Type(?r,Resource) 
Hanoi, March/April 2007
The Auxiliary FuncProp Property
• P is a functional (key) property if, and only if,
– it is a property, and
– there are no x, y1 and y2 with P(x,y1), P(x,y2 ) and
Type(?p, FuncProp) 
(Type(?p, Property) 
?r ?v1 ?v2
(PropVal(?p,?r,?v1) 
PropVal(?p,?r,?v2)  ?v1 = ?v2))
Hanoi, March/April 2007
• Containers are lists:
Type(?c,Container)  list(?c)
• Containers can be bags or sequences or alternatives:
Type(?c,Container) 
(Type(?c,Bag)  Type(?c,Seq)  Type(?c,Alt))
• Bags and sequences are disjoint:
¬(Type(?x,Bag)  Type(?x,Seq))
Hanoi, March/April 2007
Containers (cont)
• For every natural number n > 0, there is the selector
_n, which selects the nth element of a container
• The selector is a functional property:
• The selector applies only to containers:
PropVal(_n,?c,?o)  Type(?c,Container)
Hanoi, March/April 2007
• subClassOf is a property:
• If a class C is a subclass of a class C', then all
instances of C are also instances of C':
PropVal(subClassOf,?c,?c') 
(Type(?c,Class)  Type(?c',Class) 
?x (Type(?x,?c)  Type(?x,?c')))
Hanoi, March/April 2007
• P is a subproperty of P', if P'(x,y) is true whenever
P(x,y) is true:
PropVal(subPropertyOf,?p,?p') 
(Type(?p,Property)  Type(?p',Property) 
?r ?v (PropVal(?p,?r,?v) 
Hanoi, March/April 2007
Domain and Range
• If the domain of P is D, then for every P(x,y), xD
PropVal(domain,?p,?d) 
?x ?y (PropVal(?p,?x,?y)  Type(?x,?d))
• If the range of P is R, then for every P(x,y), yR
PropVal(range,?p,?r) 
?x ?y (PropVal(?p,?x,?y)  Type(?y,?r))
Hanoi, March/April 2007
Querying RDF
• The above provides a precise (formal) meaning to RDF and
RDF Schema
• This precise meaning allows for reasoning
• Querying may make use of the reasoning mechanisms
• Querying the XML representation of RDF is too low-level,
and doesn’t take advantage of reasoning
– There are various ways of syntactically representing an RDF
statement in XML
– Thus we would require several XQuery queries, e.g.
• //uni:lecturer/uni:title if uni:title element
• //uni:[email protected]:title if uni:title attribute
– Both XML representations equivalent!
• SPARQL is a query language for RDF
Hanoi, March/April 2007
RDF Querying Example
<uni:lecturer rdf:ID="949352">
<uni:name>Grigoris Antoniou</uni:name>
<uni:professor rdf:ID="949318">
<uni:name>David Billington</uni:name>
<rdfs:Class rdf:about="#professor">
<rdfs:subClassOf rdf:resource="#lecturer"/>
• A query for the names of all lecturers should return both Grigoris
Antoniou and David Billington
Hanoi, March/April 2007
SPARQL Basic Queries
• Similar to SQL where:
– select specifies the number and order of retrieved data
– from is used to navigate through the data model
– where imposes constraints on possible solutions
• In SPARQL the from is not used, since the relation
(property) names are explicit in the where
• E.g. Retrieve all phone numbers of staff members:
select ?X ?Y
where {?X phone ?Y.}
• Here ?X and ?Y are variables, and ?X phone ?Y
represents a resource-property-value triple
Hanoi, March/April 2007
SPARQL Basic Queries
• The where clause specifies a RDF pattern
– Constants can also be used in the pattern
– For example, list all courses taught by Grigoris
select ?C
where {?C isTaughtBy “Grigoris”.}
• Several patterns can be specified. This allows for joins
– For example, list all courses taught by the lecturer(s) of “Discrete
select distinct ?C
where {“Discrete Maths” isTaughtBy ?P.
?C isTaughtBy ?P.}
Hanoi, March/April 2007
SPARQL Basic Queries
• Extra filtering conditions can be imposed
– For example, list all professor with salary greater than
select ?P
where {?P hasSalary ?S. filter (?X > 3000).}
• Union of results can be performed
– For example, list all courses taught by either Grigoris or
select distinct ?C
where {{?C isTaughtBy “Grigoris”} UNION
{?C isTaughtBy “Frank”}}
Hanoi, March/April 2007
SPARQL Basic Queries
• from clauses may exist, but not like in SQL
• A from clause indicates the graph to be used.
– For example
select ?X ?Y
from <>
where {?X phone ?Y.}
• There is much more in SPARQL:
Result modifiers (like for sorting)
Ways of querying containers
Etc, etc…
Hanoi, March/April 2007
Summary of RDF and RDFS
• RDF provides a foundation for representing and
processing metadata
– It has a graph-based data model
– It has an XML-based syntax to support syntactic
• RDF has a decentralized philosophy and allows
incremental building of knowledge, and its sharing
and reuse
• RDF is domain-independent and RDF Schema
provides a mechanism for describing specific
Hanoi, March/April 2007
Summary of RDF and RDFS
• RDF Schema is a primitive ontology language
– It offers certain modeling primitives with fixed
• There exist query languages for RDF and RDFS
making use of the semantics, such as SPARQL
• There exists several tool for dealing with RDF,
RDF Schema, and that implement query languages
– Try Jena (!
– Jena is a Java framework to construct Semantic Web
Applications. It provides a programmatic environment
for RDF, RDFS and OWL, SPARQL and includes a
rule-based inference engine.
Hanoi, March/April 2007
RDF Schema is not the ultimate answer
• RDF Schema is quite primitive as a modeling language for
the Web
• Many desirable modeling primitives are missing
• We still need an ontology layer on top of RDF and RDF
• But before going into richer Web languages for
ontologies, it is better to look a bit more formally
on what ontologies are and how they can be
Hanoi, March/April 2007
Part 5: Ontologies and Description
• Ontologies establish a formal specification of the
concepts used in representing knowledge
• Ontology: originates from philosophy as a branch
of metaphysics
– Ontologia studies the nature of existence
– Defines what exists and the relation between existing
concepts (in a given domain)
– Sought universal categories for classifying everything
that exists
Hanoi, March/April 2007
An Ontology
• An ontology, is a catalog of the types of things that are
assumed to exist in a domain.
• The types in an ontology represent the predicates, word
senses, or concept and relation types of the language when
used to discuss topics in the domain.
• Logic says nothing about anything, but the combination of
logic with an ontology provides a language that can
express relationships about the entities in the domain of
• When writing logical formulas an ontology is implicitly or
explicitly assumed
– A logical formula ca only be understood if I understand the
meaning of predicates and objects involved in examples
Hanoi, March/April 2007
Aristotle’s Ontology
Hanoi, March/April 2007
The Ontology
• Effort to defined and categorize everything that
• Agreeing on the ontology makes it possible to
understand the concepts
• Efforts to define a big ontology, defining all
concepts still exists today:
– The Cyc (from Encyclopedia) ontology (over 100,000
concept types and over 1M axioms
– Electronic Dictionary Research: 400,00 concept types
– WordNet: 166,000 English word senses
Hanoi, March/April 2007
Cyc Ontology
Hanoi, March/April 2007
Cyc Ontology
Represented Thing
Intangible Object
Intangible Stuff
Attribute value
Internal machine thing
Hanoi, March/April 2007
Small Ontologies
• Designed for specific application
• How to make these coexist with big ontologies?
Hanoi, March/April 2007
Domain-Specific Ontologies
• Medical domain:
– Cancer ontology from the National Cancer Institute in the United
• Cultural domain:
– Art and Architecture Thesaurus (AAT) with 125,000 terms in the
cultural domain
– Union List of Artist Names (ULAN), with 220,000 entries on
– Iconclass vocabulary of 28,000 terms for describing cultural
• Geographical domain:
– Getty Thesaurus of Geographic Names (TGN), containing over 1
million entries
Hanoi, March/April 2007
Ontologies and the Web
• In the Web ontologies provide shared
understanding of a domain
– It is crucial to deal with differences in terminology
• To understand data in the web it is crucial that an
ontology exists
• To be able to automatically understand the data,
and use in a distributed environment it is crucial
that the ontology is:
– Explicitly defined
– Available in the Web
Hanoi, March/April 2007
Defining an Ontology
• How to define a catalog of the types of things that
are assumed to exist in a domain?
– I.e. how to define an ontology for a given domains?
• What makes an ontology?
Entities in a taxonomy
Properties and relations
• Similar to ER models in databases
Hanoi, March/April 2007
Main Stages in Ontology
1. Determine scope
2. Consider reuse
3. Enumerate terms
4. Define taxonomy
5. Define properties
6. Define facets
7. Define instances
8. Check for anomalies
Not a linear process!
Hanoi, March/April 2007
Determine Scope
• There is no correct ontology of a specific
– An ontology is an abstraction of a particular
domain, and there are always viable alternatives
• What is included in this abstraction should
be determined by
– the use to which the ontology will be put
– by future extensions that are already anticipated
Hanoi, March/April 2007
Determine Scope (cont)
• Basic questions to be answered at this stage
– What is the domain that the ontology will
– For what we are going to use the ontology?
– For what types of questions should the ontology
provide answers?
– Who will use and maintain the ontology?
Hanoi, March/April 2007
Consider Reuse
• One rarely has to start from scratch when
defining an ontology
– In these web days, there is almost always an
ontology available that provides at least a
useful starting point for our own ontology
• With the Semantic Web, ontologies will
become even more widely available
Hanoi, March/April 2007
Enumerate Terms
• Write down in an unstructured list all the relevant
terms that are expected to appear in the ontology
– Nouns form the basis for class names
– Verbs form the basis for property/predicate names
• Traditional knowledge engineering tools (e.g.
laddering and grid analysis) can be used to obtain
– the set of terms
– an initial structure for these terms
Hanoi, March/April 2007
Define the Taxonomy
• Relevant terms must be organized in a
taxonomic is_a hierarchy
– Opinions differ on whether it is more
efficient/reliable to do this in a top-down or a
bottom-up fashion
• Ensure that hierarchy is indeed a
– If A is a subclass of B, then every object of
type A must also be an object of type B
Hanoi, March/April 2007
Define Properties
• Often interleaved with the previous step
• Attach properties to the highest class in the
hierarchy to which they apply:
– Inheritance applies to properties
• While attaching properties to classes, it makes
sense to immediately provide statements about the
domain and range of these properties
– Immediately define the domain of properties
Hanoi, March/April 2007
Define Facets
• Define extra conditions over properties
– Cardinality restrictions
– Required values
– Relational characteristics
• symmetry, transitivity, inverse properties, functional
Hanoi, March/April 2007
Define Instances
• Filling the ontologies with such instances is
a separate step
• Number of instances >> number of classes
• Thus populating an ontology with instances
is not done manually
– Retrieved from legacy data sources (DBs)
– Extracted automatically from a text corpus
Hanoi, March/April 2007
Check for Anomalies
• Test whether the ontology is consistent
– For this, one must have a notion of consistency in the
• Examples of common inconsistencies
– incompatible domain and range definitions for
transitive, symmetric, or inverse properties
– cardinality properties
– requirements on property values can conflict with
domain and range restrictions
Hanoi, March/April 2007
• Java based Ontology editor
• It supports Protégé-Frames and OWL as
modeling languages
– Frames is based on Open Knowledge Base
Connectivity protocol (OKBC)
• It exports into various formats, including
(Semantic) Web formats
Hanoi, March/April 2007
The newspaper example (part)
News Service
• Properties (slots)
Persons have names which are strings, phone number, etc
Employees (further) have salaries that are positive numbers
Editor are responsible for other employees
Articles have an author, which is an instance of Author, and possibly
various keywords
• Constraints
– Each article must have at least two keywords
– The salary of an editor should be greater than the salary of any employee
which the editor is responsible for
Hanoi, March/April 2007
Languages for Ontologies
• In early days of Artificial Intelligence, ontologies
were represented resorting to non-logic-based
– Frames systems and semantic networks
• Graphical representation
– arguably ease to design
– but difficult to manage with complex pictures
– formal semantics, allowing for reasoning was missing
Hanoi, March/April 2007
Semantic Networks
• Nodes representing concepts (i.e. sets of classes of
individual objects)
• Links representing relationships
– IS_A relationship
– More complex relationships may have nodes
Hanoi, March/April 2007
Logics for Semantic Networks
• Logics was used to describe the semantics of core
features of these networks
– Relying on unary predicates for describing sets of
individuals and binary predicates for relationship
between individuals
• Typical reasoning used in structure-based
representation does not require the full power of
1st order theorem provers
– Specialized reasoning techniques can be applied
Hanoi, March/April 2007
From Frames to Description Logics
• Logical specialized languages for describing
• The name changed over time
– Terminological systems emphasizing that the language
is used to define a terminology
– Concept languages emphasizing the concept-forming
constructs of the languages
– Description Logics moving attention to the properties,
including decidability, complexity, expressivity, of the
Hanoi, March/April 2007
Description Logic ALC
• ALC is the smallest propositionally closed
Description Logics. Syntax:
– Atomic type:
• Concept names, which are unary predicates
• Role names, which are binary predicates
– Constructs
C1 ⊓ C2
C1 ⊔ C2
Hanoi, March/April 2007
(existential restriction)
(universal restriction)
Semantics of ALC
• Semantics is based on interpretations (DI,.I) where
.I maps:
– Each concept name A to AI ⊆ DI
• I.e. a concept denotes set of individuals from the domain
(unary predicates)
– Each role name R to AI ⊆ DI x DI
• I.e. a role denotes pairs of (binary relationships among)
• An interpretation is a model for concept C iff
CI ≠ {}
• Semantics can also be given by translating to 1st
order logics
Hanoi, March/April 2007
Negation, conjunction, disjunction
• ¬C denotes the set of all individuals in the domain that do
not belong to C. Formally
– (¬C)I = DI – CI
– {x: ¬C(x)}
• C1 ⊔ C2 (resp. C1 ⊓ C2) is the set of all individual that
either belong to C1 or (resp. and) to C2
– (C1 ⊔ C2)I = C1I ⋃ C2I
– {x: C1(x) ⌵ C2(x)}
• Persons that are not female
resp. (C1 ⊓ C2)I = C1I ⋂ C2I
resp. {x: C1(x)  C2(x)}
– Person ⊓ ¬Female
• Male or Female individuals
– Male ⊔ Female
Hanoi, March/April 2007
Quantified role restrictions
• Quantifiers are meant to characterize relationship
between concepts
• R.C denotes the set of all individual which relate
via R with at least one individual in concept C
– (R.C)I = {d ∈ DI | (d,e) ∈ RI and e ∈ CI}
– {x | y R(x,y)  C(y)}
• Persons that have a female child
– Person ⊓ hasChild.Female
Hanoi, March/April 2007
Quantified role restrictions (cont)
• R.C denotes the set of all individual for which
all individual to which it relates via R belong to
concept C
– (R.C)I = {d ∈ DI | (d,e) ∈ RI implies e ∈ CI}
– {x | y R(x,y)  C(y)}
• Persons whose all children are Female
– Person ⊓ hasChild.Female
• The link in the network above
– Parents have at least one child that is a person, and
there is no upper limit for children
– hasChild.Person ⊓ hasChild.Person
Hanoi, March/April 2007
Elephant example
• Elephants that are grey mammal which have
a trunck
– Mammal ⊓ bodyPart.Trunk ⊓ color.Grey
• Elephants that are heavy mammals, except
for Dumbo elephants that are light
– Mammal ⊓
(weight.heavy ⊔ (Dumbo ⊓ weight.Light)
Hanoi, March/April 2007
Reasoning tasks in DL
• What can we do with an ontology? What does the logical
formalism brings more?
• Reasoning tasks
– Concept satisfiability (is there any model for C?)
– Concept subsumption (does C1I ⊆ C2I for all I?)
C1 ⊑ C2
• Subsumption is important because from it one can compute
a concept hierarchy
• Specialized (decidable and efficient) proof techniques exist
for ALC, that do not employ the whole power needed for
1st order logics
– Based on tableau algorithms
Hanoi, March/April 2007
Representing Knowledge with DL
• A DL Knowledge base is made of
– A TBox: Terminological (background) knowledge
• Defines concepts.
• Eg. Elephant ≐ Mammal ⊓ bodyPart.Trunk
– A ABox: Knowledge about individuals, be it concepts
or roles
• E.g.
dumbo: Elephant or
• Similar to eg. Databases, where there exists a
schema and an instance of a database.
Hanoi, March/April 2007
General TBoxes
• T is finite set of equation of the form
C1 ≐ C2
• I is a model of T if for all C1 ≐ C2 ∈ T, C1I = C2I
• Reasoning:
– Satisfiability: Given C and T find whether there is a
model both of C and of T?
– Subsumption (C1 ⊑T C2): does C1I ⊆ C2I holds for all
models of T?
Hanoi, March/April 2007
Acyclic TBoxes
• For decidability, TBoxes are often restricted to
A≐ C
where A is a concept name (rather than
• Moreover, concept A does not appear in the
expression C, nor at the definition of any of the
concepts there (i.e. the definition is acyclic)
Hanoi, March/April 2007
• Define a set of individuals, as instances of
concepts and roles
• It is a finite set of expressions of the form:
– a:C
– (a,b):R
where both a and b are names of individuals, C is a
concept and R a role
• I is a model of an ABox if it satisfies all its
expressions. It satisfies
– a:C
– (a,b):R
Hanoi, March/April 2007
aI ∈ CI
(aI,bI) ∈ RI
Reasoning with TBoxes and ABoxes
• Given a TBox T (defining concepts) and an
ABox A defining individuals
– Find whether there is a common model (i.e.
find out about consistency)
– Find whether a concept is subsumed by another
concept C1 ⊑T C2
– Find whether an individual belongs to a concept
(A,T |= a:C), i.e. whether aI ∈ CI for all models
of A and T
Hanoi, March/April 2007
Inference under ALC
• Since the semantics of ALC can be defined in
terms of 1st order logics, clearly 1st order theorem
provers can be used for inference
• However, ALC only uses a small subset of 1st
order logics
– Only unary and binary predicates, with a very limited
use of quantifiers and connectives
• Inference and algorithms can be much simpler
– Tableau Algorithms are used for ALC and mostly other
description logics
• ALC is also decidable, unlike 1st order logics
Hanoi, March/April 2007
More expressive DLs
• The limited use of 1st order logics has its
advantages, but some obvious drawbacks:
Expressivity is also limited
• Some concept definitions are not possible to
define in ALC. E.g.
– An elephant has exactly 4 legs
• (expressing qualified number restrictions)
– Every mother has (at least) a child, and every son is the
child of a mother
• (inverse role definition)
– Elephant are animal
• (define concepts without giving necessary and sufficient
Hanoi, March/April 2007
Extensions of ALC
• ALCN extends ALC with unqualified number restrictions
≤n R
≥n R
=n R
– Denotes the individuals which relate via R to at least (resp. at most,
exactly) n individuals
– Eg. Person ⊓ (≥ 2 hasChild)
• Persons with at least two children
• The precise meaning is defined by (resp. for ≥ and =)
– (≤n R)I = {d ∈ DI | #{(d,e) ∈ RI} ≤ n }
• It is possible to define the meaning in terms of 1st order
logics, with recourse to equality. E.g.
– ≥2 R is {x: yz, y ≠ z  R(x,y)  R(x,z)}
– ≤2 R is
{x: y,z,w, (R(x,y)  R(x,z)  R(x,w))  (y=z ⌵ y=w ⌵ z=w)}
Hanoi, March/April 2007
Qualified number restriction
• ALCN can be further extended to include the more
expressive qualified number restrictions
(≤n R C)
(≥n R C) and
(=n R C)
– Denotes the individuals which relate via R to at least (resp. at most,
exactly) n individuals of concept C
– Eg. Person ⊓ (≥ 2 hasChild Female)
• Persons with at least two female children
– E.g. Mammal ⊓ (=4 bodypart Leg)
• Mammals with 4 legs
• The precise meaning is defined by (resp. for ≥ and =)
– (≤n R)I = {d ∈ DI | #{(d,e) ∈ RI} ≤ n }
• Again, it is possible to define the meaning in terms of 1st
order logics, with recourse to equality. E.g.
– (≥2 R C) is {x: yz, y ≠ z  C(y)  C(z)  R(x,y)  R(x,z)}
Hanoi, March/April 2007
Further extensions
• Inverse relations
– R- denotes the inverse of R: R- (x,y) = R(y,x)
• One of constructs (nominals)
– {a1, …, an}, where as are individuals, denotes one of a1, …, an
• Statements of subsumption in TBoxes (rather than only
• Role transitivity
– Trans(R) denotes the transitivity closure of R
• SHOIN is the DL resulting from extending ALC with all the
above described extensions
– It is the underlying logics for the Semantic Web language OWLDL
– The less expressive language SHIF, without nominal is the basis
for OWL-Lite
Hanoi, March/April 2007
• From the w3c wine ontology
– Wine ⊑
PotableLiquid ⊓ (=1 hasMaker) hasMaker.Winery)
• Wine is a potable liquid with exactly one maker, and the maker
must be a winery
– hasColor-.Wine ⊑ {“white”, “rose”, “red”}
• Wines can be either white, rose or red.
– WhiteWine ≐ Wine ⊓ hasColor.{“white”}
• White wines are exactly the wines with color white.
Hanoi, March/April 2007
Bibliography on Ontologies and
Description Logics
• On Ontologies and Ontology Engeneering:
Knowledge Representation: Logical, Philosophical,
and Computational Foundations
John F. Sowa
Brooks Cole Publishing Co., 2000.
• On Description Logics
The Description Logic Handbook: Theory,
Implementation, and Applications
F. Baader, D. Calvanese, D. McGuinness, D. Nardi,
and P. F. Patel-Schneider
Cambridge University Press, 2003
Hanoi, March/April 2007
Part 6: Richer Ontologies in the Web
with OWL
Ontology Languages for the Web
• We’ve seen several logical languages for
describing ontologies
– More or less expressive
– Computable
– More or less complex (making more or less efficient
• We need now to choose among them, according to
compromises, and put it in a format compatible
with Web languages
– OWL is a (family of) such language(s)
Hanoi, March/April 2007
Reasoning Support for OWL
• Semantics is a prerequisite for reasoning support
• Formal semantics and reasoning support are
usually provided by
– mapping an ontology language to a known logical
– using automated reasoners that already exist for those
• OWL is (partially) mapped on a description logic,
and makes use of reasoners such as FaCT and
Hanoi, March/April 2007
Limitations of RDF Schema
• Local scope of properties
– rdfs:range defines the range of a property (e.g. eats)
for all classes
– In RDF Schema we cannot declare range restrictions
that apply to some classes only
• E.g. we cannot say that cows eat only plants, while other
animals may eat meat, too
• Disjointness of classes
– Sometimes we wish to say that classes are disjoint (e.g.
male and female)
• Cardinality restrictions
– E.g. a person has exactly two parents, a course is taught
by at least one lecturer
Hanoi, March/April 2007
Limitations of RDF Schema (cont)
• Boolean combinations of classes
– Sometimes we wish to build new classes by combining
other classes using union, intersection, and complement
– E.g. person is the disjoint union of the classes male
and female
• Special characteristics of properties
– Transitive property (like “greater than”)
– Unique property (like “is mother of”)
– A property is the inverse of another property (like
“eats” and “is eaten by”)
• We’ve seen description logics with none of these
Hanoi, March/April 2007
Three Species of OWL
• W3C’sWeb Ontology Working Group defined 3
different OWL sublanguages:
– OWL Full
• Most expressive language
• Corresponding to DL SHOIN
– OWL Lite
• Corresponding to DL SHIF
• Each sublanguage geared toward fulfilling
different aspects of requirements
Hanoi, March/April 2007
OWL Full
• It uses all the OWL languages primitives
• It allows the combination of these primitives in arbitrary
ways with RDF and RDF Schema
• OWL Full is fully upward-compatible with RDF, both
syntactically and semantically
• OWL Full is so powerful that it is undecidable
– No efficient (or even complete) reasoning support
• May it is a good idea not to stick to upwards-compatibility!
– Combining RDF Schema with logic leads to uncontrollable
computational properties
Hanoi, March/April 2007
• OWL DL (Description Logic) is a sublanguage of
OWL Full that restricts application of the
constructors from OWL and from RDF
– Application of OWL’s constructors’ to each other is
– It corresponds to the well studied description logic
• OWL DL allows for efficient reasoning support
• But one loses full compatibility with RDF:
– Not every RDF document is a legal OWL DL
– Every legal OWL DL document is a legal RDF
Hanoi, March/April 2007
OWL Lite
• An even further restriction limits OWL DL to a
subset of the language constructors
– E.g., OWL Lite excludes enumerated classes,
disjointness statements, and arbitrary cardinality.
– It corresponds to description logics SHIF
• The advantage of OWL Lite is that it is a
language that is easier to
– grasp, for users
– implement more efficiently, for tool builders
• The disadvantage is the restricted expressivity
Hanoi, March/April 2007
OWL Compatibility with RDF
• All varieties of OWL
use RDF for their
• Instances are declared
as in RDF, using RDF
• Typing information
OWL constructors are
specializations of their
RDF counterparts
Hanoi, March/April 2007
OWL Syntactic Varieties
OWL builds on RDF and uses the XML-based
syntax of RDF
Other syntactic forms for OWL have also been
– An alternative, more readable XML-based syntax
– An abstract syntax, much more compact and readable
than the XML languages
– A graphic syntax based on the conventions of UML
(Unified Modelling Language)
Hanoi, March/April 2007
OWL XML/RDF Syntax: Header
xmlns:owl =""
xmlns:rdf =""
xmlns:xsd =" XLMSchema#">
• An OWL ontology may start with a
collection of assertions using
owl:Ontology element
Hanoi, March/April 2007
<owl:Ontology rdf:about="">
<rdfs:comment>An example OWL ontology </rdfs:comment>
<rdfs:label>University Ontology</rdfs:label>
• owl:imports is transitive
Hanoi, March/April 2007
Classes in OWL
• Classes are defined using owl:Class
– owl:Class is a subclass of rdfs:Class
• Disjointness is defined using owl:disjointWith
<owl:Class rdf:about="#associateProfessor">
Hanoi, March/April 2007
Classes in OWL (cont)
• owl:equivalentClass defines equivalence of
<owl:Class rdf:ID="faculty">
<owl:equivalentClass rdf:resource=
• owl:Thing is the most general class, which
contains everything
• owl:Nothing is the empty class
Hanoi, March/April 2007
• In OWL there are two kinds of properties
– Object properties, which relate objects to other
• E.g. isTaughtBy, supervises
– Data type properties, which relate objects to
datatype values
• E.g. phone, title, age, etc.
Hanoi, March/April 2007
Datatype Properties
• Using the layered architecture of the
Semantic Web, OWL adopts XML Schema
simple data types
<owl:DatatypeProperty rdf:ID="age">
<rdfs:range rdf:resource=
Hanoi, March/April 2007
Object Properties
• User-defined data types in RDF
<owl:ObjectProperty rdf:ID="isTaughtBy">
<owl:domain rdf:resource="#course"/>
<owl:range rdf:resource= "#academicStaffMember"/>
<rdfs:subPropertyOf rdf:resource="#involves"/>
Hanoi, March/April 2007
Inverse and Equivalent Relations
• Inverse relations (or properties)
<owl:ObjectProperty rdf:ID="teaches">
<rdfs:range rdf:resource="#course"/>
<rdfs:domain rdf:resource="#academicStaffMember"/>
<owl:inverseOf rdf:resource="#isTaughtBy"/>
• Equivalent properties
<owl:ObjectProperty rdf:ID="lecturesIn">
<owl:equivalentProperty rdf:resource="#teaches"/>
Hanoi, March/April 2007
Property Restrictions
• Similar to description logics, OWL can deal with
– Quantified restriction
– Unqualified number restriction
• Imposing that class C has some restriction (i.e.
must satisfy some conditions) is the same as
– Defining an (anonymous) class C’ collecting all objects
that the satisfy the conditions
– Then, saying that C is a subclass of C‘
• I.e. state that C ⊑ C’
Hanoi, March/April 2007
Property Restrictions in OWL
A (restriction) class is achieved through an
owl:Restriction element
This element contains an owl:onProperty element and
one or more restriction declarations
Restriction declarations can be number (cardinality)
Restriction for values and quantification
owl:allValuesFrom specifies universal quantification
owl:hasValue specifies a specific value
owl:someValuesFrom specifies existential quantification
Hanoi, March/April 2007
owl:allValuesFrom Example
<owl:Class rdf:about="#fullProfessorCourse">
<owl:onProperty rdf:resource="#isTaughtBy"/>
<owl:allValuesFrom rdf:resource="#fullProfessor"/>
• The meaning is
fullProfessorCourse ⊑ isTaughtBy.fullProfessor
Hanoi, March/April 2007
owl:someValuesFrom Example
<owl:Class rdf:about="#academicStaffMember">
<owl:onProperty rdf:resource="#teaches"/>
<owl:someValuesFrom rdf:resource= "#undergraduateCourse"/>
• The meaning is
academicStaffMember ⊑ teaches.undergraduateCourse
Hanoi, March/April 2007
owl:hasValue Example
<owl:Class rdf:about="#mathCourse">
<owl:onProperty rdf:resource="#isTaughtBy"/>
<owl:hasValue rdf:resource= "#949352"/>
• The meaning is (where {#949352} is a nominal)
mathCourse ⊑ isTaughtBy.{#949352}
Hanoi, March/April 2007
Unqualified Number Restrictions
OWL allows to specify minimum and maximum
number using owl:minCardinality and
It is possible to specify a precise number by using
the same minimum and maximum number
For convenience in these cases, OWL offers also
Hanoi, March/April 2007
owl:hasValue Example
<owl:Class rdf:about="#course">
<owl:onProperty rdf:resource="#isTaughtBy"/>
<owl:minCardinality rdf:datatype="&xsd;nonNegativeInteger">
• The meaning is
course ⊑ (≥ 1 isTaughtBy)
Hanoi, March/April 2007
Special OWL Properties
owl:SymmetricProperty (symmetry)
E.g. “has same grade as”, “is sibling of”
owl:FunctionalProperty defines a property that
has at most one value for each object
E.g. “has better grade than”, “is ancestor of”
E.g. “age”, “height”, “directSupervisor”
owl:InverseFunctionalProperty defines a
property for which two different objects cannot
have the same value
Hanoi, March/April 2007
Special Properties Example
<owl:ObjectProperty rdf:ID="hasSameGradeAs">
<rdf:type rdf:resource="&owl;TransitiveProperty"/>
<rdf:type rdf:resource="&owl;SymmetricProperty"/>
<rdfs:domain rdf:resource="#student"/>
<rdfs:range rdf:resource="#student"/>
Hanoi, March/April 2007
Boolean Combinations
As in description logics, OWL allows combining classes
with Boolean operations (union, intersection, complement)
<owl:Class rdf:ID="peopleAtUni">
<owl:unionOf rdf:parseType="Collection">
<owl:Class rdf:about="#staffMember"/>
<owl:Class rdf:about="#student"/>
The new class is not a subclass of the union, but rather
equal to the union
We have stated an equivalence of classes, not restrictions
The meaning is:
peopleAtUni ≐ staffMember ⊔ student
Hanoi, March/April 2007
Intersection Example
<owl:Class rdf:ID="facultyInCS">
<owl:intersectionOf rdf:parseType="Collection">
<owl:Class rdf:about="#faculty"/>
<owl:onProperty rdf:resource="#belongsTo"/>
<owl:hasValue rdf:resource= "#CSDepartment"/>
• The meaning is
facultyInCS ≐ faculty ⊓ ( belongsTo.{CSDepartment})
Hanoi, March/April 2007
Example with Nesting of Operator
<owl:Class rdf:ID="adminStaff">
<owl:intersectionOf rdf:parseType="Collection">
<owl:Class rdf:about="#staffMember"/>
<owl:unionOf rdf:parseType="Collection">
<owl:Class rdf:about="#faculty"/>
<owl:Class rdf:about=
• The meaning is
adminStaff ≐ staffMember ⊓ ¬(faculty ⊔ techSupportStaff)
Hanoi, March/April 2007
Enumerations with owl:oneOf
<owl:oneOf rdf:parseType="Collection">
<owl:Thing rdf:about="#Monday"/>
<owl:Thing rdf:about="#Tuesday"/>
<owl:Thing rdf:about="#Wednesday"/>
<owl:Thing rdf:about="#Thursday"/>
<owl:Thing rdf:about="#Friday"/>
<owl:Thing rdf:about="#Saturday"/>
<owl:Thing rdf:about="#Sunday"/>
Hanoi, March/April 2007
Declaring Instances
Instances of classes (A-Boxes) are declared as in
<rdf:Description rdf:ID="949352">
<rdf:type rdf:resource= "#academicStaffMember"/>
<academicStaffMember rdf:ID="949352">
<uni:age rdf:datatype="&xsd;integer">
Hanoi, March/April 2007
No Unique-Names Assumption
As in description logics, OWL does not adopt the
unique-names assumption of database systems
– Two instances having a different name or ID does not
imply that they are different individuals
Suppose we state that each course is taught by at
most one staff member, and that a given course is
taught by two staff members
An OWL reasoner does not flag an error
Instead it infers that the two resources are equal
Hanoi, March/April 2007
Explicit Inequalities in OWL
• To ensure that different individuals are
indeed recognized as such, we must
explicitly assert their inequality:
<lecturer rdf:about="949318">
<owl:differentFrom rdf:resource="949352"/>
Hanoi, March/April 2007
Distinct Objects
OWL provides a shorthand notation to assert the
pairwise inequality of all individuals in a given
<owl:distinctMembers rdf:parseType="Collection">
<lecturer rdf:about="949318"/>
<lecturer rdf:about="949352"/>
<lecturer rdf:about="949111"/>
Hanoi, March/April 2007
Ontology Versioning
• owl:priorVersion indicates earlier
versions of the current ontology
– No formal meaning, can be exploited for
ontology management
• owl:versionInfo contains a string giving
information comments about the current
version, e.g. keywords
Hanoi, March/April 2007
Ontology Versioning (cont)
owl:backwardCompatibleWith contains a
reference to another ontology
All identifiers from the previous version have the same
intended interpretations in the new version
Thus documents can be safely changed to commit to
the new version
owl:incompatibleWith states that the containing
ontology is a later version of the referenced
ontology but is not backward compatible with it
Hanoi, March/April 2007
Restrictions in OWL DL
Vocabulary partitioning
– Any resource is allowed to be only a class, a data type,
a data type property, an object property, an individual, a
data value, or part of the built-in vocabulary, and not
more than one of these
Explicit typing
The partitioning of all resources must be stated
explicitly (e.g. a class must be declared if used in
conjunction with rdfs:subClassOf)
Hanoi, March/April 2007
Restrictions in OWL DL (cont)
• Property Separation
– The set of object properties and data type
properties are disjoint
– Therefore the following can never be specified
for data type properties:
Hanoi, March/April 2007
Restrictions in OWL DL (cont)
No transitive cardinality restrictions
No cardinality restrictions may be placed on transitive
Restricted anonymous classes: Anonymous
classes are only allowed to occur as:
the domain and range of either owl:equivalentClass or
the range (but not the domain) of rdfs:subClassOf
Hanoi, March/April 2007
Restrictions in OWL Lite
• Restrictions of OWL DL, more
• owl:oneOf, owl:disjointWith, owl:unionOf,
owl:complementOf and owl:hasValue are not
• Cardinality statements (minimal, maximal, and
exact cardinality) can only be made on the values
0 or 1
• owl:equivalentClass statements can no longer be
made between anonymous classes but only
between class identifiers
Hanoi, March/April 2007
Semantics of OWL
• The semantics of OWL can be determine by
means of the semantics of the corresponding
description logics
• Theorem provers for the description logics can be
used for reasoning in OWL
• It is also possible to capture some of OWL’s in
terms of itself
– See the book for the details of this characterization
Hanoi, March/April 2007
An African Wildlife Ontology –
Class Hierarchy
• See details in the book “A Semantic Web Primer”
Hanoi, March/April 2007
An African Wildlife Ontology –
Schematic Representation
Βranches are parts of trees
Hanoi, March/April 2007
An African Wildlife Ontology –
<owl:TransitiveProperty rdf:ID="is-part-of"/>
<owl:ObjectProperty rdf:ID="eats">
<rdfs:domain rdf:resource="#animal"/>
<owl:ObjectProperty rdf:ID="eaten-by">
<owl:inverseOf rdf:resource="#eats"/>
Hanoi, March/April 2007
An African Wildlife Ontology –
Plants and Trees
<owl:Class rdf:ID="plant">
<rdfs:comment>Plants are disjoint from animals.
<owl:Class rdf:ID="tree">
<rdfs:comment>Trees are a type of plant.
<rdfs:subClassOf rdf:resource="#plant"/>
Hanoi, March/April 2007
An African Wildlife Ontology –
<owl:Class rdf:ID="branch">
<rdfs:comment>Branches are parts of trees. </rdfs:comment>
<owl:onProperty rdf:resource="#is-part-of"/>
<owl:allValuesFrom rdf:resource="#tree"/>
Hanoi, March/April 2007
An African Wildlife Ontology –
<owl:Class rdf:ID="leaf">
<rdfs:comment>Leaves are parts of branches. </rdfs:comment>
<owl:onProperty rdf:resource="#is-part-of"/>
<owl:allValuesFrom rdf:resource="#branch"/>
Hanoi, March/April 2007
An African Wildlife Ontology –
<owl:Class rdf:ID="carnivore">
<rdfs:comment>Carnivores are exactly those animals
that eat also animals.</rdfs:comment>
<owl:intersectionOf rdf:parsetype="Collection">
<owl:Class rdf:about="#animal"/>
<owl:onProperty rdf:resource="#eats"/>
<owl:someValuesFrom rdf:resource="#animal"/>
Hanoi, March/April 2007
An African Wildlife Ontology –
<owl:Class rdf:ID="herbivore">
Herbivores are exactly those animals
that eat only plants or parts of plants.
Try it out! See book for code.
Hanoi, March/April 2007
An African Wildlife Ontology –
<owl:Class rdf:ID="giraffe">
<rdfs:comment>Giraffes are herbivores, and they
eat only leaves.</rdfs:comment>
<rdfs:subClassOf rdf:type="#herbivore"/>
<owl:onProperty rdf:resource="#eats"/>
<owl:allValuesFrom rdf:resource="#leaf"/>
Hanoi, March/April 2007
An African Wildlife Ontology –
<owl:Class rdf:ID="lion">
<rdfs:comment>Lions are animals that eat
only herbivores.</rdfs:comment>
<rdfs:subClassOf rdf:type="#carnivore"/>
<owl:onProperty rdf:resource="#eats"/>
<owl:allValuesFrom rdf:resource="#herbivore"/>
Hanoi, March/April 2007
An African Wildlife Ontology –
Tasty Plants
owl:Class rdf:ID="tasty-plant">
<rdfs:comment>Plants eaten both by
herbivores and carnivores </rdfs:comment>
Left as exercise! See book for code.
Hanoi, March/April 2007
Summary of OWL
OWL is the proposed standard for Web ontologies
OWL builds upon RDF and RDF Schema:
(XML-based) RDF syntax is used
Instances are defined using RDF descriptions
Most RDFS modelling primitives are used
Formal semantics and reasoning support is provided
through the mapping of OWL on description logics
While OWL is sufficiently rich to be used in practice,
extensions are in the making
They will provide further logical features, including rules
Hanoi, March/April 2007
Part 7: Rules for the Semantic Web
Knowledge Representation
• The subjects presented so far were all related to
the representation of knowledge (in the Web)
• Knowledge Representation was studied in AI long
before the emergence of WWW
• Logic is still the foundation of KR
• There is much more in logics than what we have
used so far
– Particularly, logics allows for expressing rules that
OWL and RDF can’t
Hanoi, March/April 2007
Rules in Horn Logic
• Basis for Logic Programming
• A rule has the form: A1, . . ., An  B
– Ai and B are atomic formulas
• There are 2 ways of reading such a rule:
– Deductive rules: If A1,..., An are known to be
true, then B is also true
– Reactive rules: If the conditions A1,..., An are
true, then carry out the action B
Hanoi, March/April 2007
Description Logics vs. Horn Logic
• Neither of them is a subset of the other
• It is impossible in OWL to state general conjunctions in the
– E.g. it is impossible to assert that persons who study and live in the
same city are “home students” in OWL
– This can be done easily using rules:
studies(X,Y), lives(X,Z), loc(Y,U), loc(Z,U)  homeStudent(X)
• Horn rules cannot assert disjunctions
– With Horn rules one cannot state that a person is either a man or a
– This information is easily expressed in OWL using disjoint union
Hanoi, March/April 2007
Need for Closed World Assumption
Example: An online vendor wants to give a
special discount if it is a customer’s birthday
Solution 1
R1: If birthday, then special discount
R2: If not birthday, then not special discount
• But what happens if a customer refuses to provide
his birthday due to privacy concerns?
Hanoi, March/April 2007
Non-monotonic Rules
Solution 2
R1: If birthday, then special discount
R2’: If birthday is not known, then not
special discount
• Solves the problem but:
– The premise of rule R2' is not within the
expressive power of predicate logic
– We need a new kind of rule system
Hanoi, March/April 2007
Monotonicity of Logic
• Classical Logic is monotonic
T |= F → T U T’ |= F
• This is a basic property which makes sense for
mathematical knowledge and closed domains
• But is desirable when dealing with incomplete
– Such as incomplete information about the customer’s
birthday, in the example
Hanoi, March/April 2007
Nonmonotonic Logic Programs
Logic Programming has, years ago, dealt with
the problem of nonmotonic rules
They include, besides Horn rules, a
nonmonotonic default negation operator
– not A holds if “A is not known to be true” (or “there
is no evidence for A”)
The semantics of logic programs with default
negation is studied, and implementations (of
reasoners) exist
Hanoi, March/April 2007
Exchange of Rules
• Exchange of rules across different applications
– E.g., an online store advertises its pricing, refund, and
privacy policies, expressed using rules
• The Semantic Web approach is to express the
knowledge in a machine-accessible way using one
of the Web languages we have already discussed
– So, one needs XML-like languages for representing
(“rule markup languages”)
Hanoi, March/April 2007
Family Relations Example
• Facts in a database about relations:
mother(X,Y), X is the mother of Y
father(X,Y), X is the father of Y
male(X), X is male
female(X), X is female
• Inferred relation parent: A parent is either a father
or a mother
mother(X,Y)  parent(X,Y)
father(X,Y)  parent(X,Y)
Hanoi, March/April 2007
More Relations in the Family
• male(X), parent(P,X), parent(P,Y), X≠Y  brother(X,Y)
• female(X), parent(P,X), parent(P,Y), X≠Y  sister(X,Y)
• brother(X,P), parent(P,Y)  uncle(X,Y)
• mother(X,P), parent(P,Y)  grandmother(X,Y)
• parent(X,Y)  ancestor(X,Y)
• ancestor(X,P), parent(P,Y)  ancestor(X,Y)
Hanoi, March/April 2007
Monotonic Rules – Syntax
loyalCustomer(X), age(X) > 60  discount(X)
• Like in Logic Programming (e.g. Prolog)
distinguish some ingredients of rules:
variables which are placeholders for values: X
constants denote fixed values: 60
Predicates relate objects: loyalCustomer, >
Function symbols which return a value for certain
arguments: age
Hanoi, March/April 2007
Programs and Rules
• A (logic) program is a set of rules of the form
B1, . . . , Bn  A
• A, B1, ... , Bn are atomic formulas
• A is the head of the rule
• B1, ... , Bn are the premises (body of the rule)
– If the body is empty the  can be omitted (and is called a fact)
• The commas in the rule body stand for conjunction
• Variables may occur in A, B1, ... , Bn
– loyalCustomer(X), age(X) > 60  discount(X)
– Implicitly universally quantified
• This means that variable only in the body can be seen as existencially
quantified in the body
Hanoi, March/April 2007
• A goal denotes a query G asked to a logic
• The form:
– B1, . . . , Bn 
– Or, more used in Prolog, ?- B1, . . . , Bn
• If n = 0 we have the empty goal 
Hanoi, March/April 2007
Predicate Logic Semantics
Given a logic program P and a query
B1, . . . , Bn 
with the variables X1, ... , Xk we answer
positively if, and only if,
pl(P) |= X1 . . . Xk(B1  ...  Bn) (1)
or equivalently, if
pl(P)  {¬X1 . . . Xk (B1  ...  Bn)} is
Hanoi, March/April 2007
Ground Witnesses
• So far we have focused on yes/no answers to
• Suppose that we have the fact p(a) and the query
p(X) 
– The answer yes is correct but not satisfactory
• The appropriate answer is a substitution {X/a}
which gives an instantiation for X
• The constant a is called a ground witness
• Logic programming systems answer with (all)
substitutions that make the query true
Hanoi, March/April 2007
RDFS and Horn Logic
C subClassOf D
P subPorpertyOf Q
Hanoi, March/April 2007
C(X)  D(X)
P(X,Y)  Q(X,Y)
P(X,Y)  C(X)
P(X,Y)  C(Y)
OWL in Horn Logic
C sameClassAs D
C(X)  D(X)
D(X)  C(X)
P samePropertyAs Q P(X,Y)  Q(X,Y)
Q(X,Y)  P(X,Y)
Hanoi, March/April 2007
OWL in Horn Logic (2)
P(X,Y), P(Y,Z)  P(X,Z)
Q(X,Y)  P(Y,X)
P(X,Y)  Q(Y,X)
P(X,Y), P(X,Z)  Y=Z
Hanoi, March/April 2007
OWL in Horn Logic (3)
(C1  C2) subClassOf D
C1(X), C2(X)  D(X)
C subClassOf (D1  D2)
C(X)  D1(X)
C(X)  D2(X)
Hanoi, March/April 2007
OWL in Horn Logic (4)
(C1 C2) subClassOf D
C1(X)  D(X)
C2(X)  D(X)
C subClassOf (D1  D2)
Translation not possible!
This is a case were Horn Logic
is less expressive.
Hanoi, March/April 2007
OWL in Horn Logic (5)
C subClassOf AllValuesFrom(P,D)
C(X), P(X,Y)  D(Y)
AllValuesFrom(P,D) subClassOf C
Again translation not possible!
Hanoi, March/April 2007
OWL in Horn Logic (6)
C subClassOf SomeValuesFrom(P,D)
Translation not possible for this
quantifier either
SomeValuesFrom(P,D) subClassOf C
D(X), P(X,Y)  C(Y)
Hanoi, March/April 2007
OWL in Horn Logic (7)
• MinCardinality cannot be translated due to
existential quantification
• MaxCardinality 1 may be translated if
equality is allowed
• Complement cannot be translated, in
Hanoi, March/April 2007
The Language SWRL
• SWRL (Semantic Web Rule Language) combines OWL
DL with function-free Horn logic.
• It allows Horn-like rules to be combined with OWL DL
B1, . . . , Bn  A1, . . . , Am
– A1, . . . , Am, B1, . . . , Bn have one of the forms:
• C(x)
• P(x,y)
• sameAs(x,y) differentFrom(x,y)
where C is an OWL description, P is an OWL property, and x,y are
variables, OWL individuals or OWL data values.
Hanoi, March/April 2007
Drawbacks of SWRL
• Main source of complexity:
– arbitrary OWL expressions, such as restrictions,
can appear in the head or body of a rule.
• Adds significant expressive power to OWL,
but causes undecidability!
– there is no inference engine that draws exactly
the same conclusions as the SWRL semantics.
Hanoi, March/April 2007
SWRL Sublanguages
• SWRL adds the expressivity of DLs and functionfree rules.
• One challenge: identify sublanguages of SWRL
with right balance between expressivity and
computational viability.
• A candidate OWL DL + DL-safe rules
– every variable must appear in a non-description logic
atom in the rule body.
• More research work is being developed in finding
right balances between expressivity and
computational viability
Hanoi, March/April 2007
Integration of nonmonotonic
• Full integration amounts to:
– Combine DL formulas with rules having no restrictions
– The vocabularies are the same
– Predicates can be defined either using rules or using DL
• This approach encounters several problems
– The base assumptions of DL and of non-monotonic
rules are quite different, and so mixing them so tightly
is not easy
Hanoi, March/April 2007
Problems with integration
• Rule languages (e.g. Logic Programming) use some form
of closed world assumption (CWA)
– Assume negation by default
– This is crucial for reasoning with incomplete knowledge
• DL, being a subset of 1st order logics, has no closed world
– The world is kept open in 1st order logics (OWA)
– This is reasonable when defining concepts
– Mostly, the ontology is desirably monotonic
• What if a predicate is both “defined” using DL and LP
– Should its negation be assumed by default?
– Or should it be kept open?
– How exactly can one define what is CWA or OWA is this context?
Hanoi, March/April 2007
• Consider the program P
wine(X) ← whiteWine(X)
nonWhiteWine(X) ← not whiteWine(X)
and the “corresponding” DL theory
WhiteWine ⊑ Wine
¬WhiteWine ⊑ nonWhiteWine
• P derives nonWhiteWine(esporão_tinto) whilst the
DL does not.
Hanoi, March/April 2007
Modeling exceptions
• The following TBox is unsatisfiable
Bird ⊑ Flies
Penguin ⊑ Bird ⊓ ¬Flies
• The first assertion should be seen as
allowing exceptions
• This is easily dealt by nonmonotonic rule
languages, e.g. logic programming, as we
have seen
Hanoi, March/April 2007
Problems with integration (cont)
• DL uses classical negation while LP uses either
default or explicit negation
– Default negation is nonmonotonic
– As classical negation, explicit negation also does not
assume a complete world and is monotonic
– But classical negation and explicit negation are
– With classical negation it is not possible to deal with
Hanoi, March/April 2007
Classical vs Explicit Negation
• Consider the program P
wine(X) ← whiteWine(X)
• and the DL theory
WhiteWine ⊑ Wine
coca_cola: ¬Wine
• The DL theory derives ¬WhiteWine(coca_cola) whilst P
does not.
– In logic programs, with explicit negation, contraposition of
implications is not possible/desired
– Note in this case, that contraposition would amount to assume that
no inconsistency is ever possible!
Hanoi, March/April 2007
Problems with integration (cont)
• Decidability is dealt differently:
– DL achieves decidability by enforcing restrictions on
the form of formulas and predicates of 1st order logics,
but still allowing for quantifiers and function symbols
• E.g. it is still possible to talk about an individual without
knowing who it is:
hasMaker.{esporão} ⊑ GoodWine
– PL achieves decidability by restricting the domain and
disallowing function symbols, but being more liberal in
the format of formulas and predicates
• E.g. it is still possible to express conjunctive formulas (e.g.
those corresponding to joins in relational algebra):
isBrother(X,Y) ← hasChild(Z,X), hasChild(Z,Y), X≠Y
Hanoi, March/April 2007
Recent approaches to full integration
• Several recent (and in progress) approaches
attacking the problem of full integration of DL and
(nonmonotonic) rules:
– Hybrid MKNF [Motik and Rosati 2007, to appear]
• Based on interpreting rules as auto-epistemic formulas (cf.
previous comparison of LP and AEL)
• DL part is added as a 1st order theory, together with the rules
– Equilibrium Logics [Pearce et al. 2006]
– Open Answer Sets [Heymans et al. 2004]
Hanoi, March/April 2007
Interaction without full integration
• Other approaches combine (DL) ontologies, with
(nonmonotonic) rules without fully integrating
– Tight semantic integration
• Separate rule and ontology predicates
• Adapt existing semantics for rules in ontology layer
• Adopted e.g. in DL+log [Rosati 2006] and SWRL
– Semantic separation
• Deal with the ontology as an external oracle
• Adopted e.g. in dl-Programs [Eiter et al. 2005] (to be studied
• All this is ongoing research work!!
Hanoi, March/April 2007
• A markup language for rules, in accordance with
the Semantic Web vision:
– Make rules machine-accessible.
• RuleML is an important standardization effort for
rule markup on the Web
– It may serve as a good basis for the upcoming W3C
standard for Rule Interchange Format – RIF
• Actually a family of rule markup languages,
corresponding to different kinds of rule languages:
– derivation rules, integrity constraints, reaction rules
Hanoi, March/April 2007
RuleML (cont)
• Kernel: Datalog (function-free Horn logic)
– It also seems to be the kernel for RIF…
• XML based
– in the form of XML schemas
– DTDs for earlier versions
• Straightforward correspondence between
RuleML elements and rule components
Hanoi, March/April 2007
Rule Components and RuleML
& of atoms
Hanoi, March/April 2007
An Example
• The discount for a customer buying a
product is 7.5 percent if the customer is
premium and the product is luxury.
Hanoi, March/April 2007
RuleML Representation
Hanoi, March/April 2007
RuleML Representation (2)
Hanoi, March/April 2007
• Horn logic is a subset of predicate logic that allows
efficient reasoning, orthogonal to description logics
• Horn logic is the basis of monotonic rules
• Nonmonotonic rules are needed to dealt with incomplete
– Much work done in logic programming can now be used for
defining nonmonotonic rules for the Web
• Ways of combining rules and description logics are of
need, and are today a (hot!) topic of research
• The W3C is making efforts to come up with a standard
(kernel) language for rules in the Semantic Web
Hanoi, March/April 2007
Part 8: Conclusions
The Semantic Web Vision
• The Semantic Web is a vision for making the Web
evolve into:
– A Web in machines for humans and machines
– A Web of Data
– A Web where information is given well-defined
• Languages, technologies and tools are needed to
realize this vision
– They have and are being developed!
Hanoi, March/April 2007
We addressed
Hanoi, March/April 2007
All Fitting Together
Scenario: Bargaining among personal
software agents
• Each party represented by a software agent
• They commit to shared understanding of
terms: an ontology (e.g. in RDFS or OWL)
• Case facts, offers and decisions are
represented as RDF statements
Hanoi, March/April 2007
All Fitting Together (cont)
• Information is exchanged in some XMLbased language (or RDF-based language)
• Agent negotiation strategies are described in
a logical language
• Agents decide about next course of action
through inference, based on negotiation
strategy, case facts and previous offers and
Hanoi, March/April 2007
Application Areas
• There are many potential application areas
Knowledge management
Data integration
B2C and B2B Electronic Commerce
any application that can be good use of knowledge in
the web, “better enabling computers and people to
work in cooperation”
• The is growing industrial interest in the Semantic
Hanoi, March/April 2007
The Semantic Web Today
• It is no longer only a vision
– But it is still a vision in lots of aspects
• It is a hot topic of research
– Incorporating interests from researchers in AI, Logics
and Knowledge Representation, Web Languages,
Databases, …
• It is also a hot topic in software development
• It will surely develop more in the near future!
Hanoi, March/April 2007
The End
I hope you enjoyed the course, understood the basics
of Semantic Web concepts and languages, and
grasped its potential.
There is much more into the Semantic Web that could
not fit this course.
It leaves a lot of room for further studies…

Semantic Web Course