On Boosting Semantic Web Data Access
Li Ding
Department of Computer Science and Electrical Engineering,
University of Maryland Baltimore County
Advisor: Tim Finin
Date: Jan 19, 2005
@
2
Outline

Introduction





Research description
Research plan
Preliminary and planned work




Thesis statement
Contributions to computer science
WOB-CORE: modeling the Semantic Web with its context
Swoogle: digesting and searching the Semantic Web
WOB: evaluating semantic web data quality
Summary

Thesis schedule
@
1. Introduction



The Semantic Web in the Web
Motivation
Thesis statement
@
4
The Semantic Web in the Web
Agent World
Application
Inference
Translation
RDF Graph
Semantic Web Data Access
RDF/XML, N3, N-Triple, OWL/XML…
HTTP
Static RDF document
HTTP, SOAP
wrapper service
(Web) document
database
FIPA, SOAP,…
Agent &
Web Service
The Web
@
5
The growing semantic web data



More data ( from Swoogle Today, Jan 16, 2005 )
 335,858 RDF documents (v.s. Google 8,058,044,651)
 156,504 ontological terms (classes or properties)
 46,987,876 triples
Well populated ontology (organization adoption)
 Blog, News feed (e.g. rss)
 Personal homepage and social networking (e.g. foaf, bio)
 Digital library (e.g. dc, dcTerms),
 Copyright – creative commons (cc)
 Software configuration (trustix)
 Dictionary (e.g. wordnet)
 Scientific data ( e.g. CRISISCat - California Invasive Species
Information Catalog)
Potential semantic web data
 Bibliography
 CIA world fact book
@
6
Three challenges before utilizing semantic web data
Where does
George live ?
Web scale semantic web vocabulary and data access
1
Which `live’ ?
Ontology
dictionary
I mean ex:livesIn
2 Get it !
foo:George ex:livesIn ?x
Semantic Web
Data access
service
Quality of RDF graph
Joe
Rank? Trust ?
3
Which to believe?
foo:George ex:livesIn ex:Texas
source
foo:George ex:livesIn ex:TheWhiteHouse
source
@
7
Motivation

The utility of semantic web data access depends on
three factors
UtilitySWDA = f (Availability, Accessibility, Quality)




Availability: how much semantic web data is available in the Web
Accessibility: how easily and effectively can users obtain the data they
want
Quality: how well can semantic web data satisfy users’ requirements
Applications


Spire: sharing scientific information using the Semantic Web
SemDis: discovering and evaluating semantic associations in the
Semantic Web
@


8
Sharing semantic web data published by different
sources throughout the Web
Spire is a distributed, interdisciplinary research project exploring
the use of semantic web technologies in support science in
general and the field of ecoinformatics in particular.
How to search and
use these data ?
Darwin Core
publisher
[email protected]
Ecological Networks
creator
UMBC Tree Survey
Pacific Ecoinformatics
and Computational
Ecology Lab
creator
SF Tree Survey
California Invasive Species
Information Catalog
[email protected]
creator
[email protected]
creator
http://spire.umbc.edu/
NBII-CAIN
@


Terrorist Group
Discover complex semantic associations in SW.
Evaluate trustworthiness of discovered associations
Osama Bin Laden
listedIn
CIA Agent W
Step1
Osama Bin Laden
Collect semantic web
data from multiple
sources and merge a big
RDF graph.
memberOf
relatedTo
locatedIn
Al-Qaeda
Afghanistan
Mr. Y
ownedBy
Department of State
Organization B
9
Step 2
Discover paths from Mr.X
to Osama Bin Laden in
the big RDF graph.
Step 3
Agent K
Afghanistan
locatedIn
FOO News
Organization B
invests
basedIn
Kabul
CIA World Fact Book
locatedIn
US
Company A
Evaluate trustworthiness
of a discovered path with
provenance and trust
data
Kabul
isPresidentOf
Company A
NASDAQ
Mr.X
http://semdis.umbc.edu/
@
10
Research overview
accessibility


Search URIrefs
Map URIrefs
Search Ontologies
Search RDF documents
Semantic web “hyperlink”



Semantic web vocabulary
Semantic web data access service

2. Swoogle: Digesting and Searching the Semantic Web
Utility


Discover SW
Digest SW metadata
Search & navigation
1. WOB-CORE: Modeling the Semantic Web and its context

3. WOB: Evaluating semantic web data quality

Quality of RDF graph

quality



Consistency
Importance
Trustworthiness


Concepts
Associations
Identify dimensions
Rank importance
Evaluate trustworthiness
@
11
Thesis Statement
Finding and evaluating information in the
large scale Semantic Web is critical to users’
adoption but is not met yet. We developed
Web of Belief (WOB) ontology, Swoogle data
access service and data quality evaluation
mechanisms to address these issues. These
tools are proven to be effective in building
semantic web metadata and boosting webscale semantic web data access in
applications like SemDis and Spire.
@
12
Contributions to computer science


WOB is the first ontology that captures and collects the metadata
of the Semantic Web and its context
 RDF graph reference language
 Finer provenance model
Swoogle is one of the first data access services that digest and
search the web-scale semantic web.
 Adaptive semantic web discovery agent
 Semantic web metadata



RDF graph abstract
Ontology dictionary
Recognized more relations among resources and document
Semantic web search and navigation model and service
One of the first works that investigate semantic web data quality
 Ranking the Semantic Web.




We identified multiple navigation models for ranking.
Evaluate RDF graph's trustworthiness.
@
2. Research Description



Modeling the Semantic Web with its context
Digesting and searching the Semantic Web
Evaluating semantic web data quality
@
14
The Semantic Web and its context
Agent World
legends
Person
trust
provenance
subClassOf
believes
Agent
trusts
creates
The RDF Graph World
RDF resource
uses
serializes
The Web defines
Ontology
RDF graph
RDF Document
subClassOf
Document
subClassOf
@
15
Modeling the Semantic Web and its context

Goals




Principles





Identify concepts and associations
Build an ontology in OWL semantics, especially
 RDF graph reference language
 Finer provenance
Populate this ontology by rule-based translation
Build simple, clear and minimal ontology
Reuse existing ontology
Show entity identity
Be aware of inference tractability
Evaluation


Analytical comparison with other existing ontologies.
Satisfy applications (Swoogle, SemDis) requirements
@
16
Related works



WOB-core Ontology
 Meta-ontologies: RDF, OWL
 Popular ontologies: FOAF, DC
RDF graph reference
 Naïve approach: RDF test, OWL test
 RDF reification: RDF specification
 Named graphs (Carroll et al.2004)
Provenance
 Digital library (e.g. Dublin Core)
 Database:



data provenance (Buneman, Khanna, & Tan 2001)
view maintenance (Cui, Widom, & Wiener 2000)
AI:


knowledge provenance (da Silva, McGuinness, & McCool 2003; Fox
& Huang 2003)
proof tracing, PML (da Silva, McGuinness, & Fikes 2004); TELLIS(Gil
& Ratnakar 2002)
@
17
Web-scale semantic web data access model
agent
data access service
the Web
Discover RDF Docs
Compose
query
ask (term)
inform (term URIrefs)
Digest RDF Docs &Terms
Search Terms
ask (query)
Compose
Local
RDF graph
inform (doc URLs)
Search RDF Docs
Fetch docs
Query local
RDF graph
@
18
Digesting and searching the Semantic Web

Goals:


Web-scale semantic web data access model
Data access service




Principles



Adaptive RDF document discovery
Digest semantic web metadata
Semantic web search and navigation model and service
Scalable design
Real world application
Evaluation



Statistical report on collected metadata, web service usage
Precision and recall of search result
Users’ satisfaction on search and navigation model
@
19
Related works: SW vs. Web IR vs. DB


SW vs. Web IR: vocabulary, data model, query
SW vs. DB: implicit data, query scale, vocabulary
@
20
Related works (cont’d)

Swoogle

Ontology based annotation systems








DAML ontology library
Schema Web
Semantic web central




W3C’s Ontaria (2004)
Semantic web instance databases

Meta-crawler
focused crawler
sw-crawler
Digest
DC
W3C’s Annotea
OWL & RDFS
Search & Navigation


Semantic web ontology browsers



CREAM (AIFB,2003)
Ontology repositories


Annotate proper reference & relations


SHOE (UMCP, 1997)
Ontobroker (AIFB, karlsruhe, 1998),
WebKB (Martin & Eklund, 1999),
QuizRDF (BT,2002)
Discovery

Annotate web documents




Web IR (TFIDF)
RDF database query
(e.g. RDQL, SPARQL)
Term navigation (e.g.
Ontaria, Hyperdaml)
Semantic web search
@
21
Evaluating semantic web data quality

Goals
 Investigate dimensions of semantic web data quality
 Evaluate semantic web data quality


Ranking RDF resources and RDF documents
Evaluating RDF graph trustworthiness
Trust and provenance based semantic web navigation model
Principles
 Semantic web data quality dimensions vary for different
granularity and/or background knowledge
Evaluation
 Analytical analysis and proofs over navigation models and trust
propagation models
 Simulation (Ding et al. 2004b) for quantifying convergence &
effectiveness
 Application (Spire, SemDis) users’ feedback



@
22
Related works

Data quality dimensions

Information science (Wang, Storey, & Firth 1995)
categorize data quality dimensions by domain interests





Integrity (Database)
User-satisfaction (Psychology)
Statistics (auditing methods)
Ontological world-modeling (Wand & Wang 1996)
Imperfect information


Taxonomy : (Smithson 1989) (Smets 1991) (Parsons 1996)
Computational models (Parsons & Hunter 1998)
 probabilistic theory,
 possibility theory,
 evidence (Dempster-Shafer) theory.
@
23
Related works (cont’d)

Ranking
 Complex network analysis (Newman 2003)
 Text document ranking
 Web page ranking:



PageRank (Page et al. 1998; Haveliwala 1999),
Hits(Kleinberg,1998)
Semantic ranking:


Ranking RDF resources: (H.Zhuge & Zheng 2003)
Ranking RDF document: Swoogle (contributed by Tim Finin, Rong Pan, 2004)
Social network analysis
Trustworthiness
 Content analysis:




RDF graph difference (Berners-Lee & Connolly 2004).
Context analysis: semantic web trust layer

Information security (Hyvonen 2002)

Trust network (Golbeck, Parsia, & Hendler 2003; Richardson, Agrawal,
& Domingos 2003; R.Guha et al. 2004; Ding,et al. 2004)

semantic web publishing (Carroll & Bizer 2004).

SWAD-Europe’s trust ontology (Arenas et al. 2004)
@
3. Research Plan

Research objectives and status
@
25
Research objectives and status
Phase
Objectives
Artifacts to produce
1
WOB
WOB-core ontology (w provenance)
RDF graph reference language
Provenance
Spiral research model
translated WOB-core instances
2
Swoogle
adaptive discovery agent
semantic web metadata *
search and navigation services*
Swoogle statistics *
3
SW data
quality
WOB-quality extension
navigation and ranking model
trust inference algorithms
trust based navigation model
4
Finalize
Dissertation
* This is a joint work with others.
1 .Prototype
2. Complete
& revise
prototype
3. Evaluation &
Justification
@
Preliminary and planned Work:
Web Of belief (WOB)




WOB-Core ontology
RDF graph reference
Provenance
Status and next step
@
27
WOB-core ontology
Agent World
Association
foaf:Agent
rdfs:subClassOf
foaf:Person
rdfs:domain
wob:connective
The RDF Graph World
rdfs:Resource
wob:Association
wob:RDFgraphRef
dc:source
rdfs:subPropertyOf
wob:source
The Web
foaf:Document
wob:RDFDocument
rdfs:subClassOf
rdfs:subPropertyOf
wob:creator
wob:isdefinedby
owl:Ontology
rdfs:subClassOf
wob:sourceDocument
@
28
RDF graph reference

Reference entire RDF graph



Reference the RDF graph from a document
Reference the RDF graph defined by usePattern
Reference partial RDF graph



Accept a set of triples
Reject a set of triples
Special cases


Referencing class instance
Wildcard: “John hasChild _:x”
@
29
RDF graph reference: an example
wob:RDFGraphRef
rdf:type
wob:RDFDocument
rdf:type
wob:sourceDocument
http://foo.com/ex1.rdf
wob:usePattern
wob:SimpleTriple
rdf:type
wob:subject
foo:George
wob:predicate
ex:livesIn
http://foo.com/ex1.rdf
ex:livesIn
ex:Taxas
foo:George
wob:object
ex:Texas
foaf:mbox
[email protected]
@
30
Provenance in the Semantic Web
RDF Resource
Where
Whom
dc:source
dc:creator
dc:source
dc:creator
why
Definition
rdfs:isDefinedBy
RDF graph
RDF document


We differentiate the rdfs:range of provenance relation
The scope of provenance property


Minimum semantic element: the semantic will not be complete when any
triple is removed

Complete: the entire sub-tree

URI-complete: minimal sub-tree ends without blank nodes
dc:creator semantics



Class instance
Class/property definition
Document
@
31
Provenance of RDF graph
“A is sub class of B”
rdf:type
ex:A
rdfs:subClassOf
whom
Bob (said so)
rdf:type
owl:Class
why
implies
“A is sub class of C”
“C is sub class of B”
“Transitive rule”
ex:B
where
supports
“x is instance of both A and B”
whom
http://foo.com/example.owl
@
32
Provenance of RDF resource and RDF document
“A is sub class of B”
rdf:type
ex:A
owl:Class
rdfs:subClassOf
rdf:type
why
ex:B
Whom
(dc:creator)
where
(dc:source)
Bob (said so)
Whom
(dc:creator)
foo.com
Definition
(rdfs:isDefinedBy)
http://foo.com/example.owl
Whom
(dc:publisher)
where
(dc:source)
@
33
WOB-provenance
RDF Resource
• wob:creator
• dc:creator
• wob:sourceDocument
• wob:isDefinedBy
• rdfs:isDefinedBy
• dc:source
RDF Graph
TBD
TBD
wob:sourceDocument
Agent
• wob:creator
• dc:creator
RDF Document
rdfs:subClassOf
Website
Proof
• wob:sourceDocument
• dc:source
• wob:creator
• dc:creator
@
34
Status and next step

We have




Constructed WOB conceptualism
Proposed prelim RDF graph reference language
Classified provenance in the Semantic Web
We will





Refine and evaluate WOB-core ontology
Complete RDF graph reference language
Add why-provenance
Populate WOB-core instances using rule based
translation
Evaluate WOB-core ontology
@
Preliminary and planned Work:
Swoogle




Discovery
Digest
Search and navigation
Status and next step
@
36
The role of Swoogle in the Semantic
Web
Software Agents, Applications
uses
uses
searches
Directory/Digest Service
Service Finder
digests
Semantic Web
Services
Data
Finder
Swoogle
digests
Semantic web data
RDF document
SW data service
(Web) document
database
@
37
Discovery - research

Crawlers




Google-crawler
Focused-crawler
Semantic-Web-crawler, e.g. scutter
RDF document word indicator

Keywords (positive list and negative list)





filetype: 10 positive, over 100 negative
url-pattern
content-pattern
Google cat-words (to refine Google query)
Revisiting URLs



The would-be RDF document
The out-of-date RDF document: changed, deleted
The redirected RDF document
@
38
Discovery – current status

Crawler performance




Google crawler is the best
Focused crawler needs to be improved
1/3 URLs are verified pure RDF documents
Embedded RDF graph.
RDF docs
Focused Crawler
google crawler
SW_crawler
TOTAL
1,465
Non-RDF docs
Undecided TOTAL
10,580
52%
8,292
20,337
273,023
36% 369,371
49%
110,794
753,188
61,870
15% 285,506
70%
57,709
405,085
336,358
665,457
7%
176,795 1,178,610
Source: Swoogle (2005-Jan-05) SELECT `discovered_by`, sum(isRDF), sum(1-isRDF), count(*) FROM `digest_url` WHERE 1 group by discovered_by
@
39
Digest -- research




RDF document annotation (join work)
RDF graph abstract
Ontological term definition
Relations (join work)



Document-term relation
Document-document relation
Term-term relation
@
40
RDF document annotation (join work)

Document






RDF/OWL level





filetype (suffix of URL)
When/how discovered
Last modified time
Document hash
Crawling info
RDF Syntax
SW language
OWL species
Provenance (creator, publisher)
Ontology



Label
Version
Comment
@
41
RDF graph abstract

Possible models







Bag-of-word : literal, local name of resource
Bag-of-URI: URIrefs of non-blank RDF node
Triple: swangled triple digest (Mayfield & Finin 2003)
Ontological term: defined/referenced/populated
class/property
Namespace: used/defined namespace
Identity: identity of class instance
Possible methods


Document vector
Bloom filter (Bloom 1970)
@
42
Ontological term definition
Ontological C-P bond
• foaf:mbox
• foaf:name
Empirical C-P bond
• foaf:name
• dc:title
Term Definition
• rdfs:subClassOf -- foaf:Agent
• rdfs:label – “Person”
file1
foaf:mbox
foaf:name
rdfs:domain
rdfs:domain
file3
file2
rdf:type
rdf:type
owl:Class
foaf:Person
foaf:name
“Tim Finin”
dc:title
“Tim’s FOAF File”
rdfs:subClassOf
foaf:Agent
rdfs:label
“Person”
@
43
Relations: doc-term; doc-doc; term-term
swoogle:sameNamespace
rdfs:seeAlso
foaf:Document
swoogle:sameLocalname
rdfs:isDefinedBy
C-P bond, P-C bond
any RDF triple
wob:RDFDocument
swoogle:isUsedBy
swoogle:uses
rdfs:subClassOf
rdfs:Resource
swoogle:defines
wob:isDefinedBy
swoogle:populatesClass
swoogle:populatesProperty
swoogle:refersClass
swoogle:refersProperty
swoogle:definesClass
swoogle:definesProperty
owl:Ontology
owl:imports
owl:priorVersion
owl:backwardCompatibleWith
owl:imcompatiableWith
swoogle:officialOnto
swoogle:extensionOnto
@
44
Search & Navigation -- research
The Semantic Web is not simply the Web

Search service



Document search – RDF document is not free text
Term search – URIref contains compound local
name
Navigation service



The RDF graph – Typed links
The web of RDF documents – Few hyperlinks
The social network of agents – trust & provenance
@
45
Semantic web search/navigation model
• Keywords+ Filters
1
Term Search
Resource
URIref
2
URL
6
sameNamespace
sameLocalname
ANY RDF PROPERTY
uses
3
isUsedBy
RDF Document
4 isDefinedBy
5
officialOnto
extensionOnto
rdfs:subClassOf
7
rdfs:seeAlso
rdfs:isDefinedBy
Document Search
defines
Ontology
OntologyProperty
• Keywords+ Filters
• SPARQL
• RDF graph
@
46
Status and next step

We have


Built a automatic semantic web discovery agent
Digested part of semantic web metadata




RDF document annotation
Relations: res-res; res-doc; doc-doc
Proposed semantic web search/navigation model with
prototype implementation
We will



Make the agent adaptive
Explore efficient RDF graph abstract
Provide a complete search/navigation service, esp.




Swoogle search with SPARQL search support
Ontology dictionary with user-friendly navigation interface
Complete Swoogle web service
Complete Swoogle statistics for quantitative evaluation
@
Preliminary and planned work:
Semantic Web Data Quality

Dimensions of semantic web data quality
Ranking RDF resources and RDF documents
Evaluate RDF graph trustworthiness
Trust based navigation

Status and next step



@
48
Dimensions of semantic web data quality
Term
RDF graph
RDF graph +RDFS/OWL SW metadata
SW metadata
+trust
weighted
directed graph
RDF graph
SW + Web +
agents
Importance
 centrality
 betweenness


rel-vaguenss
RDF
Document
RDF
graph
graph
structure

definition closeness
 semantic consistency
 rel-completeness

Agent
SW + Web

Importance

Importance

credibility

Importance

credibility

credibility
More to consider

term correlation (C-P bond, P-C bond)
@
49
Ranking RDF documents and RDF resources

PageRank like navigation model


Background knowledge decides w(p) – how credits are
distributed along semantic paths from one node
Different context



RDF graph as weight directed graph
RDF graph + RDFS/OWL
RDF graph + RDFS/OWL + WOB (semantic web metadata)
@
50
Navigation model 1: RDF graph
Named edge
RDF node


Let wg(e) be the frequency of named edges
in the given RDF graph
Given a node p, each edge e from p is
assigned with weight wg(e), and w(p) is the
normalized vlaue
@
51
Navigation model 2: RDF graph +RDFS/OWL
Meta Class
type
type
type*
Class
type
Individual
Property
Literal /
Resource
InverseFunctionalProperty



Individual => Property is made by reading triple
type* is valid in OWL-FULL semantics
Literals and non-instance resources are ignored
 Except owl:InverseFunctionalProperty is considered (OWL-FULL)
@
52
An example
rdf:type
rdfs:Class
foaf:Document
rdfs:Property
rdf:type
rdf:type
rdfs:subPropertyOf
rdfs:range
rdfs:subClassOf
wob:RDFDocument
wob:source
wob:sourceDocument
dc:title
rdf:type
http://foo.com/ex.owl
rdfs:label
a2
@
53
Navigation model 3: RDF graph +RDFS/OWL+WOB
Meta Class
type
Class
type
type
Property
Ontology
Individual
RDF Document


We assume Swoogle search/navigation services is used.
Rank RDF resources and RDF documents together
@
54
Evaluating trustworthiness


[Definition] A philosophical and context dependent concept.
Common interpretations are reliance, faith, and confidence.
Examples



“Is the triple (foo:George ex:livesIn foo:WhiteHouse) credible? ”
“Does foo:George (an instance of foaf:Person) always telling
truth? ”
Related terms


Belief: Trustworthiness of an RDF graph (by individual agent)
Trust: Trustworthiness of an agent’s beliefs (by individual agent)




[KR] An agent’s belief (assertion)
[ML] A hypothesis of the other agents’ belief quality
[SNA] A context dependent inter-agent relation
Reputation: Social trustworthiness of an agent (by the public)
@
55
How statement is justified trustworthy
I believe that
I believe that
“Restaurants with good outlook are good”
“Foo has good outlook”;
“Good restaurants has good outlook”
“Foo has good outlook”;
deductive
abductive
Foo is a good
restaurant
prima facie (at first view)
No better alternative
conclusive (mimic)
inductive
I’ve been to Foo many times,
and the food was always good!
My friends (who have
similar taste as me ) said so.
@
56
Trust propagation in justification





Deductive – trustworthiness propagates from the premise w.r.t.
inference rule
 P -> Q,
tv(Q) = tv(P) *tv(P->Q)
Abductive – trustworthiness propagates from the consequence
w.r.t. trustworthiness of reversing inference rule
 P-> Q
tv(P) = tv(Q) * f ( tv(P->Q) ) Bayes
Inductive – trustworthiness is derived from past experiences
 Argumentation – logic coherence
 Knowledge similarity – statistic coherence
Conclusive – trustworthiness propagates from the other agents
through social trust relation
 Trust(A,B) tv(S,A) = tv(trust(A,B)) * tv(S,B)
 Recommendation
prima facie – blind trust
 Tv(S) = constant (normal reputation)
 Largest take all
@
57
Evaluate RDF graph trustworthiness
4
RDF graph
(w ontology)
rdfs:subClassOf
owl:Class
rdfs:Class
Foaf:Person
The given RDF Graph
S1
S2
Foaf:person rdf:type owl:Class
2
believes
(Conflict belief)
S3
Foaf:person rdf:type rdfs:Class
disbelieves
trusts
1
Agents
3
(social network)
Joe
foaf:knows
Mike
Remove independent assumption by using more data
@
58
Trust and provenance aware navigation

Mechanism





Only pursue highly trusted
Shortest distance principle
Derive trustworthiness
using weighted consensus
No delegation
Complexity control


distance=1
a

small world
Initiator’s control
c
b
Search Branch – trust filter
Search Depth

distance=0
initiator
d
distance=2
e
domain-refer
f
g
h
refer-refer
@
59
Status and next step

We have





Revealed some dimensions of semantic web data quality
Proposed some ranking mechanisms based on different
navigation models and background knowledge
Proposed some trust evaluation mechanisms based on
different background knowledge
Proposed a trust based navigation model
We will


Consolidate semantic web data quality dimensions with
more formal description
Evaluate, justify and improve ranking and trust evaluation
mechanims
@
Summary




[R] Thesis Statement
[R] Contributions to computer science
Research time table
Planned milestones
@
61
Thesis Statement
Finding and evaluating information in the
large scale Semantic Web is critical to users’
adoption but is not met yet. We developed
Web of Belief (WOB) ontology, Swoogle data
access service and data quality evaluation
mechanisms to address these issues. These
tools are proven to be effective in building
semantic web metadata and boosting webscale semantic web data access in
applications like SemDis and Spire.
@
62
Contributions to computer science


WOB is the first ontology that captures and collects the metadata
of the Semantic Web and its context
 RDF graph reference language
 Finer provenance model
Swoogle is one of the first data access services that digest and
search the web-scale semantic web.
 Adaptive semantic web discovery agent
 Semantic web metadata



RDF graph abstract
Ontology dictionary
Recognized more relations among resources and document
Semantic web search and navigation model and service
One of the first works that investigate semantic web data quality
 Ranking the Semantic Web.




We identified multiple navigation models for ranking.
Evaluate RDF graph's trustworthiness.
@
63
A tentative research time table
Phase
Objectives
Artifacts to produce
Status
(%)
Time
(months)
1
WOB
WOB-core ontology
60
0.5
RDF graph reference language
30
1
Provenance
50
0.5
translated WOB-core instances
0
1
adaptive discovery agent
50
1
semantic web metadata *
50
1
search and navigation services *
30
2
Swoogle statistics *
30
1
WOB-quality extension
20
1
navigation and ranking model
40
2
trust inference algorithms
50
2
trust based navigation model
80
1
2
3
4
Swoogle
SW Quality
Finalize
Dissertation
TOTAL
* This is a joint work with others.
4
3
5
6
4
18
@
64
Planned milestones



WOB-core ontology
 It covered all required meta-concepts in Spire and SemDis.
Swoogle
 It indexed all semantic web data needed by Spire and SemDis.
We are expecting millions of RDF documents to be indexed.
 It performed better than Google or other semantic web portals in
searching ontologies and URIrefs throughout the Web. We are
also looking forward to searching class-instance data.
Semantic web data quality
 RDF documents and RDF resources can be ranked reasonably
using semantic web metadata in WOB. We are expecting users’
satisfaction about Swoogle search precision.
 RDF graph trustworthiness can be evaluated reasonably by using
trust and provenance information in WOB.
@
65
Publications
Refereed Publications

Li Ding et al., "On Homeland Security and the Semantic Web: A Provenance and Trust
Aware Inference Framework", InProceedings, Proceedings of the AAAI SPring Symposium
on AI Technologies for Homeland Security, March 2005.

Li Ding et al., "How the Semantic Web is Being Used:An Analysis of FOAF",
InProceedings, Proceedings of the 38th International Conference on System Sciences,
January 2005.

Li Ding et al., "Analyzing Social Networks on the Semantic Web", Article, IEEE Intelligent
Systems, January 2005.

Li Ding et al., "Swoogle: A Search and Metadata Engine for the Semantic Web",
InProceedings, Proceedings of the Thirteenth ACM Conference on Information and
Knowledge Management , November 2004.

Li Ding et al., "Modeling and Evaluating Trust Network Inference", InProceedings, Seventh
International Workshop on Trust in Agent Societies at AAMAS 2004, July 2004.

Li Ding et al., "Trust Based Knowledge Outsourcing for Semantic Web Agents",
InProceedings, Proceedings of the 2003 IEEE/WIC International Conference on Web
Intelligence, October 2003.

Youyong Zou et al., "Using Semantic web technology in Multi-Agent systems: a case study
in the TAGA Trading agent environment", Article, Proceeding of the 5th International
Conference on Electronic Commerce, September 2003.
Non-Refereed Publications

Li Ding et al., "Weaving the Web of Belief into the Semantic Web", Misc, submitted to
WWW2004, May 2004.
@
66
Selected references














Berners-Lee, T., and Connolly, D. 2004. Delta: an ontology for the distribution of differences between rdf
graphs. http://www.w3.org/DesignIssues/Diff.
Bloom, B. H. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM
13(7):422–426.
Carroll, J. J.; Bizer, C.; Hayes, P.; and Stickler, P. 2004. Named graphs, provenance and trust. Technical
Report HPL-2004-57, HP Lab.
Cui, Y.; Widom, J.; and Wiener, J. L. 2000. Tracing the lineage of view data in a warehousing
environment. ACM Trans. on Database Systems 25(2):179–227.
da Silva, P. P.; McGuinness, D. L.; and Fikes, R. 2004. A proof markup language for semantic web
services. Technical Report KSL-04-01, Stanford.
da Silva, P. P.; McGuinness, D. L.; and McCool, R. 2003. Knowledge provenance infrastructure. Data
Engineering Bulletin 26(4):26–32.
Fox, M., and Huang, J. 2003. Knowledge provenance: An approach to modeling and maintaining the
evolution and validity of knowledge. Technical report, University of Toronto.
Gil, Y., and Ratnakar, V. 2002. Trusting information sources one citizen at a time. In Proceedings of
International Semantic Web Conference 2002, 162–176.
Golbeck, J.; Parsia, B.; and Hendler, J. 2003. Trust networks on the semantic web. In Proceedings of
Cooperative Intelligent Agents.
Grandison, T., and Sloman, M. 2000. A survey of trust in internet application. IEEE Communications
Surveys Tutorials (Fourth Quarter) 3(4).
Hunter, A., and Parsons, S., eds. 1998. Applications of Uncertainty Formalisms. Springer.
Hyvonen, E. 2002. The semantic web – the new internet of meanings. In Semantic Web Kick-Off in
Finland: Vision, Technologies,Research, and Applications.
H.Zhuge, and Zheng, P. 2003. Ranking semantic-linked network. In www 2003.
Josang, A. 1997. Prospectives for modelling trust in information security. In Proceedings of Australasian
Conference on Information Security and Privacy.
@
67
Selected references (cont’d)















Kanh, B. K.; Strong, D. M.; and Wang, R. Y. 2002. Information quality benchmarks: Product and service
performance. Communications of the ACM 45(4):184–192.
Kleinberg, J. 1998. Authoritative sources in a hyperlinked environment. In Proceedings of ACM-SIAM
Symposium on Discrete Algorithms.
Mayfield, J., and Finin, T. 2003. Information retrieval on the semantic web: Integrating inference and
retrieval. In Proceedings of the SIGIR 2003 Semantic Web Workshop.
McDermott, D. 2001. Why rdf’s reification doesn’t work. http://lists.w3.org/Archives/Public/wwwrdflogic/2001Apr/0066.
McKnight, D. H., and Chervany, N. L. 1996. The meanings of trust. MISRC Working Paper Series.
Newman, M. E. J. 2003. The structure and function of complex networks. SIAM Review 167–256.
Page, L.; Brin, S.; Motwani, R.; and Winograd, T. 1998. The pagerank citation ranking: Bringing order to
the web. Technical report, Stanford Digital Library Technologies Project.
Parsons, S., and Hunter, A. 1998. A review of uncertainty handling formalisms. In Applications of
Uncertainty Formalisms.
Parsons, S. 1996. Current approaches to handling imperfect information in data and knowledge bases.
Knowledge and Data Engineering 8(3).
R.Guha; Kumar, R.; Raghavan, P.; and Tomkins, A. 2004. Propagation of trust and distrust. In
Proceedings of the 1st Workshop on Friend of a Friend, Social Networking and the Semantic Web.
Richardson, M.; Agrawal, R.; and Domingos, P. 2003.Trust management for the semantic web. In
Proceedings of the Second International Semantic Web Conference.
Smets, P. 1998. Probability, possibility, belief: Which and where. Quantified Representation of Uncertainty
and Imprecision 1:1–24.
Smithson, M. J., ed. 1989. Ignorance and Uncertainty: Emerging Paradigms. Springer Verlag.
Wand, Y., and Wang, R. Y. 1996. Anchoring data quality dimensions in ontological foundations.
Communications of the ACM 39(11):86–95.
Wang, R.; Storey, V.; and Firth, C. 1995. A framework for analysis of data quality research. IEEE
Transactions on Knowledge and Data Engineering 7(4):623–639.
@
68
Some ontologies and their QNames
QName
Name
URL
rdf
Resource Description Framework
http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs
Resource Description Framework schema
http://www.w3.org/2000/01/rdf-schema#
owl
Web Ontology Language
http://www.w3.org/2002/07/owl#
rss
RDF site summary
http://purl.org/rss/1.0/
foaf
Friend Of A Friend
http://xmlns.com/foaf/0.1/
dc
Dublin Core Elements
http://purl.org/dc/elements/1.1/
bio
A vocabulary for biographical information
http://vocab.org/bio/0.1/
cc
creative commons
http://web.resource.org/cc/
trustix
(used but not publicly defined)
http://www.trustix.net/schema/rdf/spi-0.0.1#
wordnet
Wordnet (Princeton U.)
http://xmlns.com/wordnet/1.6/
@
Descargar

Document