Web of Belief:
Modeling and using Trust and
Provenance in the Semantic Web
Department of Computer Science and Electronic Engineering
University of Maryland Baltimore County
Li Ding
Last updated:10/3/2015
Outline

Introduction




Research description
Research plan
Preliminary Work




Thesis Statement
The Web Of Belief Framework
Evaluation
Contributions to computer science
Thesis Schedule
Motivation

The growing body of the Semantic Web
 Observations

Information



More Data encoded in Semantic Web language from many sources
Various dialect Ontologies
Information is managed in two layer mechanism in terms of
“Document, Ontology, namespace, term”




Physical layer: the web of semantic web documents
Logical layer: the RDF graph
More Semantic Web Tools
Drive forces


Industrial: Weblog, RSS, social network websites
Academic: research projects
Motivation (cont’d)

The Semantic Web has not achieved a real world “KB”
 Credibility & Consistency


Scalability



Facts are provided by many sources w/o guarantee
Data is in vast amount
Data is stored in an open and distributed context
Utility

Data is fragmented
 Bad URI Reference of resource & namespace in the Web
of documents
 Lack of associations in the RDF graph
Motivation (cont’d)

Why provenance and trust

Important concepts borrowed from human world



Keys to credibility assessment and justification




Multi-discipline origins: social, epistemology, psychology
The foundation of knowledge management and inference
Empirical heuristics, also the complement method, in the
absence of domain knowledge to direct reason over credibility.
Explicit representation of justification trace.
Good Heuristics to resolve inconsistency.
Keys to effectiveness and efficiency


Knowledge can be managed by Provenance besides Topic
Trust reduces search complexity
Thesis Statement
This dissertation shows that our Web Of
Belief framework, a provenance and trust
aware inference framework, is critical and
effective in deriving answers with credibility
assessment and justification across the open,
distributed, and large scale online knowledge
base provided by the Semantic Web.
Research Description
General Description
Goal: model and use provenance and trust in the SW
• to enable a credible “world KB”.
• to enable trust layer in the Semantic Web
Representation
 Encode provenance and trust
 Represent SW as KB
Management
 acquisition & digest
 data access interface
 Inference space expansion
Inference
Hypothesis Test
 Trust network computation
 Statement credibility
 Justification
Ontology Dictionary
 Term definition
 Class tree
The Infrastructure of the Semantic Web
Applications
uses
uses
Reputation Service
Web entity directory
searches
Directory/Digest Service
SW Service finder
SW Data finder
digests
Computing Services
digests
Data Service
RDF document
SW data service
(Web) document
database
Assumptions



Propositional knowledge (facts)
Uncertain knowledge with provenance
Open and distributed knowledge storage
Relationship to Other Work

Representation




Data access


Collaborative KB in open distributed context (DB)
Learning


Logical formalisms of agent model (AI)
Truth theory (Epistemology)
Provenance
Learning agent models: knowledge and behavior (social
learning & psychology)
Inference

Reason over uncertain knowledge (reasoning)
Logical Formalisms

Modal Logic -- logically formalize agent






Agent & action (McCarthy,1969; Kanger-Porn-Lindahl)
Agent & belief and intention (Cohen, Levesque,1990)
Agent & knowledge (Epistemic logic)
Agent & belief (Doxastic logic)
Agent & obligation (Deontic logic )
Other logical formalisms for trust and belief




Regan’s formal framework for belief and trust
Josang’s subjective logic
Abdul-Rahman’s social trust model
Jones and Firozabadi’s integrated logic model of trust
Epistemology
Learning Agent models

Objects to be learned



Domain Trust
Referral Trust
Methods


Histogram
Feedback based
Reason over uncertain knowledge

Quantitative approach

Certainty factors - Mycin (Shortliffe, 1976)



(obsolete heuristic), similar to Fuzzy approach
Possibility theory: Fuzzy logic (Zade, 1965;1976)
Dempter-Shafer theory (Dempster,1968; Shafer
1976)



Subjective logic
Probabilistic theory: Bayes Network (Pearl;1982)
Qualitative approach

Non-monotonic logic
Two level data access


Datalog
Logical level

RDF data access language (with provenance)




Quads
TriQL
SPARQL
Storage level

Centralized



triplestore
Kowari
Decentralized

Search engine?
Example walkthrough

Given a hypothesis/query in form of a collection of
RDF statements with or w/o variables

Provenance





where can I find them?
where are the definitions for each term?
Belief( agent, fact): Who said or asserted so?
Justify( fact, fact):
Trust


Can I believe them and thus use them in decision making
How do I trust the other agents
Relationship to Other Work

Representation





Agent, knowledge
Provenance
Trust






Pattern extraction
Transitive closure
RDF storage


Probabilistic inference
Scalability

Metadata
RDF query language

Trust network inference
Credibility

Data access

Inference
Domain filter
Social filter
Semantic Web
Research Plan
Approach – the WOB framework

Representation

WOB ontology




Management




Model provenance and trust into the semantic web
Explicit represent the semantic web
Represent SW as a KB in terms of “agent, statement, association”
Provenance aware data access language
Social network extraction and integration
Provenance and trust based knowledge base expansion
Inference

Hypothesis credibility assessment




Trust network inference
Provenance and trust based belief evaluation
Explicit justification
Ontology dictionary
Research Methodology


Identify real world problems with examples
Approach problems





Formalize problem
Position problem in literature, and find related work
Find issues to be resolved
Design and implement solutions
Evaluation methods



Statistics
Project application
Survey
Artifacts to be produced


[Data] Web Of Belief Ontology
[System] Swoogle metadata and search
service



[System] Ontology dictionary
[Data] Swoogle Statistics
[System] SemDis Trust layer


[Algorithm] Trust based belief evaluation
[Algorithm] Trust based knowledge expansion
Limitations

Limited in online Semantic Web documents
Preliminary Work
WebOfBelief Ontology

Ontology


Entity: Document, Statement, Reference, Agent,
Association


Sub-classes: trust, belief, justification, dependency
Facets



Provenance





Confidence (conditional probability)
Connective (semantics)
(Agent-document) Ownership/Authorship
(Agent-Reference) belief
(Reference-Reference) justification
(doc-doc) dependency
Logical Formalisms
Web Of Belief (WOB) Conceptual Framework (v0.92)
xsd:real [0,1]
AssociationConnective
confidence
connective
Association
Dependency
Justification
foaf:Document
Belief
Reference
Trust
foaf:Agent
selects
foaf:page
contains
rdf:Resource
dc:creator
rdf:Statement
source
wob:imports
wob:priorVersion
wob:support
wob:weaken
wob:cause
wob:imply
wob:believe
wob:disbelieve
wob:nonbelieve
wob:truthful
wob:wise
wob:knowledgeable
wob:cooperative
Data digest service

Support data access language
Credibility Assessment

Trust Network Inference
Given a trust network, how to propagate trust so as
to evaluate trust between any two agents

Trust and provenance based statement
evaluation

Explicit Justification
Ontology dictionary?
Social network extraction and mapping
Application



Trust based belief evaluation
Trust and provenance aware inference
Hypothesis testing and justification
Evaluation


Validate derived trust relations: survey users
Validate performance of WOB inference


Compare results w or w/o trust & provenance
Validate application utility: customer report
Contributions

A practical framework that makes the Semantic Web
a KB


The Web of Belief Ontology
Semantic Web data digest service




Search and browse mechanisms for SW
Support of RDF data access language?

Inference

Judge information trustworthiness
The first work in characterizing the Semantic Web
trust and provenance aware distributed inference
Dissertation schedule

Measures



Size of data that could be handle
Size of trust network
Milestones


Half-way
finished
Trust
Semantic Web
P2P
Possibility Theory
Representation
• Belief, trust
• Policy, rule
SW intelligent user
SW services
Reputation service
Inference Service
SW service finder
SW digest
Belief Theory
Inference
• Derive trust
• Belief fusion
• Justification
SW data finder
SW user
SW digest
Digest/Search Service
Heuristic search
Flexible query
SW data service
Information protection
SW file
Rich Information Text
SW Composer
An outline of the Semantic Web
the Semantic Web
compose
An example
inference
Find Washington Population
Sorry I don’t have it,
Do you want US population?
disambiguation
SW digest
Which `Washington’ do you mean?
Associations
RDF reference
Belief. Who knows what?
How to refer part of RDF graph
Trusting provenance
Sure! the following SWDs/Agents know that
Trust network
Trust network discovery
Uncertainty and Precision
•Credential based trust
•Reputation based trust
•Context/Role based trust
Trusting content
• consensus
• context axioms
Here are the certainty/trustworthiness for each unique answer
Justification
Oh Yeah! Answer X is credible because it
comes from government website
Rule represent hypothesis
Justification instantiates rule
Fill a RDF template
Show me the complete definition of class X
Expected Contributions

Framework
 Features for characterize the Semantic Web
 An Web of Belief ontology to connect the Semantic Web




Association/ annotation
Query language or data access language?
Mechanisms
 Search/browse Semantic Web Document
 Judge information trustworthiness
Applications
 Swoogle
 Semdis
1. Web of Belief – represent the SW


Build an abstract view of the Semantic Web
Select features to characterize it





Overall features: timeline, category
Different levels: term, document, network
Different classes: Entity, Association
Different semantics: Meta-ontology, domainontology
Build web of belief ontology for explicit
representation
Ontology, Document, Namespace, and Term
Namespace
Local name
uses (n:1)
hasName (n:1)
Term
defines (m:n)
contains (m:n)
Document
defines (m:n)
Ontology
Swoogle Search & Browse (1/3)
SWDB

sameLocalName
An abstract view of the Semantic Web
Network level
Semantic Web
Document level
Document
RDF Node level
RDF Node
doc-doc association
Node-node association
Document
SWD
SW database
node-doc association
RDF Node
Resource
NSWD
SW ontology
RDF Database
Java Source
Literal
class
ID
property
Non-ID
2. Swoogle – index service for SW


Even we have knowledge online, a portal data
digest service is need to facilitate data access
RDF digest



RDF query




Meta level (use RDF/OWL semantics)
Domain level (use domain semantics)
Document
Term
Literal (name, identifier)
Dictionaries


Term/Ontology dictionary
Web entity dictionary
Association Feature
Ontological annotation
rdf:type
Empirical c-p definition
rdf:type

node-node


Term-definition
class-property




P2
---
Ontological
Empirical
P1
o1
rdf:domain
rdf:range
P3
Ontological c-p definition
meta association, e.g. rdfs:subClassOf, rdfs:domain
node-doc



I
C
MetaC
resource, doc, #subject,#property,#object, #subject-type-X,
#X-type-object
Literal, doc, predicate
doc-doc


Meta association, e.g. owl:imports
Namespace co-occurrence
Story 1: Big RDF file & P2P

Facts



We found WordNet has published its ontology in a 60M daml file, where
JENA fails to load it in memory.
Most people use ontology as data exporting annotation, (Stefen Decker
argues in WWW2004 Dev day),
Querying RDF should be tractable (Ian Harrock, Andy Seanbome). i.e. we
need to balance the tractability and the expressiveness of a query.



the query result for a graph pattern (with variables) can be of three types: a
subgraph, the variable binding, a max subgraph
Provenance information mainly range in Agent (person, organization,
website). i.e. agent’s belief
Question



Is it appropriate to say a RDF model is a RDF file? If not, how do we
describe a distributed RDF model?
Will there be any very big RDF file? Why?
Can we let RDF stored in small files and distributed throughout the world.
3. SemDis: How to judge information
trustworthiness?

Granularity





Association




rdf:Statement
SWD
Information source (agent, website)
Topic
Social network (FOAF)
Belief, Authorship (foaf:maker)
Justification
Trust computation


Ranking
Network Consensus
Practice of Trust

Fields

Weblog





DBLP
FOAF
Google

Applications

Manipulate precision




FOAF
RSS
Online Social network



Disambiguation: specialize
knowledge
Privacy protection: generalize
knowledge
Manipulate completeness – fuse
knowledge
Algorithms



Trust propagation algorithm: surfer
model, flow model,
Belief merging algorithm

Given A new statement

Reasoning: What is its trustworthiness
given opinions on it from some
information sources? (subjective logic,
fuzzy cognitive map)

Justification: How to find evidences to
support/weaken it? (web of belief
ontology for annotation)
Given A question

Search: effective/efficient in open
environment (rdf digest, bounded search
with trust heuristic)
Given Online multi-network

Social relations among information
sources (FOAF)

Ontological relations among topics (subtopic)

Web entity identification and mapping
Emergence model

How these can really affect the semantic
web research?
Story 2: Identity


Facts
 We found a lot social network online, e.g. coauthor(dblp),
knows(foaf), colleague. Different networks adopt different
identities
 Each of them might not well connected, or quite small, but what-if
we connected them
 One identity shared by multiple persons, by mistake or by nature
 Identity mapping is m:n
Questions
 Can we determine certainty of identity
 How to map identity
Story3: Knowledge Fusion

Fact



We can fuse person info. From multiple FOAF file.
Some statements are confirmed by a lot of people
We can build a model which has multiple
provenance
Questions


How to use provenance information to assure the
receiver.
What if Dr. Joshi want to determine his trust to the
ontology created by Dr. Amit Sheth
Story 4: Justification Markup Language

Facts about distributed justification on the web (semantic web)




The justification on the web may not always be formalized.
Knowledge on the web could be objective (like database) or subjective (like joke, estimation).
Knowledge on the semantic web is inherently inconsistent
Determining what counts as adequate reasons is an obstacle to providing justification. This process
of reason giving can be viewed as argumentation in four major forms: inductive, deductive,
conclusive, and prima facie.




Question:



How to represent the mixture of human inference, statistical information and logical inference
Distributed justification: trust-based, case-based, logical-inference
Example: I will buy a new Honda Accord because





Inductive and deductive justification involve evidence and logical evaluation.
In a conclusive argument, reasons are analyzed by asking if another rational human would have the same belief
given the same reasons.
prima facie argumentation is a process of giving several reasons for believing something and choosing the most
important one.
(1) [inductive] it is a good car because 90% related online comments are positive ;
(2) [deductive] it has better mile/gas performance;
(3) [conclusive/mimic] I will buy a car since my friend (who has similar taste as me ) like to buy it .
(4) [prima facie] Among all factors that make me happy, buying a new car is the most important
Solution


Formal language to express logical programming proof trace, e.g. PML
We also need informative language to express human justification



Express relation between statements: support, casual, critique,
Log decision process as a case for future sharing/recall/query.
Cite a case/used reason as proof of new justification
Descargar

Document