Infomaster: An
information Integration
Tool
O. M. Duschka and M. R. Genesereth
Presentation by Cui Tao
Introduction

Huge amount of information online:
–
Distribution: Not every query can be answered by
the data in a single database
•
–
Fragmentation: horizontal, vertical
Heterogeneity
•
Notational heterogeneity:
–
•
Conceptual heterogeneity:
–
–
Different access language and protocol: Parsing HTML, SQL,
OQL, Z39.50
Semantic mismatches
Instability
Introduction
 Intelligent agents
– Search and find desired information
– Convert formats
– Translate different context
– Etc…
– Not feasible yet
– Considerable research in ontologies and
natural language understanding is required
Introduction
 Infomaster: an information integration tool
– Provide integrated access
– Manage evolving information sources
– Add new information sources
– Remove outdated information sources
Architecture
Tested Application Areas
 Newspaper classifieds
– Provide a uniform search interface
– Gather corresponding classifieds from all
relevant newspapers
 Product catalogs
– Provide terminology translation
 Campus databases
Abstraction Hierarchy
Descriptions of Relationships
 Interface relation & Site relation: in the
terms of Base relation
 Interface relation v.s. Base relation:
Interface
Base
Descriptions of Relationships
 Site relation v.s. Base relation:
Site
Base
Base
Descriptions of Relationships
 Site relation v.s. Base relation:
Site
Base
Base
Query Processing
Example: BMWs built in 1996 that are for sale for a Price below their average
market value.
Reduction:
Interface relations Base relations
 Simple:
User’s query --- Interface relation --- Base relation
 Example rewritten query:
Abduction
Base relations Site relations
 Site relations are expressed in terms of
base relations, but not vice versa
 Query rewritten problem: answer queries
using views
 Abduction: use a standard model
elimination theorem prover
Abduction
Base relations Site relations
: The set of all descriptions of the site relations
: A set of site relations
: The rewritten user query after the reduction step
Abduction
Base relations Site relations
 The example query plans:
Optimization
Assume: All ads in sjmn are in sfc
Conclusions
 The first integration system:
– Arbitrary positive relational algebra user queries
– DB description
 Efficient optimization by use:
– Integrity constraints
– Local completeness information
 Flexible Use of query planning:
– Expressive description language
– Constraint
– Background theories
Related Works
 Information Manifold project and SIMS project:
– Explore the use of descriptions logics for describing
information sources
 Occam project
– Use general AI planning techniques to generate
information gathering plans
 TSIMMIS project
– Use pattern matching techniques to match user queries
and predefined queries.
Descargar

Infomaster: An Information Integration Tool