Semantic Adaptation of
Schema Mappings when
Schemas Evolve
Cong Yu
University of Michigan
Lucian Popa
IBM Almaden Research Center
VLDB’05, Trondheim, Norway – Sep 2, 2005
Schema Mappings
Schema S
Schema T
I
J
q’



q
Schema Mappings are logical, declarative, assertions that can
describe relationships between schemas.

enough semantics to guide run-time, instance-level, transformation

e.g., GLAV mappings (or tuple-generating dependencies)
They are key elements in two main areas in information integration:

Data Exchange/Translation

Query Answering/Rewriting (or Federation)
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
2
Schema Evolution and Mapping Adaptation



Schemas evolve over time … Mappings may become invalid !
A lot of effort goes into establishing mappings. How do we
reuse them ?
Mapping Adaptation Problem [VMP’03]


09/02/2005
Given:

mapping M from S to T,

changes/evolution of S to S’, or T to T’, or both,
Derive a “best” mapping M’ that:

is valid with respect to the new schemas, and

reflects the original mapping as much as possible
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
3
Prior Solution: Incremental Method
S
M
T
M1
move
elem
T1
M2
M3
add
elem
T2
delete
constraint
T3
rename
elem
…

[VMP’03] Incrementally adapts the mapping after each atomic
change in the schemas (source and/or target).

Efficient and intuitive, for one or few changes.

However, for non-incremental evolution, there are drawbacks …
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
4
M
S
T
Different
evolution paths
Mn
Tn

The new schema may be radically different



The method will ultimately be inefficient:


The list of changes may not be known.
Evolution path must be discovered  not necessarily unique
The algorithm must be applied at each atomic change
As we shall see, the resulting mapping may not be the expected
one.
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
5
Our Approach: Composition-Based
S
M
T
E
Can use schema
mapping tools (e.g.,
Clio) to construct E.
M’ = M ° E
T’


Evolution itself is described as a schema mapping.

Concise, declarative, and expressive description of evolution.

Enables efficiency and can deal with arbitrary evolution
The adapted mapping is then obtained via composition.


Formal semantics of adaptation.
At high level, this is part of the model management vision [Ber03].
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
6
Main Contributions

We study the interplay between schema evolution
and mapping composition

interesting in terms of both semantics and
implementation

We show that the composition-based approach for
mapping adaptation can be practical
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
7
Outline of the Rest of the Talk

Incremental Approach vs. Composition Approach


Example (showing why composition is important)
Composition: Semantics and Algorithm

Transformation semantics
 specialized, more suitable for schema evolution, also
more challenging

Optimization and Experiments

Compose only when necessary
(Some mapping formulas are unaffected by the change)

09/02/2005
Conclusion
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
8
Simplified Example
Source’
LineItem
li
s
p
o
qty


SuppPart
s
p
PartOrder
p
o
Target
m:
m
PotentialSupp
s
o
SuppPart (s, p) Λ PartOrder (p, o)
 PotentialSupp (s, o)
( GLAV mapping [Halevy01], or,
source-to-target tgd [FKMP03] )
The mapping m “exports” orders o and all their potential
suppliers s.
Schema evolution scenario:


Source
Data arrives in “long” tuples, each relating an order, a part and an
available supplier.
The mapping m must be adapted to use new schema Source’.
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
9
Incremental Approach [VMP’03]
Source’
LineItem
li
s
p
o
qty


Source
SuppPart
s
p
PartOrder
p
o
Target
m:
m
PotentialSupp
SuppPart (s, p) Λ PartOrder (p, o)
 PotentialSupp (s, o)
s
o
Pick a list of changes from Source to Source’ and rewrite mapping
after each change.
(1) Move element SuppPart/s to PartOrder/s:
SuppPart (p) Λ PartOrder (s, p, o)  PotentialSupp (s, o)


(2) Delete SuppPart/p and (3) delete SuppPart.
(4) Rename PartOrder to LineItem, (5) add LineItem/li and (6) add
LineItem/qty:
m’:
09/02/2005
LineItem (li, s, p, o, qty)  PotentialSupp (s, o)
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
10

Although small, our example already needs 6 schema changes.


For large schemas, this can become challenging
Furthermore, and somewhat surprisingly, the semantics of the
adapted mapping may not be the “expected” one !
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
11
Loss of Semantics
Source’
LineItem
li
s
p
o
qty
Source
SuppPart
s
p
PartOrder
p
o
Target
m
PotentialSupp
s
o
m: SuppPart (s, p) Λ PartOrder (p, o)
 PotentialSupp (s, o)
m’: LineItem (li, s, p, o, qty)
 PotentialSupp (s, o)

The original mapping m joins orders with suppliers

However, m’ loses relevant suppliers


It only pairs an order with a supplier provided they appear in the same
LineItem tuple
To retain the original semantics, we must look in different tuples !
m’’:
09/02/2005
LineItem (li, s, p, o, qty) Λ LineItem (li’, s’, p, o’, qty’)
 PotentialSupp (s’, o)
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
12


The incremental approach is a “mechanical” procedure that
makes local changes to the mapping.
A sequence of good local changes may not necessarily yield
the best global adaptation …
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
13
Mapping Composition Approach

We look at the evolution globally:

Describe evolution through a schema mapping Source’  Source.
Source’
LineItem
li
s
p
o
qty


Source
SuppPart
s
e1
p
PartOrder
p
e2
o
Target
m
PotentialSupp e :
1
s
e2 :
o
LineItem (l, s, p, o, q) -> SuppPart (s, p)
LineItem (l, s, p, o, q) -> PartOrder (p, o)
Define the adapted mapping to be a mapping Source’  Target,
equivalent (e.g., same data movement) to the sequence of the evolution
mapping and the original mapping.
The previous m’’ satisfies the conditions for {e1,e2} and {m}.
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
14

The composition approach is a more systematic approach,
with precise semantics, guaranteed to behave the “right”
way in all situations.

Although it may appear simple in the previous example,
mapping composition poses challenges …
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
15
Challenges in Composition Approach

Mapping language:

Must handle nesting and complex types (as in XML Schema)


(details in the paper)
Furthermore, the usual mapping languages (GLAV, tuplegenerating dependencies) are not closed under composition !


Recent extension that ensures composability: second-order tgds
[FKPT04].
Main idea: add functions to gain needed expressive power

Semantics and Algorithm

Efficiency/Scalability
09/02/2005
Next …
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
16
Composition:
Semantics and
Algorithm
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
17
Composition: Semantics

In mapping composition, we want to replace a sequence of
schema mappings with one that is “equivalent” and avoids
the middle schema.

What does “equivalent” mean ?

There are two semantics that we considered:

Relationship semantics


Transformation semantics

09/02/2005
More general
More suitable, specialized
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
18
Relationship Semantics

Mappings can be viewed as describing relationships between
instances over the two schemas
Rel (M12) = { (I1, I2) | (I1, I2) satisfies M12 }

Composition of relationships:
Rel (M12) ◦ Rel (M23) = { (I1, I3) | there is I2 such that
(I1, I2) satisfies M12 and (I2, I3) satisfies M23 }

[FKPT04, Melnik04] A mapping M13 is equivalent, to the
sequence of M12 and M23, under the relationship semantics, if:
Rel (M13) = Rel (M12) ◦ Rel (M23)
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
19
Example: Semantics and Algorithm
S1
Takes:
name
course
S2
Student:
sid
M
name
Enrolls:
sid
course

S3
E Takes’:
sid
name
course
Second-order tgd [FKPT04]
M:
Takes (n, c)  Student (F(n), n)
 Enrolls (F(n), c)
E:
Student (s, n)  Enrolls (s, c)
 Takes’ (s, n, c)
M13 correctly captures the equivalent
relationship between instances of S1 and S3.

Instances (and function F) can exist a priori.

A student n must be paired with a course c


even when c is listed under a different
student name n’,
provided the student id is the same:
“Unknown” student id
1. Substitution
M13:
Takes (n, c’)  Takes (n’, c)
 F(n) = F(n’)

Takes’ (F(n), n, c)
F(n) = F(n’)
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
20

However, if we assume that the function F is one-to-one, an
important simplification can be made …
M13:
Takes (n, c’)  Takes (n’, c)
 F(n) = F(n’)

Takes’ (F(n), n, c)
Equivalent relationship
2. Reduction
F(n) = F(n’)  n = n’
M’13: Takes (n, c’)  Takes (n, c)
 Takes’ (F(n), n, c)
3. Minimization
M’’13: Takes (n, c)
 Takes’ (F(n), n, c)
Equivalent transformation

We can always make this assumption, if mappings are meant to
describe transformations (i.e., generation of a target instance).

09/02/2005
F is a Skolem function assigning unique student ids: n  F(n)
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
21
Transformation Semantics

A mapping is a process (in the spirit of data exchange
[FKMP03]):
I2 = M12(I1)


Each mapping formula is a “generator” of target facts

Functions are one-to-one value generators
Theorem. Our composition algorithm produces the schema
mapping with the equivalent transformation semantics:
M13 (I1) = M23( M12(I1) )
(up to the renaming of nulls)
Advantage of transformation semantics, in adaptation:
 simpler and more intuitive formulas !
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
22
Composition Algorithm: Further Details


The substitution step is more complex than shown:

Must handle nesting

Generate parameterized rules for set types in the middle schema

Reuse some of the mapping-based query rewriting techniques [YP04]
Minimization:

Good: it simplifies formulas and generates intuitive mapping.
(all this is enabled by the transformation semantics)

Bad: it can be expensive (same as tableau minimization) …
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
23
Optimization
and Experimental
Results
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
24
Full Adaptation
Full adaptation  Compose “whole” schema mappings
(Compose all the formulas in the original mapping with all
the formulas in the evolution mapping)


09/02/2005
Inevitable when the schema evolution is drastic and affects most
of the original mapping (non-incremental evolution)
Inefficient when the changes are small and localized (incremental
evolution)
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
25
Compose Only When Necessary
Mapping Pruning:
1. Detect those parts (formulas) M’o of the original mapping Mo
that are affected by evolution.

Only M’o need to be adapted.
2. Only a subset M’e of the formulas in the evolution mapping Me
play a role in the composition with M’o

The rest are redundant (PTIME containment-like analysis, see paper)
3. Compose M’o with M’e
 Big performance gain for incremental evolution and overall.
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
26
Analysis of Evolution Scenarios
Results
based on
Clio
Benefits = 1 – adapted mappings / (blank-sheet mappings + missed mappings)
We also have synthetic scenarios that show scalability of Mapping
Pruning with increasing schema and mapping complexity
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
27
Conclusion


We studied:

Mapping composition techniques for mapping adaptation

Transformation semantics in the context of schema evolution
Designed and implemented a practical adaptation system


Mapping pruning (schema evolution specific)
To Do:

Optimization of composition in general

Improve performance of minimization
09/02/2005
Semantic Adaptation of Schema Mappings when Schemas Evolve - VLDB'05
28
Descargar

Document