Object Fusion
in Geographic Information Systems
Catriel Beeri, Yaron Kanza,
Eliyahu Safra, Yehoshua Sagiv
Hebrew University
Jerusalem Israel
The Goal: Fusing Objects that Represent
the Same Real-World Entity
Example: three data sources that provide
information about hotels in Tel-Aviv
MAPI:
the survey of Israel
MAPA:
commercial corporation
MUNI: The municipally
of Tel-Aviv
The Goal: Fusing Objects that Represent
the Same Real-World Entity
MAPI: cadastral and
building information
MUNI:
Municipal information
MAPA:
tourist information
polygon
points
Hotel Rank
Is there a nearby
parking lot?
Each data source provides data that the other sources do not provide
The Goal: Fusing Objects that Represent
the Same Real-World Entity
MAPI: cadastral and
building information
Radison Moria
MUNI:
MAPA:
Municipal information
tourist information
Object fusion enables us to utilize the different perspectives of the data sources
Why Are Locations Used for Fusion?
• There are no global keys to identify objects that should
be fused
• Names cannot be used
– Change often
– May be missing
– May be in different languages
• It seems that locations are keys:
– Each spatial object includes location attributes
– In a “perfect world,” two objects that represent the same
entity have the same location
Why is it Difficult to use Locations?
• In real maps,
locations are inaccurate
• The map on the left is an overlay
of the three data sources about
hotels in Tel-Aviv
For example, the Basel
Hotel has three different
locations, in the three
data sources
Inaccuracy  Difficult to Use Locations
• It is difficult to distinguish between:
1. A pair of objects that represent close entities
+
2. A pair of objects that represent the same entity
+
• Partial coverage complicates the
problem
?
1
a
2
Fusion methods
Assumptions
• There are only two data sources
• Each data source has at most one object for
each real-world entity – i.e., the matching is
one-to-one
Corresponding Objects
• Objects from two
distinct sources
that represent
the same realworld entity
Fusion Sets
• A fusion algorithm creates two types
of fusion sets:
+
– A set with a single object
– A set with a pair of objects – one
from each data source
+
Confidence
• Our methods are heuristics  may produce
incorrect fusion sets
• A confidence value between 0 and 1 is attached
to each fusion set
• It indicates the degree of certainty in the
correctness of the fusion set
Fusion sets
with low
confidence
+
+
Fusion sets
with high
confidence
The Mutually-Nearest Method
• The result includes
– All mutually-nearest pairs
– All singletons, when an object is not part of pair
Finding nearest
objects
input
nearest
1 a
2
1 a
nearest
Fusion sets
nearest
2
1 a
2
The Probabilistic Method
• An object from one dataset has a probability of
choosing an object from the other dataset
• The probability is inversely proportional to the
distance
Confidence – the probability that
the object is not chosen by any +
+
Confidence – the probability of
the mutual choice
A threshold value is used to discard
fusion sets with low confidence
Mutual Influences Between Probabilities
Case I:
1
a
2
1
a
0.3
2
0.2
Case II: we expect
1
a
2
b
1
0.8
a
2
b
0.05
The Normalized-Weights Method
Normalization
captures mutual
influence
Iteration
brings to
equilibrium
Results are superior to those
of the previous two methods
(at a cost of only a small increase
in the computation time)
Measuring the Quality of the Result
Recall 
# correct
sets in the result
# entities
Precision 
# correct
|E |
sets in the result
# all sets in the result
E
Entities
in the
world
C
Correct
fusion sets
in the
result

|C |
R
Fusion
sets in
the
result

|C |
|R|
A Case Study: Hotels in Tel-Aviv
State of
the art
Our three methods
The
Mutually Probatraditional nearest bilistic
nearest
method
neighbor
(Best
results)
Normalized
weights
method
Recall
0.48
0.77
0.80
0.85
Precision
0.56
0.85
0.80
0.90
All three methods perform much better
than the nearest-neighbor method
Extensive tests on
synthesized data are
described in the paper
Conclusions
The novelty of our approach is in developing efficient
methods that find fusion sets with high recall and
precision, using only location of objects.
Thank you!
You are invited to visit our poster
And our web site
http://gis.cs.huji.ac.il/
Descargar

Object Fusion in Geographic Information Systems