Federated Service Oriented
Information Management
Ahmet Sayar
[email protected]
Introduction


Aim: Develop a general Grid architecture based approach to
distributed heterogeneous data, information and knowledge –
which are provided by different repositories and producers- in
an efficient and robust manner.
Challenges in




Representing,
Transforming,
Integrating and
Displaying
of
Data
 Information/knowledge
for decision makers in scientific application domains.


Methodology:


Create “Federated Service Oriented Information Management
architecture” for the GIS domain based on OGC (Open Geospatial
Consortium) specifications.
Determine the requirements for the generalization of the architecture for
2
other domains (Chemistry).
Motivation

SOA based on Grid or Web Services

We use DIKW to describe the hierarchy of Data-InformationKnowledge-Wisdom that we are attempting to support

“Filter Services” are Information Sources:







A service inputs DIKW from other Grids or Services and outputs DIKW
– perhaps converting data to information etc.
Web Services, easy to extend and federate.
Easy to publish, located and bind.
Predictable input/output interfaces defined by metadata
A repository or sensor has or gets DIKW from "outside Grid";
it outputs DIKW; they are “just” filters whose output is Grid
compatible DIKW as messages or message streams
Information management through ASIS (Application Specific
Information System) framework in Science Domains.
Data and metadata concepts and formats
3
GIS – OGC (Motivation Domain) (1)



Geographic Information System (GIS) is a
system for creating and managing spatial data
and associated attributes.
OGC (Open Geospatial Consortium) The goal is
to make geographic information and services
neutral and available across any network,
application, or platform.
Challenges (valid for any science domains)
 Distributed
nature of geospatial data.
 Proprietary data formats, and service methodologies.
 Lack of interoperable services.
 Assembling data from distributed sources
 Format conversions
 Amount of resources for geoprocessing
4
GIS – OGC (Motivation Domain) (2)


GML : Geographic Markup language
WFS: Web Feature Server
 Provides
vector data such as rivers, state and city
boundaries in GML.

WCS : Web Coverage Server
 Provides

coverage (raster) data. Grided data, pixel info.
WMS : Web Map Server
 Provides
data in the form of jpeg, svg, png etc. Defined
in its capabilities file.

WMS’ : Cascading Web Map Server
 Provides data in the form of layers in mages. It is
cascading because it provides other WMS layers as if
its own.
5
Information Management Arch
In GIS Domain (Sample Scenario)





Query : No Standard – Filter specification –
query on vector data by WFS using SQL
Data Encodings : GML, images
Vector
data
Metadata : Structured Capability doc in
XML.
No event notification – WS-Context for
Data:a
asynchronous run.
WFS
Registry : WRS – we call it MD.
(CGL)
Capabilities
Meta-data
Publishing
Filter Container
Discovering
g
Data Handlin
rfaces
Service inte
Filtering
Module
-Core Service-
Data:b
WCS
MD
WMS’
Data:a
Data:b
Data:c
Raster
data
(Minnesota)
Data:b
Data:c
(Nasa)
WMS
(CGL)
Interactive tools
Decision support
Data:a
Data:b
Data:c
Interactive Decision Support
Data
capability
6
From Raw Data to Information / Knowledge




Portable across


Languages
Operating system
Knowledge
Domain
Knowledge
ASFS
Any
Data
Structured
Data
(Core)
ASVS
Any
Data
Structured
Data
(Core)
Data Modeling
Discovering
Discovering

Capabilities
a-data
MetDomain
a Hgandling
dlin
Dat
Data Han
esterfaces
acin
virfce
erte
ServiceSin

Raw Data  GML
(WFS in Filter - ASFS)
GML  Map image
(WMS in Filter - ASVS)
Each filter provides data in
a consistent format.
Formats should be
consistent with the systems
data model, GML
Any Data  Common Data
Model
Data Model is XML based
hierarchical data
Data Modeling
Publishing
Publishing
Raw Data
Or Any Data
S
S

Capabilities
Meta-data
Data
base
7
Interactive Decision
Support Tools
http://virtualsky.org (R. Williams et al.)
- Interactive query,
- Interactive display, movie and animation
- Integration to Application Science Simulations
8
Application Use Domains

ServoGrid Projects (GIS)
 Patter
Informatics (PI)
 GeoFest


Virtual California (VC)
Los Alamos National Labs (LANL)

IEISS (The Interdependent Energy Infrastructure
Simulation System )


Models infrastructure networks (e.g. electric power systems
and natural gas pipelines) and simulates their physical
behavior, interdependencies between systems.
Chemistry and Astronomy (Future)

CML (Chemistry Markup Language) representation of
molecules. VOTable (Virtual Observatory Table format)
9
Problem Recognition
DB
Vector
data
Coverage
data
netCDF
Image
jpeg
DB
DB
DB
Raw Data
DB
Bitmap
data
Binary
data
DB
XML
data
Data
DB
Information
Bar
graphs
Plots
images
Statistics
data
Interactive Tools
HDF5
DB
Knowledge
Wisdom
Decisions
10
Problem Recognition -cont


Services like discovery and notification do not need to be made
application specific.
BUT If the domain changes then :
 choices,






database requirements,
data format,
core service requirements,
attributes, and
metadata context
CHANGES !
What are the common concepts and characteristics for





data,
metadata,
query language,
services, and
communication language,
in order to drive information/knowledge from the heterogeneous
data/information sources in any application domains ?
11
Generalization of Service Oriented
Information Management Architecture

GIS has some specifications based on standards
such as OGC ISO/TC210, But many others do not

GIS

ASIS
(Science Domain)
GML

ASL
(Representing)
WFS

ASFS
(Storing-Resource)
WMS

ASVS
(Displaying)
Capa.xml 
Metadata
(Integrating)
SOAP over HTTP.
(Communication Protocol)





12
Generalization - Overall Structure Solution

ASL : Application Specific Language. XML based
hierarchical data representation format.


ASVS : Application Specific Visualization System



Last filter before the decision maker.
Provides information/knowledge in human readable formats
ASFS : Application Specific Feature Service.


Cross language, platform and operating system
Stores and provides common data model (ASL)
Treat binary and common data (in ASL) differently.
ASFS
AS
Repository
AS
“Sensor”
AS Tool
(generic)
AS Service
(user defined)
AS Tool
(generic)
Message Using ASL
ASVS
Display
13
ASFS and ASVS in SOA
Interfaces, querying, metadata and data model
ASFS

ASVS
Routines
Return types
Routines
Return types
GetCapability
Capability file XML
GetCapability
Capability file XML
DescribeData
XML-schema
GetVis
Images, svg, png..
GetData
ASL
GetDataInformation
HTML, Text, XML
Each routine is published in the WSDL, invoked based on predefined
request schema and put into SOAP body.
<request>
…..<GetCapability>
</request>
<SOAP:Envelope>
…<SOAP:Body>
……<request>
……..<GetCapability>
……</request>
...<SOAP:Body>
<SOAP:Envelope>
14
Sample Capabilities File (too simplified) – GIS Domain

<?xml version='1.0' encoding="UTF-8" standalone="no" ?>
<!DOCTYPE WMT_MS_Capabilities SYSTEM "http://toro.ucs.indiana.edu:8086/xml/capabilities.dtd">
<Capabilities version="1.1.1" updateSequence="0">
<Service>
<Name>CGL_Mapping</Name>
<Title>CGL_Mapping WMS</Title>
<OnlineResource xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple“
xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" />
<ContactInformation>
…..
</ContactInformation>
</Service>
<Capability>
<Request>
<GetCapabilities>
<Format>WMS_XML</Format>
<DCPType><HTTP><Get>
<OnlineResource xmlns:xlink="http://w3.org/1999/xlink" xlink:type="simple“
xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" />
</Get></HTTP></DCPType>
</GetCapabilities>
<GetMap>
<Format>image/GIF</Format>
<Format>image/PNG</Format>
<DCPType><HTTP><Get>
<OnlineResource xmlns:xlink="http://w3.org/1999/xlink" xlink:type="simple“
xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" />
</Get></HTTP></DCPType>
</GetMap>
</Request>
<Layer>
<Name>California:Faults</Name>
<Title>California:Faults</Title>
<SRS>EPSG:4326</SRS>
<LatLonBoundingBox minx="-180" miny="-82" maxx="180" maxy="82" / >
</Layer>
</Capability>
</Capabilities>
15
Sample Scenario for ASIS
Capabilities
Meta-data
Capabilities
Meta-data
Data
A
Domain
Knowledge
Data Modeling
A
A,B,C
Any
Data
ASFS
Structured
Data
(Core)
Discovering
Structured
Data
(Core)
g
Data Handlin
rfaces
Service inte
ASVS
Discovering
g
Data Handlin
rfaces
Service inte
A,B,C
Any
Data
Domain
Knowledge
GetData(A)
Data Modeling
Domain
Knowledge
ASVS
(Core)
Structured
Data
Discovering
A,B,C,
D
D,E,F
Any
Data
GetData(A)
B,C
A,B,C
Any
Data
ASFS
(Core)
Structured
Data
Discovering
Domain
Knowledge
GetVis(A,E)
Data Modeling
Publishing
Data Modeling
Publishing
Capabilities
Meta-data
GetVis(E)
Domain
Knowledge
Domain
Knowledge
Any
Data
F
ASFS
(Core)
Data Modeling
Publishing
Data
B,C
Capabilities
Meta-data
Data
E
Publishing
Structured
Data
Discovering
(Core)
Structured
Data
Data
D
g
Data Handlin
rfaces
Service inte
E
E,F
ASVS
Discovering
Any
Data
Data Modeling

g
Data Handlin
rfaces
Service inte
Capabilities
Meta-data
g
Data Handlin
rfaces
Service inte
A,
B,
C,
D,
E,
F
Capabilities
Meta-data
Publishing
g
Data Handlin
rfaces
Service inte
E
Interactive Tools
A
Publishing
Data
F
Successive
Static
Each
GetCapability
Client
Filter
linking
needs
publishes
requests
to
ofrequest
visualize
filters.Capability
are
its
from
data
Data
done,
client
through
Auser
and
aggregation
tools
is
Eits
and
not
at the
cycle through
capability
startup.
makes
involved.
aLater
GetVis
These
file. requests
“GetCapabilities”
requestwill
chains
to be
ASVS
created
are
interfaces
with
created
based
specific
of on
filters. onaggregated
returned
attributes
based
for querying.
filters
capabilities
capabilities
GetVis
thatispublished
defined in a
16
schema file.
before
Overall Structure Solution -cont







Common data (ASL) is kept in ASFS with query capability.
In a given domain every filter speaks in ASL.
Filters (ASVS, ASFS) keep their metadata locally.
ASVS both visualize information and provide a way of navigating
ASFS and their underlying DB.
ASVS can itself be federated and present output interface.
Dynamic metadata update via MD services or P2P metadata
exchange.
Utilizing data/information at the application level via filters



ASFS provide ASL.
ASVS provide human readable information such as text, graphs
(scalable vector (svg) or portable (png)) and images.
Filters have common ports and interfaces


Enable chaining for more complex data and information creation.
Filters are easily published, located and invoked over the internet.
17
Applicability to
Different Science Domains

How strongly our service definitions in proposed
architecture matches to general science domains?
Filters
ASL
GIS
GML
ASFS
ASVS
Metadata
WFS
WMS
capability.xml
schema
Astronomy VOTable,
FITS
SkyNode
VOPlot
TopCat
VOResource
Chemistry
NO
NO standard NO
JChemPaint
CML
18
Research Issues (1)

Requirements for the domain metadata in
capability
 What
does capabilities do and need to have to
federate filters?

Requirements for the ASL (such as CML, GML)
 What

does ASL need to have to federate the filters?
Concept of data (such as feature, coverage)
 Common

representation? Possible? To what extend?
A common information management framework
which can be applied to any domain.
 some
instructions- any field, what needs to be done
19
Research Issues (2)



Application level data/information federation.
Integrating the system with application science
simulations.
Creating interactive decision support tools
utilizing integrated filter services.
 Tools
for map animation, map movies, images
 Interactive query support to get further information on
the image and/or animation.


Enabling binding of services into pipelines with
or without human intervention through metadata.
Caching and load balancing to handle large
scientific data in an efficient and robust manner
(application based).
20
Related Work
SRB (Storage Resource Broker)

SRB
 Uniform
access to distributed heterogeneous data
resources by attributes.
 Catalog service is MCAT (Metadata Catalog Service).
 Resource and data location transparency.
 Remote authentication authorization – user groups.
 Not just for access, transferring and replicating.
 Sample projects using SRB: BIRN and IVOA.

Summary
 Other
important digital library projects and the NGAS
(Next Generation Archive System) from ESO.
 We will research more these important activities, identify
key architecture ideas and incorporate lessons.
 SRB can be leveraged in ASIS.
21
Related Work -Cont
OGSA-DAI

Ogsa-DAI
 Open
Grid Service Architecture–Data Access and
Integration.
 Access to heterogeneous data via common interfaces
on the grid.
 Catalog service is MCS (Metadata Catalog Service)
 OGSI-compliant Grid.
 Components are Grid services. Resources should be
registered.
 Sample projects using Ogsa-DAI : LEAD, MyGrid.

Summary
 OGSA-DAI
emphasizes database layer whereas we
are tackling the application specific DIKW.
 OGSA-DAI can be leveraged in ASIS.
22
Contributions




Instructions how to build ASL and metadata in
capability for the application sciences.
Instructions how to build application specific
information system (ASIS) federating multiple filters
speaking ASL.
Information grid (ASIS) formalization through
capabilities metadata, defining all the
data/information sources as interacting Web Service
filters with standard metadata service ports.
Optimize and enhance the distributed
heterogeneous information management.
23
THANKS
[email protected]
Ahmet Sayar
24
APPENDIX
25
Literature Survey
OGSA-DAI
SRB
26
Discussions on SRB & Ogsa-DAI

SRB



Monolithic – does too much
MCAT dependent
MCAT has limited support for application-level metadata




Need diff metadata for diff domain, and extensions for applications
Not standard based – Not open source
Not handling data based on DIKW hierarchy
Ogsa-DAI



At the data and Database level
MCS dependent
MCS has limited support for application-level metadata




Need diff metadata for diff domain, and extensions for applications
For Grid applications - GGF standards
Data only in relational and XML database or ordinary files
Not handling data based on DIKW hierarchy
27
Our Work Compared to SRB & Ogsa-DAI (1)

Each filter has its own metadata

Distributed metadata handling



They provide heterogeneous data access and federation
through central metadata services





Peer to peer
Through MD services
SRB MCAT and Ogsa-DAI MCS
Main motivation is sharing, interpreting and knowledge
extraction of the data and information.
Their motivation is storing, accessing and updating of the
heterogeneous data.
We leverages their power and usability in our federated
service oriented information management architecture.
They are not competitors, instead completers.
28
Our Work Compared to SRB & Ogsa-DAI (2)
Wisdom Decisions,
ready to use information
and knowledge
Wisdom decisions,
knowledge and information
extraction by the user
Interactive Tools
-Central data access
abstraction. Uniform
access to
heterogeneous data
MCAT
sources
-Metadata :
SRB/MCAT, OgsaDAI/MCS
-Both provides
extensible metadata
arch for diff domains
-SRB has “zone”
concept addresses
similar issues but in
different way
-Reusable components
Filter Services with
specific ports and
interfaces
ASVS
ASVS
GDSReg
MasterSRB
Ogsa-GDSF
SRB Agents
Ogsa/GDS
ASVS
-Distributed DIKW
abstraction
ASFS
ASFS
ASFS
-Metadata in capability
document
-Metadata aggregators
R
R
R
R
R
R
-New metadata for
different domains
Wisdom decisions
-Smart data querying
Information/knowledge
-Web Services based
SOA (advantages).
29
Data access and query
Why are we different ?
Federated Service Oriented Information Management

SOA (Service Oriented Architecture)




Easy to extend
Reusable components
Cross platform and language.
XML based hierarchical data representation




Easy to access data – no command line




Easy data integration
Easy querying
Human readable information
Interactive tools
On the fly query creation.
Not only accessing data but also transforming through its
path to end users.
Ports to integrate application simulations to application
specific information system (ASIS)

Integrating application simulation data/information with ASIS
outputs
30
An Example of Other Domains:
Astronomy Domain (IVOA Standards)


FS-1 : VOPlot
 Integrating, Interacting
visualization tools
FS-2 : SkyNode



FS-4 : SSA







2D sky projection, logically a grid of
pixels encoded as a FITS image
DB
DB
ADQL based SOAP interface
returning VOTable based results
FS-3 : SIA

DB
FS-3
FS-2
FS-1
FS-4
MD
URL-based returning a dataset
"document" (VOTable)
Query : ADQL –extension of SQL
Data Encoding: VOTable, FITS
Metadata : UCD, VOResource
Event notification : VOEvent
Registry : VORegistry
QueryableData in : SSAP and SIAP,
VOStore
PORTAL
Data
capability
Interactive Decision Support
31
Descargar

Metadata/capability enabled Information System for