The Virtual Observatory
Peter Fox
HAO/ESSL/NCAR
November 28, 2005
The Virtual Solar-Terrestrial Observatory
Outline








Virtual Observatories - history and definition(s),
I*Ys
Some examples - within disciplines
When is a VO, not a VO?
Beyond disciplines, the emerging need
What is missing, i.e. the enabling technology
Challenges and interoperability
Examples: VSTO and SESDI
What’s ahead
The Virtual Solar-Terrestrial Observatory
Encyclopedia - we’ve made it!


Virtual observatory is a collection of integrated astronomical data
archives and software tools that utilize computer networks to create
an environment in which research can be conducted. Several
countries have initiated national virtual observatory programs that
will combine existing databases from ground-based and orbiting
observatories and make them easily accessible to researchers. As a
result, data from all the world's major observatories will be available
to all users and to the public. This is significant not only because of
the immense volume of astronomical data but also because the data
on stars and galaxies has been compiled from observations in a
variety of wavelengths: optical, radio, infrared, gamma ray, X-ray
and more. Each wavelength can provide different information about
a celestial event or object, but also requires a special expertise to
interpret. In a virtual observatory environment, all of this data is
integrated so that it can be synthesized and used in a given study.
http://www.encyclopedia.com/html/v1/virtobserv.asp
The Virtual Solar-Terrestrial Observatory
Yet more definitions




AVO: A virtual observatory (VO) is a collection of interoperating data
archives and software tools which utilize the internet to form a scientific
research environment in which astronomical research programs can be
conducted. In much the same way as a real observatory consists of
telescopes, each with a collection of unique astronomical instruments, the
VO consists of a collection of data centres each with unique collections of
astronomical data, software systems and processing capabilities.
From the Grid: virtual observatory - astronomical / solar / solar terrestrial
data repositories made accessible through grid and web services.
Workshop: A Virtual Observatory (VO) is a suite of software applications on
a set of computers that allows users to uniformly find, access, and use
resources (data, software, document, and image products and services
using these) from a collection of distributed product repositories and
service providers. A VO is a service that unites services and/or multiple
repositories.
VxOs
The Virtual Solar-Terrestrial Observatory
Virtual Observatories


Conceptual examples:
In-situ: Virtual measurements
 Related measurements

Remote sensing: Virtual, integrative measurements
 Data integration


Systems or frameworks?
Brokers, or data providers, or service providers?
» VOTables, VOQueries, etc. a sytnax for exchange

Holding metadata? Who imposes the catalog, or
vocabulary?
The Virtual Solar-Terrestrial Observatory
VOs and data providers

Not a VO:





When you hand off a user to another site
Only one dataset
When you do not deliver, or do not arrange for delivery of the data
When your curation role is not evident
DP:











Acquire data and produce data products (static or dynamic).
Preserve data in useable forms.
Distribute data, and provide easy machine (API) and Internet browser access.
Support a communication mechanism – should support a standards-based messaging system (e.g., ftp, http,
SOAP, XML)
Produce, document, and make easily available metadata for product finding and detailed data granule content
description. Ideally, maintain a catalogue of detailed data availability information.
Assure the validity and quality of the data.
Document the validation process.
Provide quality information (flags).
Maintain careful versioning including the processing history of a product.
Maintain an awareness of standards (such as community accepted data models), and adhere to them as needed.
Provide software required to read and interpret the data; ideally the routines used by the PI science team should be
available to all.
The Virtual Solar-Terrestrial Observatory
What should a VO do?

Make “standard” scientific research much more efficient.
 Even the PI teams should want to use them.
 Must improve on existing services (Mission and PI sites, etc.). VOs
will not replace these, but will use them in new ways.

Enable new, global problems to be solved.
 Rapidly gain integrated views from the solar origin to the terrestrial
effects of an event.
 Find data related to any particular observation.
 (Ultimately) answer “higher-order” queries such as “Show me the
data from cases where a large CME observed by SOHO was also
observed in situ.”
The Virtual Solar-Terrestrial Observatory
What the NASA community wants

Provide coordinated discovery and access to data and service resources
for a specific scientific discipline
 Identify relevant data sources and appropriate repositories.
 Allow queries that yield data granules or pointers to them.
 Provide a user interface to access resources both through an API (or
equivalent machine access) and a web browser application.
 Handle a wide range of provider types, as needed.

Understand the data needs of its focus area:




Recruit potential new providers.
Provide support and "cookbooks" for easy incorporation of providers.
Help to assure high data quality and completeness of the product set.
Resolve issues of multiple versions of datasets.
The Virtual Solar-Terrestrial Observatory
More from NASA

Provide documentation for metadata:






Provide an API or other means for the VxO to appear to others as a
single provider.
Potentially provide value-added services (can be done by providers or
elsewhere):







Set standards for metadata and query items
Assist providers, and review metadata.
Maintain a global knowledge of data availability.
Possibly maintain collection catalog metadata.
Data Subsetting:
Averaging of data
Filtering
Data Merging
Format Conversion
Provide access to event lists and ancillary data.
Collect statistical information and community comments to assess
success.
The Virtual Solar-Terrestrial Observatory
VSO and the ‘small box’
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
CEDAR
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Earth System Grid
DATA storage
SECURITY services
METADATA services
LBNL
gridFTP server/client
TRANSPORT services
ANALYSIS & VIZ services
HRM
MONITORING services
FRAMEWORK services
ANL
DISK
Auth metadata
NCAR
MySQL
RLS
GSI
CAS client
TOMCAT
SLAMON daemon
NCL openDAPg client
NERSC
HPSS
AXIS
CAS server
GRAM
LAS server
gridFTP server/client
HRM
NCAR
MSS
LLNL
GSI
openDAPg server
ORNL
TOMCAT
DISK
SLAMON daemon
CDAT openDAPg client
MySQL
gridFTP server/client
Xindice
HRM
GSI
DISK
THREDDS catalogs
RLS
CAS client
MyProxy client
gridFTP server/client
MyProxy server
ORNL
HPSS
openDAPg server
HRM
DISK
ISI
MySQL
RLS
GSI
CAS client
MCS
MySQL
Xindice
GSI
OGSA-DAIS
MySQL
GSI
The Virtual Solar-Terrestrial Observatory
RLS
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
Emerging needs





Interdisciplinary science and engineering (not just
between adjacent fields)
Interdisciplinary data assimilation, integration
Web service workflow orchestration (beyond syntax)
Vortals as well as portals (specific to general)
Agency (NASA) and community efforts (eGY, IHY, IPY,
IYPE)
The Virtual Solar-Terrestrial Observatory
ACOS at the MLSO
Near real-time
data from Hawaii
from a variety of
solar instruments,
as a valuable
source for space
weather, solar
variability and
basic solar
physics
The Virtual Solar-Terrestrial Observatory
CISM
Goal: To create a physics-based numerical simulation model that describes
the space environment from the Sun to the Earth.
THE USES OF SPACE WEATHER MODELING
A scientific tool for increased
understanding of the
complex space environment.
A specification and forecast
tool for space weather
prediction.
An educational tool for
teaching about the space
environment.
The Virtual Solar-Terrestrial Observatory
CEDARWEB
Community
data archive,
documents,
and support.
The Virtual Solar-Terrestrial Observatory
User requirements

CEDAR




Search must return data (i.e. no null searches)
Search across instruments, models
Know about special time periods, campaigns, etc.
Allow selections based on (appropriate) geophysical conditions,
e.g. Kp index
 Usual format returned and in correct units
 Must be able to easily re-create the search, access
 Visual browsing

MLSO
 Same as CEDAR !!
 + sampling interval choice, e.g. minutely, daily, average, best of
the day, synoptic
The Virtual Solar-Terrestrial Observatory
Challenges and interoperability

Semantic misunderstanding
 E.g. sunspot number and variations in solar radiation: over 90%
of researchers outside the sub-field of solar radiation think:
sunspot number is a measure of solar radiation
 In reality: a sunspot number is a measure of the number of
sunspots appearing on the visible solar surface, a sunspot is an
indicator of the location of strong solar magnetic fields, strong
magnetic fields are collectively known as solar activity, sunspots
are observed to produce a localized decrease in the solar
radiation output, etc.
 How to ‘explain’ this to a computer?

Interfaces are built by computer scientists with syntax
that often works within a discipline but rarely across
them
The Virtual Solar-Terrestrial Observatory
Concept and user needs
Goal - find the right balance of data/model holdings,
portals and client software that a researchers can
use without effort or interference as if all the
materials were available on his/her local
computer.
The Virtual Solar-Terrestrial Observatory (VSTO) is a:
• distributed, scalable education and research environment for searching,
integrating, and analyzing observational, experimental and model databases
in the fields of solar, solar-terrestrial and space physics
VSTO comprises a:
• System-like framework which provides virtual access to specific data, model,
tool and material archives containing items from a variety of space- and
ground-based instruments and experiments, as well as individual and
community modeling and software efforts bridging research and educational
use
The Virtual Solar-Terrestrial Observatory
User needs
In discussions with data providers and users, the needs are clear:
``Fast access to `portable' data, in a way that works with the tools we have;
information must be easy to access, retrieve and work with.'’
 Few clicks, get what I want, whose tools? MY tools
Too often users (and data providers) have to deal with the organizational structure of the
data sets which varies significantly --- data may be stored at one site in a small number
of large files while similar data may be stored at another site in a large number of
relatively smaller files. There is an equally large problem with the range of metadata
descriptions for the data. Users often only want subsets of the data and struggle with
getting it efficiently. One user expresses it as:
``(Please) solve the interface problem.'’
 Encapsulate more
The Virtual Solar-Terrestrial Observatory
What’s new in the VSTO?
•
•
•
•
•
Datasets alone are not sufficient to build a virtual observatory:
VSTO integrates tools, models, and data
VSTO addresses the interface problem, effectively and
scalably
VSTO addresses the interdisciplinary metadata and
ontology problem - bridging terminology and use of data
across disciplines
VSTO leverages the development of schema that adequately
describe the syntax (name of a variable, its type, dimensions,
etc. or the procedure name and argument list, etc.), semantics
(what the variable physically is, its units, etc.) and pragmatics
(or what the procedure does and returns, etc.) of the datasets
and tools.
VSTO provides a basis for a framework for building and
distributing advanced data assimilation tools
The Virtual Solar-Terrestrial Observatory
Virtual Observatory: Need better glue
•
Basic problem: schema are categorized rather than
developed from an object model/class hierarchy ->
significantly limits non-human use. However, they all form the
basis to organize catalog interfaces for all types of data,
images, etc.
•
This limits data systems utilizing frameworks and prevents
frameworks from truly interoperating (SOAP, WSDL only a
start)
•
Directories, e.g. NASA GCMD, CEDAR catalog, FITS (flat)
keyword/ value pairs, are being turned into ontologies
(SWEET, VSTO)
•
Markup languages, e.g. ESML, SPDML, ESG/ncML are
excellent bases
The Virtual Solar-Terrestrial Observatory
Methodologies






Use-cases
User requirements
Semantics - ‘what does this mean’
Data integration
Ontologies
Rapid prototyping
The Virtual Solar-Terrestrial Observatory
HAO and SCD from NCAR, McGuinness Assoicates: Peter Fox, Don Middleton, Stan Solomon, Deborah
McGuinness, Jose Garcia, Patrick West, Luca Cinquini, James Benedict, Tony Darnell
http://vsto.hao.ucar.edu/ and soon http://www.vsto.org/


Application domains - CEDAR, CISM, ACOS
Realms (ontologies):
 Covers middle atmosphere to the Sun + SPDML
 Mesh with Earth Realm (SWEET)
 Mesh with GEON
VSTO
SWEET
+SPDML

ACOS
CISM
Use-cases and user requirements
CEDAR
The Virtual Solar-Terrestrial Observatory
VSTO Use-case 1
UC1: Plot the observed/measured Neutral Temperature (Parameter) looking in the vertical direction for
Millstone Hill Fabry-Perot interferometer (Instrument) from January 2000 to August 2000 (Temporal
Domain) as a time series .
Precondition: portal application is authorized to access the backend data extraction and plotting service
1.
2.
User accesses the portal application
User goes through a series of views to select (in order) the desired observatory, instrument,
record-type (kind of data), parameter, start and stop dates, and the plot type is inferred. At each
step, the user selection determines the range of available options in the subsequent steps. NB, an
alternate path is selection of start and stop dates, then instrument, etc.
3. The application validates the user request: verifying the logical correctness of the request, i.e. that
Millstone Hill is an observatory that operates a type of instrument that measures neutral
temperature (i.e. check that Millstone Hill <isA> observatory and check that the range of the
measures property on the Millstone Hill Fabry Perot Interferometer subsumes neutral
temperature). Also, the application must verify that no necessary information is missing from the
request.
4. The application processes the user request to locate the physical storage of the data, returning for
example a URL-like expression: find Millstone Hill FPI data of the correct type (operating mode;
defined by CEDAR KINDAT since the instrument has two operating modes) in the given time range
(Millstone Hill FPI <hasKindofData> 1701 <intersects> TemporalDomain [January 2000, August
2000] )
5. The application plots the data in the specified plot type (a time series). This step involves extracting
the data from records of one or more files, creating an aggregate array of data with independent
variable time (of day or day+time depending on time range selected) and passing this to a
procedure to create the resulting image.
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
The Virtual Solar-Terrestrial Observatory
Demo
The Virtual Solar-Terrestrial Observatory
Have you heard these questions?








What do you mean by that?
What did you mean by that?
What does this mean?
How did you get this, please explain?
Does this also mean … ?
Doesn’t this contradict … ?
<Insert your own here>
Leads to:
 Inference
 Reasoning
 Explanation
The Virtual Solar-Terrestrial Observatory
Paradigm shift for NASA?




From: Instrument based
To: Measurement based
Requires: ‘bridging the discipline data divide’
Overall vision for SESDI: To integrate information
technology in support of advancing measurement-based
processing systems for NASA by integrating existing diverse
science discipline and mission-specific data sources.
SWEET
Volcano
Climate
SESDI
The Virtual Solar-Terrestrial Observatory
Semantic connectors
SWEET
Process-oriented semantic
content represented in SWSL
---------------------------Articulation axioms

The SESDI re-useable component interfaces. The stub on
each end of the connector is based on the GEON
Ontology-Data registration technology and contains
articulated axioms derived from the knowledge gained in
the unit-level data registration. Includes integrity checks,
domain and range, etc.
The Virtual Solar-Terrestrial Observatory
What’s ahead?
 Virtual Observatories provide both framework and data system
elements, users are already confusing VOs and data providers
 Many VO’s are noting the need for better glue, scalability, expandability,
etc.
 Success (to date) in utilizing formal methods for interface specification
and development using ontologies
 Success in breaking all of the free tools! Commercial tools are under
consideration
 Challenges exist for reasoning and interface with scientific datatypes,
e.g. complex spatial and temporal concepts
 For VSTO (and SESDI) - more use-cases, populate the interfaces and
test for scalability and interoperation in production settings
The Virtual Solar-Terrestrial Observatory
Descargar

Document