Unidata THREDDS*: Moving Toward
Interoperable Data Systems
*THematic Real-time Environmental Distributed Data Services
Ben Domenico
October 2003
Sponsored by the National Science Foundation
http://www.nsf.gov
1
Topics
• Traditional Unidata Approach
– Mainly meteorological data
– Subscription system pushes data to user sites
– Unidata Program Center provides data
analysis tools for use on data at user sites
• THREDDS Enhancements
– Broader menu of Earth system data
– Local client access from remote servers
– Less arcane, more general and accessible
tools
– Integration of data and analysis tools into
educational modules and digital libraries
2
Unidata Community Today
• More than160 institutions
– Includes over 100 academic departments
plus government agencies and private
sector research groups
– Does not count separate installations, e.g.
Spanish weather service IDD, US Weather
Service radar data system
• Interdisciplinary from the outset: 1996
survey showed over 2/3 of institutions
had some uses outside meteorology
(oceanography, hydrology, climatology, civil
engineering, environmental science…)
3
Community Impact Survey
• Over 21,000 college students per year use
Unidata tools and data in classrooms and
labs
• Nearly 4,000 women/minority students
• More than 1,800 faculty and research staff
• Over 55,000 K-12 students involved through
Unidata-connected university programs
• Informal education: in excess of 1 million hits
at Unidata-based university web sites per day
• 97% of community report being satisfied or
very satisfied
4
Principal Activities
of the Unidata Program Center
• Facilitating Data Access to a broad
spectrum of observations & forecasts (in near
real time)
• Providing Tools to visualize, analyze,
organize, receive, & share data at university
sites
• Supporting Faculty who use Unidata
systems at colleges & universities (most in
the U.S.)
• Building and Advocating for a Community
where data, tools, & best practices in
education/research are shared
5
Traditional Unidata Data Types
• Individual observations from weather
stations around the globe
• Satellite imagery
• Radar data from 160 NEXRAD radars
• Output from weather forecast model runs
at the National Centers for Environmental
Prediction
• Lightning strike data
• Measurements from sensors on
commercial aircraft
6
1Km Radar
Image
7
IDD: The Community in Action
• The Internet-based system by which
universities acquire huge quantities of
weather data in near-real time (i.e. ASAP)
typifies Unidata’s community orientation.
• The system has no data center -- all tasks are
performed on the participants’ own (small)
computers.
• Currently the most used “advanced
application” on the Abilene network (2-3% in
terms of packets and bytes transferred)
8
Internet Data Distribution (IDD)
with Multiple Sources (Injecting 17 Gigabytes per Day)
Source
LDM
LDM
LDM
Source
Source
LDM
LDM
LDM
Internet
LDM
LDM
LDM
Using LDM software for instant data relaying, ~160
institutions cooperate to acquire a wide range of realtime, global, atmospheric & oceanic observations, model
outputs, remotely sensed images..., in a coordinated
community effort.
9
Typical Data Handling
at a Unidata Site
Unidata user
Unidata user
running local
running
analysis and display
tools local
analysis and display tools
Forecast
Model Output
Application
specific protocols
Satellite
imagery
Decoders
Local data
decoded into
application
specific
formats
Decoders
Weather station
observations
IDD
Radar data
Decoders
Decoders
Decoders
Lightning, aircraft,
GPSmet, etc.
11
Thematic Data Servers
(combining IDD “push” with several
forms of “pull” and DL discovery)
Local user applications:
e.g., LAS, McIDAS,
IDV, VGEE,
IDL, MatLab...
Discovery
Digital Library for
Earth-System Education
Client/server data
access protocols, e.g.
OpenDAP, ADDE,
WCS, FTP
Hydrology
Data, e.g.
IDD
IDD
DLESE
DL
interchange
protocol
Geophysical
Data, e.g.
IDD
IDD
Satellite
Satellite
Satellite
Satellite
Images,e.g.
e.g.
Images,
Images, e.g.
Imagery...
IDD
12
THREDDS
THematic Real-time Environmental Distributed Data Services
Connecting people, documents and data
People
Documents
Data
13
THREDDS Overview
• National Science Digital Library (NSDL)
“collections” project
• Integrating real-time environmental data
into
– Online educational materials
– Digital libraries (DLESE, NSDL)
• Two-year grant from NSF Department of
Undergraduate Education (DUE)
• Second generation under negotiation
• Led by Unidata Program Center (UPC)
14
THREDDS Data Providers
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
University of Alabama Huntsville (Sara Graves, Rahul Ramachandran, Steve Tanner, Ken Keiser)
ARM (Atmospheric Radiation Measurement, Chris Klaus)
CDC, the Climate Diagnostic Center (Roland Schweitzer)
COLA, Center for Oceans Land Atmosphere (Joe Wielgosz)
University of Florence (Stefano Nativi)
GMU, George Mason University (Menas Kafatos and Ruixin Yang)
IRI/LDEO, International Research Institute/Lamont Doherty Earth Observatory (Benno Blumenthal)
ESG, the Earth System GRID (Luca Cinquini, NCAR/SCD)
IRIS DMC, Incorporated Research Institutes for Seismology Data Management Center (Rob Casey)
NCAR, the National Center for Atmospheric Research (Don Middleton)
NCDC, the National Climatic Data Center (Ben Watkins)
NGDC, National Geophysical Data Center (Ted Habermann)
NOMADS,NOAA Operational Model Archive and Distribution System, (Glenn Rutledge, NCDC)
University of Oklahoma (Kelvin Droegemeier)
PMEL, the Pacific Marine Environment Laboratory (Steve Hankin)
FNMOC, Fleet Numerical Meteorological and Oceanographic Center (Phil Sharfstein)
SSEC, the Space Science and Engineering Center., U. of Wisconsin-Madison (Steve Ackerman, Tom
Whittaker)
Unidata Community ADDE servers (Tom Yoksas, Unidata Program Center)
CIESIN (Consortium for International Earth Science Information Network, Bob Downs)
CUAHSI (Consortium of Universities for Advancement of Hydrologic Science, David Maidment)
ESIG/NCAR (NCAR Environmental Societal Impacts Group, Bob Harriss)
Earthscope (UCAR UNAVCO, Chuck Meertens)
GEON (GEOphysical Network, Chaitan Baru, UCSD San Diego Supercomputer Center)
ESRI GIS Community
15
THREDDS
Analysis/Display Tool Builders
• Data Discovery Toolkit and Foundry based on EDMI (Earth Data
Multimedia Instrument, New Media Studio, Bruce Caron).
• GDS, GrADS/DODS Server (COLA, Center for Oceans Land Atmosphere,
Joe Wielgosz)
• IDV, Integrated Data Viewer (Unidata Program Center, Don Murray)
• INGRID (IRI/LDEO, International Research Institute/Lamont Doherty Earth
Observatory, Benno Blumenthal)
• LAS, Live Access Server (PMEL, the Pacific Marine Environment
Laboratory, Steve Hankin)
• VGEE, Virtual Geophysical Exploration Environment (NCAR, DLESE, U. of
Illinois, Unidata, many collaborators)
• WXWISE Applets (SSEC, the Space Science and Engineering Center., U.
of Wisconsin-Madison, Tom Whittaker)
• ESRI GIS Clients (ESRI, Inc., Jack Dangermond, President)
• OGC Clients (Open GIS Consortium, David Schell, President)
• MyWorld (Northwestern educational GIS Client, Danny Edelson)
16
THREDDS Interoperability Partners
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
ADDE, Abstract Data Distribution Environment (University of Wisconsin – Madison, Tom Yoksas)
DIMES, DIstributed MEtadata System (George Mason University, Ruixin Yang)
DODS/OPeNDAP/Aggregation Server, Distributed Oceanographic Data System/Open source
Project for a Network Data Access Protocol (University of Rhode Island, Unidata, Ethan Davis)
DLESE, Digital Library for Earth System Education (Rajul Pandya)
ESML, Earth System Markup Language (University of Alabama-Huntsville, Rahul Ramachandran)
ESRI, Environmental Science Research Institute (various)
GCMD, Global Change Master Directory (Gene Major)
OGC and ISO Standards (University of Florence, Stefano Nativi)
ADL (Gazetteer Services The University of California, Santa Barbara, Linda Hill and Michael
Goodchild)
DLESE Evaluation Services (The University of Colorado CIRES, Susan Buhr)
DLESE Data Services (Tamara Ledley)
DLESE Program Center Digital Library for Earth System Education (Mary Marlino)
ESRI (Jack Dangermond, President)
OPeNDAP (The University of Rhode Island Open source Project for a Network Data Access Protocol
-- formerly DODS, Peter Cornillon)
LAITS (Laboratory for Advanced Information Technology and Standards,Liping Di, George Mason
University)
NSDL Evaluation Services (University of Colorado, Tamara Sumner)
OGC (Open GIS Consortium, David Schell, President)
SWEET (Semantic Web for Earth and Environmental Terminology, Rob Raskin)
17
Unidata’s Contributions
• A large, (inter)national, active, cooperative academic
user community
• Coordination of many disparate contributors
(universities, government agencies, digital libraries,
commercial vendors, standards bodies…)
• Reliable, automated, real-time data systems
• Platform-independent 5D visualization with HTML
document integration
• Basic inventory catalog generator and server
software
• Client-side catalog access modules
18
Funding Sources
• Unidata 2003/2008 (NSF Atmospheric
Science Division)
• THREDDS NSDL Collections Grant (NSF
Department of Undergraduate Education)
• DODS/OPeNDAP (University of Rhode
Island subcontract on Naval Ocean
Partnership Program Grant and NASA
Earth Science Enterprise)
• NWS/COMET Case Studies (NOAA NWS)
19
The Web
• Well-developed
connections
People
– Document references
– Embedded multimedia
– Embedded interactive
applets
• Powerful tools
– Google
– Dreamweaver
Documents
– Web-site management
tools
– Web services
Data
21
Data Access Technologies
People
Documents
Data
• Web-based data interactions
with passive gif images -- most
analysis work done on remote
server
• Traditional Unidata IDD with
analysis on local clients
• Combinations with Web
browse and FTP delivery for
local analysis,
• Client/server, e.g.,
DODS/OPeNDAP
• All lack sophisticated, textbased Web search/discovery
tools and coherent integration
23
People
Documents
THREDDS is the Bottom line
Data
• Associate words of the science
with available datasets
• Create “compound” documents
pointing to datasets
• Connect analysis tools to
documents and datasets
• Wide range of compound documents
– Lists of datasets available on server with brief
description of dataset classes
– Online publications pointing to datasets illustrating
concepts
• Massive arsenal of Web and Digital Library
search/discovery tools can be applied to
compound documents
26
People
Discovery and
Publication Tools
Discovery and
Publication Services
Documents
Analysis and
Visualization Tools
THREDDS
Middleware
Data Services
Data
29
Remote Catalog Query
and Data Access via
Local Analysis Tool
People
Discovery and
Publication Tools
User accesses
remote catalog via
local analysis tool
Discovery and
Publication Services
Documents
Analysis and
Visualization Tools
THREDDS
Middleware
User accesses
remote dataset via
local analysis tool
Data Services
Data
30
Basic Compound Document
THREDDS Server Inventory Catalog
• Inventory list of
datasets on server
• Generated
automatically with
minimal human
input
• Viewed from within
analysis and display
application
• Can be harvested
for inclusion in
GCMD, DLESE,
NSDL for use by
module builders
31
Enhanced Metadata Catalog
32
Compound Publication: Educational
Module within Interactive Analysis Tool
• Discovery at
DLESE
• module at DPC
• VGEE tool at
Unidata
• Datasets at NCAR
• Lends itself well to
Web discovery
tools, DL
integration
• Can be:
– education module
– online scientific
publication
33
Browser-base Thin Client Access
• LDEO/IRI web site
publishes catalog of
datasets available on
server at UCAR
• Catalog resides and
is updated at UCAR
• Browsing of datasets
on UCAR server
from LDEO server
• Also enables
analysis and display
of datasets on UCAR
server using tools on
LDEO server
34
DLESE Search
35
Search Results
36
Interactive Lesson
37
Lesson with Data Tool Loading
38
Interactive Tool Loads
with Data from Server
39
Other Server Data is Accessible
40
Vis Tool with Concept Models
41
•
•
•
•
Stepwise creation of third-party
enhanced catalogs/case studies
Begin with basic inventory catalog
Crawler traverses datasets listed in basic
catalog and adds location “bounding box” to
location-enhanced catalog
Gazetteer service examines location-enhanced
catalogs to create a catalog of datasets
associated with named region on Earth
Evolve to “event” gazetteer with 5-dimensional
bounding box (e.g., model output datasets
related to “Storm of the Century” with vorticity
above a threshold – a distributed case study)
42
Digital Library
Enhanced
Catalogs
Catalog System
ESML or
NCML
Generator
Data Catalog
Harvesting
Model output
Enhanced Catalogs
Enhanced
Inventory
THREDDS
Catalog
THREDDS
Catalog Inventory
Inventory
Catalog
Data
Server
Generator and
andData
DataServer
Server
Case Study
Catalog
Weather Obs
Event
Gazetteer
Data
Mining
Engine
Third Party
THREDDS
Third Party
Third
Party
Catalog
Server
THREDDS
THREDDS
Catalog
Server
Catalog Server
43
ISCCP Collection Metadata
44
Future Directions
• Standards-based web services approach
to providing both data and metadata
• Integrate GIS clients and servers into
THREDDS for access to societal impacts,
infrastructure, hydrology data, etc.
• Work with OGC and ISO to incorporate
emerging standard access protocols into
THREDDS
• Actively participate in future DLESE Data
Access Working Group and Data Services
workshops to create more compound
document educational module.
45
THREDDS, GIS, DL Interoperability
THREDDS Client
Applications
GIS Client
Applications
OGC or
proprietary GIS
protocols
OGC or OPeNDAP
ADDE. FTP…
protocols
OpenGIS Protocols:
WMS, WFS, WCS
GIS Servers
GIS Server
Demographic,
infrastructure,
GIS Server
societal impacts, …
datasets
Metadata
crosswalk
THREDDS Servers
THREDDS Server
THREDDS
Server
Satellite,
radar,
forecast model output, …
datasets
Metadata
crosswalk
Open Archives Initiative (OAI) Metadata Harvesting
Digital Library Discovery Systems
46
Summary
• Universities have used Unidata tools to
acquire, analyze, and display real-time
atmospheric data for nearly 20 years
• THREDDS – along with related client/server
access and display technologies-- makes an
even broader menu of Earth system data to a
more diverse community of users
• THREDDS technologies enable the creation of
compound educational modules and
scientific publications with embedded pointers
to datasets and tools.
47
Data System Emphases
• Constant, real-time data streams
– Dozens of source classes
– ~10 products per second
– Up to 2 GB/hour
• Discovery centers
– GCMD (DIF metadata)
– DLESE (ADN metadata)
– NSDL (DC metadata)
• Forecast model output is central
– Future time
– Time relative to present
48
ADEPT/DLESE/NASA
ADN Metadata
•
•
•
•
•
•
•
•
•
•
•
Title - the name of the resource
URL or access information - the url to an online resource or access
information to a physical object
Description - a narrative describing the content, purpose or goal of the
resource
Subject - general topic areas that the resource is about or covers
Technical requirements - information related to platform requirements,
browsers and plug-ins, etc.
Resource type - an indication to the type of educational resource, such as
lab exercises and tutorials, etc.
Audience - grade range of the resource
Copyright - copyright statement and any other restricted usage or lack
thereof about the cataloged resource
Cost - indication as to whether there is a cost associated with accessing or
using the resource
Resource creator - contact information for the author or publisher of a
resource
Resource cataloger - contact information for the cataloger of a resource
49
Forecast Model Output:
Jet Stream Winds with Surface Temp
50
Integrated Analysis and Display
• Local analysis and display tools
• Datasets on distributed remote servers
• Client/server, web services access to
– Metadata
– Datasets
• Moving to Open GIS protocols
51
Integrated Data Visualization
Client
•3D radar reflectivity from
NCAR server via DODS
protocol
•Visible 1K satellite image from
Wisconsin SSEC via ADDE
protocol
•Balloon sounding temperature
profile from local disk delivered
automatically in real-time via
IDD
•Different sources, protocols,
resolutions, time-scales
53
Need Computer “Use” Metadata
• Knowledge of data structure is necessary
• Requires semantic information, e.g.,
– Standard metadata and data access protocols
– Standard quantities
– Standard units of measure
– Connection to controlled
vocabularies/ontologies
• XML markup languages
– NcML (NetCDF markup language)
– ESML (Earth Science markup language)
– GML (Geography markup language)
54
More Information
• http://my.unidata.ucar.edu/
• http://www.unidata.ucar.edu/projects/THREDDS/
• [email protected]
55
Descargar

Slide 1