Grid Workflow
Midwest Grid Workshop Module 6
Goals
Enhance scientific productivity through:

Discovery and application of datasets and
programs at petabyte scale

Enabling use of a worldwide data grid as a
scientific workstation
Goals of using grids through scripting


Provide an easy on-ramp to the grid
Utilize massive resources with simple scripts



Leverage multiple grids like a workstation
Empower script-writers to empower end users
Track and leverage provenance in the science
process
Classes of Workflow Systems

Earlier generation business workflow systems


Scientific laboratory management systems


BPEL, BPDL, Taverna/SCUFL, Triana
Pegasus/Wings


Pegasus, Virtual Data Language
Service-oriented workflow systems


Kepler, DAGman, P-Star, VisTrails, Karajan
VDS: First-generation Virtual Data System


LIMS, “wet lab” workflow
Application-oriented workflow


Document management, forms processing, etc
Pegasus with OWL/RDF workflow specification
Swift workflow system

Karajan with typed and mapped VDL - SwiftScript
VDS – The Virtual Data System

Introduced Virtual Data Language - VDL


Several Planners




A location-independent parallel language
Pegasus: main production planner
Euryale: experimental “just in time” planner
GADU/GNARE – user application planner (D.
Sulahke, Argonne)
Provenance


Kickstart – app launcher and tracker
VDC – virtual data catalog
Virtual Data and Workflows

Challenge is managing and organizing the vast
computing and storage capabilities provided by
Grids

Workflow expresses computations in a form
that can be readily mapped to Grids

Virtual data keeps accurate track of data
derivation methods and provenance

Grid tools virtualize location and caching of
data, and recovery from failures
Virtual Data Origins:
The Grid Physics Network
Enhance scientific productivity through…
 Discovery, application and management of data
and processes at all scales
 Using a worldwide data grid as a scientific
workstation
The key to this approach is Virtual Data – creating
and managing datasets through workflow
“recipes” and provenance recording.
Virtual Data workflow abstracts Grid
details
Example Application:
High Energy Physics
Data Analysis
mass = 200
decay = bb
mass = 200
mass = 200
decay = ZZ
mass = 200
decay = WW
stability = 3
mass = 200
decay = WW
mass = 200
decay = WW
stability = 1
mass = 200
event = 8
mass = 200
plot = 1
Work and slide by
Rick Cavanaugh and
Dimitri Bourilkov,
University of Florida
mass = 200
decay = WW
stability = 1
LowPt = 20
HighPt = 10000
mass = 200
decay = WW
event = 8
mass = 200
decay = WW
plot = 1
mass = 200
decay = WW
stability = 1
event = 8
mass = 200
decay = WW
stability = 1
plot = 1
The core essence:
Basic data analysis programs
bins
=60
bins
xmin
= 40.5
xmin
ymin
= .003
Raw
107: 24B707CCData
AF 01 37 01 00 01 00
01 24655A35 235011.603
0 +0269
108: 24B707CD 01 23 01
01 24655A35 235011.603
0 +0269
109: 06194161 80 01 38
01 03E9DCA9 235142.597
0 -0723
110: 06194163 00 01 01
01
061206 V 03
3F 00 01 00
061206 V 03
01 00 01 00
061206 V 03
CMS.ECal.2006.0405
28 32 01 00
ymin
infile
Data
Analysis
Program
Expressing Workflow in VDL
TR grep (in a1, out a2) {
argument stdin = ${a1};
argument stdout = ${a2}; }
TR sort (in a1, out a2) {
argument stdin = ${a1};
argument stdout = ${a2}; }
Define a “function”
wrapper for an application
Define “formal arguments”
for the application
Connect applications via
output-to-input
dependencies
DV grep (a1=@{in:file1}, a2=@{out:file2});
DV sort (a1=@{in:file2}, a2=@{out:file3});
file1
grep
file2
sort
file3
Define a “call” to invoke
application
Provide “actual” argument
values for the invocation
Executing VDL Workflows
Workflow spec
VDL
Program
Create Execution Plan
Pegasus
Planner
Virtual Data
catalog
Virtual Data
Workflow
Generator
Abstract
workflow
Grid Workflow Execution
DAGman
DAG
DAGman &
Condor-G
Show world and
results in large DAG
on right, as animated
overlay
Job
Planner
Job
Cleanup
...and collecting Provenance
Specify Workflow
Abstract
workflow
Virtual Data
Workflow
Generator
Virtual Data
catalog
Create and run DAG
Pegasus
Planner
file1
DAGman
script
DAGman &
Condor-G
Provenance
collector
VDL
Grid Workflow Execution
(on worker nodes)
launcher
Provenance
data
file2
launcher
Provenance
data
grep
sort
file3
What must we “virtualize” to compute
on the Grid?

Location-independent computing: represent all
workflow in abstract terms

Declarations not tied to specific entities:


sites

file systems

schedulers
Failures – automated retry for data server and
execution site un-availability
Mapping the Science Process to
workflows







Start with a single workflow
Automate the generation of workflow for sets
of files (datasets)
Replicate workflow to explore many datasets
Change Parameters
Change code – add new transformations
Build new workflows
Use provenance info
How does Workflow
Relate to Provenance?

Workflow – specifies what to do

Provenance – tracks what was done
Query
What I
Am Doing
What I
Did
Execution
environment
What I
Want to
Do
Schedule
Edit
…
Executed
Executing
Executable
Waiting
Having interface definitions also
facilitates provenance tracking.
bins
=60
bins
xmin
= 40.5
xmin
ymin
= .003
Raw
107: 24B707CCData
AF 01 37 01 00 01 00
01 24655A35 235011.603
0 +0269
108: 24B707CD 01 23 01
01 24655A35 235011.603
0 +0269
109: 06194161 80 01 38
01 03E9DCA9 235142.597
0 -0723
110: 06194163 00 01 01
01
061206 V 03
3F 00 01 00
061206 V 03
01 00 01 00
061206 V 03
CMS.ECal.2006.0405
28 32 01 00
ymin
infile
Data
Analysis
Program
Functional MRI Analysis
3a.h
3a.i
4a.h
4a.i
ref.h
ref.i
5a.h
5a.i
6a.h
align_warp/1
align_warp/3
align_warp/5
align_warp/7
3a.w
4a.w
5a.w
6a.w
reslice/2
reslice/4
reslice/6
reslice/8
3a.s.h
3a.s.i
4a.s.h
4a.s.i
5a.s.h
5a.s.i
6a.s.h
6a.i
6a.s.i
softmean/9
atlas.h
slicer/10
atlas.i
slicer/12
slicer/14
atlas_x.ppm
atlas_y.ppm
atlas_z.ppm
convert/11
convert/13
convert/15
atlas_x.jpg
atlas_y.jpg
atlas_z.jpg
Workflow courtesy James Dobson, Dartmouth Brain Imaging Center
LIGO Inspiral Search Application

Describe…
Inspiral workflow application is the work of Duncan Brown, Caltech,
Scott Koranda, UW Milwaukee, the ISI Pegasus Team,
and the LSC Inspiral group
Example Montage Workflow
~1200 node workflow, 7 levels
Mosaic of M42 created on
the Teragrid using Pegasus
http://montage.ipac.caltech.edu/
Blasting for Protein Knowledge
BLAST compare of complete nr database for sequence similarity
and function characterization
Knowledge Base
PUMA is an interface for the
researchers to be able to find
information about a specific protein
after having been analyzed against
the complete set of sequenced
genomes (nr file ~ approximately 3
million sequences)
Analysis on the Grid
The analysis of the protein sequences
occurs in the background in the grid
environment. Millions of processes are
started since several tools are run to
analyze each sequence, such as finding
out protein similarities (BLAST), protein
family domain searches (BLOCKS), and
structural characteristics of the protein.
FOAM:
Fast Ocean/Atmosphere Model
250-Member Ensemble
Run on TeraGrid under VDS
Remote Directory
Creation for
Ensemble Member 1
FOAM run for
Ensemble Member 1
Atmos
Atmos
Postprocessing
Postprocessing for
Ensemble Member 2
Remote Directory
Creation for
Ensemble Member 2
Remote Directory
Creation for
Ensemble Member N
FOAM run for
Ensemble Member 2
FOAM run for
Ensemble Member N
Ocean
Postprocessing for
Ensemble Member 2
Coupl
Coupl
Postprocessing
Postprocessing
for for
Ensemble
EnsembleMember
Member22
Results transferred to archival storage
Work of: Rob Jacob (FOAM), Veronica Nefedova (Workflow design and execution)
TeraGrid and VDS speed up modelling
Climate
Supercomputer
TeraGrid with
NMI and VDS
FOAM
application by
Rob Jacob,
Argonne; VDS
workflow by
Veronika
Nefedova,
Argonne
Visualization
courtesy Pat
Behling and
Yun Liu, UW
Madison..
VDS

Virtual Data System

Virtual Data Language (VDL)


Pegasus planner


A language to express workflows
Decides how the workflow will run
Virtual Data Catalog (VDC)

Stores information about workflows

Stores provenance of data
Virtual Data Process

Describe data derivation or analysis steps in a
high-level workflow language (VDL)

VDL is cataloged in a database for sharing by the
community

Grid workflows are generated from VDL

Provenance of derived results stored in database
for assessment or verification
Planning with Pegasus
High Level
Application
Knowledge
Resource
Information
and Configuration
Data Location
Information
Pegasus Planner
DAX abstract
workflow, from
VDL
Plan to be submitted
to the grid (e.g
condor submit files)
Abstract to Concrete, Step 1: Workflow Reduction
f.ip
f.ip
f.ip
A
A
A
f.a
f.a
f.a
f.a
f.a
B
C
B
C
C
f.b
f.c
f.b
f.c
f.c
D
E
D
E
E
f.d
f.e
f.d
f.e
f.d
f.e
F
F
F
f.out
f.out
f.out
Abstract Workflow
File f.d exists somewhere.
Reuse it.
Mark Jobs D and B to delete
Delete Job D and Job B
Step 2: Site Selection & Addition of Data Stage-in
Nodes
SI
(A)
f.ip
f.ip
f.ip
A
A
f.a
A
f.a
f.a
C
C
f.c
f.c
C
f.c
E
f.d
f.e
F
E
f.d
f.e
F
SI
(f.d)
E
f.d
f.e
F
Legend
Unmapped Job
f.out
f.out
f.out
Reduced Workflow
Workflow after Site
Selection
Workflow with data
stage-in jobs
Job Mapped to Site A
Job Mapped to Site B
Stage-in Job
Step 3: Addition of Data Stage-out Nodes
SI
f.ip
SI
f.ip
f.ip
f.ip
A
A
f.a
f.a
C
C
SO
f.c
f.c
SI
f.d
E
SI
f.d
E
f.d
f.e
f.d
f.e
Legend
F
F
Unmapped Job
Job Mapped to Site A
Job Mapped to Site B
f.out
SO
f.op
Workflow with Data Stage in
jobs
Workflow with Data Stage out
jobs to final output site
Stage-in Job
Stage-Out Job
StepSI 4: Addition of Replica Registration
Jobs
SI
f.ip
f.ip
f.ip
f.ip
A
A
f.a
f.a
C
C
SO
f.c
SO
f.c
SI
f.d
E
SI
f.d
E
f.d
f.e
f.d
f.e
Legend
Unmapped Job
F
F
Job Mapped to Site A
SO
f.op
SO
f.op
Job Mapped to Site B
Stage-in Job
Stage-Out Job
Reg
f.op
Workflow with Data Stage
out Jobs to final output site
Workflow with Registration
Job that registers the
generated data
Registration Job
Step 5: Addition of Job-Directory Creation
SI
f.ip
SI
f.ip
Dir
f.ip
f.ip
A
A
f.a
f.a
C
C
Dir
SO
f.c
SI
f.d
f.d
SO
f.c
E
SI
f.d
E
f.e
f.d
f.e
Legend
F
F
Unmapped Job
Job Mapped to Site A
SO
f.op
SO
f.op
Job Mapped to Site B
Stage-in Job
Reg
f.op
Reg
f.op
Stage-Out Job
Registration Job
Workflow with Registration
Job that registers the
generated data
Workflow with Directory
Creation Jobs
Make Dir Job
Final Result of Abstract-to-Concrete Process
SI
f.ip
Dir
f.ip
f.ip
A
A
f.a
f.a
B
C
f.a
C
Dir
SO
f.c
f.b
f.c
D
E
SI
f.d
E
f.d
f.e
f.d
f.e
Legend
F
F
Unmapped Job
Job Mapped to Site A
f.out
SO
f.op
Job Mapped to Site B
Stage-in Job
Abstract Workflow
Reg
f.op
Stage-Out Job
Registration Job
Final Concrete Workflow
Make Dir Job
Swift System improves on VDS/VDL

Clean separation of logical/physical concerns

XDTM specification of logical data structures
+ Concise specification of parallel programs

SwiftScript, with iteration, etc.
+ Efficient execution on distributed resources

Lightweight threading, dynamic provisioning, Grid interfaces,
pipelining, load balancing
+ Rigorous provenance tracking and query is in design

Virtual data schema & automated recording
 Improved usability and productivity

Demonstrated in numerous applications
39
AIRSN: an example program
(Run or) reorientRun (Run ir,
(Run snr) functional ( Run r, NormAnat a,
string direction) {
Air shrink ) {
foreach Volume iv, i in ir.v {
Run yroRun = reorientRun( r , "y" );
or.v[i] = reorient(iv, direction);
}
Run roRun = reorientRun( yroRun , "x" );
}
Volume std = roRun[0];
Run rndr = random_select( roRun, 0.1 );
AirVector rndAirVec = align_linearRun( rndr, std, 12, 1000, 1000, "81 3 3" );
Run reslicedRndr = resliceRun( rndr, rndAirVec, "o", "k" );
Volume meanRand = softmean( reslicedRndr, "y", "null" );
Air mnQAAir = alignlinear( a.nHires, meanRand, 6, 1000, 4, "81 3 3" );
Warp boldNormWarp = combinewarp( shrink, a.aWarp, mnQAAir );
Run nr = reslice_warp_run( boldNormWarp, roRun );
Volume meanAll = strictmean( nr, "y", "null" )
Volume boldMask = binarize( meanAll, "y" );
snr = gsmoothRun( nr, boldMask, "6 6 6" );
}
40
VDL/VDS Limitations

Missing VDL language features



Run time complexity in VDS




Data typing & data mapping
Iterators and control-flow constructs
State explosion for data-parallel applications
Computation status hard to provide
Debugging information complex & distributed
Performance

Still many runtime bottlenecks
The Messy Data Problem

Scientific data is typically
logically structured



E.g., hierarchical structure
Common to map functions over
dataset members
Nested map operations can scale to
millions
of objects
The Messy Data Problem


But physically “messy”
Heterogeneous storage format
and access protocol




Logically identical dataset can be
stored in textual File (e.g. CSV),
spreadsheet, database, …
Data available from filesystem,
DBMS, HTTP, WebDAV, ..
Metadata encoded in directory
and file names
Hinders program
development, composition,
execution
./Group23
total 58
drwxr-xr-x 4 yongzh users 2048 Nov 12 14:15 AA
drwxr-xr-x 4 yongzh users 2048 Nov 11 21:13 CH
drwxr-xr-x 4 yongzh users 2048 Nov 11 16:32 EC
./Group23/AA:
total 4
drwxr-xr-x 5 yongzh users 2048 Nov 5 12:41 04nov06aa
drwxr-xr-x 4 yongzh users 2048 Dec 6 12:24 11nov06aa
. /Group23/AA/04nov06aa:
total 54
drwxr-xr-x 2 yongzh users 2048 Nov 5 12:52 ANATOMY
drwxr-xr-x 2 yongzh users 49152 Dec 5 11:40 FUNCTIONAL
. /Group23/AA/04nov06aa/ANATOMY:
total 58500
-rw-r--r-- 1 yongzh users
348 Nov 5 12:29 coplanar.hdr
-rw-r--r-- 1 yongzh users 16777216 Nov 5 12:29 coplanar.img
. /Group23/AA/04nov06aa/FUNCTIONAL:
total 196739
-rw-r--r-- 1 yongzh users 348 Nov 5 12:32 bold1_0001.hdr
-rw-r--r-- 1 yongzh users 409600 Nov 5 12:32 bold1_0001.img
-rw-r--r-- 1 yongzh users 348 Nov 5 12:32 bold1_0002.hdr
-rw-r--r-- 1 yongzh users 409600 Nov 5 12:32 bold1_0002.img
-rw-r--r-- 1 yongzh users 496 Nov 15 20:44 bold1_0002.mat
-rw-r--r-- 1 yongzh users 348 Nov 5 12:32 bold1_0003.hdr
-rw-r--r-- 1 yongzh users 409600 Nov 5 12:32 bold1_0003.img
SwiftScript

Typed parallel programming[SIGMOD05,
notation
Springer06]



XDTM as data model and type system
Typed dataset and procedure definitions
Scripting language



Implicit data parallelism
Program composition
from procedures
Control constructs
(foreach, if, while, …)
Clean application logic
Type checking
Dataset selection, iteration
A Notation & System for Expressing and Executing Cleanly Typed Workflows on
Messy Scientific Data [SIGMOD Record Sep05]
fMRI Type Definitions in SwiftScript
type Study {
Group g[ ];
}
type Image {};
type Group {
Subject s[ ];
}
type Warp {};
type Subject {
Volume anat;
Run run[ ];
}
type Run {
Volume v[ ];
}
Simplified declarations
of fMRI AIRSN
(Spatial Normalization)
type Volume {
Image img;
Header hdr;
}
type Header {};
type Air {};
type AirVec {
Air a[ ];
}
type NormAnat {
Volume anat;
Warp aWarp;
Volume nHires;
}
AIRSN Program Definition
(Run or) reorientRun (Run ir,
string direction) {
(Run snr) functional ( Run r, NormAnat a,
foreach Volume iv, i in ir.v {
Air shrink ) {
or.v[i] = reorient(iv, direction);
Run yroRun = reorientRun( r , "y" );
}
Run roRun = reorientRun( yroRun , "x" ); }
Volume std = roRun[0];
Run rndr = random_select( roRun, 0.1 );
AirVector rndAirVec = align_linearRun( rndr, std, 12, 1000, 1000, "81 3 3" );
Run reslicedRndr = resliceRun( rndr, rndAirVec, "o", "k" );
Volume meanRand = softmean( reslicedRndr, "y", "null" );
Air mnQAAir = alignlinear( a.nHires, meanRand, 6, 1000, 4, "81 3 3" );
Warp boldNormWarp = combinewarp( shrink, a.aWarp, mnQAAir );
Run nr = reslice_warp_run( boldNormWarp, roRun );
Volume meanAll = strictmean( nr, "y", "null" )
Volume boldMask = binarize( meanAll, "y" );
snr = gsmoothRun( nr, boldMask, "6 6 6" );
}
SwiftScript Expressiveness
Lines of code with different workflow encodings
AIRSN workflow
workflow:expanded:
re o rie n t
fMRI
Workflow
Shell
Script
VDL
Swift
re o rie n t
re o rie n t/2 5
re o rie n t/5 1
re o rie n t/2 7
re o rie n t/5 2
re o rie n t/2 9
re o rie n t/5 3
re o rie n t/0 9
re o rie n tR u n
re o rie n t/0 1
re o rie n t/0 5
re o rie n tR u n
re o rie n t/1 0
re o rie n t/0 2
re o rie n t/0 6
re o rie n t/3 1
re o rie n t/3 3
re o rie n t/5 4
re o rie n t/5 5
re o rie n t/3 5
re o rie n t/5 6
re o rie n t/3 7
re o rie n t/5 7
ra n d o m _ se le ct
a lig n lin e a r/1 1
a lig n lin e a r
ATLAS1
49
72
6
a lig n lin e a r/0 3
a lig n lin e a r/0 7
a lig n lin e a rR u n
re slice /1 2
re slice
re slice /0 4
re slice /0 8
re slice R u n
ATLAS2
97
135
10
so ftm e a n /1 3
so ftm e a n
so ftm e a n
a lig n lin e a r
a lig n lin e a r/1 7
a lig n lin e a r
FILM1
63
134
17
co m b in e _ w a rp
co m b in e w a rp /2 1
co m b in e w a rp
re slice _ w a rp
FEAT
84
191
re slice _ w a rp /2 6 re slice _ w a rp/2 8 re slice _ w a rp/3 0 re slice _ w a rp /2 4 re slice _ w a rp /2 2 re slice _ w a rp /2 3 re slice _ w a rp/3 2 re slice _ w a rp/3 4 re slice _ w a rp/3 6 re slice _ w a rp /3 8
re slice _ w a rp R u n
13
strictm e a n
AIRSN
215
~400
34
strictm e a n /3 9
strictm e a n
b in a rize
g sm o o th
b in a rize
b in a rize /4 0
g sm o o th /4 4
g sm o o th /4 5
g sm o o th /4 6
g sm o o th /4 3
g sm o o th /4 1
g sm o o th R u n
g sm o o th /4 2
g sm o o th /4 7
g sm o o th /4 8
Collaboration with James Dobson, Dartmouth [SIGMOD Record Sep05]
g sm o o th /4 9
g sm o o th /5 0
Swift Architecture
Specification
Abstract
computation
SwiftScript
Compiler
Virtual Data
Catalog
SwiftScript
Scheduling
Execution
Execution Engine
(Karajan w/
Swift Runtime)
Virtual Node(s)
C
C
C
C
Swift runtime
callouts
Status reporting
Provenance
collector
Provisioning
file1
Dynamic
Resource
Provisioner
launcher App
F1
Provenance
data
launcher
Provenance
data
file2
App
F2
file3
Amazon
EC2
48
Using Swift
site
list
app
list
SwiftScript
App
a1
Worker Nodes
Data
f1
f2
f3
launcher App
a1
swift
command
App
a2
Workflow
Status
and logs
f1
f2
launcher
Provenance
data
App
a2
f3
Swift uses Karajan Workflow Engine



Fast, scalable threading model
Suitable constructs for control flow
Flexible task dependency model


Flexible provider model allows for use of
different run time environments



“Futures” enable pipelining
Job execution and data transfer
Flow controlled to avoid resource overload
Workflow client runs from a Java container
Java CoG Workflow, Gregor von Laszewski et al., Workflows for Science, 2007
Application example:
ACTIVAL: Neural activation validation
Identifies clusters of neural activity not likely to
be active by random chance: switch labels of the
conditions for one or more participants; calculate
the delta values in each voxel, re-calculate the
reliability of delta in each voxel, and evaluate
clusters found. If the clusters in data are greater
than the majority of the clusters found in the
permutations, then the null hypothesis is refuted
indicating that clusters of activity found in our
experiment are not likely to be found by chance.
Work by S. Small and U.
Hasson, UChicago.
SwiftScript Workflow ACTIVAL – Data types and utilities
type
type
type
type
type
type
script {}
type fullBrainData {}
brainMeasurements{}
type fullBrainSpecs {}
precomputedPermutations{}
type brainDataset {}
brainClusterTable {}
brainDatasets{ brainDataset b[]; }
brainClusters{ brainClusterTable c[]; }
// Procedure to run "R" statistical package
(brainDataset t) bricRInvoke (script permutationScript, int iterationNo,
brainMeasurements dataAll, precomputedPermutations dataPerm) {
app { bricRInvoke @filename(permutationScript) iterationNo
@filename(dataAll) @filename(dataPerm); }
}
// Procedure to run AFNI Clustering tool
(brainClusterTable v, brainDataset t) bricCluster (script clusterScript,
int iterationNo, brainDataset randBrain, fullBrainData brainFile,
fullBrainSpecs specFile) {
app { bricPerlCluster @filename(clusterScript) iterationNo
@filename(randBrain) @filename(brainFile)
@filename(specFile); }
}
// Procedure to merge results based on statistical likelhoods
(brainClusterTable t) bricCentralize ( brainClusterTable bc[]) {
app { bricCentralize @filenames(bc); }
}
ACTIVAL Workflow – Dataset iteration procedures
// Procedure to iterate over the data collection
(brainClusters randCluster, brainDatasets dsetReturn) brain_cluster
(fullBrainData brainFile, fullBrainSpecs specFile)
{
int sequence[]=[1:2000];
brainMeasurements
precomputedPermutations
script
script
brainDatasets
}
dataAll<fixed_mapper; file="obs.imit.all">;
dataPerm<fixed_mapper; file="perm.matrix.11">;
randScript<fixed_mapper; file="script.obs.imit.tibi">;
clusterScript<fixed_mapper; file="surfclust.tibi">;
randBrains<simple_mapper; prefix="rand.brain.set">;
foreach int i in sequence {
randBrains.b[i] = bricRInvoke(randScript,i,dataAll,dataPerm);
brainDataset rBrain = randBrains.b[i] ;
(randCluster.c[i],dsetReturn.b[i]) =
bricCluster(clusterScript,i,rBrain, brainFile,specFile);
}
ACTIVAL Workflow – Main Workflow Program
// Declare datasets
fullBrainData
fullBrainSpecs
brainFile<fixed_mapper; file="colin_lh_mesh140_std.pial.asc">;
specFile<fixed_mapper; file="colin_lh_mesh140_std.spec">;
brainDatasets
brainClusters
randBrain<simple_mapper; prefix="rand.brain.set">;
randCluster<simple_mapper; prefix="Tmean.4mm.perm",
suffix="_ClstTable_r4.1_a2.0.1D">;
dsetReturn<simple_mapper; prefix="Tmean.4mm.perm",
suffix="_Clustered_r4.1_a2.0.niml.dset">;
clusterThresholdsTable<fixed_mapper; file="thresholds.table">;
brainResult<fixed_mapper; file="brain.final.dset">;
origBrain<fixed_mapper; file="brain.permutation.1">;
brainDatasets
brainClusterTable
brainDataset
brainDataset
// Main program – executes the entire workflow
(randCluster, dsetReturn) = brain_cluster(brainFile, specFile);
clusterThresholdsTable = bricCentralize (randCluster.c);
brainResult = makebrain(origBrain,clusterThresholdsTable,brainFile,specFile);
Performance example: fMRI workflow
4-stage workflow
(subset of AIRSN)
476 jobs, <10 secs
CPU each, 119 jobs
per stage.
Jobs
No pipelining:
24 minutes
(idle uc-teragrid cluster,
via GRAM to Torque)
Example
Performance Optimizations
Jobs pipelined
between
stages:
Jobs
19 minutes
Pipelining
Example
Performance Optimizations
with pipelining and
clustering (up to 6
jobs clustered into
one GRAM job):
Jobs
8 mins
Pipelining + clustering
Example
Performance Optimizations
Jobs
With pipelining and
CPU provisioning:
2.2 minutes.
Pipelining + provisioning
Load Balancing
Jobs
uc-teragrid:
216
UC-TeraPort:
260
Load balancing between UC TeraPort (OSG) and UC-TeraGrid (IA32)
Development Status

Initial release is available for evaluation

Performance measurement and tuning efforts active

Adapting to OSG Grid info and site conventions

Many applications in progress and eval

Astrophysics, molecular dynamics, neuroscience, psychology,
radiology

Provisioning mechanism progressing

Virtual data catalog re-integration starting ~ April

Collating language feedback – focus is on mapping

Web site for docs, downloads and more info:
www.ci.uchicago.edu/swift
Conclusion

Swift is in its early stages of development and its transition
from the VDS virtual data language

Application testing is underway in neuroscience, molecular
dynamics, astrophysics, radiology, and other applications.


Providing valuable feedback for language refinement and
finalization
SwiftScript is proving to be a productive language

while feedback from usage is still shaping it

Positive comments from VDL users – radiology in particular

Ongoing performance evaluation and improvement is
yielding exciting results

Major initial focus is usability – good progress on
improving time-to-get-started and on ease of debugging
Acknowledgements


Swift effort is supported by DOE (Argonne LDRD),
NSF (i2u2,GriPhyN, iVDGL), NIH, and the UChicago
Computation Institute
Team


Java CoG Kit


Mihael Hategan, Gregor Von Laszewski, and many
collaborators
User contributed workflows and Swift applications


Ben Clifford, Ian Foster, Mihael Hategan, Veronika
Nefedova, Tiberiu Stef-Praun, Mike Wilde, Yong Zhao
ASCI Flash, I2U2, UC Human Neuroscience Lab, UCH
Molecular Dynamics, UCH Radiology, caBIG
Ravi Madduri, Patrick McConnell, and the caGrid
team of caBIG.
Based on:
The Virtual Data System –
a workflow toolkit for
science applications
OSG Summer Grid Workshop
Lecture 8
June 29, 2006
Based on: Fast, Reliable, Loosely
Coupled Parallel Computation
Tiberiu Stef-Praun
Computation Institute
University of Chicago & Argonne National Laboratory
[email protected]
www.ci.uchicago.edu/swift
Acknowledgements
The technologies and applications described here
were made possible by the following projects and support:
GriPhyN, iVDGL, the Globus Alliance and QuarkNet,
supported by
The National Science Foundation
The Globus Alliance, PPDG, and QuarkNet,
supported by the US Department of Energy,
Office of Science
Support was also provided by
NVO, NIH, and SCEC
Descargar

Grid Tutorial