Welcome to CW 2008!!!
The Condor Project
(Established ‘85)
Distributed Computing research
performed by a team of ~35 faculty, full
time staff and students who
 face software/middleware engineering challenges
in a UNIX/Linux/Windows/OS X environment,
 involved in national and international collaborations,
 interact with users in academia and industry,
 maintain and support a distributed production
environment (more than 4000 CPUs at UW),
 and educate and train students.
“ … Since the early days of mankind the
primary motivation for the establishment of
communities has been the idea that by being
part of an organized group the capabilities
of an individual are improved. The great
progress in the area of inter-computer
communication led to the development of
means by which stand-alone processing subsystems can be integrated into multicomputer ‘communities’. … “
Miron Livny, “ Study of Load Balancing Algorithms for
Decentralized Distributed Processing Systems.”,
Ph.D thesis, July 1983.
Main Threads of Activities
› Distributed Computing Research – develop and
evaluate new concepts, frameworks and technologies
Keep Condor “flight worthy” and support our users
The Open Science Grid (OSG) – build and operate a
national High Throughput Computing infrastructure
The Grid Laboratory Of Wisconsin (GLOW) – build,
maintain and operate a distributed computing and
storage infrastructure on the UW campus.
The NSF Middleware Initiative - Develop, build
and operate a national Build and Test facility
powered by Metronome
Future of
Grid Computing
Miron Livny
Computer Sciences Department
University of Wisconsin-Madison
[email protected]
The Tulmod says in the name of Rabbi
“Since the destruction of the
Temple, prophecy has been
taken from prophets and
given to fools and children.”
(Baba Batra 12b)
The Grid Computing Movement
I believe that as a movement grid
computing ran its course.
No more an easy source of funding
No more an easy way to get the “troops”
No more an easy sell of software tools
No more an easy way to get your papers
published or your press releases posted
“The term “the Grid” was coined in the mid 1990s to denote a proposed
distributed computing infrastructure for advanced science and
engineering [27]. Considerable progress has since been made on the
construction of such an infrastructure (e.g., [10, 14, 36, 47]) but the term
“Grid” has also been conflated, at least in popular perception, to embrace
everything from advanced networking to artificial intelligence. One might
wonder if the term has any real substance and meaning. Is there really a
distinct “Grid problem” and hence a need for new “Grid technologies”? If so,
what is the nature of these technologies and what is their domain of
applicability? While numerous groups have interest in Grid concepts and
share, to a significant extent, a common vision of Grid architecture, we do not
see consensus on the answers to these questions.”
“The Anatomy of the Grid - Enabling Scalable Virtual Organizations” Ian
Foster, Carl Kesselman and Steven Tuecke 2001.
Distributed Computing
Distributed computing is here to stay
and to continue to evolve as processing,
storage and communication resources
get more powerful and cheaper
Big science is inherently distributed
Most scientific disciplines (and many
commercial sectors) depend on High
Throughput Computing (HTC) capabilities
Keynote 3: When All Computing Becomes Grid
Speaker: Prof. Daniel A. Reed
Chancellor’s Eminent Professor
Director, Renaissance Computing Institute
University of North Carolina at Chapel Hill
Scientific computing is moving rapidly from a world of “reliable,
secure parallel systems” to a world of distributed software, virtual
organizations and high-performance, though unreliable parallel and
distributed systems with few guarantees of availability and quality of
service. In addition, a tsunami of new experimental and computational
data poses equally vexing problems in analysis, transport, visualization
and collaboration. This transformation poses daunting scaling and
reliability challenges and necessitates new approaches to collaboration,
software development, performance measurement, system reliability
and coordination. This talk describes Renaissance approaches to
solving some of today’s most challenging scientific and societal
problems using Grids and parallel systems, supported by rich tools for
performance analysis, reliability assessment and workflow
As we return to the
fundamentals and stay
away from hype and the
technologies of the
moment, we will advance
the state of the art in
distributed computing
is Stronger
than Ever
Downloads per month
Fractions per month
Language Weaver Executive Summary
• Incorporated in 2002
– USC/ISI startup that commercializes statisticalbased machine translation software
• Continuously improved language pair offering in
terms of language pairs coverage and translation
– More than 50 language pairs
– Center of excellence in Statistical Machine
Translation and Natural Language Processing
IT Needs
• The Language Weaver Machine Translation
systems are trained automatically on large
amounts of parallel data.
• Training/learning processes implement
workflows with hundreds of steps, which use
thousands of CPU hours and which generate
hundreds of gigabytes of data
• Robust/fast workflows are essential for rapid
experimentation cycles
Solution: Condor
• Condor-specific workflows adequately manage
thousands of atomic computational steps/day.
• Advantages:
– Robustness – good recovery from failures
– Well-balanced utilization of existing IT
The Road Ahead
Green Computing
Computing in the Clouds
“Launch and Leave” Computing
Turn-on of the LHC
Broader and larger community of contributors
More and bigger campus grids
Fetching work from “other” sources
Multi-Core nodes
Low latency and short jobs
Staging data through Storage Elements
Thank you for building such
a wonderful community

OSG Integration