High Productivity
Computing Systems Program
CASC
HPC Technology Update
Robert Graybill
March 24, 2005
10/3/2015 RBG
1
Outline
 High Computing University Research Activities
 HECURA Status
 High Productivity Computing Systems Program
 Phase II Update
 Vendor
Teams
 Council on Competitiveness
 Productivity Team
 Phase III Concept
 Other Related Computing Technology Activities
10/3/2015 RBG
2
HECURA – High-End Computing
University Research Activity
Strategy:
 Fund universities in high-end
computing research targeting Nation’s
key long-term needs.
Implementation:
Potential High-End Computing Research Areas
Hardware
Software
Operating systems
Languages, compilers, and libraries
Tools & development environments
Algorithms
Systems
System architecture
Reliability, availability, & serviceability (RAS)
System modeling and performance analysis
Programming models
 Coordinated FY04 Solicitation led by
NSA (DARPA $2M Per Year)
 $ 1M sent to DOE
 $ 1M sent to NSF
 Participating Agencies
 DARPA, DOE, NSF
 Status
 DOE FY04/05 – Fund FASTOS
 NSF FY04 – Fund Domain
Specific Compilation Environment
 NSF FY05 – In Process
10/3/2015 RBG
Microarchitecture
Memory
Interconnect
Power, cooling, and packaging
I/O and storage
HEC Research Area
Lead Agency
Operating systems
Department of Energy / Office
of Science
Website:
http://www.sc.doe.gov/grants/Fr04-13.html
Languages, compilers, and
libraries
National Science Foundation
Website: http://www.nsf.gov/pubs/2004/nsf04569/nsf04569.htm
3
Outline
 High End Computing University Research Activities
 HECURA Status
 High Productivity Computing Systems Program
 Phase II Update
 Vendor
Teams
 Council on Competitiveness
 Productivity Team
 Phase III Concept
 Other Related Computing Technology Activities
10/3/2015 RBG
4
High Productivity
Computing Systems
Goal:
 Provide a new generation of economically viable high productivity computing
systems for the national security and industrial user community (2010)
Impact:
 Performance (time-to-solution): speedup critical national
security applications by a factor of 10X to 40X
 Programmability (idea-to-first-solution): reduce cost and
time of developing application solutions
 Portability (transparency): insulate research and
operational application software from system
 Robustness (reliability): apply all known techniques to
protect against outside attacks, hardware faults, &
programming errors
HPCS Program Focus Areas
Applications:
 Intelligence/surveillance, reconnaissance, cryptanalysis, weapons analysis, airborne contaminant
modeling and biotechnology
Fill the Critical Technology and Capability Gap
Today (late 80’s HPC technology)…..to…..Future (Quantum/Bio Computing)
10/3/2015 RBG
5
HPCS Program Phases I - III
Products
Early Pilot Platforms
HPCS
Intermediate
Products
Productivity Experimental Productivity
Concepts & Productivity Framework
Framework
Baseline
Metrics
Productivity
Assessment
(MIT LL, DOE,
DoD, NASA, NSF)
System
Design
Review
Concept
Review
Industry
Milestones
1
2
PDR
4
3
5 6
Technology
Assessment
Review
Pilot Systems
CDR
7
Procurement
Decisions
Year (CY)
02
Program Reviews
Critical Milestones
Program Procurements
10/3/2015 RBG
03
04
05
06
07
08
09
(Funded Five)
(Funded Three)
(Fund up to Two)
Phase I
Industry
Concept
Study
Phase II
R&D
Phase III
Full Scale Development
10
Mission
Partners
6
Phase II Program Goals
 Phase II Overall Productivity Goals
 Execution (sustained performance) – 1 Petaflop/sec (scalable to greater than 4 Petaflop/sec).
Reference: Functional Workflow 3
 Development – 10X over today’s systems. Reference: Functional Workflows 1,2,4,5
 Productivity Framework





Establish experimental baseline
Evaluate emerging vendor execution and development productivity concepts
Provide a solid reference for evaluation of vendor’s Phase III designs
Provide technical basis for Mission Partner investment in Phase III
Early adoption or phase in of execution and development metrics by mission partners
 Subsystem Performance Indicators (Vendor Generated Goals from Phase I)




3.2 PB/sec bisection bandwidth;
64,000 GUPS (RandomAccess)
6.5 PB/sec data streams bandwidth;
2+ PF/s Linpack
HPCchallenge
Documented and Validated Through Simulations, Experiments,
Prototypes, and Analysis
10/3/2015 RBG
7
HPCS I/O Challenges
 1 Trillion files in a single file system
 32K file creates per second
 10K metadata operations per second
 Needed for Checkpoint/Restart files
 Streaming I/O at 30 GB/sec full duplex
 Needed for data capture
 Support for 30K nodes
 Future file system need low latency communication
An Envelope on HPCS Mission Partner Requirements
10/3/2015 RBG
8
Phase II Accomplishments
 Unified and mobilized broad government agency buy-in ….
(vision, technical goals, funding and active reviewers)
 Driving HPC vendor and industry users’ vision of high-end computing --
-- “To out-compete … We must out-compute!”
 Completed Program Milestones 1 - 4
 SDR – Established credible technical baseline, assessed
program goals and identified challenges
 Technology Assessment Review
 Established “Productivity” as a key evaluation criteria rather than
only ‘Performance’ through HPCS Productivity Team efforts
 Released “execution time” HPCchallenge & in-the-large
applications benchmarks
 Completed early “Development Time” experiments
 Early commercial buy-in …… Parallel Matlab Announcement
 FY04 HEC-URA awards completed through DOE and NSF
 Developed Draft Phase III Strategy
10/3/2015 RBG
9
HPCS System Architectures
Cray / Sun / IBM
Addressing Time-to-Solution
Experimental Codes
Large Multi-Module Codes
Porting Codes
Running Codes
Administration
R&D in New Languages
Chapel (Cray)
X10 (IBM)
Fortress (Sun)
10/3/2015 RBG
10
HPCS Vendor Innovations
Non-Proprietary Version
 “Super Sized” scaled up HPC development environments, runtime software,
file I/O and streaming I/O to support 10k to 100K processors
 Intelligent continuous processing optimization (CPO)
 Application optimized configurable heterogeneous computing
 Workflow based productivity analysis
 High bandwidth module/cabinet interconnect fabric
 Capacitive proximity chip/module interconnect – Breaks bandwidth
cost/performance barriers
 Developed prototype high productivity languages
 On the track for 10X improvement in HPC productivity
HPCS Disruptive Technology Will Result
in Revolutionary HPC Industry Products in 2010
HPCS Technology has Already Impacted Vendors 2006/2007 Products
10/3/2015 RBG
11
Near Term Meetings
 Petascale Applications Workshop
March 22-23 Chicago – Argonne National Lab
 Next HPCS Productivity Team/Task Group meeting
June 28-30, 2005 Reston, VA (General Productivity session
& individual team meetings)
 Second Annual Council on Competitiveness Conference -
HPC: Supercharging U.S. Innovation and Competitiveness
July 13, 2005 Washington, DC
 Milestone V Industry Reviews (Two days)
Week of July 25th (Sun, Cray) and August 2,3 or 4 (IBM)
Standard Review plus special emphasis on Productivity
10/3/2015 RBG
12
Outline
 High Computing University Research Activities
 HECURA Status
 High Productivity Computing Systems Program
 Phase II Update
 Vendor
Teams
 Council on Competitiveness
 Productivity Team
 Phase III Concept
 Other Related Computing Technology Activities
10/3/2015 RBG
13
HPC Industrial Users Survey:
Top-Level Findings
 High Performance Computing
Is Essential to Business
Survival
 Companies Are Realizing a
Range of Financial and
Business Benefits from Using
HPC
 Companies Are Failing to Use
HPC as Aggressively as They
Could Be
 Business and Technical
Barriers Are Inhibiting the Use
of Supercomputing
 Dramatically More Powerful and Easier-to-Use-
Computers Would Deliver Strategic, Competitive Benefits
10/3/2015 RBG
14
Ideal Market for HPC
Number of Applications
Number of Tasks
Number of Users
8
Blue-Collar Computing
Blue-Collar HPC
Increased
Productivity Gains
In Industry and
Engineering
Easy Pickings
Competitive Necessity
Business ROI
Programmer Productivity
Increased
Gains in
Scientific Discovery
Current Market
for HPC
Heroes
1 2
10/3/2015 RBG
4
64
DoD NSF DoE
Amount of Computing Power , Storage , & Capability
# of Dollars
15
HPC ISV Phase I Survey:
Early Findings – Results in July 05
Biosciences 66
CAE 112
Chemistry 30
Climate 2*
DCC&D 1
EDA 21
Financial 7
General Science 105*
General Visualization 6
Geosciences 21*
Middleware 79
Weather 3*
Unknown 7
Grand Total 460
10/3/2015 RBG
 So far we have identified 460 ISV
packages that are supplied by 279
organizations.
 Some are middleware and some
may be cut as we refine the data.
 Domestic/Foreign Sources will be
identified
 Issue is that very few of them will
scale to peta-scale systems
16
Productivity Framework
BW bytes/flop (Balance)
Memory latency
Memory size
……..
Processor flop/cycle
Processor integer op/cycle
Bisection BW
………
Size (ft3)
Power/rack
Facility operation
……….
Code size
Restart time (Reliability) Code
Optimization time
………
Benchmarks
Kernel,
Compact & Full
Common Modeling Interface
System Parameters
(Examples)
Reliability
Actual
System
or
Model
Portability
Productivity = Utility/Cost
U
U(T)
 

C
CS + C O + C M
Exe Time
Experiments

Productivity
Metrics
Dev Time
Experiments
Productivity
Work
Flows
(Utility/Cost)
Utility → U(T)
Production
Constant
U
U
T
T
 Captures major elements that go into evaluating a system
 Builds on current HPC acquisition processes
10/3/2015 RBG
17
Productivity Team
Sponsors
Mission Partners
NSA NRO DOE
HPCMO NASA NSF
Bob Graybill DARPA
Fred Johnson DOE SC
Productivity Team Mgmt.
Vendor Productivity POCs
David Mizell CRAY
Larry Votta SUN
TBD IBM
Jeremy Kepner LINCOLN
Bob Lucas ISI
Development Experiments
Existing Code Analysis
Workflows, Models, Metrics
Benchmarks
Vic Basili UMD
Cray(3) Sun(5) IBM(5)
ARSC UDel Pitt
UCSB(2) UMD(8)
MissSt ISI(3)
Vanderbilt(2)
Lincoln(4)
LLNL MIT(2) MITRE
NSA(2) PSC SDSC(2)
Doug Post LANL
Cray(2) Sun(5) IBM(6)
ARL UMD Oregon
MissSt DOE HPCMO
LANL(5) ISI
Vanderbilt(2)
Lincoln(4) ANL
MITRE NASA ORNL(2)
SAIC Sandia NSA
Jeremy Kepner LINCOLN
Cray(4) Sun(7) IBM(6)
ARL UMD(4) Oregon
MissSt LANL ISI
Lincoln(4) MITRE
UMN NASA(2) DOE
David Koester MITRE
Cray(2) Sun(6) IBM(3)
UIUC(2) UMD(3) UTK(2)
UNM ERDC GWU HPCMO
ISI(2) LANL(3) LBL
Lincoln(4) MITRE UMN
NSA(2) ORNL OSU
Sandia SDSC(3)
10/3/2015 RBG
High Prod. Lang. Systems
Execution Time Models
Test & Spec
Rusty Lusk ANL
Hans Zima JPL
…
Bob Lucas ISI
Cray(2) Sun IBM
CalTech UMD UNM
ISI(3) Lanl(2) SDSC
Lincoln(2) MITRE
UMN ORNL Sandia
Ashok Krishnamurthy OSU
Cray(2) Sun(3) IBM(2)
NSA(2) Uwisc UCB UNM
Codesourcery OSU(2) ISI
NRO(2) Instrumental
ILincoln(4) MITRE
18
Productivity Research Teams
BW bytes/flop (Balance)
Memory latency
Memory size
……..
Processor flop/cycle
Processor integer op/cycle
Bisection BW
………
Size (ft3)
Power/rack
Facility operation
……….
Code size
Restart time (Reliability) Code
Optimization time
………
Benchmarks
Kernel,
Compact & Full
Common Modeling Interface
System Parameters
(Examples)
Reliability
Actual
System
or
Model
Portability
Benchmark Working Group
Lead:David Koester MITRE
Test & Spec Working Group
Lead: Ashok Krishnamurthy OSU
Execution Time Working Group
Lead: Bob Lucas USC ISI
Exe Time
Experiments
Productivity
Metrics
Dev Time
Experiments
Workflows Models & Metrics
Working Group
Lead: Jeremy Kepner Lincoln
Work
Flows
Productivity
(Utility/Cost)
Existing Codes Working Group
Lead: Doug Post LANL
Development Time Working Group
Lead: Vic Basili UMD
High Productivity Language Systems
Working Group
Lead: Hans Zima JPL
Distributed Team Involving a Large Cross Section of the HPC Community
10/3/2015 RBG
19
General Productivity Formula
 
U
C

 = productivity [utility/$]
U = utility [user specified]
T = time to solution [time]
C = total cost [$]
U(T)
CS + C O + C M
CS = software cost [$]
CO = operation cost [$]
CM = machine cost [$]
 Utility is value user places on getting a result at time T
Researcher?
U
Enterprise?
U
T
Production?
U
U
T
Constant
T
T
 Software costs include time spent by users developing their codes
 Operating costs include admin time, electric and building costs
 Productivity formula is tailored by each user through use of functional work
flows
 Developing Large multi-module codes
 Developing Small Codes
 Running applications
 Porting codes
 Administration
10/3/2015 RBG
20
Level 1 Functional Workflows
Enable Time-to-Solution Analysis
(1) Writing Large Multi-Module Codes
Formulate
questions
Develop
Approach
Develop
Code
(3) Running Codes
Production
Runs
V&V
Analyze
Results
Decide;
Hypothesize
Writing Small Codes
(2)
(4) Porting Code
Identify
Differences
Change
Code
Optimize
(5) Administration
Problem
Resolution
Resource
Management
Security
Management
HW/SW
Upgrade
 Mission Partners may create their own HPC usage scenarios from
these basic work flow elements
 Item in red represent areas with highest HPC specific interest
10/3/2015 RBG
21
Small Code Level 2 Work Flow Example
Markov Model - Classroom (UCSB) Data
Formulate
1.0 / 0s
Program
1.0 / 355s
Compile
1.0 / 49s
.002 / 5s
Compile
Debug
.95 / 5s
Test
.048 / 9s 1.0 / 629s
Optimize
.266 / 5s
1.0 / 30s
Run
.699 / 4s
.035 / 3s
10/3/2015 RBG
22
HPCS Benchmark Spectrum
Simulation
…
I/O
Others
Classroom
Experiment
Codes
Intelligence
8 HPCchallenge
Benchmarks
(~40) Micro & Kernel
Benchmarks

(~10) Compact
Applications
9 Simulation
Applications
Simulation
Global
Linpack
PTRANS
RandomAccess
1D FFT
Current
UM2000
GAMESS
OVERFLOW
LBMHD/GTC
RFCTH
HYCOM
Near-Future
NWChem
ALEGRA
CCSM
Reconnaissance
HPCS
Spanning Set
of Kernels
3 Petascale/s
Simulation
(Compact)
Applications
Existing Applications
Local
DGEMM
STREAM
RandomAccess
1D FFT
3 Scalable
Compact Apps
Pattern Matching
Graph Analysis
Signal Processing
System Bounds
Emerging Applications
Execution
Bounds
Discrete
Math
…
Graph
Analysis
…
Linear
Solvers
…
Signal
Processing
…
Execution and
Development
Indicators
Future Applications
Execution
Indicators
Spectrum of benchmarks provide different views of system
 HPCchallenge pushes spatial and temporal boundaries; sets performance bounds
 Applications drive system issues; set legacy code performance bounds

Kernels and Compact Apps for deeper analysis of execution and development time
10/3/2015 RBG
23
HPCchallenge Bounds Performance
HPCS Challenge Points
HPCchallenge Benchmarks
HPCS Challenge Points
HPCchallenge Benchmarks
High
1.00
HPL
HPL
Mission
Partner
Applications
RandomAccess
PTRANS
STREAM
Temporal locality
Temporal Locality
Low
FFT
RFCTH2
Spatial Locality
Test3D
AVUS
CG
0.40
0.20
STREAM
RandomAccess
0.00
0.00
Low
HYCOM
OOCore
0.60
Overflow
Gamess
0.80
0.10
0.20
0.30
High
0.40
0.50
0.60
0.70
0.80
0.90
1.00
Spatial Locality
http://icl.cs.utk.edu/hpcc/
 HPCchallenge
Pushes spatial and temporal boundaries
Defines architecture performance bounds
10/3/2015 RBG
24
HPCchallenge Website
Kiviat Diagram Example — AMD Configurations
Not all TOP500 systems are
created equal !!
HPCS/Mission Partner
Productivity Team is
Providing an HPC System
Analysis Framework
10/3/2015 RBG
25
Development Time Activities (1)
Victor R. Basili - Team Lead
 Created the infrastructure for conducting experimental studies in
the field of high performance computing program development
Designed and conducted Classroom studies
A Total of 7 HPC classes were studied and data from 15
assignments was collected and analyzed
Designed and conducted observational studies (Study
HPC experts working on small assignments)
2 observational studies have been conducted and analyzed
Designed and conducted case studies (study HPC experts
working on real projects)
Conducted 2 case studies 1 of which completed
Developed a refined experimental design for experiments
in 2005
10/3/2015 RBG
26
Development Time Activities (2)
 Developed a downloadable instrumentation package
Looking for expert volunteers to download and use the
package
 Built knowledge about how to conduct experiments in
the HPC environment
 Tested and evaluated data collection tools
Hackystat
Eclipse
 Developed new hypotheses
Developed and analyzed list of HPCS folklore
Developed and analyzed list of common HPCS
defects
10/3/2015 RBG
27
Measuring Development Time
Real
Applications
2 case
studies
Validity
Small
Projects
7 HPC classes
studied (15 projects,
~100 students)
Classroom
Studies
HPC Center
Tutorials
developed
downloadable
package
2 observational
studies
new data
collection tools
(Hackystat, Eclipse)
Cost
 Developing a new methodology for conducting these tests
 Comparing programming models and languages
 Measuring: performance achieved, effort, and experties
 Workflows: steps and time spent in each step
10/3/2015 RBG
28
Outline
 High Computing University Research Activities
 HECURA Status
 High Productivity Computing Systems Program
 Phase II Update
 Vendor
Teams
 Council on Competitiveness
 Productivity Team
 Phase III Concept
 Other Related Computing Technology Activities
10/3/2015 RBG
29
HPCS Draft Phase III Program
Productivity
Assessment
(MIT LL, DOE,
DoD, NASA, NSF)
Concept
Review
Industry
Milestones
System
Design
Review
1
MP Peta-Scale
Procurements
2
4
3
5 6 7
Technology
Assessment
SW
Review HPLS
Rel
1
Plan
Mission Partner
Peta-Scale
Application Dev
Final
Demo
SCR
CDR
DRR
Early
Demo
SW
Rel 2
SW
Rel 3
SW
Dev Unit
PDR
Mission Partner
Dev Commitment
Mission Partner
System Commitment
Deliver Units
MP Language Dev
Year (CY)
Program Reviews
Critical Milestones
02
03
04
05
(Funded Five)
(Funded Three)
Phase I
Industry
Concept
Study
Phase II
R&D
06
07
08
09
10
Phase III
System Development & Demonstration
11
Mission
Partners
Program Procurements
10/3/2015 RBG
30
Outline
 High Computing University Research Activities
 HECURA Status
 High Productivity Computing Systems Program
 Phase II Update
 Vendor
Teams
 Council on Competitiveness
 Productivity Team
 Phase III Concept
 Other Related Computing Technology Activities
10/3/2015 RBG
31
Related Technologies
Systems That Know What They’re Doing
 Intelligent Systems
 - Architectures for Cognitive
Information Processing (ACIP)
 High-End Application
Responsive Computing
 High Productivity Computing
Systems Program (HPCS)
 Mission Responsive
Architectures
 Polymorphous Computing
Architectures Program (PCA)
 Power Management
 Power Aware Computing and
Communications Program (PAC/C)
10/3/2015 RBG
+ HECURA
+ OneSAF
Objective
System
+ XPCA
M
i
s
s
i
o
n
Protocols
Micro Architectures
Vdd Scaling
Clock Gating
Compilers/OS
Algorithms
32
Descargar

High Productivity Computing System Program