How to…
Various rules for how (not) to behave
Kathy Yelick
Derived from :
How to Give a Bad talk: The Ten Commandments
by David A. Patterson (slides by Rolf Riedi)
Twelve Ways to Fool the Masses: Scientific
Malpractice in High-Performance Computing
by David Bailey
1. Thou shalt not waste space
Poster board is expensive.
Your ideas are priceless.
My Space-Efficient Poster: Make sure to
cover all white space -- no borders, or
other separating between topics. Minimize
line spacing. Use the minimum font
legible visible from 1 foot away.
1. Thou shalt not waste space
Poster board is expensive.
Your ideas are priceless.
My Space-Efficient Poster:
• Make sure to cover all white space:
• no borders, or other separating
between topics.
• Minimize line spacing.
• Use the minimum font legible visible
from 1 foot away.
2. Thou shalt not be neat
Why vaste research time on prepare poster?
Ignore spell’g, grammer and legibilite.
2. Thou shalt not be neat
Why waste research time on preparing slides?
Ignore spelling, grammar and legibility.
Who cares what 30 people think?
3. Thou shalt not covet brevity
Do you want to promote the stereotype that
computer scientists can't write? Always use
complete sentences, never just key words. If
possible, use whole paragraphs to make sure
your visitors will have to stand by your poster for
a long time just to read the text.
3. Thou shalt not covet brevity
Use key words.
Don’t plan to read your poster.
4. Omit needless background
 Assume
 No
you will always be present
need for the poster to tell it’s own story
Don’t both to label graphs – you memory is fine
Use inside lingo (e.g., Bassi, PSI, my laptop)
4. Omit needless background
 Write
poster to be reused without you
 Critical
information should be there
Label all axes on graphs (“Mflop/sec” not “speed”)
Use globally terminology (e.g., “IBM Power5 with
Federation switch” or “Pentium 4 with Gigabit
5. Covet Content over Structure
Just get the facts on the poster, don’t worry about
Humans can be trained to read right-to-left,
bottom-to-top, or any other order
Experience with foreign languages proves this
What we would do
with more time
Results on 8
Why this problem is Outline of our
planned solutions
6. Thou shalt not use color
use of color indicates
uncareful research.
It's also unfair to emphasize
some words over others.
Using color doesn’t mea
a fancy plotter
7. Thou shalt not illustrate
 Confucius
◊ ``A
 but
picture is a 1000 words,''
Dijkstra says
◊ ``Pictures
 Who
are for weak minds.'‘
are you going to believe?
Wisdom from the ages or
the person who first counted goto's?
8. Let the Poster Speak for Itself
 Do
not stand near your poster
 Do
not think about what you’re going
to say to visitors
 If
you worked in a team, let your
partner do all the talking
9. Reuse, Recycle, Reclaim
 Once
the paper is written, you can
just glue the pages to the poster
board, right?
10. Do Not Plan Ahead
Why waste research time thinking about the
It could take an hours out of your several weeks of
project work.
How can you appear spontaneous if you plan
Don’t worry about presentation when you’re
collecting results
Don’t get any feedback on your results
Commandment X is most important.
 Even if you break the other nine, this one can
save you.
Twelve Ways to Fool the Masses:
Scientific Malpractice in HighPerformance Computing
David H. Bailey
Lawrence Berkeley National Laboratory
Lessons From History
High standards of honesty and scientific rigor must be vigilantly
enforced within a field.
Rigorous peer review is essential.
Scientific research must be based on solid empirical data and
careful, objective analysis of that data.
Scientists must be willing to provide all details of the experimental
environment, so others can reproduce their results.
A “politically correct” conclusion is no excuse for poor scholarship.
Erudite-sounding technical terminology and fancy mathematical
formulas are no substitutes for sound reasoning.
Hype has no place in the scientific enterprise.
“There is a real world; its properties are not social constructs;
facts and evidence do matter.” – Sokal
History of Parallel Computing
1976-1986: Initial research studies and demos.
1986-1990: First large-scale systems deployed.
1990-1994: Successes over-hyped; faults ignored.
Shoddy measurement methods used.
Questionable performance claims made.
1994-1998: Numerous firms fail; agencies cut funds.
1998-2002: Reassessment.
2002-2006: Recovering? Or slipping again into hype?
Parallel System Performance Practices, circa
Performance results on small-sized parallel systems
were linearly scaled to full-sized systems.
Example: 8,192-CPU results were linearly scaled to
65,536-CPU results, simply by multiplying by 8.
Rationale: “We can’t afford a full-sized system.”
Sometimes this was done without any clear disclosure
in the paper or presentation.
Parallel System Performance Practices, circa
Highly tuned programs were compared with
untuned implementations on other systems.
In comparisons with vector systems, often little or
no effort was made to tune the vector code.
This was the case even for comparisons with
SIMD parallel systems – here the SIMD code can
be directly converted to efficient vector code.
Parallel System Performance Practices, circa
Inefficient algorithms were employed, requiring
many more operations, in order to exhibit an
artificially high Mflop/s rate.
Some scientists employed explicit PDE schemes
for applications where implicit schemes were
known to be much better.
One paper described doing a discrete Fourier
transform by direct computation, rather than by
using an FFT (8n2 operations rather than 5n log2n).
Parallel System Performance Practices, circa
Performance rates on 32-bit floating-point data on
one system were compared with rates on 64-bit
data on other systems.
Using 32-bit data instead of 64-bit data effectively
doubles data bandwidth, thus yielding artificially
high performance rates.
Some computations can be done safely with 32-bit
floating-point arithmetic, but most cannot.
Even 64-bit floating-point arithmetic is not enough
for some scientific applications – 128-bit is required.
Parallel System Performance Practices, circa
In some cases, performance experiments reported in
published results were not actually performed.
Abstract of published paper:
“The current Connection Machine implementation runs at 300-800
Mflop/s on a full [64K] CM-2, or at the speed of a single processor of
a Cray-2 on 1/4 of a CM-2.”
Buried in text:
“This computation requires 568 iterations (taking 272 seconds) on a
16K Connection Machine.”
I.e., the computation was not run on a full 64K CM-2.
“In contrast, a Convex C210 requires 909 seconds to compute this
example. Experience indicates that for a wide range of problems, a
C210 is about 1/4 the speed of a single processor Cray-2, …”
I.e., the computation was not run on a Cray-2 at all – it was run on a
Convex system, and a very dubious conversion factor was used.
Parallel System Performance Practices, circa
Scientists were just as guilty as commercial
vendors of questionable performance claims.
The examples in my files were written by
professional scientists and published in peerreviewed journals and conference proceedings.
One example is from an award-winning paper.
Scientists in some cases accepted free computer
time or research funds from vendors, but did not
disclose this fact in their papers.
Scientists should be held to a higher standard than
vendor marketing personnel.
Performance Plot A
Data for Plot A
Parallel system
Run time
Vector system
Run time
 In last entry, the 3:11:50 figure is an estimate.
 The vector system code is “not optimized.”
 The vector system performance is better except for the last
(estimated) entry.
Performance Plot B
Facts for Plot B
32-bit performance rates on a parallel system are
compared with 64-bit performance on a vector system.
Parallel system results are linearly extrapolated to a fullsized system from a small system (only 1/8 size).
The vector version of code is “unvectorized.”
The vector system “curves” are straight lines – i.e., they
are linear extrapolations from a single data point.
It appears that of all points on four curves in this plot, at
most four points represent real timings.
Twelve Ways to Fool the Masses
Quote only 32-bit performance results, not 64-bit results.
Present performance figures for an inner kernel, and then represent these figures
as the performance of the entire application.
Quietly employ assembly code and other low-level language constructs.
Scale up the problem size with the number of processors, but omit any mention of
this fact.
Quote performance results projected to a full system.
Compare your results against scalar, unoptimized code on conventional systems.
When direct run time comparisons are required, compare with an old code on an
obsolete system.
If Mflop/s rates must be quoted, base the operation count on the parallel
implementation, not on the best sequential implementation.
Quote performance in terms of processor utilization, parallel speedups or Mflop/s
per dollar.
Mutilate the algorithm used in the parallel implementation to match the
Measure parallel run times on a dedicated system, but measure conventional run
times in a busy environment.
If all else fails, show pretty pictures and animated videos, and don't talk about
Twelve Ways: Basic Principles
Use well-understood, community-defined metrics.
Base performance rates on operation counts derived from the
best practical serial algorithms, not on schemes chosen just to
exhibit artificially high Mflop/s rates on a particular system.
Use comparable levels of tuning.
Provide full details of experimental environment, so that
performance results can be reproduced by others.
Disclose any details that might affect an objective
interpretation of the results.
Honesty and reproducibility should characterize all work.
Danger: We can fool ourselves, as well as others.
New York Times, 22 Sept 1991
Excerpts from NYT Article
“Rival supercomputer and work station
manufacturers are prone to hype, choosing the
performance figures that make their own systems
look better.”
“It’s not really to the point of widespread fraud, but
if people aren’t somewhat more circumspect, it
could give the field a bad name.”
Fast Forward to 2007:
Five New Ways to Fool the Masses
Dozens of runs are made, but only the best performance
figure is cited in the paper.
Runs are made on part of an otherwise idle system, but
this is not disclosed in the paper.
Performance rates are cited for a run with only one CPU
active per node.
Special hardware, operating system or compiler settings
are used that are not appropriate for real-world usage.
“Scalability” is defined as a successful execution on a
large number of CPUs, regardless of performance.
Extra Slides
Example from Physics:
Measurements of Speed of Light
Why the discrepancy between pre-1945 and post-1945 values?
Probably due to biases and sloppy experimental methods.
Example from Psychology:
The “Blank Slate”
The “blank slate” paradigm (1920-1990):
The human mind at birth is a “blank slate.”
Heredity and biology play no significant role in human personality
– all behavioral traits are socially constructed.
Current consensus, based on latest research:
Humans at birth possess sophisticated facilities for language
acquisition, pattern recognition and social life.
Heredity, evolution and biology are major factors of personality
How did these scientists get it so wrong?
Sloppy experimental methodology and analysis.
Pervasive biases and wishful thinking.
Ref: Steven Pinker, The Blank Slate: The Modern Denial of Human Nature
Example from Anthropology:
The “Noble Savage”
Anthropologists, beginning with Margaret Mead in the 1930s, taught
that primitive societies (such as South Sea Islands) were idyllic:
 No violence, jealousy or warfare.
 Happy, uninhibited – no psychological problems or “hangups.”
Beginning in the 1980s, a new breed of anthropologists began to
reexamine these findings. They concluded:
 Most of these societies have murder rates several times higher
than large U.S. cities.
 Death rates from inter-tribe warfare exceed that of Western
societies by factors of 10 to 100.
 Complex, jealous taboos surround courtship and marriage, often
justifying the killing of non-virgin brides or suspected adulterers.
Why were the earlier studies so wrong?
Answer: “Anthropological malpractice” – Pinker
Postmodern Science Studies
These scholars study the social and political factors involved in
scientific discoveries. Some of these studies are interesting and
useful, but others are highly questionable:
Denials that science progresses towards fundamental truth.
Claims that scientific theories are “socially constructed.”
Politically charged rhetoric.
Gratuitous use of erudite-sounding technical jargon.
Significant misunderstandings of the mathematical and scientific topics
being addressed.
Application of arcane theories of math and physics into inappropriate
Reluctance to submit scholarship to rigorous outside review.
Ref: Fashionable Nonsense by Alan Sokal and Jean Bricmont
The Sokal Hoax
In 1996, Alan Sokal, a physicist at NYU, wrote a spoof of a postmodern
science article, entitled “Transgressing the Boundaries: Toward a
Transformative Hermeneutics of Quantum Gravity”:
Page after page of erudite-sounding nonsense.
Numerous references to arcane scientific theories, including quantum
mechanics, relativity, chaos theory, mathematical set theory, etc.
Frequent, approving quotes from leading postmodern science scholars.
Politically charged rhetoric.
Deliberately written so that “any mathematician or physicist would
realize that it was a spoof.”
In spite of its these flaws, the article was accepted for publication in Social
Text, a leading postmodern journal. It appeared in a special issue
devoted to defending the science studies field against its detractors.
Excerpts from Sokal’s Article
Rather, [scientists] cling to the dogma … that there exists an external
world, whose properties are independent of any individual human being
and indeed of humanity as a whole; that these properties are encoded
in “eternal” physical laws; and that human beings can obtain reliable,
albeit imperfect and tentative, knowledge of these laws by hewing to
the “objective” procedures and epistemological strictures prescribed by
the (so-called) scientific method. [pg 217] Note: Sokal is deriding even
the most basic notions of scientific reality and common sense.
In this way the infinite-dimensional invariance group erodes the distinction
between the observer and observed; the p of Euclid and the G of
Newton, formerly thought to be constant and universal, are now
perceived in their ineluctable historicity; and the putative observer
becomes fatally de-centered, disconnected from any epistemic link to a
space-time point that can no longer be defined by geometry alone. [pg
222] Note: In addition to gratuitous usage of technical jargon, Sokal is
saying that p and G are not constants!
Excerpts from Other (Serious) Articles in
the Same Issue as Sokal’s Article
Most theoretical physicists, for example, sincerely believe that however partial our
collective knowledge may be, ... one day scientists shall find the necessary
correlation between wave and particle; the unified field theory of matter and
energy will transcend Heisenberg’s uncertainly principle. [Aronowitz, pg 181]
Note: A “unified field theory” will not do away with wave-particle duality and
Heisenberg’s uncertainty principle – these are inherent in quantum theory.
[P]assionate partisans of wave and matrix mechanics explanations for the
behavior of electrons were unable to reach agreement for decades.
[Aronowitz, pg 195] Note: Even Aronowitz’s history is wrong – wave and
matrix formulations of quantum mechanics were reconciled within weeks.
Once it is acknowledged that the West does not have a monopoly on all the good
scientific ideas in the world, or that reason, divorced from value, is not
everywhere and always a productive human principle, then we should expect
to see some self-modification of the universalist claims maintained on behalf
of empirical rationality. Only then can we begin to talk about different ways of
doing science, ways that downgrade methodology, experiment, and
manufacturing in favor of local environments, cultural values, and principles of
social justice. [Ross, pg 3-4] Note: Ross is advocating an extreme cultural
relativism for science, discarding much of our rational, empirical methodology.
2005: A Sokal-Like Hoax
in Computer Science
In early 2005, some MIT graduate students submitted two papers to the 9th
World Multi-Conference on Systemics, Cybernetics and Informatics
(WMSCI). :
“Rooter: A Methodology for the Typical Unification of Access Points and
“The Influence of Probabilistic Methodologies on Networking”
These papers were completely generated by means of a computer
programs, with reasonable sentence structures, but otherwise simply a
concatenation of computer science buzzwords, nonsensical charts and
graphs, and nonexistent references.
The first was accepted as a “non-reviewed” submission; the second was
rejected, but without referee reports or other explanation.
In neither case did either referees or the Program Committee note that
these papers are utter gibberish.
Abstracts of the Two Papers
Abstract of Paper #1:
Many physicists would agree that, had it not been for congestion
control, the evaluation of web browsers might never have
occurred. In fact, few hackers worldwide would disagree with
the essential unification of voice-over-IP and public-private key
pair. In order to solve this riddle, we confirm that SMPs can be
made stochastic, cacheable, and interposable.
Abstract of Paper #2:
In recent years, much research has been devoted to the
exploration of von Neumann machines; however, few
have deployed the study of simulated annealing. In fact,
few security experts would disagree with the investigation
of online algorithms [25]. STEEVE, our new system for
game-theoretic modalities, is the solution to all of these
Recent Example #1
In 2003 a prominent computer vendor (which is also involved in
the HPC world) submitted results on the SPEC benchmark:
Used a special command to enable “memory read bypass,”
which eliminates the need to wait for the snoop response
required in a multiprocessor configuration.
Used a special command to enable a maximum of eight
hardware pre-fetch streams and disable software-based prefetching.
Installed a special high-performance, single-threaded malloc
library, geared for speed rather than memory efficiency.
These settings are not appropriate for normal production usage,
and thus the resulting performance figures are unrealistic.
Recent Example #2
Recently a certain HPC vendor claimed, in a press release:
Discovery of a “proof” of Amdahl’s law.
“New” technology that is “provably optimal” by Amdahl’s law.
Several people in the HPC community responded, some rather
sharply, to these claims. The vendor has responded also.
Even if a firm or scientist has some good ideas, hype does not
help their cause, and may endanger the community’s credibility.
Peer-reviewed publications should accompany press
“Extraordinary claims require extraordinary evidence.”
– Carl Sagan
Grid Computing Projects
[email protected]
[email protected] sustains 35 Tflop/s on 2M+ systems
1.7 x 1021 flops over 3 years
Supernova Cosmology Infrastructure
[Thanks to W. Johnston, LBNL]
What the Grid Does Well
Providing national or international access to
important scientific datasets.
Providing a uniform scheme for remote system
access and user authentication.
Providing a high-performance parallel platform
for certain very loosely coupled computations.
Providing a high-capability platform for large
computations that can run on a single remote
system, chosen at run time.
Enabling new types of multi-disciplinary, multisystem, multi-dataset research.
What the Grid Doesn’t Do So Well
Scientific computations that require heavy
interprocessor communication.
Probably the majority of high-end scientific
computations are of this nature.
This doesn’t rule out such applications running
remotely on a single system connected to the grid.
Many classified or proprietary computations.
Current grid security and privacy are not
convincing for many of these users
This doesn’t rule out “internal grids” -- some have
been quite successful.
The Role of Good Benchmarks in
Combating Performance Abuse
Well-designed, rigorous, scalable performance
benchmark tests help bring order to the field.
Well-thought-out and well-enforced “ground rules”
are essential.
A rational scheme must be provided for calculating
performance rates.
A well-defined test must be included to validate the
correctness of the results.
A repository of results must be maintained.
Recent example: The HPCS benchmark suite.
Lessons from History:
Back to the Future
High standards of honesty and scientific rigor must be vigilantly
enforced within the HPC field.
Rigorous peer review is essential.
Performance claims must be based on solid benchmark data and
open, objective analysis of that data.
Well-constructed, community-defined benchmarks are essential
to combat performance abuse.
Researchers must be willing to provide all details of the
experimental environment, so others can reproduce their results.
A “politically correct” conclusion is no excuse for poor scholarship.
Hype has no place in the scientific enterprise.
Danger: We can fool ourselves, as well as others.

Globus Project Future Directions