[HPGMG Forum] what is a "good" or a "better" TOP500 ranking?
fpetrin at us.ibm.com
Sun Nov 30 23:10:48 UTC 2014
Graph500 is mostly limited by the local memory and network/communication
layer performance at scale
Every point increase in the problem scale doubles the data set size and
the degree of imbalance, roughly
doubling the size of the largest vertex. G500 implementations need to deal
with potential network hot spots
and also steer the type of algorithm, based on the status of the
exploration. We rely on non-blocking collectives
to determine what to do at run-time.
Integer performance is one of the important parameters, but not the only
one. For example, the single node
performance of System K is about 3X BGQ, but BGQ has better scaling.
It would be very helpful to have a benchmark that stresses the network
performance and the scalability
of the overall system in a non-trivial way.
Hope this helps.
Manager, High Performance Analytics Department
IBM TJ Watson Research Center
e-mail: fpetrin at us.ibm.com
From: Jeff Hammond <jeff.science at gmail.com>
To: Horst Simon <hdsimon at lbl.gov>
Cc: "hpgmg-forum at hpgmg.org" <hpgmg-forum at hpgmg.org>
Date: 11/30/2014 02:55 PM
Subject: Re: [HPGMG Forum] what is a "good" or a "better" TOP500
Sent by: "HPGMG-Forum" <hpgmg-forum-bounces at hpgmg.org>
We have proposed a Quantum Chemistry 500 because none of the existing
community benchmarks come close to representing the behavior of atomic
integrals, Hartree-Fock and related DFT methods. All these PDE-oriented
benchmarks are trivial by comparison.
On the other hand, HPL is actually a fantastic proxy for CCSD(T), which is
dominated by DGEMM and point-to-point bandwidth.
In our case, we decided that a single code was the wrong way to benchmark,
and have instead defined the physics and numerics scientists need, without
prescribing an implementation, since there are many already (and, unlike
many domains, these many different codes can reproduce the exact same
solution if the problem is sufficiently specified).
There's nothing wrong with having lots of different benchmarks, as long as
it's relatively easy to obtain and analyze the associated performance
And I agree that not very much critical thinking is going into some of
these benchmarks. It's rather ironic that Blue Gene/Q dominates the
Graph500, because it's not designed for this class of problems and has the
worst integer performance (relative to other capability) of any leadership
class machine, and Graph500 was supposed to be a non-numerical benchmark.
The near-perfect correlation between HPCG and STREAM is another failure in
the science of benchmarking.
Sent from my iPhone
> On Nov 30, 2014, at 10:34 AM, Horst Simon <hdsimon at lbl.gov> wrote:
> Like many of you I returned from SC14 with many questions to think
about, among others "what is a 'better' benchmark?" Here is a summary of
at least three conversations that I had at SC14 between "M" (that is me),
and "C" a colleague (a synthesis of several conversations.
> C: Well, I am really glad that HPCG (or HPGMG) is being developed that
will make the TOP500 more realistic.
> M: What do you mean by more realistic?
> C: It is well known that HPL is not a good benchmark to measure the
performance of real systems. If we replace HPL with HPCG (or HPGMG) then
we would not get such a distortion of the performance, for example all
those GPU based system would not be ranked as high.
> M: But why do you think that HPCG (or HPGMG) is better?
> C: ... long technical argument involving bisection bandwidth, mixture of
long and short messages, real applications that don't solve dense linear
systems, streams benchmark etc.
> M: But why do think this is "better"? What do mean by "better"? I think
that "better" should imply that any new benchmark would in some sense be a
better approximation to the application workload. Can you prove this?
> ..... etc.
> What was really striking about these type of conversations was the fact
how little our community is thinking scientifically. If you want to do
something better, then you first have to define what you are actually
measuring. So how do we really measure the applications performance of a
petascale platform? I can think of many applications where HPCG (and HPGM)
as irrelevant to the application as is HPL.
> HPGMG-Forum mailing list
> HPGMG-Forum at hpgmg.org
HPGMG-Forum mailing list
HPGMG-Forum at hpgmg.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the HPGMG-Forum