[HPGMG Forum] Performance Versatility slides from Monday

Sam Williams swwilliams at lbl.gov
Thu Nov 27 19:17:25 UTC 2014


Its all a matter of perspective and what the purpose of the figure was.

My view is that programmers should view the numa node as the quanta for distributed memory programming and hide architecture/technology choices like #cores, SMT, SIMD, etc... behind workshare (instead of SPMD) and SIMD pragmas.  Although there is perhaps as much as a factor of 3 difference today (BGQ node vs. GPU) in terms of power, I think there is convergence towards relatively beefy numa nodes.  However, the number of numa nodes per compute node can very widely depending on procurement requirements and network scalability.

In terms of the figures, I wanted to be able to differentiate performance per socket and the scalability of the network.  One could convert the figures from DOF/s to DOF/J (DOF/KWh) or DOF/$ (assuming one knew the real cost).  




On Nov 27, 2014, at 9:49 AM, Jed Brown <jed at jedbrown.org> wrote:

> "Bauer, Gregory H" <gbauer at illinois.edu> writes:
> 
>> Jed,
>> 
>> Thanks for the references and the conversation at the Joint Lab meeting.
>> 
>> Would you have the data for the HPGMG performance on different systems
>> plotted with respect to socket or node count rather than NUMA node count?
>> Or, is the intention to show the impact of NUMA itself along with the
>> network?
> 
> This is Sam's plot and he could easily generate that plot variant.  NUMA
> nodes are equivalent to sockets on most of these architectures.  I don't
> like normalizing by either because sockets also have huge variability in
> cost and power requirement.  I would rather normalize by dollars or
> Watts, but that is also controversial and it's hard to get that data.
> 
> Real costs are rarely passed directly to the users (or even to centers,
> in some cases), different centers subsidize costs differently, and there
> are many external factors, so it's entirely logical that a given user
> might choose an architecture that is not the most cost-effective (in
> acquisition cost, TCO, or Watts) for their application requirements.
> 
>> The peak flop rate for a single NUMA node differs substantial for the
>> systems being compared in the plot.
>> 
>> My only reason to bring this up, if it hasn¹t been brought up before, is
>> that you don¹t buy in terms of NUMA nodes and on most systems users get
>> charged for nodes.
>> -Greg
>> 
>> 
>> On 11/25/14, 10:34, "Jed Brown" <jed at jedbrown.org> wrote:
>> 
>>> http://59A2.org/files/20141124-Versatility.pdf
>> 
>>> 



More information about the HPGMG-Forum mailing list