[HPGMG Forum] HPGMG release v0.1

Sam Williams swwilliams at lbl.gov
Mon Jun 9 16:40:21 UTC 2014


probably GB/s (28) vs bytes/cycle (18)

On Jun 9, 2014, at 9:37 AM, Mark Adams <mfadams at lbl.gov> wrote:

> 
> 
> 
> On Mon, Jun 9, 2014 at 9:19 AM, Jed Brown <jed at jedbrown.org> wrote:
> "Vitali A. Morozov" <morozov at anl.gov> writes:
> > and see that you provide STREAM-based memory bandwidth for some
> > architectures. I suggest to specify a particular benchmark, let us say
> > "triad", because the result of STREAM is benchmark-dependent.
> 
> Yes, I would prefer Triad.
> 
> > For BG/Q, I have measured 29.3 GB/s/node on "triad". For Cray XC30, I
> > have measured 48.6 GB/s/socket or 97.1 GB/s/node. This is slightly
> > better than the numbers you have reported.
> 
> I'll update the BG/Q number.  What code is needed to observe this?  (I
> think I've always heard 26-27 GB/s quoted and have not personally
> measured higher.)  It would be helpful to list this somewhere on the
> ALCF website.
> 
> 
> Humm, the LLNL spreadsheet says 2.37 GFlops/s which seems to be in line with 26-27 Gbytes/sec (3 vectors I assume), but they list 17.82 Gb/s (I think).  They divide this number by 18 to get bytes/cycle.
> 
>  
> 
> 
> I assume that your 97 GB/s on XC30 using E5-2697v2?  The numbers I used
> come from this page which quotes STREAM Triad at 89 GB/s.
> 
>   http://www.nersc.gov/users/computational-systems/edison/configuration/
> 
> > For Cray XC30, the flop rate is 518.4 GF per node. For Xeon E5-2697 v2 @
> > 2.7 GHz,
> 
> Edison uses E5-2695v2 (2.4 GHz), thus the somewhat lower number.
> 
> > each core can have 8 Flops/cycle - 4 way FMA - or 8 * 2.7 = 21.6
> > GFlops per core. 12 cores result in 259.2 GFlops per socket, 2 sockets
> > give 518.4 GFlops.
> 
> 
> _______________________________________________
> HPGMG-Forum mailing list
> HPGMG-Forum at hpgmg.org
> https://hpgmg.org/lists/listinfo/hpgmg-forum



More information about the HPGMG-Forum mailing list