Hi Mark,

I'm adding Steve Langer the first author of the paper in case he has anything to add as he's been more focused on trying to track down latency issues.

Our results point to, but do not defiantly show the latency is the likely culprit for both the IMC and Laser packages.  This is due to the low IPC, low BW utilization and high number of L2 misses per line fetched from memory.  A number above 1 means that multiple reads occurred to a line in quick succession, but it was not prefetched.  Below 1 means that many lines were successfully prefetched.  We need to either look on other systems with better latency measurement capabilities or find the needle in a haystack (or combination thereof) of performance counters that gets us more definitive latency data.

Note MCB does something, but not a lot more than IMC in Hydra.  Likely this due to a smaller problem size that is run for MCB it has a higher cache hit rate so latency is not as large of a factor though MCB is often latency bound.  We didn't think too hard about MCB though for this work.

So to answer your question.  Yes there are other metrics worth thinking about and latency is one of them.  We're not sure how to capture all of them well, which is why we left the data off the kivet plots, but included hydra imc to show that we were missing something.


I see you are an author on a paper about Hydra (IMC).  In our modeling efforts we are concerned about MC, transport, etc. that might be poorly represented with HPGMG.  I would imagine that these methods use lots of streaming data but 'Hydra IMC' data on the spreadsheet does NOTHING!  Do you have any thoughts on this?  Is there a metric that you are not measuring here that should be and that we should think about?

