<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div>Brian,</div><div>George Michelogiannakis implemented a reproducible dot product (published 2years ago in SC).  We demonstrated only 3% slowdown over a conventional double precision dot prod.  Jim Demmel's postdoc implemented something similar last year and added it to ReproBLAS.</div><div><br></div><div>My recommendation is to just link against ReproBLAS since it is a released, tuned, and professionally supported reproducible blas implementation.</div><div><br></div><div>-john<br><br>Sent from my iPhone</div><div><br>On Jul 31, 2015, at 5:04 PM, Brian Van Straalen <<a href="mailto:bvstraalen@lbl.gov">bvstraalen@lbl.gov</a>> wrote:<br><br></div><blockquote type="cite"><div><meta http-equiv="Content-Type" content="text/html charset=utf-8">Once you reach the bottom solver for full multigrid you should not be in the network at all anymore.  You would be computing a dot product over a few hundred doubles in L1 memory.  I’m just guessing here but I don’t think the cost of a reproducible dot product would be that bad to implement in software as a compensated sum or distillation.   There was a talk at ISC this year showing some results.  From DRAM compensated summation is about the same, and gets more expensive in faster memory levels.  The impact was not 50x ever though. <div class=""><br class=""><div class=""><br class=""><div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Jul 31, 2015, at 4:38 PM, Jeff Hammond <<a href="mailto:jeff.science@gmail.com" class="">jeff.science@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">If by 128b floats, you mean IEEE754 quad precision implemented in SW, then the associated dot product will run ~50x slower on conventional hardware (that is, hardware that does not support QP).<div class=""><br class=""></div><div class="">It should be possible to implement DDP or some form of compensated summation more efficiently.<br class=""><div class=""><br class=""></div><div class="">Jeff<br class=""><div class="gmail_extra"><br class=""><div class="gmail_quote">On Fri, Jul 31, 2015 at 4:18 PM, Brian Van Straalen <span dir="ltr" class=""><<a href="mailto:bvstraalen@lbl.gov" target="_blank" class="">bvstraalen@lbl.gov</a>></span> wrote:<br class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word" class=""><div class=""><br class=""></div><div class="">I would think that we could probably implement a reproducible dot product in the krylov code since it only happens on the coarse grid which should be small enough.</div><div class=""><br class=""></div><div class="">HPGMG uses max norms, so we should be ok for that part.</div><span class="HOEnZb"><font color="#888888" class=""><div class=""><br class=""></div><div class="">Brian</div></font></span><div class=""><div class="h5"><div class=""><br class=""></div><br class=""><div class=""><blockquote type="cite" class=""><div class="">On Jul 31, 2015, at 3:27 PM, Hoemmen, Mark <<a href="mailto:mhoemme@sandia.gov" target="_blank" class="">mhoemme@sandia.gov</a>> wrote:</div><br class=""><div class=""><br style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class=""><br style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class=""><span style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="">On 7/31/15, 3:45 PM, "Jed Brown" <</span><a href="mailto:jed@jedbrown.org" style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" target="_blank" class="">jed@jedbrown.org</a><span style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="">> wrote:</span><br style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class=""><br style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class=""><blockquote type="cite" style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class="">Brian Van Straalen <<a href="mailto:bvstraalen@lbl.gov" target="_blank" class="">bvstraalen@lbl.gov</a>> writes:<br class=""><blockquote type="cite" class="">The concern is not trivial.  I¹ve spent some time re-reading<br class="">Precimonious paper (<a href="http://eecs.berkeley.edu/~rubio/includes/sc13.pdf" target="_blank" class="">eecs.berkeley.edu/~rubio/includes/sc13.pdf</a><br class=""><<a href="http://eecs.berkeley.edu/~rubio/includes/sc13.pdf" target="_blank" class="">http://eecs.berkeley.edu/~rubio/includes/sc13.pdf</a>>) and I realize<br class="">that it would not be hard to make a faster version of FMG using mixed<br class="">precision.  <br class=""></blockquote><br class="">Just a quick comment now.  I think there's not as much fat to trim as<br class="">you think.  In general, the precision needs to be as accurate as the<br class="">discretization.  Most flops occur on fine grids where the discretization<br class="">is more accurate than single precision.  I challenge you to speed up<br class="">HPGMG by more than, say, 15%, while maintaining order of accuracy on<br class="">fine grids.<br class=""><br class=""><blockquote type="cite" class="">There have been papers over the last few years using 4-byte AMG as a<br class="">preconditioner<span class=""> </span><br class=""></blockquote><br class="">So much fat already.  Then you have a Krylov method and full-accuracy<br class="">residuals, but HPGMG solves in the cost of a few residual evaluations.<br class="">Also, these low-accuracy preconditioners are usually used for problems<br class="">that are only modestly ill-conditioned.  Try it with an operator with<br class="">condition number 10^{12} like you see in solid mechanics or geodynamics<br class="">and it doesn't look so hot any more.<br class=""></blockquote><br style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class=""><span style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="">It could be fun to use such a tool to find out the best places to put</span><br style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class=""><span style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="">128-bit floating-point arithmetic.  That could help with some really hard</span><br style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class=""><span style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="">problems, or at least avoid some reproducibility issues.</span><br style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class=""><br style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px" class=""><span style="font-family:Helvetica;font-size:14px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;float:none;display:inline!important" class="">mfh</span></div></blockquote></div><br class=""></div></div><span class=""><div class="">
<span style="border-collapse: separate; font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px;" class=""><div class=""><div class=""><font face="'Courier New'" class="">Brian Van Straalen         Lawrence Berkeley Lab</font></div><div class=""><font face="'Courier New'" class=""><a href="mailto:BVStraalen@lbl.gov" target="_blank" class="">BVStraalen@lbl.gov</a>         Computational Research</font></div><div class=""><font face="'Courier New'" class="">(510) 486-4976             Division (<a href="http://crd.lbl.gov/" target="_blank" class="">crd.lbl.gov</a>)</font></div></div><div class=""><br class=""></div><div class=""><br class=""></div></span><br class="">
</div>
<br class=""></span></div><br class="">_______________________________________________<br class="">
HPGMG-Forum mailing list<br class="">
<a href="mailto:HPGMG-Forum@hpgmg.org" class="">HPGMG-Forum@hpgmg.org</a><br class="">
<a href="https://hpgmg.org/lists/listinfo/hpgmg-forum" rel="noreferrer" target="_blank" class="">https://hpgmg.org/lists/listinfo/hpgmg-forum</a><br class="">
<br class=""></blockquote></div><br class=""><br clear="all" class=""><div class=""><br class=""></div>-- <br class=""><div class="gmail_signature">Jeff Hammond<br class=""><a href="mailto:jeff.science@gmail.com" target="_blank" class="">jeff.science@gmail.com</a><br class=""><a href="http://jeffhammond.github.io/" target="_blank" class="">http://jeffhammond.github.io/</a></div>
</div></div></div></div>
</div></blockquote></div><br class=""><div apple-content-edited="true" class="">
<span class="Apple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: 2; text-align: auto; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px;  "><div class=""><div class=""><font class="Apple-style-span" face="'Courier New'">Brian Van Straalen         Lawrence Berkeley Lab</font></div><div class=""><font class="Apple-style-span" face="'Courier New'"><a href="mailto:BVStraalen@lbl.gov" class="">BVStraalen@lbl.gov</a>         Computational Research</font></div><div class=""><font class="Apple-style-span" face="'Courier New'">(510) 486-4976             Division (<a href="http://crd.lbl.gov" class="">crd.lbl.gov</a>)</font></div></div><div class=""><br class=""></div><div class=""><br class=""></div></span><br class="Apple-interchange-newline">
</div>
<br class=""></div></div></div></div></blockquote><blockquote type="cite"><div><span>_______________________________________________</span><br><span>HPGMG-Forum mailing list</span><br><span><a href="mailto:HPGMG-Forum@hpgmg.org">HPGMG-Forum@hpgmg.org</a></span><br><span><a href="https://hpgmg.org/lists/listinfo/hpgmg-forum">https://hpgmg.org/lists/listinfo/hpgmg-forum</a></span><br></div></blockquote></body></html>