[HPGMG Forum] [EXTERNAL] Re: Acceptable rounding errors

Brian Van Straalen bvstraalen at lbl.gov
Sat Aug 1 03:27:40 UTC 2015


I would probably be more inclined to leave HPGMG to be purely C-code reference implementation with MPI and/or/without OpenMP.   The range of input values for the dot product are well characterized. If we think we need it, it is not complicated to site the relevant literature and provide the reference implementation in C (fast arbitrary-precision floating-point was my graduate project with Shewchuk ;-) ).   We don’t currently know anything about how sensitive the final residual is to the bottom solve dot product.   External dependencies, however slight, make a benchmark less attractive.

I suspect Jed is right and there is not much wiggle room in FMG for precision gaming.  I was wondering if there was a way to prove it and hence impose a reasonably tight error tolerance for an acceptable benchmark submission.  The nice thing about the 4th-order FV code is that almost any shortcuts results in breaking 4th-order truncation error.

Brian


> On Jul 31, 2015, at 6:41 PM, JOHN SHALF <jshalf at me.com> wrote:
> 
> Brian,
> George Michelogiannakis implemented a reproducible dot product (published 2years ago in SC).  We demonstrated only 3% slowdown over a conventional double precision dot prod.  Jim Demmel's postdoc implemented something similar last year and added it to ReproBLAS.
> 
> My recommendation is to just link against ReproBLAS since it is a released, tuned, and professionally supported reproducible blas implementation.
> 
> -john
> 
> Sent from my iPhone
> 
> On Jul 31, 2015, at 5:04 PM, Brian Van Straalen <bvstraalen at lbl.gov <mailto:bvstraalen at lbl.gov>> wrote:
> 
>> Once you reach the bottom solver for full multigrid you should not be in the network at all anymore.  You would be computing a dot product over a few hundred doubles in L1 memory.  I’m just guessing here but I don’t think the cost of a reproducible dot product would be that bad to implement in software as a compensated sum or distillation.   There was a talk at ISC this year showing some results.  From DRAM compensated summation is about the same, and gets more expensive in faster memory levels.  The impact was not 50x ever though.
>> 
>> 
>> 
>>> On Jul 31, 2015, at 4:38 PM, Jeff Hammond <jeff.science at gmail.com <mailto:jeff.science at gmail.com>> wrote:
>>> 
>>> If by 128b floats, you mean IEEE754 quad precision implemented in SW, then the associated dot product will run ~50x slower on conventional hardware (that is, hardware that does not support QP).
>>> 
>>> It should be possible to implement DDP or some form of compensated summation more efficiently.
>>> 
>>> Jeff
>>> 
>>> On Fri, Jul 31, 2015 at 4:18 PM, Brian Van Straalen <bvstraalen at lbl.gov <mailto:bvstraalen at lbl.gov>> wrote:
>>> 
>>> I would think that we could probably implement a reproducible dot product in the krylov code since it only happens on the coarse grid which should be small enough.
>>> 
>>> HPGMG uses max norms, so we should be ok for that part.
>>> 
>>> Brian
>>> 
>>> 
>>>> On Jul 31, 2015, at 3:27 PM, Hoemmen, Mark <mhoemme at sandia.gov <mailto:mhoemme at sandia.gov>> wrote:
>>>> 
>>>> 
>>>> 
>>>> On 7/31/15, 3:45 PM, "Jed Brown" <jed at jedbrown.org <mailto:jed at jedbrown.org>> wrote:
>>>> 
>>>>> Brian Van Straalen <bvstraalen at lbl.gov <mailto:bvstraalen at lbl.gov>> writes:
>>>>>> The concern is not trivial.  I¹ve spent some time re-reading
>>>>>> Precimonious paper (eecs.berkeley.edu/~rubio/includes/sc13.pdf <http://eecs.berkeley.edu/~rubio/includes/sc13.pdf>
>>>>>> <http://eecs.berkeley.edu/~rubio/includes/sc13.pdf <http://eecs.berkeley.edu/~rubio/includes/sc13.pdf>>) and I realize
>>>>>> that it would not be hard to make a faster version of FMG using mixed
>>>>>> precision.
>>>>> 
>>>>> Just a quick comment now.  I think there's not as much fat to trim as
>>>>> you think.  In general, the precision needs to be as accurate as the
>>>>> discretization.  Most flops occur on fine grids where the discretization
>>>>> is more accurate than single precision.  I challenge you to speed up
>>>>> HPGMG by more than, say, 15%, while maintaining order of accuracy on
>>>>> fine grids.
>>>>> 
>>>>>> There have been papers over the last few years using 4-byte AMG as a
>>>>>> preconditioner
>>>>> 
>>>>> So much fat already.  Then you have a Krylov method and full-accuracy
>>>>> residuals, but HPGMG solves in the cost of a few residual evaluations.
>>>>> Also, these low-accuracy preconditioners are usually used for problems
>>>>> that are only modestly ill-conditioned.  Try it with an operator with
>>>>> condition number 10^{12} like you see in solid mechanics or geodynamics
>>>>> and it doesn't look so hot any more.
>>>> 
>>>> It could be fun to use such a tool to find out the best places to put
>>>> 128-bit floating-point arithmetic.  That could help with some really hard
>>>> problems, or at least avoid some reproducibility issues.
>>>> 
>>>> mfh
>>> 
>>> Brian Van Straalen         Lawrence Berkeley Lab
>>> BVStraalen at lbl.gov <mailto:BVStraalen at lbl.gov>         Computational Research
>>> (510) 486-4976             Division (crd.lbl.gov <http://crd.lbl.gov/>)
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> HPGMG-Forum mailing list
>>> HPGMG-Forum at hpgmg.org <mailto:HPGMG-Forum at hpgmg.org>
>>> https://hpgmg.org/lists/listinfo/hpgmg-forum <https://hpgmg.org/lists/listinfo/hpgmg-forum>
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Jeff Hammond
>>> jeff.science at gmail.com <mailto:jeff.science at gmail.com>
>>> http://jeffhammond.github.io/ <http://jeffhammond.github.io/>
>> Brian Van Straalen         Lawrence Berkeley Lab
>> BVStraalen at lbl.gov <mailto:BVStraalen at lbl.gov>         Computational Research
>> (510) 486-4976             Division (crd.lbl.gov <http://crd.lbl.gov/>)
>> 
>> 
>> 
>> 
>> _______________________________________________
>> HPGMG-Forum mailing list
>> HPGMG-Forum at hpgmg.org <mailto:HPGMG-Forum at hpgmg.org>
>> https://hpgmg.org/lists/listinfo/hpgmg-forum <https://hpgmg.org/lists/listinfo/hpgmg-forum>

Brian Van Straalen         Lawrence Berkeley Lab
BVStraalen at lbl.gov         Computational Research
(510) 486-4976             Division (crd.lbl.gov)




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://hpgmg.org/lists/archives/hpgmg-forum/attachments/20150731/7f8084d3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://hpgmg.org/lists/archives/hpgmg-forum/attachments/20150731/7f8084d3/attachment-0001.bin>


More information about the HPGMG-Forum mailing list