[HPGMG Forum] [EXTERNAL] Re: Acceptable rounding errors

Jeff Hammond jeff.science at gmail.com
Sat Aug 1 00:21:32 UTC 2015


You said "we could probably implement a reproducible dot product".  As luck
would have it, I have actually implemented and extensively tested 128b
precision (which is what Mark proposed) dot products.  My ~50x number is
the measured difference between 64b and 128b dot products on IA.  The data
are invariant to GCC vs Intel compilers and Fortran vs C implementation in
my tests.

I have also implemented Kahan summation for dot products, but haven't spent
enough time with the experiments to report performance data that I would
consider reliable enough for decision-making.  Looking at the code, I would
estimate it is more like ~10x slower than straight DP.

Marat Dukham has done some nice work on DDP that has been reported on other
public email lists.

Anyways, I have no position in this debate.  I am merely trying to provide
data that I have measured to inform the debate you all are having.

Jeff

On Fri, Jul 31, 2015 at 5:04 PM, Brian Van Straalen <bvstraalen at lbl.gov>
wrote:

> Once you reach the bottom solver for full multigrid you should not be in
> the network at all anymore.  You would be computing a dot product over a
> few hundred doubles in L1 memory.  I’m just guessing here but I don’t think
> the cost of a reproducible dot product would be that bad to implement in
> software as a compensated sum or distillation.   There was a talk at ISC
> this year showing some results.  From DRAM compensated summation is about
> the same, and gets more expensive in faster memory levels.  The impact was
> not 50x ever though.
>
>
>
> On Jul 31, 2015, at 4:38 PM, Jeff Hammond <jeff.science at gmail.com> wrote:
>
> If by 128b floats, you mean IEEE754 quad precision implemented in SW, then
> the associated dot product will run ~50x slower on conventional hardware
> (that is, hardware that does not support QP).
>
> It should be possible to implement DDP or some form of compensated
> summation more efficiently.
>
> Jeff
>
> On Fri, Jul 31, 2015 at 4:18 PM, Brian Van Straalen <bvstraalen at lbl.gov>
> wrote:
>
>>
>> I would think that we could probably implement a reproducible dot product
>> in the krylov code since it only happens on the coarse grid which should be
>> small enough.
>>
>> HPGMG uses max norms, so we should be ok for that part.
>>
>> Brian
>>
>>
>> On Jul 31, 2015, at 3:27 PM, Hoemmen, Mark <mhoemme at sandia.gov> wrote:
>>
>>
>>
>> On 7/31/15, 3:45 PM, "Jed Brown" <jed at jedbrown.org> wrote:
>>
>> Brian Van Straalen <bvstraalen at lbl.gov> writes:
>>
>> The concern is not trivial.  I¹ve spent some time re-reading
>> Precimonious paper (eecs.berkeley.edu/~rubio/includes/sc13.pdf
>> <http://eecs.berkeley.edu/~rubio/includes/sc13.pdf>) and I realize
>> that it would not be hard to make a faster version of FMG using mixed
>> precision.
>>
>>
>> Just a quick comment now.  I think there's not as much fat to trim as
>> you think.  In general, the precision needs to be as accurate as the
>> discretization.  Most flops occur on fine grids where the discretization
>> is more accurate than single precision.  I challenge you to speed up
>> HPGMG by more than, say, 15%, while maintaining order of accuracy on
>> fine grids.
>>
>> There have been papers over the last few years using 4-byte AMG as a
>> preconditioner
>>
>>
>> So much fat already.  Then you have a Krylov method and full-accuracy
>> residuals, but HPGMG solves in the cost of a few residual evaluations.
>> Also, these low-accuracy preconditioners are usually used for problems
>> that are only modestly ill-conditioned.  Try it with an operator with
>> condition number 10^{12} like you see in solid mechanics or geodynamics
>> and it doesn't look so hot any more.
>>
>>
>> It could be fun to use such a tool to find out the best places to put
>> 128-bit floating-point arithmetic.  That could help with some really hard
>> problems, or at least avoid some reproducibility issues.
>>
>> mfh
>>
>>
>> Brian Van Straalen         Lawrence Berkeley Lab
>> BVStraalen at lbl.gov         Computational Research
>> (510) 486-4976             Division (crd.lbl.gov)
>>
>>
>>
>>
>>
>> _______________________________________________
>> HPGMG-Forum mailing list
>> HPGMG-Forum at hpgmg.org
>> https://hpgmg.org/lists/listinfo/hpgmg-forum
>>
>>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
> http://jeffhammond.github.io/
>
>
> Brian Van Straalen         Lawrence Berkeley Lab
> BVStraalen at lbl.gov         Computational Research
> (510) 486-4976             Division (crd.lbl.gov)
>
>
>
>
>


-- 
Jeff Hammond
jeff.science at gmail.com
http://jeffhammond.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://hpgmg.org/lists/archives/hpgmg-forum/attachments/20150731/774bd81e/attachment.html>


More information about the HPGMG-Forum mailing list