In article <nk78ps$t2a$
1...@gioia.aioe.org>,
Terje Mathisen <
terje.m...@tmsw.no> wrote:
> Bonita Montero wrote:
> > Am 19.06.2016 um 00:08 schrieb Terje Mathisen:
> >
> >> In particular operations on NaN(s) intentionally breaks associativity,
> >> look at comparisons.
> >
> > I found a funny thing: When operating on two NaNs with the x87-FPU,
> > the lower 51 bits of the result is loaded with the larger value of
> > the lower 51 bits of both operands.
This is with a rounding precision of 53 bits, or perhaps after converting
the value in an x87 register back to a double in memory and loooking at
it in memory. Normally, the rounding precision is 64 bits and it is
the lower 62 or 63 that are compared.
This is also only with quiet NaNs IIRC. The quiet bit gives another
bit to compare, and messes up the comparision if it is not set
consisistently.
> Not funny:
>
> This is a very natural result of having an internal pathway that sorts
> the inputs into larger/smaller (ignoring sign), which is something you
> need if you are going to subtract.
>
> It doesn't matter for FMUL or FADD, except when an FADD has inputs with
> opposite signs and really was a FSUB.
>
> For FDIV however you need to skip that compare/swap network, but FDIV is
> an order of magnitude slower anyway.
This gives predictable results in some cases.
The FreeBSD math library promotes both args to long double for returning
combinations of NaNs to get predictable results in more cases. It
sometimes uses the following as a not very magic C expression for
mixing the NaN x with the NaN y:
(x + (long double)0) + (y + (long double)0)
This adds 0 to ensure that the quiet bit is set consistently, provided
there are no compiler bugs and/or certain #pragmas are set to disallow
certain optimizations). It adds specifically (long double)0 to get
the operands in the x87 on x86 systems (provided the compiler does the
expected thing). This makes the final result depend only on the x87
hardware and not on the types of x and y or compiler optimizations or
bugs.
As was noted in the original posting, x+y is not even commutative for
SSE. It is commutative for x87 since it is basically the bitwise
maximum with some complication for the sign bit. The promotion to
long double is partly to get this commutativity. With commutativity,
there are no compiler bugs from evaluating x+y in the order y+x.
Plain x+y with no compiler bugs would give consistent results if
x, y and the return type are all the same, even if x is signaling
and y is quiet. The magic expression forces x and y to be quiet
before the comparison (although that is not the natural x87 operation)
for the technical reason that it is too difficult to preserve the
non-quietness of x and y unless they have type long double.
This method also works on ia64 and sparc64 systems and was needed to
give consistent results on one or both of those.
Bruce