Dealing with libc math differences

162 views
Skip to first unread message

Erik Bray

unread,
Aug 10, 2016, 7:38:59 AM8/10/16
to sage-devel
Hi all,

Sorry if this has been discussed ad-infinitum before--I looked around
a bit but didn't find a definitive answer.

I have one (well at least one) test that's failing on Cygwin due to
tiny difference in the last digit of the result of log(3).

This leads to to several questions:

1) Is it worth investigating the reason for the difference?
2) Is it worth trying to provide a workaround for the difference?
3) Or should the test just be updated to ignore the last couple digits
of the result, and if so how (ellipses?)

Thanks,
Erik

Travis Scrimshaw

unread,
Aug 10, 2016, 8:39:03 AM8/10/16
to sage-devel
Hey Erik,
   It seems like what you want is # rel tol XYZ or # abs tol XYZ. For an example, see rings/real_double.pyx.

Best,
Travis

Thierry Dumont

unread,
Aug 10, 2016, 8:45:57 AM8/10/16
to sage-...@googlegroups.com
Hello,

What do you mean by log(3) ?

Is it log(3.) or log(RDF(3)) ?


With log(RDF(3)), this is classical; changing the library version, or
the compiler version often changes the way some expressions and
functions are computed. As floating point arithmetic is not associative,
this is classical. With log(3.) (RealField()), I don't know.
When we wrote the book "Calcul Mathématique avec Sage", we had this
problem: after some release (some months) some doctests of the chapter
on numerical algebra and floating point numbers failed. The only work
around we found was to truncate the last digit in the output, to try to
test what should remain constant.

Note that:

sage: log(RDF(3))
1.0986122886681098
sage: log(3.)
1.09861228866811

But:
sage: log(RDF(3)).n(prec=53)
1.09861228866811
sage: log(3.).n(prec=53)
1.09861228866811

sage: log(RDF(3))==log(3.)
True

as we also know that:
sage: RDF.prec()==RealField().prec()
True


So in this case it is only a problem of rounding output.


Good luck.

Yours
t.



tdumont.vcf

leif

unread,
Aug 10, 2016, 9:05:37 AM8/10/16
to sage-...@googlegroups.com
Erik Bray wrote:
> Hi all,
>
> Sorry if this has been discussed ad-infinitum before--I looked around
> a bit but didn't find a definitive answer.
>
> I have one (well at least one) test that's failing on Cygwin due to
> tiny difference in the last digit of the result of log(3).
>
> This leads to to several questions:
>
> 1) Is it worth investigating the reason for the difference?

I thought you had already tracked it down to "libc math" (= libm I guess).

Which log() btw.? logf(), log(), logl()? (Presumably not clog*().)


-leif

Jeroen Demeyer

unread,
Aug 10, 2016, 9:08:27 AM8/10/16
to sage-...@googlegroups.com
On 2016-08-10 13:38, Erik Bray wrote:
> 1) Is it worth investigating the reason for the difference?

No, but it is worth determining how bad the error is. In all cases, I
would say that an error of less than 1 ulp is totally acceptable. If the
error is 1 ulp or more for a basic function (like log) on a reasonable
non-pathological input (like 3.0), I would consider that an upstream
bug. Still, it would not necessarily be a problem for Sage if the error
is a few ulp.

> 3) Or should the test just be updated to ignore the last couple digits
> of the result, and if so how (ellipses?)

The best way is to use # (rel|abs) tol in the doctest.

Marc Mezzarobba

unread,
Aug 10, 2016, 11:01:17 AM8/10/16
to sage-...@googlegroups.com
Jeroen Demeyer wrote:
> If the error is 1 ulp or more for a basic function (like log) on a
> reasonable non-pathological input (like 3.0), I would consider that
> an upstream bug.

Quite a few implementations only provide accuracies of 3-4 ulp for speed
reasons (it may make it possible to use double precision internally for
intermediate operations that would otherwise require some kind of
extended precision).

--
Marc

Erik Bray

unread,
Aug 11, 2016, 6:24:31 AM8/11/16
to sage-devel
On Wed, Aug 10, 2016 at 2:45 PM, Thierry Dumont
<tdu...@math.univ-lyon1.fr> wrote:
> Le 10/08/2016 à 13:38, Erik Bray a écrit :
>> Hi all,
>>
>> Sorry if this has been discussed ad-infinitum before--I looked around
>> a bit but didn't find a definitive answer.
>>
>> I have one (well at least one) test that's failing on Cygwin due to
>> tiny difference in the last digit of the result of log(3).
>>
>> This leads to to several questions:
>>
>> 1) Is it worth investigating the reason for the difference?
>> 2) Is it worth trying to provide a workaround for the difference?
>> 3) Or should the test just be updated to ignore the last couple digits
>> of the result, and if so how (ellipses?)
>>
>> Thanks,
>> Erik
>>
> Hello,
>
> What do you mean by log(3) ?
>
> Is it log(3.) or log(RDF(3)) ?

Really what I meant was `float(log(3))` which I guess in Sage ends up
being same as `log(RDF(3))`

> With log(RDF(3)), this is classical; changing the library version, or
> the compiler version often changes the way some expressions and
> functions are computed. As floating point arithmetic is not associative,
> this is classical. With log(3.) (RealField()), I don't know.
> When we wrote the book "Calcul Mathématique avec Sage", we had this
> problem: after some release (some months) some doctests of the chapter
> on numerical algebra and floating point numbers failed. The only work
> around we found was to truncate the last digit in the output, to try to
> test what should remain constant.

I have:

sage: R = RealField(100)
sage: log(R(3))
1.0986122886681096913952452369

Okay. With Cygwin's libm:

sage: float(log(3))
1.0986122886681096

While on Linux I get:

sage: float(log(3))
1.0986122886681098

It looks like the latter result is properly rounded up (with an error
of 1ulp) while my result on Cygwin is just truncated.

Erik Bray

unread,
Aug 11, 2016, 6:24:48 AM8/11/16
to sage-devel
On Wed, Aug 10, 2016 at 3:08 PM, Jeroen Demeyer <jdem...@cage.ugent.be> wrote:
> On 2016-08-10 13:38, Erik Bray wrote:
>>
>> 1) Is it worth investigating the reason for the difference?
>
>
> No, but it is worth determining how bad the error is. In all cases, I would
> say that an error of less than 1 ulp is totally acceptable. If the error is
> 1 ulp or more for a basic function (like log) on a reasonable
> non-pathological input (like 3.0), I would consider that an upstream bug.
> Still, it would not necessarily be a problem for Sage if the error is a few
> ulp.

In this case the difference is just one ulp.

>> 3) Or should the test just be updated to ignore the last couple digits
>> of the result, and if so how (ellipses?)
>
>
> The best way is to use # (rel|abs) tol in the doctest.

Great, I found that shortly after posting. I think in this case it
should probably be fine?

leif

unread,
Aug 11, 2016, 7:15:32 AM8/11/16
to sage-...@googlegroups.com
FWIW, on Cygwin we used to use the Cephes library for a while. (No idea
whether it now replaced libm or parts of it got merged into libm
upstream, or none of that.)

But the result you get presumably also depends on whether libm was
compiled with -mfpmath=sse or -mfpmath=387 etc.


-leif



Dima Pasechnik

unread,
Aug 11, 2016, 8:17:00 AM8/11/16
to sage-devel
not sure about this; cygwin uses libm from something called newlib, which is mostly developed by Sun some 20+ years ago.
So it's pretty much non-optimised for particular hardware.
Although they keep adding more stuff to it even now, cf. e.g.

Dima

-leif



Erik Bray

unread,
Aug 11, 2016, 8:24:33 AM8/11/16
to sage-devel
On Thu, Aug 11, 2016 at 2:17 PM, Dima Pasechnik <dim...@gmail.com> wrote:
>> FWIW, on Cygwin we used to use the Cephes library for a while. (No idea
>> whether it now replaced libm or parts of it got merged into libm
>> upstream, or none of that.)
>>
>> But the result you get presumably also depends on whether libm was
>> compiled with -mfpmath=sse or -mfpmath=387 etc.
>>
> not sure about this; cygwin uses libm from something called newlib, which is
> mostly developed by Sun some 20+ years ago.
> So it's pretty much non-optimised for particular hardware.
> Although they keep adding more stuff to it even now, cf. e.g.
> https://sourceware.org/ml/newlib-cvs/2016-q1/msg00046.html

Yup, that's why I wrote "libc" at some point--just because newlib
incorporates both libc and libm so it's easy to think of them as part
of the same library. I have no idea how it is configured.

One major disadvantage to it is it is still missing most double
complex functions which creates a lot of hassle, though that's an
orthogonal issue.

leif

unread,
Aug 11, 2016, 9:15:31 AM8/11/16
to sage-...@googlegroups.com
Lacking (at least some) C99 functions (especially long double versions
of the gamma functions) was the reason to include and use Cephes on
Cygwin, but that was years ago. (It's no longer built on Cygwin since
about four years, cf. #9543.)


-leif


Jeroen Demeyer

unread,
Aug 12, 2016, 3:42:19 AM8/12/16
to sage-...@googlegroups.com
On 2016-08-11 12:24, Erik Bray wrote:
> In this case the difference is just one ulp.

To be more precise: the difference is actually about 0.59 ulp on Cygwin:

sage: R = RealField(256)
sage: log3_exact = R(3).log()
sage: log3_cygwin = RDF(1.0986122886681096)
sage: (R(log3_cygwin) - log3_exact)/log3_cygwin.ulp()
-0.5914650915268009

Erik Bray

unread,
Aug 12, 2016, 5:08:59 AM8/12/16
to sage-devel
Oh nice, I did't know you could do that.
In the above I meant the difference between the result I had on Linux
and on Cygwin which of course have the same floating point precision,
not the difference from the (closer to) exact value. Using the same
calculation as above the difference of the Linux result from exact is
~0.41 ulp.

All the more reason to just add a small tolerance on this particular
test. Thanks again!

leif

unread,
Aug 12, 2016, 5:24:21 AM8/12/16
to sage-...@googlegroups.com
Yeah, the doctest framework should automatically base the tolerance on
such comparisons... ;-) :P


-leif


Jeroen Demeyer

unread,
Aug 12, 2016, 5:40:38 AM8/12/16
to sage-...@googlegroups.com
On 2016-08-12 11:24, leif wrote:
> Yeah, the doctest framework should automatically base the tolerance on
> such comparisons... ;-) :P

I don't know how serious you use, but you have to keep in mind that the
doctest framework works with strings. The main problem is that you would
need to guess the precision in bits from some string representation.
That's not easy because RealField and RDF already print numbers differently.

Apart from this issue, we could support something like "# ulp tol 2" for
a tolerance of 2 ulp.

leif

unread,
Aug 12, 2016, 6:49:28 AM8/12/16
to sage-...@googlegroups.com
Jeroen Demeyer wrote:
> On 2016-08-12 11:24, leif wrote:
>> Yeah, the doctest framework should automatically base the tolerance on
>> such comparisons... ;-) :P
>
> I don't know how serious you use, but you have to keep in mind that the
> doctest framework works with strings. The main problem is that you would
> need to guess the precision in bits from some string representation.
> That's not easy because RealField and RDF already print numbers
> differently.

Not /that/ serious, as the mark-up was supposed to indicate. ;-)

While this may be trivial for toy examples, it'll get hard to impossible
for more complex ones, as you'd have to interpret the code doing the
computations and derive formulas from that, where you must not simplify
expressions in the first place to take into account precision of each
individual operation performed. (Also, thresholds and cut-offs in
algorithms are usually "hard-coded" rather than given by explicit
formulas one could interpret, i.e., these are in comments or hidden in
[hopefully] referenced papers.) You'd also need specifications for
externally defined functions.

But rather than trying that at test time, a tool could help /when
writing the tests/ (such that you could use the value(s) and tolerance
you get from that in the "expected" results), for "easy" cases at least.

In practice, most people just take *their* result (i.e., copy-paste what
they get on their machine, and probably add some guessed tolerance,
perhaps after some buildbot complained).

I think we had some discussion that in /examples/ the latter is more
appropriate than giving some "fake" (more) exact result.


> Apart from this issue, we could support something like "# ulp tol 2" for
> a tolerance of 2 ulp.

At least we should fix #21121 [1], which is distantly related.


-leif


[1] "Precision in doctest tolerance: x > x"
https://trac.sagemath.org/ticket/21121


Erik Bray

unread,
Aug 19, 2016, 5:00:30 AM8/19/16
to sage-devel
The difference in log(3) strikes again, but this time in a test where
it's harder to just set a tolerance. The test is for the algdep()
method of ComplexDoubleElements:
https://git.sagemath.org/sage.git/tree/src/sage/rings/complex_double.pyx#n2377

Because of the way RDF(sqrt(3)) is calculated it uses log(3). This
results in a different numeric result for "z". The test already has
"abs tol 1e-16" for "z" which is fine, but the algorithm from PARI is
sensitive to even that small difference and gives a totally different
polynomial result:

sage: z = (1/2)*(1 + RDF(sqrt(3)) *CDF.0); z
0.5 + 0.8660254037844386*I
sage: p = z.algdep(5); p
x^5 + x^2
sage: z^5 + z^2
-4.996003610813204e-16 - 2.220446049250313e-16*I

So the answer isn't wrong--it's still a good approximation. But the
test expects:

sage: z = (1/2)*(1 + RDF(sqrt(3)) *CDF.0); z
0.5 + 0.8660254037844387*I
sage: p = z.algdep(5); p
x^3 + 1

Not really sure what the best course of action would be for this test
then, other than to hard-code a value for z.

Erik Bray

unread,
Aug 19, 2016, 5:31:03 AM8/19/16
to sage-devel
To answer my own question, this is how I rewrote the test for now:

2389 sage: z = (1/2)*(1 + RDF(sqrt(3)) *CDF.0); z # abs tol 1e-16
2390 0.5 + 0.8660254037844387*I
2391 sage: p = z.algdep(5); p # random
2392 x^3 + 1
2393 sage: p.factor()
2394 (x + 1) * ... (x^2 - x + 1)
2395 sage: abs(z^2 - z + 1) < 1e-14
2396 True

This works out fine because the result I got still has (x^2 - x + 1)
as a factor. It just has one additional factor of x^2.
I'm not too happy with marking the result of z.algdep() as "random"
though. It's definitely deterministic, just sensitive. I think the
doctest module maybe needs an "# ignore" flag that is effectively the
same as "# random" without the implication that the output is truly
random. Just that the line should be run, but its output should not
be checked.

leif

unread,
Aug 20, 2016, 9:47:47 AM8/20/16
to sage-...@googlegroups.com
At least add a comment why the '# random' is there, e.g. something like
'# random -- exact output differs on Cygwin/depending on the precision
of log()...' (perhaps with a trac ticket number).


If there are more instances, we may consider '# optional -- cygwin' or
whatever (analogous to '# 32-bit' and '# 64-bit').


-leif


Erik Bray

unread,
Aug 22, 2016, 5:28:58 AM8/22/16
to sage-devel
Already done one better: https://trac.sagemath.org/ticket/21292
(added you to CC if you're interested but feel free to remove
otherwise)

> If there are more instances, we may consider '# optional -- cygwin' or
> whatever (analogous to '# 32-bit' and '# 64-bit').

Yes, having platform specific tests could be nice. I keep thinking of
adding something to this effect, but so far I've been able to avoid
the need, which I think is good. But if it comes to it, this will be
nice to have. If such a thing is implemented I think it should use
the existing (draft, but long in use in the wild) standard for
Environment Markers: https://www.python.org/dev/peps/pep-0496/

Best,
Erik
Reply all
Reply to author
Forward
0 new messages