doctest failures due to rounding errors on Solaris.

35 views
Skip to first unread message

Dr. David Kirkby

unread,
Dec 29, 2009, 6:29:42 PM12/29/09
to sage-...@googlegroups.com
I started to run the Sage 4.3 tests suite on 't2' some time back (It has been
going about 36 hours now. Perhaps it might finish before 4.3.1 comes out.)

There are a few failures, but these couple just came up, and look rather odd, as
this is only an error in the last decimal place, and so will be subject to
rounding errors with different floating point processors. They do not really
seem failures to me.


**********************************************************************
File "/rootpool2/local/kirkby/sage-4.3/devel/sage/sage/symbolic/pynac.pyx", line
1276:
sage: py_exp(float(1))
Expected:
2.7182818284590451
Got:
2.7182818284590455
**********************************************************************

**********************************************************************
File
"/rootpool2/local/kirkby/sage-4.3/devel/sage/sage/symbolic/constants_c.pyx",
line 195:
sage: float(e)
Expected:
2.7182818284590451
Got:
2.7182818284590455
**********************************************************************

William Stein

unread,
Dec 29, 2009, 11:15:54 PM12/29/09
to sage-devel
On Tue, Dec 29, 2009 at 3:29 PM, Dr. David Kirkby
<david....@onetel.net> wrote:
> I started to run the Sage 4.3 tests suite on 't2' some time back (It has been
> going about 36 hours now. Perhaps it might finish before 4.3.1 comes out.)
>
> There are a few failures, but these couple just came up, and look rather odd, as
> this is only an error in the last decimal place, and so will be subject to
> rounding errors with different floating point processors. They do not really
> seem failures to me.

They are important. Search sage-devel, sage-release, and sage-trac
for numerical noise" for hundreds (?) of examples of how to deal with
such issues. The main point is to change the doctest to

sage: py_exp(float(1))
2.7182818284590...

if by hand we've determined that the mistake is really due to
different floating point conventions.


>
>
> **********************************************************************
> File "/rootpool2/local/kirkby/sage-4.3/devel/sage/sage/symbolic/pynac.pyx", line
> 1276:
>     sage: py_exp(float(1))
> Expected:
>     2.7182818284590451
> Got:
>     2.7182818284590455
> **********************************************************************
>
>
>
> **********************************************************************
> File
> "/rootpool2/local/kirkby/sage-4.3/devel/sage/sage/symbolic/constants_c.pyx",
> line 195:
>     sage: float(e)
> Expected:
>     2.7182818284590451
> Got:
>     2.7182818284590455
> **********************************************************************
>

> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to sage-devel+...@googlegroups.com
> For more options, visit this group at http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>

--
William Stein
Associate Professor of Mathematics
University of Washington
http://wstein.org

Peter Jeremy

unread,
Dec 30, 2009, 12:59:57 AM12/30/09
to sage-...@googlegroups.com
On 2009-Dec-29 23:29:42 +0000, "Dr. David Kirkby" <david....@onetel.net> wrote:
>There are a few failures, but these couple just came up, and look
>rather odd, as this is only an error in the last decimal place, and
>so will be subject to rounding errors with different floating point
>processors. They do not really seem failures to me.

One of the claims for IEEE-754 arithmetic is that basic arithmetic
operations (+, -, *, /, sqrt) will give bit-for-bit identical
results on different implementations.

>**********************************************************************
>File "/rootpool2/local/kirkby/sage-4.3/devel/sage/sage/symbolic/pynac.pyx", line
>1276:
> sage: py_exp(float(1))
>Expected:
> 2.7182818284590451
>Got:
> 2.7182818284590455
>**********************************************************************

As a correctly rounded IEEE-754 double precision constant,
e=0x4005BF0A8B145769 (0x2.B7E1 5162 8AED 2), this should convert
to 2.7182818284590451 (rounded to "sufficient" accuracy).

2.7182818284590455 is 1ULP high - this may reflect a rounding error
in either the exp() or double_to_ascii() implementation.

--
Peter Jeremy

Robert Bradshaw

unread,
Dec 30, 2009, 3:03:56 AM12/30/09
to sage-...@googlegroups.com
On Dec 29, 2009, at 9:59 PM, Peter Jeremy wrote:

> On 2009-Dec-29 23:29:42 +0000, "Dr. David Kirkby" <david....@onetel.net
> > wrote:
>> There are a few failures, but these couple just came up, and look
>> rather odd, as this is only an error in the last decimal place, and
>> so will be subject to rounding errors with different floating point
>> processors. They do not really seem failures to me.
>
> One of the claims for IEEE-754 arithmetic is that basic arithmetic
> operations (+, -, *, /, sqrt) will give bit-for-bit identical
> results on different implementations.

I would have expected t2 to be IEEE-754 compliant, but maybe it isn't.

>> **********************************************************************
>> File "/rootpool2/local/kirkby/sage-4.3/devel/sage/sage/symbolic/

>> pynac.pyx", line
>> 1276:
>> sage: py_exp(float(1))
>> Expected:
>> 2.7182818284590451
>> Got:
>> 2.7182818284590455
>> **********************************************************************
>
> As a correctly rounded IEEE-754 double precision constant,
> e=0x4005BF0A8B145769 (0x2.B7E1 5162 8AED 2), this should convert
> to 2.7182818284590451 (rounded to "sufficient" accuracy).
>
> 2.7182818284590455 is 1ULP high - this may reflect a rounding error
> in either the exp() or double_to_ascii() implementation.

It's probably not double_to_ascii, as (on t2)

>>> float("2.7182818284590452353602874713526625")
2.7182818284590451

I bet Solaris has a completely different (and unfortunately not as
accurate) implementation of exp in its libm. It could also be a bug in
the compiler, as math.e is defined as

src/Include/pymath.h:#define Py_MATH_E 2.7182818284590452354

which has much less than 1 ulp accuracy. In any case, the correct fix
is as William mentioned to do place ...'s for the last digit.

- Robert

Dr. David Kirkby

unread,
Dec 30, 2009, 11:49:13 AM12/30/09
to sage-...@googlegroups.com

I might be mistaken, but I believe the SPARC processor users 64 bits for
floating point and the Intel chips use 80 internally, so it might be an issue
with the processor. I've got no idea how that fits in with IEEE-754.

Dave

Robert Bradshaw

unread,
Dec 30, 2009, 5:57:43 PM12/30/09
to sage-...@googlegroups.com

Yes, that could be the difference for exp. What's more worrisome is
that the C compiler incorrectly resolves the literal floating point
2.7182818284590452354 .

- Robert

Dr. David Kirkby

unread,
Dec 30, 2009, 7:38:49 PM12/30/09
to sage-...@googlegroups.com
Robert Bradshaw wrote:
>>>
>>>>> **********************************************************************
>>>>> File "/rootpool2/local/kirkby/sage-4.3/devel/sage/sage/symbolic/
>>>>> pynac.pyx", line
>>>>> 1276:
>>>>> sage: py_exp(float(1))
>>>>> Expected:
>>>>> 2.7182818284590451
>>>>> Got:
>>>>> 2.7182818284590455

> Yes, that could be the difference for exp. What's more worrisome is

> that the C compiler incorrectly resolves the literal floating point
> 2.7182818284590452354 .
>
> - Robert
>

Em, This is very odd. exp(1) gives a different result on SPARC if you build
with gcc or Sun Studio. GCC is correct, and Sun Studio is wrong. Yet Sage on
't2' was build with gcc, not Sun Studio.

On Open Solaris, using a Xeon processor, the two compilers give the same result.

Here's a noddy C program:

drkirkby@kestrel:~$ cat exp.c
#include <math.h>
#include <stdio.h>

int main() {
printf("%.16lf\n",exp(1.0));
}

This data below is from a SPARC (sun4u architecture) of mine. I've tried on 't2'
(sun4v) and get exactly the same results.

drkirkby@kestrel:~$ uname -a
SunOS kestrel 5.10 Generic_141444-09 sun4u sparc SUNW,UltraAX-i2
drkirkby@kestrel:~$ cat exp.c
#include <math.h>
#include <stdio.h>

int main() {
printf("%.16lf\n",exp(1.0));
}

drkirkby@kestrel:~$ gcc exp.c
drkirkby@kestrel:~$ ./a.out
2.7182818284590451
drkirkby@kestrel:~$ /opt/sunstudio12.1/bin/cc -lm exp.c
drkirkby@kestrel:~$ ./a.out
2.7182818284590455

Now, when I do this on Open Solaris, I get the same result with each compiler.

bash-3.2$ uname -a
SunOS hawk 5.11 snv_111b i86pc i386 i86pc
bash-3.2$ gcc exp.c
bash-3.2$ ./a.out
2.7182818284590451
bash-3.2$ /opt/sunstudio12.1/bin/cc -lm exp.c
bash-3.2$ ./a.out
2.7182818284590451

I also tried this on HP-UX, and get the same (correct) result with both the
native HP compiler and with gcc.

-bash-4.0$ uname -a
HP-UX hpbox B.11.11 U 9000/785 2016698240 unlimited-user license
-bash-4.0$ gcc exp.c
-bash-4.0$ ./a.out
2.7182818284590451
-bash-4.0$ cc -lm exp.c
-bash-4.0$ ./a.out
2.7182818284590451

IMHO, this should not be covered up by reducing the number of digits compared,
but recorded as a failure. Many packages have 'expected failures'. It would be
better if this was recorded as an 'expected failure'. I appreciate that might
not be how the doctests work now.

Dave

PS

It's interesting how both HP and Sun compilers need the '-lm' to link in the
maths library. GCC does that by default.

http://trac.sagemath.org/sage_trac/ticket/7067

is an open ticket which is caused by making the incorrect assumption that the C
compiler will automatically link in the maths library.

Robert Bradshaw

unread,
Dec 31, 2009, 5:08:46 AM12/31/09
to sage-...@googlegroups.com
On Dec 30, 2009, at 4:38 PM, Dr. David Kirkby wrote:

> Robert Bradshaw wrote:
>>>>
>>>>>> **********************************************************************
>>>>>> File "/rootpool2/local/kirkby/sage-4.3/devel/sage/sage/symbolic/
>>>>>> pynac.pyx", line
>>>>>> 1276:
>>>>>> sage: py_exp(float(1))
>>>>>> Expected:
>>>>>> 2.7182818284590451
>>>>>> Got:
>>>>>> 2.7182818284590455
>
>> Yes, that could be the difference for exp. What's more worrisome is
>> that the C compiler incorrectly resolves the literal floating point
>> 2.7182818284590452354 .
>>
>> - Robert
>>
>
> Em, This is very odd. exp(1) gives a different result on SPARC if
> you build
> with gcc or Sun Studio. GCC is correct, and Sun Studio is wrong. Yet
> Sage on
> 't2' was build with gcc, not Sun Studio.

I was actually unable to start sage on t2, so my experience is just
with the plain system Python. (Do you know what that was built with?)

Given the above, and the fact that we're looking at exp(1) and math.e,
I'm also starting to lean towards the fact that this should be correct
on every system, and we've uncovered a bug of some sort.

- Robert

Dr. David Kirkby

unread,
Dec 31, 2009, 7:31:12 AM12/31/09
to sage-...@googlegroups.com
Robert Bradshaw wrote:
> On Dec 30, 2009, at 4:38 PM, Dr. David Kirkby wrote:
>
>> Robert Bradshaw wrote:
>>>>>>> **********************************************************************
>>>>>>> File "/rootpool2/local/kirkby/sage-4.3/devel/sage/sage/symbolic/
>>>>>>> pynac.pyx", line
>>>>>>> 1276:
>>>>>>> sage: py_exp(float(1))
>>>>>>> Expected:
>>>>>>> 2.7182818284590451
>>>>>>> Got:
>>>>>>> 2.7182818284590455
>>> Yes, that could be the difference for exp. What's more worrisome is
>>> that the C compiler incorrectly resolves the literal floating point
>>> 2.7182818284590452354 .
>>>
>>> - Robert
>>>
>> Em, This is very odd. exp(1) gives a different result on SPARC if
>> you build
>> with gcc or Sun Studio. GCC is correct, and Sun Studio is wrong. Yet
>> Sage on
>> 't2' was build with gcc, not Sun Studio.
>
> I was actually unable to start sage on t2, so my experience is just
> with the plain system Python. (Do you know what that was built with?)

There's a semi-stable release of Sage at:

/rootpool2/local/kirkby/sage-4.3
)

I call it semi-stable, as I do occasionally make changes that might break it
temporarily, but I always put it back to a working Sage.

However, there is also

http://t2nb.math.washington.edu:8000/

which has Sage 4.3. That is a Solaris zone on the host t2.

If you want your own copy on 't2' which you can recompile easily, I could give
you one. I have a tar file on the zone 't2nb' which has both the source and
binaries of a complete stable version. I could copy that somewhere and make it
available if you want. (It's the tarball I used to copy Sage into the zone).

>> IMHO, this should not be covered up by reducing the number of digits
>> compared,
>> but recorded as a failure. Many packages have 'expected failures'.
>> It would be
>> better if this was recorded as an 'expected failure'. I appreciate
>> that might
>> not be how the doctests work now.
>
> Given the above, and the fact that we're looking at exp(1) and math.e,
> I'm also starting to lean towards the fact that this should be correct
> on every system, and we've uncovered a bug of some sort.
>
> - Robert

What I find odd is the fact that gcc gets it right with my noddy C program. But
when that gcc is used to build Sage, we find that Sage gets the same answer as
Sun Studio. One might suggest a different library might be used, but both gcc
and Sun Studio link against the same 3 libraries.

irkby@t2:[~] $ /opt/SUNWspro/bin/cc -lm exp.c
kirkby@t2:[~] $ ldd a.out
libm.so.2 => /lib/libm.so.2
libc.so.1 => /lib/libc.so.1
/platform/SUNW,T5240/lib/libc_psr.so.1
kirkby@t2:[~] $ gcc exp.c
kirkby@t2:[~] $ ldd a.out
libc.so.1 => /lib/libc.so.1
libm.so.2 => /lib/libm.so.2
/platform/SUNW,T5240/lib/libc_psr.so.1
kirkby@t2:[~] $

I do not know if there is any significance in the fact that ldd shows libm
before libc when built with the Sun compiler but swaps them when built with gcc.

There's no implementation of 'exp' in libc, as the following shows.

kirkby@t2:[~] $ /opt/SUNWspro/bin/cc exp.c
Undefined first referenced
symbol in file
exp exp.o
ld: fatal: Symbol referencing errors. No output written to a.out

I'm certainly against coving this up just now at least.

Dave

rjf

unread,
Dec 31, 2009, 12:14:19 PM12/31/09
to sage-devel
You guys could eliminate one (I suspect major) source of confusion by
not printing the numbers in decimal and reading them back in from
character strings.

You can read/write exact hexadecimal 64-bit floating-point binary
numbers.
or you can write them out as <sign>X <exact integer, written in any
base you choose> X 2^<exact integer>
without any error or possible confusion.

There are few people who will claim their numerical library functions
provide correctly rounded
results always. Read about the "table-maker's dilemma".
http://en.wikipedia.org/wiki/Rounding#The_table-maker.27s_dilemma

The fact that rounding of + and * is done right is irrelevant.
The fact that some computers are doing 80-bit intermediate
calculations vs 64-bit might be relevant.
But then if you want them all to agree, you can knee-cap the 80-bit
machines, so they will be the
same as 64-bit. Java originally required that.

Instead of flailing around switching Solaris compilers and what-have-
you
why not try figuring out whether the bug is printing or computing, or
even a bug at all.

RJF

Dr. David Kirkby

unread,
Dec 31, 2009, 2:15:29 PM12/31/09
to sage-...@googlegroups.com

The point you are missing is that we want to compare the output what Sage prints
to a human.

Comparing the bits of a floating point number would not help as a test suite. If
two systems have the same 64 bits to indicate the number, a bug in the code
which converts that to ASCII would not be detected. The tests compare ASCII
text, not numbers.

I would accept for some randomised testing, it might be better to bypaass the
binary to ASCII conversion, as it would allow for more tests to be conducted in
less time. But not in general would that be a good idea.

It's 20+ years since I done any assembly language programming - that was on an
386/387 chip, then later on a VAX - never SPARC. So I'd be at a loss really.
(BTW, after programming on a VAX, you realise how limited the x86 series was).

Dave

Gonzalo Tornaria

unread,
Dec 31, 2009, 2:50:05 PM12/31/09
to sage-...@googlegroups.com
On Wed, Dec 30, 2009 at 10:38 PM, Dr. David Kirkby
<david....@onetel.net> wrote:
> Em, This is very odd.  exp(1) gives a different result on SPARC if you build
> with gcc or Sun Studio. GCC is correct, and Sun Studio is wrong. Yet Sage on
> 't2' was build with gcc, not Sun Studio.

gcc is actually inlining exp(1.0) to its correct value. The exp() from
the sun library is incorrect. Try this program instead:

#include <math.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
double x = 1.0;
if (argc>1)
x = atof(argv[1]);
printf("%.16lf\n",exp(x));
}

Best, Gonzalo

rjf

unread,
Dec 31, 2009, 4:01:50 PM12/31/09
to sage-devel

On Dec 31, 11:15 am, "Dr. David Kirkby" <david.kir...@onetel.net>
wrote:

> > RJF
>
> The point you are missing is that we want to compare the output what Sage prints
> to a human.
>

The point you are missing is that the following item, which presumably
could be printed by Sage,
is perfectly readable to a human:

6121026514868073 * 2^(-51).

It exactly dictates the bits in an IEEE double-float, and does not
require any conversion from binary
to decimal. It does not need rounding. This kind of representation
does not have any hidden unprinted digits. It does not ever need to
be longer because of delicate edge conditions of certain numbers.

It happens to evaluate to
APPROXIMATELY 2.718281828459045

a more accurate rendition of that same number, again in decimal is
2.7182818284590450907955982984....


although a more accurate rendition of exp(1) looks like this.
2.7182818284590452353602874713....

If you want to compare the results of 2 numerical computations for
exact identical bits, then I suggest you
look at the bits.

The fact that two different systems get slightly different answers is
not necessarily an indication of an error.

RJF

Dr. David Kirkby

unread,
Dec 31, 2009, 11:22:23 PM12/31/09
to sage-...@googlegroups.com
rjf wrote:
>
> On Dec 31, 11:15 am, "Dr. David Kirkby" <david.kir...@onetel.net>
> wrote:
>
>>> RJF
>> The point you are missing is that we want to compare the output what Sage prints
>> to a human.
>>
>
> The point you are missing is that the following item, which presumably
> could be printed by Sage,
> is perfectly readable to a human:
>
> 6121026514868073 * 2^(-51).
>
> It exactly dictates the bits in an IEEE double-float, and does not
> require any conversion from binary
> to decimal. It does not need rounding. This kind of representation
> does not have any hidden unprinted digits. It does not ever need to
> be longer because of delicate edge conditions of certain numbers.
>
> It happens to evaluate to
> APPROXIMATELY 2.718281828459045

Sure, Sage could print that. It would also be worth printing the sign bit, so we
could verify the values of

1) Sign bit
2) Significand
3) Exponent.

All of those could be correct. But there is still the software which does the
non-trivial task of converting that into the base 10 representation used by
humans. Then in additon to that, there is the software which takes a base 10
number, shows it with the Sage prompt, adding carriage returns etc where
necessary. All of these can go wrong.

I would think in an almost ideal world, the test would be done at a higher
level, using hardware/software which checked what the monitor actually
displayed. That's not quite as easy to do though.

Even better would be some way to scan the brain of the user to see what he/she
believes Sage is showing. Perhaps we use a font that is not very good, so
despite being displayed properly, it misunderstood.

Given most of time people want to see a base 10 representation of a number, and
not a base 2, base 16 or IEE 754 representation, I believe most testing should
be done at the base 10 level.

If there is a reason for testing the IEEE 754 representation as first choice,
then you have yet to convince me of it.


Dave


Tim Daly

unread,
Jan 1, 2010, 12:13:41 AM1/1/10
to sage-...@googlegroups.com, daly, axiom-developer@nongnu.org >> Axiom-Developer
Dave,

Axiom has the same issues.

My take on this is that what you check depends on the reason you are
checking.
If you are generating the output for human use (e.g. a table) then you
want decimal.
If you are generating the output for regression testing (e.g. checking
the answers on
multiple hardware) then you probably want Fateman's solution.

Tim


William Stein

unread,
Jan 1, 2010, 12:16:43 AM1/1/10
to sage-devel, daly, axiom-developer@nongnu.org >> Axiom-Developer

The output is used both for human use and for regression testing. Its
primary use is human -- it's an example in the Sage reference manual:

sage: float(e)
2.7182818284590451

This is something a user will look at when reading the documentation
for some function. It illustrates what happens when they convert the
symbolic constant e to float.

William

Tim Daly

unread,
Jan 1, 2010, 1:09:06 AM1/1/10
to sage-...@googlegroups.com, axiom-developer@nongnu.org >> Axiom-Developer, daly@axiom-developer.org >> daly
And therein lies the problem. We use a regression that does a comparison
of the
printed representation of the output of the run with a stored copy of
the output.

All of our regression tests were passing until I installed another,
unrelated program.
Suddenly about 30 regression tests started failing. It turns out that
the unrelated
program upgraded one of the system libraries. The net effect of that
change was
to cause the last digit in the output to "wobble" so that some of the
table values
differ in the nth place (20th, 30th, or thereabouts digit). This caused
the regression
comparisons to fail.

Common lisp will give you the exact bit pattern of the float and this value
does not wobble so the text comparison succeeds with both the old and
the new libraries against the bit pattern.

So I can tell you from experience that what you would like to do is not
going to succeed.

Our solution to the human vs regression problem is to include the stable
bit values in the actual compare and keep the human values in a latex
table. This is easy to do with literate input.

Tim

rjf

unread,
Jan 1, 2010, 1:10:25 AM1/1/10
to sage-devel
Assuring the correctness of the binary-to-decimal conversion of the
sage output routines can and should be done separately from testing
the exp() function.
That should be fairly obvious.

I am amazed by your insistence that this kind of decimal output should
be used in regression testing.
Certainly you can't be saying that your idea of regression testing is
for a human to compare two printed outputs, or even two outputs on a
screen!

So why not have the computer compare the ascii strings of digits on
output? Here's why:

In 2.718 .... 0451, the last "1" adds no information for a double-
float, and so a careful printing program that prints
the minimum number of digits necessary to reconstitute the number
could omit it.

Would that make one of the two equivalent "numbers" erroneous?

RJF


Peter Jeremy

unread,
Jan 1, 2010, 1:23:10 AM1/1/10
to sage-...@googlegroups.com
On 2009-Dec-31 17:50:05 -0200, Gonzalo Tornaria <torn...@math.utexas.edu> wrote:
>gcc is actually inlining exp(1.0) to its correct value. The exp() from
>the sun library is incorrect. Try this program instead:

FreeBSD libm is derived from Sun's libm and also gets exp(1) 1ULP high
on amd64 (on i386, exp() is written using i387 assembler) so I tend to
agree that the bug is in Sun's libm.

>#include <math.h>
>#include <stdio.h>
>#include <stdlib.h>
>
>int main(int argc, char **argv) {
> double x = 1.0;
> if (argc>1)
> x = atof(argv[1]);
> printf("%.16lf\n",exp(x));

^^ this needs to be at least 18 to see the problem.
>}

--
Peter Jeremy

William Stein

unread,
Jan 1, 2010, 2:00:11 AM1/1/10
to sage-devel, axiom-developer@nongnu.org >> Axiom-Developer, daly@axiom-developer.org >> daly

What I would like to do does succeed. The goal of the doctest suite
in Sage is:

(1) to provide examples for users that illustrate every function in Sage,

(2) assure that these examples run as claimed, in the sense that a
given input pasted into the Sage command line, results in the claimed
output (with certain well-defined assumptions about state).

You can view (1) as documentation for users, and view (2) as one form
of "regression testing". There are many types of regression tests.

Regarding your example, I would consider it a bug if the specific
examples of Sage's floating point output -- as claimed in the examples
in the official documentation -- are wrong in the presence of certain
standard system-wide shared libraries. That is to say, if the Sage
documentation says:

sage: foo(bar)
1.2345

but in fact in Sage one has

sage: foo(bar)
1.2351

then I would consider this misleading documentation, i.e., a bug.
Your statement above suggests that you *don't* consider this a bug at
all. In Sage, one possible solution (and it depends on context
whether this it the right solution or not) would be to change the
documentation to

sage: foo(bar) # last few digits numerically unstable
1.23...

The "..." in Python's doctest framework is a wildcard. The Sage
code hasn't been changed, but the documentation has been. The point
is that a read of the documentation for the function foo will see
explicitly -- right there in the documentation -- that there are
numerical noise issues on certain platforms. We have followed
exactly this approach in all of the Sage example docstests, which now
total over 110,000 lines of input, testing over 19,000 functions (or
over 80% of the functions in Sage).

-- William

>
> Our solution to the human vs regression problem is to include the stable
> bit values in the actual compare and keep the human values in a latex
> table. This is easy to do with literate input.
>
> Tim
>
>
>
>
>
>
>

William Stein

unread,
Jan 1, 2010, 2:00:27 AM1/1/10
to sage-devel, axiom-developer@nongnu.org >> Axiom-Developer, daly@axiom-developer.org >> daly

What I would like to do does succeed. The goal of the doctest suite
in Sage is:

sage: foo(bar)
1.2345

sage: foo(bar)
1.2351

80.9% of the functions in Sage).

-- William

>


> Our solution to the human vs regression problem is to include the stable
> bit values in the actual compare and keep the human values in a latex
> table. This is easy to do with literate input.
>
> Tim
>
>
>
>
>
>
>

Dr. David Kirkby

unread,
Jan 1, 2010, 4:56:16 AM1/1/10
to sage-...@googlegroups.com

Hi,

Thank you. Yes, your version forces a call to the library, which does give the
less accurate result (2.7182818284590455). The data below is on my own SPARC,
not 't2', but results should be the same.

drkirkby@kestrel:~$ gcc -lm exp.c
drkirkby@kestrel:~$ ./a.out
2.7182818284590451
drkirkby@kestrel:~$ gcc -lm exp2.c
drkirkby@kestrel:~$ ./a.out
2.7182818284590455
Da
I believe you have got to the bottom of this now.

I also noted that if I try to compile and link my version, without linking in
the maths library, so it compiles and links fine. But if I try your version
(which I called exp2.c), the link fails.

bash-3.2$ gcc exp.c
bash-3.2$ gcc exp2.c


Undefined first referenced
symbol in file

exp /var/tmp//cc2qaGNG.o
ld: fatal: symbol referencing errors. No output written to a.out
collect2: ld returned 1 exit status

Dave

Dr. David Kirkby

unread,
Jan 1, 2010, 6:56:02 AM1/1/10
to sage-...@googlegroups.com
Peter Jeremy wrote:
> On 2009-Dec-31 17:50:05 -0200, Gonzalo Tornaria <torn...@math.utexas.edu> wrote:
>> gcc is actually inlining exp(1.0) to its correct value. The exp() from
>> the sun library is incorrect. Try this program instead:
>
> FreeBSD libm is derived from Sun's libm and also gets exp(1) 1ULP high
> on amd64 (on i386, exp() is written using i387 assembler) so I tend to
> agree that the bug is in Sun's libm.

But if FreeBSD's libm for the 387 chips is based on Sun's library for a SPARC
processor one needs to be cautious before making that assumption. If FreeBSD's
implementation was based on Sun's library for the x86 processor, then I'd say
FreeBSD got something wrong, since exp(1.0) is computed accurately with Open
Solaris running on an Xeon processor.

Virtually all computations on reals are going to introduce some errors. The fact
the 387 can do some of those calculations in 80 bits means there is a reason for
having a more accurate library than on the SPARC processor, which is only 64-bits.

I do not know whether the SPARC processor has an instruction which computes
exp(x) directly. Finding such a simple bit of information in manuals which are
100's of pages long, is no easy task. At least some of the manuals are here:

http://www.sun.com/processors/documentation.html

The oldest processor listed is the IIi, and the newest the T1, but one can find
documents with the newer

http://www.sun.com/processors/UltraSPARC-T2/perspectives.xml#anchor1

Somewhere here I have a paper copy of the 80387 processors manual, but I can't
seem to find it.

I think Richard's point about there being a tradeoff between speed and accuracy
is important to keep in mind.

Based on the 64/80 bit difference, I can see a justification for using a more
accurate implementation of exp(x) on a x86 chip, than I can on a SPARC chip. If
it going to take you long to compute something to an extra bit of accuracy, only
to lose that bit as soon as you do any non-trivial computation with it, then one
would have to question whether it was worth spending the extra time in the first
place. Because the x86 processor works internally at 80 bits, it should be able
to preserve all 64 bits when it writes to RAM.

I'm not saying this is a bug, I'm not saying it is not. But the more I think
about it, taking into account what I know of the 80 vs 64 issue, and what
Richard said about the extra time needed to compute a result more accurately,
I'm inclined to think this is not a bug.

Dave

TimDaly

unread,
Jan 1, 2010, 5:08:15 PM1/1/10
to sage-devel
>     That is to say, if the Sage
> documentation says:
>
>    sage: foo(bar)
>    1.2345
>
> but in fact in Sage one has
>
>    sage: foo(bar)
>    1.2351
>
> then I would consider this misleading documentation, i.e., a bug.
> Your statement above suggests that you *don't* consider this a bug at
> all.   In Sage, one possible solution (and it depends on context
> whether this it the right solution or not) would be to change the
> documentation to
>
>    sage: foo(bar)     # last few digits numerically unstable
>    1.23...
>

The idea of documentation is that the user can see what to expect
when typing a command. Your result 1.23... achieves this goal.

The idea of regression test is to detect changes in results
that are incorrect. If you do a computation on a system and the
result differs in the trailing bits (due to a real bug) then
your regression test f() = 1,23... will succeed.

So representing the result f() = 1.23....
will not detect a bad computation if the new library has a bug
and is, at best, a weak regression test.

All non-bit floating point output involves a choice of rounding.
You have chosen to do rounding by truncation. This choice is not
obvious nor necessarily the best. See
http://www.diycalculator.com/popup-m-round.shtml

In any case, you're welcome to test any way you prefer.
I just thought I'd give you insights from Axiom's tests.
I'll let it rest.

Tim

Dr. David Kirkby

unread,
Jan 1, 2010, 5:18:47 PM1/1/10
to sage-...@googlegroups.com
TimDaly wrote:

> All non-bit floating point output involves a choice of rounding.
> You have chosen to do rounding by truncation. This choice is not
> obvious nor necessarily the best. See
> http://www.diycalculator.com/popup-m-round.shtml
>
> In any case, you're welcome to test any way you prefer.
> I just thought I'd give you insights from Axiom's tests.
> I'll let it rest.
>
> Tim

I must admit, it had crossed my mind that truncation like this was not optimal.
Even without reading the reference, I would have thought something like testing ifL

abs(expected - got) < some_tolerable_absolute_error

may have been a better choice.

Truncating is obviously going to be a problem if the expected number is
1.000000000000000 but you get 0.9999999999999999 Those two numbers are fairly
close in value, yet their digits are completely different.

Perhaps I'm missing something here.

Dave

William Stein

unread,
Jan 1, 2010, 5:38:47 PM1/1/10
to sage-devel

The test we're talking about right now are "doctests". There primary
purpose is to serve as *documentation*... that is actually correct.
As such, changing lines such as

sage: foo(bar) # last few digits numerically unstable
1.23...

to

sage: abs(foo(bar) - 1.23) < 0.00001
True

would be a step backwards. It would be a step backwards, because it
is harder to read and expresses less clearly what the documentation is
supposed to illustrate (which is how to use the function foo).

Test like your suggestion above would make perfect sense as unittests
or in the context of randomized testing. Those are great, there are
some in Sage, but they serve a different purpose than doctests.

William

Dr. David Kirkby

unread,
Jan 1, 2010, 10:49:39 PM1/1/10
to sage-...@googlegroups.com
William Stein wrote:
>
> Test like your suggestion above would make perfect sense as unittests
> or in the context of randomized testing. Those are great, there are
> some in Sage, but they serve a different purpose than doctests.
>
> William
>

As a matter of interest, I wrote a few lines of Mathematica, to write to a
binary file the value of the constant E and the machine precision number
Exp[1.0] on a SPARC with Mathematica 7. I then used the unix tool 'od' to dump
the hexadecimal values. Sure enough, Mathematica too shows a one bit difference
between E and Exp[1.0].


Mathematica 7.0 for Sun Solaris SPARC (64-bit)
Copyright 1988-2009 Wolfram Research, Inc.

In[1]:= BinaryWrite["exp1.dat",Exp[1.0],"Real64"]

Out[1]= exp1.dat

In[2]:= !od -t x2 exp1.dat
0000000 4005 bf0a 8b14 576a
0000010

In[2]:= BinaryWrite["E.dat",E,"Real64"]

Out[2]= E.dat

In[3]:= !od -t x2 E.dat
0000000 4005 bf0a 8b14 5769
0000010


Note how the last byte has changed from 69 to 6a.


There's probably a much better way of showing the data in Mathematica, but using
'od' was the easiest solution I could think of.

I also tried the same with the Solaris x86 version of Mathematica. No such
change occurs then. Apart from the order of the bytes are swapped (SPARC is big
endian, x86 is little endian), the Solaris x86 date for Exp[1.0] is the same as
for the constant E.


Dave

Peter Jeremy

unread,
Jan 2, 2010, 8:13:43 AM1/2/10
to sage-...@googlegroups.com
On 2010-Jan-01 11:56:02 +0000, "Dr. David Kirkby" <david....@onetel.net> wrote:
>But if FreeBSD's libm for the 387 chips is based on Sun's library for a SPARC
>processor one needs to be cautious before making that assumption. If FreeBSD's
>implementation was based on Sun's library for the x86 processor, then I'd say
>FreeBSD got something wrong, since exp(1.0) is computed accurately with Open
>Solaris running on an Xeon processor.

Sun's libm probably started on M68K. There are two distinct sets of
code: The i386 code uses i387 instructions to do:
x1 = x * log2(e)
exp(x) = 2^int(x1) * 2^fract(x1)
in 80-bit precision. The amd64 and SPARC code uses a semi-numerical
expansion in C, assuming a 64-bit FPU.

>Virtually all computations on reals are going to introduce some
>errors. The fact the 387 can do some of those calculations in 80 bits
>means there is a reason for having a more accurate library than on
>the SPARC processor, which is only 64-bits.

AFAIK, the original justification for including the 80-bit FP format
was to reduce the need for multi-precision or semi-numerical code to
provide accurate double-precision results in floating point libraries.

>I do not know whether the SPARC processor has an instruction which computes
>exp(x) directly.

It doesn't - the SPARC only provide the basic IEEE operations.

> Finding such a simple bit of information in manuals which are
>100's of pages long, is no easy task.

You want a SPARC Assembler Language Manual - googling turned up
suitable extracts.

>Based on the 64/80 bit difference, I can see a justification for
>using a more accurate implementation of exp(x) on a x86 chip, than I
>can on a SPARC chip.

The core of i386 exp() is 11 x87 instructions (and another 22
instructions to handle special cases and ensure the correct FPU mode).
The FreeBSD C version is 56 lines of C and 17 magic constants and
claims less than 1ULP error. The code in glibc-2.9 comprises 100
lines of C with a fallback to multi-precision code.

> If it going to take you long to compute something to an extra bit of
>accuracy, only to lose that bit as soon as you do any non-trivial
>computation with it, then one would have to question whether it was
>worth spending the extra time in the first place.

Calculating perfectly rounded results for transcendental functions is
between very hard and impractical without relying on (very slow)
multi-precision arithmetic. Most implementations allow a few ULP
error. Whether those few bits are critical is a numerical analysis job.

> Because the x86 processor works internally at 80 bits, it
>should be able to preserve all 64 bits when it writes to RAM.

There are some gotchas here. On an i387 in default (80-bit) mode, a
local double variable may be 80-bits if kept in a register but 64-bits
if written to memory. This means the precision of working variables
in a function can alternate between 64 and 80 bits depending on the
code, optimisation level and whim of the compiler. More precision is
not always good. This extra precision (and particularly the
arbitrariness of its existence) can wreak havoc in a function that
expects 'double' to be 64-bits.

>result more accurately, I'm inclined to think this is not a bug.

I tend to agree.

--
Peter Jeremy

Jason Grout

unread,
Jan 2, 2010, 4:41:29 PM1/2/10
to sage-...@googlegroups.com
Peter Jeremy wrote:

> There are some gotchas here. On an i387 in default (80-bit) mode, a
> local double variable may be 80-bits if kept in a register but 64-bits
> if written to memory. This means the precision of working variables
> in a function can alternate between 64 and 80 bits depending on the
> code, optimisation level and whim of the compiler. More precision is
> not always good. This extra precision (and particularly the
> arbitrariness of its existence) can wreak havoc in a function that
> expects 'double' to be 64-bits.

(If I recall the story correctly...) Steve Leon (a linear algebraist)
told of a piece of matlab code that was giving a certain result.
However, when a print statement was inserted in the code to print out an
intermediate result, the function would give a different answer (much
different, if I recall correctly). Apparently it was really
frustrating. They finally concluded that the print statement caused a
certain variable to be 64-bit instead of 80-bit, and that made all the
difference.

Thanks,

Jason

Robert Bradshaw

unread,
Jan 3, 2010, 1:45:30 AM1/3/10
to sage-...@googlegroups.com
On Jan 1, 2010, at 2:08 PM, TimDaly wrote:

>> That is to say, if the Sage
>> documentation says:
>>
>> sage: foo(bar)
>> 1.2345
>>
>> but in fact in Sage one has
>>
>> sage: foo(bar)
>> 1.2351
>>
>> then I would consider this misleading documentation, i.e., a bug.
>> Your statement above suggests that you *don't* consider this a bug at
>> all. In Sage, one possible solution (and it depends on context
>> whether this it the right solution or not) would be to change the
>> documentation to
>>
>> sage: foo(bar) # last few digits numerically unstable
>> 1.23...
>>
>
> The idea of documentation is that the user can see what to expect
> when typing a command. Your result 1.23... achieves this goal.

Yep. It also tells the user that there may be some numerical noise.

> The idea of regression test is to detect changes in results
> that are incorrect. If you do a computation on a system and the
> result differs in the trailing bits (due to a real bug) then
> your regression test f() = 1,23... will succeed.
>
> So representing the result f() = 1.23....
> will not detect a bad computation if the new library has a bug
> and is, at best, a weak regression test.

By adding the ... for the trailing digits, one is explicitly saying
that a change in the last couple of digits should not be considered
incorrect (e.g. it varies in inconsequential ways depending on the
platform), thus reducing the false positive rate for your regression
tests. In my mind this makes for a better regression test.

> All non-bit floating point output involves a choice of rounding.
> You have chosen to do rounding by truncation. This choice is not
> obvious nor necessarily the best.

It is both intuitive to read and standard use in the Python doctesting
framework, which makes it a nice fit for us.

> See http://www.diycalculator.com/popup-m-round.shtml

Given that most of the people on this thread don't actually use Sage,
I'd like to point out that both of these failing tests have to do with
the handling of Python floating point numbers, which are basically C
doubles and inherently have all the same advantages and disadvantages.
By default, when one works with floating point numbers in Sage MPFR
values are used, which (modulo MPFR bugs) are bit-for-bit identical on
all supported platforms and as accurate as possible (with your choice
of rounding mode) to the last bit. Of course this has a speed penalty,
which is why we try to make sure things work smoothly with python
floats as well.

- Robert

Reply all
Reply to author
Forward
0 new messages