Python -- floating point arithmetic

david mainzer

unread,

Jul 7, 2010, 8:05:52 AM7/7/10

to pytho...@python.org

Dear Python-User,

today i create some slides about floating point arithmetic. I used an
example from

http://docs.python.org/tutorial/floatingpoint.html

so i start the python shell on my linux machine:

dm@maxwell $ python
Python 2.6.5 (release26-maint, May 25 2010, 12:37:06)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> >>> sum = 0.0
>>> >>> for i in range(10):
... sum += 0.1
...
>>> >>> sum
0.99999999999999989
>>> >>>
But thats looks a little bit wrong for me ... i must be a number greater
then 1.0 because 0.1 = 0.100000000000000005551115123125782702118158340454101562500000000000
in python ... if i print it.

So i create an example program:

sum = 0.0
n = 10
d = 1.0 / n
print "%.60f" % ( d )
for i in range(n):
print "%.60f" % ( sum )
sum += d

print sum
print "%.60f" % ( sum )

-------- RESULTs ------
0.100000000000000005551115123125782702118158340454101562500000
0.000000000000000000000000000000000000000000000000000000000000
0.100000000000000005551115123125782702118158340454101562500000
0.200000000000000011102230246251565404236316680908203125000000
0.300000000000000044408920985006261616945266723632812500000000
0.400000000000000022204460492503130808472633361816406250000000
0.500000000000000000000000000000000000000000000000000000000000
0.599999999999999977795539507496869191527366638183593750000000
0.699999999999999955591079014993738383054733276367187500000000
0.799999999999999933386618522490607574582099914550781250000000
0.899999999999999911182158029987476766109466552734375000000000
1.0
0.999999999999999888977697537484345957636833190917968750000000

and the jump from 0.50000000000000*** to 0.59999999* looks wrong
for me ... do i a mistake or is there something wrong in the
representation of the floating points in python?

my next question, why could i run

print "%.66f" % ( sum )

but not

print "%.67f" % ( sum )

can anybody tell me how python internal represent a float number??

Best and many thanks in advanced,
David

david mainzer

unread,

Jul 7, 2010, 7:52:12 AM7/7/10

to pytho...@python.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA384

Dear Python-User,

http://docs.python.org/tutorial/floatingpoint.html

- -------- RESULTs ------

0.100000000000000005551115123125782702118158340454101562500000
0.000000000000000000000000000000000000000000000000000000000000
0.100000000000000005551115123125782702118158340454101562500000
0.200000000000000011102230246251565404236316680908203125000000
0.300000000000000044408920985006261616945266723632812500000000
0.400000000000000022204460492503130808472633361816406250000000
0.500000000000000000000000000000000000000000000000000000000000
0.599999999999999977795539507496869191527366638183593750000000
0.699999999999999955591079014993738383054733276367187500000000
0.799999999999999933386618522490607574582099914550781250000000
0.899999999999999911182158029987476766109466552734375000000000
1.0
0.999999999999999888977697537484345957636833190917968750000000

and the jump from 0.50000000000000*** to 0.59999999* looks wrong
for me ... do i a mistake or is there something wrong in the
representation of the floating points in python?

my next question, why could i run

print "%.66f" % ( sum )

but not

print "%.67f" % ( sum )

can anybody tell me how python internal represent a float number??

Best and many thanks in advanced,
David

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.15 (GNU/Linux)

iQIcBAEBCQAGBQJMNGpsAAoJEJ82b5xvAlAYYMYP/RTaRcB2NCawBQ25V463+pkO
/YtTqsamrFqENrljqpsPrwOOqR02TQrOvZrV72snDPkkcN36tZHKwbmcnOS0tOAc
kjX0/oNQOvvEMyJUuHJt9kdNjxMEvUUlENEZHLtpxbypIAo3Waf0FBxKo9F4fJJS
PaDIuDgXiLOiaTTUCwa5kKowzjUe8BJOczpUulvGiIvbUac4cxwkPiGvv6L4wzmI
/4x1mG5FSibEzcI2zFRsQNHfTqxuKC3a49DuSPtXZo4YWdqVeSXLntQk70uTa78K
q4cBVEIDqETQyG0mcRABJpcTMPsnGgbgJD74uhDSTARuyHh405XbjKlic7pe1M12
AhuN71QjpGFl80OOXxCja4OKJCAhPEkfhNsJjrlFnSXAoFWg5YvhDbSVkjW6ftt0
tzyGZEoBpRSjGSIeAojYaqt1Xdxwldz2qFjRLX03io3Fgr8PKRbcLRgg1FXaMFPc
gYaY0UEh4dEjt/5afp3ET0LkOhZVMbbi2oSkkFQpRMAzik63zSJRwRxiQXWunlSi
HhKqL4sAICf6MODzquuNzJO9wH8Fkpwi+JPEAwnQm/S6Ty00dN22GezG4TWGfepH
AhKOIEtRIcMgI7kyBfUdOQe6sifKGGuTEeXK12Td9znZN+wKfqG+Ch1mq4Rwy6P7
p2xwF7ZUWNgRU+5Y/bLG
=aS+I
-----END PGP SIGNATURE-----

Mark Dickinson

unread,

Jul 7, 2010, 8:55:55 AM7/7/10

to

On Jul 7, 1:05 pm, david mainzer <d...@tu-clausthal.de> wrote:
> Dear Python-User,
>
> today i create some slides about floating point arithmetic. I used an
> example from
>
> http://docs.python.org/tutorial/floatingpoint.html
>
> so i start the python shell on my linux machine:
>
> dm@maxwell $ python
> Python 2.6.5 (release26-maint, May 25 2010, 12:37:06)
> [GCC 4.3.4] on linux2
> Type "help", "copyright", "credits" or "license" for more information.>>> >>> sum = 0.0
> >>> >>> for i in range(10):
>
> ... sum += 0.1
> ...>>> >>> sum
> 0.99999999999999989
>
> But thats looks a little bit wrong for me ... i must be a number greater
> then 1.0 because 0.1 = 0.100000000000000005551115123125782702118158340454101562500000000000
> in python ... if i print it.

So you've identified one source of error here, namely that 0.1 isn't
exactly representable (and you're correct that the value stored
internally is actually a little greater than 0.1). But you're
forgetting about the other source of error in your example: when you
do 'sum += 0.1', the result typically isn't exactly representable, so
there's another rounding step going on. That rounding step might
produce a number that's smaller than the actual exact sum, and if
enough of your 'sum += 0.1' results are rounded down instead of up,
that would easily explain why the total is still less than 1.0.

Look at this more closely: you're adding

0.500000000000000000000000....

to

0.1000000000000000055511151231257827021181583404541015625

The *exact* result is, of course

0.6000000000000000055511151231257827021181583404541015625

However, that's not a number that can be exactly represented as a C
double (which is how Python stores floats internally). This value
falls between the two (consecutive) representable values:

0.59999999999999997779553950749686919152736663818359375

and

0.600000000000000088817841970012523233890533447265625

But of these two, the first is closer to the exact value than the
second, so that's the result that you get.

You can convince yourself of these results by using the fractions
module to do exact arithmetic:

>>> from fractions import Fraction
>>> tenth = Fraction.from_float(0.1)
>>> half = Fraction.from_float(0.5)
>>> point_six = Fraction.from_float(0.6) # 0.599999999999
>>> point_six_plus = Fraction.from_float(0.6 + 2**-53) # next float up, 0.6000000
>>> sum = tenth + half # exact value of the sum
>>> point_six < sum < point_six_plus # lies between point_six and point_six_plus
True
>>> sum - point_six < point_six_plus - sum # but it's closer to point_six
True

> my next question, why could i run
>
> print "%.66f" % ( sum )
>
> but not
>
> print "%.67f" % ( sum )

That's a historical artefact resulting from use of a fixed-length
buffer somewhere deep in Python's internals. This restriction is
removed in Python 2.7 and Python 3.x.

> can anybody tell me how python internal represent a float number??

In CPython, it's stored as a C double, which typically means in IEEE
754 binary64 format. (Of course, since it's a Python object, it's not
just storing the C double itself; it also has fields for the object
type and the reference count, so a Python float typically takes 16
bytes of memory on a 32-bit machine, and 24 bytes on a 64-bit
machine.)

--
Mark

Thomas Jollans

unread,

Jul 7, 2010, 9:08:07 AM7/7/10

to pytho...@python.org

On 07/07/2010 02:05 PM, david mainzer wrote:
> today i create some slides about floating point arithmetic. I used an
> example from
>
> http://docs.python.org/tutorial/floatingpoint.html
>
> so i start the python shell on my linux machine:
>
> dm@maxwell $ python
> Python 2.6.5 (release26-maint, May 25 2010, 12:37:06)
> [GCC 4.3.4] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>>>>> sum = 0.0
>>>>>>> for i in range(10):
> ... sum += 0.1
> ...
>>>>>>> sum
> 0.99999999999999989
>>>>>>>
> But thats looks a little bit wrong for me ... i must be a number greater
> then 1.0 because 0.1 = 0.100000000000000005551115123125782702118158340454101562500000000000
> in python ... if i print it.

The simple fact of the matter is: floating point arithmetic isn't
accurate. This has nothing to do with Python, it's the fault of your
processor's floating point handling. It's good enough in most cases, but
you should never rely on a floating-point number to have exactly a
certain value. It won't work. To generalize your example a bit:

>>> def test_floating_product(a, b):
... sum = 0
... for _ in range(int(b)):
... sum += a
... return sum, a * int(b), sum == a * b
...
>>> test_floating_product(0.1, 1)
(0.1, 0.1, True)
>>> test_floating_product(0.1, 10)
(0.9999999999999999, 1.0, False)
>>> test_floating_product(0.2, 4)
(0.8, 0.8, True)
>>> test_floating_product(0.2, 5)
(1.0, 1.0, True)
>>> test_floating_product(0.2, 6)
(1.2, 1.2000000000000002, False)
>>>

> -------- RESULTs ------
> 0.100000000000000005551115123125782702118158340454101562500000
> 0.000000000000000000000000000000000000000000000000000000000000
> 0.100000000000000005551115123125782702118158340454101562500000
> 0.200000000000000011102230246251565404236316680908203125000000
> 0.300000000000000044408920985006261616945266723632812500000000
> 0.400000000000000022204460492503130808472633361816406250000000
> 0.500000000000000000000000000000000000000000000000000000000000
> 0.599999999999999977795539507496869191527366638183593750000000
> 0.699999999999999955591079014993738383054733276367187500000000
> 0.799999999999999933386618522490607574582099914550781250000000
> 0.899999999999999911182158029987476766109466552734375000000000
> 1.0
> 0.999999999999999888977697537484345957636833190917968750000000
>
> and the jump from 0.50000000000000*** to 0.59999999* looks wrong
> for me ... do i a mistake or is there something wrong in the
> representation of the floating points in python?

the difference is almost exactly 0.1, so that looks okay

>
> my next question, why could i run
>
> print "%.66f" % ( sum )
>
> but not
>
> print "%.67f" % ( sum )

I can run either... with Python 3.1. Using 2.6, I get a nice error message:

>>> "%.129f" % 0.1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: formatted float is too long (precision too large?)

There just isn't anything like 67 decimals of information available.
Having that information wouldn't help you a bit.

basically, floating point number are stored in the format

N * (2 ** E)

And use a lot of guesswork. as E gets larger, the precision decreases.
Rounding errors occur at the last few decimals, in either direction,
depending on the numbers.

Christian Heimes

unread,

Jul 7, 2010, 9:24:35 AM7/7/10

to pytho...@python.org

> can anybody tell me how python internal represent a float number??

It's an IEEE 754 double precision float on all hardware platforms that
support IEEE 754 semantics. Python follows the C99 standards for double
and complex numbers.

Christian

Philip Semanchuk

unread,

Jul 7, 2010, 10:05:22 AM7/7/10

to python-list (General)

Yes, this might be a good time to review the dense but interesting
document, "What Every Computer Scientist Should Know About Floating-
Point Arithmetic":
http://docs.sun.com/source/806-3568/ncg_goldberg.html

Cheers
Philip

bart.c

unread,

Jul 7, 2010, 11:00:25 AM7/7/10

to

david mainzer wrote:

I think the main problem is, as sum gets bigger, the less significant bits
of the 0.1 representation fall off the end (enough to make it effectively
just under 0.1 you're adding instead of just over).

> can anybody tell me how python internal represent a float number??

Try "google ieee floating point". The problems aren't specific to Python.

--
Bartc

bart.c

unread,

Jul 7, 2010, 11:02:38 AM7/7/10

to

david mainzer wrote:

I think the main problem is, as sum gets bigger, the less significant bits

of the 0.1 representation fall off the end (enough to make it effectively
just under 0.1 you're adding instead of just over).

> can anybody tell me how python internal represent a float number??

Try "google ieee floating point". The problems aren't specific to Python.

--
Bartc

Nobody

unread,

Jul 7, 2010, 12:29:55 PM7/7/10

to

On Wed, 07 Jul 2010 15:08:07 +0200, Thomas Jollans wrote:

> you should never rely on a floating-point number to have exactly a
> certain value.

"Never" is an overstatement. There are situations where you can rely
upon a floating-point number having exactly a certain value.

First, floating-point values are exact. They may be an approximation
to some other value, but they are still exact values, not some kind of
indeterminate quantum state. Specifically, a floating-point value is a
rational number whose denominator is a power of two.

Second, if the result of performing a primitive arithmetic operation
(addition, subtraction, multiplication, division, remainder) or the
square-root function on the equivalent rational values is exactly
representable as a floating-point number, then the result will be exactly
that value.

Third, if the result of performing a primitive arithmetic operation or the
square-root function on the equivalent rational values *isn't* exactly
representable as a floating-point number, then the floating-point result
will be obtained by rounding the exact value according to the FPU's
current rounding mode.

All of this is entirely deterministic, and follows relatively simple
rules. Even if the CPU has a built-in random number generator, it will
*not* be used to generate the least-significant bits of a floating-point
arithmetic result.

The second and third cases above assume that floating-point arithmetic
follows IEEE-754 (the second case is likely to be true even for systems
which don't strictly adhere to IEEE-754). This is true for most modern
architectures, provided that:

1. You aren't using Borland C, which forcibly "optimises" x/y to x*(1/y),
so 12/3 isn't exactly equal to 4, as 1/3 isn't exactly representable. Real
compilers won't use this sort of approximation unless specifically
instructed (e.g. -funsafe-math-optimizations for gcc).

2. You aren't using one of the early Pentium chips.

In spite of this, there are some "gotcha"s. E.g. on x86, results are
computed to 80-bit (long double) precision internally. These will be
truncated to 64 bits if stored in memory. Whether the truncation occurs is
largely up to the compiler, although it can be forced with -ffloat-store
with gcc.

More complex functions (trigonometric, etc) are only accurate to within a
given relative error (e.g. +/- the value of the least significant bit), as
it isn't always possible to determine the correct value for the least
significant bit for a given rounding mode (and even if it is theoretically
possible, there is no limit to the number of bits of precision which would
be required).

Ethan Furman

unread,

Jul 7, 2010, 1:32:50 PM7/7/10

to pytho...@python.org

Nobody wrote:
> On Wed, 07 Jul 2010 15:08:07 +0200, Thomas Jollans wrote:
>
>> you should never rely on a floating-point number to have exactly a
>> certain value.
>
> "Never" is an overstatement. There are situations where you can rely
> upon a floating-point number having exactly a certain value.

It's not much of an overstatement. How many areas are there where you
need the number
0.100000000000000005551115123125782702118158340454101562500000000000?

If I'm looking for 0.1, I will *never* (except by accident ;) say

if var == 0.1:

it'll either be <= or >=.

By contrast, if I'm dealing with integers I can say if var == 4 because
I *know* that there are values that var can hold that are *exactly* 4.
Not 3.999999999817263 or 4.0000000019726.

~Ethan~

Raymond Hettinger

unread,

Jul 7, 2010, 2:08:18 PM7/7/10

to

On Jul 7, 5:55 am, Mark Dickinson <dicki...@gmail.com> wrote:
> On Jul 7, 1:05 pm, david mainzer <d...@tu-clausthal.de> wrote:
>
>
>
> > Dear Python-User,
>
> > today i create some slides about floating point arithmetic. I used an
> > example from
>
> >http://docs.python.org/tutorial/floatingpoint.html
>
> > so i start the python shell on my linux machine:
>
> > dm@maxwell $ python
> > Python 2.6.5 (release26-maint, May 25 2010, 12:37:06)
> > [GCC 4.3.4] on linux2
> > Type "help", "copyright", "credits" or "license" for more information.>>> >>> sum = 0.0
> > >>> >>> for i in range(10):
>
> > ... sum += 0.1
> > ...>>> >>> sum
> > 0.99999999999999989
>
> > But thats looks a little bit wrong for me ... i must be a number greater
> > then 1.0 because 0.1 = 0.100000000000000005551115123125782702118158340454101562500000000000
> > in python ... if i print it.

[Mark Dickinson]

> So you've identified one source of error here, namely that 0.1 isn't
> exactly representable (and you're correct that the value stored
> internally is actually a little greater than 0.1). But you're
> forgetting about the other source of error in your example: when you
> do 'sum += 0.1', the result typically isn't exactly representable, so
> there's another rounding step going on. That rounding step might
> produce a number that's smaller than the actual exact sum, and if
> enough of your 'sum += 0.1' results are rounded down instead of up,
> that would easily explain why the total is still less than 1.0.

One key for understanding floating point mysteries is to look at the
actual binary sums rather that their approximate representation as a
decimal string. The hex() method can make it easier to visualize
Mark's explanation:

>>> s = 0.0
>>> for i in range(10):
... s += 0.1
... print s.hex(), repr(s)

0x1.999999999999ap-4 0.10000000000000001
0x1.999999999999ap-3 0.20000000000000001
0x1.3333333333334p-2 0.30000000000000004
0x1.999999999999ap-2 0.40000000000000002
0x1.0000000000000p-1 0.5
0x1.3333333333333p-1 0.59999999999999998
0x1.6666666666666p-1 0.69999999999999996
0x1.9999999999999p-1 0.79999999999999993
0x1.cccccccccccccp-1 0.89999999999999991
0x1.fffffffffffffp-1 0.99999999999999989

Having used hex() to understand representation error (how the binary
partial sums are displayed), you can use the Fractions module to gain
a better understanding of rounding error introduced by each addition:

>>> s = 0.0
>>> for i in range(10):
exact = Fraction.from_float(s) + Fraction.from_float(0.1)
s += 0.1
actual = Fraction.from_float(s)
error = actual - exact
print '%-35s%-35s\t%s' % (actual, exact, error)

3602879701896397/36028797018963968 3602879701896397/36028797018963968
0
3602879701896397/18014398509481984 3602879701896397/18014398509481984
0
1351079888211149/4503599627370496 10808639105689191/36028797018963968
1/36028797018963968
3602879701896397/9007199254740992
14411518807585589/36028797018963968 -1/36028797018963968
1/2
18014398509481985/36028797018963968 -1/36028797018963968
5404319552844595/9007199254740992
21617278211378381/36028797018963968 -1/36028797018963968
3152519739159347/4503599627370496
25220157913274777/36028797018963968 -1/36028797018963968
7205759403792793/9007199254740992
28823037615171173/36028797018963968 -1/36028797018963968
2026619832316723/2251799813685248
32425917317067569/36028797018963968 -1/36028797018963968
9007199254740991/9007199254740992
36028797018963965/36028797018963968 -1/36028797018963968

Hope this helps your slides,

Raymond

Wolfram Hinderer

unread,

Jul 7, 2010, 7:13:25 PM7/7/10

to

On 7 Jul., 19:32, Ethan Furman <et...@stoneleaf.us> wrote:
> Nobody wrote:
> > On Wed, 07 Jul 2010 15:08:07 +0200, Thomas Jollans wrote:
>
> >> you should never rely on a floating-point number to have exactly a
> >> certain value.
>
> > "Never" is an overstatement. There are situations where you can rely
> > upon a floating-point number having exactly a certain value.
>
> It's not much of an overstatement. How many areas are there where you
> need the number
> 0.100000000000000005551115123125782702118158340454101562500000000000?
>
> If I'm looking for 0.1, I will *never* (except by accident ;) say
>
> if var == 0.1:
>
> it'll either be <= or >=.

The following is an implementation of a well-known algorithm.
Why would you want to replace the floating point comparisons? With
what?

(This is toy-code.)

#####
from random import random

def min_cost_path(cost_right, cost_up):
""" return minimal cost and its path in a rectangle - going up and
right only """
cost = dict()
size_x, size_y = max(cost_right)

#compute minimal cost
cost[0, 0] = 0.0
for x in range(size_x):
cost[x + 1, 0] = cost[x, 0] + cost_right[x, 0]
for y in range(size_y):
cost[0, y + 1] = cost[0, y] + cost_up[0, y]
for x in range(size_x):
cost[x + 1, y + 1] = min(cost[x, y + 1] + cost_right[x, y
+ 1],
cost[x + 1, y] + cost_up[x + 1,
y])

#compute path (reversed)
x = size_x
y = size_y
path = []
while x != 0 and y != 0:
if x == 0:
y -= 1
path.append("u")
elif y == 0:
x -= 1
path.append("r")
elif cost[x - 1, y] + cost_right[x - 1, y] == cost[x, y]: # fp
compare
x -= 1
path.append("r")
elif cost[x, y - 1] + cost_up[x, y - 1] == cost[x, y]: # fp
compare
y -= 1
path.append("u")
else:
raise ValueError

return cost[size_x, size_y], "".join(reversed(path))

if __name__ == "__main__":
size = 100
cost_right = dict(((x, y), random()) for x in range(size) for y in
range(size))
cost_up = dict(((x, y), random()) for x in range(size) for y in
range(size))
print min_cost_path(cost_right, cost_up)

Zooko O'Whielacronx

unread,

Jul 7, 2010, 11:41:29 PM7/7/10

to Python List

I'm starting to think that one should use Decimals by default and
reserve floats for special cases.

This is somewhat analogous to the way that Python provides
arbitrarily-big integers by default and Python programmers only use
old-fashioned fixed-size integers for special cases, such as
interoperation with external systems or highly optimized pieces (in
numpy or in native extension modules, for example).

Floats are confusing. I've studied them more than once over the years
but I still can't really predict confidently what floats will do in
various situations.

And most of the time (in my experience) the inputs and outputs to your
system and the literals in your code are actually decimal, so
converting them to float immediately introduces a lossy data
conversion before you've even done any computation. Decimal doesn't
have that problem.

>From now on I'll probably try to use Decimals exclusively in all my
new Python code and switch to floats only if I need to interoperate
with an external system that requires floats or I have some tight
inner loop that needs to be highly optimized.

Regards,

Zooko

David Cournapeau

unread,

Jul 8, 2010, 12:04:33 AM7/8/10

to Zooko O'Whielacronx, Python List

On Thu, Jul 8, 2010 at 5:41 AM, Zooko O'Whielacronx <zo...@zooko.com> wrote:
> I'm starting to think that one should use Decimals by default and
> reserve floats for special cases.
>
> This is somewhat analogous to the way that Python provides
> arbitrarily-big integers by default and Python programmers only use
> old-fashioned fixed-size integers for special cases, such as
> interoperation with external systems or highly optimized pieces (in
> numpy or in native extension modules, for example).

I don't think it is analogous at all. Arbitrary-bit integers have a
simple tradeoff: you are willing to lose performance and memory for
bigger integer. If you leave performance aside, there is no downside
that I know of for using big int instead of "machine int". Since you
are using python, you already bought this kind of tradeoff anyway.

Decimal vs float is a different matter altogether: decimal has
downsides compared to float. First, there is this irreconcilable fact
that no matter how small your range is, it is impossible to represent
exactly all (even most) numbers exactly with finite memory - float and
decimal are two different solutions to this issue, with different
tradeoffs. Decimal are more "intuitive" than float for numbers that
can be represented as decimal - but most numbers cannot be represented
as (finite) decimal.

Except for some special usecases, you cannot expect to exactly
represent real numbers. Once you accept that fact, you can make a
decision on decimal, fraction, float or whatever format you see fit.

> And most of the time (in my experience) the inputs and outputs to your
> system and the literals in your code are actually decimal, so
> converting them to float immediately introduces a lossy data
> conversion before you've even done any computation. Decimal doesn't
> have that problem.

That's not true anymore once you start doing any computation, if by
decimal you mean finite decimal. And that will be never true once you
start using non trivial computation (i.e. transcendental functions
like log, exp, etc...).

David

Zooko O'Whielacronx

unread,

Jul 8, 2010, 12:53:20 AM7/8/10

to Python List

On Wed, Jul 7, 2010 at 10:04 PM, David Cournapeau <cour...@gmail.com> wrote:
>
> Decimal vs float is a different matter altogether: decimal has
> downsides compared to float. First, there is this irreconcilable fact
> that no matter how small your range is, it is impossible to represent
> exactly all (even most) numbers exactly with finite memory - float and
> decimal are two different solutions to this issue, with different
> tradeoffs. Decimal are more "intuitive" than float for numbers that
> can be represented as decimal - but most numbers cannot be represented
> as (finite) decimal.

This is not a downside of decimal as compared to float, since most
numbers also cannot be represented as float.

>> And most of the time (in my experience) the inputs and outputs to your
>> system and the literals in your code are actually decimal, so
>> converting them to float immediately introduces a lossy data
>> conversion before you've even done any computation. Decimal doesn't
>> have that problem.
>
> That's not true anymore once you start doing any computation, if by
> decimal you mean finite decimal.

I don't understand. I described two different problems: problem one is
that the inputs, outputs and literals of your program might be in a
different encoding (in my experience they have most commonly been in
decimal). Problem two is that your computations may be lossy. If the
inputs, outputs and literals of your program are decimal (as they have
been for most of my programs) then using decimal is better than using
float because of problem one. Neither has a strict advantage over the
other in terms of problem two.

(There is also problem zero, which is that floats more confusing,
which is how this thread got started. Problem zero is probably the
most important problem for many cases.)

> And that will be never true once you
> start using non trivial computation (i.e. transcendental functions
> like log, exp, etc...).

I'm sorry, what will never be true? Are you saying that decimals have
a disadvantage compared to floats? If so, what is their disadvantage?

(And do math libraries like http://code.google.com/p/dmath/ help ?)

Regards,

Zooko

Steven D'Aprano

unread,

Jul 8, 2010, 3:32:24 AM7/8/10

to

On Thu, 08 Jul 2010 06:04:33 +0200, David Cournapeau wrote:

> On Thu, Jul 8, 2010 at 5:41 AM, Zooko O'Whielacronx <zo...@zooko.com>
> wrote:
>> I'm starting to think that one should use Decimals by default and
>> reserve floats for special cases.
>>
>> This is somewhat analogous to the way that Python provides
>> arbitrarily-big integers by default and Python programmers only use
>> old-fashioned fixed-size integers for special cases, such as
>> interoperation with external systems or highly optimized pieces (in
>> numpy or in native extension modules, for example).
>
> I don't think it is analogous at all. Arbitrary-bit integers have a
> simple tradeoff: you are willing to lose performance and memory for
> bigger integer. If you leave performance aside, there is no downside
> that I know of for using big int instead of "machine int".

Well, sure, but that's like saying that if you leave performance aside,
there's no downside to Bubblesort instead of Quicksort.

However, I believe that in Python at least, the performance cost of
arbitrary-sized longs is quite minimal compared to the benefit, at least
for "reasonable" sized ints, and so the actual real-world cost of
unifying the int and long types is minimal.

On the other hand, until Decimal is re-written in C, it will always be
*significantly* slower than float.

$ python -m timeit "2.0/3.0"
1000000 loops, best of 3: 0.139 usec per loop
$ python -m timeit -s "from decimal import Decimal as D" "D(2)/D(3)"
1000 loops, best of 3: 549 usec per loop

That's three orders of magnitude difference in speed. That's HUGE, and
*alone* is enough to disqualify changing to Decimal as the default
floating point data type.

Perhaps in the future, if and when Decimal has a fast C implementation,
this can be re-thought.

> Since you are
> using python, you already bought this kind of tradeoff anyway.
>
> Decimal vs float is a different matter altogether: decimal has downsides
> compared to float. First, there is this irreconcilable fact that no
> matter how small your range is, it is impossible to represent exactly
> all (even most) numbers exactly with finite memory - float and decimal
> are two different solutions to this issue, with different tradeoffs.

Yes, but how is this a downside *compared* to float? In what way does
Decimal have downsides that float doesn't? Neither can represent
arbitrary real numbers exactly, but if anything float is *worse* compared
to Decimal for two reasons:

* Python floats are fixed to a single number of bits, while the size of
Decimals can be configured by the user;

* floats can represent sums of powers of two exactly, while Decimals can
represent sums of powers of ten exactly. Not only does that mean that any
number exactly representable as a float can also be exactly represented
as a Decimal, but Decimals can *additionally* represent exactly many
numbers of human interest that floats cannot.

> Decimal are more "intuitive" than float for numbers that can be
> represented as decimal - but most numbers cannot be represented as
> (finite) decimal.

True, but any number that can't be, also can't be exactly represented as
a float either, so how does using float help?

[...]

>> And most of the time (in my experience) the inputs and outputs to your
>> system and the literals in your code are actually decimal, so
>> converting them to float immediately introduces a lossy data conversion
>> before you've even done any computation. Decimal doesn't have that
>> problem.
>
> That's not true anymore once you start doing any computation, if by
> decimal you mean finite decimal. And that will be never true once you
> start using non trivial computation (i.e. transcendental functions like
> log, exp, etc...).

But none of those complications are *inputs*, and, again, floats suffer
from exactly the same problem.

--
Steven

Adam Skutt

unread,

Jul 8, 2010, 6:58:27 AM7/8/10

to

On Jul 8, 12:53 am, "Zooko O'Whielacronx" <zo...@zooko.com> wrote:
> I don't understand. I described two different problems: problem one is
> that the inputs, outputs and literals of your program might be in a
> different encoding (in my experience they have most commonly been in
> decimal). Problem two is that your computations may be lossy. If the
> inputs, outputs and literals of your program are decimal (as they have
> been for most of my programs) then using decimal is better than using
> float because of problem one. Neither has a strict advantage over the
> other in terms of problem two.

I can't think of any program I've ever written where the inputs are
actually intended to be decimal. Consider a simple video editing
program, and the user specifies a frame rate 23.976 fps. Is that what
they really wanted? No, they wanted 24000/1001 but didn't feel like
typing that. It should be instructive and telling that mplayer now
strongly prefers the rational expressions over the decimal ones. Just
because you haven't rounded your input in parsing it doesn't change
the fact your input may have be rounded before it was presented to the
program, which is the reality of the matter more often than not.

Even in fields where the numbers are decimal, like finance, the rules
are considerably more complicated than what you were taught in the
"schoolbook". This is one of the reasons IBM can still make a killing
selling mainframes, they got all of those rules right decades ago.

> > And that will be never true once you
> > start using non trivial computation (i.e. transcendental functions
> > like log, exp, etc...).
>
> I'm sorry, what will never be true? Are you saying that decimals have
> a disadvantage compared to floats? If so, what is their disadvantage?

He's saying that once you get past elementary operations, you quickly
run into irrational numbers, which you will not be representing
accurately. Moreover, in general, it's impossible to even round
operations involving transcendental functions to an arbitrary fixed-
precision, you may need effectively infinite precision in order to the
computation. In practice, this means the error induced by a lossy
input conversion (assuming you hadn't already lost information) is
entirely negligible compared to inherent inability to do the necessary
calculations.

Mark Dickinson

unread,

Jul 8, 2010, 7:23:05 AM7/8/10

to

On Jul 8, 11:58 am, Adam Skutt <ask...@gmail.com> wrote:
> accurately. Moreover, in general, it's impossible to even round
> operations involving transcendental functions to an arbitrary fixed-
> precision, you may need effectively infinite precision in order to the
> computation.

Impossible? Can you explain what you mean by this? Doesn't the
decimal module do exactly that, giving correctly-rounded exp() and
log() results to arbitrary precision?

Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin

Type "help", "copyright", "credits" or "license" for more information.

>>> import decimal
>>> decimal.getcontext().prec = 1000
>>> decimal.Decimal(2).ln()
Decimal('0.6931471805599453094172321214581765680755001343602552541206800094933936219696947156058633269964186875420014810205706857336855202357581305570326707516350759619307275708283714351903070386238916734711233501153644979552391204751726815749320651555247341395258829504530070953263666426541042391578149520437404303855008019441706416715186447128399681717845469570262716310645461502572074024816377733896385506952606683411372738737229289564935470257626520988596932019650585547647033067936544325476327449512504060694381471046899465062201677204245245296126879465461931651746813926725041038025462596568691441928716082938031727143677826548775664850856740776484514644399404614226031930967354025744460703080960850474866385231381816767514386674766478908814371419854942315199735488037516586127535291661000710535582498794147295092931138971559982056543928717000721808576102523688921324497138932037843935308877482597017155910708823683627589842589185353024363421436706118923678919237231467232172053401649256872747782344535348')

--
Mark

David Mainzer

unread,

Jul 8, 2010, 8:16:36 AM7/8/10

to

Thanks to all for your help :)

All your comments helped me and now i know how it works in python !

Best
David

Adam Skutt

unread,

Jul 8, 2010, 9:00:48 AM7/8/10

to

On Jul 8, 7:23 am, Mark Dickinson <dicki...@gmail.com> wrote:
> On Jul 8, 11:58 am, Adam Skutt <ask...@gmail.com> wrote:
>
> > accurately. Moreover, in general, it's impossible to even round
> > operations involving transcendental functions to an arbitrary fixed-
> > precision, you may need effectively infinite precision in order to the
> > computation.
>
> Impossible? Can you explain what you mean by this? Doesn't the
> decimal module do exactly that, giving correctly-rounded exp() and
> log() results to arbitrary precision?

You run into the table-maker's dilemma: there's no way to know in
advance how many digits you need in order to have n bits of precision
in the result. For some computations, the number of bits required to
get the desired precision can quickly overwhelm the finite limitations
of your machine (e.g., you run out of RAM first or the time to compute
the answer is simply unacceptable).

Adam

Ethan Furman

unread,

Jul 8, 2010, 9:10:06 AM7/8/10

to pytho...@python.org

Wolfram Hinderer wrote:
> On 7 Jul., 19:32, Ethan Furman <et...@stoneleaf.us> wrote:
>
>>Nobody wrote:
>>
>>>On Wed, 07 Jul 2010 15:08:07 +0200, Thomas Jollans wrote:
>>
>>>>you should never rely on a floating-point number to have exactly a
>>>>certain value.
>>
>>>"Never" is an overstatement. There are situations where you can rely
>>>upon a floating-point number having exactly a certain value.
>>
>>It's not much of an overstatement. How many areas are there where you
>>need the number
>>0.100000000000000005551115123125782702118158340454101562500000000000?
>>
>>If I'm looking for 0.1, I will *never* (except by accident ;) say
>>
>>if var == 0.1:
>>
>>it'll either be <= or >=.
>
>
> The following is an implementation of a well-known algorithm.
> Why would you want to replace the floating point comparisons? With
> what?

Interesting. I knew when I posted my above comment that I was ignoring
such situations. I cannot comment on the code itself as I am unaware of
the algorithm, and haven't studied numbers extensively (although I do
find them very interesting).

So while you've made your point, I believe mine still stands --
comparing floats using == to absolute numbers is almost always a mistake.

~Ethan~

Mark Dickinson

unread,

Jul 8, 2010, 9:22:39 AM7/8/10

to

On Jul 8, 2:00 pm, Adam Skutt <ask...@gmail.com> wrote:
> On Jul 8, 7:23 am, Mark Dickinson <dicki...@gmail.com> wrote:> On Jul 8, 11:58 am, Adam Skutt <ask...@gmail.com> wrote:
>
> > > accurately. Moreover, in general, it's impossible to even round
> > > operations involving transcendental functions to an arbitrary fixed-
> > > precision, you may need effectively infinite precision in order to the
> > > computation.
>
> > Impossible? Can you explain what you mean by this? Doesn't the
> > decimal module do exactly that, giving correctly-rounded exp() and
> > log() results to arbitrary precision?
>
> You run into the table-maker's dilemma: there's no way to know in
> advance how many digits you need in order to have n bits of precision
> in the result.

Sure. But it's a bit of a stretch to go from not knowing what
resources you'll need in advance to calling something 'impossible'. :)

> For some computations, the number of bits required to
> get the desired precision can quickly overwhelm the finite limitations
> of your machine (e.g., you run out of RAM first or the time to compute
> the answer is simply unacceptable).

Perhaps in theory. In practice, though, it's very rare to need to
increase precision more than once or twice beyond an initial first
guesstimate, and the amount of extra precision needed is small. That
increase is unlikely to cause problems unless you were operating right
up against your machine's limits in the first place.

--
Mark

Stefan Krah

unread,

Jul 8, 2010, 9:59:40 AM7/8/10

to pytho...@python.org

Yes, this appears to be unsolved yet, see also:

http://www.cs.berkeley.edu/~wkahan/LOG10HAF.TXT

"Is it time to quit yet? That's the Table-Maker's Dilemma. No general
way exists to predict how many extra digits will have to be carried to
compute a transcendental expression and round it _correctly_ to some
preassigned number of digits. Even the fact (if true) that a finite
number of extra digits will ultimately suffice may be a deep theorem."

However, in practice, mpfr rounds correctly and seems to be doing fine.

In addition to this, I've been running at least 6 months of continuous
tests comparing cdecimal and decimal, and neither log() nor exp() poses
a problem.

pow() is trickier. Exact results have to be weeded out before
attempting the correction loop for correct rounding, and this is
complicated.

For example, in decimal this expression takes a long time (in cdecimal
the power function is not correctly rounded):

Decimal('100.0') ** Decimal('-557.71e-742888888')

Stefan Krah

Adam Skutt

unread,

Jul 8, 2010, 10:29:41 AM7/8/10

to

I suspect your platitude isn't especially comforting for those who
need more computing capability than we can currently construct.
However, I wouldn't call the amount of extra needed precision "small"
for most transcendental functions, as it's frequently more than double
in the worse-case situations and increases non-linearly as the number
of desired digits increases.

Which almost brings us full circle to where I was originally pointing:
the "rounding" problem is inherent in the finite nature of a physical
computer, so you cannot make the rounding problem go away. As such,
talking about differences in rounding between decimal and binary
representations is somewhat of a corner case. Replacing "float" with
"decimal" won't get rid of the problems that floating-point brings to
the table in the first place. The issues that come up all have to do
with lack of human understanding of what the computer is doing. Take
even as something as innocent as equality between two floating-point
numbers: even exact rounding of all operations doesn't solve this
entirely common problem. Accordingly, once we explain why this
doesn't work, we frequently don't need the enhanced functionality
decimal provides and hopefully can make the determination on our own.

If you want to make elementary arithmetic (add, subtract, multiple,
divide) behave intuitively then you (arguably) want an arbitrary-
precision fractional/rational number class. After that, the right
solution is education.

Adam

Victor Eijkhout

unread,

Jul 8, 2010, 11:19:37 AM7/8/10

to

Zooko O'Whielacronx <zo...@zooko.com> wrote:

> I'm starting to think that one should use Decimals by default and
> reserve floats for special cases.

Only if one has Power6 (or 7) which has hardware support for BCD.
Otherwise you will have slow applications.

Victor.
--
Victor Eijkhout -- eijkhout at tacc utexas edu

Mark Dickinson

unread,

Jul 8, 2010, 11:36:24 AM7/8/10

to

On Jul 8, 3:29 pm, Adam Skutt <ask...@gmail.com> wrote:
> On Jul 8, 9:22 am, Mark Dickinson <dicki...@gmail.com> wrote:
> > On Jul 8, 2:00 pm, Adam Skutt <ask...@gmail.com> wrote:
> > > For some computations, the number of bits required to
> > > get the desired precision can quickly overwhelm the finite limitations
> > > of your machine (e.g., you run out of RAM first or the time to compute
> > > the answer is simply unacceptable).
>
> > Perhaps in theory. In practice, though, it's very rare to need to
> > increase precision more than once or twice beyond an initial first
> > guesstimate, and the amount of extra precision needed is small. That
> > increase is unlikely to cause problems unless you were operating right
> > up against your machine's limits in the first place.
>
> I suspect your platitude isn't especially comforting for those who
> need more computing capability than we can currently construct.
> However, I wouldn't call the amount of extra needed precision "small"

I think that's because we're talking at cross-purposes.

To clarify, suppose you want to compute some value (pi; log(2);
AGM(1, sqrt(2)); whatever...) to 1000 significant decimal places.
Then typically the algorithm (sometimes known as Ziv's onion-peeling
method) looks like:

(1) Compute an initial approximation to 1002 digits (say), with known
absolute error (given by a suitable error analysis); for the sake of
argument, let's say that you use enough intermediate precision to
guarantee an absolute error of < 0.6 ulps.

(2) Check to see whether that approximation unambiguously gives you
the correctly-rounded 1000 digits that you need.

(3) If not, increase the target precision (say by 3 digits) and try
again.

It's the precision increase in (3) that I was calling small, and
similarly it's step (3) that isn't usually needed more than once or
twice. (In general, for most functions and input values; I dare say
there are exceptions.)

Step (1) will often involve using significantly more than the target
precision for intermediate computations, depending very much on what
you happen to be trying to compute. IIUC, it's the extra precision in
step (1) that you don't want to call 'small', and I agree.

IOW, I'm saying that the extra precision required *due to the table-
maker's dilemma* generally isn't a concern.

I actually agree with much of what you've said. It was just the
"impossible" claim that went over the top (IMO). The MPFR library
amply demonstrates that computing many transcendental functions to
arbitrary precision, with correctly rounded results, is indeed
possible.

--
Mark

Giacomo Boffi

unread,

Jul 8, 2010, 11:52:13 AM7/8/10

to

"Zooko O'Whielacronx" <zo...@zooko.com> writes:

> I'm starting to think that one should use Decimals by default and
> reserve floats for special cases.

would you kindly lend me your Decimals ruler? i need to measure the
sides of the triangle whose area i have to compute

Chris Rebert

unread,

Jul 8, 2010, 12:31:32 PM7/8/10

to Giacomo Boffi, pytho...@python.org

If your ruler doesn't have a [second] set of marks for centimeters and
millimeters, that's really one cheap/cruddy ruler you're using.

Cheers,
Chris
--
Metrication!
http://blog.rebertia.com

Zooko O'Whielacronx

unread,

Jul 8, 2010, 12:38:24 PM7/8/10

to Python List

On Thu, Jul 8, 2010 at 4:58 AM, Adam Skutt <ask...@gmail.com> wrote:
>
> I can't think of any program I've ever written where the inputs are
> actually intended to be decimal. Consider a simple video editing
> program, and the user specifies a frame rate 23.976 fps. Is that what
> they really wanted? No, they wanted 24000/1001 but didn't feel like
> typing that.

Okay, so there was a lossy conversion from the user's intention
(24000/1001) to what they typed in (23.976).

>>> instr = '23.976'

Now as a programmer you have two choices:

1. accept what they typed in and losslessly store it in a decimal:

>>> from decimal import Decimal as D

>>> x = D(instr)
>>> print x
23.976

2. accept what they typed in and lossily convert it to a float:

>>> x = float(instr)
>>> print "%.60f" % (x,)
23.975999999999999090505298227071762084960937500000000000000000

option 2 introduces further "error" between what you have stored in
your program and what the user originally wanted and offers no
advantages except for speed, right?

>> I'm sorry, what will never be true? Are you saying that decimals have
>> a disadvantage compared to floats? If so, what is their disadvantage?
>
> He's saying that once you get past elementary operations, you quickly
> run into irrational numbers, which you will not be representing
> accurately. Moreover, in general, it's impossible to even round
> operations involving transcendental functions to an arbitrary fixed-
> precision, you may need effectively infinite precision in order to the
> computation. In practice, this means the error induced by a lossy
> input conversion (assuming you hadn't already lost information) is
> entirely negligible compared to inherent inability to do the necessary
> calculations.

But this is not a disadvantage of decimal compared to float is it?
These problems affect both representations. Although perhaps they
affect them differently, I'm not sure.

I think sometimes people conflate the fact that decimals can easily
have higher and more variable precision than floats with the fact that
decimals are capable of losslessly storing decimal values but floats
aren't.

Regards,

Zooko

Adam Skutt

unread,

Jul 8, 2010, 1:04:38 PM7/8/10

to

On Jul 8, 11:36 am, Mark Dickinson <dicki...@gmail.com> wrote:
> I think that's because we're talking at cross-purposes.
>
> To clarify, suppose you want to compute some value (pi; log(2);
> AGM(1, sqrt(2)); whatever...) to 1000 significant decimal places.
> Then typically the algorithm (sometimes known as Ziv's onion-peeling
> method) looks like:
>
> (1) Compute an initial approximation to 1002 digits (say), with known
> absolute error (given by a suitable error analysis); for the sake of
> argument, let's say that you use enough intermediate precision to
> guarantee an absolute error of < 0.6 ulps.
>
> (2) Check to see whether that approximation unambiguously gives you
> the correctly-rounded 1000 digits that you need.
>
> (3) If not, increase the target precision (say by 3 digits) and try
> again.
>
> It's the precision increase in (3) that I was calling small, and
> similarly it's step (3) that isn't usually needed more than once or
> twice. (In general, for most functions and input values; I dare say
> there are exceptions.)
>
> Step (1) will often involve using significantly more than the target
> precision for intermediate computations, depending very much on what
> you happen to be trying to compute. IIUC, it's the extra precision in
> step (1) that you don't want to call 'small', and I agree.
>
> IOW, I'm saying that the extra precision required *due to the table-
> maker's dilemma* generally isn't a concern.

Yes, though I think attributing only the precision added in step 3 to
the table-maker's dilemma isn't entirely correct. While it'd be
certainly less of a dilemma if we could precompute the necessary
precision, it doesn't' help us if the precision is generally
unbounded. As such, I think it's really two dilemmas for the price of
one.

> > I actually agree with much of what you've said. It was just the
> "impossible" claim that went over the top (IMO). The MPFR library
> amply demonstrates that computing many transcendental functions to
> arbitrary precision, with correctly rounded results, is indeed
> possible.

That's because you're latching onto that word instead of the whole
sentence in context and making a much bigger deal out of than is
appropriate. The fact that I may not be able to complete a given
calculation for an arbitrary precision is not something that can be
ignored. It's the same notional problem with arbitrary-precision
integers: is it better to run out of memory or overflow the
calculation? The answer, of course, is a trick question.

Adam

Mark Dickinson

unread,

Jul 8, 2010, 1:07:23 PM7/8/10

to

On Jul 8, 2:59 pm, Stefan Krah <stefan-use...@bytereef.org> wrote:
> pow() is trickier. Exact results have to be weeded out before
> attempting the correction loop for correct rounding, and this is
> complicated.
>
> For example, in decimal this expression takes a long time (in cdecimal
> the power function is not correctly rounded):
>
> Decimal('100.0') ** Decimal('-557.71e-742888888')

Hmm. So it does. Luckily, this particular problem is easy to deal
with. Though I dare say that you have more up your sleeve. :)?

--
Mark

Adam Skutt

unread,

Jul 8, 2010, 1:22:29 PM7/8/10

to

On Jul 8, 12:38 pm, "Zooko O'Whielacronx" <zo...@zooko.com> wrote:
> On Thu, Jul 8, 2010 at 4:58 AM, Adam Skutt <ask...@gmail.com> wrote:
>
> > I can't think of any program I've ever written where the inputs are
> > actually intended to be decimal. Consider a simple video editing
> > program, and the user specifies a frame rate 23.976 fps. Is that what
> > they really wanted? No, they wanted 24000/1001 but didn't feel like
> > typing that.
>
> Okay, so there was a lossy conversion from the user's intention
> (24000/1001) to what they typed in (23.976).
>
> >>> instr = '23.976'
>
> Now as a programmer you have two choices:
>
> 1. accept what they typed in and losslessly store it in a decimal:
>
> >>> from decimal import Decimal as D
> >>> x = D(instr)
> >>> print x
>
> 23.976
>
> 2. accept what they typed in and lossily convert it to a float:
>
> >>> x = float(instr)
> >>> print "%.60f" % (x,)
>
> 23.975999999999999090505298227071762084960937500000000000000000
>
> option 2 introduces further "error" between what you have stored in
> your program and what the user originally wanted and offers no
> advantages except for speed, right?

No, you have a third choice, and it's the only right choice:
3. Convert the input to user's desired behavior and behave
accordingly. Anything else, here, will result in A/V sync issues.
Which is really my point, just because we write '23.976' on the
command-line doesn't necessarily mean that's what we meant. Humans
are pretty lazy, and we write rational numbers as incomplete decimals
all of the time.

> But this is not a disadvantage of decimal compared to float is it?
> These problems affect both representations. Although perhaps they
> affect them differently, I'm not sure.
>
> I think sometimes people conflate the fact that decimals can easily
> have higher and more variable precision than floats with the fact that
> decimals are capable of losslessly storing decimal values but floats
> aren't.
>

No, it's not a specific disadvantage of decimal compared to float.
I'm not sure why David C. choose to phrase it in those specific terms,
though I'm not sure it matters all that much. What I believe is one
must understand that the underlying issues are fundamental, and the
only way to solve the issues is to educate programmers so they can
write code that behaves correctly in the face of rounding. And I do
believe you're correct that programmers frequently see one desirable
behavior of decimal FP over binary FP and therefore assume all the
badness must have gone away.

Adam

Stefan Krah

unread,

Jul 8, 2010, 2:00:14 PM7/8/10

to pytho...@python.org

Adam Skutt <ask...@gmail.com> wrote:
> > > I actually agree with much of what you've said. It was just the
> > "impossible" claim that went over the top (IMO). The MPFR library
> > amply demonstrates that computing many transcendental functions to
> > arbitrary precision, with correctly rounded results, is indeed
> > possible.
> That's because you're latching onto that word instead of the whole
> sentence in context and making a much bigger deal out of than is
> appropriate. The fact that I may not be able to complete a given
> calculation for an arbitrary precision is not something that can be
> ignored. It's the same notional problem with arbitrary-precision
> integers: is it better to run out of memory or overflow the
> calculation? The answer, of course, is a trick question.

In the paper describing his strategy for correct rounding Ziv gives an
estimate for abnormal cases, which is very low.

This whole argument is a misunderstanding. Mark and I argue that correct
rounding is quite feasible in practice, you argue that you want guaranteed
execution times and memory usage. This is clear now, but was not so apparent
in the "impossible" paragraph that Mark responded to.

I think asking for strictly bounded resource usage is reasonable. In cdecimal
there is a switch to turn off correct rounding for exp() and log().

Stefan Krah

unread,

Jul 8, 2010, 2:11:34 PM7/8/10

to pytho...@python.org

Not at the moment, but I'll keep trying. :)

Stefan Krah

Zooko O'Whielacronx

unread,

Jul 8, 2010, 3:15:08 PM7/8/10

to Python List

On Thu, Jul 8, 2010 at 11:22 AM, Adam Skutt <ask...@gmail.com> wrote:
> On Jul 8, 12:38 pm, "Zooko O'Whielacronx" <zo...@zooko.com> wrote:
>> Now as a programmer you have two choices:
…
>> 1. accept what they typed in and losslessly store it in a decimal:

…

>> 2. accept what they typed in and lossily convert it to a float:

> No, you have a third choice, and it's the only right choice:

> 3. Convert the input to user's desired behavior and behave
> accordingly. Anything else, here, will result in A/V sync issues.

Okay, please tell me about how this is done. I've never dealt with
this sort of issue directly.

As far as I can imagine, any way to implement option 3 will involve
accepting the user's input and storing it in memory somehow and then
computing on it. As far as I can tell, doing so with a decimal instead
of a float will never be worse for your outcome and will often be
better. But, please explain in more detail how this works.

Thanks!

Regards,

Zooko

Adam Skutt

unread,

Jul 8, 2010, 3:33:09 PM7/8/10

to

On Jul 8, 2:00 pm, Stefan Krah <stefan-use...@bytereef.org> wrote:
>
> This whole argument is a misunderstanding. Mark and I argue that correct
> rounding is quite feasible in practice, you argue that you want guaranteed
> execution times and memory usage. This is clear now, but was not so apparent
> in the "impossible" paragraph that Mark responded to.

No, that's what I'm arguing for, though such a feature is important.
I'm pointing out that the feasibility of correct rounding is entirely
dependent on what you're doing. For IEEE-754 double, it's feasible
for the elementary functions if you can tolerate intermediate
calculations that are more than twice as large as your double in the
corner cases. Certainly, for a single calculation, this is
acceptable, but at how many calculations is it no longer acceptable?

Adam

Wolfram Hinderer

unread,

Jul 8, 2010, 4:52:56 PM7/8/10

to

On 8 Jul., 15:10, Ethan Furman <et...@stoneleaf.us> wrote:

> Interesting. I knew when I posted my above comment that I was ignoring
> such situations. I cannot comment on the code itself as I am unaware of
> the algorithm, and haven't studied numbers extensively (although I do
> find them very interesting).
>
> So while you've made your point, I believe mine still stands --
> comparing floats using == to absolute numbers is almost always a mistake.

JFTR, it works because a+b == a+b (while I don't think that a+b == b+a
holds for all a and b).

In general, I totally agree with you.

Paul Rubin

unread,

Jul 8, 2010, 5:44:37 PM7/8/10

to

Wolfram Hinderer <wolfram....@googlemail.com> writes:
> JFTR, it works because a+b == a+b (while I don't think that a+b == b+a
> holds for all a and b).

I'm pretty sure IEEE 754 addition is always commutative (except maybe in
the presence of infinity or NaN and maybe not even then). It differs from
rational or real-number arithmetic in that it is not always associative.
You can have (a+b)+c != a+(b+c).

Mark Dickinson

unread,

Jul 8, 2010, 5:54:27 PM7/8/10

to

On Jul 8, 9:52 pm, Wolfram Hinderer <wolfram.hinde...@googlemail.com>
wrote:

> JFTR, it works because a+b == a+b (while I don't think that a+b == b+a
> holds for all a and b).

Actually, that's one of the few identities that's safe. Well, for non-
NaN IEEE 754 floating-point, at any rate. And assuming that there's
no use of extended precision to compute intermediate results (in which
case one side could conceivably be computed with different precision
from the other; but that applies to a+b == a+b, too). And assuming
that no FPE signals occur and halt the program... (e.g., for overflow,
or from doing -inf + inf). Okay, maybe not that safe, then. :)

For NaNs, there are two problems: first, if exactly one of a and b is
a NaN then a+b and b+a will both be NaNs, so a + b == b + a will be
false, even though they're identical. Second, if a and b are *both*
NaNs, and you're feeling really fussy and care about signs and
payloads, then a + b and b + a won't even necessarily be bitwise
identical: a + b will have the sign and payload of a, while b + a has
the sign and payload of b. You can see something similar with the
Decimal module:

>>> Decimal('NaN123') + Decimal('-NaN456')
Decimal('NaN123')
>>> Decimal('-NaN456') + Decimal('NaN123')
Decimal('-NaN456')

Or more directly (just for fun), using the struct module to create
particular nans:

>>> import struct
>>> x = struct.unpack('<d', '\xff\xff\xff\xff\xff\xff\xff\xff')[0]
>>> y = struct.unpack('<d', '\xfe\xff\xff\xff\xff\xff\xff\xff')[0]
>>> x
nan
>>> y
nan
>>> struct.pack('<d', x+y) # identical to x
'\xff\xff\xff\xff\xff\xff\xff\xff'
>>> struct.pack('<d', y+x) # identical to y
'\xfe\xff\xff\xff\xff\xff\xff\xff'

Isn't floating-point a wonderful thing. :)

--
Mark

Tim Roberts

unread,

Jul 9, 2010, 3:16:16 AM7/9/10

to

David Mainzer <d...@tu-clausthal.de> wrote:
>
>All your comments helped me and now i know how it works in python !

This has nothing to do with Python. What you're seeing is the way IEEE-754
floating point arithmetic works. You'd see these same results in any
programming language, including assembly language (assuming it hasn't
changed the rounding mode).
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.

Aahz

unread,

Jul 9, 2010, 11:41:57 AM7/9/10

to

In article <mailman.426.1278606...@python.org>,

Chris Rebert <cl...@rebertia.com> wrote:
>On Thu, Jul 8, 2010 at 8:52 AM, Giacomo Boffi <giacom...@polimi.it> wrote:
>> "Zooko O'Whielacronx" <zo...@zooko.com> writes:
>>>
>>> I'm starting to think that one should use Decimals by default and
>>> reserve floats for special cases.
>>
>> would you kindly lend me your Decimals ruler? i need to measure the
>> sides of the triangle whose area i have to compute
>
>If your ruler doesn't have a [second] set of marks for centimeters and
>millimeters, that's really one cheap/cruddy ruler you're using.

Unless I'm badly mistaken, Giacomo was making a funny about Greek
geometers.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"....Normal is what cuts off your sixth finger and your tail..." --Siobhan