"Sometimes Floating Point Math is Perfect"

Lynn McGuire

unread,

Jul 12, 2017, 6:48:00 PM7/12/17

to

"Sometimes Floating Point Math is Perfect"

https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/

Interesting. We moved to 64 bit doubles a couple of decades ago and
have never regretted it.

Lynn

Thomas Jahns

unread,

Jul 13, 2017, 4:15:00 AM7/13/17

to

Not really applicable in the comp.lang.fortran forum: Fortran leaves enough
parts of REAL arithmetic unspecified that compilers are sufficiently free to
interpretations of code that give results different from the IEEE
specifications. An example is that the Intel compiler (when optimizing at all)
will compute expressions at compile-time different from run-time.

C and C++ have tighter rules, but that effectively also creates less opportunity
for optimizations.

Thomas

bitrex

unread,

Jul 13, 2017, 8:23:35 AM7/13/17

to

"If the two constants being added had been exact then there would only
have been one rounding in the calculation and the result would have
matched the literal on the right-hand side."

What does "exact" mean in this context? How is writing 98432341293.375 +
0.000244140625 more "exact" than 0.2 + 0.3?

Fred.Zwarts

unread,

Jul 13, 2017, 9:51:49 AM7/13/17

to

"bitrex" schreef in bericht news:YCJ9B.187981$qL3....@fx34.iad...

Exact means that the decimal representation can be converted exactly to a
binary representation.
Since 10 can be divided by 5, 1/5 can be represented exactly in decimal
notation (0.2).
But since 2 cannot be divided by 5, 1/5 cannot be represented exactly in
binary notation.
(Just as 1/3 cannot be represented exactly by a decimal notation (because 3
is not a divisor of 10) but needs rounding: 0.3333333333\.)
So, 0.2 cannot be represented exactly by a binary floating point number.
0.000244140625, however, can be represented exactly as a binary floating
point number.

Ben Bacarisse

unread,

Jul 13, 2017, 10:05:04 AM7/13/17

to

bitrex <bit...@de.lete.earthlink.net> writes:

> On 07/12/2017 06:47 PM, Lynn McGuire wrote:
>> "Sometimes Floating Point Math is Perfect"
>>
>> https://randomascii.wordpress.com/2017/06/19/sometimes-floating-point-math-is-perfect/
>>
>>
>> Interesting. We moved to 64 bit doubles a couple of decades ago and
>> have never regretted it.
>>
>> Lynn
>
> "If the two constants being added had been exact then there would only
> have been one rounding in the calculation and the result would have
> matched the literal on the right-hand side."
>
> What does "exact" mean in this context?

In this context it means... er... exact -- without any error.

> How is writing 98432341293.375 + 0.000244140625 more "exact" than 0.2
> + 0.3?

Neither the quoted paragraph not the blog post use the term "more
exact". What's more the blog post does not compare 98432341293.375 +
0.000244140625 with 0.2 + 0.3, but with 0.1 + 0.3.

98432341293.375 is binary 1011011101011000001100101010100101101.011.
This can be represented exactly (without error) in a C double and, if
the C implementation conforms to the IEEE floating-point
recommendations, it will be exactly represented. Likewise,
0.000244140625 is exactly .000000000001 and the rules of IEEE arithmetic
say that the sum must be rounded to the nearest (binary) digit. In
fact, the sum can be represented exactly in a C double so "rounding to
the nearest binary digit" means, in the case, giving the exact answer.

With 0.1 + 0.2 there are three places where accuracy is lost. 0.1 can
not be represented exactly in a binary double and neither can 0.2. Both
will be represented by the nearest possible floating-point number, but
neither is exact. Finally, the sum of those two closest-but-not-quite
numbers can not be exactly represented either, giving a third loss of
accuracy.

I leave your example, 0.2 + 0.3 for you to analyse yourself.

--
Ben.

bitrex

unread,

Jul 13, 2017, 10:11:09 AM7/13/17

to

On 07/13/2017 10:04 AM, Ben Bacarisse wrote:

> In this context it means... er... exact -- without any error.
>
>> How is writing 98432341293.375 + 0.000244140625 more "exact" than 0.2
>> + 0.3?
>
> Neither the quoted paragraph not the blog post use the term "more
> exact". What's more the blog post does not compare 98432341293.375 +
> 0.000244140625 with 0.2 + 0.3, but with 0.1 + 0.3.
>
> 98432341293.375 is binary 1011011101011000001100101010100101101.011.

Ah, naturally. Thanks for clearing that up

bitrex

unread,

Jul 13, 2017, 10:16:38 AM7/13/17

to

On 07/13/2017 09:51 AM, Fred.Zwarts wrote:

> Exact means that the decimal representation can be converted exactly to
> a binary representation.
> Since 10 can be divided by 5, 1/5 can be represented exactly in decimal
> notation (0.2).
> But since 2 cannot be divided by 5, 1/5 cannot be represented exactly in
> binary notation.
> (Just as 1/3 cannot be represented exactly by a decimal notation
> (because 3 is not a divisor of 10) but needs rounding: 0.3333333333\.)
> So, 0.2 cannot be represented exactly by a binary floating point number.
> 0.000244140625, however, can be represented exactly as a binary floating
> point number.

I thought it might be something like that; unfortunately my ability to
mentally do decimal to binary conversions on the fly tops out around 900
million

Siri Cruise

unread,

Jul 13, 2017, 11:08:38 AM7/13/17

to

In article <YCJ9B.187981$qL3....@fx34.iad>,

Any real number of the form m*2^n, integers m and n, can be exactly represented
in radix two in log2 m + log2 n digits, the near universal radix of computer
numbers. These numbers are closed under addition, subtraction, multiplication
but not division.

It can be represented exactly as computer real number if the representation has
at least log2 m fraction bits and log2 n exponent bits.

0.2 = 2*10^-1 = 2*2^-1*5^-1 = (1*2^0) / (5*2^0), and this division is not
closed. Same for 0.3.

--
:-<> Siri Seal of Disavowal #000-001. Disavowed. Denied. Deleted. @
'I desire mercy, not sacrifice.' /|\
Free the Amos Yee one. This post / \
Yeah, too bad about your so-called life. Ha-ha. insults Islam. Mohammed

bitrex

unread,

Jul 13, 2017, 12:50:57 PM7/13/17

to

On 07/13/2017 11:08 AM, Siri Cruise wrote:

>> What does "exact" mean in this context? How is writing 98432341293.375 +
>> 0.000244140625 more "exact" than 0.2 + 0.3?
>
> Any real number of the form m*2^n, integers m and n, can be exactly represented
> in radix two in log2 m + log2 n digits, the near universal radix of computer
> numbers. These numbers are closed under addition, subtraction, multiplication
> but not division.
>
> It can be represented exactly as computer real number if the representation has
> at least log2 m fraction bits and log2 n exponent bits.
>
> 0.2 = 2*10^-1 = 2*2^-1*5^-1 = (1*2^0) / (5*2^0), and this division is not
> closed. Same for 0.3.

Thanks, I was unsure because IIRC it wasn't stated explicitly in the
article that just the standard properties of floating point
representations of binary numbers was what the author was talking about,
and not something specific to IEEE-754 (which I'm not terribly familiar
with.) I guess I thought that paragraph (and example numbers the author
used) wasn't particularly well-worded, but my educational background
isn't in computer science.

Öö Tiib

unread,

Jul 13, 2017, 6:26:58 PM7/13/17

to

It is all about how we handle precision. When we do our calculation
algorithms naively then accuracy of result may be lot less than
the accuracy of floating point used. Wikipedia discusses it in ...
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
for example.

Chris M. Thomasson

unread,

Jul 13, 2017, 10:38:51 PM7/13/17

to

Thank you for posting this Lynn. :)

Lynn McGuire

unread,

Jul 14, 2017, 2:39:50 PM7/14/17

to

You are welcome.

Lynn