[mildly off-topic] Valgrind and rounding errors

Martijn van Buul

unread,

Jun 7, 2019, 8:47:50 AM6/7/19

to

Hello,

I know this is a bit off-topic (or is it?) but I'm hoping someone
over here can shed a light on this anyway:

I have a very simple program:

#include <cinttypes>
#include <iostream>

constexpr uint64_t operator "" _GHz(long double arg)
{
return (uint64_t) (1E9L * arg);
}

int main()
{
std::cout << 2.1_GHz << std::endl;
return 0;
}

Yes, I'm aware that there is a potential trunctation issue here, because
"2.1" has a terrible floating point representation. I also know how to
address it. However, I don't see any *UB* here: The result should always
be the same. It might be unexpected, but not unpredictable.

Except, it is:

[martijnb@tim ~]$ g++ ./wtf.cpp
[martijnb@tim ~]$ ./a.out
2099999999
[martijnb@tim ~]$ valgrind -q ./a.out
2100000000
[martijnb@tim ~]$ exit

I godbolted this, and I discovered that the 'constexpr' declaration is
being ignored when compiling without optimizations - the user-defined
literal is still being evaluated at runtime. However, I don't understand
why the truncation has a different result depending on whether it runs in
valgrind or not. Once I compile this with any level of optimisation, the
difference goes away.

Any suggestions to avoid this? Am I hitting UB, and if so: why? I wished I
could avoid the "long double" argument here, but as far as I can tell I can't.

Martijn

PS: This reminds me of earlier issues a few years ago (where gcc was using the
80-bit floating point registers of Intel x87 FPUs for operation on 64-bit
doubles), but maybe I'm seeing ghosts. Also, that still doesn't quite
explain why valgrind makes a difference, and I would expect truncation
to have an effect in the *opposite* direction: it would make the
represented value *lower*, not *higher*.

--
Martijn van Buul - pi...@dohd.org

Öö Tiib

unread,

Jun 7, 2019, 9:51:53 AM6/7/19

to

On Friday, 7 June 2019 15:47:50 UTC+3, Martijn van Buul wrote:
> However, I don't see any *UB* here: The result should always
> be the same. It might be unexpected, but not unpredictable.

I also do not see any UB here but that does not mean the
result should be always the same.
Have you read David Goldberg's 28 years old paper on the
subject?
Hmm ... Oracle has copy up:
https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

> Any suggestions to avoid this?

You have to round when you don't want your results to be
(sometimes unpredictably, See the paper) truncated.

Christian Gollwitzer

unread,

Jun 7, 2019, 2:56:49 PM6/7/19

to

Am 07.06.19 um 15:51 schrieb Öö Tiib:

I don't think this is the problem. Floating point numbers are by
definition inaccurate, but this does not mean they are irreproducible.
If you perform the exact same operations twice, you'll get the same
(bitexact) result. Across architectures it may vary, but even the IEEE
standard defines bit-exact addition, subtraction, multiplication and
square root. So even across architecture, with the same IEEE flags set
(rounding etc.) simple computations give bit-identical results.

Maybe valgrind has a bug, or it does not support the long double (80
bit) data type and emulates it with 64bit double computations?

Christian

Öö Tiib

unread,

Jun 10, 2019, 3:24:47 AM6/10/19

to

On Friday, 7 June 2019 21:56:49 UTC+3, Christian Gollwitzer wrote:
> Am 07.06.19 um 15:51 schrieb Öö Tiib:
> > On Friday, 7 June 2019 15:47:50 UTC+3, Martijn van Buul wrote:
> >> However, I don't see any *UB* here: The result should always
> >> be the same. It might be unexpected, but not unpredictable.
> >
> > I also do not see any UB here but that does not mean the
> > result should be always the same.
> > Have you read David Goldberg's 28 years old paper on the
> > subject?
> > Hmm ... Oracle has copy up:
> > https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
> >
> >> Any suggestions to avoid this?
> >
> > You have to round when you don't want your results to be
> > (sometimes unpredictably, See the paper) truncated.
>
>
> I don't think this is the problem. Floating point numbers are by
> definition inaccurate, but this does not mean they are irreproducible.
> If you perform the exact same operations twice, you'll get the same
> (bitexact) result. Across architectures it may vary, but even the IEEE
> standard defines bit-exact addition, subtraction, multiplication and
> square root. So even across architecture, with the same IEEE flags set
> (rounding etc.) simple computations give bit-identical results.

See the elaboration of that example (quoted from paper):

int main() {
double q;
q = 3.0/7.0;
if (q == 3.0/7.0) printf("Equal\n");
else printf("Not Equal\n");
return 0;
}

Christian Gollwitzer

unread,

Jun 11, 2019, 1:30:18 AM6/11/19

to

Am 10.06.19 um 09:24 schrieb Öö Tiib:

In this example with a long double intermediate precision, the compiler
possibly performs two different computations: q is computed as

round_to_double64(long double(3.0)/long double(7.0))

and this is compared to long double(3.0)/long double(7.0)

The result can differ. What can't happen (usually) is, that the same
binary on the same computer prints sometimes "Equal" and sometimes "Not
Equal" depending on the phase of the moon.

Martin runs the exact same assembly code directly and within valgrind,
i.e. the order of the operations performed should be identical. Now
valgrind is a CPU simulator and it turns out that it does not support
the long double type:

"Precision: There is no support for 80 bit arithmetic. Internally,
Valgrind represents all such "long double" numbers in 64 bits, and so
there may be some differences in results. Whether or not this is
critical remains to be seen. Note, the x86/amd64 fldt/fstpt instructions
(read/write 80-bit numbers) are correctly simulated, using conversions
to/from 64 bits, so that in-memory images of 80-bit numbers look correct
if anyone wants to see."

http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits

Christian

Fred. Zwarts

unread,

Jun 11, 2019, 3:09:24 AM6/11/19

to

Op 07.jun..2019 om 20:56 schreef Christian Gollwitzer:

> Am 07.06.19 um 15:51 schrieb Öö Tiib:

...

>
> ... Floating point numbers are by

> definition inaccurate, but this does not mean they are irreproducible.

I think the term "inaccurate" is a bit inaccurate here. :-)
Floating point calculations are very accurate, often more than integer
calculations. But floating point calculations are more difficult to
understand.
Compare a floating point type with a mantissa of N bits with an integer
type with N bits.
For each calculation where the integer calculation gives a good result,
the floating point calculation will give exactly the same good result.
There where the integer calculation fails to give an exact result (e.g.,
due to truncation as in 5/3), the floating point result (e.g. 5.0/3.0)
is more accurate (closer to 1.66666/) than the integer result.

Martijn van Buul

unread,

Jun 11, 2019, 4:01:32 AM6/11/19

to

* Christian Gollwitzer:

> Martin runs the exact same assembly code directly and within valgrind,
> i.e. the order of the operations performed should be identical. Now
> valgrind is a CPU simulator and it turns out that it does not support
> the long double type:
>
> "Precision: There is no support for 80 bit arithmetic. Internally,
> Valgrind represents all such "long double" numbers in 64 bits, and so
> there may be some differences in results. Whether or not this is
> critical remains to be seen. Note, the x86/amd64 fldt/fstpt instructions
> (read/write 80-bit numbers) are correctly simulated, using conversions
> to/from 64 bits, so that in-memory images of 80-bit numbers look correct
> if anyone wants to see."

Ah, thank you, that's my problem alright. I hope this doesn't invalidate
the validity of the unittests of this particular project (which are all
executed using valigrind); I hope noone uses 'long doubles'...

Öö Tiib

unread,

Jun 11, 2019, 8:17:00 AM6/11/19

to

But can't same binary return different results on different computers?

> Martin runs the exact same assembly code directly and within valgrind,
> i.e. the order of the operations performed should be identical. Now
> valgrind is a CPU simulator and it turns out that it does not support
> the long double type:
>
> "Precision: There is no support for 80 bit arithmetic. Internally,
> Valgrind represents all such "long double" numbers in 64 bits, and so
> there may be some differences in results. Whether or not this is
> critical remains to be seen. Note, the x86/amd64 fldt/fstpt instructions
> (read/write 80-bit numbers) are correctly simulated, using conversions
> to/from 64 bits, so that in-memory images of 80-bit numbers look correct
> if anyone wants to see."
>
> http://www.valgrind.org/docs/manual/manual-core.html#manual-core.limits

Oh. Good catch!

James Kuyper

unread,

Jun 11, 2019, 8:27:12 AM6/11/19

to

It's not quite that simple. The fundamental problem is that there's only
a finite number of integer values between the lowest and highest values
representable by a given integer type, and all of those values
in-between are also representable.

The only case where the mathematical result of an operation on integers
can be within the representable range, but need not be represented as an
integer, is integer division. When that is the case, the result is still
precisely defined: the fractional part of the mathematical value is
discarded, so the result is the bracketing integer value that is closest
to zero.

Between any two consecutive representable floating point values, there's
infinitely many real numbers that cannot be represented in that type.

When the mathematical result of a calculation is a real number within
the range of values that are representable by a given floating point
type, there will be one (or possibly two) representable number that is
closest to that real number. Even the strictest standards generally
allow the actual result of such a calculation to be either than number,
or the next representable number, or the previous representable number,
and do not require that it be the same one of those three options be
chosen each time the calculation is performed. This is true even if the
mathematical value is representable.
This is because, in practice, it can often be difficult to do any better
than that for floating point mathematical calculations, particularly for
the results of transcendental function calls.

Juha Nieminen

unread,

Jun 12, 2019, 5:06:14 AM6/12/19

to

Fred. Zwarts <F.Zw...@kvi.nl> wrote:
> Op 07.jun..2019 om 20:56 schreef Christian Gollwitzer:
>> Am 07.06.19 um 15:51 schrieb Öö Tiib:
>> ... Floating point numbers are by
>> definition inaccurate, but this does not mean they are irreproducible.
>
> I think the term "inaccurate" is a bit inaccurate here. :-)
> Floating point calculations are very accurate, often more than integer
> calculations. But floating point calculations are more difficult to
> understand.

I think there's a bit of ambiguity about what is meant by "accuracy" here.

Pretty much all modern hardware implement the IEEE standard when dealing
with floating point values, and AFAIK that standard defines very
meticulously and unambiguously what the result of every operation must
be with given input values, bit-by-bit. If you know these definitions,
you know *exactly* what the result of an operation, with given values,
will be, to the last bit of it. There's no fuzziness here.

Also, "inaccuracy" may lead one to believe that the result might vary
randomly even with the same input. Obviously that's not the case. The
result will always be the same.

But here the term is most probably used with the meaning "you can't
represent the result of many operations completely exactly, and there
will often be rounding errors." As an example 1.0 and 3.0 can both be
represented with complete accuracy, but the result of 1.0/3.0 cannot,
and will have a small rounding error (because 1/3 is not representable
completely accurately with floating point).

However, one could think of it as no different than operations applied to
ints not being always representable. If, for example, you multiply two
ints, and the result is too large, the end result will not be accurate,
for the mere reason that the variable type doesn't have enough bits to
represent it accurately. This is no different from floating point: It, too,
doesn't have enough bits to represent every possible value accurately.

It's just that when people say "floating point is inaccurate", it's often
said with some kind of implication that integer variables do not suffer
from this same problem. Even though they do.

*All* mathematic operations done on computers is inaccurate in some cases
in one way or another. Pointing to floating point in particular is a bit
misleading.

James Kuyper

unread,

Jun 12, 2019, 8:49:54 AM6/12/19

to

On 6/12/19 5:05 AM, Juha Nieminen wrote:
...

> Pretty much all modern hardware implement the IEEE standard when dealing
> with floating point values, and AFAIK that standard defines very
> meticulously and unambiguously what the result of every operation must
> be with given input values, bit-by-bit.

No, it does not. In general, it allows for either the representable
value closest to the mathematical value of an expression, or one of the
two representable values immediately preceding and immediately following
that value, and it does not require that the choice be consistent.

> If you know these definitions,
> you know *exactly* what the result of an operation, with given values,
> will be, to the last bit of it. There's no fuzziness here.

Yes there is. The fuzziness is only 1 ulp, but it is there.

> Also, "inaccuracy" may lead one to believe that the result might vary
> randomly even with the same input. Obviously that's not the case. The
> result will always be the same.

It's not required to be, not even by the IEEE standard.

> However, one could think of it as no different than operations applied to
> ints not being always representable. If, for example, you multiply two
> ints, and the result is too large, the end result will not be accurate,
> for the mere reason that the variable type doesn't have enough bits to
> represent it accurately. This is no different from floating point: It, too,
> doesn't have enough bits to represent every possible value accurately.

The key difference is that the only unrepresentable integer values are
those outside the range of representable integer values. An infinite
number of unrepresentable real numbers separate every single consecutive
pair of numbers that can be represented in floating point.

Keith Thompson

unread,

Jun 12, 2019, 6:48:55 PM6/12/19

to

James Kuyper <james...@alumni.caltech.edu> writes:
> On 6/12/19 5:05 AM, Juha Nieminen wrote:

[...]

>> However, one could think of it as no different than operations applied to
>> ints not being always representable. If, for example, you multiply two
>> ints, and the result is too large, the end result will not be accurate,
>> for the mere reason that the variable type doesn't have enough bits to
>> represent it accurately. This is no different from floating point: It, too,
>> doesn't have enough bits to represent every possible value accurately.
>
> The key difference is that the only unrepresentable integer values are
> those outside the range of representable integer values. An infinite
> number of unrepresentable real numbers separate every single consecutive
> pair of numbers that can be represented in floating point.

Agreed.

One might argue that, for example, the fact that 4/3 yields 1 means
the result is inaccurate. The way I'd describe it is that 1 is the
correct result for that operation (which is integer division, not
real division). It does lose some information, but that's why we
have the % operator and the div, ldiv, and lldiv library functions.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */

Martijn van Buul

unread,

Jun 14, 2019, 3:23:16 AM6/14/19

to

* Keith Thompson:

> Agreed.
>
> One might argue that, for example, the fact that 4/3 yields 1 means
> the result is inaccurate.

But by that definition, 3/2 would also be inaccurate, while it isn't: it
can accurately be represented by a binary floating point.

Keith Thompson

unread,

Jun 14, 2019, 4:43:09 AM6/14/19

to

Martijn van Buul <pi...@dohd.org> writes:
> * Keith Thompson:
>
>> Agreed.
>>
>> One might argue that, for example, the fact that 4/3 yields 1 means
>> the result is inaccurate.
>
> But by that definition, 3/2 would also be inaccurate, while it isn't: it
> can accurately be represented by a binary floating point.

To be clear, I wasn't making the argument that it's inaccurate; rather
the accurate result of 4/3 is 1, because it's an integer division, not a
floating-point division.

Similarly, the *accurate* result of 3/2 is 1 (and 3.0/2.0 is 1.5).

(The C standard describes integer division as "the algebraic quotient
with any fractional part discarded", but it could equivalently be
described without reference to discarding any fractional part.)

James Kuyper

unread,

Jun 14, 2019, 8:32:06 AM6/14/19

to

On 6/14/19 3:22 AM, Martijn van Buul wrote:
> * Keith Thompson:
>
>> Agreed.
>>
>> One might argue that, for example, the fact that 4/3 yields 1 means
>> the result is inaccurate.
>
> But by that definition, 3/2 would also be inaccurate, while it isn't: it
> can accurately be represented by a binary floating point.

Yes, in C 3/2 is 1, and 1.0 can be represented by binary floating point
- which is not what you meant - but it is what Keith was talking about.
The mathematical value of 3/2 is exactly representable in floating point
on any system where FLT_RADIX is a multiple of 2 (the most common values
are 2, 10, and 16 - I've never heard of a real system where it wasn't a
multiple of 2), and on such systems 3.0/2.0 == 1.5.

You need to clearly distinguish several things:

1. The mathematical value of 4/3, which is a number that when multiplied
by 3, gives 4.

2. The C value of 4/3, which is 1. When multiplied by 3, it gives 3.

3. The C value of 4.0/3.0, which is a rational number with a denominator
that is a integral power of FLT_RADIX. Unless FLT_RADIX is a multiple of
3 (I have actually heard of trinary systems, but they were highly
experimental, and I doubt that any such system hosts an implementation
of C), that value is NOT the same as the mathematical value - when
multiplied by 3 it does not give exactly 4.0, though it does come close.

Robert Wessel

unread,

Jun 15, 2019, 1:58:17 PM6/15/19

to

On Fri, 14 Jun 2019 08:31:33 -0400, James Kuyper
<james...@alumni.caltech.edu> wrote:

>On 6/14/19 3:22 AM, Martijn van Buul wrote:
>> * Keith Thompson:
>>
>>> Agreed.
>>>
>>> One might argue that, for example, the fact that 4/3 yields 1 means
>>> the result is inaccurate.
>>
>> But by that definition, 3/2 would also be inaccurate, while it isn't: it
>> can accurately be represented by a binary floating point.
>
>Yes, in C 3/2 is 1, and 1.0 can be represented by binary floating point
>- which is not what you meant - but it is what Keith was talking about.
>The mathematical value of 3/2 is exactly representable in floating point
>on any system where FLT_RADIX is a multiple of 2 (the most common values
>are 2, 10, and 16 - I've never heard of a real system where it wasn't a
>multiple of 2), and on such systems 3.0/2.0 == 1.5.

FSVO "real". Ternary FP has existed on at least emulated (ternary)
systems.

https://en.wikipedia.org/wiki/Ternac

FP was implemented (in software, I believe) on the balanced ternary
Soviet Sutuns, but the exact format has never been clear to me (and
all of the documentation I've found is unhelpfully - at least to me -
in Russian), but ternary FP seems at least a plausible choice for
that.

James Kuyper

unread,

Jun 17, 2019, 8:35:21 AM6/17/19

to

On 6/15/19 1:58 PM, Robert Wessel wrote:
> On Fri, 14 Jun 2019 08:31:33 -0400, James Kuyper
> <james...@alumni.caltech.edu> wrote:

...

>> on any system where FLT_RADIX is a multiple of 2 (the most common values
>> are 2, 10, and 16 - I've never heard of a real system where it wasn't a
>> multiple of 2), and on such systems 3.0/2.0 == 1.5.
>
>
> FSVO "real". Ternary FP has existed on at least emulated (ternary)
> systems.

I mentioned ternary systems further down in the same message. I did not
include them in the above comment, because it was specifically about
FLT_RADIX, a feature of the C standard library, and I know of no
implementation of C targeting a ternary platform (which doesn't mean
that there isn't one).

Tim Rentsch

unread,

Jun 22, 2019, 8:14:21 AM6/22/19

to

James Kuyper <james...@alumni.caltech.edu> writes:

> On 6/12/19 5:05 AM, Juha Nieminen wrote:
> ...
>
>> Pretty much all modern hardware implement the IEEE standard when
>> dealing with floating point values, and AFAIK that standard defines
>> very meticulously and unambiguously what the result of every
>> operation must be with given input values, bit-by-bit.
>
> No, it does not. In general, it allows for either the representable
> value closest to the mathematical value of an expression, or one of
> the two representable values immediately preceding and immediately
> following that value, and it does not require that the choice be
> consistent.

I don't think this is right. The people who did IEEE floating
point are keen on reproducibility, and it seems unlikely they
would allow this much slop in the results. Furthermore IEEE 754
spells out in detail several different rounding modes, and AFAICT
implementations of IEEE 754 are required to support all of them.
Supporting statements can be found in ISO C N1570, for example
footnote 204 in 7.6 p1

This header is designed to support the floating-point
exception status flags and directed-rounding control modes
required by IEC 60559 [...]

There would be little point to requiring, for example, both
round-to-nearest and round-towards-zero, if they could produce
identical results.

All the evidence I have seen suggests that the IEEE-specified
directed-rouding modes are both exact and required. I don't have
any citations from the requisite ISO document (as I have not got
a copy from which to give one), but if it were to allow the sort
of inexact results described then there should be some indication
of that amongst the sea of publically available web documents,
and I haven't found any.