float_to_list vs io:format ~w

4 views
Skip to the first unread message

Viktor Söderqvist

unread,
28 June 2021, 5:23:17 am28/6/21
to erlang-q...@erlang.org
Hi!

Apparently, float_to_binary/1 and float_to_list/1 return a different
textual representation than io:format("~w", [Float]) and the shell.

Eshell V11.1.8 (abort with ^G)
1> float_to_list(10.2).
"1.01999999999999992895e+01"
2> 1.01999999999999992895e+01.
10.2
3> io:format("~w~n", [10.2]).
10.2
ok
4> 1.01999999999999992895e+01 =:= 10.2.
true

Apparently, both are correct textual representations of the same
floating point value. Why don't they use the same algorithm?

Viktor

zxq9

unread,
28 June 2021, 5:32:03 am28/6/21
to erlang-q...@erlang.org
Floats are white lies but useful engineering approximations and reals
are a fake contrivance that imagines the universe to be infinitely smooth.

I'm being silly above (but seriously, reals are fake), but the bottom
line is it really isn't a simple thing to pick a "right" representation
and I imagine the difference comes down to use case: formatting data for
humans VS exporting data to other machines.

Humans:
"Practically speaking, the user intends 10.2 to be 10.2 even though that
is an impossible number to represent as a float."

Machines:
"The closest string representation of this value is literally X so it
will be represented in full."

My question is: What is the algorithm for the 10.2 -> "10.2"
representation? Is there some rounding rule? This is probably all
covered in some part of IEEE 754 I've never gotten around to reading.

-Craig

Thomas Depierre

unread,
28 June 2021, 5:33:56 am28/6/21
to Viktor Söderqvist, Erlang Questions
Short answer: because the BIF use the libc sprintf, which does not support the format offered by io:format
Slightly more complex answer: it does not yet, but it is coming, PR at https://github.com/erlang/otp/pull/4719

Long answer:
The io:format format is known as "shortest round-trip conversion". It has the advantage of always giving the shortest string while keeping perfect precision, making it both better for external data exchange (less bytes used) and for human readability.
It has historically been a problem. We had no fast algorithm to generate that format, even less an algorithm that does not need arbitrarily large integers. So it is not implemented in the libcs anywhere, as it was really complex and expensive. It is not in the C spec. We only have %g which is... its own mess.

This changed a few years ago thanks to Ryu, an algorithm presented by Ulf Adams in 2018. https://github.com/ulfjack/ryu
I have been working for 9 months bringing Ryu to OTP, which is slowly happening. Hopefully before OTP 25.

Getting it in the libc will probably take a few decades, seeing how much dev cycle they get for the amount of legacy they deal with.

Hope that answer the question ? Always available if someone want to dig deeper.

Viktor Söderqvist

unread,
28 June 2021, 6:14:00 am28/6/21
to Thomas Depierre, Erlang Questions
Thanks Thomas! That's a very useful answer.

I know I'd seen this topic somewhere but I couldn't find it. It was
probably your PR I'd seen.

Cheers,
Viktor

PS. @zxq9 Yes, IEEE floats are a mess. Posits[1] is an interesting
alternative but I think it'll take five decades before they replace IEEE
floats, if ever.

[1]: https://www.youtube.com/watch?v=aP0Y1uAA-2Y
> <https://calendar.x.ai/di4na?utm_medium=email&utm_source=user-signature>

Eckard Brauer

unread,
28 June 2021, 6:50:08 am28/6/21
to erlang-q...@erlang.org
0.2 (decimal) is an infinite (but periodic, in this case) as a binary
number, and the internal representation is a binary one. And
representing an _exact_ infinite with a finite number of positions is
impossible, as it's easy to understand.

So for having exact representations, an alternative internal number
representation would be needed, beside the binary representations
already existing (e.g. BCD numbers). But even irrationals (pi, e,...)
wouldn't be represented exactly by that... so I fear we're still looking
for the optimum.

E.

Am Mon, 28 Jun 2021 18:31:44 +0900
schrieb zxq9 <zx...@zxq9.com>:

> Floats are white lies but useful engineering approximations and reals
> are a fake contrivance that imagines the universe to be infinitely
> smooth.
>
> I'm being silly above (but seriously, reals are fake), but the bottom
> line is it really isn't a simple thing to pick a "right"
> representation and I imagine the difference comes down to use case:
> formatting data for humans VS exporting data to other machines.
>
> Humans:
> "Practically speaking, the user intends 10.2 to be 10.2 even though
> that is an impossible number to represent as a float."
>
> Machines:
> "The closest string representation of this value is literally X so it
> will be represented in full."
>
> My question is: What is the algorithm for the 10.2 -> "10.2"
> representation? Is there some rounding rule? This is probably all
> covered in some part of IEEE 754 I've never gotten around to reading.
>
> -Craig
>
> On 2021/06/28 18:23, Viktor Söderqvist wrote:
> [...]

Reply all
Reply to author
Forward
0 new messages