Deprecate to_string() for floating point types?

1,060 views
Skip to first unread message

u97...@gmail.com

unread,
Nov 8, 2015, 9:07:26 AM11/8/15
to ISO C++ Standard - Discussion
(bikeshedding note to avoid unwanted content: topic is about a somewhat small detail in the standard and this may turn out to be bikeshedding for people with more significant responsibilities)

For starters, please think an answer to the following question: what would you consider to be the string representation of double value 1e-9?

C++11 added std::to_string() that converts numeric value to string representation. Not thinking about historical ballast for C, generally speaking what would be reasonable behaviour for such a function? Given that numeric literals are written as text in the source code, it's natural to expect that numeric_to_string() would produce such a string that the original numeric value could be obtained from the string without losing information. What, however, to_string() does with floating point values is that it does not only lose information, but it can in practice wipe out the numeric information completely making it potential or, I would argue, likely cause of bugs and be generally a highly dubious implementation.

Examples:
1.
std::cout << std::to_string(1e-9);           // output: "0.000000"
std
::cout << std::to_string(6.67408e-11);    // output: "0.000000"
std
::cout << std::to_string(6.626070040e-34) // output: "0.000000"

All values are certainly different but their string representations are identical. As such to_string() essentially wipes out the numeric information. It's also worth noting that the resulting string may even be longer than a valid and likely the most natural string presentation (e.g. "1e-9") even though to_string() has destroyed the information.

2.
std::cout << std::to_string(1.23e100); // output:
"1229999999999999945619502435678791882061496502770990950045684429327960298864608335541984218516600989160291306221939122973741400364055485571676274743695192965637069768948118175959863951770799435358111025735195134313314113829815221797071926323389168215764573082356023275727273837119288529943287157489664.000000"
std
::cout << std::to_string(1.23e-300); // output:"0.000000"
std
::cout << 1.23e300;   // " output: "1.23e+300"
std
::cout << 1.23e-300;  // " output: "1.23e-300"

Values in above are quite similar: both have the same numeric precision and it would be fair to expect that their string representations would also be similar. In addition the representation of 1.23e300 is impractical due to it's length. Note also the contrast to default representation in std::cout.

3.
std::cout << 1e-9;                 // output: "1e-09"
std
::cout << std::to_string(1e-9); // output: "0.000000"

Fundamentally different behaviour (i.e. not only precision or formatting difference) of default to-stream printing and to_string() is probably unexpected as both lines essentially does (or should do) at least approximately the same thing: convert double to string representation.

With these examples the current definition of std::to_string() with floating points is fair to be considered fundamentally flawed and likely cause of bugs especially as the original(?) papers (N1803, N2408) did not mention this problem in no way indicating that this behaviour wasn't even though about. Changing the behaviour is a breaking change, so deprecating might be the best way to minimize the damages.


Example code:

#include <iostream>
#include <string>

int main()
{
    std
::cout << std::to_string(1e-9) << '\n';
    std
::cout << std::to_string(6.67408e-11) << '\n';
    std
::cout << std::to_string(6.626070040e-34) << '\n';

    std
::cout << std::to_string(1.23e300) << '\n';
    std
::cout << std::to_string(1.23e-300) << '\n';
    std
::cout << 1.23e300 << '\n';
    std
::cout << 1.23e-300 << '\n';

    std
::cout << 1e-9 << '\n';
    std
::cout << std::to_string(1e-9) << '\n';
   
return 0;
}

u97...@gmail.com

unread,
Nov 8, 2015, 9:17:15 AM11/8/15
to ISO C++ Standard - Discussion, u97...@gmail.com
Corrections:
  • first line in example 2 should have 1.23e300 instead of 1.23e100
  • "did not mention this problem in no way" -> "did not mention this problem at all"

Nicol Bolas

unread,
Nov 8, 2015, 10:13:12 AM11/8/15
to ISO C++ Standard - Discussion, u97...@gmail.com
I see no reason to deprecate these functions. Sure, it's not a completely backwards-compatible change, but fixing the functions to behave more reasonably is preferable to adding a bunch of new functions and telling people not to use the old ones.

Also, it's really hard to write code where this change isn't backwards compatible. Not unless you're explicitly relying on a particular size of characters.

Andrey Semashev

unread,
Nov 8, 2015, 4:54:13 PM11/8/15
to std-dis...@isocpp.org
On 2015-11-08 17:07, u97...@gmail.com wrote:
> (bikeshedding note to avoid unwanted content
> <https://groups.google.com/a/isocpp.org/d/msg/std-discussion/A4fhAM0OlhY/hNV47qdgbzIJ>:
> topic is about a somewhat small detail in the standard and this may turn
> out to be bikeshedding for people with more significant responsibilities)
>
> For starters, please think an answer to the following question: what
> would you consider to be the string representation of double value 1e-9?

The standard defines the format of the string returned by to_string
rather clearly. In case of double the formatting is equivalent to %f
snprintf flag.

> Given
> that numeric literals are written as text in the source code, it's
> natural to expect that numeric_to_string() would produce such a string
> that the original numeric value could be obtained from the string
> without losing information.

I wouldn't say this is a fair expectation with regard to floating point
numbers. The standard doesn't give you that guarantee, even with
iostreams, and given how volatile FP numbers are the lossless roundtrip
cannot be implied.

> What, however, to_string() does with
> floating point values is that it does not only lose information, but it
> can in practice wipe out the numeric information completely making it
> potential or, I would argue, likely cause of bugs and be generally a
> highly dubious implementation.

I'd say every design that relies on FP<->string<->FP to be a lossless
roundtrip is likely broken, whether it employs to_string/stod or
iostreams or C equivalents. Even if you set formatting flags correctly
(i.e. infinite width and precision, locale-agnostic formatting, etc.),
the formatting and parsing process itself may introduce error. Then
there are denormal FP values, which may or may not be rounded to zero
during any arithmetic operations that are applied during formatting and
parsing.

> Examples:
> 1.
> |
> std::cout <<std::to_string(1e-9);// output: "0.000000"
> std::cout <<std::to_string(6.67408e-11);// output: "0.000000"
> std::cout <<std::to_string(6.626070040e-34)// output: "0.000000"
> |
>
> All values are certainly different but their string representations are
> identical. As such to_string() essentially wipes out the numeric
> information. It's also worth noting that the resulting string may even
> be longer than a valid and likely the most natural string presentation
> (e.g. "1e-9") even though to_string() has destroyed the information.

Again, the standard defines the format of the strings and it does not
allow for "1e-9" output. It does not make the function work incorrectly
or not useful. Personally, I do find the std::fixed style more readable,
even though it leads to information loss like in the above. That makes
to_string more useful for diagnostic purposes to me.

If you want a different format then you can use snprintf or iostreams.

> 2.
> |
> std::cout <<std::to_string(1.23e100);// output:
> "1229999999999999945619502435678791882061496502770990950045684429327960298864608335541984218516600989160291306221939122973741400364055485571676274743695192965637069768948118175959863951770799435358111025735195134313314113829815221797071926323389168215764573082356023275727273837119288529943287157489664.000000"
> std::cout <<std::to_string(1.23e-300);// output:"0.000000"
> std::cout <<1.23e300;// " output: "1.23e+300"
> std::cout <<1.23e-300;// " output: "1.23e-300"
> |
>
> Values in above are quite similar: both have the same numeric precision
> and it would be fair to expect that their string representations would
> also be similar. In addition the representation of 1.23e300 is
> impractical due to it's length. Note also the contrast to default
> representation in std::cout.

The 1.23e300 number is not representable exactly as a double value. You
get information loss either way, it's just less apparent in the last two
lines. I even suspect the first line is giving you a more accurate
representation of the number that you have in your code in run time.

> 3.
> |
> std::cout <<1e-9;// output: "1e-09"
> std::cout <<std::to_string(1e-9);// output: "0.000000"
> |
>
> Fundamentally different behaviour (i.e. not only precision or formatting
> difference) of default to-stream printing and to_string() is probably
> unexpected as both lines essentially does (or should do) at least
> approximately the same thing: convert double to string representation.

And so they do.

> With these examples the current definition of std::to_string() with
> floating points is fair to be considered fundamentally flawed and likely
> cause of bugs especially as the original(?) papers (N1803
> <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1803.html>,
> N2408
> <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2408.html>)
> did not mention this problem in no way indicating that this behaviour
> wasn't even though about. Changing the behaviour is a breaking change,
> so deprecating might be the best way to minimize the damages.

I don't think to_string is flawed, its behavior is well defined and
useful in certain contexts. I'm certainly not in favor of deprecating or
removing it.

FP numbers are tricky.

Howard Hinnant

unread,
Nov 8, 2015, 5:26:07 PM11/8/15
to std-dis...@isocpp.org
On Nov 8, 2015, at 4:54 PM, Andrey Semashev <andrey....@gmail.com> wrote:
>
> I'd say every design that relies on FP<->string<->FP to be a lossless roundtrip is likely broken, whether it employs to_string/stod or iostreams or C equivalents. Even if you set formatting flags correctly (i.e. infinite width and precision, locale-agnostic formatting, etc.), the formatting and parsing process itself may introduce error.

std::hexfloat is pretty awesome, unless you’re a human… ;-)

Howard

u97...@gmail.com

unread,
Nov 8, 2015, 6:01:59 PM11/8/15
to ISO C++ Standard - Discussion
@Nicol Bolas

I'm not a standard expert so whether deprecation or modification would be better should be evaluated by more experienced people. Do you have an example of a function for which the semantics had been changed in past?

@Andrey Semashev

I'm quite astounded that someone finds nothing wrong in value-to-string mapping such as 1e-9 -> "0.000000". I'll clarify a few point, although disagreement will likely remain:

1. There was no claim that the behaviour mandated by the standard was not clearly defined, the argument was that the definition was error prone and dubious.
2. I see no reason why floating point values couldn't be represented as text in non-lossy way (please feel free to show me wrong). Your job is actually easy, it's enough to give one double value that can't be presented as text which would map exactly to the original double value (let's ignore NaN's etc. for now). For example the value 1.23e300 which you said "is not representable exactly as a double value": true, but the relevant part is that if you have a double defined in the source as 1.23e300, converting it to string "1.23e300" e.g. by format "%g" and then back to double will give exactly the same double value as the original; whether or not the original value can be exactly represented as double is irrelevant. And "%g" needs less than 10 chars instead of over 300 hundred.
3. Whether exactly non-lossy or not isn't the main point; values presented in the examples are not exceptional corner cases and still to_string() fails to convert them reasonably.

Yes floating points are tricky and that's why they shouldn't be made any trickier by this kind of implementations. The title of the referenced papers were "Simple numeric access",  but it's not particularly simple that instead of just using to_string(), one would always have to check whether given value is in the range where to_string() does not malfunction.

Andrey Semashev

unread,
Nov 8, 2015, 6:43:39 PM11/8/15
to std-dis...@isocpp.org
On 2015-11-09 02:01, u97...@gmail.com wrote:
>
> I'm quite astounded that someone finds nothing wrong in value-to-string
> mapping such as 1e-9 -> "0.000000".

You're missing precision from the picture, while it plays the key role
in the task of representation. First, when you type 1e-9 in your source
code, that textual representation is transformed into double (let's
assume it's a 64-bit IEEE 754 FP number), which is luckily possible
without loss (i.e. the double's precision is enough to hold that
number). Next you convert that double to a string with a given format -
%f, which also has a precision. It so happens that the format is
restricted to 6 digits after the decimal point and thus is not capable
to represent the original number. As a result you get the closest
representable number. There's nothing wrong or new with it, it's how any
conversion works. You will get a similar result if you try to store
1e-2000 into a double - the number will truncate to 0.

> I'll clarify a few point, although
> disagreement will likely remain:
>
> 1. There was no claim that the behaviour mandated by the standard was
> not clearly defined, the argument was that the definition was error
> prone and dubious.

Like I said, FP numbers are tricky (read - error prone). Care must be
taken when working with them, including when you format them. The
to_string function is well defined (not dubious) and has valid uses,
some of which I mentioned in my previous reply.

> 2. I see no reason why floating point values couldn't be represented as
> text in non-lossy way (please feel free to show me wrong). Your job is
> actually easy, it's enough to give one double value that can't be
> presented as text which would map exactly to the original double value
> (let's ignore NaN's etc. for now).

It's not a question of a particular number but rather a reasonable
implementation that is allowed by the standard.

> For example the value 1.23e300 which
> you said "is not representable exactly as a double value": true, but the
> relevant part is that if you have a double defined in the source as
> 1.23e300, converting it to string "1.23e300" e.g. by format "%g" and
> then back to double will give exactly the same double value as the
> original; whether or not the original value can be exactly represented
> as double is irrelevant.

I'm sorry, but no. What you're seeing in your example is just two
computational errors cancelling each other. That's
implementation-specific behavior at best.

While programming math you care exactly of the numbers that your program
operates on. If you write 1.23e300 and expect your program to operate on
that number then you have a bug. You may not care about the precision
error that occurs when your 1.23e300 is converted into double, but then
why do you care when the same error is exposed when you format the
double to a string?

> And "%g" needs less than 10 chars instead of
> over 300 hundred.

Different formats have different pros and cons. At this point you're
basically saying that you like %g more than %f.

> 3. Whether exactly non-lossy or not isn't the main point; values
> presented in the examples are not exceptional corner cases and still
> to_string() fails to convert them reasonably.

Like I said, I find %f a quite reasonable format for diagnostic purposes
(i.e. when a human is supposed to read those numbers). For other
purposes I'd choose a different format, which might be less suited for
humans and offer better other characteristics, such as conciseness and
range of representable values. But frankly, I find the latter cases
diminishingly rare.

Andrey Semashev

unread,
Nov 8, 2015, 6:44:45 PM11/8/15
to std-dis...@isocpp.org
Hmm, I forgot about that flag. :)

Nicol Bolas

unread,
Nov 8, 2015, 8:17:04 PM11/8/15
to ISO C++ Standard - Discussion

I suggest you take that up with pretty much every user of Collada or similar text-based technologies. Whatever your personal feelings on the matter may be, the facts on the ground are that round-tripping is important to many users.

Even if you set formatting flags correctly
(i.e. infinite width and precision, locale-agnostic formatting, etc.),
the formatting and parsing process itself may introduce error. Then
there are denormal FP values, which may or may not be rounded to zero
during any arithmetic operations that are applied during formatting and
parsing.

Yes, perfect round-tripping is not viable.

How about round-tripping that isn't ridiculously broken? It's one thing to have a value that loses a couple of digits of precision. It's quite another to lose the entire value.

We don't need the round trip to guarantee perfect precision reproduction. But that's no excuse for not improving what we have.

It should also be noted that most other languages have much better round-tripping precision than `to_string` provides.
 
> Examples:
> 1.
> |
> std::cout <<std::to_string(1e-9);// output: "0.000000"
> std::cout <<std::to_string(6.67408e-11);// output: "0.000000"
> std::cout <<std::to_string(6.626070040e-34)// output: "0.000000"
> |
>
> All values are certainly different but their string representations are
> identical. As such to_string() essentially wipes out the numeric
> information. It's also worth noting that the resulting string may even
> be longer than a valid and likely the most natural string presentation
> (e.g. "1e-9") even though to_string() has destroyed the information.

Again, the standard defines the format of the strings and it does not
allow for "1e-9" output. It does not make the function work incorrectly
or not useful. Personally, I do find the std::fixed style more readable,
even though it leads to information loss like in the above. That makes
to_string more useful for diagnostic purposes to me.

How useful that is for "diagnostic purposes" rather depends on what you are diagnosing, don't you think? I've been in situations in graphics work where getting only 2 digits of precision in text is not even close to acceptable even as a diagnostic.

Andrey Semashev

unread,
Nov 9, 2015, 4:04:34 AM11/9/15
to std-dis...@isocpp.org
On 2015-11-09 04:17, Nicol Bolas wrote:
>
>
> On Sunday, November 8, 2015 at 4:54:13 PM UTC-5, Andrey Semashev wrote:
>
> I'd say every design that relies on FP<->string<->FP to be a lossless
> roundtrip is likely broken, whether it employs to_string/stod or
> iostreams or C equivalents.
>
>
> I suggest you take that up with pretty much every user of Collada or
> similar text-based technologies. Whatever your personal feelings on the
> matter may be, the facts on the ground are that round-tripping is
> /important/ to many users.

I'm not sure what exactly I have to take to those users. I don't think
the problems I'm pointing out will be a surprise for them.

I understand that the lossless roundtrip is desirable to some, and up to
some amount of error it is possible, given the required precautions are
taken. If you're willing to accept those errors, then go ahead. But you
have to be aware that you're losing something along the way, and that
something may affect the result of your program.

I say this design is likely broken because people (myself included)
typically assume no information loss across a (reliable) network channel
or serialization/deserialization roundtrip, where FP<->string<->FP
convertsions are likely to happen. And that is a fair expectation, IMO -
loss should not happen in those areas.

> How about round-tripping that isn't ridiculously broken? It's one thing
> to have a value that loses a couple of digits of precision. It's quite
> another to lose /the entire value/.

No, it's really not. A loss is a loss, only with %f you have a constant
factor of error and in case of %g you get error that depends on the
value itself. There will always be the case when the representation
looses more than you're willing to accept.

> We don't need the round trip to guarantee perfect precision
> reproduction. But that's no excuse for not /improving/ what we have.

I didn't say I'm against progress. :) What kind of improvement do you
have in mind?

> How useful that is for "diagnostic purposes" rather depends on what you
> are diagnosing, don't you think? I've been in situations in graphics
> work where getting only 2 digits of precision in text is not even close
> to acceptable even as a diagnostic.

Fair enough. If you're dealing with numbers that tend to be very large
or very small, applying %f directly may be suboptimal. Thing is, %g is
also not a very good choice because this format is difficult to
comprehend. Can you immediately tell if 1.23847623e10 is greater than
1.23843754e11? I'd say an improved format would be a %g with a fixed
user-definable e so that the two numbers could be presented as
1.23847623e10 and 12.3843754e10.

Andrey Semashev

unread,
Nov 9, 2015, 4:09:57 AM11/9/15
to std-dis...@isocpp.org
On 2015-11-09 12:04, Andrey Semashev wrote:
>
> I say this design is likely broken because people (myself included)
> typically assume no information loss across a (reliable) network channel
> or serialization/deserialization roundtrip, where FP<->string<->FP
> convertsions are likely to happen. And that is a fair expectation, IMO -
> loss should not happen in those areas.

I meant "serialization/deserialization to a file" above.

Nicol Bolas

unread,
Nov 9, 2015, 8:31:29 AM11/9/15
to ISO C++ Standard - Discussion


On Monday, November 9, 2015 at 4:04:34 AM UTC-5, Andrey Semashev wrote:
On 2015-11-09 04:17, Nicol Bolas wrote:
>
>
> On Sunday, November 8, 2015 at 4:54:13 PM UTC-5, Andrey Semashev wrote:
>
>     I'd say every design that relies on FP<->string<->FP to be a lossless
>     roundtrip is likely broken, whether it employs to_string/stod or
>     iostreams or C equivalents.
>
>
> I suggest you take that up with pretty much every user of Collada or
> similar text-based technologies. Whatever your personal feelings on the
> matter may be, the facts on the ground are that round-tripping is
> /important/ to many users.

I'm not sure what exactly I have to take to those users. I don't think
the problems I'm pointing out will be a surprise for them.

You said such designs are "broken". I pointed out that a huge number of people rely on such designs, and thus by your standards are relying on "broken" designs. Whether you like it or not, other people don't consider them to be "broken".
> How about round-tripping that isn't ridiculously broken? It's one thing
> to have a value that loses a couple of digits of precision. It's quite
> another to lose /the entire value/.

No, it's really not.

Yes, it is. Outputting 0.000000 when I gave it a non-zero number is not acceptable. It is far worse than outputting 0.000001 when I gave it 0.0000008.
 
A loss is a loss, only with %f you have a constant
factor of error

I'm not sure what you mean by "factor of error". The amount of error you lose with `%f` is certainly not constant with respect to the size of the number. If you output 1234500.0f, you will lose no digits of precision. If you output `0.00012345`, you will lose two decimal digits of precision.

That's not "constant".

and in case of %g you get error that depends on the
value itself.

... How does it depend on the value? That's the whole point of %g; it retains X digits of precision, regardless of the input number.

There will always be the case when the representation
looses more than you're willing to accept.

> We don't need the round trip to guarantee perfect precision
> reproduction. But that's no excuse for not /improving/ what we have.

I didn't say I'm against progress. :) What kind of improvement do you
have in mind?

Anything that makes the actual precision loss better. Preferably where the precision loss has a maximum of 1-2 digits of least significance.

> How useful that is for "diagnostic purposes" rather depends on what you
> are diagnosing, don't you think? I've been in situations in graphics
> work where getting only 2 digits of precision in text is not even close
> to acceptable even as a diagnostic.

Fair enough. If you're dealing with numbers that tend to be very large
or very small, applying %f directly may be suboptimal. Thing is, %g is
also not a very good choice because this format is difficult to
comprehend. Can you immediately tell if 1.23847623e10 is greater than
1.23843754e11?

Yes. And even if I were the sort who would say "no", I can still tell which is bigger. It just takes a bit longer.

Consider instead 0.000000 and 0.000000. Can you tell which is bigger? Nobody can, no matter how long you look at it.

Comprehension may take longer with %g, but it is at least possible.
 
I'd say an improved format would be a %g with a fixed
user-definable e so that the two numbers could be presented as
1.23847623e10 and 12.3843754e10.

This goes to the point of having a generic "to_string" function that takes as few arguments as possible. That is, what exactly is the point of such a function?

It's the default case. It's the one that works. It's the one that's quick, simple, effective, doesn't require a lot of fiddling around, and it almost always gives you the right answer.

Outputting 0.000000 for an input value that had many digits of significance is not "working". It is not "effective". And it certainly is not the "right answer".

If a user wants pretty formatted output so that the numbers are all aligned and easy to read, or all have some exponent of precision or something, we have tools for that. `to_string` doesn't need to be that. It needs to work. And it currently does not.

snprintf can be used for special case needs. `to_string` is for when you need the value in a string. And outputting 0.000000 for 0.00000011232 is not putting the value into a string.

Andrey Semashev

unread,
Nov 9, 2015, 9:33:48 AM11/9/15
to std-dis...@isocpp.org
On 2015-11-09 16:31, Nicol Bolas wrote:
>
>
> On Monday, November 9, 2015 at 4:04:34 AM UTC-5, Andrey Semashev wrote:
>
> On 2015-11-09 04:17, Nicol Bolas wrote:
> >
> >
> > On Sunday, November 8, 2015 at 4:54:13 PM UTC-5, Andrey Semashev
> wrote:
> >
> > I'd say every design that relies on FP<->string<->FP to be a
> lossless
> > roundtrip is likely broken, whether it employs to_string/stod or
> > iostreams or C equivalents.
> >
> >
> > I suggest you take that up with pretty much every user of Collada or
> > similar text-based technologies. Whatever your personal feelings
> on the
> > matter may be, the facts on the ground are that round-tripping is
> > /important/ to many users.
>
> I'm not sure what exactly I have to take to those users. I don't think
> the problems I'm pointing out will be a surprise for them.
>
>
> You said such designs are "broken". I pointed out that a huge number of
> people /rely/ on such designs, and thus by your standards are relying on
> "broken" designs. Whether you like it or not, other people don't
> consider them to be "broken".

Well, people do all kinds of crazy stuff, it doesn't make this stuff any
less crazy. :)

> > How about round-tripping that isn't ridiculously broken? It's one
> thing
> > to have a value that loses a couple of digits of precision. It's
> quite
> > another to lose /the entire value/.
>
> No, it's really not.
>
>
> Yes, it is. Outputting 0.000000 when I gave it a non-zero number is not
> acceptable.

Well, we disagree on that. My point is that whatever format you choose,
unless it is 100% perfect, you will always have a case when something
gets rounded and you get not what you want.

> It is far worse than outputting 0.000001 when I gave it
> 0.0000008.

So is the problem in the rounding mode? What if the number is 0.0000001,
is it allowed to be formatted as 0.000000?

> A loss is a loss, only with %f you have a constant
> factor of error
>
>
> I'm not sure what you mean by "factor of error". The amount of error you
> lose with `%f` is certainly not constant with respect to the size of the
> number. If you output 1234500.0f, you will lose no digits of precision.
> If you output `0.00012345`, you will lose two decimal digits of precision.
>
> That's not "constant".

Maybe my wording was poor; English is not my native language, sorry.
What I mean is that whatever number you format with %f you will get a
fixed precision up to 6-th digit after the decimal point. You lose
anything more precise than that. In other words, the represented number
is N±1e-6, and the representation error is constant. That is not the
case with %g, which only keeps a fixed portion of the most significant
digits and looses the rest (thus the amount of error depends on the number).

> Consider instead 0.000000 and 0.000000. Can you tell which is bigger?
> /Nobody/ can, no matter how long you look at it.

I can - the numbers are equal. And yes, this outcome, while probably is
surprising to some, is perfectly logical. The same way '(int)1.2 ==
(int)1.3' holds true because you're comparing not the doubles but ints.

> Comprehension may take longer with %g, but it is at least /possible/.

You will have the same kind of problem with %g if you compare formatted
numbers 1e50+1 and 1e50.

Viacheslav Usov

unread,
Nov 9, 2015, 9:59:45 AM11/9/15
to std-dis...@isocpp.org
On Sun, Nov 8, 2015 at 10:54 PM, Andrey Semashev <andrey....@gmail.com> wrote:

[...]

> I'd say every design that relies on FP<->string<->FP to be a lossless roundtrip is likely broken, whether it employs to_string/stod or iostreams or C equivalents.

Given that just about every floating point unit in the world operates in accordance with IEEE 754 [1], there is no good excuse for not having a standardized textual representation of floating point numbers that is loseless. At the very least, the C++ standard could require an implementation-defined format that is loseless when used within that particular implementation.

It should probably be more straight-forward to extend [lex.fcon] by including a format or formats that can represent either IEEE 754 or implementation-specific numbers exactly. Then standard library converters can be defined in terms of that format or formats.

Cheers,
V.

Nicol Bolas

unread,
Nov 9, 2015, 10:05:52 AM11/9/15
to ISO C++ Standard - Discussion

Nobody is asking for 100% perfect. What we're asking for is something that preserves digits as much as is possible.

Outputting 0.000000 for a non-zero number is not preserving significant digits as much as possible.

> It is far worse than outputting 0.000001 when I gave it
> 0.0000008.

So is the problem in the rounding mode? What if the number is 0.0000001,
is it allowed to be formatted as 0.000000?

>     A loss is a loss, only with %f you have a constant
>     factor of error
>
>
> I'm not sure what you mean by "factor of error". The amount of error you
> lose with `%f` is certainly not constant with respect to the size of the
> number. If you output 1234500.0f, you will lose no digits of precision.
> If you output `0.00012345`, you will lose two decimal digits of precision.
>
> That's not "constant".

Maybe my wording was poor; English is not my native language, sorry.
What I mean is that whatever number you format with %f you will get a
fixed precision up to 6-th digit after the decimal point.

And that might be fine... if I were using a fixed-point number. I am instead using a floating-point number. The entire purpose of which is that it retains a specific number of significant digits, regardless of the location of its digits relative to the decimal point. Obviously within the limits of the numeric precision of the floating point type, of course.

If your `to_string` for floating-point acts like `to_string` for fixed-point, you're doing it wrong.
 
You lose
anything more precise than that. In other words, the represented number
is N±1e-6, and the representation error is constant. That is not the
case with %g, which only keeps a fixed portion of the most significant
digits and looses the rest (thus the amount of error depends on the number).

> Consider instead 0.000000 and 0.000000. Can you tell which is bigger?
> /Nobody/ can, no matter how long you look at it.

I can - the numbers are equal.

But one of them was in fact bigger than the other. Therefore, information has been lost.

Or to say it this way, if the floating point value of x is greater than the floating point value of y, then the to_string representation should preserve that information as much as is reasonable.
 
And yes, this outcome, while probably is
surprising to some, is perfectly logical. The same way '(int)1.2 ==
(int)1.3' holds true because you're comparing not the doubles but ints.

> Comprehension may take longer with %g, but it is at least /possible/.

You will have the same kind of problem with %g if you compare formatted
numbers 1e50+1 and 1e50.

But that's only because you broke floating-point precision. Within the boundaries of floating point precision, %g will print the right answer.

Thus far, the only positive defense you've been able to make for the current behavior is about visual formatting. It looks nicer and more readable to some people.

Nicol Bolas

unread,
Nov 9, 2015, 10:09:37 AM11/9/15
to ISO C++ Standard - Discussion
On Monday, November 9, 2015 at 9:59:45 AM UTC-5, Viacheslav Usov wrote:
On Sun, Nov 8, 2015 at 10:54 PM, Andrey Semashev <andrey....@gmail.com> wrote:

[...]

> I'd say every design that relies on FP<->string<->FP to be a lossless roundtrip is likely broken, whether it employs to_string/stod or iostreams or C equivalents.

Given that just about every floating point unit in the world operates in accordance with IEEE 754 [1], there is no good excuse for not having a standardized textual representation of floating point numbers that is loseless. At the very least, the C++ standard could require an implementation-defined format that is loseless when used within that particular implementation.

No, it is not.

It is not possible to convert every binary IEEE-754 representable number into a decimal version that will convert back to exactly that float. Just as there are many decimal float numbers that do not convert into an exact binary IEEE-754 number.

The only way to do such perfect textual conversion is to turn it into a hex-float. And that's not what we're talking about. We want `to_string` to convert it into a readable decimal number.

Oh, and the C++ standard does not require `float` or any other floating-point type to be IEEE-754.

We're not asking for perfection. We're asking for not-stupidity, not losing precision for arbitrary reasons. Not losing precision where precision does not have to be lost.

Jean-Marc Bourguet

unread,
Nov 9, 2015, 10:24:48 AM11/9/15
to ISO C++ Standard - Discussion
Le lundi 9 novembre 2015 16:09:37 UTC+1, Nicol Bolas a écrit :
On Monday, November 9, 2015 at 9:59:45 AM UTC-5, Viacheslav Usov wrote:
On Sun, Nov 8, 2015 at 10:54 PM, Andrey Semashev <andrey....@gmail.com> wrote:

[...]

> I'd say every design that relies on FP<->string<->FP to be a lossless roundtrip is likely broken, whether it employs to_string/stod or iostreams or C equivalents.

Given that just about every floating point unit in the world operates in accordance with IEEE 754 [1], there is no good excuse for not having a standardized textual representation of floating point numbers that is loseless. At the very least, the C++ standard could require an implementation-defined format that is loseless when used within that particular implementation.

No, it is not.

It is not possible to convert every binary IEEE-754 representable number into a decimal version that will convert back to exactly that float. Just as there are many decimal float numbers that do not convert into an exact binary IEEE-754 number.

Every non-infinite, non-NaN IEEE-754 number has an exact decimal representation.  The algorithm to print the "best" decimal representation (i.e. it will gives 0.1 and not something like 0.099999999987689) which will be read as a FP is even know, see _How to print floating-point numbers accurately_ by G. Steel and J. White, http://dl.acm.org/citation.cfm?id=93559 and I'd surprised if that algorithm has not been improved since.  It does use multi-precision arithmetic, though, but it's already common for stdlib implementation to use multi-precision arithmetic when reading the number.

-- 
Jean-Marc

Viacheslav Usov

unread,
Nov 9, 2015, 10:30:42 AM11/9/15
to std-dis...@isocpp.org
On Mon, Nov 9, 2015 at 4:09 PM, Nicol Bolas <jmck...@gmail.com> wrote:

> It is not possible to convert every binary IEEE-754 representable number into a decimal version that will convert back to exactly that float.

You are disputing a statement that I never made.
 
> And that's not what we're talking about.

You are mistaken, because we are very obviously talking about that.

We're asking for not-stupidity, not losing precision for arbitrary reasons.

Base 10 is entirely arbitrary.

Cheers,
V.

Andrey Semashev

unread,
Nov 9, 2015, 10:31:49 AM11/9/15
to std-dis...@isocpp.org
On 2015-11-09 18:05, Nicol Bolas wrote:
>
> Maybe my wording was poor; English is not my native language, sorry.
> What I mean is that whatever number you format with %f you will get a
> fixed precision up to 6-th digit after the decimal point.
>
>
> And that might be fine... if I were using a /fixed-point/ number. I am
> instead using a /floating-point/ number. The entire purpose of which is
> that it retains a specific number of significant digits, regardless of
> the location of its digits relative to the decimal point. Obviously
> within the limits of the numeric precision of the floating point type,
> of course.
>
> If your `to_string` for floating-point acts like `to_string` for
> fixed-point, /you're doing it wrong/.

The problem is that humans are used to reading numbers in fixed point
format.

> > Consider instead 0.000000 and 0.000000. Can you tell which is
> bigger?
> > /Nobody/ can, no matter how long you look at it.
>
> I can - the numbers are equal.
>
>
> But one of them was in fact bigger than the other. Therefore,
> information has been lost.

Exactly. And it will always be lost as long as FP to string conversion
is lossy.

> Or to say it this way, if the floating point value of x is greater than
> the floating point value of y, then the to_string representation should
> preserve that information as much as is reasonable.

You can't define 'reasonable'. I mean, I find the current %f behavior
reasonable.

> > Comprehension may take longer with %g, but it is at least
> /possible/.
>
> You will have the same kind of problem with %g if you compare formatted
> numbers 1e50+1 and 1e50.
>
>
> But that's only because you broke floating-point precision. Within the
> boundaries of floating point precision, %g will print the right answer.

Sorry, I miscalculated the numbers. Here's the correct example:

#include <cstdio>

int main()
{
double n1 = (1e15) + 1.0;
double n2 = 1e15;

std::printf("%g\n%g\n%d\n", n1, n2, (int)(n1 == n2));

return 0;
}

The output is:

1e+15
1e+15
0

Nicol Bolas

unread,
Nov 9, 2015, 10:53:35 AM11/9/15
to ISO C++ Standard - Discussion
On Monday, November 9, 2015 at 10:30:42 AM UTC-5, Viacheslav Usov wrote:
On Mon, Nov 9, 2015 at 4:09 PM, Nicol Bolas <jmck...@gmail.com> wrote:

> It is not possible to convert every binary IEEE-754 representable number into a decimal version that will convert back to exactly that float.

You are disputing a statement that I never made.
 
> And that's not what we're talking about.

You are mistaken, because we are very obviously talking about that.

No, we aren't. If we just wanted to turn an IEEE-754 number into a string of text to be read by another program and converted back into a float, we wouldn't bother to use decimal at all. We'd use Base64 or whatever to encode the binary bytes directly.

The implicit assumption with `to_string` is that a human ought to be able to read the results. Just like every other float-to-string functionality that other languages provide.
 
We're asking for not-stupidity, not losing precision for arbitrary reasons.

Base 10 is entirely arbitrary.

Then take it up with the world's school systems for teaching everyone base 10. That's what people know, so that's what `to_string` ought to output.

Nicol Bolas

unread,
Nov 9, 2015, 11:16:38 AM11/9/15
to ISO C++ Standard - Discussion

If unnecessarily losing precision in float-to-string conversions is "reasonable" to you, then I would say that your definition of "reasonable" doesn't coincide with those of others.

And there's substantial evidence for that:

Lua:

print(0.000000001)

Yields:

1e-9

Python:

print(0.000000001)

Yields

1e-9

C#:

using System;

public class Test
{
   
public static void Main()
   
{
       
// your code goes here
       
Console.WriteLine("{0}", 0.000000001);
   
}
}

Yields:

1e-9

Java:

import java.io.*;

class Ideone
{
   
public static void main (String[] args) throws java.lang.Exception
   
{
       
System.out.println(Float.toString(0.000000001f));
   
}
}

Yields:

1e-9

C++:

#include <iostream>
#include <string>
using namespace std;
     
int main() {
    cout
<< to_string(0.000000001f) << "\n";
   
return 0;
}

Yields:

0.000000

The rest of the world seems to have decided what "reasonable" means. And it's not what `to_string` does.
 
>      > Comprehension may take longer with %g, but it is at least
>     /possible/.
>
>     You will have the same kind of problem with %g if you compare formatted
>     numbers 1e50+1 and 1e50.
>
>
> But that's only because you broke floating-point precision. Within the
> boundaries of floating point precision, %g will print the right answer.

Sorry, I miscalculated the numbers.

OK, you're right; %g won't always give the right answer.

However, I just checked this for both Lua and Python. And neither of them give the right answer either. It seems that the rest of the world decided that %g's failings were more tolerable than %f's.

Thiago Macieira

unread,
Nov 9, 2015, 12:24:50 PM11/9/15
to std-dis...@isocpp.org
On Monday 09 November 2015 12:04:30 Andrey Semashev wrote:
> Thing is, %g is
> also not a very good choice because this format is difficult

And %g is also insufficient. You want %.19g so that you have enough digits to
make the round-trip conversion lossless.

Then you'll get bug reports from your users that you're writing too many
digits and that some imprecise FP numbers like 1.3 are actually shown as a
very long number close to 1.3 but different from it (same FP number though).

This is not a wild prediction on my part. It has already happened. I changed
the QVariant conversion of double to string to %.19g and we got several bug
reports about the too-precise number. Our conclusion: we'll import a whole new
library into QtCore that is better at converting doubles to string than
snprintf.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Thiago Macieira

unread,
Nov 9, 2015, 12:32:30 PM11/9/15
to std-dis...@isocpp.org
On Monday 09 November 2015 15:59:43 Viacheslav Usov wrote:
> Given that just about every floating point unit in the world operates in
> accordance with IEEE 754 [1], there is no good excuse for not having a
> standardized textual representation of floating point numbers that is
> loseless.

Yes, there is. See these bug reports about using a too-precise number:

https://bugreports.qt.io/browse/QTBUG-47371
https://bugreports.qt.io/browse/QTBUG-47192
https://bugreports.qt.io/browse/QTBUG-47575

Those were caused by fixing
https://bugreports.qt.io/browse/QTBUG-42574

which asked for lossless conversion. Once we implemented the lossless
conversion, people started complaining that the FP number they were getting
was not the number they were expecting.

u97...@gmail.com

unread,
Nov 9, 2015, 12:40:28 PM11/9/15
to ISO C++ Standard - Discussion, u97...@gmail.com
@ Nicol Bolas

Don't let Andrey fool you with the nonsense ;) Of course %g will "fail" in the example because it has by default 6 digit precision, using e.g. "%.16g" shows the whole value and it often but not always can avoid showing the redundant digits (see example below)

Example:

#include <cstdio>
#include <iostream>
 
int main()
{
   
char buffer[32];
    sprintf
(buffer, "%f", 123.3);
    std
::cout << buffer << '\n';
    sprintf
(buffer, "%g", 123.3);
    std::cout << buffer << '\n';
    sprintf
(buffer, "%.16g", 123.3);
    std
::cout << buffer << '\n';
    sprintf
(buffer, "%.19g", 123.3);
    std
::cout << buffer << '\n';

   
double n1 = (1e15) + 1.0;
   
double n2 = 1e15;

    std
::printf("%.19g\n%.19g", n1, n2);
}


Output
123.300000
123.3
123.3
123.2999999999999972
1000000000000001
1000000000000000




loic.act...@numericable.fr

unread,
Nov 9, 2015, 12:42:31 PM11/9/15
to Thiago Macieira, std-dis...@isocpp.org
The situation might be different: You are talking about a UI framework, that displays strings to the end user. std::to_string is a more general facility. And I think round-trip is something important in this situation.

---
Loïc



---- Message d'origine ----
De : "Thiago Macieira" <thi...@macieira.org>
À : std-dis...@isocpp.org
Objet : Re: [std-discussion] Deprecate to_string() for floating point types?
Date : 09/11/2015 18:32:20 CET
--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-discussion/.

Jean-Marc Bourguet

unread,
Nov 9, 2015, 12:46:12 PM11/9/15
to ISO C++ Standard - Discussion, thi...@macieira.org
Again, since at least Steele's paper it is know how to give out strings which reads back as the original yet are not surprising to the user. I'd not be surprised if Scheme or Java mandated their use BTW.  And IIRC Lawrence Crawl hinted in the numeric group that he wanted to have such thing normalized as well, I don't remember having seen a paper.

-- 
Jean-Marc

Andrey Semashev

unread,
Nov 9, 2015, 12:46:53 PM11/9/15
to std-dis...@isocpp.org
On 2015-11-09 20:40, u97...@gmail.com wrote:
> @ Nicol Bolas
>
> Don't let Andrey fool you with the nonsense ;)

Fooling anyone is the opposite of what I was trying to do in this
discussion. Oh, well...

Viacheslav Usov

unread,
Nov 9, 2015, 12:48:52 PM11/9/15
to std-dis...@isocpp.org
On Mon, Nov 9, 2015 at 4:53 PM, Nicol Bolas <jmck...@gmail.com> wrote:

> No, we aren't.

I am talking about that. I am part of 'we'. Can you complete this syllogism?

> If we just wanted to turn an IEEE-754 number into a string of text to be read by another program and converted back into a float, we wouldn't bother to use decimal at all. We'd use Base64 or whatever to encode the binary bytes directly.

I would have no problem with that, if it were standardized and portable. The problem is that there is no standard way to represent floating point numbers as literals and strings exactly and portably. Which is extremely weird given that modern hardware de facto uses the same standard for floating point numbers.

> Then take it up with the world's school systems for teaching everyone base 10.

This is non sequitur, because we are not talking about everyone.

> That's what people know, so that's what `to_string` ought to output.

People know a few other things, too. Feel free to apply your reasoning to this premise.

The need to represent floating point literals in C++ exactly is different from the need to represent floating point numbers as taught by the world's school system. And by the way, the subject of this thread is about deprecating to_string, so to_string can happily continue losing precision.

Cheers,
V.

Viacheslav Usov

unread,
Nov 9, 2015, 12:54:33 PM11/9/15
to std-dis...@isocpp.org
On Mon, Nov 9, 2015 at 6:46 PM, Jean-Marc Bourguet <jm.bo...@gmail.com> wrote:
Again, since at least Steele's paper it is know how to give out strings which reads back as the original yet are not surprising to the user. I'd not be surprised if Scheme or Java mandated their use BTW.  And IIRC Lawrence Crawl hinted in the numeric group that he wanted to have such thing normalized as well, I don't remember having seen a paper.

I have not read the paper, but you mentioned certain limitations, such as no infinities or NaNs. There is no reason to have those limitations. Just for the record.

Cheers,
V.

Viacheslav Usov

unread,
Nov 9, 2015, 12:56:53 PM11/9/15
to std-dis...@isocpp.org
On Mon, Nov 9, 2015 at 6:32 PM, Thiago Macieira <thi...@macieira.org> wrote:
On Monday 09 November 2015 15:59:43 Viacheslav Usov wrote:
> Given that just about every floating point unit in the world operates in
> accordance with IEEE 754 [1], there is no good excuse for not having a
> standardized textual representation of floating point numbers that is
> loseless.

Yes, there is. See these bug reports about using a too-precise number:

I did not suggest making a breaking change anywhere.

Cheers,
V.

Nicol Bolas

unread,
Nov 9, 2015, 1:28:52 PM11/9/15
to ISO C++ Standard - Discussion


On Monday, November 9, 2015 at 12:48:52 PM UTC-5, Viacheslav Usov wrote:
On Mon, Nov 9, 2015 at 4:53 PM, Nicol Bolas <jmck...@gmail.com> wrote:
> That's what people know, so that's what `to_string` ought to output.

People know a few other things, too. Feel free to apply your reasoning to this premise.

The need to represent floating point literals in C++ exactly is different from the need to represent floating point numbers as taught by the world's school system.

Right. That's why I'm wondering why you brought it up in a discussion about `to_string`. The purpose of `to_string` is not to guarantee a full round-trip with no loss of precision. The purpose of `to_string` is to be a simple and effective way of making a numerical value printable and readable.
 
And by the way, the subject of this thread is about deprecating to_string, so to_string can happily continue losing precision.

No, the thread is about the problems of `to_string`, with the suggestion to deprecate it in favor of something else as a solution.

Nicol Bolas

unread,
Nov 9, 2015, 1:41:12 PM11/9/15
to ISO C++ Standard - Discussion


On Monday, November 9, 2015 at 12:24:50 PM UTC-5, Thiago Macieira wrote:
On Monday 09 November 2015 12:04:30 Andrey Semashev wrote:
> Thing is, %g is
> also not a very good choice because this format is difficult

And %g is also insufficient. You want %.19g so that you have enough digits to
make the round-trip conversion lossless.

Then you'll get bug reports from your users that you're writing too many
digits and that some imprecise FP numbers like 1.3 are actually shown as a
very long number close to 1.3 but different from it (same FP number though).

This is not a wild prediction on my part. It has already happened. I changed
the QVariant conversion of double to string to %.19g and we got several bug
reports about the too-precise number. Our conclusion: we'll import a whole new
library into QtCore that is better at converting doubles to string than
snprintf.

Hmm... That's unfortunate, but understandable.

So if we want to improve `to_string`, we need to come up with a way to format floats that is compact, while still representing the value.

I think this would best be done by making it a QOI issue, with the standard making it clear that the resulting string from `to_string` should have "at least X significant digits", and that the resulting string should be formatted so that it can be fed back through stof/d/ld.

Matthew Woehlke

unread,
Nov 9, 2015, 2:17:24 PM11/9/15
to std-dis...@isocpp.org
On 2015-11-09 12:24, Thiago Macieira wrote:
> On Monday 09 November 2015 12:04:30 Andrey Semashev wrote:
>> Thing is, %g is
>> also not a very good choice because this format is difficult
>
> And %g is also insufficient. You want %.19g so that you have enough digits to
> make the round-trip conversion lossless.
>
> Then you'll get bug reports from your users that you're writing too many
> digits and that some imprecise FP numbers like 1.3 are actually shown as a
> very long number close to 1.3 but different from it (same FP number though).
>
> This is not a wild prediction on my part. It has already happened. I changed
> the QVariant conversion of double to string to %.19g and we got several bug
> reports about the too-precise number. Our conclusion: we'll import a whole new
> library into QtCore that is better at converting doubles to string than
> snprintf.

Heh. Please make that library available to inkscape, which has the same
problem ;-).

I think what we humans really want is the decimal floating-point (%g)
number that has as few digits as possible while not altering the binary
representation more than 1-2 of the lowermost bits of precision. Note
that that includes preferring "1.5" over "1.500000"; trailing zeros are
not always desirable.

--
Matthew

Thiago Macieira

unread,
Nov 9, 2015, 2:58:34 PM11/9/15
to loic.act...@numericable.fr, std-dis...@isocpp.org
On Monday 09 November 2015 18:42:28 loic.act...@numericable.fr wrote:
> The situation might be different: You are talking about a UI framework, that
> displays strings to the end user. std::to_string is a more general
> facility. And I think round-trip is something important in this situation.

No, I'm not. I'm talking about a change made in QtCore, inside the QVariant
code. There's nothing GUI about that, but it affected user-visible strings.

That's a very good parallel to std::to_string.

Thiago Macieira

unread,
Nov 9, 2015, 3:00:26 PM11/9/15
to Jean-Marc Bourguet, ISO C++ Standard - Discussion
On Monday 09 November 2015 09:46:11 Jean-Marc Bourguet wrote:
> Again, since at least Steele's paper it is know how to give out strings
> which reads back as the original yet are not surprising to the user. I'd
> not be surprised if Scheme or Java mandated their use BTW. And IIRC
> Lawrence Crawl hinted in the numeric group that he wanted to have such
> thing normalized as well, I don't remember having seen a paper.

By all means, let's require the short version of double conversion. Having to
import libdouble-conversion (https://github.com/google/double-conversion) is a
hack to work around problems in the C library.

Oh, and snprintf_l needs to be mandatory. IMHO the non-locale-enabled versions
of the printf and scanf functions need to be deprecated and their use warned
against.

Thiago Macieira

unread,
Nov 9, 2015, 6:53:12 PM11/9/15
to std-dis...@isocpp.org
On Monday 09 November 2015 10:41:12 Nicol Bolas wrote:
> So if we want to improve `to_string`, we need to come up with a way to
> format floats that is compact, while still representing the value.
>
> I think this would best be done by making it a QOI issue, with the standard
> making it clear that the resulting string from `to_string` should have "at
> least X significant digits", and that the resulting string should be
> formatted so that it can be fed back through stof/d/ld.

I disagree. There's a QoI aspect, but I wouldn't specify what you said. If we
want lossless, we should simply say "lossless" and let the implementation
decide how to achieve that.

For example,

using namespace std;
cout << to_string(0.3) << endl;

The output could be any of the following:

0.2999999999999999889
0.29999999999999999
0.3

All values are *correct* because they represent the same value
(0x1.3333333333333p-2, in hexfloat form).

The QoI is which one gets printed. We punt the problem of producing something
the user wants to see to the implementation.

Like I said, today, if you use snprintf_l with a precision of
std::numeric_limits<double>::max_digits10 or that+2, most implementations will
produce an unexpected result (the first two lines).

snprintf is nonsense because, for me, the output above would have used a comma
instead of a period for the decimal separator. That's not lossless.

Thiago Macieira

unread,
Nov 9, 2015, 6:54:00 PM11/9/15
to std-dis...@isocpp.org
On Monday 09 November 2015 14:17:11 Matthew Woehlke wrote:
> I think what we humans really want is the decimal floating-point (%g)
> number that has as few digits as possible while not altering the binary
> representation more than 1-2 of the lowermost bits of precision. Note
> that that includes preferring "1.5" over "1.500000"; trailing zeros are
> not always desirable.

Trailing zeroes are easy to fix and most implementations will get it right.
The problem is usually trailing nines. See the other email.

Thiago Macieira

unread,
Nov 9, 2015, 7:00:53 PM11/9/15
to std-dis...@isocpp.org
The OP did, when the thread started asking about changing the output of
to_string().

My changes in Qt are arguably not breaking anything, since I added precision
and prevented loss of data. But it broke with expectations.

Richard Smith

unread,
Nov 9, 2015, 7:13:47 PM11/9/15
to std-dis...@isocpp.org, thi...@macieira.org
On Mon, Nov 9, 2015 at 9:46 AM, Jean-Marc Bourguet <jm.bo...@gmail.com> wrote:
Again, since at least Steele's paper it is know how to give out strings which reads back as the original yet are not surprising to the user. I'd not be surprised if Scheme or Java mandated their use BTW.  And IIRC Lawrence Crawl hinted in the numeric group that he wanted to have such thing normalized as well, I don't remember having seen a paper.

Modern versions of TCL use that approach by default. See http://www.tcl.tk/cgi-bin/tct/tip/132, in particular "a number shall be formatted using the smallest number of decimal digits required to distinguish it from the next floating point number above it and the next floating point number below", and this citation:

Guy L. Steele Jr. and Jon L. White, How to print floating-point numbers accurately. In Proceedings of the ACM Conference on Programming Language Design and Implementation, June 20-22 1990, pp. 112-126. [http://dl.acm.org/citation.cfm?doid=93542.93559]
 
It would be very useful to have an implementation of that algorithm in the standard library, whether or not we name it to_string.

Nicol Bolas

unread,
Nov 9, 2015, 8:06:06 PM11/9/15
to ISO C++ Standard - Discussion


On Monday, November 9, 2015 at 7:00:53 PM UTC-5, Thiago Macieira wrote:
On Monday 09 November 2015 18:56:52 Viacheslav Usov wrote:
> On Mon, Nov 9, 2015 at 6:32 PM, Thiago Macieira <thi...@macieira.org> wrote:
> > On Monday 09 November 2015 15:59:43 Viacheslav Usov wrote:
> > > Given that just about every floating point unit in the world operates in
> > > accordance with IEEE 754 [1], there is no good excuse for not having a
> > > standardized textual representation of floating point numbers that is
> > > loseless.
> >
> > Yes, there is. See these bug reports about using a too-precise number:
> I did not suggest making a breaking change anywhere.

The OP did, when the thread started asking about changing the output of
to_string().

Technically, the OP only wanted `to_string` deprecated because of the surprise of having such poor round-tripping performance. Personally, I don't see why we should deprecate them instead of fixing them. Even at the risk of making a string much longer than expected.

Matthew Woehlke

unread,
Nov 10, 2015, 10:39:30 AM11/10/15
to std-dis...@isocpp.org
On 2015-11-09 18:53, Thiago Macieira wrote:
> On Monday 09 November 2015 14:17:11 Matthew Woehlke wrote:
>> I think what we humans really want is the decimal floating-point (%g)
>> number that has as few digits as possible while not altering the binary
>> representation more than 1-2 of the lowermost bits of precision. Note
>> that that includes preferring "1.5" over "1.500000"; trailing zeros are
>> not always desirable.
>
> Trailing zeroes are easy to fix and most implementations will get it right.

...and which *isn't* currently done by to_string. My intent was merely
to point out that, if we change things, we may want to explicitly note
trailing zeros as one of the things we want to change. (Even if it
happens implicitly just in switching from '%f' to '%g'.)

--
Matthew

Viacheslav Usov

unread,
Nov 10, 2015, 12:11:22 PM11/10/15
to std-dis...@isocpp.org
On Mon, Nov 9, 2015 at 7:28 PM, Nicol Bolas <jmck...@gmail.com> wrote:

That's why I'm wondering why you brought it up in a discussion about `to_string`.

This discussion was initiated by a report on the failure of to_string to serve a particular purpose. Whether to_string needs to serve that particular purpose is debatable, and if it does not, another mechanism serving that purpose is a valid subject in this discussion.

> The purpose of `to_string` is not to guarantee a full round-trip with no loss of precision.

The standard does not define the purpose of to_string. The standard implies that to_string, when applied to integers, does "guarantee a full round-trip with no loss of precision", your ideas about its purpose notwithstanding.

The current behaviour of to_string will probably have to be retained as a default for backward compatibility, but there is no reason why some of selectable behaviours could not "guarantee a full round-trip with no loss of precision". There are in fact very good reasons, as stated earlier, why such behaviours are desirable and achievable.

> The purpose of `to_string` is to be a simple and effective way of making a numerical value printable and readable.

This is neither specified by the standard nor fundamentally incompatible with being able to "guarantee a full round-trip with no loss of precision"

 > No, the thread is about...

You are contradicting yourself, because "the subject of this thread is about deprecating to_string..." is both what I wrote and what the subject of your message says.

Cheers,
V.

Nicol Bolas

unread,
Nov 10, 2015, 12:45:47 PM11/10/15
to ISO C++ Standard - Discussion


On Tuesday, November 10, 2015 at 12:11:22 PM UTC-5, Viacheslav Usov wrote:
On Mon, Nov 9, 2015 at 7:28 PM, Nicol Bolas <jmck...@gmail.com> wrote:

That's why I'm wondering why you brought it up in a discussion about `to_string`.

This discussion was initiated by a report on the failure of to_string to serve a particular purpose. Whether to_string needs to serve that particular purpose is debatable, and if it does not, another mechanism serving that purpose is a valid subject in this discussion.

`to_string` makes floats human readable. Therefore, anyone who is currently thinking about using `to_string` to make a float round-trippable through text would also have "human readable" as a requirement.

And therefore a suggestion for a function that round-trips through text but isn't human readable is not appropriate to this discussion.

> The purpose of `to_string` is not to guarantee a full round-trip with no loss of precision.

The standard does not define the purpose of to_string. The standard implies that to_string, when applied to integers, does "guarantee a full round-trip with no loss of precision", your ideas about its purpose notwithstanding.

Um, no. The standard explicitly states how `to_string` is currently implemented: as exactly equivalent to `snprintf("%f", float);`. And there is nothing about the behavior of `%f` that guarantees round-tripping.

So the standard does not imply anything of the sort at present. Users have a reasonable expectation that it should do that, but they wouldn't get that impression by reading the standard. They got it by reading the function's name and making assumptions about it. That is, there's a disconnect between what the user expects and what the standard says happens.

That disconnect is what this thread is all about.

The current behaviour of to_string will probably have to be retained as a default for backward compatibility,

I don't believe that. I believe that we can enhance the current behavior of `to_string` without breaking the expectations of people on its output. Or at least, not very much.

but there is no reason why some of selectable behaviours could not "guarantee a full round-trip with no loss of precision". There are in fact very good reasons, as stated earlier, why such behaviours are desirable and achievable.

> The purpose of `to_string` is to be a simple and effective way of making a numerical value printable and readable.

This is neither specified by the standard

Of course not. The standard specifies behavior. Not design principles or guidelines. If you're looking for design ideals, the standard isn't where you look.

nor fundamentally incompatible with being able to "guarantee a full round-trip with no loss of precision"

 > No, the thread is about...

You are contradicting yourself, because "the subject of this thread is about deprecating to_string..." is both what I wrote and what the subject of your message says.

Right. Which is why the OP goes into intimate details of the problems of `to_string`, then spends a paragraph saying "so we should deprecate it". My response (and the response of others) is that we should fix it.
 

Cheers,
V.

Viacheslav Usov

unread,
Nov 10, 2015, 2:11:02 PM11/10/15
to std-dis...@isocpp.org
On Tue, Nov 10, 2015 at 6:45 PM, Nicol Bolas <jmck...@gmail.com> wrote:

> `to_string` makes floats human readable.

This repeats your earlier argument about its purpose, which, as I said, is quite simply baseless. This invalidates the rest of your argument.

to_string does what it does, and labelling its current behaviour as "human readable" is purely subjective. Even if I personally agree that "human readable" is a desirable property, it does not technically need to be a property of to_string or whatever other mechanism when it is used to represent floating point numbers exactly. I said this earlier, and your response to this was "right", so why are we discussing this again?
 
> Um, no. The standard explicitly states how `to_string` is currently implemented: as exactly equivalent to `snprintf("%f", float);`. And there is nothing about the behavior of `%f` that guarantees round-tripping.

I am really not willing to make this personal, but I just find myself repeatedly asking myself: why does Nicol Bolas keep saying "no" to something that I did not say when responding to my messages? I said, literally: " to_string, when applied to integers". How could you respond to this with "no" and then justify your response by saying something about floats?

The standard implies a certain behaviour of to_string with respect to integers, specifically that they can be converted to a string and back without any loss. The fact that the standard does not imply that when it comes to floats is quite nasty. I think we all agree about this and there is no need to repeat this over and over again.

For me, the worst part of this is that this is very likely based on a thoroughly outdated idea that floating point numbers are intrinsically evil and one has pretty much always assume some kind of back-stabbingness when dealing with them, so it is OK to lose every significant figure your data happen to have when you rely on standard library means.

That is not OK, and ought to be fixed. Which is not extremely hard because there are well established standards on floating point numbers. And if that is fixed, that should be fixed not only in string conversions, but also in literals, because losing precision just because your literals cannot specify numbers exactly is absurd.

Whether the string conversion part is fixed via to_string, extended to_string or some other thing is much less important for me.

Cheers,
V.

Nicol Bolas

unread,
Nov 10, 2015, 4:41:03 PM11/10/15
to ISO C++ Standard - Discussion


On Tuesday, November 10, 2015 at 2:11:02 PM UTC-5, Viacheslav Usov wrote:
On Tue, Nov 10, 2015 at 6:45 PM, Nicol Bolas <jmck...@gmail.com> wrote:

> `to_string` makes floats human readable.

This repeats your earlier argument about its purpose, which, as I said, is quite simply baseless. This invalidates the rest of your argument.

to_string does what it does, and labelling its current behaviour as "human readable" is purely subjective. Even if I personally agree that "human readable" is a desirable property, it does not technically need to be a property of to_string or whatever other mechanism when it is used to represent floating point numbers exactly. I said this earlier, and your response to this was "right", so why are we discussing this again?
 
 
> Um, no. The standard explicitly states how `to_string` is currently implemented: as exactly equivalent to `snprintf("%f", float);`. And there is nothing about the behavior of `%f` that guarantees round-tripping.

I am really not willing to make this personal, but I just find myself repeatedly asking myself: why does Nicol Bolas keep saying "no" to something that I did not say when responding to my messages? I said, literally: " to_string, when applied to integers". How could you respond to this with "no" and then justify your response by saying something about floats?

The standard implies a certain behaviour of to_string with respect to integers, specifically that they can be converted to a string and back without any loss. The fact that the standard does not imply that when it comes to floats is quite nasty. I think we all agree about this and there is no need to repeat this over and over again.

I want to make sure I understand your argument.

The standard does not explicitly state that `to_string` exists to make a number human readable. And therefore, any claim that `to_string` exists to make strings human readable is "baseless". This despite the fact that the specified behavior makes the number human readable.

However, the standard implies that `to_string` is supposed to allow round tripping. This is because the standard mandated behavior of `to_string` for integers happens to allow round-tripping. This despite the fact that the same "standard mandated behavior of `to_string`" explicitly forbids round-tripping for floats.

So we should listen to what the standard says about the behavior, but only when it proves your point?

If you want to feel that a reasonable implementation of `to_string` should be allowed to use Base64 so long as `stof` can read a Base64-encoded float, I can't stop you.

As to the specific question of round-tripping for integers, I would say that this is coincidental. It's just that it's easy to round-trip between integers and a readable representation of them. Whereas with floats, there was no good, simple alternative that could be expressed as a form of `snprintf`.

Round tripping with `to_string` would be nice. But getting a readable string, and not Base64 or whatever, is an unmoveable goal of such a function.

For me, the worst part of this is that this is very likely based on a thoroughly outdated idea that floating point numbers are intrinsically evil and one has pretty much always assume some kind of back-stabbingness when dealing with them, so it is OK to lose every significant figure your data happen to have when you rely on standard library means.

There is only one person on this thread claiming that the current behavior is OK. And his argument is mainly that he doesn't want to see exponential notation.

Thiago Macieira

unread,
Nov 10, 2015, 5:45:55 PM11/10/15
to std-dis...@isocpp.org
On Tuesday 10 November 2015 13:41:03 Nicol Bolas wrote:
> As to the specific question of round-tripping for integers, I would say
> that this is coincidental. It's just that it's easy to round-trip between
> integers and a readable representation of them. Whereas with floats, there
> was no good, simple alternative that could be expressed as a form of
> `snprintf`.

Thanks Nicol, I think you summarised it quite well. It seems the standard is
unintentionally contradictory (maybe not strictly speaking, but that's what
people read).

In my view, the best option would be to make it human readable *and* loss-less
for regular, finite numbers. We can do it in stages.

I'd start with the basic rules for round-tripping:

* it's ok for denormal numbers to get normalised, whenever possible
* all NaNs get converted to the same NaN representation
* all ±infinities get converted to the same ±inf
* negative zeroes get converted to positive zeroes

Then add one about locale:
* the output is locale-independent (period for decimal separation, no group
separators)

Then we choose one of the two rules about precision
* precision is reasonable, but not guaranteed to be lossless (that's %g)
* lossless (%.19g for IEEE 754 double-precision or with a better library)

Viacheslav Usov

unread,
Nov 11, 2015, 5:45:34 AM11/11/15
to std-dis...@isocpp.org
On Tue, Nov 10, 2015 at 10:41 PM, Nicol Bolas <jmck...@gmail.com> wrote:
 
> The standard does not explicitly state that `to_string` exists to make a number human readable. And therefore, any claim that `to_string` exists to make strings human readable is "baseless".

That, and "human readable" is not something defined formally. And one could just as easily say it is coincidental.

And no, my arguments for round-tripping are not substantially better than yours for readability. They both interpret what the standard does not say.
 
> If you want to feel that a reasonable implementation of `to_string` should be allowed to use Base64 so long as `stof` can read a Base64-encoded float, I can't stop you.

Again, I could accept that as a selectable behaviour of to_string, provided some strong portability guarantees. I would prefer something less opaque. I would also prefer never losing a bit of precision with the default to_string behaviour, but I could accept having to request a lossless behaviour explicitly or having to use a different mechanism for that.

Cheers,
V.

u97...@gmail.com

unread,
Nov 11, 2015, 2:06:23 PM11/11/15
to ISO C++ Standard - Discussion, u97...@gmail.com
Lot's of posts here; I'll try to gather points so far and add some thoughts.

Despite of opposite views, I presume it's quite commonly accepted that current implementation of to_string for floating point types is more or less substandard. But it's perhaps not so clear what such function should exactly do if designing it from scratch. Some options that have been mentioned:
1. Lossless conversion to string.
    -Simple in the sense that no need think "how inexact".
    -May print more characters than expected/desired by some users.
2. Use some compromise between number of characters and precision.
    -What is the compromise? What ever it will be, it's obvious that it will not be appropriate for all needs: some want to have conversion that looks nice and some will find it faulty if different floating point values don't map to different strings.

With integer values to_string() maps every integer to different string and these strings (i.e. the image of the conversion function from the set of floating point values) can be mapped back to original integers; one-to-one correspondence. From this one concrete question could be formulated as follows:
  • Should there be a to_string-like simple function that converts floating point values to string with the same one-to-one correspondence property as integers have with to_string()?
Precision and formatting matters depend on use case and it is seems impossible for the standard to come up with a 'one-size-fits-all' solution. It can come up with a solution that works or is tolerable for some, maybe even for the majority (e.g. like done with streams), but does it need to handle the problem with such approach? If precision and "prettyness" issues indeed are so context dependent, is it even worth trying to give a general solution? That doesn't, however, mean that the standard shouldn't make it easy for the programmer to accomplish his/her ideal. And considering the title "Simple numeric access" of the papers, I think it also worth pointing out that having a too simple and restricted conversion function doesn't necessarily help the programmer because once the simple version is learned and remembered, it comes necessary to learn to use the robust conversion function.

From this, an idea for evaluation: define overloads to allow arbitrary sprintf-behaviour directly from to_string or it's new version. For example (using double instead of templated floating point type for clarity):

std::string new_to_string(double f, int_type precision = non_lossy_digit_precision_for_double);
std
::string new_to_string(double f, format_options format);
// Usage
new_to_string
(1.23456); // Use non-lossy conversion
new_to_string
(1.23456, 3); // Use only 3 significant digits. Somewhat ambiguous though, could be understood as "convert both and return pair of strings".
new_to_string
(1.23456, format_option('f', 6)); // Classic %f printing.
// Or perhaps format_option could be a string or implicitly constructible from string allowing the sprintf-style
new_to_string
(1.23456, "%f");


The relevant question still is:
  • Should to_string() for floating points be
  1. deprecated
  2. changed to match the ideal version (after the ideal is defined and agreed on).
  3. let it be as it is.
I wouldn't have started this thread if '3' looked like the correct options, but as said before, evaluating pros and cons of 1 and 2 should be done be more experienced people than OP. And if 1, should there be a new and renamed to_string() for floating point values and what would it be called?

u97...@gmail.com

unread,
Feb 28, 2016, 8:13:20 AM2/28/16
to ISO C++ Standard - Discussion, u97...@gmail.com
To get the matter forward, this has been continued in the 'Future Proposals' -forum.
Reply all
Reply to author
Forward
0 new messages