[Proposal] Deprecate default String.Chars implementation for Float

125 views
Skip to first unread message

Bryan Enders

unread,
Dec 2, 2016, 9:39:44 AM12/2/16
to elixir-lang-core
String.Chars.to_string/1 is only implemented by default for types that can be represented with unambiguous precision… with the exception of Float. The ambiguity with regards to representing tuples and maps (and by extension structs) would seem to be the primary reason for omitting a default implementation for those types. The same could be said of floats. A programmer attempting to interpolate 1_000.0 into a string might be quite surprised to discover that the string representation is "1.0e3". They might be even more surprised to discover that interpolating 12_123_123_123_123_123_123.0 will be rounded in its representation as "1.2123123123123122e19". The protocol implementation must make formatting assumptions with regards to precision when it comes to floats. I would recommend the eventual deprecation of the default String.Chars implementation for Float.

Chris McCord

unread,
Dec 2, 2016, 9:51:24 AM12/2/16
to elixir-l...@googlegroups.com
I am -1 on this one. I agree that explicit formatting and rounding will be required for times that you care about how the float is printed, but the utility of not having to format the float for interpolation/logging for such a core type is worth keeping the behavior in my opinion.

On Dec 2, 2016, at 9:39 AM, Bryan Enders <pinebo...@gmail.com> wrote:

String.Chars.to_string/1 is only implemented by default for types that can be represented with unambiguous precision… with the exception of Float. The ambiguity with regards to representing tuples and maps (and by extension structs) would seem to be the primary reason for omitting a default implementation for those types. The same could be said of floats. A programmer attempting to interpolate 1_000.0 into a string might be quite surprised to discover that the string representation is "1.0e3". They might be even more surprised to discover that interpolating 12_123_123_123_123_123_123.0 will be rounded in its representation as "1.2123123123123122e19". The protocol implementation must make formatting assumptions with regards to precision when it comes to floats. I would recommend the eventual deprecation of the default String.Chars implementation for Float.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/e5d26d6d-9a57-444d-bca2-7a17bce4c79f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bryan Enders

unread,
Dec 2, 2016, 9:56:03 AM12/2/16
to elixir-lang-core
Note: Unlike String.Chars.to_string/1, Kernel.inspect/1 is allowed to take liberties. Consequently, it can provide string representations of all types.

Bryan Enders

unread,
Dec 2, 2016, 10:08:30 AM12/2/16
to elixir-lang-core
 the utility of not having to format the float for interpolation/logging for such a core type is worth keeping the behavior

Couldn’t the same be said of all the basic types, including tuples, maps, ranges, and processes (arguable the essential type of Erlang/Elixir)? Providing a convenient string representation that takes liberties with formatting is the purpose of Kernel.inspect/1, not String.Chars.to_string/1. Wouldn’t the most consistent behavior be to expect the programmer to inspect the float they wish to interpolate/log, just as they must with a tuple, map, range, struct, process, etc?

José Valim

unread,
Dec 2, 2016, 10:09:17 AM12/2/16
to elixir-l...@googlegroups.com
> String.Chars.to_string/1 is only implemented by default for types that can be represented with unambiguous precision… with the exception of Float

A float number can be represented as a string with no ambiguity. There are many papers that explain how to do so quickly and accurately, the earlier and latest I am aware are:

https://lists.nongnu.org/archive/html/gcl-devel/2012-10/pdfkieTlklRzN.pdf

What is not possible is the opposite: to get a string representing a number and represent that accurately as a float. So the ambiguity in your example is not from when the float is converted to a string but from when we parse the string in your Elixir source code and attempt to represent it as a float.
 
Therefore to_string is behaving as expected.

José Valim
Skype: jv.ptec
Founder and Director of R&D

On Fri, Dec 2, 2016 at 3:33 PM, Bryan Enders <pinebo...@gmail.com> wrote:
String.Chars.to_string/1 is only implemented by default for types that can be represented with unambiguous precision… with the exception of Float. The ambiguity with regards to representing tuples and maps (and by extension structs) would seem to be the primary reason for omitting a default implementation for those types. The same could be said of floats. A programmer attempting to interpolate 1_000.0 into a string might be very surprised to discover that the string representation is "1.0E3". They might be even more surprised to discover that interpolating 12_123_123_123_123_123_123.0 will be rounded in its representation as "1.2123123123123122e19". The protocol implementation must make formatting assumptions with regards to precision when it comes to floats. I would recommend the eventual deprecation of the default String.Chars implementation for Float.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/a2c0dfa4-97ec-4f97-8953-ce8660d2bedd%40googlegroups.com.

José Valim

unread,
Dec 2, 2016, 10:15:02 AM12/2/16
to elixir-l...@googlegroups.com
Sorry, my earliest and latest references were wrong. Here are the papers I am aware of with the known algorithm names in parentheses:





José Valim
Skype: jv.ptec
Founder and Director of R&D

Bryan Enders

unread,
Dec 2, 2016, 10:48:12 AM12/2/16
to elixir-lang-core, jose....@plataformatec.com.br
That’s a good point, José. It does make unexpected assumptions with regards to formatting (scientific notation when it’s shortest, decimal otherwise), but I was incorrect in asserting that it makes assumptions with regards to precision. The String.Chars implementation for Date makes formatting assumptions, but at least they are backed by the ISO 8601 standard. Most programmers would not be surprised that interpolating ~D[1970-01-01] results in "1970-01-01". I do think they’d be surprised that interpolating 1000.0 would result in 1.0e3 while interpolating 1000.5 would result in "1000.5". The default implementation would seem to violate the Principle of Least Astonishment.

If providing obvious representations of basic types that can be represented obviously is not the purpose of String.Chars.to_string/1 (as opposed to Kernel.inspect/1), then shouldn’t there be default implementations for all the primitive types in Elixir?

On Friday, December 2, 2016 at 10:09:17 AM UTC-5, José Valim wrote:
> String.Chars.to_string/1 is only implemented by default for types that can be represented with unambiguous precision… with the exception of Float

A float number can be represented as a string with no ambiguity. There are many papers that explain how to do so quickly and accurately, the earlier and latest I am aware are:

https://lists.nongnu.org/archive/html/gcl-devel/2012-10/pdfkieTlklRzN.pdf

What is not possible is the opposite: to get a string representing a number and represent that accurately as a float. So the ambiguity in your example is not from when the float is converted to a string but from when we parse the string in your Elixir source code and attempt to represent it as a float.
 
Therefore to_string is behaving as expected.

José Valim
Skype: jv.ptec
Founder and Director of R&D

On Fri, Dec 2, 2016 at 3:33 PM, Bryan Enders <pinebo...@gmail.com> wrote:
String.Chars.to_string/1 is only implemented by default for types that can be represented with unambiguous precision… with the exception of Float. The ambiguity with regards to representing tuples and maps (and by extension structs) would seem to be the primary reason for omitting a default implementation for those types. The same could be said of floats. A programmer attempting to interpolate 1_000.0 into a string might be very surprised to discover that the string representation is "1.0E3". They might be even more surprised to discover that interpolating 12_123_123_123_123_123_123.0 will be rounded in its representation as "1.2123123123123122e19". The protocol implementation must make formatting assumptions with regards to precision when it comes to floats. I would recommend the eventual deprecation of the default String.Chars implementation for Float.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.

José Valim

unread,
Dec 2, 2016, 11:02:59 AM12/2/16
to Bryan Enders, elixir-lang-core
We can discuss improving the formatting for floats but it is important to leave the Principle of Least Astonishment at the door because different people are going to have different expectations of what is the least surprising and we would be unable to reach an agreement.

We implement Kernel.to_string for types that have a meaning outside of the Elixir environment. If I show {1, 2, 3} to a non programmer, it is unlikely they would understand what it means. Therefore, we don't implement the to_string protocol for tuples because we want the programmer to think about how that term should be formatted before printing it in a GUI or a web UI.

On the other hand, the formatting of floats is quite well known and understood. I can't recall exactly where I learned it but I was certainly aware of it by high school.


José Valim
Skype: jv.ptec
Founder and Director of R&D

Bryan Enders

unread,
Dec 2, 2016, 11:31:38 AM12/2/16
to elixir-lang-core, pinebo...@gmail.com, jose....@plataformatec.com.br
That makes sense. In that case, I retract my proposal. I do think we should discuss changing the formatting of floats, especially since (scientific) E notation is discouraged in general communication.

José Valim

unread,
Dec 2, 2016, 11:47:32 AM12/2/16
to Bryan Enders, elixir-lang-core
The problem is that we cannot correctly encode a float after certain decimal digits. For example, a float like 123_123_123_123_123_123_123 cannot correctly be shown as 123123123123123123 because there is no precision at the last digits, in such cases a sort of compactation, such as scientific notation, is necessary. If anyone is aware of a better format for floats, I am all ears.



José Valim
Skype: jv.ptec
Founder and Director of R&D

Bryan Enders

unread,
Dec 2, 2016, 3:48:29 PM12/2/16
to elixir-lang-core, pinebo...@gmail.com, jose....@plataformatec.com.br
The current behavior (shortest representation, whether E notation or decimal) is perfect for IO.inspect and Kernel.inspect. For Kernel.to_string I would use decimal representation unless the float is at such a size where compaction occurs, then I would use E notation.

José Valim

unread,
Dec 2, 2016, 4:06:31 PM12/2/16
to Bryan Enders, elixir-lang-core
I believe that's how it works today or at least how will it work on Elixir v1.4 (i.e. it is what I see in my terminal around).

José Valim
Skype: jv.ptec
Founder and Director of R&D

Bryan Enders

unread,
Dec 2, 2016, 7:19:58 PM12/2/16
to elixir-lang-core, pinebo...@gmail.com, jose....@plataformatec.com.br
v1.4 is using :io_lib_format.fwrite_g/1, which provides the ‘shortest, correctly rounded string that converts to Float when read back with list_to_float/1’. That means that 1_000.0 will be represented as "1.0e3" instead of "1000.0" since the former is shorter. That’s appropriate for Kernel.inspect/1 to be consumed by a programmer, but not for Kernel.to_string/1 for a GUI or web UI to be consumed by non-programmers.

José Valim

unread,
Dec 3, 2016, 6:25:15 AM12/3/16
to Bryan Enders, elixir-lang-core
Thanks for the example, I will investigate it.



José Valim
Skype: jv.ptec
Founder and Director of R&D

Reply all
Reply to author
Forward
0 new messages