exact strtod

already...@yahoo.com

unread,

Jul 28, 2018, 3:50:15 PM7/28/18

to

A question to numeric experts: how many bits of precision one needs in intermediate calculations in order to guarantee "exact" (i.e. correctly rounded) result of strtod() assuming that the type 'double' defined as IEEE-754 binary64?

It seems to me ~1400 bits are needed in the worst case, but I am not 100% sure.

Terje Mathisen

unread,

Jul 28, 2018, 5:18:21 PM7/28/18

to

I have never seen any specification for strtod() which puts limits on
the number of decimal digits in the input, this means that you can
easily construct a string that will take arbitrarily many bits.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

already...@yahoo.com

unread,

Jul 28, 2018, 5:52:47 PM7/28/18

to

I meant not completely dumb implementation.
Something like - you load up to N bits of signficand and if there are more digits then you scan them for presence of non-zeros. If at least one non-zeros digit found then you set LS bit of significand to 1. As for exponent, you test if it is not obviously to big, either direction, and if it is then no further calculations are needed.

John Levine

unread,

Jul 28, 2018, 5:54:53 PM7/28/18

to

>> A question to numeric experts: how many bits of precision one needs
>> in intermediate calculations in order to guarantee "exact" (i.e.
>> correctly rounded) result of strtod() assuming that the type 'double'
>> defined as IEEE-754 binary64?
>>
>> It seems to me ~1400 bits are needed in the worst case, but I am not
>> 100% sure.
>>
>I have never seen any specification for strtod() which puts limits on
>the number of decimal digits in the input, this means that you can
>easily construct a string that will take arbitrarily many bits.

Really? If the result is an IEEE double, I would assume that beyond
some point more input digits can make no difference. The question was
how much precision do you need to be sure that the output is the
closest value to the input?

--
Regards,
John Levine, jo...@iecc.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

kr...@berkeley.edu

unread,

Jul 28, 2018, 6:18:59 PM7/28/18

to

On Saturday, July 28, 2018 at 12:50:15 PM UTC-7, already...@yahoo.com wrote:
> A question to numeric experts: how many bits of precision one needs in intermediate calculations in order to guarantee "exact" (i.e. correctly rounded) result of strtod() assuming that the type 'double' defined as IEEE-754 binary64?
>
> It seems to me ~1400 bits are needed in the worst case, but I am not 100% sure.

@inproceedings{Clinger:1990:RFP:93542.93557,
author = {Clinger, William D.},
title = {How to Read Floating Point Numbers Accurately},
booktitle = {Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation},
series = {PLDI '90},
year = {1990},
isbn = {0-89791-364-7},
location = {White Plains, New York, USA},
pages = {92--101},
numpages = {10},
url = {http://doi.acm.org/10.1145/93542.93557},
doi = {10.1145/93542.93557},
acmid = {93557},
publisher = {ACM},
address = {New York, NY, USA},
}

already...@yahoo.com

unread,

Jul 28, 2018, 7:21:33 PM7/28/18

to

Thank you.
Unless better algorithms were discovered since 1990, it sounds like Terje is right and in the worst cases a required precision is unbounded.

already...@yahoo.com

unread,

Jul 28, 2018, 7:23:21 PM7/28/18

to

On Sunday, July 29, 2018 at 12:54:53 AM UTC+3, John Levine wrote:
>
> Really? If the result is an IEEE double, I would assume that beyond
> some point more input digits can make no difference. The question was
> how much precision do you need to be sure that the output is the
> closest value to the input?
>

Yes.

Ivan Godard

unread,

Jul 28, 2018, 8:45:19 PM7/28/18

to

Not really. If at any point you truncate the remaining digits of the
input and figure a double using the untruncated digits, and also figure
one using the untruncated digits but incremented in the last untruncated
place, and the two doubles are the same, then the digits that were
truncated can have no impact in the result and need not be examined.

A circuit to do both at once would be similar to a carry-select adder,
but seems unlikely to be worth while.

Message has been deleted

kr...@berkeley.edu

unread,

Jul 28, 2018, 9:06:25 PM7/28/18

to

On Saturday, July 28, 2018 at 4:21:33 PM UTC-7, already...@yahoo.com wrote:
> Unless better algorithms were discovered since 1990, it sounds like Terje is right and in the worst cases a required precision is unbounded.

Haven't checked this but it suggests <1200 bits for double precision.
https://arxiv.org/abs/1310.8121v1

Bruce Hoult

unread,

Jul 28, 2018, 11:01:05 PM7/28/18

to

This paper is the "bible", widely cited, and states that unbounded precision is needed, but it's in fact not correct.

For Innaworks' "Alchemo" Java-to-Native compiler for pre-iPhone and Android phones I proved that only a bounded number of bits were required regardless of the length of the input string, and implemented a corresponding algorithm in 2007.

The key is that you can (notionally, and incrementally) find the decimal expansions of two adjacent IEEE Double values, and once their digits diverge and the string of digits being converted lies between them you only have to decide which one the input is closer to and then you can not even bother to parse the remaining decimal digits.

I don't recall offhand how many bits are required. It's a little less than 1400 bits. Somehow 168 bytes, 1344 bits, seems to ring a bell.

I wanted to publish the result at the time, and went so far to contact Guy Steele (who wrote the corresponding binary-to-decimal paper) who agreed with my conclusions, but my boss was against publication, regarding the algorithm as some kind of competitive advantage on tiny devices. In total well under 1 KB of RAM is required, worst case, for perfect conversion of arbitrarily long input strings (assuming you're streaming the input, not storing it).

Bruce Hoult

unread,

Jul 28, 2018, 11:02:53 PM7/28/18

to

Exactly.

Bruce Hoult

unread,

Jul 28, 2018, 11:15:44 PM7/28/18

to

I found an email chain from December 2007. The work was done a few months before that, but 14 December was the Victoria University of Wellington Christmas BBQ. James Noble is the professor there (means head of department), Lindsay Groves another very senior guy. I see now that I got responses from both Guy Steele and Will Clinger:

From: br...@hoult.org
Subject: reading floats
Date: 14 December 2007 12:34:34 AM
To: k...@mcs.vuw.ac.nz

Hi James,

It was good to talk to you at the BBQ today.

I just did a quick check on the Java source. Double.valueOf(String)
and Double.parseDouble(String) call through to
sun.misc.FloatingDecimal.readJavaFormatString(String) to do the
parsing, and sun.misc.FloatingDecimal.doubleValue() to convert the
digits to a double.

https://openjdk.dev.java.net/source/browse/openjdk/jdk/trunk/jdk/src/
share/classes/sun/misc/FloatingDecimal.java?rev=257&view=markup

The guts of it is that the parsing allocates a char[] for the mantissa
digits that is large enough to hold the entire input string, after
trimming whitespace but still including sign and exponent. Except for
the easy special cases pointed out by Clinger (and Gay), doubleValue()
then does in fact use a bignum (FDBigInt) of the entire input mantissa
string, and another the same size calculated from the fp double
estimate value in order to come up with an adjustment to the estimated
double value.

That is, the bignums used are in fact of unbounded size, depending on
the length of the mantissa in the input string (less any leading and
trailing zeroes). i.e. if you feed it a string consisting of a few
million digits of pi then it's going to use a few MB of additional
working space, quite needlessly.

Would you like me to give you anything before you contact Guy Steele?
It certainly wouldn't take long to whip up a page or so of the basic
motivations, idea, and advantages.

I think it is very interesting from a theoretical perspective that I'm
effectively showing that a Finite State Machine is adequate to produce
correctly rounded floats from an input stream, contrary to the claims
in the literature. For example, the first paragraph of Clinger's
paper:

"Consider the problem of converting decimal scientific no-
tation for a number into the best binary floating point ap-
proximation to that number, for some fixed precision. This
problem cannot be solved using arithmetic of any fixed pre-
cision. Hence the IEEE Standard for Binary Floating-Point
Arithmetic does not require the result of such a conversion
to be the best approximation."

What I am claiming to show is that this is false, and that in fact a
fixed precision of only a few bits more than required for correctly
rounded printing (as laid out in the Steele and White paper) is
necessary. In fact for double precision IEEE floating point numbers a
handful of multiple precision integers of 36 words each (144 bytes,
1152 bits, total size of less than 1 KB) is always sufficient, even if
presented with extremely long input strings megabytes in size.

How to Read Floating Point Numbers Accurately

William D Clinger
http://portal.acm.org/citation.cfm?id=93557

How to Print Floating-Point Numbers Accurately
Guy L. Steele Jr. & Jon L White
http://portal.acm.org/citation.cfm?id=93559

Both in:
Proceedings of the ACM SIGPLAN'S Conference on
Programming Language Design and Implementation.
White Plains, New York, June 20-22, 1990.

Best regards,
Bruce

----

I didn't mention the actual idea in the previous email, so here it is,
as applied to double precision IEEE floats:

1) convert the input number with the mantissa truncated to the first
17 - 19 digits into the nearest double precision number using any
existing technique or library. 17 is the minimum necessary to
guarantee 0.5 ULP precision in selecting that double, but 19 fits into
a 64 bit integer which may be handy (especially if the entire mantissa
is only 18 or 19 digits anyway).

2) if you didn't truncate the mantissa then you're done.

3) the maximum error relative to the truncated mantissa is 0.5 ULP.
The error caused by truncating the mantissa is also strictly less than
0.5 ULP. The correct answer is either the double you already have or
else its successor.

4) start converting the midpoint of your double and it's successor to
decimal, using for example the Steele and White (FPP)^2 algorithm but
with the terminating condition being R becoming equal to zero
(implying an infinite number of trailing zeroes) rather than all that
mucking about being clever with low and high.

5) compare the generated digits to the digits in the original input
string. If you know that the string already passed fp parsing then
it's very simple. Just skip over any mixture of whitespace, signs,
zeroes, and decimal points to find the first non-zero digit. (special
case: input equal to zero). Then start scanning digits (ignoring
decimal points) and compare with the digits being generated in 4). Or
you can use information saved in the parsing stage. It depends on
whether you're using an existing library as a pure black box in step
1) or are able to modify it (or implement from a clean sheet).

6) if a digit matches then keep going. If a digit does not match then
stop immediately and you know which way to round. If the input number
terminates first then round down. If R in the generator loop becomes
zero first then round up. If both terminate at the same time then
round to even.

Note that there is no need to store the generated digits. Compare
them and discard.

Long mantissas that are not close to halfway between two doubles will
be examined for only a very short distance.

Even in the worst case, the decimal representation of all doubles
terminates. The longest are the numbers midway between denorms, For
example the smallest number that rounds to non-zero, which is the
number halfway between zero and the smallest denorm:

[Note: I have since realized that in fact Java gets this one wrong.
Due to "round to even" this should read as 0.0, and you need to
increase the 5 to a 6, or add e.g. "000000000000001" on the end to
make it go to 5E-324. Both my implementation and David Gay's "dtoa.c"
agree on this.]

0.0000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000024703282292062327208828439643411068
618252990130716238221279284125033775363510437593264991818081
799618989828234772285886546332835517796989819938739800539093
906315035659515570226392290858392449105184435931802849936536
152500319370457678249219365623669863658480757001585769269903
706311928279558551332927834338409351978015531246597263579574
622766465272827220056374006485499977096599470454020828166226
237857393450736339007967761930577506740176324673600968951340
535537458516661134223766678604162159680461914467291840300530
057530849048765391711386591646239524912623653881879636239373
280423891018672348497668235089863388587925628302755995657524
455507255189313690836254779186948667994968324049705821028513
185451396213837722826145437693412532098591327667236328125

That's 323 zeroes followed by 751 decimal digits which are those of
5^1075.

python -c 'print "0."+"0"*323+str(5**1075)'

Inputting this to, for example, Double.parseDouble(String) correctly
produces the smallest denorm (5E-324, though Sun Java prints it with
spurious precision as 4.9E-324).

Changing the last digit from a 5 to a 4 correctly produces a result
of 0.

The midpoints between other denorms are odd multiples of this number.
They terminate at the same number of places after the decimal, but
have up to 53 bits worth of extra significant digits (fewer leading
zeroes)

Thus, even if presented with an input string consisting of millions of
mantissa digits, you can *always* stop comparing after generating and
examining no more than 770 digits.

I have implemented this technique in a upcoming release of our
"alcheMo" J2ME to BREW porting product. Both parsing and printing
floating point numbers are achieved not only with correct rounding,
but also with no consing (e.g. when appending or inserting to a
StringBuffer with sufficient free space).

I also intend to implement it in one or both of the Dylan compilers
maintained at http://www.opendylan.org, a group I have been a member
of since 1998. (I have also been captain of a team using Dylan to
enter the ICFP programming contest on a regular basis, including
gaining 2nd place twice and the Judge's prize twice).

I also intend to implement it as a modification to David Gay's
"dtoa.c", which appears to be the current industry standard, appearing
in everything from the iPhone up to mainframes.

In addition to the obvious speed advantages and space advantages of
not using unbounded bignums in extreme cases, this technique also
allows the use of stack-allocated bignums instead of heap allocated
one, which may further increase the speed, or perhaps even allow a FSM
implementation in hardware.

So, yes, I would be grateful if you could bring this to Guy's
attention and see what he thinks about it.

Cheers,
Bruce

------------------------------------------------------------------------------

from James Noble <k...@mcs.vuw.ac.nz>
to Guy Steele <Guy.S...@sun.com>,
cc Bruce Hoult <br...@hoult.org>,
Lindsay Groves <lin...@mcs.vuw.ac.nz>,
date Dec 14, 2007 4:30 PM
subject reading fixed precision floating point numbers

Hi Guy

so I'm writing to ask some advice about an interesting conversation
I had yesterday at our research group's Christmas party.

Bruce Hoult, a Wellington programmer, was asking for
advice about an algorithm he has developed and implemented to read
floating point numbers that requires only constant space (a little more
than the precision to which the number will be stored) rather than
working via arbitrary precision bignums.

I suggested he email you about this, but he thought you'd be more
likely to read emails from someone whom you know.

So, Bruce (and I) would be very interested in your opinion
of this work - whether you're aware of anything similar,
and whether you think e.g somewhere like PLDI would
still be interested in work on this topic.

I've copied in Bruce's emails below, if you've any thoughts
about this, we'd love to know what they are.

cheers

James

------------------------------------------------------------------------------

from Guy Steele <Guy.S...@sun.com>
to James Noble <k...@mcs.vuw.ac.nz>,
cc Bruce Hoult <br...@hoult.org>,
Lindsay Groves <lin...@mcs.vuw.ac.nz>,
date Dec 15, 2007 7:15 AM
subject Re: reading fixed precision floating point numbers

Yes, this sounds VERY interesting.

High order bit: I would suggest getting Will Clinger to take
a look at this as well, because he has studied the "read" problem
more closely than I have (I focused on the "print" problem,
which is related but a little different). You can reach him at
wi...@ccs.neu.edu .

I have read through the notes below and I think I grasp most
of the idea: assuming you can solve the print problem with limited
memory, you do a read on a truncated version of the input
to produce a result x such that reading the entire input ought
to produce either x or nextafter(x); then you print the value
(x+nextafter(x))/2, presumably using one extra bit of precision
and compare the result to the entire original input digit string.

The two parts I am unsure of are:

(1) I don't quite see how you do the read on the truncated input
string using limited memory in the situation where the input
consists of a decimal point and many zeros before the first
nonzero digit. The description below says:

> 1) convert the input number with the mantissa truncated to the first
> 17 - 19 digits into the nearest double precision number using any
> existing technique or library. ...

It speaks of "mantissa" rather than "fraction", so I'm not sure
whether it means the first 17 - 19 digits of the input string
or the first 17 - 19 *significant* digits. If the former, then
the first 17 digits of 0.00000000000000000000123 are all
zero, so the trial input x is zero, and (x+nextafter(x))/2 is not
the correct value. If the latter, then scaling by a potentially
large factor of 10 is required.

(2) Similarly, I am not sure how the space required by the
printing part of the process can be limited in all cases.

But I am very eager to hear more details that would eliminate
my puzzlement or confusion. If this algorithm is indeed as claimed,
it would surely merit presentation at PLDI. And if he were to
provide a plug-compatible replacement for David Gay's work
with improved performance, that would be a *great* service to
the community.

--Guy

------------------------------------------------------------------------------

James Noble to Bruce, Lindsay

Hi Bruce

WOW!

so you should be pretty pleased with this..

I presume you'll email WIll Clinger yourself?
but let me know how you go,
and obviously I'm keep to help out as I can
(as, I'm sure is Lindsay)

James

--------------------------------------------------------------

Lindsay Groves to Bruce, James

Hi Bruce

James said:
>WOW!
>
>so you should be pretty pleased with this..

I agree. That's a very encouraging respnse.

>I presume you'll email WIll Clinger yourself?
>but let me know how you go,
>and obviously I'm keep to help out as I can
>(as, I'm sure is Lindsay)

Indeed I am.

I haven't read through your initial message, or the follow-up, in detail, but
will do over the next few days. Maybe we should get together soonish and alk
about where to go with this. It looks like the PLDI cycle is papers due in
November and conference in June, so we've missed 2008. We could aim to have
something really solid prepared for next November for PLDI 2009, or look for
something else earlier which is suitable.

Cheers
Lindsay

--------------------------------------------------------------

from Bruce Hoult <br...@hoult.org>
to Guy Steele <Guy.S...@sun.com>,
cc James Noble <k...@mcs.vuw.ac.nz>,
Lindsay Groves <lin...@mcs.vuw.ac.nz>,
date Dec 15, 2007 10:41 AM
subject Re: reading fixed precision floating point numbers

On Dec 15, 2007 7:15 AM, Guy Steele <Guy.S...@sun.com> wrote:
> Yes, this sounds VERY interesting.

Thank you.

> I have read through the notes below and I think I grasp most
> of the idea: assuming you can solve the print problem with limited
> memory, you do a read on a truncated version of the input
> to produce a result x such that reading the entire input ought
> to produce either x or nextafter(x); then you print the value
> (x+nextafter(x))/2, presumably using one extra bit of precision
> and compare the result to the entire original input digit string.

Nice summary.

The extra bit and the averaging is easy to come by, since when you
unpack x's mantissa into an integer of the same size of x you have a
lot of extra bits available. Also, approaching nextafter(x) from
below, you don't have to cope with the varying density of values above
and below exact 2^N values. (x+nextafter(x))/2 requires only
"f=(f<<1)+1; --e" (assuming e is already set to the same value as for
denorms in the case that x is zero, which is automatically true for
IEEE formats)

> The two parts I am unsure of are:
>
> (1) I don't quite see how you do the read on the truncated input
> string using limited memory in the situation where the input
> consists of a decimal point and many zeros before the first
> nonzero digit. The description below says:
>
> > 1) convert the input number with the mantissa truncated to the first
> > 17 - 19 digits into the nearest double precision number using any
> > existing technique or library. ...
>
> It speaks of "mantissa" rather than "fraction", so I'm not sure
> whether it means the first 17 - 19 digits of the input string
> or the first 17 - 19 *significant* digits.

Sorry, I just typed up those notes very quickly after leaving the BBQ
with James, so as to have *something* to show you and the terminology
is a little sloppy.

I meant the first 17 significant digits.

> If the former, then
> the first 17 digits of 0.00000000000000000000123 are all
> zero, so the trial input x is zero, and (x+nextafter(x))/2 is not
> the correct value. If the latter, then scaling by a potentially
> large factor of 10 is required.

Right. In my current implementation I count the leading zeroes and
ignore them and later subtract the count from the exponent. In fact
you do that for anything you find after the decimal anyway e.g.
123.4567e10 is treated as 1234567e6.

Interestingly, the Clinger paper (and everything else I've seen in the
literature) assumes that you have *already* done the parsing -- and in
fact already converted the fraction into a binary integer and suitably
adjusted the exponent -- *before* you even get into the algorithm
being discussed.

I suspect this may be a reason that no one seems to have realized what
a valuable resource the original decimal string is for getting the
rounding right!

> (2) Similarly, I am not sure how the space required by the
> printing part of the process can be limited in all cases.

Simply because the machine floating point formats are of fixed
precision and so the bignums used by the algorithms published by
yourself and White naturally achieve only limited sizes. The largest
you have to cope with in printing is approximately the mantissa bits
shifted left by the exponent (i.e. 53 bits shifted left by 1023 places
for IEEE doubles), plus a little bit more.

These integers aren't small but they're not exactly large, at least
until someone invents a new FP format with a few more exponent bits
:-)

> But I am very eager to hear more details that would eliminate
> my puzzlement or confusion. If this algorithm is indeed as claimed,
> it would surely merit presentation at PLDI. And if he were to
> provide a plug-compatible replacement for David Gay's work
> with improved performance, that would be a *great* service to
> the community.

The time and space performance on "normal" inputs of the size produced
by printing floating point numbers would be essentially unchanged from
that of the existing implementation. The only gain would come from
being able to reimplement the bignum operations to avoid dynamic
allocation, as they now require only a fixed upper size. This is
likely to be fairly minor.

Time is expected to be quite significantly improved for the "stupidly
long" input corner cases, though I don't have numbers yet as I have so
far only implemented my algorithm on BREW mobile phones where I don't
have a conventional correctly rounded algorithm to compare against
(the one built into BREW itself is junk). I was more interested in
the space (including code space) aspects there so have made a number
of space vs speed tradeoffs such as calculating powers of ten from
scratch each time rather than using a large table, and strictly
limiting the number of operations available in my bignum library to
the bare minimum.

I expect to find that if I truncate at N digits then performance for,
say, N+1 and N+2 digits may be slightly slower than for the current
method, due to the generation of the first 17 digits in the printing
stage taking a finite amount of time while being essentially useless.
I see several possible responses to this:

a) it doesn't matter

b) if you truncate at a larger number of digits than strictly
necessary (e.g. 19) and pay careful attention to the error bounds then
in 99%+ of input cases it can be proved that the ignored digits can't
affect the result anyway, so you don't need to perform the "printing"
stage.

c) if the input fraction is less than or equal to N digits (19, say)
then don't truncate it, but if the fraction is more than N digits then
truncate it to 17. There will exist a value of N for which reading
the truncated number and then printing the first N digits takes less
time than reading the full number. In this case I expect my algorithm
to strictly dominate existing ones, due merely to avoiding dynamic
allocation for small inputs, but with big O improvements for large
inputs.

Just a word on big O: converting a long decimal number to a bignum
binary by the usual method of multiplying by 10 and adding the next
digit is O(N^2) in the number of input digits, so any conventional
algorithm is unavoidably no less than O(N^2).

If you don't have to worry about malformed input then my modification
is O(1), since even for an input with millions of digits you can pick
the first significant digits of the fraction off the start of the
string (modulo leading zeroes), and the exponent off the end, convert
to the estimate x, and then generate no more than about 770 digits to
compare against the input string.

If you do have to worry about malformed input (and of course you do)
then the parsing is unavoidably O(N), and therefore so is my overall
algorithm.

Guy, thank you very much for your response!

Although I already have one working implementation of this (which will
be shipping to customers by the end of 2007) I know that I have quite
a lot of work still to do to put together a publishable paper,
especially at the PLDI level. What I think you have done is confirm
that it is worth putting in that effort, and I will further check that
with Will Clinger.

Best regards,
Bruce

-------------------------------------------------

from Bruce Hoult <br...@hoult.org>
to Will Clinger <wi...@ccs.neu.edu>,
cc Guy Steele <Guy.S...@sun.com>,
James Noble <k...@mcs.vuw.ac.nz>,
Lindsay Groves <lin...@mcs.vuw.ac.nz>,
date Dec 16, 2007 2:18 PM
subject new idea for reading floating point numbers

Hi Will,

I believe I've got a new wrinkle on reading correctly rounded floating
point numbers. It's been run past Guy Steele and he also thinks it's
new but that I should run it past you to make doubly sure.

The background is that I'm a programmer for a small company called
Innaworks in Wellington, NZ, where we are making development tools for
people creating programs for small mobile devices (read: games for
cellphones...). Our current product which has been publicly available
since June is marketed as a "porting tool" for making Java
MicroEdition programs run on non-Java phones such as BREW. Of course
what it really is is a Java-compatible compiler and runtime system.
I'm responsible for the garbage collector, exception handling system,
threading library and the lowest level Java libraries (Strings, arrays
and Vectors, hash tables) and their interactions and tuning with the
GC and threads etc.

Lately I've been implementing floating point support for an upcoming
release and found conversion to and from strings a slightly deeper
subject than I'd expected. I've been using the 1990 PLDI papers by
you and Steele&White as a base, but with a few modifications mostly
designed to decrease code size, but also runtime memory size.

At a BBQ on Thursday I happened to ask James Noble (Victoria
University of Wellington, NZ) whether he thought one of my ideas was
worth publishing, and if so, how and where. He was quite enthusiastic
about the idea but as it's not in his area asked me to jot down a few
notes for him to forward to Steele, who he knows. Although I've fully
implemented and tested the idea and it will be shipping to our
customers next month I hadn't yet written it up at all. So my notes
are a bit rough and stream-of-consciousness, unfortunately.

The aspect I'm attacking is the corner case when you are presented
with a string longer (perhaps vastly longer) than those needed to
guarantee that values are preserved on a round-trip from binary to
string and back to binary. As you know, digits hundreds of
significant digits into the string can sometimes still affect which
floating point number is the closest to the value represented by the
string, and so must be considered.

The best published technique I know of (yours, with small
modifications by David Gay) requires unbounded working storage,
dependent on the size of the input. As you say in the introductory
paragraph of your 1990 paper: "This problem cannot be solved using
arithmetic of any fixed precision."

My technique uses yours as a subroutine but enables me to determine
the closest floating point value using a bounded amount of working
storage, no matter how long the input string is. For IEEE doubles all
needed variables total to less than 1 KB.

The best short summary I have at the moment is due to Steele:

"Assuming you can solve the print problem with limited
memory, you do a read on a truncated version of the input
to produce a result x such that reading the entire input ought
to produce either x or nextafter(x); then you print the value
(x+nextafter(x))/2, presumably using one extra bit of precision
and compare the result to the entire original input digit string."

I would be very interested to hear what you think of this idea, and
whether it is in fact new. Certainly it is not present in Gay's
library, and is not in Sun's Java code either.

I have implemented it on BREW (e.g. Verizon) phones and it works well.

I'm intending to implement it in at least the Gwydion Dylan compiler
(probably the former Harlequin Dylan as well), and also as a
modification to Gay's widely used dtoa.c.

I've attached the correspondence of the last day or two between
myself, James Noble and Guy Steele below.

Sincerely,
Brune Hoult

----------------------------------------------------

from Guy.S...@sun.com
to Will Clinger <wi...@ccs.neu.edu>,
cc James Noble <k...@mcs.vuw.ac.nz>,
Bruce Hoult <br...@hoult.org>,
Lindsay Groves <lin...@mcs.vuw.ac.nz>,
date Dec 16, 2007 3:49 PM
subject Re: new idea for reading floating point numbers

Will (and everyone),

Let me chime in here that my two uncertainties turned out to be
an uncertainty as to the meaning of "limited storage". First of all,
we can agree that, for the read problem, the length of the input
string is in principle truly unlimited. I think we can also agree
that the amount of internal storage required is in principle dependent
on the length of the significand of the internal (binary) representation.
The two open questions are then:
(a) For both the read and print problems, is the amount of internal storage
required dependent on the exponent range of the internal (binary)
representation?
(b) For the read problem, is the amount of internal storage required
dependent on the value of any (decimal) exponent (such as one
specified after the letter "E" in computerized scientific notation)?

I believe that Bruce has indicated that, for his algorithm, the answer
to (a) is "yes", and with that answer, my two uncertainties as stated
in my previous email (which Bruce forwarded in his last email) are
resolved. I am still unsure about (b); I think that Bruce intends that
the answer be "no"; it may well be true, but I am still a bit uncertain
about how that is achieved without inspecting the algorithm in detail.

--Guy

----------------------------------------------------

fromWilliam D Clinger <wi...@ccs.neu.edu>
toGuy....@sun.com,
wi...@ccs.neu.edu,
cc br...@hoult.org,
k...@mcs.vuw.ac.nz,
lin...@mcs.vuw.ac.nz,
date Dec 16, 2007 5:23 PM
subject Re: new idea for reading floating point numbers

I will respond later in more detail, but it may well be
a week before I will have time to address everything the
way I'd like to. So I'm giving you my quick reactions
tonight before I go to bed.

Guy wrote:
> The two open questions are then:
> (a) For both the read and print problems, is the amount of internal storage
> required dependent on the exponent range of the internal (binary)
> representation?
> (b) For the read problem, is the amount of internal storage required
> dependent on the value of any (decimal) exponent (such as one
> specified after the letter "E" in computerized scientific notation)?

The answer to (a) is yes, and I think the answer to (b)
is yes also. (See below.)

I think the key insight of Bruce's algorithm is that the
bounded dynamic range of real floating point representations,
together with realistic practical bounds that may reasonably
be imposed upon the exponent that follows the E, allow a
finite state solution. For example, it is reasonable for
practical implementations to restrict the exponent that
follows the E to 64 decimal places or fewer, which implies
some bound on the number of digits you have to pay attention
to before the E.

Getting back to (b): It seems to me that, without assuming
some bound on the exponent, you'd need an unbounded amount
of storage just to keep track of n for inputs of the form

1000...000e-1000...000
--------- ---------
n zeroes m zeroes

It also seems to me that you *have* to keep track of n for
such inputs, just so you can figure out whether the negative
exponent cranks the value back into the dynamic range of your
floating point representation.

Bruce, can you confirm that you are assuming some bound on
the number of digits in the decimal notation's exponent?

Bruce wrote:
> "Consider the problem of converting decimal scientific no-
> tation for a number into the best binary floating point ap-
> proximation to that number, for some fixed precision. This
> problem cannot be solved using arithmetic of any fixed pre-
> cision. Hence the IEEE Standard for Binary Floating-Point
> Arithmetic does not require the result of such a conversion
> to be the best approximation."
>
> What I am claiming to show is that this is false...

What I believe you are claiming to show is that the problem
can be solved using fixed precision when both the precision
and the dynamic range of the binary floating point representation
are bounded.

I really did prove the theorem, but both the statement of the
theorem and its proof point toward the only possible ways to
escape from the theorem. The critical weakness of the theorem
is that it assumes a definition of binary floating point with
unlimited dynamic range. As I explained in my paper (and gave
a completely impractical algorithm besides!), a finite state
machine suffices when the dynamic range of binary floating
point numbers is bounded, and that is of course true for all
practical floating point formats.

What is surprising about your algorithm is that it bears so
much resemblance to the impractical algorithm I described
only to dismiss out of hand.

Your algorithm definitely works (although I think there may
be a couple of minor bugs in your description of it), and I
agree that it should be possible to implement it using only
bounded storage. At the moment, I don't see any contradiction
between that conclusion and the results of my paper.

> Just a word on big O: converting a long decimal number to a bignum
> binary by the usual method of multiplying by 10 and adding the next
> digit is O(N^2) in the number of input digits, so any conventional
> algorithm is unavoidably no less than O(N^2).

It seems, however, that regarding the bignums as polynomials
and multiplying them by methods described in Knuth 4.3.3 or
4.6.4 would be O(N lg N lg lg N), so I don't think this is
a promising line of argument.

Congratulations on your discovery of an interesting and
important new algorithm!

Will

-------------------------------------------

James Noble to Bruce, Lindsay

> Congratulations on your discovery of an interesting and
> important new algorithm!

what he said!

James

---------------------------------------------

Lindsay Groves to Bruce, James

>> Congratulations on your discovery of an interesting and
>> important new algorithm!
>
>what he said!

Ditto!

That's an amazingly positive response.

Cheers
Lindsay

Bruce Hoult

unread,

Jul 28, 2018, 11:29:38 PM7/28/18

to

Sorry. As extracted from that long email chain, the size I found in 2007 was actually 144 bytes, 1152 bits. I believe the actual size needed is a few bits smaller, but that's rounded up to the next multiple of 32 bits (and coincidentally 64 and even 128 bits also).

Terje Mathisen

unread,

Jul 29, 2018, 4:48:22 AM7/29/18

to

John Levine wrote:
>>> A question to numeric experts: how many bits of precision one needs
>>> in intermediate calculations in order to guarantee "exact" (i.e.
>>> correctly rounded) result of strtod() assuming that the type 'double'
>>> defined as IEEE-754 binary64?
>>>
>>> It seems to me ~1400 bits are needed in the worst case, but I am not
>>> 100% sure.
>>>
>> I have never seen any specification for strtod() which puts limits on
>> the number of decimal digits in the input, this means that you can
>> easily construct a string that will take arbitrarily many bits.
>
> Really? If the result is an IEEE double, I would assume that beyond
> some point more input digits can make no difference. The question was
> how much precision do you need to be sure that the output is the
> closest value to the input?
>

Encode as decimal a value which is very close to a binary rounding point?

No, you are right: Since decimal contains a factor of two in each
decimal digit, you can never require more than 53 decimal digits to get
the same number of binary bits.

Terje Mathisen

unread,

Jul 29, 2018, 5:17:18 AM7/29/18

to

We can reuse the argument about double rounding here, i.e. when you have
M which is (2N plus 2 or 3) bits available in a higher precision format
and round to this, then subsequently round to N bits, you will _always_
get the same result as if you rounded to N directly. (This result does
depend on having a sticky bit though, which effectively emulates
arbitrary precision.)

This means that when the next decimal digit is more than about 110 bits
away, it should not be able to influence the final double result?

The key is whether it is possible to encode in decimal a value which
corresponds to .49999999/.5000000 in the last mantissa bit, without
terminating one one side or the other relatively quickly.

Bruce Hoult

unread,

Jul 29, 2018, 5:18:28 AM7/29/18

to

On Sunday, July 29, 2018 at 1:48:22 AM UTC-7, Terje Mathisen wrote:
> No, you are right: Since decimal contains a factor of two in each
> decimal digit, you can never require more than 53 decimal digits to get
> the same number of binary bits.

To print a binary number exactly in decimal?

That's correct for integers, but not for floating point, in which the significant binary digits can be shifted right by 1023 places.

For example, the smallest denorm, 0x0000_0000_0000_0001, requires 751 decimal digits to print exactly (plus 323 leading zeros)

python -c 'print "0."+"0"*323+str(5**1075)'

0.0000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000024703282292062327208828439643411068
618252990130716238221279284125033775363510437593264991818081
799618989828234772285886546332835517796989819938739800539093
906315035659515570226392290858392449105184435931802849936536
152500319370457678249219365623669863658480757001585769269903
706311928279558551332927834338409351978015531246597263579574
622766465272827220056374006485499977096599470454020828166226
237857393450736339007967761930577506740176324673600968951340
535537458516661134223766678604162159680461914467291840300530
057530849048765391711386591646239524912623653881879636239373
280423891018672348497668235089863388587925628302755995657524
455507255189313690836254779186948667994968324049705821028513
185451396213837722826145437693412532098591327667236328125

However we never actually want to print the binary number exactly in decimal. What we want is to print the shortest decimal number that we can parse unambiguously back into the exact same bits in binary.

That never needs more than 19 significant decimal digits (for IEEE double).

53 decimal digits is far more than needed for unambiguous round-tripping, and far less than needed to print an IEEE double exactly.

Terje Mathisen

unread,

Jul 29, 2018, 10:08:54 AM7/29/18

to

OK, thanks!

I did not want to print it exactly, I just proved to myself that any
binary fp value _can_ be converted to a finite number of decimal digits.

I do agree that the optimal solution for both binary FP to decimal and
the reverse is to have just enough digits to make the round-trip exact
for all fp values.

Fred J. Tydeman

unread,

Jul 29, 2018, 12:21:38 PM7/29/18

to

On Sat, 28 Jul 2018 19:50:13 UTC, already...@yahoo.com wrote:

> A question to numeric experts: how many bits of precision one needs in intermediate calculations in order to guarantee "exact" (i.e. correctly rounded) result of strtod() assuming that the type 'double' defined as IEEE-754 binary64?
>
> It seems to me ~1400 bits are needed in the worst case, but I am not 100% sure.

Around 1996, using continued fractions, I wrote a program to find hard
to convert numbers from decimal to IEEE-754 double with its 53 bits.
Here are the results.

1st column = # of decimal digits
2nd column = # of bits beyond 53 to do correct rounding
20 74
19 73
18 72
17 66
16 63
15 58
14 57
13 53
12 52
11 45
10 45
9 42
8 37
7 33
6 30
5 27
4 24
3 20
2 17
1 13

So, is about 3.3*(# of decimal digits) + 11
The 3.3 is about log2(10)

--
Message sent VIA Followup and E-Mail --

--
---
Fred J. Tydeman Tydeman Consulting
tyd...@tybor.com Testing, numerics, programming
+1 (702) 608-6093 Vice-chair of PL22.11 (ANSI "C")
Sample C99+FPCE tests: http://www.tybor.com
Savers sleep well, investors eat well, spenders work forever.

Fred J. Tydeman

unread,

Jul 29, 2018, 12:23:28 PM7/29/18

to

On Sat, 28 Jul 2018 19:50:13 UTC, already...@yahoo.com wrote:

> A question to numeric experts: how many bits of precision one needs in intermediate calculations in order to guarantee "exact" (i.e. correctly rounded) result of strtod() assuming that the type 'double' defined as IEEE-754 binary64?
> It seems to me ~1400 bits are needed in the worst case, but I am not 100% sure.

already...@yahoo.com

unread,

Jul 29, 2018, 12:30:42 PM7/29/18

to

Either I don't understand your answer or you don't understand my question.

already...@yahoo.com

unread,

Jul 29, 2018, 4:18:15 PM7/29/18

to

I am not sure with which part of my post you disagree.

What I mean is something rather simple.
For example 4503599627370496.5000... where ... stands for arbitrary long string of 0s that can be followed or not followed by non-0.

In this particular case a situation arises because of tie broken to nearest even. It wouldn't happen if tie break is "away from zero". I am not sure whether "away from zero" completely immune against similar cases or not.

Terje Mathisen

unread,

Jul 29, 2018, 4:29:30 PM7/29/18

to

Thanks Fred!

Your result makes a lot of sense, particularly the log2(10) scaling factor.

Did you keep a printout of those worst examples?

Terje

Terje Mathisen

unread,

Jul 29, 2018, 4:33:58 PM7/29/18

to

This is the sort of example I was thinking of when I stated that you
could need arbitrarily many bits, but in reality you don't:

You just need to treat all trailing digits (after the 0.5) as a single
sticky bit: Either you get a non-zero digit or you don't. You do need to
scan past all those zeroes but you don't need to remember or even count
them, much less expand the binary size of the preliminary conversion.

Chris M. Thomasson

unread,

Jul 29, 2018, 4:56:30 PM7/29/18

to

On 7/29/2018 9:23 AM, Fred J. Tydeman wrote:
> On Sat, 28 Jul 2018 19:50:13 UTC, already...@yahoo.com wrote:
>
>> A question to numeric experts: how many bits of precision one needs in intermediate calculations in order to guarantee "exact" (i.e. correctly rounded) result of strtod() assuming that the type 'double' defined as IEEE-754 binary64?
>> It seems to me ~1400 bits are needed in the worst case, but I am not 100% sure.
>
> Around 1996, using continued fractions, I wrote a program to find hard
> to convert numbers from decimal to IEEE-754 double with its 53 bits.

Nice. We can use them to create a sufficient rational that represents a
real up to a desired level of precision. The convergents of continued
fractions are the target...

already...@yahoo.com

unread,

Jul 29, 2018, 4:57:49 PM7/29/18

to

That what I wrote in my first answer to you. To which Ivan said "Not really".

Terje Mathisen

unread,

Jul 29, 2018, 5:19:58 PM7/29/18

to

> That what I wrote in my first answer to you. To which Ivan said "Not really".
>

In that case we're in violent agreement. :-)

You can of course get an input which corresponds to 0.25, 0.125, 0.0625
etc, but after we scale the input by the relevant power of two we get
back to 0.500000...

MitchAlsup

unread,

Jul 29, 2018, 7:23:01 PM7/29/18

to

On Saturday, July 28, 2018 at 2:50:15 PM UTC-5, already...@yahoo.com wrote:
> A question to numeric experts: how many bits of precision one needs in intermediate calculations in order to guarantee "exact" (i.e. correctly rounded) result of strtod() assuming that the type 'double' defined as IEEE-754 binary64?
>
> It seems to me ~1400 bits are needed in the worst case, but I am not 100% sure.

In the realm of transcendental functions, J.M.Muller in chapter 10 of his book
"Elementary functions Algorithms and Implementation" has examples of several
well chosen operands that require 120-odd pits in order to get the rounding
correct.

In the realm of the 6 major IEEE functions (+-*/SQRT, *+) you need an inter-
mediate accumulator of 212 bits with the 106 bits out of the multiplier
hitting bits[159:53]. The augend has to be able to be routed to all 212-bits.

In the realm of argument reduction you also need ONLY 212 bits of intermediate
precision if you have to do everything in Double precision. If you are inside
the HW, you can deliver a correctly rounded argument with 116-bits of inter-
mediate precision (for which I have filed a patent, and may file a second.)

None of the above tally the exponent or sign bits in the precision.

Fred J. Tydeman

unread,

Jul 30, 2018, 1:49:22 AM7/30/18

to

On Sun, 29 Jul 2018 20:29:27 UTC, Terje Mathisen <terje.m...@tmsw.no> wrote:

> Your result makes a lot of sense, particularly the log2(10) scaling factor.
>
> Did you keep a printout of those worst examples?

Yes. They are part of my FPCE Test Suite (used to check C compilers and libraries).

Fred J. Tydeman

unread,

Jul 30, 2018, 2:07:46 AM7/30/18

to

On Sun, 29 Jul 2018 16:30:40 UTC, already...@yahoo.com wrote:

> Either I don't understand your answer or you don't understand my question.

Consider the one digit number (the exponent is not part of the digit count): 5e125
As a 64-bit double, it is 5A07A2ECC414A03F.7FF6CA....
The .7FF... part is used to determine the rounding.
So, keeping less than 12 bits after the 53 bits of the double,
one would end up with ...03F.800... which would round to ...040 (which is wrong).
So, in this case, a one decimal digit number needs at least 12 bits after the 53 bits
to do correct rounding.

Terje Mathisen

unread,

Jul 30, 2018, 3:52:08 AM7/30/18

to

Fred J. Tydeman wrote:
> On Sun, 29 Jul 2018 16:30:40 UTC, already...@yahoo.com wrote:
>
>> Either I don't understand your answer or you don't understand my
>> question.
>
> Consider the one digit number (the exponent is not part of the digit
> count): 5e125 As a 64-bit double, it is 5A07A2ECC414A03F.7FF6CA....
> The .7FF... part is used to determine the rounding. So, keeping less
> than 12 bits after the 53 bits of the double, one would end up with
> ...03F.800... which would round to ...040 (which is wrong). So, in
> this case, a one decimal digit number needs at least 12 bits after
> the 53 bits to do correct rounding.

I would be willing to argue that you need to include the exponent as
well, but this is still a very nice example:

Extracting powers of two, this is really 5**126 * 2**125, so you have to
encode the 126th power of 5 as a binary value.

already...@yahoo.com

unread,

Jul 30, 2018, 3:58:08 AM7/30/18

to

But my question was not "how many bits are needed per decimal digit?", but "how many bits are needed in the worst case?".
Or, using terminology more close to yours, "[in worst case] how many decimal digits, not counting leading zeros, has to be converted in "standard" manner (i.e. Est(n+1)=Est(n)*10+dig(n+1) ), before we can safely switch a handling of input string to "scan for sticky digit" ?

Probably, in order to avoid ambiguity, I have to formulate my question as an example of implementation and to store it on github for reference. May be, I'd do it this night or the next night.

Fred J. Tydeman

unread,

Jul 30, 2018, 4:18:52 PM7/30/18

to

On Mon, 30 Jul 2018 07:58:06 UTC, already...@yahoo.com wrote:

> But my question was not "how many bits are needed per decimal digit?", but "how many bits are needed in the worst case?".
> Or, using terminology more close to yours, "[in worst case] how many decimal digits, not counting leading zeros, has to be converted in "standard" manner (i.e. Est(n+1)=Est(n)*10+dig(n+1) ), before we can safely switch a handling of input string to "scan for sticky digit" ?

Based upon IEEE-754, for binary64 format, converting from binary64 to decimal and back to binary64,
as long as one uses 17 (or more) significant decimal digits, one gets an identity with round to nearest.

And, conversions from decimal (for 20 or fewer significant digits) to binary64 will always be correctly rounded.

already...@yahoo.com

unread,

Jul 30, 2018, 7:05:37 PM7/30/18

to

I am not interested in "20 or fewer".
I am interested in input strings of unlimited length. Or at least "practically unlimited" like 1,000,000 of decimal digits.

My feeling, confirmed by Bruce Holt, was that exact conversion of such strings does *not* require unlimited, or even very high, precision.

already...@yahoo.com

unread,

Jul 30, 2018, 7:19:27 PM7/30/18

to

Done. A demo uses boost, so not as easy for everybody to play with as I would prefer. But this way it was the easiest for me to code.

https://github.com/already5chosen/bin2ascii/blob/34575e29999b3611fd52e59d417318ec73be8c59/strtod/demo_strtod.cpp

The question is - what is the minimal sufficient value of STRTOD_NBITS ?

already...@yahoo.com

unread,

Aug 3, 2018, 10:56:33 AM8/3/18

to

On Sunday, July 29, 2018 at 4:06:25 AM UTC+3, kr...@berkeley.edu wrote:
> On Saturday, July 28, 2018 at 4:21:33 PM UTC-7, already...@yahoo.com wrote:
> > Unless better algorithms were discovered since 1990, it sounds like Terje is right and in the worst cases a required precision is unbounded.
>
> Haven't checked this but it suggests <1200 bits for double precision.
> https://arxiv.org/abs/1310.8121v1

Just now I realize, who answered my question.
Recently I was sort of campaigning in this group against one of the principle decision in your pet and even more so against justification behind the decision
expressed in the docs.
So it's nice to know that you are lurking.

Bruce Hoult

unread,

Aug 3, 2018, 3:15:47 PM8/3/18

to

In google groups I see only "kr...@berkeley.edu" (which I noticed at the time) but I don't suppose there are a lot of names starting with "kr".

I imagine a lot of people here might be interested in the talk given in the office at lunchtime yesterday. The ancient history part duplicated talks already publicly available (e.g. https://www.youtube.com/watch?v=8n2HLp2gtYs), but not the more recent events.

kr...@berkeley.edu

unread,

Aug 4, 2018, 1:59:10 PM8/4/18

to

On Friday, August 3, 2018 at 7:56:33 AM UTC-7, already...@yahoo.com wrote:
> O
>

> Just now I realize, who answered my question.
> Recently I was sort of campaigning in this group against one of the principle decision in your pet and even more so against justification behind the decision
> expressed in the docs.
> So it's nice to know that you are lurking.

I used to participate actively when in grad school 25+ years ago, and still check in occasionally for nostalgia reasons or if someone points a post out to me, but difficult to justify spending time here given recent low signal-to-noise ratio.