Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Precision in Tcl 8.0 (was Tcl 8.0: Loss of (tcl_)precision.)

15 views
Skip to first unread message

John Ousterhout

unread,
Jun 27, 1997, 3:00:00 AM6/27/97
to

A bunch of us here at SunScript have been talking about the precision
issues that George Howlett raised. On the one hand, what George suggests
is potentially losing information, which seems bad. On the other hand,
the results produced by George's suggestions are actually more correct
and intuitive in many cases: 2.4/3 really *is* 0.8, not 0.79999999999999993
as Tcl 8.0 currently prints.

So here is a compromise proposal for discussion. If you know people who
are experts in numerical issues, it would be great to get their feedback
to make sure that we're not missing some important issues.

1. Retain the full double-precision information in all internal
calculations, just as is done now. In other words, don't follow
George's suggestion to round every time a double value is stored
in an object. This way we don't lose information unnecessarily.

2. When converting double values to strings, use %.12g. The idea here
is to round, as George suggested. The choice of 12 digits instead of
the 15 digits that George suggested is a tradeoff between two things.
On the one hand, making the precision too small can cause information
to be lost. On the other hand, making it too large is also bad because
roundoff errors could accumulate in a calculation to the point where
rounding produces a number like 0.79999999999999993 again. For example,
try running the following procedure with an argument of 100:

proc p num {
set x .1
for {set i 1} {$i <= $num} {incr i} {
set x [expr $x + .1]
puts "After $i additions, sum is $x; rounded is [format %.15g $x]"
}
}

After .1 is added about 60 times there is enough accumulated roundoff
error that the answer is no longer clean under %.15g formatting. Perhaps
12 digits is unnecessarily conservative and we could safely use 13 or
14....

3. When comparing floating-point values in the expr command with the
operators <, <=, ==, >=, and >, apply a fudge factor: a number
x is considered equal to a number y if abs((x-y)/y) < 0.5e-12.
This way the loop example that George gave will execute for the
desired number of iterations, but we only have to do the rounding
during comparison operations.

With this scheme I don't think we need to reinstate the tcl_precision
variable. If you want a more precise printout than %.12g, just don't
let the string conversion happen automatically: do it explicitly with,
for example, "format %.17g $x". Thus people doing serious floating-
point calculations can keep the full precision. The rounding during
comparisons shouldn't be an issue for serious floating-point programmers
because they shouldn't be doing floating-point comparisons in the first
place, no? I've never liked the tcl_precision variable because it is
global: if you have one package that expects one level of precision and
another package that expects another level, they can't be used in the
same application. So, I'd like to get rid of tcl_precision if possible.

Please respond to the newsgroup with comments. Can you think of any
holes in this scheme? Is there any way in which it is worse than either
Tcl 7.6 or Tcl 8.0?

If there is general agreement that this is a good idea, we'll see if we
can get it implemented before Tcl 8.0 goes final.

Viktor Dukhovni

unread,
Jun 27, 1997, 3:00:00 AM6/27/97
to

In <5p0r01$au0$1...@engnews2.Eng.Sun.COM> ous...@tcl.eng.sun.com (John Ousterhout) writes:

>1. Retain the full double-precision information in all internal
> calculations, just as is done now. In other words, don't follow
> George's suggestion to round every time a double value is stored
> in an object. This way we don't lose information unnecessarily.

Yes, though incompatible, it is the only way to go.

There is no such thing as a 0.8 (or 2.4) 64 bit double.
Only rationals with "sufficiently small" 2 power denominators can be
represented exactly.

Applications which desire exact "fixed point" arithmetic,
should multiply by the right power of 10, and use integer arithmetic!

Use of a floating point loop control variables is easily avoided
completely. Just precompute the correct integer loop count with the desired
rounding or truncation, and loop over ints instead.

Applications which desire a particular floating point precision
should judiciously use "format" and their own comparison functions.

>2. When converting double values to strings, use %.12g. The idea here
> is to round, as George suggested. The choice of 12 digits instead of
> the 15 digits that George suggested is a tradeoff between two things.
> On the one hand, making the precision too small can cause information
> to be lost. On the other hand, making it too large is also bad because
> roundoff errors could accumulate in a calculation to the point where
> rounding produces a number like 0.79999999999999993 again. For example,
> try running the following procedure with an argument of 100:


This breaks:

while {<more floats>} {
lappend floatlist $float
}
eval floatcmd $floatlist

The values are converted to strings before re-parsing by eval.
(This again points out the need for an "leval" which avoids
stringification and substitution).

There are probably more subtle cases where values are inadvertently
converted to strings.


>3. When comparing floating-point values in the expr command with the
> operators <, <=, ==, >=, and >, apply a fudge factor: a number
> x is considered equal to a number y if abs((x-y)/y) < 0.5e-12.
> This way the loop example that George gave will execute for the
> desired number of iterations, but we only have to do the rounding
> during comparison operations.

abs((x-y)/y) is not symmetric and likely to lead to floating point
overflows. You want: abs(x-y)/(abs(x)+abs(y)) and skip the division
when x == y == 0.0. Also deal with NaN.

Far worse under the new rules:

(2.4 / 3) == 0.8

But,

(2.4 / 3) - 0.8 != 0

This breaks the fundamental requirement that x == y iff x - y == 0.
In C and Perl, both are false, why should either be true in Tcl?

--
Viktor.

Grant Reaber

unread,
Jun 27, 1997, 3:00:00 AM6/27/97
to

In article <5p14f7$t...@yoyodyne.ics.uci.edu>,

No! 0 - 0 == -0.

What the scheme does break is the rather fundamental notion that a
number is either greater than zero, equal to zero, or less than zero,
and not more than one of those (substitute some other number for zero
if you like). Testing equality of floating point numbers is a
meaningful operation in IEEE. You just have to be careful. This
property can be restored if you do the fudge for all comparisons, so
that a number has to be a certain amount less than another number
before Tcl says it is less than that number. This behavior would also
be consistent with what would happen if the value accidentally got
stringified. But maybe that's what JO was suggesting. The reason I
thought not is that he said good programmers wouldn't be testing
equality anyway, as if testing inequality would be unaffected by the
change.

Since we're talking about it, let me bring up some other expr issues.
The new rand function is nice, but the random function looks pretty
weak. How hard would it be to look up a good RNG in a book and use
that? Also, I would like to see the semantics of expr operations made
more clear. At the very least, guarantee that all arithmetic is done
with 32 bits. How about overflow checking? Maybe provide unsigned
arithmetic. Or bigints. I've always thought that a language that
treats everything as a string should do bigint arithmetic. One of the
bad things about C is that its math is underspecified. A "high-level"
language like Tcl should remedy that.

Grant


Ron A. Zajac

unread,
Jun 28, 1997, 3:00:00 AM6/28/97
to

John Ousterhout wrote:
>
> A bunch of us here at SunScript have been talking about the precision
> issues that George Howlett raised.<SNIP>
>
> <suggestions SNIPped>

>
> With this scheme I don't think we need to reinstate the tcl_precision
> variable....<SNIP>

...tho you might want to create a "tcl_roundoff" variable (default value
= 12) to accomodate roundoff throttling....

--
Ron A. Zajac / NORTEL / 972-684-4887 esn444 / zajacATnortelDOTcom
These notions are mine, not NORTEL's!

Mark Diekhans

unread,
Jun 28, 1997, 3:00:00 AM6/28/97
to

ous...@tcl.eng.sun.com (John Ousterhout) writes:

>2. When converting double values to strings, use %.12g. The idea here
> is to round, as George suggested. The choice of 12 digits instead of
> the 15 digits that George suggested is a tradeoff between two things.

IMHO, this creates a problem worse than the original one. With
tcl_precision, floating point numbers are implicitly rounding everywhere they
are used. With this approach, the implicit rounding happens at times that can
only be known if the Tcl programmer knows what C API is used to implement a
command that operates on floating point numbers. Commands implemented using
the string API will round floating point numbers while those using the Tcl_Obj
API will maintain the maximum precision.

While the Tcl_Objs provide for a high performance API, the string API is *much*
simpler to use. It will continue to remain an important approach when the
speed of development is more important than the speed of execution.
Much of the existing Tcl code base will probably never be ported to the
Tcl_Obj API. If the value of data item is different depending on if its
internal representation is a string or a native type, it puts an unreasonable
burden on the programmer to understand the internal implementation of
the code then are using.

Mark

Steven Correll

unread,
Jun 29, 1997, 3:00:00 AM6/29/97
to

I'm not a numerical analyst, and I don't even play one on TV, but I
have read the excellent paper "What Every Computer Scientist Should
Know About Floating-Point Arithmetic", originally published in March
1991 Computing Surveys, and currently part of the Sun Numerical
Computation Guide; and also its forthcoming appendix D (both available
at http://www.validgh.com). These explain why George's proposal to
round on every store is a bad idea (namely, because people who don't want
to have to think about floating point will be the first to suffer from the
counterintuitive effects of repeated rounding). I would recommend these
papers to anybody who is inclined to disagree with point (1).

I also have personal experience with a system which implements
something similar to (3). We used the sum of x and y in the denominator
because we were concerned about what would happen when x and y were
very different in magnitude. We took care to maintain the expected
identities by using the same "fudge factor" for all of the comparison
operators (e.g. "a <= b" implies "!(a > b)") and users have been
happy. There's also precedent for this in the APL language.

I'm not sure about (2). There's no problem with rounding as a final
step before presenting data to an observer: but if a program ever does
something that causes data to be converted back and forth between
binary and string representations, (2) will cause it to encounter the
kind of repeated rounding that introduces surprising behavior.

In article <5p0r01$au0$1...@engnews2.Eng.Sun.COM>,
John Ousterhout <ous...@tcl.eng.sun.com> wrote:
>...


>1. Retain the full double-precision information in all internal
> calculations, just as is done now. In other words, don't follow
> George's suggestion to round every time a double value is stored
> in an object. This way we don't lose information unnecessarily.
>

>2. When converting double values to strings, use %.12g. The idea here

> is to round, as George suggested....


>
>3. When comparing floating-point values in the expr command with the
> operators <, <=, ==, >=, and >, apply a fudge factor: a number
> x is considered equal to a number y if abs((x-y)/y) < 0.5e-12.
> This way the loop example that George gave will execute for the
> desired number of iterations, but we only have to do the rounding
> during comparison operations.

--
Steven Correll == PO Box 66625, Scotts Valley, CA 95067 == s...@netcom.com

Robin Becker

unread,
Jun 29, 1997, 3:00:00 AM6/29/97
to

In article <5p0r01$au0$1...@engnews2.Eng.Sun.COM>, John Ousterhout
<ous...@tcl.eng.sun.com> writes

>A bunch of us here at SunScript have been talking about the precision
>issues that George Howlett raised. On the one hand, what George suggests
>is potentially losing information, which seems bad. On the other hand,
>the results produced by George's suggestions are actually more correct
>and intuitive in many cases: 2.4/3 really *is* 0.8, not 0.79999999999999993
>as Tcl 8.0 currently prints.
>
......

>Please respond to the newsgroup with comments. Can you think of any
>holes in this scheme? Is there any way in which it is worse than either
>Tcl 7.6 or Tcl 8.0?
>
>If there is general agreement that this is a good idea, we'll see if we
>can get it implemented before Tcl 8.0 goes final.
I think I agree with the overall idea, but aren't people going to want
access to the fudge factor and or the default printout precision? This
has all come about because of the dual nature of the float/double
objects. One thing I would miss about the proposed scheme is that it's
impossible to get at the original string version from within Tcl (or am
I wrong here). If a number is specified as 4.000 this means something
and I assume under rounding this will appear as 4. In finance for
example people have conventions about which exchange rates are quoted to
2 3 4 decimal places etc; bonds are often quoted in 64'ths or 32'nds
giving nice finite decimal ranges

(examples) 1 % set s 4.0000
4.0000
(examples) 2 % string length $s
6
(examples) 3 % string length [expr 0+$s]
3
(examples) 4 % expr 0+$s
4.0

ie we still have the original string value until we do any expr
operation. I'm assuming the rounding would happen for all string ops so
I would expect differences here; for the current scheme we already have
$s==[expr 0+$s], but the string representations are different.
--
Robin Becker

Malcolm Northcott

unread,
Jun 29, 1997, 3:00:00 AM6/29/97
to

If you need to display or store a number with a particular
number of significant figures, use the "format" command,
otherwise the more precision the better.

--
Malcolm Northcott m...@laplacian.com

Peter De voil

unread,
Jun 30, 1997, 3:00:00 AM6/30/97
to

Hi,

Roger Hui implemented the idea of comparison tolerance in APL/J as:

int teq(double x, double y) {
if (x == y) return 1;
if (0 < x != 0 < y) return(0);
double mag = (0 < x) ? MAX(x,y) : -MIN(x,y)
if (mag == Inf)
return(x == y);
return(abs(x-y) <= tolerance * mag);
}


Or, in english, "x=y if the magnitude of x-y does not exceed t times
the larger of the magnitudes of x and y".

The global tolerance can be manipulated in the same way as tcl_precision
used to be.

Yours,
PdeV.

John Ousterhout

unread,
Jun 30, 1997, 3:00:00 AM6/30/97
to

In article <x7lo3ur...@osprey.grizzly.com>, Mark Diekhans <ma...@osprey.grizzly.com> writes:

|> ous...@tcl.eng.sun.com (John Ousterhout) writes:
|>
|> >2. When converting double values to strings, use %.12g. The idea here
|> > is to round, as George suggested. The choice of 12 digits instead of
|> > the 15 digits that George suggested is a tradeoff between two things.
|>
|> IMHO, this creates a problem worse than the original one. With
|> tcl_precision, floating point numbers are implicitly rounding everywhere they
|> are used. With this approach, the implicit rounding happens at times that can
|> only be known if the Tcl programmer knows what C API is used to implement a
|> command that operates on floating point numbers. Commands implemented using
|> the string API will round floating point numbers while those using the Tcl_Obj
|> API will maintain the maximum precision.

This problem may not be as bad as you think. If a floating point value is
passed to an old string-style command, it is true that the value will be
converted to a string, which will result in rounding. However, the internal
representation is not lost, so future operations on the same value will get
the full precision (as long as they are in object commands). Thus there
won't be repeated rounding unless a string command computes with the rounded
value and returns the result back to an object command. I suspect that most
computation on floating point values will be done in core commands, which all
use the object system.

Even in the worst case, we'll still have 12 digits of precision, which is
better than the 6 digits that Tcl calculations get today if they don't set
tcl_precision.

Mark Diekhans

unread,
Jun 30, 1997, 3:00:00 AM6/30/97
to

Hi John,

ous...@tcl.eng.sun.com (John Ousterhout) writes:


>
> In article <x7lo3ur...@osprey.grizzly.com>, Mark Diekhans <ma...@osprey.grizzly.com> writes:
> |> IMHO, this creates a problem worse than the original one. With
> |> tcl_precision, floating point numbers are implicitly rounding everywhere they
> |> are used. With this approach, the implicit rounding happens at times that can
> |> only be known if the Tcl programmer knows what C API is used to implement a
> |> command that operates on floating point numbers. Commands implemented using
> |> the string API will round floating point numbers while those using the Tcl_Obj
> |> API will maintain the maximum precision.
>
> This problem may not be as bad as you think. If a floating point value is
> passed to an old string-style command, it is true that the value will be
> converted to a string, which will result in rounding. However, the internal
> representation is not lost, so future operations on the same value will get
> the full precision (as long as they are in object commands).

Unless it does an operation on the number and returns a number or stores the
number in a data structure and latter returns.

> Even in the worst case, we'll still have 12 digits of precision, which is
> better than the 6 digits that Tcl calculations get today if they don't set
> tcl_precision.

Actually, I am not worried about the precission, but the inconsistency of
the behavior of a commands. It will not be intuitive or obvious when rounding
will occur.

Just my $0.02,
Mark

Frederic Bonnet

unread,
Jul 2, 1997, 3:00:00 AM7/2/97
to

Hi John,

That sounds like a very acceptable compromise to me. In fact, it's quite
similar to what I said in a previous pst, but with more precise proposals.

John Ousterhout wrote:
>
> 1. Retain the full double-precision information in all internal
> calculations, just as is done now. In other words, don't follow
> George's suggestion to round every time a double value is stored
> in an object. This way we don't lose information unnecessarily.

Agreed, else we would also lose the benefits of Tcl8's bytecode complier and
object system.

> 2. When converting double values to strings, use %.12g. The idea here
> is to round, as George suggested. The choice of 12 digits instead of
> the 15 digits that George suggested is a tradeoff between two things.

> On the one hand, making the precision too small can cause information
> to be lost. On the other hand, making it too large is also bad because
> roundoff errors could accumulate in a calculation to the point where
> rounding produces a number like 0.79999999999999993 again. For example,
> try running the following procedure with an argument of 100:
>

> proc p num {
> set x .1
> for {set i 1} {$i <= $num} {incr i} {
> set x [expr $x + .1]
> puts "After $i additions, sum is $x; rounded is [format %.15g $x]"
> }
> }
>
> After .1 is added about 60 times there is enough accumulated roundoff
> error that the answer is no longer clean under %.15g formatting. Perhaps
> 12 digits is unnecessarily conservative and we could safely use 13 or
> 14....

Why not keeping tcl_precision to specify default string formatting? See below
for development on this point.

> 3. When comparing floating-point values in the expr command with the
> operators <, <=, ==, >=, and >, apply a fudge factor: a number
> x is considered equal to a number y if abs((x-y)/y) < 0.5e-12.
> This way the loop example that George gave will execute for the
> desired number of iterations, but we only have to do the rounding
> during comparison operations.

Agreed, this will allow a more intuitive behavior.

I suppose 0.5e-12 is related to the default %.12g output. Unfortunately,
while we can force a more precise string output with format and thus gain
better control over the precision, we can't gain control over this
"fudge factor" unless through an hypothetical tcl_fudge variable.

> With this scheme I don't think we need to reinstate the tcl_precision

> variable. If you want a more precise printout than %.12g, just don't
> let the string conversion happen automatically: do it explicitly with,
> for example, "format %.17g $x". Thus people doing serious floating-
> point calculations can keep the full precision. The rounding during
> comparisons shouldn't be an issue for serious floating-point programmers
> because they shouldn't be doing floating-point comparisons in the first
> place, no?

Well said ;-)

> I've never liked the tcl_precision variable because it is
> global: if you have one package that expects one level of precision and
> another package that expects another level, they can't be used in the
> same application. So, I'd like to get rid of tcl_precision if possible.

I agree on your point about tcl_precision being global. However, we still
need a way to control precision during comparisons, and setting the default
output without using format would be useful, too. So maybe we'll have to
keep the old tcl_precision. But we can also improve this (rather outdated)
feature.
About packages using different precisions, perhaps we could use the new
namespace feature introduced by Tcl8: each package could (should?) then define
its own tcl_precision variable and thus control the way calculations are
performed. After all, in Tcl7.6, a slave interpreter could use a different
value of tcl_precision than it master. So each namespace could define its own
or inherit its parent's. Granted, this is easier to say than to implement,
but it's only an idea.

About floats comparisons, Peter De voil has posted a sample implementation
that seems very interesting, along with the same kind of comments I've made
about keeping tcl_precision.

I like this example since it carefully avoids any expensive operation.

See you, Fred
--
Frederic BONNET fbo...@irisa.fr
Ingenieur Ecole des Mines de Nantes/Ecole des Mines de Nantes Engineer
IRISA Rennes, France - Projet Solidor/Solidor Project
------------------------------------------------------------------------
Tcl: can't leave | "Theory may inform but Practice convinces."
$env(HOME) without it! | George BAIN

Paul Eggert

unread,
Jul 4, 1997, 3:00:00 AM7/4/97
to

ous...@tcl.eng.sun.com (John Ousterhout) writes:

>2. When converting double values to strings, use %.12g.

Here's a much better proposal, originally made by Guy Steele and Jon White:

When converting a double value to a string,
convert to the smallest number of digits such that
converting back to double yields the original value again.

This proposal will keep both sides happy.

The people who don't like the fact that the nearest IEEE double to 1.1
converts to "1.1000000000000001" with %.17g will be happy, since that
double converts to "1.1" under this proposal. The people who don't
like loss of information will also be happy, since no information is
lost under this proposal.

This proposal is the best engineering compromise between the two camps.
It's better than the %.17g format used by Tcl 8.0 and GNU Emacs Lisp
(which generate useless digits that annoy people), and it's better than
the %.15g format used by Perl 5 (which loses information). It's also
better than %.12g or %.6g, which lose even more information than %.15g
does.

You might be surprised (I was!) that it wasn't commonly known how to
read and print floating point numbers accurately, in the sense
described above, until 1990; this is partly why many libraries do the wrong
thing in this area. Please see the following references for details.

William D Clinger, How to read floating-point numbers
accurately. SIGPLAN notices 25, 6 (June 1990), 92-101.

Guy L Steele Jr and Jon L White, How to print floating-point
numbers accurately. SIGPLAN notices 25, 6 (June 1990), 112-126.

Clinger's ideas are used by high-quality C libraries'
decimal-to-double converters, so you're halfway there;
all you need to do is implement Steele and White's ideas.
(I can give you a slow-but-portable prototype if you like.)

By the way, this proposal is required by the IEEE Scheme standard,
and it works well in practice.

> The choice of 12 digits instead of
> the 15 digits that George suggested is a tradeoff between two things.
> On the one hand, making the precision too small can cause information
> to be lost. On the other hand, making it too large is also bad because
> roundoff errors could accumulate in a calculation to the point where
> rounding produces a number like 0.79999999999999993 again.

This argument conflates roundoff error with printing error.
Tcl cannot insulate the user from roundoff error -- no matter what
number of digits Tcl uses, some programs' roundoff errors will be
visible with that number of digits. It is unwise to try to hide
roundoff errors in this way.

However, Tcl _can_ do something about printing error -- see the Steele
and White paper.

George Howlett is complaining about printing error, not roundoff error,
so if you apply Steele and White's method you will address his concerns.

Grant Reaber

unread,
Jul 7, 1997, 3:00:00 AM7/7/97
to

In article <5pi7q5$1qi$1...@shade.twinsun.com>,

Paul Eggert <egg...@twinsun.com> wrote:
>ous...@tcl.eng.sun.com (John Ousterhout) writes:
>
>>2. When converting double values to strings, use %.12g.
>
>Here's a much better proposal, originally made by Guy Steele and Jon White:
>
> When converting a double value to a string,
> convert to the smallest number of digits such that
> converting back to double yields the original value again.

Sounds like a good idea. Does this method always yield only one
string, though? If not, maybe it should be further specified so that
exactly one string results (pick the closest one to an even number or
something).

Grant


Martin Shepherd

unread,
Jul 9, 1997, 3:00:00 AM7/9/97
to

ous...@tcl.eng.sun.com (John Ousterhout) writes:
> >2. When converting double values to strings, use %.12g.

If we have to have a single precision imposed like this, wouldn't it
make more sense to have it tuned to the host architecture by using
something like the following:

#include <float.h>
...
sprintf(buffer, "%.*g", DBL_DIG, a_double_variable);

(For those who don't know, DBL_DIG is a standard C parameter in float.h.
It parameterizes the number of decimal digits of precision in a
double precision variable on the host architecture).

Having said this, I am personally not at all convinced that imposing a
blanket precision is a workable solution. If the numbers being printed
originated as single-precision float's, then using a number tuned to
doubles just won't work. I encountered this very problem today when
porting a Tcl program (one that relied on tcl_precision), to
Tcl8.0b2. The underlying C program used float's throughout, so when I
replaced a call to Tcl_PrintDouble() with the above statement, the
imprecise numbers stayed as .xxx9999999 to 15 significant figures. On
realizing this, I changed the sprintf call to:

sprintf(buffer, "%.*g", FLT_DIG, my_float_variable);

This cured the problem, but it highlighted the fact that imposing
a blanket precision is a bad idea.

The only suggestion that I have seen that makes much sense is the
following:

egg...@twinsun.com (Paul Eggert) writes:
: Here's a much better proposal, originally made by Guy Steele and Jon White:


:
: When converting a double value to a string,
: convert to the smallest number of digits such that
: converting back to double yields the original value again.

:
: This proposal will keep both sides happy.

Martin Shepherd (m...@astro.caltech.edu)

0 new messages