# Comparing rationals/floats

38 views

### Gcomnz

Apr 15, 2005, 7:18:51 PM4/15/05
to Perl6 Language List
More questions stemming from cookbook work... Decimal Comparisons:

The most common recipe around for comparisons is to use sprintf to cut
the decimals to size and then compare strings. Seems ugly.

The non-stringification way to do it is usually along the lines of:

if (abs(\$value1 - \$value2) < abs(\$value1 * epsilon))

(From Mastering Algorithms with Perl errata)

I'm wondering though, if C<\$value1 == \$value2> is always wrong (or
almost always wrong) then should it be smarter and:

a. throw a warning
b. DWIM using overloaded operators (as in reduce precision then compare)
c. throw a warning but have other comparison operators just for this
case to make sure you know what you're doing

I'd vote for b., but I don't know enough about the problem domain to
know if that is safe, and realistically I just want to write the
cookbook entry rather than start a math-geniuses flame war ;-)

Which leads to another question: Are there \$value.precision() and
\$value.accuracy() methods available for decimals? I'd really rather
not do the string comparison if it can be avoided, maybe it's just the
purist in me saying "leave the numbers be" :-)

Apologies in advance if this is somewhere I missed. I did a lot of searching.

### Doug McNutt

Apr 15, 2005, 8:16:30 PM4/15/05
to Perl6 Language List
At 16:18 -0700 4/15/05, gcomnz wrote:
>More questions stemming from cookbook work... Decimal Comparisons:
>
>The most common recipe around for comparisons is to use sprintf to cut
>the decimals to size and then compare strings. Seems ugly.
>
>The non-stringification way to do it is usually along the lines of:
>
>if (abs(\$value1 - \$value2) < abs(\$value1 * epsilon))
>
>(From Mastering Algorithms with Perl errata)
>
>I'm wondering though, if C<\$value1 == \$value2> is always wrong (or
>almost always wrong) then should it be smarter and:
> SNIP

I have longed for an OO class that might be called "measurement". An object would include a float, a unit of measure, and an estimate of accuracy.

Mathematical operations would be overloaded so that the result of a calculation would appropriately handle propagation of the argument's accuracies into the result. It might even do unit conversions but that's another subject. Coercion of a float into a measurement would be automatic with infinite precision assumed.

Given the new class it is easy to adjust comparison operators to calculate "within experimental error".

--

--> Life begins at ovulation. Ladies should endeavor to get every young life fertilized. <--

### Larry Wall

Apr 16, 2005, 8:33:56 PM4/16/05
to Perl6 Language List
On Fri, Apr 15, 2005 at 04:18:51PM -0700, gcomnz wrote:
: More questions stemming from cookbook work... Decimal Comparisons:

:
: The most common recipe around for comparisons is to use sprintf to cut
: the decimals to size and then compare strings. Seems ugly.

Certainly, but I don't think the situation arises all that frequently
in practice.

: The non-stringification way to do it is usually along the lines of:

:
: if (abs(\$value1 - \$value2) < abs(\$value1 * epsilon))
:
: (From Mastering Algorithms with Perl errata)

You should someday peruse the Ada documentation for this sort of stuff.
It's pretty enlightening (in the Zen sense of feeling smacked upside
the head) about how hard it is to guarantee any particular set of
"real" semantics across unknown machine architectures.

: I'm wondering though, if C<\$value1 == \$value2> is always wrong (or

: almost always wrong) then should it be smarter and:
:
: a. throw a warning
: b. DWIM using overloaded operators (as in reduce precision then compare)
: c. throw a warning but have other comparison operators just for this
: case to make sure you know what you're doing
:
: I'd vote for b., but I don't know enough about the problem domain to
: know if that is safe, and realistically I just want to write the
: cookbook entry rather than start a math-geniuses flame war ;-)

I think I'd vote for

d. Educate users that they almost never want to use == to compare
two numbers on the real number line, however they're represented.
e. Put some rough dwimmery into the ~~ operator instead.

The definition of "rough dwimmery" can be negotiated, but as a form
of definitional handwaving it's probably enough to write examples with.

but in the interests of speed, you probably want some way to tell ~~
how many bits of binary mantissa to pay attention to if the exponents
are equal, so you can do most of your work without a horrid conversion
to decimal, or less horribly, subtraction to find a number to compare
to your epsilon. But by the time you compare exponents, it's possible
that it's cheaper to go ahead and let the hardware do the subtraction
for you. I'll leave that up to the implementors.

My main point is that the default "slop" should be fairly liberal on
the assumption that the ordinary user has written a fairly sloppy
algorithm. I'd probably leave about the last 8 bits out of an
ordinary double comparison, but let the user change that pragmatically.

I can see arguments for making the slop a lot bigger than that.
On the other hand, you don't want to make it so big that different
integers start comparing as equal merely because they happen to be big.

On the other other hand, if you know you're storing only integer
values into your floaters, you can safely use == instead of ~~ because
they will always compare equal until you exceed the precision of
the floater.

On the other other other hand, we're gonna all be using 64-bit
computers in a few years, so there will be less demand for putting
large integers into floaters to get around 32-bit limitations.

'Course, then we'll be worrying about the precision of 128-bit floaters...

: Which leads to another question: Are there \$value.precision() and

: \$value.accuracy() methods available for decimals? I'd really rather
: not do the string comparison if it can be avoided, maybe it's just the
: purist in me saying "leave the numbers be" :-)

As the other reply pointed out, it'd be nice to have a data type
that supports these semantics, but they would certainly run slower
than the usual floating-point algorithms do. Historically the people
doing this sort of stuff would rather have the raw speed and just
sort of hand calculate the uncertainty in their heads, from the basic
multiply uncertainties.

But for models that are trying to be provably accurate rather than
fast, I could see a use for the smarter data type. For the rest of
us, leaving out about 8 bits for ~~ comparison lets us use algorithms
with a little bit of additive slop, and punishes us a bit when we
use algorithms with a lot of multiplicative slop. :-)

: Apologies in advance if this is somewhere I missed. I did a lot of searching.

It's not something we've paid much attention to, partly because I *did*