38 views

Skip to first unread message

Apr 15, 2005, 7:18:51 PM4/15/05

to Perl6 Language List

More questions stemming from cookbook work... Decimal Comparisons:

The most common recipe around for comparisons is to use sprintf to cut

the decimals to size and then compare strings. Seems ugly.

The non-stringification way to do it is usually along the lines of:

if (abs($value1 - $value2) < abs($value1 * epsilon))

(From Mastering Algorithms with Perl errata)

I'm wondering though, if C<$value1 == $value2> is always wrong (or

almost always wrong) then should it be smarter and:

a. throw a warning

b. DWIM using overloaded operators (as in reduce precision then compare)

c. throw a warning but have other comparison operators just for this

case to make sure you know what you're doing

I'd vote for b., but I don't know enough about the problem domain to

know if that is safe, and realistically I just want to write the

cookbook entry rather than start a math-geniuses flame war ;-)

Which leads to another question: Are there $value.precision() and

$value.accuracy() methods available for decimals? I'd really rather

not do the string comparison if it can be avoided, maybe it's just the

purist in me saying "leave the numbers be" :-)

Apologies in advance if this is somewhere I missed. I did a lot of searching.

Marcus Adair

Apr 15, 2005, 8:16:30 PM4/15/05

to Perl6 Language List

At 16:18 -0700 4/15/05, gcomnz wrote:

>More questions stemming from cookbook work... Decimal Comparisons:

>

>The most common recipe around for comparisons is to use sprintf to cut

>the decimals to size and then compare strings. Seems ugly.

>

>The non-stringification way to do it is usually along the lines of:

>

>if (abs($value1 - $value2) < abs($value1 * epsilon))

>

>(From Mastering Algorithms with Perl errata)

>

>I'm wondering though, if C<$value1 == $value2> is always wrong (or

>almost always wrong) then should it be smarter and:

> SNIP>More questions stemming from cookbook work... Decimal Comparisons:

>

>The most common recipe around for comparisons is to use sprintf to cut

>the decimals to size and then compare strings. Seems ugly.

>

>The non-stringification way to do it is usually along the lines of:

>

>if (abs($value1 - $value2) < abs($value1 * epsilon))

>

>(From Mastering Algorithms with Perl errata)

>

>I'm wondering though, if C<$value1 == $value2> is always wrong (or

>almost always wrong) then should it be smarter and:

>Marcus Adair

I have longed for an OO class that might be called "measurement". An object would include a float, a unit of measure, and an estimate of accuracy.

Mathematical operations would be overloaded so that the result of a calculation would appropriately handle propagation of the argument's accuracies into the result. It might even do unit conversions but that's another subject. Coercion of a float into a measurement would be automatic with infinite precision assumed.

Given the new class it is easy to adjust comparison operators to calculate "within experimental error".

--

--> Life begins at ovulation. Ladies should endeavor to get every young life fertilized. <--

Apr 16, 2005, 8:33:56 PM4/16/05

to Perl6 Language List

On Fri, Apr 15, 2005 at 04:18:51PM -0700, gcomnz wrote:

: More questions stemming from cookbook work... Decimal Comparisons:

:

: The most common recipe around for comparisons is to use sprintf to cut

: the decimals to size and then compare strings. Seems ugly.

: More questions stemming from cookbook work... Decimal Comparisons:

:

: The most common recipe around for comparisons is to use sprintf to cut

: the decimals to size and then compare strings. Seems ugly.

Certainly, but I don't think the situation arises all that frequently

in practice.

: The non-stringification way to do it is usually along the lines of:

:

: if (abs($value1 - $value2) < abs($value1 * epsilon))

:

: (From Mastering Algorithms with Perl errata)

You should someday peruse the Ada documentation for this sort of stuff.

It's pretty enlightening (in the Zen sense of feeling smacked upside

the head) about how hard it is to guarantee any particular set of

"real" semantics across unknown machine architectures.

: I'm wondering though, if C<$value1 == $value2> is always wrong (or

: almost always wrong) then should it be smarter and:

:

: a. throw a warning

: b. DWIM using overloaded operators (as in reduce precision then compare)

: c. throw a warning but have other comparison operators just for this

: case to make sure you know what you're doing

:

: I'd vote for b., but I don't know enough about the problem domain to

: know if that is safe, and realistically I just want to write the

: cookbook entry rather than start a math-geniuses flame war ;-)

I think I'd vote for

d. Educate users that they almost never want to use == to compare

two numbers on the real number line, however they're represented.

e. Put some rough dwimmery into the ~~ operator instead.

The definition of "rough dwimmery" can be negotiated, but as a form

of definitional handwaving it's probably enough to write examples with.

I welcome input from people who know more about this than I do,

but in the interests of speed, you probably want some way to tell ~~

how many bits of binary mantissa to pay attention to if the exponents

are equal, so you can do most of your work without a horrid conversion

to decimal, or less horribly, subtraction to find a number to compare

to your epsilon. But by the time you compare exponents, it's possible

that it's cheaper to go ahead and let the hardware do the subtraction

for you. I'll leave that up to the implementors.

My main point is that the default "slop" should be fairly liberal on

the assumption that the ordinary user has written a fairly sloppy

algorithm. I'd probably leave about the last 8 bits out of an

ordinary double comparison, but let the user change that pragmatically.

I can see arguments for making the slop a lot bigger than that.

On the other hand, you don't want to make it so big that different

integers start comparing as equal merely because they happen to be big.

On the other other hand, if you know you're storing only integer

values into your floaters, you can safely use == instead of ~~ because

they will always compare equal until you exceed the precision of

the floater.

On the other other other hand, we're gonna all be using 64-bit

computers in a few years, so there will be less demand for putting

large integers into floaters to get around 32-bit limitations.

'Course, then we'll be worrying about the precision of 128-bit floaters...

: Which leads to another question: Are there $value.precision() and

: $value.accuracy() methods available for decimals? I'd really rather

: not do the string comparison if it can be avoided, maybe it's just the

: purist in me saying "leave the numbers be" :-)

As the other reply pointed out, it'd be nice to have a data type

that supports these semantics, but they would certainly run slower

than the usual floating-point algorithms do. Historically the people

doing this sort of stuff would rather have the raw speed and just

sort of hand calculate the uncertainty in their heads, from the basic

rule-of-thumb that additions add uncertainty, while multiplications

multiply uncertainties.

But for models that are trying to be provably accurate rather than

fast, I could see a use for the smarter data type. For the rest of

us, leaving out about 8 bits for ~~ comparison lets us use algorithms

with a little bit of additive slop, and punishes us a bit when we

use algorithms with a lot of multiplicative slop. :-)

: Apologies in advance if this is somewhere I missed. I did a lot of searching.

It's not something we've paid much attention to, partly because I *did*

read the Ada spec when it first came out. :-)

Larry

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu