clojure.core/max and NaN

bOR_

unread,

Oct 30, 2011, 5:02:23 AM10/30/11

to clo...@googlegroups.com

Hi all,

Ran into something unexpected with "max".

user> (sd-age-female 13)

[10 NaN 0.746555245613119]

user> (apply max (sd-age-female 13))

0.746555245613119

user>

I don't know what the mathematically correct way would be of max in response to a NaN, but I am surprised that it picks 0.746 above 10 when there's a NaN inbetween there.

Ben Smith-Mannschott

unread,

Oct 30, 2011, 10:36:23 AM10/30/11

to clo...@googlegroups.com

On Sun, Oct 30, 2011 at 10:02, bOR_ <boris....@gmail.com> wrote:
> Hi all,
> Ran into something unexpected with "max".
> user> (sd-age-female 13)
>
>
> [10 NaN 0.746555245613119]
>
>
> user> (apply max (sd-age-female 13))
>
> 0.746555245613119

TL;DR: Don't expect sensible answers when NaN is involved.

The implementation of max in Clojure core assumes that it's possible
to compare the things it is given and get consistent results.

clojure.core:
#
# (defn max
# "Returns the greatest of the nums."
# {:added "1.0"
# :inline-arities >1?
# :inline (nary-inline 'max)}
# ([x] x)
# ([x y] (. clojure.lang.Numbers (max x y)))
# ([x y & more]
# (reduce1 max (max x y) more)))

clojure.lang.Numbers:
#
# static public double max(double x, double y){
# if(x > y){
# return x;
# } else {
# return y;
# }
# }

This doesn't turn out to be the case with IEEE floating point numbers
because of the behavior of NaN, as mandated by the standard.

For every number that is not NaN, the world works as we expect:

a < b && b < c implies a < c
a < b implies !(a >= b)
a < b implies b > a
a == b implies b == a

But, adding NaN to the mix changes all that. NaN is like a great big
black hole for truth. All of these are false:

(NaN == NaN)
(NaN < b)
(NaN >= b)
...

http://java.sun.com/docs/books/jls/third_edition/html/typesValues.html#4.2.4
#
# An operation that overflows produces a signed infinity, an operation
# that underflows produces a denormalized value or a signed zero, and an
# operation that has no mathematically definite result produces NaN. All
# numeric operations with NaN as an operand produce NaN as a result. As
# has already been described, NaN is unordered, so a numeric comparison
# operation involving one or two NaNs returns false and any !=
# comparison involving NaN returns true, including x!=x when x is NaN.

So, in your example (max [10 NaN 0.74]) what happens is this:

- (max 10 NaN) discovers that (10 > NaN) -> false, therefore
NaN is chosen as the "max" of the first two is NaN.

- (max NaN 0.74) discovers that (NaN > 0.74) -> false, therefore
0.74 is the maximum.

// Ben

Mikhail Kryshen

unread,

Oct 30, 2011, 12:20:59 PM10/30/11

to clo...@googlegroups.com

Why does Clojure have it's own naive implementation of max for doubles
instead of using max from java.lang.Math which has necessary checks for
NaN and the positive and negative zeros?

--
Mikhail

Brian Goslinga

unread,

Oct 30, 2011, 10:45:57 AM10/30/11

to Clojure

As a work around, you can use max-key with an appropriately written
key function that returns negative infinity for NaN (if that is the
behavior you want).

Ben Smith-Mannschott

unread,

Nov 1, 2011, 12:14:37 PM11/1/11

to clo...@googlegroups.com

I've opened http://dev.clojure.org/jira/browse/CLJ-868

"The min and max functions in clojure.core behave unpredictably when
one or more of their arguments is Float/NaN or Double/NaN. This is
because the current implementation assumes that > provides a total
ordering, but this is not the case when NaN is added to the mix. This
is an unfortunate fact of life when dealing with IEEE floating point
numbers."

It seems to me that there are four approaches one might take to
address this.

1. Document the current undefined behavior of min and max in the
presence of NaN.
2. Define that min and max will return NaN if any argument is NaN.
3. Define that min and max will ignore any NaN arguments.
4. Define that min and max will throw an exception if any argument is NaN.

1 Document current behavior as undefined
----------------------------------------

This requires no changes in the implementation, but it doesn't strike
me as a desirable resolution. Why unnecessarily codify clearly
confusing behavior?

2 Make NaN contagious
---------------------

Define min and max to return NaN if and only if at least one of their
arguments is NaN. This seems most in keeping with the (admittedly
perverse) behavior of NaN as specified.

See JLS 4.2.4 Floating Point Operations:

# An operation that overflows produces a signed infinity, an operation
# that underflows produces a denormalized value or a signed zero, and

# an operation that has no mathematically definite result produces
# NaN. All numeric operations with NaN as an operand produce NaN as a
# result. As has already been described, NaN is unordered, so a
# numeric comparison operation involving one or two NaNs returns false
# and any != comparison involving NaN returns true, including x!=x
# when x is NaN.

3 Let min and max ignore NaN arguments
--------------------------------------

This means that (min a NaN b) would be exactly equivalent to (min a
b). It would further imply that (min NaN) would be equivalent to
(min), which currently throws an exception.

4 Let NaN cause min and max to throw an exception
-------------------------------------------------

Currently min and max throw an exception if given arguments that are
not Numeric. One might plausibly argue that NaN is not numeric.

--

I've attached a patch with test and implementation for variant 2 to
the issue.

// Ben

Brian Hurt

unread,

Nov 1, 2011, 1:43:01 PM11/1/11

to clo...@googlegroups.com

On Tue, Nov 1, 2011 at 12:14 PM, Ben Smith-Mannschott <bsmit...@gmail.com> wrote:

2 Make NaN contagious
---------------------

Define min and max to return NaN if and only if at least one of their
arguments is NaN. This seems most in keeping with the (admittedly
perverse) behavior of NaN as specified.

The behavior isn't perverse- there's a difference between numerical comparisons and ordering. To see why this is so, consider the case should NaN = NaN. For ordering (sorting, etc.), this makes perfect sense. Numerically, however- sqrt(-1) = NaN, and sqrt(-2) = NaN as well, so if NaN = NaN, then sqrt(-1) = sqrt(-2). It's easy to get from there to all sorts of fun conclusions, like 1 = 2, and so on. The IEEE spec implements (to the extent it can) *numerical* equality. What most people want is ordering (unless the want numerical equality).

Pretty much every language ever screws this up.

Brian

Michael

unread,

Nov 1, 2011, 4:00:26 PM11/1/11

to Clojure

On Nov 1, 12:14 pm, Ben Smith-Mannschott <bsmith.o...@gmail.com>
wrote:

> 3. Define that min and max will ignore any NaN arguments.

What is:

(min NaN NaN)

in this situation; ()?

Ben Smith-Mannschott

unread,

Nov 1, 2011, 5:23:54 PM11/1/11

to clo...@googlegroups.com

The part of the message you didn't quote implies that it would behave
as (min): throw an exception. The other option would be to return NaN.
When I sat down to actually write the patch to implement this variant,
I found that the latter behavior fell out of the implementation
naturally, so I went with that.

// ben

Alexander Taggart

unread,

Nov 1, 2011, 8:08:02 PM11/1/11

to clo...@googlegroups.com

See http://dev.clojure.org/jira/browse/CLJ-738 for clarity on this issue.

Reply all

Reply to author

Forward