This makes no sense, as the outcome of the operation is undefined and
should be NaN.
max(NaN,0.) = NaN
After researching, it appears the first outcome is accepted behavior,
and might be included in the revised IEEE 754 standard, which affects
not only Fortran. The discussion posted at
www.cs.berkeley.edu/~ejr/Projects/ieee754/meeting-minutes/02-11-21.html#minmax
suggests that "There is no mathematical reason to prefer one reason to
another."
But I think otherwise, for the following reason. Suppose the NaN is
produced by x/y, where x=0 came from an underflow and y=0 came from an
underflow. Then x/y would be a well-defined number that could be
postive or negative. The convetion max(NaN,0.) = 0. is wrong at least
half the time.
I agree with you, but unfortunately there's a school of thought that
*sometimes* NaN means "no candidate value". By this logic, the maximum
of two or more values simply omits NaNs as contenders. Yuk.
P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
> This makes no sense, as the outcome of the operation is undefined and
> should be NaN.
> max(NaN,0.) = NaN
Why?
> After researching, it appears the first outcome is accepted behavior,
> and might be included in the revised IEEE 754 standard, which affects
> not only Fortran. The discussion posted at
> www.cs.berkeley.edu/~ejr/Projects/ieee754/meeting-minutes/02-11-21.html#minmax
> suggests that "There is no mathematical reason to prefer one reason to
> another."
Don't take this the wrong way. But, the members of the IEEE754
committee probably have much more experience than you (and
many of the people here in c.l.f) in floating point mathematics.
If they came to the conclusion that
"There is no mathematical reason to prefer one reason to another."
then you may want to pay attention to them, and guard against
suspect comparisons.
> But I think otherwise, for the following reason. Suppose the NaN is
> produced by x/y, where x=0 came from an underflow and y=0 came from an
> underflow. Then x/y would be a well-defined number that could be
> postive or negative. The convetion max(NaN,0.) = 0. is wrong at least
> half the time.
AFAIK, Fortran does not have hysteresis. It does not know
or care how you got to x = 0 and y = 0. All it tries to
do is evaluate x/y. This is a NaN.
Steven G. Kargl wrote:
<snip>
> AFAIK, Fortran does not have hysteresis. It does not know
> or care how you got to x = 0 and y = 0. All it tries to
> do is evaluate x/y. This is a NaN.
Note that this is true, *even if the x/y were an argument
of the max()*.
The compiler must evaluate the values of the arguments,
next associate the values with the dummy arguments,
next call the function.
"History is just one damn thing after another." :-)
--
Cheers!
Dan Nagle
Purple Sage Computing Solutions, Inc.
A standard-conforming Fortran processor is allowed to evaluate
max(x,y) as
if(x > y) then
max = x
else
max = y
endif
If x is NaN, then x > y is unordered (i.e., it is not true in IEEE
arithmetic).
The ELSE branch is taken, so max(NaN,0.0) = 0.0.
--Eric
On a somewhat related question, a colleague and I have observed
evaluation of MAXVAL() or MINVAL() where certain versions of certain
compiler libraries ignore all array elements up to and including the
last NaN, while other libraries simply ignore all NaN values. We have
never observed NaN as a result. We haven't gathered visible support for
our request that a consistent position should be taken.
> The convetion max(NaN,0.) = 0. is wrong at least
> half the time.
But by your argument, the convention Max(NaN,0.)=NaN would be wrong
the other half. That is probably why they concluded
"There is no mathematical reason to prefer one reason to another."
$.02 -Ron Shepard
Of course, NaN could always be a well-defined value, just as Inf can be
produced by an overflow of a finite number. Inf can represent a finite
number too large to fit in the exponent field (an inexact infinity so
to speak); NaN can represent any value, only the computer wasn't able
to figure out which one (an inexact NaN). Its meaning is not
restricted to being "not a real number", because it often results from
a combination of underflows and overflows that do not necessarily
correspond to exact zeros and exact infinities. In this sense,
max(NaN,0.) = NaN is always correct. Indeed, max(NaN,0.) is
occasionally correct. On the same basis one could suggest the sign of
NaN be negative, because this is occasionally correct.
1. z := Max{x, y} iff z <= x or z <= y
2. z := Max{x, y} iff z >= x and z >= y
Using the first definition, Max{5, NaN} = 5. Under the second
definition, Max{5, NaN} = NaN. There is no mathematical reason to
prefer one reason to another.
---------------------------------------------
I don't understand 1. - doesn't look like a correct Max definition to
me (even if the missing "z belongs to {x,y}" is added).
Jaroslav
Is that how max is defined by the standard? if not, if your
processor instead evaluates max(x,y) as
if(x < y) then
max = y
else
max = x
endif
the else-branch is still taken and now max(NaN,0.0) = NaN.
In this case your argument is in support of both results,
rendering the argument void.
--
Christer Ericson
http://realtimecollisiondetection.net/
And in any case, changing the order of the arguments would change the
result, in both cases. I do think it's imperative that max(a,b)==max(b,a).
Jan
Which ones do? Associativity - yes, I see that. But commutation?
Jan
For what it's worth, the C standard makes it clear that fmax should
produce 0 in this case.
In the footnote to 7.12.12.2:
"NaN arguments are treated as missing data: if one argument is a NaN and
the other numeric, then the fmax functions choose the numeric value. See
F.9.9.2."
Normative appendix F.9.9.2 says, of implementations that define
__STDC_IEC_559__,
"If just one argument is a NaN, the fmax functions return the other
argument (if both arguments are NaNs, the functions return a NaN)."
--
Simon.
--
Posted via a free Usenet account from http://www.teranews.com
OK, I cannot state this for sure, as I don't have access to the IEEE
standard,
but I was under the impression (perhaps until now you correct me) that,
if _mathematically_
a+b == (c + nearest(c,+1.))/2,
then you can have in FP
a+b == c
b+a == nearest(c,+1.)
and that more possibilities are allowed with more complex expressions
as a*b + c*d /= c*d + a*b
Jaroslav
I agree that this is a good idea, acording to 'the principle of least
astonishment', but the problem here is that NaNs already break this in
various ways. I.e. they are defined (afair) to compare false against any
real number, which does mean that _both_
if (a > NaN) {}
and
if (a <= NaN) {}
should be false, i.e. as soon as you allow NaNs the normal rules for
logical operations break down completely. :-(
At least on x86/x87 hardware any comparison against NaN will return
'UNORDERED', which is distinct from the set of 'greater_than', 'equal',
'less_than'.
Terje
--
- <Terje.M...@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
No, no - you forgot to check for orderability beforehand. You wouldn't
expect this to work for, e.g., spinors (not to mention complex numbers),
would you?
> At least on x86/x87 hardware any comparison against NaN will return
> 'UNORDERED', which is distinct from the set of 'greater_than', 'equal',
> 'less_than'.
And that is as it should be!
Jan
I don't see anything in the algorithmic description of floating-point addition
that allows this kind of asymmetry. You shift the mantissas based on the
difference in exponent towards the larger value, add, and renormalize. All
of these operations are symmetrical in the operands.
Jan
On the CDC Cyber 205, in a vector add the second operand was shifted
so that the exponents became equal. If a right shift, the shifted off
bits were lost, on a left shift everything was retained. And if the
left shift was too large, a short-cut was made. On tht machine, with
those operations, addition was not commutative.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
> On the CDC Cyber 205,... On tht machine, with
> those operations, addition was not commutative.
Though I'd personally characterize the philosophy of the CDC machines of
that era as being that speed was more important than the last few bits
of accuracy. They counted on the 60-bit word to be enough to make up for
being "sloppy" about the last few bits.
I can't tell precisely what environment some posters in the thread are
talking about. I suspect different posters are talking about different
things without specifying so, which gets confusing. Some posters have
mentioned the IEEE standard, while others appear to be talking in more
general terms about any floating point implementation or about the
requirements of the Fortran (or C?) standard. I'm afraid that I can't
follow the result because I'm unsure what unstated assumptions apply to
each post. I can make guesses, but it is easy to guess incorrectly in
some cases. I'm occasionally left with the impression that some posters
might be a bit vague on the distinction, which might explain why I have
trouble telling exactly what they meant.
--
Richard Maine | Good judgment comes from experience;
email: my first.last at org.domain| experience comes from bad judgment.
org: nasa, domain: gov | -- Mark Twain
Well, it is unlikely that any of them have more experience than me in
practical numerical software engineering, though I am less than 1% of
the numerical analyst that Kahan is, to give just one example. Note
that I am not saying that some of them don't have comparable experience,
though I can't think of any offhand.
They are quite simply wrong. There IS a very strong mathematical
argument, and the reasons for making that statement are dogmatism, not
science. I give the reasons for the DECISION below, but that is a
consequent matter. It is the reason for making that STATEMENT I am
talking about above.
The background is that traditional design started by creating a fairly
precise mathematical model, and then deriving the operations to fit in
with that model. This maximises the possibilities of reasoning about
the behaviour of the program (e.g. validation, program proving, software
engineering, etc. etc.) at the expense of restricting flexibility.
The alternative approach is to start with the facilities, maximise
them for flexibility, and let the overall model lie where it falls.
This maximises the flexibility of the design, at the cost of making
static validation somewhere between harder and impossible.
One of Kahan's papers points out that IEEE 754 did the latter, and that
it was a deliberate deviation from the traditional approach.
Jumping aside, the need for missing values occurs almost entirely in
vector or other composite operations, and EVERY language that has
supported them needs BOTH semantics. In particular, the requirement
order for the operations in statistics is:
Count non-missing values in vector
Sum non-missing values in vector
Take mininum/maximum of non-missing values in vector
Take product of non-missing values in vector
Derived operations and more esoteric ones
Also, EVERY language needs BOTH semantics, according to context. For
example, in the following:
top = max(max(vector_A,vector_B))
sum should be the maximum of the elements of vector_A and vector_B
where BOTH of a pair are non-missing. Look at any decent book on
statistics or good statistical package for ample evidence.
IEEE 754 NaNs are VERY clearly indications of 'invalid' values (though
even that has several interpretations, but the subtleties are irrelevant
to this). If they were to be treated as missing values, then it is
immediately clear that NaN+1.23 should be 1.23. No ifs or buts.
I have a paper on this somewhere, which I have circulated but not
published, if anyone is interested.
Regards,
Nick Maclaren.
> > On the CDC Cyber 205,... On tht machine, with
> > those operations, addition was not commutative.
>
> Though I'd personally characterize the philosophy of the CDC machines of
> that era as being that speed was more important than the last few bits
> of accuracy. They counted on the 60-bit word to be enough to make up for
> being "sloppy" about the last few bits.
I did not follow the description in the previous post explaining how
addition was not commutative, but I can add that, if I remember
correctly, unlike the earlier Cyber machines, the Cyber 205 was a
byte-addressible machine with (I think) both 4-byte and 8-byte reals.
$.02 -Ron Shepard
Ron Shepard wrote:
<snip a bunch>
> the Cyber 205 was a
> byte-addressible machine with (I think) both 4-byte and 8-byte reals.
IIRC, the 205 was _bit_ addressable, with a 48-bit address.
Both Fortran 2003 (and C99 and F95 with TR 15581)
provide standard ways to do what you want, assuming that
NaN indicates a missing value. In Fortran's case,
since IEEE_IS_NAN is elemental, you can simply say for
a vector declared REAL X(N):
> Count non-missing values in vector
non_missing = count(.not.ieee_is_nan(x))
> Sum non-missing values in vector
total = sum(x, mask=.not.ieee_is_nan(x))
(and similarly for PRODUCT)
> Take mininum/maximum of non-missing values in vector
biggest = maxval(x, mask=.not.ieee_is_nan(x))
(and similarly for MINVAL)
You're always free to special-case things, as in
if(any(ieee_is_nan(x))) then
result = NaN
else
result = f(x) ! all non-NaN elements in X
endif
You can do the same thing (albeit more verbosely) in C99 with the
analogous library functions.
Some applications will want to silently ignore NaN elements, and
in other cases NaNs indicate serious errors that can be indicated
by NaN result values. Seems to me that the current C and Fortran
language standards offer basic support for both alternatives,
but the onus is on the programmer to anticipate the cases where
NaNs are possible and to take the appropriate action.
Things might get tricky if you're also testing/setting the various IEEE
exception flags or if X contains signaling NaNs. I don't know,
for example, whether using the mask functions above can change
the state of the INVALID flag or if the language standards even
address this issue.
--Eric
>After tracking down a bug in my Fortran program, I found that it
>assumed
>max(NaN,0.) = 0.
We thought long and hard about this and decided that least surprise
was for it to return NaN -- many programmers seem to think of max() as
the kind of arithmetic operator which should return NaN if any input
is a NaN. But no competing compiler did this, and it would hurt
performance. So...
FWIW this is a significant source of program errors. For example, the
NAS Parallel benchmarks print out SUCCESSFUL when an error causes the
result to be NaN. You see, they didn't think about this case when they
picked the order of their if() statements.
-- greg
employed by, not speaking for, QLogic/PathScale
[deleted comp.lang.c because I'm tired of them whining about
topicality and cross-posts.]
It was even bit addressable...
If you have a system with no means of detecting floating-point errors
other than looking for Inf and NaN in output (e.g. cygwin/g77) then
doing otherwise can conceal significant bugs.
Don't bet on it :-( Even Fortran doesn't specify what you imply,
and C99's facilities are so badly specified as to be effectively
useless (and even actively harmful).
|> Some applications will want to silently ignore NaN elements, and
|> in other cases NaNs indicate serious errors that can be indicated
|> by NaN result values. Seems to me that the current C and Fortran
|> language standards offer basic support for both alternatives,
|> but the onus is on the programmer to anticipate the cases where
|> NaNs are possible and to take the appropriate action.
Which is precisely what writing robust, portable code (a.k.a. software
engineering) is NOT about!
Regards,
Nick Maclaren.
> In article <1156789047.6...@75g2000cwc.googlegroups.com>,
> ejk...@yahoo.com writes:
> |>
> |> Both Fortran 2003 (and C99 and F95 with TR 15581)
> |> provide standard ways to do what you want, ...
> |>
> |> You can do the same thing (albeit more verbosely) in C99 with the
> |> analogous library functions.
>
> Don't bet on it :-( Even Fortran doesn't specify what you imply,
> and C99's facilities are so badly specified as to be effectively
> useless (and even actively harmful).
Easy on the hyperbole. C99 spells out quite clearly, in F.9.9.2,
that fmax must effectively ignore NaNs. I don't like it, but out
C99 implementation does exactly what's specified in the 1999
C Standard. And our test suite looks for that behavior too.
P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
No, that's not my point. I agree that THAT is clear.
NaN support and fmax/fmin are in the 'open' (mandatory) sections of the
standard, but footnotes 204/205 refer forward to Annex F. However,
Annex F applies only if an implementation sets __STDC_IEC_559__, and
makes little or no sense on many architectures and with many compiler
options (e.g. VAX/Alpha, zOS, ones with IEEE 754 representation but only
partial semantics, etc.).
And then there is the question of exactly how the "as if" rule and
pragma STDC FP_CONTRACT apply (and the default for THAT may be anything).
The only compiler that I have used that supported C99 Annex F continued
to set __STDC_IEC_559__ even when I specified optimisation options that
affect the numeric results (which is forbidden by IEEE 754). So a lot
of code will have __STDC_IEC_559__, but NOT the semantics it expects.
If, however, the compiler DIDN'T do that, things would be no better.
There is almost nothing in the C99 standard that says what the "IEEE 754"
parts of the standard that are not shielded by __STDC_IEC_559__ (e.g.
<fenv.h>, chunks of <math.h> and some others) mean if __STDC_IEC_559__
is not set. As you recall, this issue was raised repeatedly during the
standardisation process.
And, of course, fmax/fmin are among the most clearly specified parts of
C99's IEEE 754 support - many of the other aspects are MUCH less clear,
and the flag handling is unbelievable. But let's skip that, on the grounds
of good taste and public decency.
Regards,
Nick Maclaren.
> The background is that traditional design started by creating a fairly
> precise mathematical model, and then deriving the operations to fit in
> with that model. This maximises the possibilities of reasoning about
> the behaviour of the program (e.g. validation, program proving, software
> engineering, etc. etc.) at the expense of restricting flexibility.
>
> The alternative approach is to start with the facilities, maximise
> them for flexibility, and let the overall model lie where it falls.
> This maximises the flexibility of the design, at the cost of making
> static validation somewhere between harder and impossible.
>
> One of Kahan's papers points out that IEEE 754 did the latter, and that
> it was a deliberate deviation from the traditional approach.
And yet, it was IEEE 754 which gained whitespread usage, and was
officially adopted in at least one, and if I understand this thread
correctly at least two, programming language standards. IYAM, this
demonstrates the gap between computing science in academe and the
practical art of writing programs that are supposed to be used by
humans.
Richard
You have made at least three major errors in that, I am afraid.
Firstly, I am older and more experienced than that, and am referring
to the days before the rise of computing science in academia. The
traditional design I referred to was and is used by practical software
engineers (though we weren't called that, then). Yeah, I know that
we are now an engandered species :-(
Secondly, IEEE 754 has NOT gained widespread acceptance, and almost all
"IEEE 754" systems use its format and not its semantics, or go in for
some simplification or variant of it. Many or most don't support
denormalised numbers or exceptions (as it specifies them) in some of
their modes (often their defaults), and some older and embedded systems
didn't/don't support NaNs or infinities.
Thirdly, only Java has attempted to include it as part of its specification,
and Kahan has written a diatribe against Java. C99 has some support, but
words fail me when describing how unusable it is. Fortran makes gestures,
but the 'support' is minimal, to put it mildly.
IEEE 754R is still in denial about the reasons for the near-total uptake
in programming languages (after 22 years!) and the people STILL believe
that it is due to inertia. No way, Jose.
Regards,
Nick Maclaren.
> Secondly, IEEE 754 has NOT gained widespread acceptance, and almost all
> "IEEE 754" systems use its format and not its semantics, or go in for
> some simplification or variant of it.
You could say the same thing about Standard C, if you're sufficiently
vague about degree of conformance. The fact is that *most* architectures
in widespread use support a pretty good approximation to IEEE 754. If
you want to savor all the small ways they don't quite conform, then ask
Fred Tydeman to give you his list of grievances. But in real life they
hardly matter to the practicing programmer.
> Many or most don't support
> denormalised numbers
I called you on this in Berlin and you temporized. The statement
IME is simply untrue. And ME involves supporting Standard C99 and
C++ libraries on about a dozen popular architectures, with about
as many floating-point processors.
> or exceptions (as it specifies them)
I guess the parenthetic quibble gives you some wiggle room, but
I still have trouble with it. Our fenv.h tests seem to indicate
that the popular architectures *do* meet the requirements of
IEEE 754 in this area.
> in some of
> their modes (often their defaults),
Another potential quibble, which I still have trouble believing,
from direct experience.
> and some older and embedded systems
> didn't/don't support NaNs or infinities.
Now that's true. My former company Whitesmiths, Ltd. provided software
floating-point packages that blew off denormals, infinities, and NaNs,
with no complaints from our customers. But the last packages we shipped
went out the door in 1988. Since then, both hardware and software have
learned to take IEEE 754 a bit more seriously. My book "The Standard C
Library" worried considerably about infinities and NaNs back in 1992.
It looks pretty naive to me now, in many ways, but it got things mostly
right. And IMO it reflected the spirit of the times.
> Thirdly, only Java has attempted to include it as part of its
> specification,
> and Kahan has written a diatribe against Java.
IIRC, the diatribe did *not* complain that Java slavishly followed
IEEE 754 to its detriment. There was this little thing about favoring
Sun architecture over Intel, then patching thing up in a half-assed
manner...
> C99 has some support, but
> words fail me when describing how unusable it is.
The detailed words always seem to fail you, but you're consistently
quick with the poisonous adjectives. In fact, C99 has two extensive
annexes (F and G) describing in exquisite detail how IEEE 754 and
Standard C should play together. My current company Dinkumware, Ltd.
has endeavored to conform completely to those annexes, and aside
from a few aggressively perverse tests in Tydeman's floating-point
suite we do just fine. While I don't always agree with some of the
more creative choices made in those annexes (particularly G), I had
little trouble following their guidance. Hence, I have to disagree
that C99 support for IEEE 754 is unusable, either to us or our
customers.
> Fortran makes gestures,
> but the 'support' is minimal, to put it mildly.
I'm glad you're capable of putting something mildly, from time to
time. Now if only you could put some of your jeremiads a bit more
accurately. Or even more precisely, so they're easier to criticize
in detail.
> IEEE 754R is still in denial about the reasons for the near-total uptake
> in programming languages (after 22 years!) and the people STILL believe
> that it is due to inertia. No way, Jose.
From out here, IEEE 754R seems to be suffering more from internecine
warfare than from any denial about their effects on programming
languages. But I'm always willing to cut slack for any war
correspondent. YMMV.
> Since then, both hardware and software have learned to take
> IEEE 754 a bit more seriously. My book "The Standard C Library"
> worried considerably about infinities and NaNs back in 1992.
> It looks pretty naive to me now, in many ways, but it got
> things mostly right. And IMO it reflected the spirit of the
> times.
I'd be gratified if you'd say a few more words about how the
discussion in "The Standard C Library" treats floating-point
arithmetic (or just infinities and NaNs) naively. An important
part of my own understanding of floating-point and IEEE 754 has
come from reading that book, so I'm now wondering how naive I am
;-)
--
"This is a wonderful answer.
It's off-topic, it's incorrect, and it doesn't answer the question."
--Richard Heathfield
> "P.J. Plauger" <p...@dinkumware.com> writes:
>
>> Since then, both hardware and software have learned to take
>> IEEE 754 a bit more seriously. My book "The Standard C Library"
>> worried considerably about infinities and NaNs back in 1992.
>> It looks pretty naive to me now, in many ways, but it got
>> things mostly right. And IMO it reflected the spirit of the
>> times.
>
> I'd be gratified if you'd say a few more words about how the
> discussion in "The Standard C Library" treats floating-point
> arithmetic (or just infinities and NaNs) naively. An important
> part of my own understanding of floating-point and IEEE 754 has
> come from reading that book, so I'm now wondering how naive I am
> ;-)
Well, the topic of this thread is one example. I assumed that NaN
always meant, "Abandon hope, all meaning is lost." The idea never
occurred to me that it might simply mean, "Pay no attention to me,
just go compute a useful value from other arguments." (And I still
have trouble with that viewpoint.)
Similarly, I always took Inf to mean "unbounded". But C99 actually
expects you to compute many different angles for atan2, as if two
Inf values were usefully taken as equal sides of a triangle. (And
I still have trouble with that viewpoint.)
I was unaware of several of the subtle requirements of IEEE 754,
particularly regarding domain vs. range errors and Inf vs. NaN
return values at mathematical singularities.
The code I presented in that book often botched denormal arguments.
It's all too easy to blow away those tiny values by mixing them
unnecessarily with other values.
The code was also intentionally cavailer about preserving, and
generating, -0 results. (And I still think that -0 is of dubious
value.)
All I had back then were the Elefunt tests, which I translated
painfully from Fortran. With the much better testing technology
we've developed in Dinkumware, we've cleaned up quite a few
spots where we lost accuracy. We also eliminated a few spots
where we were trying too hard, to no avail.
Finally, our work in Dinkumware on the special math functions
drove home the importance of sensitivity, both in determining
how hard an implementer should try to get exactly the right
result and in how much a user should trust numeric
approximations to functions in their sensitive regions.
That's all.
The R language, mostly used for statistics work, has both NA and NaN.
NA when you don't know anything about the value, often used
in input data where no data exists. NaN usually as the result
of computations.
I believe the distiction could also make sense in other languages,
though I don't expect it to appear anytime soon.
-- glen
I suppose you could regard trap handlers as part of "IEEE exceptions".
I've never seen an implementation of IEEE traps.
--
J. Giles
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare
I hesitate to stick my head up in such august company, being so
nonaugust, but I was impressed by how easy it was to replicate
Kahan's Borda's mouthpiece example in C99-based ANS Forth
implementations (pfe and gforth) of complex mathematics:
http://www-personal.umich.edu/~williams/archive/forth/complex/borda.html
And furthermore how easy it was to get the right answers
generally for the complex elementary functions above and below
their principal value branch cuts:
http://www-personal.umich.edu/~williams/archive/forth/complex/complex-ieee-test.fs
The link just above is not comprehensible outside of a niche of
a niche. :-)
-- David
Don't worry - as in politics, it is often the least august people who
see most clearly.
|> but I was impressed by how easy it was to replicate
|> Kahan's Borda's mouthpiece example in C99-based ANS Forth
|> implementations (pfe and gforth) of complex mathematics:
|>
|> http://www-personal.umich.edu/~williams/archive/forth/complex/borda.html
|>
|> And furthermore how easy it was to get the right answers
|> generally for the complex elementary functions above and below
|> their principal value branch cuts:
|>
|> http://www-personal.umich.edu/~williams/archive/forth/complex/complex-ieee-test.fs
Well, yes, but one of the questions is whether that is useful. Now, I
quite agree that a numerical analyst as good as Kahan can use branch
cut information directly in real applications, and I date from the days
when we were taught to do so (though I doubt that I still can), but I
know of nobody under 45 who can.
My experience is that Bill Plauger is understating the case - the IEEE 754
handling of zeroes (and hence infinities) causes a huge number of errors
that would not have occurred in older arithmetics. People REALLY don't
expect the "gotchas" it introduces - and C99 makes that a lot worse.
|> The link just above is not comprehensible outside of a niche of
|> a niche. :-)
Very true. In a related niche, I have a simple test that shows up the
chaos with C99's complex division and overflow. It really is VERY
nasty :-(
Regards,
Nick Maclaren.
I have. I have even tested them. They didn't work, because few (if
any) modern systems support interrupt recovery by user code properly.
However, Plauger has mangled what I said so appallingly that I hope
that you don't think that I said what he implied I said.
What I said was:
Many or most don't support denormalised numbers or exceptions
(as it specifies them) in some of their modes (often their defaults),
and some older and embedded systems didn't/don't support NaNs or
infinities.
That statement is true and I stand by it. I did not say that any
modern general-purpose desktop and server systems don't support
IEEE 754 at all[*], though many do not support C99 - for good reasons,
including customer demand.
[*] zOS supports only a subset - see:
http://www-306.ibm.com/software/awdtools/czos/features/
Regards,
Nick Maclaren.
You cannot now say the same about C90, though you could up to about 1995.
You can, indeed, say the same about C99 - and a large part of the reason
for that is customer demand for C90 (for very good reasons).
Unfortunately, they do matter very much to anyone who is attempting to
write robust, portable code or is in the position of assisting users
who develop on one system and then find that their code doesn't work
on another. I have considerable experience with both.
|> > Many or most don't support
|> > denormalised numbers
|>
|> I called you on this in Berlin and you temporized. The statement
|> IME is simply untrue. And ME involves supporting Standard C99 and
|> C++ libraries on about a dozen popular architectures, with about
|> as many floating-point processors.
That is a falsehood. As you have done here, you quoted me out of
context, completely changing the meaning of what I said to something
that I did not say and was false, and I was not given a chance to
respond by the chairman. I wondered whether to object on a point
of order, but that seemed excessive.
|> > or exceptions (as it specifies them)
|>
|> I guess the parenthetic quibble gives you some wiggle room, but
|> I still have trouble with it. Our fenv.h tests seem to indicate
|> that the popular architectures *do* meet the requirements of
|> IEEE 754 in this area.
|>
|> > in some of
|> > their modes (often their defaults),
|>
|> Another potential quibble, which I still have trouble believing,
|> from direct experience.
For heaven's sake! Quoting individual phrases (not even clauses!) out
of context is a low political trick. I have requoted the full sentence
in another posting, and shall not do so here.
My statement was, however, based on detailed investigations of four
Unices on 4 totally separate architectures, and brief ones on a fair
number of others. In particular, it is true for at least AIX, IRIX
Solaris and Linux/gcc.
|> > Thirdly, only Java has attempted to include it as part of its
|> > specification,
|> > and Kahan has written a diatribe against Java.
|>
|> IIRC, the diatribe did *not* complain that Java slavishly followed
|> IEEE 754 to its detriment. There was this little thing about favoring
|> Sun architecture over Intel, then patching thing up in a half-assed
|> manner...
Yes, and there was a much larger thing about how it was essential to
provide both the flags and the values in order to get reliable exception
handling.
|> > C99 has some support, but
|> > words fail me when describing how unusable it is.
|>
|> The detailed words always seem to fail you, but you're consistently
|> quick with the poisonous adjectives.
As you know, that is another falsehood. I wrote a great many detailed
descriptions of the problems for the SC22WG14 reflector, often including
solutions, and some of them were raised by the BSI as official National
Body comments. All were either ignored or responded to entirely by
ad hominem attacks, as you are doing here.
I have a fair number of very detailed documents describing the issues,
and often solutions to the problems, which have been widely circulated.
I posted one to the SC22WG21 list, which you managed to get ignored
without consideration at a meeting at which I was not present, apparently
by claiming that it was false. However, I should point out that I gave
an independent reference that my statements were correct (Python).
[ To anyone else: please Email me if you want copies, and tell me
what aspects you are interested in. I don't guarantee to be able to
find everything! ]
|> In fact, C99 has two extensive
|> annexes (F and G) describing in exquisite detail how IEEE 754 and
|> Standard C should play together.
Well, no, and that was one of the reasons that the BSI voted "no"
to C99 and many customers have explicitly specified C90. As I gave
enough reasons in a previous response in this thread, I shan't repeat
them.
|> My current company Dinkumware, Ltd.
|> has endeavored to conform completely to those annexes, and aside
|> from a few aggressively perverse tests in Tydeman's floating-point
|> suite we do just fine. While I don't always agree with some of the
|> more creative choices made in those annexes (particularly G), I had
|> little trouble following their guidance. Hence, I have to disagree
|> that C99 support for IEEE 754 is unusable, either to us or our
|> customers.
There is a difference between "unimplementable" and "unusable". I
never said that Annex F or G were unimplementable on systems with
hardware that supports IEEE 754 in at least one mode. They are
unusable for real applications that need robustness, portability
and efficiency (and, to some extent, any of those).
Enough, already. I am not going to respond to your attacks further.
If you ask me to justify what I have actually said, I may respond, but
that is unlikely if you misquote to the above extent.
Regards,
Nick Maclaren.
Can you post it, or a link to it if it's too large? I think a lot of
comp.lang.c folks would be interested (not sure about
comp.lang.fortran).
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
> ...
> |> I called you on this in Berlin and you temporized. The statement
> |> IME is simply untrue. And ME involves supporting Standard C99 and
> |> C++ libraries on about a dozen popular architectures, with about
> |> as many floating-point processors.
>
> That is a falsehood.
That's one.
> ...
> As you have done here, you quoted me out of
> context, completely changing the meaning of what I said to something
> that I did not say and was false, and I was not given a chance to
> respond by the chairman. I wondered whether to object on a point
> of order, but that seemed excessive.
You're responding now, by questioning my integrity and motives.
> ...
> |> Another potential quibble, which I still have trouble believing,
> |> from direct experience.
>
> For heaven's sake! Quoting individual phrases (not even clauses!) out
> of context is a low political trick.
That's two.
> ...
> |> > C99 has some support,
> but
> |> > words fail me when describing how unusable it is.
> |>
> |> The detailed words always seem to fail you, but you're consistently
> |> quick with the poisonous adjectives.
>
> As you know, that is another falsehood.
That's three strikes, and you're completely out of line. It's one
thing to accuse me of stating falsehoods, and of various other
tricks; it's quite another to say that I deliberately made a false
statement.
> ...
> Enough, already. I am not going to respond to your attacks further.
> If you ask me to justify what I have actually said, I may respond, but
> that is unlikely if you misquote to the above extent.
Not to worry. The conversation is over.
Indeed!
> My experience is that Bill Plauger is understating the case - the
IEEE 754
> handling of zeroes (and hence infinities) causes a huge number of errors
> that would not have occurred in older arithmetics. People REALLY don't
> expect the "gotchas" it introduces - and C99 makes that a lot worse.
I really don't have much experience here.
> |> The link just above is not comprehensible outside of a niche of
> |> a niche. :-)
>
> Very true. In a related niche, I have a simple test that shows up the
> chaos with C99's complex division and overflow. It really is VERY
> nasty :-(
I'd be curious to see it. The stuff I was talking about doesn't
use the C99 complex math library at all, just the underlying
gcc support for IEEE 754.
-- David
Mine doesn't, either. Yes, it was designed with malice aforethought,
but I was checking whether the situation was really as bad as a look
at the code indicated it was. It shows what happens if you use the
'native' complex divide, the example in Annex G, or calculate the real
and imaginary parts separately. The results on several systems are
along the lines of:
0.00e+00 (1.000,1.000) (1.000,1.000) (1.000,1.000)
1.00e+307 (1.089,0.891) (1.089,0.891) (1.089,0.891)
2.00e+307 (1.154,0.769) (1.154,0.769) (1.154,0.769)
3.00e+307 (1.193,0.642) (1.193,0.642) (1.193,0.642)
4.00e+307 (1.207,0.517) (1.207,0.517) (1.207,0.517)
5.00e+307 (1.200,0.400) (1.200,0.400) (1.200,0.400)
6.00e+307 (1.176,0.294) (1.176,0.294) (1.176,0.294)
7.00e+307 (1.141,0.201) (inf,nan) (1.141,0.201)
8.00e+307 (inf,nan) (inf,nan) (inf,0.122)
9.00e+307 (nan,nan) (inf,nan) (nan,0.000)
1.00e+308 (nan,nan) (inf,nan) (nan,0.000)
1.10e+308 (nan,nan) (inf,nan) (nan,-0.000)
1.20e+308 (nan,nan) (inf,nan) (nan,-0.000)
1.30e+308 (0.000,-0.000) (inf,nan) (0.000,-0.000)
1.40e+308 (0.000,-0.000) (inf,nan) (0.000,-0.000)
1.50e+308 (0.000,-0.000) (inf,nan) (0.000,-0.000)
1.60e+308 (0.000,-0.000) (inf,nan) (0.000,-0.000)
1.70e+308 (0.000,-0.000) (nan,nan) (0.000,-0.000)
inf (nan,nan) (nan,nan) (0.000,-0.000)
inf (nan,nan) (nan,nan) (0.000,-0.000)
Note that the result should be decreasing, but becomes infinite before
it drops to zero. The example code does rather better, but the rule
that (inf,nan) is an infinity causes significant trouble - as I knew it
would, and was investigating. I don't know how much better is possible,
because this sort of issue is very nasty.
Regards,
Nick Maclaren.
#pragma STDC CX_LIMITED_RANGE OFF
#pragma STDC FP_CONTRACT OFF
#include <math.h>
#include <stdio.h>
#include <complex.h>
#ifndef TRUST_C99
#define creal(x) ((double)(x))
#define cimag(x) ((double)(-I*(x)))
#define INFINITY (1.0/0.0)
#define isfinite(x) (! isinf(x) && ! isnan(x))
extern double fmax(double,double);
#endif
double complex cdivd(double complex z, double complex w)
{
double a, b, c, d, logbw, denom, x, y;
int ilogbw = 0;
a = creal(z); b = cimag(z);
c = creal(w); d = cimag(w);
logbw = logb(fmax(fabs(c), fabs(d)));
if (isfinite(logbw)) {
ilogbw = (int)logbw;
c = scalbn(c, -ilogbw); d = scalbn(d, -ilogbw);
}
denom = c * c + d * d;
x = scalbn((a * c + b * d) / denom, -ilogbw);
y = scalbn((b * c - a * d) / denom, -ilogbw);
if (isnan(x) && isnan(y)) {
if ((denom == 0.0) &&
(!isnan(a) || !isnan(b))) {
x = copysign(INFINITY, c) * a;
y = copysign(INFINITY, c) * b;
}
else if ((isinf(a) || isinf(b)) &&
isfinite(c) && isfinite(d)) {
a = copysign(isinf(a) ? 1.0 : 0.0, a);
b = copysign(isinf(b) ? 1.0 : 0.0, b);
x = INFINITY * ( a * c + b * d );
y = INFINITY * ( b * c - a * d );
}
else if (isinf(logbw) &&
isfinite(a) && isfinite(b)) {
c = copysign(isinf(c) ? 1.0 : 0.0, c);
d = copysign(isinf(d) ? 1.0 : 0.0, d);
x = 0.0 * ( a * c + b * d );
y = 0.0 * ( b * c - a * d );
}
}
return x + I * y;
}
int main() {
int i;
double d, r, z;
double complex c1, c2;
for (i = 0; i < 20; ++i) {
d = i*0.1e308;
c1 = (1.0e308+I*1.0e308)/(1.0e308+I*d);
c2 = cdivd(1.0e308+I*1.0e308,1.0e308+I*d);
if (1.0e308 > d) {
r = d/1.0e308;
z = 1.0e308+d*r;
printf("%.2e (%.3f,%.3f) (%.3f,%.3f) (%.3f,%.3f)\n",d,
creal(c1),cimag(c1),creal(c2),cimag(c2),
(1.0e308+1.0e308*r)/z,(1.0e308-1.0e308*r)/z);
} else {
r = 1.0e308/d;
z = d+1.0e308*r;
printf("%.2e (%.3f,%.3f) (%.3f,%.3f) (%.3f,%.3f)\n",d,
creal(c1),cimag(c1),creal(c2),cimag(c2),
(1.0e308*r+1.0e308)/z,(1.0e308*r-1.0e308)/z);
}
}
return 0;
}
> > This makes no sense, as the outcome of the operation is undefined and
> > should be NaN.
> > max(NaN,0.) = NaN
> Why?
Well, because a NaN *could* be plus infinity, or a number too large to
be represented.
If one wants to *implement* NaNs *at all*, one's _reason_ for doing so
is because one wants computer arithmetic to produce accurate results -
rather than just plugging in the best representable number that fits,
and then giving a result that may not be valid.
So going to the trouble of bothering with NaNs, and then deciding that
treating them pessimistically is just too much bother in a few cases,
vitiates the entire enterprise!
John Savard
And you join the list of people who are willing to state on the
basis of a few minutes thought and no research whatsoever, that the
people behind the IEEE 754 standard are idiots.
-William Hughes
There is at least one good reason for the current standard behavior:
It maintains the maximum amount of information.
I.e. doing the opposite which would be to require max(...,NaN,...) to
always be NaN simply discards everything we know about the representable
numbers in the array, even in the case where the NaN simply means Not
Applicable, i.e. 'skip this value'.
What's needed in this case is a side channel which can provide the fact
that at least one of the inputs were NaN, since this is critical when
the NaN is a result of a series of previous calculations which have
blown up.
OTOH there is an equally good reason for requiring the opposite
behavior, i.e. max(...) is NaN if any input value is NaN: This would
obey 'the principle of last astonishment', i.e. in all other IEEE fp
operations any NaN input will propagate into the output to make sure
that this critical piece of information cannot be hidden.
Terje
--
- <Terje.M...@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
I think this apporach would only serve to confuse the issue.
Clearly, there are two current interpretations of the semantics of NaN:
one is that a NaN means some prior computation yielded a non-representable
and exceptional result, the other is that this data value should be ignored.
These different meanings imply different processing in certain situations,
but not in all: It appears to me that this applies in particular to all
reduction operators, with a correction to be applied whereever the count
of operands appears (e.g., for the average value reduction).
Given current facilites in Fortran, I think the reduction operators should
always obey the "exception" semantics - i.e., if ANY(ISNAN(input_data)) is
true, the result should be a NaN (perhaps define a new one?). My reason is
that the "ignore" semantics can be achieved easily by the user: instead of,
for example, writing
X = MAX(input_data)
she can write
X = MAX(input_data(WHERE(.NOT. ISNAN(input_data)))
Of course, if you define the operator semantics the other way, you can
get the exception semantics by writing
IF (ANY(ISNAN(input_data)) THEN
X = some_NaN
ELSE
X = MAX (input_data)
ELSEIF
so it perhaps boils down to frequency of use and to consistency of user
expectations.
The only other alternative would be to define some particular value - which
could be an IEEE 754 NaN - to signify "value not available", and to modify
the semantics of the reduction operators based on whether this value is
present or not. However, one still needs to define the behaviour of what to
do when _both_ an "exception" NaN _and_ an "ingore" NaN are present in the
input 8-). My choice: the "exception" NaN takes precendence.
Jan
You are wrong on three counts:
max/min are not part of IEEE 754, and are not even in the appendix;
they are proposed as part of IEEE 754R.
He did not state that they were idiots - merely misguided - and you
have no evidence that he has done no research.
I am pretty sure that he knows of my analysis of the matter (and
document on it), where I do explain why he is correct and the IEEE 754R
people are wrong. And, in THIS respect, I believe that I have more
experience than any of the people on that group.
OK?
Regards,
Nick Maclaren.
Sorry, Terje, but you have missed the point. By providing ways to lose
the NaN state quietly, you are rendering them useless as an error value.
50 years of experience shows that requiring programmers to check every
operation that might fail for possible failure simply does not work; if
checking isn't fail-safe, it is pointless.
As my document explains, there are numerous possible meanings of "not a
number", all of which imply different semantics. In particular, allowing
a missing value indication is one of the oldest and best, but is NOT what
IEEE 754 does and is NOT what IEEE 754R is proposing. The arguments
used for this behaviour by IEEE 754R have been copied from C99 and are
both false have are known to be false.
Regards,
Nick Maclaren.
Correct. And, as I have pointed out to both C99 and IEEE 754R, but have
been ignored on purely dogmatic grounds, there are NO uses where ONLY the
"missing value" semantics are wanted, IEEE 754 doesn't support them in the
first place, and the max/min operations aren't the most important anyway.
There ARE uses where only the "error value" semantics are wanted.
In statistics, which is one of the claimed uses, missing value semantics
are wanted ONLY for specific reduction operations, and the ranking of
importance is:
Counting (i.e. number of non-NaNs)
Summation
Derivative operations (mean, variance etc.) |
Max/min | In some order
Multiplication |
Regards,
Nick Maclaren.
Those are all reductions.
> Multiplication |
Do you mean scaling? If so, you probably want to cover any functional
transform of the values.
Jan
Precisely.
|> > Multiplication |
|>
|> Do you mean scaling? If so, you probably want to cover any functional
|> transform of the values.
Sorry, I meant product. I.e. reduction by multiplication.
The basic rules with both error and missing values are:
If any operand is erroneous, or the result is mathematically undefined,
the result is erroneous. Missing/0.0 is erroneous, for example.
Otherwise, if it is a reduction, only the non-missing values are used.
If it is not a reduction and any operand is missing, the result is
missing.
In practice, "a reduction" isn't just operations that are reductions, but
specific calls to reduction functions that allow for missing values.
Regards,
Nick Maclaren.
Nick!
You always complain about people who quote you selectively! In this case
you have snipped the paragraph where I agree with you, specifically that
dropping NaN information is a _very_ surprising behavior.
I.e. we're in violent agreement, I was just trying to think of a
possible use for the defined (IMHO broken) specification.
Mea culpa. I apologise.
I did actually read what you said, and misunderstood it.
Regards,
Nick Maclaren.
A quibble.
> He did not state that they were idiots - merely misguided
More quibbling. He said that the proposed approach
"vitiates the entire enterprise". This is a lot stronger than
"misguided".
> - and you
> have no evidence that he has done no research.
Since he has no idea of why NAN's are used
and is incorrect about the reasons behind
the proposed behaviour of max(NAN,0.) , I concluded he
has done no research.
>
> I am pretty sure that he knows of my analysis of the matter (and
> document on it),
I see no evidence of this other than he agrees in conclusion.
How do you account for his lack of knowledge if he is familiar
with your analysis.
>where I do explain why he is correct and the IEEE 754R
> people are wrong. And, in THIS respect, I believe that I have more
> experience than any of the people on that group.
>
> OK?
Beside the point. Your opinion is obviously informed and I have
not claimed otherwise. However, I stongly suspect that the
opinions of other IEEE 754R people are also informed.
-William Hughes
>> > Why?
One can see that problems that something can cause almost
immediately, and without doing any real "research".
I would say that the ones who produced that standard did
not fully examine the adverse consequences that their
actions could cause, even when they were pointed out to
them.
This applies to other standards as well; those who think
they can provide "what is necessary or appropriate" are
only fooling themselves, and often harming others.
--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hru...@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558
Yes, but one may not be able to see problems that a change
would cause without doing research. In this case it is clear that
setting max(NAN,0.)=0. will cause problems. It is less clear why
setting max(NAN,0.)=NAN will cause problems. It is not at all
clear which solution should be preferred. John Savard seems to
have based his conclusion only on the fact that setting
max(NAN,0.)=0. will cause problems.
> I would say that the ones who produced that standard did
> not fully examine the adverse consequences that their
> actions could cause, even when they were pointed out to
> them.
>
Perhaps, and perhaps not. Since we have little information
about the consequences of setting max(NAN,0.)=0 (and John
Savard appears to have none) how do you justify this statment.
> This applies to other standards as well; those who think
> they can provide "what is necessary or appropriate" are
> only fooling themselves, and often harming others.
>
I agree, but you have set up a straw man. Yes, those
who think they can provide "what is necessary or appropriate"
are only fooling themselves, but no, I don't put automatically
put people who create standards in this catagory.
Normally, there is no "what is necessary or appropriate"
to be provided, tradeoffs and compromises must be made.
Despite this standards are very useful. Problems arise when
people believe that no compromises are necessary.
They then note of some standard that it
causes some specific problem and conclude that
the people who created the standard were incompetent because
they did not notice this.
This is very different from knowing the background of the
compromise and deciding that the wrong choice was made
(e.g. your complaint about speed being unduly emphasized at
the expense of accuracy)
-William Hughes
Note that the S and R languages used in statistics have both NA and NaN.
NA for data that should be ignored, usually data that didn't exist in
the input data set, such as someone not answering a survey question.
As an interpreted language it is fairly easy to do, though I believe
they use an IEEE NaN value with different values in the low order bits.
It might be an interesting feature to add to other languages and/or
hardware.
-- glen
> As my document explains, there are numerous possible meanings of "not a
> number", all of which imply different semantics. ...
Which document? I'd like to read it.
- Bob
Thank you. This is a very plausible explanation. Is there an
appropriate venue to express our concerns with this behavior? The
relevant people may not come across these posts. I gather that 754R is
not yet finished, so there may still be an opportunity to influence the
standard.
-Norbert
OK, no problem.
OTOH, having a standard which requires silent removal of NaNs _is_ a
problem. :-(
As I understand Nick's point, the problem is the conflation of two
meanings for NaN, so it wouldn't be at all surprising to me if there
were *no* definite right answer for max(NaN,0).
Now, I probably have less experience and knowledge in this area than
anyone who has so far contributed to this thread, but if I may be
indulged a little, what is wrong with...
max(QNaN,0) = 0
max(SNaN,0) = SNaN
I quite agree. C99 Annex F - just say "no".
Regards,
Nick Maclaren.
There IS a definite right answer, using the meaning of NaN that is implied
by IEEE 754, and it is NaN. To get the other answer, you need a meaning
of NaNs that is not currently in IEEE 754.
|> Now, I probably have less experience and knowledge in this area than
|> anyone who has so far contributed to this thread, but if I may be
|> indulged a little, what is wrong with...
|>
|> max(QNaN,0) = 0
|> max(SNaN,0) = SNaN
BAD idea. Sorry. Firstly, IEEE 754 requires max(SNaN,0) to raise the
invalid exception, secondly, that would imply that QNaN+0.0 = 0.0 and,
thirdly, the only languages that 'support' IEEE 754 use only QNaNs.
Regards,
Nick Maclaren.
Had the inspiration when doing data bases a while back, that as well as
null, void is also needed.
null=unknown quantity
void=no quantity
max(null,0)=null
max(void,0)=0
NaN appears like a null so max(nan,0)=nan ;-) curry shop ahoy!!
NaD would be Not a Datum
cheers.
jacko
754r, by the way, noes not define any operations called max and min. It
has MinNum, MaxNum, etc. How languages expose those functions to users is
up to the language. Personally I would not call them 'min' or 'max'...
mfc
That would be a technically adequate political compromise if it had both
forms. It would still be a technical mistake, in context, but would at
least not be actively harmful.
I have never heard of one good reason that it does not do that.
Regards,
Nick Maclaren.
> There is at least one good reason for the current standard behavior:
>
> It maintains the maximum amount of information.
>...
>
> OTOH there is an equally good reason for requiring the opposite
> behavior, i.e. max(...) is NaN if any input value is NaN:
This is why 754R specifies two separate functions, max() and maxnum()
(and similarly for min() , maxmag() and minmag()).
Michel.
For an extra complication (for comp.lang.fortran), what should
MAX1(NaN,0) return?
Note that C, or at least C89, doesn't have max and min, but they are
commonly done using preprocessor macros. They are then dependent on
the result of the > or >= operator, and likely not symmetric.
#define MAX(a,b) ((a>b)?(a):(b))
I don't expect MAX(NaN,0.) to be the same as MAX(0.,NaN)
-- glen
> For an extra complication (for comp.lang.fortran), what should
> MAX1(NaN,0) return?
That's easy. It is illegal code because MAX1 requires default real
arguments, but 0 is integer. Presumably that's not what you actually
meant, but that's my answer to the literal question. :-)
I'm not going to get involved in the serious discussion in this thread.
I do have an opinion on the matter, but I think I'll avoid expressing
it.
--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
Well, if you want the literal question, I don't know that there
is a constant (PARAMETER) NaN, either.
In another post I went back to change to 0. before posting,
but I missed on this one. (And I almost wrote AMAX0, too.)
-- glen
Both of you should have participated in the 754R discussions, which are
winding down now -- but the official IEEE ballotting will start later
this year, and perhaps you should join the IEEE SA (Standards
Association) if you have not done so already (deadline Sept 28), so you
can comment when the draft is published for review in a month or so.
The standard deliberately avoids assigning meaning to NaN payloads, but
from various discussions about the distinction between "missing data"
and "invalid data" it seems to me that defining a particular NaNcode
(other than the machine default) for "missing" would have been quite
valuable. I'm just afraid of bringing it up so late in the game...
Interestingly IBM's "High Level Assembler" does support defining two
kinds of quiet NaN in floating-point constants: (NAN) implies machine
default (double 0x7ff80000...) and (QNAN) implies 0x7ffc0000... but I
don't know of any software that exploits this.
Michel.
I do not know. In what way is returning NaN for max(NaN, number) slower
than returning "number"? I would state (in some pseudo assembly):
# max(a0, a1), input registers a0 and a1, output in a2
compare a0, a1
if not comparable, a2 = some NaN, return
if a0 < a1, a2 = a1, return
a2 = a0, return
returning "number" would be a bit slower in my opinion. And that is much
more the case if you are considering more than two arguments.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
>I do not know. In what way is returning NaN for max(NaN, number) slower
>than returning "number"?
You missed the part where I said that current compilers return
something based on the order of the arguments. Returning either NaN or
number would be equally slower.
-- greg
>-- greg
I do not see that the compiler has any way of knowing
that something is NaN. It is only the ALU which has
this information at run time.
What Greg means is that current compilers implement max(a, b) as
if(a > b) then return a else return b.
If one of the two is a NaN and the other is a number, this means
that the question on whether NaN is returned or the number depends
on the order of the arguments. So if you want to ensure that the
function arguments commute you have to spend time, and that is the
case whether you want to return a NaN or just the other.
>>You missed the part where I said that current compilers return
>>something based on the order of the arguments. Returning either NaN or
>>number would be equally slower.
>
>>-- greg
>
>I do not see that the compiler has any way of knowing
>that something is NaN. It is only the ALU which has
>this information at run time.
Yes. Perhaps you would understand if I clarified to "the current code
emitted by compilers returns something based on the order of the
arguments..."
I'm breaking a many-year taboo to respond to your post, Herman!
-- greg
How about:
if(a>b) then return a; else if (b>a) then return b; else return a;
It could be done as a #define in C or a statement function in Fortran.
Also, how about:
if(a<=b) return b; else return a;
-- glen
I don't see what advantages that has. Certainly it doesn't solve the
"problem" that the order of the arguments matters in cases of
non-comparable arguments (such as when one of the arguments is a NaN).
To solve that, you need something more like:
if(a>=b) then return a; else if (b>a) then return b; else return NaN;
- Brooks
--
The "bmoses-nospam" address is valid; no unmunging needed.
> To solve that, you need something more like:
> if(a>=b) then return a; else if (b>a) then return b; else return NaN;
One difference is that yours returns a if a==b, and mine returns b,
but that should be the same.
If neither a>b nor b>a is true, mine returns a. Assuming you don't
have a variable or parameter named NaN, how can you find out which
variable is NaN?
-- glen
I suppose one could do:
if(a>=b) then return a; else if (b>a) then return b;
else if (a==a) then return b; else return a;
If one has two distinct arguments for which arg==arg is false, then the
order of their specification still makes a difference, but otherwise
this should work.
Can you reduce that to:
if(a>=b) then return a; else if (b>a .OR. a==a) then return b;
else return a;
Two consecutive IFs that both return b seems redundant.
(Assuming the compiler doesn't optimize the a==a out.)
I originally thought it could work without an a==a test,
but I suppose not.
-- glen
max(a,b) = merge(a,b, a>b .or. isnan(a))
If A is a NaN, this returns A (hence, if both are NaNs, the
NaN returned is A). Otherwise, if B is a NaN, this returns B.
If neither is a NaN, it returns a value equal to the max of the
two.
--
J. Giles
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare
Certainly. However, the "a==a" test is rarely needed, and I figured
that in something like this it was better to avoid unnecessary tests
even at the expense of slightly more complicated code.
> (Assuming the compiler doesn't optimize the a==a out.)
> I originally thought it could work without an a==a test,
> but I suppose not.
I don't think it's possible -- there's a symmetry argument involved, in
the sense that one has to do something to break the symmetry between
arguments, and a process of elimination reduces to a requirement to test
the properties of at least one argument independently, when they are
non-orderable and one wants symmetric results.
Arguably, what one is doing is imposing a (within-the-function) ordering
between NaN and real numbers, and then fixing up the greater/less-than
tests to respect that ordering. That is, one is implementing a standard
simple "if a is 'greater than' b return a else return b" MAX() function,
but using the more complicated logic "x is 'greater than' y if the
expression x>y yields 'true', or if x is NaN and y is non-NaN".
What if a is a NaN and b not, and what if it is the other way around?
In both cases a is returned. Not very different from what I wrote, there
in both cases b is returned. Changing to >= makes no difference.
To return a NaN when one of the operands is NaN you need something like:
if(uncomparable(a, b)) return NaN;
else if(a > b) return a; else return b;
To return the non-NaN when there is one you need something like:
if(uncomparable(a, b)) if(isnan(a)) return b; else return a;
else if(a > b) return a; else return b;
>(Assuming the compiler doesn't optimize the a==a out.)
Compilers do not optimize a==a out when trying to do IEEE.
Most compilers have command-line switches to determine whether or not
they're trying to strictly do IEEE. Or how strictly.
-- g
So what's wrong with doing this on integers instead? (I.e. looking
directly at the IEEE bit patterns.)
If we're really worried about either getting rid of or preserving NaNs,
then it would seem possible that doing a manual test might be equally
fast or faster than a series of fp compares?
The bitwise encoding of NaNs are a form of infinity, right?
I guess one possible problem is -NaN as an input for max() and +NaN for
min()? :-(
Anyway, the proper solution is an explicit comparison operator which can
return all 6 possible result:
LESS_THAN, GREATER_THAN, EQUAL
A_NAN, B_NAN, BOTH_NAN
If we require the three first values to correspond to (1, 2, 3) and the
UNORDERED results to be A_NAN = 4, B_NAN = 8 and BOTH_NAN = 12, then
max(a,b) becomes
return (fpcompare(a,b) & (GREATER_THAN | A_NAN)) ? a : b;
Similarly, min(a,b) becomes
return (fpcompare(a,b) & (LESS_THAN | A_NAN)) ? a : b;
Terje
--
- <Terje.M...@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
On x87 the best available fp comparison opcode cannot tell us which
operand was NaN, so the fastest code seems to be something like this:
inline double max(double a, double b)
{
__asm {
fld [b]
fld [a]
fucomi st,st(1) // a <=> b ?
jpe both_ok // Jump if no NaN values!
// At least one of the numbers was a NaN: Which?
fucomi st,st(0) // a <=> a ?
jpo return_a // a is NaN, so return it!
both_ok:
fcmovl st(1),st // Return (b) if (a < b)
return_a:
fstp st,st(1) // Push the return value down and pop
}
} // Leave the larger value on the fp stack
There is one branch even in the normal no NaN case, but this should be
correctly predicted in any critical, often-executed code path.
For an outline function I would have jumped past the normal return point
to handle the unordered case.
Well, it isn't portable in any of the popular languages.
> If we're really worried about either getting rid of or preserving NaNs,
> then it would seem possible that doing a manual test might be equally
> fast or faster than a series of fp compares?
For processors with a separate floating point unit (separate registers)
it is probably slower getting values between fixed and floating
point registers as bit patterns. If you assume that integer
and floating point values are the same size and endianness it isn't
so hard to do.
-- glen
One pair of distinct operands which test
as equal is +0 and -0. max(-0, -0) should return -0
but all other combinations max(+-0, +-0) should return +0.
--
Expel the plutarchs from Washington, DC this November.
Put us on a sensible orbit amongst the solar powers.
pmon...@cwi.nl Microsoft Research and CWI Home: Bellevue, WA
I agree that 'SHOULD' is the proper term here. It would indeed be 'The
Right Thing' to return the positive zero if there is one.
However, by definition, the distinction between +/-0 is disregarded by
the fp comparison opcode(s) I know about, which would make such behavior
compulsory ('MUST') rather expensive. :-(
> Peter L. Montgomery wrote:
> > One pair of distinct operands which test
> > as equal is +0 and -0. max(-0, -0) should return -0
> > but all other combinations max(+-0, +-0) should return +0.
>
> I agree that 'SHOULD' is the proper term here. It would indeed be 'The
> Right Thing' to return the positive zero if there is one.
>
> However, by definition, the distinction between +/-0 is disregarded by
> the fp comparison opcode(s) I know about, which would make such behavior
> compulsory ('MUST') rather expensive. :-(
I'll abstain on the question of what should happen, but I was going to
make a simillar comment about "should" being an appropriate word choice.
In particular, it is not a "shall" - at least not in the Fortran
standard. There is probably even a school of interpretation which would
argue that it is a "shall not". I won't go that far, but it certainly is
not "shall". ("Shall" is the word for a requirement in standard-speak;
"must" is reserved to mean an inevitable condition).
> > >>I do not know. In what way is returning NaN for max(NaN, number) slower
> > >>than returning "number"?
> > >You missed the part where I said that current compilers return
> > >something based on the order of the arguments. Returning either NaN or
> > >number would be equally slower.
> > I do not see that the compiler has any way of knowing
> > that something is NaN. It is only the ALU which has
> > this information at run time.
>What Greg means is that current compilers implement max(a, b) as
> if(a > b) then return a else return b.
>If one of the two is a NaN and the other is a number, this means
>that the question on whether NaN is returned or the number depends
>on the order of the arguments. So if you want to ensure that the
>function arguments commute you have to spend time, and that is the
>case whether you want to return a NaN or just the other.
Most hardware either has a max operation or an absolute
value operation. Doing it as .5*(a+b + |a-b|) is likely
to be faster, as it does not have and if operation. It
is also unlikely that max will be the last argument of
the current block, so it will not be a matter of a return.
And of course if the hardware has a max operator, do it by
using that operator. So max(a,b) should be an inlined
function, not a call, and not using a conditional.
1) It can overflow if both a and b are large, when
MAX should not overflow.
2) It can give the wrong answer in many cases for floating
point values, such as MAX(-1e20,1) where a+b and a-b
are likely both 1e-20. It will give the wrong answer
for fixed point values if either a+b or a-b overflow.
3) Newer hardware has a conditional load operation to
do conditional operations without a delay causing
branch.
4) C89 (I am not sure about C99) doesn't even have a MAX
function, it is normally done with the preprocessor
and the conditional operator. Compilers may or may not
figure out that MAX was implied.
-- glen
5) most CPUs currently sold (any which support SSE2 or IA-64 instruction
sets, among others) have a max instruction, and compilers try to
recognize code sequences which can use it. Your suggestion could not be
so recognized, in part because the simple instruction would not have the
over/underflow behavior you have introduced.
(snip)
> 5) most CPUs currently sold (any which support SSE2 or IA-64 instruction
> sets, among others) have a max instruction, and compilers try to
> recognize code sequences which can use it. Your suggestion could not be
> so recognized, in part because the simple instruction would not have the
> over/underflow behavior you have introduced.
According to one article, most CPUs sold are Intel 8051's. Of 32bit
CPUs it is ARMs. I don't believe the 8051 has a MAX, and if it did it
would be fixed point 8 bit.
I don't know how good C compilers are at recognizing MAX, especially
for more complicated expressions.
#define MAX(a,b) ((a>b)?(a):(b))
x=MAX(b*b,4*a*c);
comes out as:
x=((b*b>4*a*c)?(b*b):(4*a*c));
even worse with more complicated expressions.
-- glen
>5) most CPUs currently sold (any which support SSE2 or IA-64 instruction
>sets, among others) have a max instruction,
If that instruction avoids this NaN problem, then it's worth noting
that compilers sure don't seem to be using it.
-- g
;; These versions of the min/max patterns are intentionally ignorant of
;; their behavior wrt -0.0 and NaN (via the commutative operand mark).
;; Since both the tree-level MAX_EXPR and the rtl-level SMAX operator
;; are undefined in this condition, we're certain this is correct.
and, where it designates the same instructions as "ieee" non commutative :
;; min = (op1 < op2 ? op1 : op2)
;; max = (!(op1 < op2) ? op1 : op2)
;; Their operands are not commutative, and thus they may be used in the
;; presence of -0.0 and NaN.
Does this mean that gcc/gfortran intends to treat it as commutative only
for -ffast-math -msse ? Further, that the min operation is easier to
optimize this way than the max ?