Avoiding NaN and Inf on floating point division

ardi

unread,

Jan 4, 2014, 6:07:54 AM1/4/14

to

Hi,

Am I right supposing that if a floating point variable x is normal (not denormal/subnormal) it is guaranteed that for any non-NaN and non-Inf variable called y, the result y/x is guaranteed to be non-NaN and non-Inf?

If affirmative, I've two doubts about this. First, how efficient can one expect the isnormal() macro to be? I mean, should one expect it to be much slower than doing an equality comparison to zero (x==0.0) ? Or should the performance be similar?

Second, how could I "emulate" isnormal() on older systems that lack it? For example, if I compile on IRIX 6.2, which AFAIK lacks isnormal(), is there some workaround which would also guarantee me that the division doesn't generate NaN nor Inf?

Also, if the isnormal() macro can be slow, is there any other approach which would also give me the guarantee I'm asking for? Maybe comparing to some standard definition which holds the smallest normal value available for each data type? Are such definitions standardized in some way such that I can expect to find them in some standard header on most OSs/compilers? Would I be safe to test it this way rather than with the isnormal() macro?

Thanks a lot!

ardi

P.S: Yes, I realize a floating point division can result in a meaningless value even if it's non-NaN and non-Inf, because you might be dividing by a very small (yet normal) which should be zero but it isn't because of the math operations performed on it previously. But this is another problem. I'm asking for an scenario where you don't care if the result is meaningless or not, you just need to be sure it isn't NaN nor Inf.

jacob navia

unread,

Jan 4, 2014, 6:53:40 AM1/4/14

to

Le 04/01/2014 12:07, ardi a �crit :

> Am I right supposing that if a floating point variable x is normal (not denormal/subnormal)

it is guaranteed that for any non-NaN and non-Inf variable called y, the
result y/x is guaranteed

to be non-NaN and non-Inf?

No.

#include <stdio.h>
int main(void)
{
double a=1e300,b=1e-300,c;
c=a/b;
printf("%g\n",c);
}

ardi

unread,

Jan 4, 2014, 7:06:12 AM1/4/14

to

On Saturday, January 4, 2014 12:53:40 PM UTC+1, jacob navia wrote:
> Le 04/01/2014 12:07, ardi a ï¿½crit :

Ooops!!! I believe this means I forgot you can also get Inf from overflow... if a number is very big and a division turns it even larger, it can overflow, and then it becomes Inf even if the denominator is a normal value.

This effectively breaks my quest for "healthy divisions". I guess I'm back to my old arbitrary epsilon checking approach (i.e.: check the denominator for fabs(x)>epsilon for deciding whether the division can be performed or not, where epsilon is left as an exercise for the reader ;-)

Thanks,

ardi

Ben Bacarisse

unread,

Jan 4, 2014, 7:28:05 AM1/4/14

to

ardi <ardillas...@gmail.com> writes:

> Am I right supposing that if a floating point variable x is normal
> (not denormal/subnormal) it is guaranteed that for any non-NaN and
> non-Inf variable called y, the result y/x is guaranteed to be non-NaN
> and non-Inf?

No. Assuming what goes by the name of IEEE floating point, you will get
NaN when y == x == 0, and Inf from all sorts of values for x and y
(DBL_MAX/0.5 for example).

An excellent starting point is to search the web for Goldberg's paper
"What Every Computer Scientist Should Know About Floating-Point
Arithmetic". It will pay off the time spent in spades.

> If affirmative, I've two doubts about this. First, how efficient can
> one expect the isnormal() macro to be? I mean, should one expect it to
> be much slower than doing an equality comparison to zero (x==0.0) ? Or
> should the performance be similar?

I'd expect it to be fast. Probably not as fast as a test for zero, but
it can be done by simple bit testing.

However, you say "if affirmative" and the answer to your question is
"no" so maybe all the rest is moot.

> Second, how could I "emulate" isnormal() on older systems that lack
> it? For example, if I compile on IRIX 6.2, which AFAIK lacks
> isnormal(), is there some workaround which would also guarantee me
> that the division doesn't generate NaN nor Inf?

There are lots of ways. For example, IEEE double precision sub-normal
numbers have an absolute value less less than DBL_MIN (defined in
float.h). You can also test normality by looking at the bits. For
example, a sub-normal IEEE number has zero bits in the exponent field
and a non-zero fraction.

> Also, if the isnormal() macro can be slow, is there any other approach
> which would also give me the guarantee I'm asking for? Maybe comparing
> to some standard definition which holds the smallest normal value
> available for each data type?

The guarantee you want is that a division won't generate NaN or +/-Inf?
The simplest method is to do the division and test the result, but maybe
one or more of your systems generates a signal that you want to avoid?
I think you should a bit more about what you are trying to do.

It's generally easy to test if you'll get a NaN from the division of
non-NaN numbers (you only get NaN from 0/0 and the four signed cases of
Inf/Inf), but pre-testing for Inf is harder.

> Are such definitions standardized in
> some way such that I can expect to find them in some standard header
> on most OSs/compilers? Would I be safe to test it this way rather than
> with the isnormal() macro?

Your C library should have float.h and that should define FLT_MIN,
DBL_MIN and LDBL_MIN but I don't think that helps you directly.

<snip>
--
Ben.

Tim Prince

unread,

Jan 4, 2014, 7:46:55 AM1/4/14

to

On 1/4/2014 6:07 AM, ardi wrote:

> Second, how could I "emulate" isnormal() on older systems that lack it? For example, if I compile on IRIX 6.2, which AFAIK lacks isnormal(), is there some workaround which would also guarantee me that the division doesn't generate NaN nor Inf?
>
> Also, if the isnormal() macro can be slow, is there any other approach which would also give me the guarantee I'm asking for? Maybe comparing to some standard definition which holds the smallest normal value available for each data type? Are such definitions standardized in some way such that I can expect to find them in some standard header on most OSs/compilers? Would I be safe to test it this way rather than with the isnormal() macro?
>

Maybe you could simply edit the glibc or OpenBSD implementation into
your working copy of your headers, if you aren't willing to update your
compiler or run-time library.

http://ftp.cc.uoc.gr/mirrors/OpenBSD/src/lib/libc/gen/isnormal.c

Is your compiler so old that it doesn't implement inline functions?
That's the kind of background you need to answer your own question about
speed. Then you may need to use an old-fashioned macro (with its
concerns about double evaluation of expressions).

--
Tim Prince

James Kuyper

unread,

Jan 4, 2014, 11:35:01 AM1/4/14

to

On 01/04/2014 06:07 AM, ardi wrote:
> Hi,
>
> Am I right supposing that if a floating point variable x is normal
> (not denormal/subnormal) it is guaranteed that for any non-NaN and
> non-Inf variable called y, the result y/x is guaranteed to be non-NaN
> and non-Inf?

How could that be true? If the mathematical value of y/x were greater
than DBL_MAX, or smaller than -DBL_MAX, what do you expect the floating
point value of y/x to be? What you're really trying to do is prevent
floating point overflow, and a test for isnormal() is not sufficient.
You must also check whether fabs(x) > fabs(y)/DBL_MAX (assuming that x
and y are both doubles).

As far as the C standard is concerned, the accuracy of floating point
math is entirely implementation-defined, and it explicitly allows the
implementation-provided definition to be "the accuracy is unknown"
(5.2.4.2.2p6). Therefore, a fully conforming implementation of C is
allowed to implement math that is so inaccurate that DBL_MIN/DBL_MAX >
DBL_MAX. In practice, you wouldn't be able to sell such an
implementation to anyone who actually needed to perform floating point
math - but that issue is outside the scope of the standard.

However, if an implementation pre-#defines __STDC_IEC_559__, it is
required to conform to the requirements of Annex F (6.10.8.3p1), which
are based upon but not completely identical to the requirements of IEC
60559:1989, which in turn is essentially equivalent to IEEE 754:1985.
That implies fairly strict requirements on the accuracy; for the most
part, those requirements are as strict as they reasonably could be.

> If affirmative, I've two doubts about this. First, how efficient can
> one expect the isnormal() macro to be? I mean, should one expect it
> to be much slower than doing an equality comparison to zero (x==0.0)
> ? Or should the performance be similar?

The performance is inherently system-specific; for all I know there
might be floating point chips where isnormal() can be implemented by a
single floating point instruction; but at the very worst it shouldn't be
much more complicated than a few mask and shift operations on the bytes
of a copy of the argument.

> Second, how could I "emulate" isnormal() on older systems that lack
> it? For example, if I compile on IRIX 6.2, which AFAIK lacks
> isnormal(), is there some workaround which would also guarantee me
> that the division doesn't generate NaN nor Inf?

Find a precise definition of the floating point format implemented on
that machine (which might not fully conform to IEEE requirements), and
you can then implement isnormal() by performing a few mask and shift
operations on the individual bytes of the argument.

> Also, if the isnormal() macro can be slow, is there any other

> approach which would also give me the guarantee I'm asking for? ..

If you can find a alternative way of implementing the equivalent of
isnormal() that is significantly faster than calling the macro provided
by a given version of the C standard library, then you should NOT use
that alternative; what you should do is drop that version of the C
standard library and replace it with one that's better-implemented.

> ... Maybe

> comparing to some standard definition which holds the smallest normal
> value available for each data type?

Yes, that's what FLT_MIN, DBL_MIN, and LDBL_MIN are for.

> ... Are such definitions standardized

> in some way such that I can expect to find them in some standard

> header on most OSs/compilers? ...

Yes - the standard header is <float.h>.

> ... Would I be safe to test it this way

> rather than with the isnormal() macro?

It could be safe, if you handle correctly the possibility that the value
is a NaN. Keep in mind that all comparisons with a NaN fail, so
x>=DBL_MIN is not the same as !(x<DBL_MIN). If x is a NaN, the first
expression is false, while the second is true.
--
James Kuyper

Tim Prince

unread,

Jan 4, 2014, 12:09:28 PM1/4/14

to

On 1/4/2014 6:07 AM, ardi wrote:

> Am I right supposing that if a floating point variable x is normal (not denormal/subnormal) it is guaranteed that for any non-NaN and non-Inf variable called y, the result y/x is guaranteed to be non-NaN and non-Inf?
>

1/x is well-behaved when x is normal (only possible flag raised is
inexact). That is an important enough consideration to be part of
IEEE754 design, but not guaranteed in C without IEEE754 (the latter
being a reasonable expectation of a good quality platform, but there are
still exceptions). As others pointed out, your goal seems to be well
beyond that.

--
Tim Prince

Keith Thompson

unread,

Jan 4, 2014, 3:48:40 PM1/4/14

to

James Kuyper <james...@verizon.net> writes:
> On 01/04/2014 06:07 AM, ardi wrote:

[...]

>> Also, if the isnormal() macro can be slow, is there any other
>> approach which would also give me the guarantee I'm asking for? ..
>
> If you can find a alternative way of implementing the equivalent of
> isnormal() that is significantly faster than calling the macro provided
> by a given version of the C standard library, then you should NOT use
> that alternative; what you should do is drop that version of the C
> standard library and replace it with one that's better-implemented.

That's not always an option. What you should probably do in
that case is (a) consider carefully whether your faster version
is actually correct, and (b) contact the maintainers of your
implementation.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

glen herrmannsfeldt

unread,

Jan 4, 2014, 5:28:08 PM1/4/14

to

James Kuyper <james...@verizon.net> wrote:
> On 01/04/2014 06:07 AM, ardi wrote:
>> Am I right supposing that if a floating point variable x is normal
>> (not denormal/subnormal) it is guaranteed that for any non-NaN and
>> non-Inf variable called y, the result y/x is guaranteed to be non-NaN
>> and non-Inf?

> How could that be true? If the mathematical value of y/x were greater
> than DBL_MAX, or smaller than -DBL_MAX, what do you expect the floating
> point value of y/x to be? What you're really trying to do is prevent
> floating point overflow, and a test for isnormal() is not sufficient.
> You must also check whether fabs(x) > fabs(y)/DBL_MAX (assuming that x
> and y are both doubles).

> As far as the C standard is concerned, the accuracy of floating point
> math is entirely implementation-defined, and it explicitly allows the
> implementation-provided definition to be "the accuracy is unknown"
> (5.2.4.2.2p6). Therefore, a fully conforming implementation of C is
> allowed to implement math that is so inaccurate that DBL_MIN/DBL_MAX >
> DBL_MAX. In practice, you wouldn't be able to sell such an
> implementation to anyone who actually needed to perform floating point
> math - but that issue is outside the scope of the standard.

Yes, but it seems that it might not be so far off for rounding to allow
fabs(y)/(fabs(y)/DBL_MAX) to overflow, such that your test doesn't
guarantee no overflow.

-- glen

glen herrmannsfeldt

unread,

Jan 4, 2014, 5:32:53 PM1/4/14

to

Tim Prince <tpr...@computer.org> wrote:
> On 1/4/2014 6:07 AM, ardi wrote:

(snip)

> 1/x is well-behaved when x is normal (only possible flag raised is
> inexact). That is an important enough consideration to be part of
> IEEE754 design, but not guaranteed in C without IEEE754 (the latter
> being a reasonable expectation of a good quality platform,
> but there are still exceptions).

I haven't looked at IEEE754 in that much detail, but on many floating
point systems the exponent range is such that the smallest normal
floating point value will overflow on computing 1/x. If the exponent
range is symmetric, there is a factor of the base (2 or 10) to
consider.

> As others pointed out, your goal seems to be well beyond that.

-- glen

Tim Prince

unread,

Jan 4, 2014, 5:41:52 PM1/4/14

to

IEEE754 specifically requires an asymmetric range such that 1/TINY(x)
(Fortran) or 1/FLT_MIN, 1/DBL_MIN, ... don't overflow. At the other
end, of course, there are large numbers whose reciprocal is sub-normal
and will "flush-to-zero" with Intel default compiler options. As far as
I know, CPUs other than Intel(r) Xeon Phi(tm) introduced in the last 3
years support sub-normal numbers with reasonable efficiency (it was
decades to fulfill the promise made when the standard was instituted).

--
Tim Prince

glen herrmannsfeldt

unread,

Jan 4, 2014, 7:49:56 PM1/4/14

to

Tim Prince <tpr...@computer.org> wrote:

(snip, I wrote)

>> I haven't looked at IEEE754 in that much detail, but on many floating
>> point systems the exponent range is such that the smallest normal
>> floating point value will overflow on computing 1/x. If the exponent
>> range is symmetric, there is a factor of the base (2 or 10) to
>> consider.

> IEEE754 specifically requires an asymmetric range such that 1/TINY(x)
> (Fortran) or 1/FLT_MIN, 1/DBL_MIN, ... don't overflow. At the other
> end, of course, there are large numbers whose reciprocal is sub-normal
> and will "flush-to-zero" with Intel default compiler options. As far as
> I know, CPUs other than Intel(r) Xeon Phi(tm) introduced in the last 3
> years support sub-normal numbers with reasonable efficiency (it was
> decades to fulfill the promise made when the standard was instituted).

So that is where the change in bias came from. IEEE754 is pretty similar
to VAX (other than the byte ordering), but the bias is off by one.

If you do it in the obvious way, there is one more value of negative
exponent than positive exponent, and an additional factor of (almost)
the base between the largest and smallest significand.

But it tends to take a lot of extra hardware to do denormals fast.
For people doing floating point in FPGAs, where it is already pretty
inefficient to generate a floating point add/subtract unit.
(The barrel shifter for pre/post normalization is huge, compared
to the actual add/subtract.) Then a lot more logic to get denormals.

Otherwise, denormals give a fraction of additional bit of exponent
range for a large additional cost of logic. Better to add one more
exponent bit instead.

-- glen

James Kuyper

unread,

Jan 4, 2014, 10:15:07 PM1/4/14

to

That is a real problem, though a marginal one. The best solution to that
problem is to allow the overflow to happen, and test for it afterwards,
but the OP seems uninterested in that option.
--
James Kuyper

Malcolm McLean

unread,

Jan 5, 2014, 6:14:40 AM1/5/14

to

On Saturday, January 4, 2014 12:06:12 PM UTC, ardi wrote:
>
> Ooops!!! I believe this means I forgot you can also get Inf from overflow...
> if a number is very big and a division turns it even larger, it can overflow,
> and then it becomes Inf even if the denominator is a normal value.
>
> This effectively breaks my quest for "healthy divisions". I guess I'm back
> to my old arbitrary epsilon checking approach (i.e.: check the denominator
> for fabs(x)>epsilon for deciding whether the division can be performed or
> not, where epsilon is left as an exercise for the reader ;-)
>

What you can do is call the function frexp(). This will give you an exponent.
Then chuck out any numbers outside or a reasonable range. IEEE doubles have
11 bits of exponent so, -1024 to 1023. You're highly unlikely to need anything
bigger or smaller than +/- 100, it's got to be either corrupt data or an
intermediate in a calculation which was unstable and has lost precision.

christ...@cbau.wanadoo.co.uk

unread,

Jan 5, 2014, 4:09:24 PM1/5/14

to

On Saturday, January 4, 2014 4:35:01 PM UTC, James Kuyper wrote:

> The performance is inherently system-specific; for all I know there
> might be floating point chips where isnormal() can be implemented by a
> single floating point instruction; but at the very worst it shouldn't be
> much more complicated than a few mask and shift operations on the bytes
> of a copy of the argument.

For example for double, and IEEE 754 compatible implementation, you can check

(x - x == x - x) && fabs (x) >= DBL_MIN

which is not quite trivial, but not that difficult. (A Not-a-Number x fails both tests; If x is +/- infinity then x - x is NaN and x - x == x - x fails; for zero or denormalised x the test fabs (x) >= DBL_MIN fails. If the compiler optimises x - x to 0 (which is incorrect because of NaN and INF), or optimises expr == expr to 1 (which is incorrect if expr is NaN), then you have bigger problems.

Tim Prince

unread,

Jan 5, 2014, 5:24:53 PM1/5/14

to

This looks attractive from the point of view of not requiring integer
bit field examination of a memory copy of x. I have a concern about how
to prevent compiler optimization from breaking it, as that's the point
of leaving a standard intrinsic to the implementation to know how.