[llvm-dev] Should isnan be optimized out in fast-math mode?

288 views
Skip to first unread message

Serge Pavlov via llvm-dev

unread,
Sep 8, 2021, 1:03:09 PM9/8/21
to LLVM Developers, Clang Dev
Hi all,

One of the purposes of `llvm::isnan` was to help preserve the check made by `isnan` if fast-math mode is
specified (https://reviews.llvm.org/D104854). I'd like to describe reason for that and propose to use the behavior
implemented in that patch.

The option `-ffast-math` is often used when performance is important, as it allows a compiler to generate faster code.
This option itself is a collection of different optimization techniques, each having its own option. For this topic only the
option `-ffinite-math-only` is of interest. With it the compiler treats floating point numbers as mathematical real numbers,
so transformations like `0 * x -> 0` become valid.

In clang documentation (https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffast-math) this option is described as:

    "Allow floating-point optimizations that assume arguments and results are not NaNs or +-Inf."

GCC documentation (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html) is a bit more concrete:

    "Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs."

**What is the issue?**

C standard defines a macro `isnan`, which can be mapped to an intrinsic function provided by the compiler. For both
clang and gcc it is `__builtin_isnan`. How should this function behave if `-ffinite-math-only` is specified? Should it make a
real check or the compiler can assume that it always returns false?

GCC optimizes out `isnan`. It follows from the viewpoint that (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50724#c1):

    "With -ffinite-math-only you are telling that there are no NaNs and thus GCC optimizes isnan (x) to 0."

Such treatment of `-ffinite-math-only` has sufficient drawbacks. In particular it makes it impossible to check validity of
data: a user cannot write

assert(!isnan(x));

because the compiler replaces the actual function call with its expected value. There are many complaints in GCC bug
solutions are using integer operations to make the check, to turn off `-ffinite-math-only` in some parts of the code or to
ensure that libc function is called. It clearly demonstrates that `isnan` in this case is useless, but users need its functionality
and do not have a proper tool to make required checks. The similar direction was criticized in llvm as well (https://reviews.llvm.org/D18513#387418).

**Why imposing restrictions on floating types is bad?**

If `-ffinite-math-only` modifies properties of `double` type, several issues arise, for instance:
- What should return `std::numeric_limits<double>::has_quiet_NaN()`?
- What body should have this function if it is used in a program where some functions are compiled with `fast-math` and some without?
- Should inlining of a function compiled with `fast-math` to a function compiled without it be prohibited in inliner?
- Should `std::isnan(std::numeric_limits<float>::quiet_NaN())` be true?

If the type `double` cannot have NaN value, it means that `double` and `double` under `-ffinite-math-only` are different types
(https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544641.html). Such a way can solve these problems but it is so expensive
that hardly it has a chance to be realized.

**The solution**

Instead of modifying properties of floating point types, the effect of `-ffinite-math-only` can be expressed as a restriction on
operation usage.  Actually clang and gcc documentation already follows this way. Fast-math flags in llvm IR also are attributes
of instructions. The only question is whether `isnan` and similar functions are floating-point arithmetic.

From a practical viewpoint, treating non-computational functions as arithmetic does not add any advantage. If a code extensively
uses `isnan` (so could profit by their removal), it is likely it is not suitable for -ffinite-math-only. This interpretation however creates
the problems described above. So it is profitable to consider `isnan` and similar functions as non-arithmetical.

**Why is it safe to leave `isnan`?**

The probable concern of this solution is deviation from gcc behavior. There are several reasons why this is not an issue.

1. -ffinite-math-only is an optimization option. A correct program compiled with -ffinite-math-only and without it should behave
   identically, if conditions for using -ffinite-math-only are fulfilled. So making the check cannot break functionality.
2. `isnan` is implemented by libc, which can map it to a compiler builtin or use its own implementation, depending on
   configuration options. `isnan` implemented in libc obviously always does the real check.
3. ICC and MSVC preserve `isnan` in fast-math mode.

The proposal is to not consider `isnan` and other such functions as arithmetic operations and do not optimize them out
just because -ffinite-math-only is specified. Of course, there are cases when `isnan` may be optimized out, for instance,
`isnan(a + b)` may be optimized if -ffinite-math-only is in effect due to the assumption (result of arithmetic operation is not NaN).

What are your opinions?

Thanks,
--Serge

Chris Tetreault via llvm-dev

unread,
Sep 8, 2021, 2:04:36 PM9/8/21
to Serge Pavlov, LLVM Developers, cfe...@lists.llvm.org

As a developer (who always reads the docs and generally makes good life choices), if I turn on -ffast-math, I want the compiler to produce the fastest possible floating point math code possible, floating point semantics be darned. Given this viewpoint, my opinion on this topic is that the compiler should do whatever it wants, given the constraints of the documented behavior of NaN. I think the clang docs for -ffast-math are pretty clear on this subject:

 

```

Enable fast-math mode. This option lets the compiler make aggressive, potentially-lossy assumptions about floating-point math. These include:

...

- Operands to floating-point operations are not equal to NaN and Inf ...

```

 

The compiler may assume that operands to floating point operations are not NaN or infinity. So:

 

- What should return `std::numeric_limits<double>::has_quiet_NaN()`? : It should return true if it would have returned true with fast math disabled. Clang is not required to pretend NaN doesn't exist, it's allowed to pretend arguments cannot be NaN if that is convenient.

- What body should have this function if it is used in a program where some functions are compiled with `fast-math` and some without? : This function should be allowed to act as if NaN exists in all cases.

- Should inlining of a function compiled with `fast-math` to a function compiled without it be prohibited in inliner? No. The author of the function that uses fast-math made their choices, and the user of that function should have vetted their dependencies better. In my view, this is no different than if somebody wrote `if (x == y/z) ...`; it's a bug on the user. It's not clang's fault that this code doesn't work as the author wanted.

- Should `std::isnan(std::numeric_limits<float>::quiet_NaN())` be true? : No. quiet_NaN() can return whatever it wants, but the call to std::isnan can be replaced with false since it may assume it's argument is not NaN.

 

Of course, this all sounds fine and well, but the reality is that people don't read docs and don't make good life choices. They turn on fast math because they want it to reduce `x * 0` to `0`, and are surprised when their NaN handling code fails. This is unfortunate, but I don't think we should reduce the effectiveness of fast-math because of this human issue. Other flags exist for these users, and when they complain they should be told about them. Really this is an issue of poor developer discipline, and if we really want to solve this, perhaps some sort of "fast math sanitizer" can be created. It can statically analyze code and complain when it sees things like `if (isnan(foo))` not guarded by `__FAST_MATH__` with mast math enabled. Or, maybe the compiler can just issue a warning unconditionally in this case.

 

Thanks,

   Chris Tetreault

 

From: cfe-dev <cfe-dev...@lists.llvm.org> On Behalf Of Serge Pavlov via cfe-dev
Sent: Wednesday, September 8, 2021 10:03 AM
To: LLVM Developers <llvm...@lists.llvm.org>; Clang Dev <cfe...@lists.llvm.org>
Subject: [cfe-dev] Should isnan be optimized out in fast-math mode?

 

WARNING: This email originated from outside of Qualcomm. Please be wary of any links or attachments, and do not enable macros.

Joerg Sonnenberger via llvm-dev

unread,
Sep 8, 2021, 4:58:41 PM9/8/21
to llvm...@lists.llvm.org, cfe...@lists.llvm.org
On Wed, Sep 08, 2021 at 06:04:08PM +0000, Chris Tetreault via llvm-dev wrote:
> As a developer (who always reads the docs and generally makes good life
> choices), if I turn on -ffast-math, I want the compiler to produce the
> fastest possible floating point math code possible, floating point
> semantics be darned. Given this viewpoint, my opinion on this topic is
> that the compiler should do whatever it wants, given the constraints of
> the documented behavior of NaN.

There is a huge different between optimisations that assume NaN is not
present and breaking checks for them. I'm not convinced at all that
constant-folding isnan to false will actually speed up real world code.

Joerg
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Chris Tetreault via llvm-dev

unread,
Sep 8, 2021, 6:16:51 PM9/8/21
to Joerg Sonnenberger, llvm...@lists.llvm.org, cfe...@lists.llvm.org
Maybe not, but it will simplify the implementation of clang, and eliminating even 1 instruction is technically a speedup. If the check is in an assert, then it would ideally be removed in a release build and not matter anyways. If the check is in a branch, then that's a whole branch that can get eliminated as dead code, which may be huge if it's deep in the hot render loop.

But really, "a check for NaN" is an operation, so by the documented behavior of -ffast-math, it should assume that it does not receive NaN as an argument. Absent a compelling use case, I think consistent behavior is a very valuable thing to have. By turning on fast math, as a developer you are saying "My code doesn't have NaN, so feel free to optimizing assuming this". To then go ahead and have code that expects checks for NaN to work is kind of silly. If the user wants this behavior, they should pass -funsafe-math-optimizations (or whatever subset of the flags of fast math that they really wanted). After all, what is the point of checking for NaN if "you don't have NaN"?

Really, the problem is that `-ffast-math` is the flag that everybody knows about, so they use it and get upset when it doesn't do what they want. This is a problem of education, not something the compiler should be working around. Now, if we want to issue warnings about misuse of things like isnan or isfinite in the presence of fast math, then that would be great.

Thanks,
Chris Tetreault

-----Original Message-----
From: cfe-dev <cfe-dev...@lists.llvm.org> On Behalf Of Joerg Sonnenberger via cfe-dev
Sent: Wednesday, September 8, 2021 1:58 PM
To: llvm...@lists.llvm.org; cfe...@lists.llvm.org
Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math mode?

WARNING: This email originated from outside of Qualcomm. Please be wary of any links or attachments, and do not enable macros.

On Wed, Sep 08, 2021 at 06:04:08PM +0000, Chris Tetreault via llvm-dev wrote:
> As a developer (who always reads the docs and generally makes good
> life choices), if I turn on -ffast-math, I want the compiler to
> produce the fastest possible floating point math code possible,
> floating point semantics be darned. Given this viewpoint, my opinion
> on this topic is that the compiler should do whatever it wants, given
> the constraints of the documented behavior of NaN.

There is a huge different between optimisations that assume NaN is not present and breaking checks for them. I'm not convinced at all that constant-folding isnan to false will actually speed up real world code.

Joerg
_______________________________________________
cfe-dev mailing list
cfe...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

Kevin Neal via llvm-dev

unread,
Sep 8, 2021, 6:18:54 PM9/8/21
to llvm...@lists.llvm.org, Clang Dev
Constant folding away isnan() has already been mentioned as something that surprises people when it eliminates useful things like assert(!isnan(x)). This can be worked around by using integer operations, of course. But having isnan() ignore fast math flags will produce instructions that will frequently be faster than the integer operations.

Are fast math flags _required_ to make assumptions? Or simply _allowed_? The difference is key here.

-----Original Message-----
From: cfe-dev <cfe-dev...@lists.llvm.org> On Behalf Of Joerg Sonnenberger via cfe-dev
Sent: Wednesday, September 08, 2021 4:58 PM
To: llvm...@lists.llvm.org; cfe...@lists.llvm.org

Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math mode?

EXTERNAL

On Wed, Sep 08, 2021 at 06:04:08PM +0000, Chris Tetreault via llvm-dev wrote:
> As a developer (who always reads the docs and generally makes good life
> choices), if I turn on -ffast-math, I want the compiler to produce the
> fastest possible floating point math code possible, floating point
> semantics be darned. Given this viewpoint, my opinion on this topic is
> that the compiler should do whatever it wants, given the constraints of
> the documented behavior of NaN.

There is a huge different between optimisations that assume NaN is not
present and breaking checks for them. I'm not convinced at all that
constant-folding isnan to false will actually speed up real world code.

Joerg
_______________________________________________
cfe-dev mailing list
cfe...@lists.llvm.org
https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fcfe-dev&amp;data=04%7C01%7Ckevin.neal%40sas.com%7C601e59a1438e478816ea08d9730b6652%7Cb1c14d5c362545b3a4309552373a0c2f%7C0%7C0%7C637667315063404929%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=ZSz6TH00A3DndUq1563b5akWHpxf81ZGn6nqImqP8Gw%3D&amp;reserved=0

James Y Knight via llvm-dev

unread,
Sep 8, 2021, 6:27:54 PM9/8/21
to Serge Pavlov, LLVM Developers, Clang Dev
I expressed my strong support for this on the previous thread, but I'll just repost the most important piece...

I believe the proposed semantics from the Clang level ought to be:
  The -ffinite-math-only and -fno-signed-zeros options do not impact the ability to accurately load, store, copy, or pass or return such values from general function calls. They also do not impact any of the "non-computational" and "quiet-computational" IEEE-754 operations, which includes classification functions (fpclassify, signbit, isinf/isnan/etc), sign-modification (copysign, fabs, and negation `-(x)`), as well as the totalorder and totalordermag functions. Those correctly handle NaN, Inf, and signed zeros even when the flags are in effect. These flags do affect the behavior of other expressions and math standard-library calls, as well as comparison operations.

I would not expect this to have an actual negative impact on the performance benefit of those flags, since the optimization benefits mainly arise from comparisons and the general computation instructions which are unchanged.

In further support of this position, I note that the previous thread uncovered at least one vendor -- Apple (https://opensource.apple.com/source/Libm/Libm-2026/Source/Intel/math.h.auto.html) -- going out of their way to cause isnan and friends to function properly with -ffast-math enabled.

_______________________________________________
cfe-dev mailing list
cfe...@lists.llvm.org

Chris Lattner via llvm-dev

unread,
Sep 8, 2021, 6:51:43 PM9/8/21
to James Y Knight, LLVM Developers, Clang Dev
On Sep 8, 2021, at 3:27 PM, James Y Knight via llvm-dev <llvm...@lists.llvm.org> wrote:

I expressed my strong support for this on the previous thread, but I'll just repost the most important piece...

I believe the proposed semantics from the Clang level ought to be:
  The -ffinite-math-only and -fno-signed-zeros options do not impact the ability to accurately load, store, copy, or pass or return such values from general function calls. They also do not impact any of the "non-computational" and "quiet-computational" IEEE-754 operations, which includes classification functions (fpclassify, signbit, isinf/isnan/etc), sign-modification (copysign, fabs, and negation `-(x)`), as well as the totalorder and totalordermag functions. Those correctly handle NaN, Inf, and signed zeros even when the flags are in effect. These flags do affect the behavior of other expressions and math standard-library calls, as well as comparison operations.

FWIW, I completely agree - these flags are about enabling optimizations that the presence of nans otherwise prohibits.  We shouldn’t take a literal interpretation of an old GCC manual, as that would not be useful.

If we converge on this definition, I think it should be documented.  This is a source of confusion that comes up periodically.

-Chris
_______________________________________________

Krzysztof Parzyszek via llvm-dev

unread,
Sep 9, 2021, 9:30:31 AM9/9/21
to Chris Lattner, James Y Knight, LLVM Developers, cfe...@lists.llvm.org

If we say that the fast-math flags are “enabling optimizations that the presence of nans otherwise prohibits”, then there is no reason for clang to keep calls to “isnan” around, or to keep checks like “fpclassify(x) == it’s_a_nan” unfolded.  These are exactly the types of optimizations that the presence of NaNs would prohibit.

 

I understand the need for having some NaN-handling preserved in an otherwise finite-math code.  We already have fast-math-related attributes attached to each function in the LLVM IR, so we could introduce a source-level attribute for enabling/disabling these flags per function.

 

 

--

Krzysztof Parzyszek  kpar...@quicinc.com   AI tools development

 

From: cfe-dev <cfe-dev...@lists.llvm.org> On Behalf Of Chris Lattner via cfe-dev
Sent: Wednesday, September 8, 2021 5:51 PM
To: James Y Knight <jykn...@google.com>
Cc: LLVM Developers <llvm...@lists.llvm.org>; Clang Dev <cfe...@lists.llvm.org>
Subject: Re: [cfe-dev] [llvm-dev] Should isnan be optimized out in fast-math mode?

 

WARNING: This email originated from outside of Qualcomm. Please be wary of any links or attachments, and do not enable macros.

On Sep 8, 2021, at 3:27 PM, James Y Knight via llvm-dev <llvm...@lists.llvm.org> wrote:

Sanjay Patel via llvm-dev

unread,
Sep 9, 2021, 11:02:44 AM9/9/21
to Krzysztof Parzyszek, LLVM Developers, cfe...@lists.llvm.org
Not sure which way to go, but I agree that we need to improve the docs/user experience either way.
Let's try to iron this out with an example (this is based on https://llvm.org/PR51775 ):

#include <math.h>
#include <stdlib.h>
int main() {
    const double d = strtod("1E+1000000", NULL);
    return d == HUGE_VAL;
}

What should this program return when compiled with -ffinite-math-only? Should this trigger a clang warning?

The proposed documentation text isn't clear to me. Should clang apply "nnan ninf" to the IR call for "strtod"?
"strtod" is not in the enumerated list of functions where we would block fast-math-flags, but it is a standard lib call, so "nnan ninf" would seem to apply...but we also don't want "-ffinite-math-only" to alter the ability to return an INF from a "general function call"?

Serge Pavlov via llvm-dev

unread,
Sep 9, 2021, 11:45:37 AM9/9/21
to Sanjay Patel, LLVM Developers, cfe...@lists.llvm.org
On Thu, Sep 9, 2021 at 10:02 PM Sanjay Patel via llvm-dev <llvm...@lists.llvm.org> wrote:
Not sure which way to go, but I agree that we need to improve the docs/user experience either way.
Let's try to iron this out with an example (this is based on https://llvm.org/PR51775 ):

#include <math.h>
#include <stdlib.h>
int main() {
    const double d = strtod("1E+1000000", NULL);
    return d == HUGE_VAL;
}

What should this program return when compiled with -ffinite-math-only? Should this trigger a clang warning?

Comparison `d == HUGE_VAL` is an arithmetic operation, so requirements for using -ffinite-math-only are broken. Both compilers are right.

Serge Pavlov via llvm-dev

unread,
Sep 9, 2021, 12:10:31 PM9/9/21
to Krzysztof Parzyszek, LLVM Developers, cfe...@lists.llvm.org
On Thu, Sep 9, 2021 at 8:30 PM Krzysztof Parzyszek via cfe-dev <cfe...@lists.llvm.org> wrote:

If we say that the fast-math flags are “enabling optimizations that the presence of nans otherwise prohibits”, then there is no reason for clang to keep calls to “isnan” around, or to keep checks like “fpclassify(x) == it’s_a_nan” unfolded.  These are exactly the types of optimizations that the presence of NaNs would prohibit.


Transformation 'x * 0 -> 0' is an optimization allowed in the absence of nans as arguments, because it produces a program that behaves identically under the given restrictions. Replacement of `isnan(x + x)` is also an optimization under the same restrictions. Replacement of `isnan(x)` in general case is not, because we cannot assume that x cannot be a NaN.

 

I understand the need for having some NaN-handling preserved in an otherwise finite-math code.  We already have fast-math-related attributes attached to each function in the LLVM IR, so we could introduce a source-level attribute for enabling/disabling these flags per function.


GCC allows using `#pragma GCC optimize ("finite-math-only")` or `#pragma GCC optimize ("no-finite-math-only")` to enable/disable optimization per function basis. Clang could support this pragmf or maybe `#pragma clang fp` can be extended to support similar functionality.

Krzysztof Parzyszek via llvm-dev

unread,
Sep 9, 2021, 12:29:32 PM9/9/21
to Serge Pavlov, LLVM Developers, cfe...@lists.llvm.org

This goes back to what these options actually imply.  The interpretation that I favor is “this code will never see a NaN”, or “the program can assume that no floating point expression will evaluate to a NaN”.  The benefit of that is that it’s intuitively clear.  In that case “isnan(x)” is false, because x cannot be a NaN.  There is no distinction between “isnan(x+x)” and “isnan(x)”.  If the user wants to preserve “isnan(x)”, they can apply some pragma (which clang may actually have already).

 

To be honest, I’m not sure that I understand your argument.  Are you saying that under your interpretation we could optimize “isnan(x+x) -> false”, but not “isnan(x) -> false”?

 

 

--

Krzysztof Parzyszek  kpar...@quicinc.com   AI tools development

 

Serge Pavlov via llvm-dev

unread,
Sep 9, 2021, 12:53:25 PM9/9/21
to Krzysztof Parzyszek, LLVM Developers, cfe...@lists.llvm.org
On Thu, Sep 9, 2021 at 11:29 PM Krzysztof Parzyszek <kpar...@quicinc.com> wrote:

This goes back to what these options actually imply.  The interpretation that I favor is “this code will never see a NaN”, or “the program can assume that no floating point expression will evaluate to a NaN”.  The benefit of that is that it’s intuitively clear.  In that case “isnan(x)” is false, because x cannot be a NaN.  There is no distinction between “isnan(x+x)” and “isnan(x)”.  If the user wants to preserve “isnan(x)”, they can apply some pragma (which clang may actually have already).


It is apparent simplicity. As the discussion in gcc mail list demonstrated (https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544641.html) this is actually an impromissing way. From a practical viewpoint it is also a bad solution as users cannot even check the assertions.
 

 

To be honest, I’m not sure that I understand your argument.  Are you saying that under your interpretation we could optimize “isnan(x+x) -> false”, but not “isnan(x) -> false”?


Argument of `isnan(x+x)` is a result of arithmetic operation. According to the meaning of -ffinite-math-only it cannot produce NaN. So this call can be optimized out. In the general case `isnan(x)` value may be, say, loaded from memory. Load is not an arithmetic operation, so nothing prevents from loading NaN. Optimizing the call out is dangerous in this case.

Chris Tetreault via llvm-dev

unread,
Sep 9, 2021, 1:10:08 PM9/9/21
to Serge Pavlov, Krzysztof Parzyszek, llvm...@lists.llvm.org, cfe...@lists.llvm.org

If the issue is that users want their asserts to fire, then they should be encouraged to only enable fast math in release builds.

Krzysztof Parzyszek via llvm-dev

unread,
Sep 9, 2021, 1:30:49 PM9/9/21
to Serge Pavlov, LLVM Developers, cfe...@lists.llvm.org

It is apparent simplicity. As the discussion in gcc mail list demonstrated (https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544641.html) this is actually an impromissing way. From a practical viewpoint it is also a bad solution as users cannot even check the assertions.

 

The intent here is that users can preserve the NaN behavior by annotating the code with either attributes or pragmas.  I don’t think that the linked discussion actually shows that the “no NaNs ever” interpretation is any worse than the “arithmetic operations do not produce NaNs”. A large part of was what happens to `__builtin_nan`, but if your code explicitly produces NaNs and you compile it finite-math, you shouldn’t expect anything meaningful.  IMO it’s much better to have a flag with a clarity of what it does, even if it leads to potentially unexpected results, than having an option whose description is open to interpretation.  At least the users will know what caused the issue, rather than wonder if they had found a compiler bug or not.

 

I agree that there may be issues with multiple definitions of functions compiled with different settings, although that is not strictly limited to FP flags.  There should be some unified approach to that, and I don’t know what the right thing to do it off the top of my head.

 

 

Argument of `isnan(x+x)` is a result of arithmetic operation. According to the meaning of -ffinite-math-only it cannot produce NaN. So this call can be optimized out. In the general case `isnan(x)` value may be, say, loaded from memory. Load is not an arithmetic operation, so nothing prevents from loading NaN. Optimizing the call out is dangerous in this case.

 

`x` is not a load, it’s an expression.  Also, even in the presence of NaNs, x+0 preserves the value type (i.e. normal/subnormal/infinity/NaN), except signaling NaNs perhaps.  I’m not sure whether we even consider signaling NaNs, so let’s forget them for a moment.  If x+0 is a NaN iff x is a NaN, then the compiler should be able to rewrite x -> x+0 regardless of any flags.  But then, given that x+0 is now “arithmetic”, isnan(x+0) could become `false`.  This is fundamentally counterintuitive.

 

Furthermore, if we had `a = isnan(x)`, we couldn’t fold it to `false`, but if we had `a = isnan(x); b = isnan(x+x)`, then we could fold both to `false`.  This is, again, unintuitive.

Serge Pavlov via llvm-dev

unread,
Sep 9, 2021, 1:34:54 PM9/9/21
to Chris Tetreault, llvm...@lists.llvm.org, cfe...@lists.llvm.org
Let me describe a real life example.

There is a realtime program that processes float values from a huge array. Calculations do not produce NaNs and do not expect them. Using -ffinite-math-only substantially speeds up the program, so it is highly desirable to use it. The problem is that the array contains NaNs, they mark elements that should not be processed.

An obvious solution is to check an element for NaN, and if it is not, process it. Now there is no clean way to do so. Only workarounds, like using integer arithmetics. The function 'isnan' became useless. And there are many cases when users complain of this optimization.

Thanks,
--Serge

Serge Pavlov via llvm-dev

unread,
Sep 9, 2021, 1:52:24 PM9/9/21
to Krzysztof Parzyszek, LLVM Developers, cfe...@lists.llvm.org
On Fri, Sep 10, 2021 at 12:30 AM Krzysztof Parzyszek <kpar...@quicinc.com> wrote:

It is apparent simplicity. As the discussion in gcc mail list demonstrated (https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544641.html) this is actually an impromissing way. From a practical viewpoint it is also a bad solution as users cannot even check the assertions.

 

The intent here is that users can preserve the NaN behavior by annotating the code with either attributes or pragmas.  I don’t think that the linked discussion actually shows that the “no NaNs ever” interpretation is any worse than the “arithmetic operations do not produce NaNs”. A large part of was what happens to `__builtin_nan`, but if your code explicitly produces NaNs and you compile it finite-math, you shouldn’t expect anything meaningful.


The purpose of -ffinite-math-only was to make calculations faster by excluding corner cases when the user is sure that they do not occur. Why should it prohibit all operations on NaNs, like reading, writing and checking? Does prohibiting them make programs faster or otherwise better?
 

  IMO it’s much better to have a flag with a clarity of what it does, even if it leads to potentially unexpected results, than having an option whose description is open to interpretation.  At least the users will know what caused the issue, rather than wonder if they had found a compiler bug or not.


This solution seems overcomplicated, - a new flag with probably complex meaning. If the effect of -ffinite-math-only is limited to the cases where this restriction indeed gives benefits, it would be a solution without multiplying entities.

Chris Tetreault via llvm-dev

unread,
Sep 9, 2021, 2:03:17 PM9/9/21
to Serge Pavlov, llvm...@lists.llvm.org, cfe...@lists.llvm.org

In this case, I think it’s perfectly reasonable to reinterpret_cast the floats to uint32_t, and then inspect the bit pattern. Since NaN is being used as a sentinel value, I assume it’s a known bit pattern, and not just any old NaN.

 

I think it’s fine that fast-math renders isnan useless. As far as I know, the C++ standard wasn’t written to account for compilers providing fast-math flags. fast-math is itself a workaround for “IEEE floats do not behave like actual real numbers”, so working around a workaround seems reasonable to me.

Serge Pavlov via llvm-dev

unread,
Sep 9, 2021, 2:28:02 PM9/9/21
to Chris Tetreault, llvm...@lists.llvm.org, cfe...@lists.llvm.org
On Fri, Sep 10, 2021 at 1:03 AM Chris Tetreault <ctet...@quicinc.com> wrote:

In this case, I think it’s perfectly reasonable to reinterpret_cast the floats to uint32_t, and then inspect the bit pattern. Since NaN is being used as a sentinel value, I assume it’s a known bit pattern, and not just any old NaN.


C standard defines a function to determine if a value is NaN. The fact that it does not work in this case demonstrates that the optimization is incorrect. Again, if isnan comes from libc implementation, it will work, but if it is provided by the compiler, it does not. Users expect consistent behavior.

If NaNs are not prohibited at all in -ffinite-math-only mode, isnan must work as specified in the standard.

 

I think it’s fine that fast-math renders isnan useless. As far as I know, the C++ standard wasn’t written to account for compilers providing fast-math flags. fast-math is itself a workaround for “IEEE floats do not behave like actual real numbers”, so working around a workaround seems reasonable to me.


I feel you are right with fast-math as a workaround, but the compiler is a practical tool and it must be convenient and suitable for a wide set of tasks. The situation when a user has to invent workarounds because some optimization changes semantics of a standard function is not good.

As for ffinite-math-only, it is actually more or less a safe mode. When we use integer division, we know that the divisor must not be zero. The case of ffinite-math-only is similar. 

Mehdi AMINI via llvm-dev

unread,
Sep 9, 2021, 2:30:24 PM9/9/21
to Serge Pavlov, llvm...@lists.llvm.org, cfe...@lists.llvm.org
On Thu, Sep 9, 2021 at 10:34 AM Serge Pavlov via cfe-dev <cfe...@lists.llvm.org> wrote:
Let me describe a real life example.

There is a realtime program that processes float values from a huge array. Calculations do not produce NaNs and do not expect them. Using -ffinite-math-only substantially speeds up the program, so it is highly desirable to use it. The problem is that the array contains NaNs, they mark elements that should not be processed.

An obvious solution is to check an element for NaN, and if it is not, process it. Now there is no clean way to do so. Only workarounds, like using integer arithmetics. The function 'isnan' became useless. And there are many cases when users complain of this optimization.

I personally would separate the "pre-processing" of the input in a compilation unit that isn't compiled with -ffinite-math-only and isolate the perf-critical routines to be compiled with this flag if needed (I'd also like a sanitizer to have a build mode that validate that no NaNs are ever seen in this routines).

In general, Krzysztof's reasoning in this thread makes sense to me, in particular in terms of being consistent with how we treat isnan(x) vs isnan(x+0) for example.

-- 
Mehdi

Serge Pavlov via llvm-dev

unread,
Sep 9, 2021, 2:55:47 PM9/9/21
to Mehdi AMINI, llvm...@lists.llvm.org, cfe...@lists.llvm.org
On Fri, Sep 10, 2021 at 1:29 AM Mehdi AMINI <joke...@gmail.com> wrote:


On Thu, Sep 9, 2021 at 10:34 AM Serge Pavlov via cfe-dev <cfe...@lists.llvm.org> wrote:
Let me describe a real life example.

There is a realtime program that processes float values from a huge array. Calculations do not produce NaNs and do not expect them. Using -ffinite-math-only substantially speeds up the program, so it is highly desirable to use it. The problem is that the array contains NaNs, they mark elements that should not be processed.

An obvious solution is to check an element for NaN, and if it is not, process it. Now there is no clean way to do so. Only workarounds, like using integer arithmetics. The function 'isnan' became useless. And there are many cases when users complain of this optimization.

I personally would separate the "pre-processing" of the input in a compilation unit that isn't compiled with -ffinite-math-only and isolate the perf-critical routines to be compiled with this flag if needed (I'd also like a sanitizer to have a build mode that validate that no NaNs are ever seen in this routines).

It could be a workaround. GCC supports '#pragma GCC optimize', which could be used to turn on and off -ffinite-math-only. In clang this pragma does not work, so only separate translation units with subsequent linking, which is not possible in some cases, like in ML kernels.


In general, Krzysztof's reasoning in this thread makes sense to me, in particular in terms of being consistent with how we treat isnan(x) vs isnan(x+0) for example.


The key point here is what guarantees the user provides to the compiler when they specify -ffinite-math-only. If "NaN never cannot be seen" then indeed, isnan may be optimized out. If "NaNs do not occur in arithmetic operations", then 'isnan' must be kept unless we know for sure that its argument cannot be a NaN. The choice should be based on practical needs IMHO. The second approach is more flexible and enables more use cases.

Cranmer, Joshua via llvm-dev

unread,
Sep 9, 2021, 4:39:01 PM9/9/21
to Mehdi AMINI, Serge Pavlov, llvm...@lists.llvm.org, cfe...@lists.llvm.org

(Speaking only for myself here, and mostly as someone who doesn’t typically write floating-point-heavy code).

 

The root issue we have here is that, as with many compiler extensions, fast-math flags ends up creating a vaguely-defined variant of the C specification governed by the “obvious” semantics, and is the case with “obvious” semantics, there are several different “obvious” results.

 

Given the standard C taste for undefined behavior, it would seem to me that the most natural definition of -ffinite-math-only would be to say that any operation that produces NaN or infinity results is undefined behavior, or produces a poison value using LLVM’s somewhat tighter definition here [1]. This notably doesn’t give a clear answer on what to do with floating-point operations that don’t produce floating-point results (e.g., casts, comparison operators), and the volume of discussions on this point is I think indicative that there are multiple reasonable options here. Personally, I find the extension of the UB to cases that consume but do not produce floating-point values to be the most natural option.

 

It’s also the case that many users don’t like undefined behavior as a concept, in large part because it can be very difficult to work around in a few cases where it is desired to explicitly override the undefined behavior. For some of the more basic integer UB, clang already provides builtin overflow checking macros to handle the I-want-to-check-if-it-overflowed-without-UB case, for example. And if fast math flags are to create UB, then similar functionality to override the floating-point UB ought to be provided. Already, C provides a mechanism to twiddle floating-point behavior on a per-scope basis (e.g., #pragma STDC FENV_ACCESS, CX_LIMITED_RANGE, FP_CONTRACT). LLVM already supports these flags on a per-instruction basis, so it really shouldn’t be very difficult to have Clang support pragmas to twiddle fast-math flags like the existing C pragmas. And in this model, the -ffast-math and related flags are doing nothing more than setting the default values of these pragmas.

 

In that vein, I can imagine a user writing a program that would look something like this:

 

int some_hard_math_kernel(float *inputs, float *outputs, int N) {

  {

    #pragma clang fast_math off

    for (int i = 0; i < N; i++) {

       if (isinf(inputs[i]) || isnan(inputs[i]))
         return ILLEGAL_ARGUMENT;

    }

  }

  #pragma clang fast_math on

  // Do fancy math here…

  // and if we see isnan(x) here, even if it’s in a library routine [compiled with -ffast-math],

  // or maybe implied by some operation the compiler understands [say, complex multiplication]

  // it is optimized to false.

  return SUCCESS;
}

 

I can clearly see use cases where the programmer might wish to have the optimizer eliminate any isnan calls that are generated when -ffast-math is used, but like other UB, I think it is extremely beneficial to provide some way to explicitly opt-out of UB on a case-by-case basis.

 

I would even go so far as to suggest that maybe the C standards committee should discuss how to handle at least the nsz/nnan/ninf parts of fast-math flags, given that very similar concepts seem to exist in all of the major C/C++ compilers.

 

[1] I fully expect any user who is knowledgeable about poison in LLVM—which admittedly is a fairly expert user—would expect poison to kick in most of the time C or C++ provides for undefined behavior, and potentially to rely on that expectation.

Chris Tetreault via llvm-dev

unread,
Sep 9, 2021, 4:55:16 PM9/9/21
to Serge Pavlov, llvm...@lists.llvm.org, cfe...@lists.llvm.org

The point I was trying to make regarding the C++ standard is that fast-math is a non-standard language extension. If you enable it, you should expect the compiler to diverge from the language standard. I’m sure there’s precedent for this. If I write #pragma once at the top of my header, and include it twice back to back, the preprocessor won’t paste my header twice. Should #pragma once be removed because it breaks #include?

 

Now, you have a real-world example that uses NaN as a sentinel value. In your case, it would be nice if the compiler worked as you suggest. Now, suppose I have a “safe matrix multiply”:

 

```

std::optional<MyMatrixT> safeMul(const MyMatrixT & lhs, const MyMatrixT & rhs) {

  for (int i = 0; i < lhs.rows; ++i) {

    for (int j = 0; j < lhs.cols; ++j) {

      if (isnan(lhs[i][j])) {

        return {};

      }

    }

  }

  for (int i = 0; i < rhs.rows; ++i) {

    for (int j = 0; j < rhs.cols; ++j) {

      if (isnan(rhs[i][j])) {

        return {};

      }

    }

 }

 

  // do the multiply

}

```

 

In this case, if isnan(x) can be constant folded to false with fast-math enabled, then these two loops can be completely eliminated since they are empty and do nothing. If MyMatrixT is a 100 x 100 matrix, and/or safeMul is called in a hot loop, this could be huge. What should I do instead here?

 

Really, it would be much more consistent if we apply the clang documentation for fast-math “Operands to floating-point operations are not equal to NaN and Inf” literally, and not actually implement “Operands to floating-point operations are not equal to NaN and Inf, except in the case of isnan(), but only if the argument to isnan() is a value stored in a variable and not an expression”. As far as using isnan from the standard library compiled without fast-math vs a compiler builtin, I don’t think this is an issue. Really, enabling fast-math is basically telling the compiler “My code has no NaNs. I won’t try to do anything with them, and you should optimize assuming they aren’t there”. If a developer does their part, why should it matter to them that isnan() might work?

 

Thanks,

   Chris Tetreault

James Y Knight via llvm-dev

unread,
Sep 9, 2021, 4:57:12 PM9/9/21
to Sanjay Patel, LLVM Developers, cfe...@lists.llvm.org
On Thu, Sep 9, 2021 at 11:02 AM Sanjay Patel <spa...@rotateright.com> wrote:
Not sure which way to go, but I agree that we need to improve the docs/user experience either way.
Let's try to iron this out with an example (this is based on https://llvm.org/PR51775 ):

#include <math.h>
#include <stdlib.h>
int main() {
    const double d = strtod("1E+1000000", NULL);

This should be covered by the "general function call" rule, is therefore unaffected by -ffinite-math-only, and may validly return inf.

    return d == HUGE_VAL;

For this comparison, however, the compiler can assume its operands are always finite. Thus, this comparison results in a poison value (in LLVM IR terminology).

What should this program return when compiled with -ffinite-math-only? Should this trigger a clang warning?

We could indeed emit a diagnostic (when -ffinite-math-only is in effect) to let you know that you are doing something guaranteed to be incorrect, by using a manifest constant INF, where you promised that you would not.
 
The proposed documentation text isn't clear to me. Should clang apply "nnan ninf" to the IR call for "strtod"?
"strtod" is not in the enumerated list of functions where we would block fast-math-flags, but it is a standard lib call, so "nnan ninf" would seem to apply...but we also don't want "-ffinite-math-only" to alter the ability to return an INF from a "general function call"?


The strtod function should be allowed to return inf/nan. There's two ways we could accomplish that:
1. We could specify in LLVM that nnan/ninf are meaningless to most function calls. In this case, Clang may continue emitting it everywhere, as is done today, including on strtod, but it would have no impact.
2. We could specify that clang should not emit nnan/ninf except on certain calls. In this case, Clang would not emit it on strtod.

I haven't thought about which option would be better. I've been trying to discuss the desired C-facing semantics first.
  

Richard Smith via llvm-dev

unread,
Sep 9, 2021, 8:59:20 PM9/9/21
to Chris Tetreault, llvm...@lists.llvm.org, cfe...@lists.llvm.org
On Thu, 9 Sept 2021 at 13:55, Chris Tetreault via llvm-dev <llvm...@lists.llvm.org> wrote:

The point I was trying to make regarding the C++ standard is that fast-math is a non-standard language extension.


-ffinite-math-only does not need to be a non-standard language extension. Neither C nor C++ requires that floating-point types can represent infinity or NaN, and we could define this flag as meaning that there are (notionally) simply no such values in the relevant types. Of course, that's not actually consistent with what we currently do, nor with what GCC does.

Would it be reasonable to treat operations on Inf and NaN values as UB in this mode only if the same operation on a signaling NaN might signal? (Approximately, that'd mean we imagine these non-finite value encodings all encode sNaNs that are UB if they would signal.) That means the operations that ISO 60559 defines as non-computational or quiet-computational would be permitted to receive NaN and Inf as input and produce them as output, but that other computational operations would not.

Per ISO 60559, the quiet-computational operations that I think are relevant to us are: copy, negate, abs, copySign, and conversions between encoding (eg, bitcast). The non-computational operations that I think are relevant to us are classification functions (including isNaN).

James Y Knight via llvm-dev

unread,
Sep 10, 2021, 10:30:08 AM9/10/21
to Richard Smith, llvm-dev, Clang Dev
On Thu, Sep 9, 2021, 8:59 PM Richard Smith via llvm-dev <llvm...@lists.llvm.org> wrote:
Would it be reasonable to treat operations on Inf and NaN values as UB in this mode only if the same operation on a signaling NaN might signal? (Approximately, that'd mean we imagine these non-finite value encodings all encode sNaNs that are UB if they would signal.) That means the operations that ISO 60559 defines as non-computational or quiet-computational would be permitted to receive NaN and Inf as input and produce them as output, but that other computational operations would not.

Per ISO 60559, the quiet-computational operations that I think are relevant to us are: copy, negate, abs, copySign, and conversions between encoding (eg, bitcast). The non-computational operations that I think are relevant to us are classification functions (including isNaN).

I'm in favor. (Perhaps unsurprisingly, as this is precisely the proposal I made earlier, worded slightly differently. :)

Serge Pavlov via llvm-dev

unread,
Sep 10, 2021, 1:42:32 PM9/10/21
to Chris Tetreault, llvm...@lists.llvm.org, cfe...@lists.llvm.org
If clang does not remove `__builtin_isnan` in `-ffinite-math-only` mode and a user wants calls to `isnan` be optimized out, they can do it in a literally couple of lines:

#undef isnan
#define isnan(x) false

If clang optimizes out `__builtin_isnan` and a user wants to check if some float is NaN, they have no appropriate way for that, only hacks and kludges.

Approach that -ffast-math-only means that "there are no NaNs" is too rigid, it prevents several coding techniques, does not provide additional optimization possibilities and provokes user complaints.

Thanks,
--Serge


On Fri, Sep 10, 2021 at 11:28 PM Chris Tetreault <ctet...@quicinc.com> wrote:

I’m not super knowledgeable on the actual implementation of floating point math in clang, but on the surface this seems fine. My position is that we should provide no guarantees as to the behavior of code with NaN or infinity if fast-math is enabled. We can go with this behavior, but we shouldn’t tell users that they can rely on this behavior. Clang should have maximal freedom to optimize floating point math with fast-math, and any constraint we place potentially results in missed opportunities. Similarly we should feel free to change this implementation in the future, the goal not being stability for users who chose to rely on our implementation details. If users value reproducibility, they should not be using fast math.

 

The only thing I think we should guarantee is that casts work. I should be able to load some bytes from disk, cast the char array to a float array, and any NaNs that I loaded from disk should not be clobbered. After that, if I should be able to cast an element of my float array back to another type and inspect the bit pattern (assuming I did not transform that element in the array in any other way after casting it from char) to support use cases like Serge’s. Any other operation should be fair game.

 

Thanks,

   Chris Tetreault

Chris Tetreault via llvm-dev

unread,
Sep 10, 2021, 2:25:11 PM9/10/21
to Richard Smith, llvm...@lists.llvm.org, cfe...@lists.llvm.org

I’m not super knowledgeable on the actual implementation of floating point math in clang, but on the surface this seems fine. My position is that we should provide no guarantees as to the behavior of code with NaN or infinity if fast-math is enabled. We can go with this behavior, but we shouldn’t tell users that they can rely on this behavior. Clang should have maximal freedom to optimize floating point math with fast-math, and any constraint we place potentially results in missed opportunities. Similarly we should feel free to change this implementation in the future, the goal not being stability for users who chose to rely on our implementation details. If users value reproducibility, they should not be using fast math.

 

The only thing I think we should guarantee is that casts work. I should be able to load some bytes from disk, cast the char array to a float array, and any NaNs that I loaded from disk should not be clobbered. After that, if I should be able to cast an element of my float array back to another type and inspect the bit pattern (assuming I did not transform that element in the array in any other way after casting it from char) to support use cases like Serge’s. Any other operation should be fair game.

 

Thanks,

   Chris Tetreault

 

Chris Tetreault via llvm-dev

unread,
Sep 10, 2021, 2:25:34 PM9/10/21
to Serge Pavlov, llvm...@lists.llvm.org, cfe...@lists.llvm.org

I would argue that #undef’ing a macro provided by the compiler is a much worse kludge that static casting your float to an unsigned int. Additionally, you have to re define isnan to whatever it was after your function (let it pollute unrelated code that possibly isn’t even being compiled with fast math), which can’t be done portably as far as I know. Additionally, this requires you to be the author of safeMul. What if it’s in a dependency for which you don’t have the source? At that point, your only recourse is to open an issue with libProprietaryMatrixMath and hope your org is paying them enough to fast track a fix.

Serge Pavlov via llvm-dev

unread,
Sep 10, 2021, 2:26:43 PM9/10/21
to Chris Tetreault, llvm...@lists.llvm.org, cfe...@lists.llvm.org
It should not be done in headers of course. Redefinition of this macro in the source file which is compiled with -ffinite-math-only is free from the described drawbacks. Besides, the macro `isnan` is defined by libc, not compiler and IIRC it is defined as macro to allow such manipulations.

Influence of libc on behavior of `isnan` in -ffinite-math-only is also an argument against "there are no NaNs". It causes inconsistency in the behavior. Libc can provide its own implementation, which does not rely on compiler `__builtin_isnan` and user code that uses `isnan` would work. But at some point configuration script changes or libc changed the macro and your code works wrong, as it happened after commit 767eadd78 in llvm libcxx project. Keeping `isnan` would make changes in libc less harmful.

Thanks,
--Serge

Joerg Sonnenberger via llvm-dev

unread,
Sep 10, 2021, 2:39:34 PM9/10/21