68 views

Skip to first unread message

Aug 26, 2014, 11:23:11 AM8/26/14

to duncan...@deepbluecap.com, llv...@cs.uiuc.edu

Hi Duncan,

Thank you for your comment to the bug 16257.

I am new to LLVM, so not all the aspects of LLVM's*"undef"*
seem clear to me yet.

I read and understood the examples from the LLVM documentation:

http://llvm.org/docs/LangRef.html#undefined-values

However, those examples do not cover all of the possible contexts where*"undef"*
can appear.

E.g., I can't realize why*"fmul undef, undef"* may not result
in* "undef"* whereas *"mul
undef, undef"* is always folded to *"undef"*. Seems I am
missing something important about
floats here.

The same applies to "fdiv".

*"fdiv undef, undef"* is not folded but *"div %X, undef"*
and *"div undef, %X"* are
folded to *"undef"* (SimplifyFDivInst function in
lib/Analysis/InstructionSimplify.cpp). Moreover, SimplifyFDivInst
does not take
into account whether signalling of SNaNs can be switched off or not
- it always
folds if one of the operands is *"undef"*.

Another mysterious thing for me is folding of*"mul %X, undef"*.

The result depends on whether %X is odd or even:

*"fadd undef,
undef"*.
*"Add" *gets folded to *"undef"* if one or both its
operands are *"undef"*.

Should*"fadd"* behave differently than integer *"add"*
too?

I've tried to google these questions, scanned StackOverflow, but didn't find any good articles / posts answering them. LLVM's*"undef"* is
covered quite poorly.

Duncan, do you know of any web resources explaining this in more detail?

Thank you in advance for any help.

Kind regards,

Oleg

Thank you for your comment to the bug 16257.

I am new to LLVM, so not all the aspects of LLVM's

I read and understood the examples from the LLVM documentation:

http://llvm.org/docs/LangRef.html#undefined-values

However, those examples do not cover all of the possible contexts where

E.g., I can't realize why

The same applies to "fdiv".

Another mysterious thing for me is folding of

The result depends on whether %X is odd or even:

- "undef" if %X is odd or equal to "undef";
- 0 otherwise.

Should

I've tried to google these questions, scanned StackOverflow, but didn't find any good articles / posts answering them. LLVM's

Duncan, do you know of any web resources explaining this in more detail?

Thank you in advance for any help.

Kind regards,

Oleg

Aug 26, 2014, 11:51:54 AM8/26/14

to Oleg Ranevskyy, llv...@cs.uiuc.edu

Hi Oleg,

On 26/08/14 21:20, Oleg Ranevskyy wrote:

> Hi Duncan,

>

> Thank you for your comment to the bug 16257.

>

> I am new to LLVM, so not all the aspects of LLVM's /"undef"/ seem clear to me yet.

> E.g., I can't realize why /"fmul undef, undef"/ may not result in/"undef"/

> whereas /"mul undef, undef"/ is always folded to /"undef"/. Seems I am missing

fold the mul to undef you have to prove that the output may be anything (any bit

pattern). For example it would be wrong to fold "mul 0, undef" to undef since

no matter what value you substitute for "undef" the output of the mul will

always be zero. Note that in something like "mul undef, undef", where undef

occurs more than once, you are allowed to assume that the two undefs are

independent of each other, uncorrelated, i.e. in your reasoning you can insert

any bit pattern for the first undef and any other bit pattern for the second undef.

Here is a proof that "mul undef, undef" is undef. Let R represent an arbitrary

bit pattern. We have to find two bit patterns X and Y such that "mul X, Y" is

equal to R. For example set X to 1 and Y to R. This ends the proof.

Now consider "fmul undef, undef". You could try to apply the same reasoning,

and maybe it is OK. Let R represent an arbitrary bit pattern. Set Y = R and

set X to be the bit pattern for the floating point number 1.0. Then "fmul X, Y"

is equal to R, proving that "fmul undef, undef" can be folded to "undef".

But... is this correct? Is it always true that "fmul 1.0, R" always has exactly

the same bits as R (this is not the same as the output comparing equal to R

using a floating point comparison)? After all, R might be something funky like

a signed zero, an infinity, or a NaN. I don't know if it is always true if

"fmul 1.0, R" always has the same bits as R, hopefully a floating point expert

can tell us.

You could always argue that you could choose "undef" to be a signalling NaN,

causing the program to have undefined behaviour at the point of the

multiplication, in which case you could do anything, but kindly instead just

fold the fmul to undef. I don't much like using snans like this though since

they can be made non-signalling by poking the processor AFAIK.

>

> The same applies to "fdiv".

> /"fdiv undef, undef"/ is not folded but /"div %X, undef"/ and /"div undef, %X"/

> are folded to /"undef"/ (SimplifyFDivInst function in

> lib/Analysis/InstructionSimplify.cpp).

Here is the argument for "div %X, undef -> undef". The undef value might be

zero (div %X, 0), so lets choose it to be zero. That means that the program has

undefined behaviour at this point, and so we could fold the div to "exit(1)" or

"launch nuclear missile". But instead we only fold it to undef.

You are wrong to say that "div undef, %X" is folded to "undef" by

InstructionSimplify, it is folded to zero. It would be wrong to fold to undef,

since (eg) if %X is 2 then you can't obtain an arbitrary bit pattern as the output.

Fdiv is harder than div because a floating point division by 0.0 has a defined

result, unlike for div.

Moreover, SimplifyFDivInst does not take

> into account whether signalling of SNaNs can be switched off or not - it always

> folds if one of the operands is /"undef"/.

This is either a mistake, or a decision that in LLVM IR snans are always

considered to be signalling.

>

> Another mysterious thing for me is folding of /"mul %X, undef"/.

> * 0 otherwise.

InstructionSimplify folds "mul %X, undef" to 0 always, since after all "undef"

could be zero, in which case the output is 0. It would be wrong to fold it to

undef, as you point out, and it isn't folded to undef.

>

> There is a similar bug 16258 about folding of /"fadd undef, undef"/. /"Add"

> /gets folded to /"undef"/ if one or both its operands are /"undef"/.

> Should /"fadd"/ behave differently than integer /"add"/ too?

It's the same problem. It is easy to show that "add undef, %X" can produce any

desired bit pattern as output, but it is less clear whether that is true for

fadd, unless you use the snan argument.

Ciao, Duncan.

>

> I've tried to google these questions, scanned StackOverflow, but didn't find any

> good articles / posts answering them. LLVM's /"undef"/ is covered quite poorly.

LLVM Developers mailing list

LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu

http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

On 26/08/14 21:20, Oleg Ranevskyy wrote:

> Hi Duncan,

>

> Thank you for your comment to the bug 16257.

>

> I read and understood the examples from the LLVM documentation:

> http://llvm.org/docs/LangRef.html#undefined-values

>

> However, those examples do not cover all of the possible contexts where

> /"undef"/ can appear.
> http://llvm.org/docs/LangRef.html#undefined-values

>

> However, those examples do not cover all of the possible contexts where

> E.g., I can't realize why /"fmul undef, undef"/ may not result in/"undef"/

> whereas /"mul undef, undef"/ is always folded to /"undef"/. Seems I am missing

> something important about floats here.

in the case of mul undef, undef, the two inputs may be anything. In order to
fold the mul to undef you have to prove that the output may be anything (any bit

pattern). For example it would be wrong to fold "mul 0, undef" to undef since

no matter what value you substitute for "undef" the output of the mul will

always be zero. Note that in something like "mul undef, undef", where undef

occurs more than once, you are allowed to assume that the two undefs are

independent of each other, uncorrelated, i.e. in your reasoning you can insert

any bit pattern for the first undef and any other bit pattern for the second undef.

Here is a proof that "mul undef, undef" is undef. Let R represent an arbitrary

bit pattern. We have to find two bit patterns X and Y such that "mul X, Y" is

equal to R. For example set X to 1 and Y to R. This ends the proof.

Now consider "fmul undef, undef". You could try to apply the same reasoning,

and maybe it is OK. Let R represent an arbitrary bit pattern. Set Y = R and

set X to be the bit pattern for the floating point number 1.0. Then "fmul X, Y"

is equal to R, proving that "fmul undef, undef" can be folded to "undef".

But... is this correct? Is it always true that "fmul 1.0, R" always has exactly

the same bits as R (this is not the same as the output comparing equal to R

using a floating point comparison)? After all, R might be something funky like

a signed zero, an infinity, or a NaN. I don't know if it is always true if

"fmul 1.0, R" always has the same bits as R, hopefully a floating point expert

can tell us.

You could always argue that you could choose "undef" to be a signalling NaN,

causing the program to have undefined behaviour at the point of the

multiplication, in which case you could do anything, but kindly instead just

fold the fmul to undef. I don't much like using snans like this though since

they can be made non-signalling by poking the processor AFAIK.

>

> The same applies to "fdiv".

> are folded to /"undef"/ (SimplifyFDivInst function in

> lib/Analysis/InstructionSimplify.cpp).

Here is the argument for "div %X, undef -> undef". The undef value might be

zero (div %X, 0), so lets choose it to be zero. That means that the program has

undefined behaviour at this point, and so we could fold the div to "exit(1)" or

"launch nuclear missile". But instead we only fold it to undef.

You are wrong to say that "div undef, %X" is folded to "undef" by

InstructionSimplify, it is folded to zero. It would be wrong to fold to undef,

since (eg) if %X is 2 then you can't obtain an arbitrary bit pattern as the output.

Fdiv is harder than div because a floating point division by 0.0 has a defined

result, unlike for div.

Moreover, SimplifyFDivInst does not take

> into account whether signalling of SNaNs can be switched off or not - it always

This is either a mistake, or a decision that in LLVM IR snans are always

considered to be signalling.

>

> Another mysterious thing for me is folding of /"mul %X, undef"/.

> The result depends on whether %X is odd or even:

>

> * "undef" if %X is odd or equal to "undef";
>

> * 0 otherwise.

InstructionSimplify folds "mul %X, undef" to 0 always, since after all "undef"

could be zero, in which case the output is 0. It would be wrong to fold it to

undef, as you point out, and it isn't folded to undef.

>

> There is a similar bug 16258 about folding of /"fadd undef, undef"/. /"Add"

> /gets folded to /"undef"/ if one or both its operands are /"undef"/.

> Should /"fadd"/ behave differently than integer /"add"/ too?

It's the same problem. It is easy to show that "add undef, %X" can produce any

desired bit pattern as output, but it is less clear whether that is true for

fadd, unless you use the snan argument.

Ciao, Duncan.

>

> I've tried to google these questions, scanned StackOverflow, but didn't find any

>

> Duncan, do you know of any web resources explaining this in more detail?

>

> Thank you in advance for any help.

>

> Kind regards,

> Oleg

_______________________________________________
> Duncan, do you know of any web resources explaining this in more detail?

>

> Thank you in advance for any help.

>

> Kind regards,

> Oleg

LLVM Developers mailing list

LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu

http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Aug 27, 2014, 5:51:22 AM8/27/14

to Duncan Sands, llv...@cs.uiuc.edu

Hi Duncan,

Thank you a lot for your time to provide that great and informative explanation.

Thank you a lot for your time to provide that great and informative explanation.

Now the "undef" logic makes much more
sense for me.

>>

My mistake. I meant to say "**f****div** undef, %X"
is folded to "undef" (not integer "div").

>>

Probably, the "Undefined Values" section of LLVM IR documentation
(http://llvm.org/docs/LangRef.html#undefined-values) needs a
correction.

For the statement "fdiv %X, undef" it states:

"because the undef is allowed to be an arbitrary value, we are allowed to assume that it could be zero.*******Since a
divide by zero has undefined behavior*******...".

This is not true for floats.

For the statement "fdiv %X, undef" it states:

"because the undef is allowed to be an arbitrary value, we are allowed to assume that it could be zero.

This is not true for floats.

>>

Yes, this seems to be an agreement to treat "undef" as a SNaN for
"fdiv".

The question is whether we can make the same assumption for other floating point operations, or "fdiv" needs a correction to prevent folding since signalling of SNaNs might be disabled.

The question is whether we can make the same assumption for other floating point operations, or "fdiv" needs a correction to prevent folding since signalling of SNaNs might be disabled.

>>

Sorry, I malformed this line and forgot to highlight that by "%X"
I meant a constant here. So, constant folding comes into play. The
result depends on the constant parity. E.g.:

mul i64 5, undef --> undef

mul i64 4, undef --> 0

I still have difficulty understanding why the first one is folded to "undef".

"Undef" could be zero, so the result is also zero. Besides, if "undef" isn't zero, it's impossible to get an arbitrary bit pattern anyway. The result will always be divisible by 5.

Would you be able to elaborate on this please?

Thank you!

Kind regards,

Oleg

mul i64 5, undef --> undef

mul i64 4, undef --> 0

I still have difficulty understanding why the first one is folded to "undef".

"Undef" could be zero, so the result is also zero. Besides, if "undef" isn't zero, it's impossible to get an arbitrary bit pattern anyway. The result will always be divisible by 5.

Would you be able to elaborate on this please?

Thank you!

Kind regards,

Oleg

Aug 27, 2014, 6:20:36 AM8/27/14

to Oleg Ranevskyy, llv...@cs.uiuc.edu

Hi Oleg,

> >> /This is either a mistake, or a decision that in LLVM IR snans are always

> considered to be signalling. /

it is any convenient value. For one fdiv optimization maybe it is convenient to

choose it to be the bit pattern of an snan, for some other fdiv optimization the

compiler can choose it to be 1.0 if that is convenient.

> The question is whether we can make the same assumption for other floating point

> operations, or "fdiv" needs a correction to prevent folding since signalling of

> SNaNs might be disabled.

You can assume that undef is an snan if you want in any floating point

operation. But what does that assumption buy you? If you are willing to assume

that the processor will trap on snans then it buys you a lot for any floating

point operation, but if you are not willing to assume that then you need to find

some other trick or give up on folding the operation to undef.

>

> >> /InstructionSimplify folds "mul %X, undef" to 0 always/

Since 5 and 2^64 are relatively prime, you can prove that given any i64 value R,

there exists an i64 value X such that mul i64 5, X is equal to R. Here's an

example using i3 arithmetic (possible values: 0, ..., 7).

mul i3 5, 0 -> 0

mul i3 5, 5 -> 1 (because 5 * 5 is 25; reduced modulo 8 that gives 1)

mul i3 5, 2 -> 2

mul i3 5, 7 -> 3

mul i3 5, 4 -> 4

mul i3 5, 1 -> 5

mul i3 5, 6 -> 6

mul i3 5, 3 -> 7

So you see that every possible i3 output is achievable. Thus it is correct to

fold "mul i3 5, undef" to undef. The same goes for any other number that is

relatively prime to 2, i.e. is an odd number.

Ciao, Duncan.

> >> /This is either a mistake, or a decision that in LLVM IR snans are always

> considered to be signalling. /

> Yes, this seems to be an agreement to treat "undef" as a SNaN for "fdiv".

"undef" is whatever bit pattern you want it to be, i.e. the compiler can assume
it is any convenient value. For one fdiv optimization maybe it is convenient to

choose it to be the bit pattern of an snan, for some other fdiv optimization the

compiler can choose it to be 1.0 if that is convenient.

> The question is whether we can make the same assumption for other floating point

> operations, or "fdiv" needs a correction to prevent folding since signalling of

> SNaNs might be disabled.

operation. But what does that assumption buy you? If you are willing to assume

that the processor will trap on snans then it buys you a lot for any floating

point operation, but if you are not willing to assume that then you need to find

some other trick or give up on folding the operation to undef.

>

> >> /InstructionSimplify folds "mul %X, undef" to 0 always/

> Sorry, I malformed this line and forgot to highlight that by "%X" I meant a

> constant here. So, constant folding comes into play. The result depends on the

> constant parity. E.g.:

> mul i64 5, undef --> undef

> mul i64 4, undef --> 0

> I still have difficulty understanding why the first one is folded to "undef".

> "Undef" could be zero, so the result is also zero. Besides, if "undef" isn't

> zero, it's impossible to get an arbitrary bit pattern anyway. The result will

> always be divisible by 5.

No, that's not true, because we are dealing with arithmetic modulo 2^64 here.
> constant here. So, constant folding comes into play. The result depends on the

> constant parity. E.g.:

> mul i64 5, undef --> undef

> mul i64 4, undef --> 0

> I still have difficulty understanding why the first one is folded to "undef".

> "Undef" could be zero, so the result is also zero. Besides, if "undef" isn't

> zero, it's impossible to get an arbitrary bit pattern anyway. The result will

> always be divisible by 5.

Since 5 and 2^64 are relatively prime, you can prove that given any i64 value R,

there exists an i64 value X such that mul i64 5, X is equal to R. Here's an

example using i3 arithmetic (possible values: 0, ..., 7).

mul i3 5, 0 -> 0

mul i3 5, 5 -> 1 (because 5 * 5 is 25; reduced modulo 8 that gives 1)

mul i3 5, 2 -> 2

mul i3 5, 7 -> 3

mul i3 5, 4 -> 4

mul i3 5, 1 -> 5

mul i3 5, 6 -> 6

mul i3 5, 3 -> 7

So you see that every possible i3 output is achievable. Thus it is correct to

fold "mul i3 5, undef" to undef. The same goes for any other number that is

relatively prime to 2, i.e. is an odd number.

Ciao, Duncan.

Aug 27, 2014, 8:21:40 AM8/27/14

to Duncan Sands, llv...@cs.uiuc.edu

Duncan,

On the one hand, we can assume undef to be an SNaN for the sake of

folding. InstructionSimplify makes such an assumption and folds "fdiv

undef, %X" and "fdiv %X, undef" to undef.

On the other hand, there are the requests to fold "fmul undef, undef" /

"fadd undef, undef" to undef as well. However, it was stated that such

folding is questionable as signalling of SNaNs can potentially be disabled.

Shouldn't fdiv / fmul / fadd be consistent as regards the assumptions

they make?

Could you please advise how we should proceed with the bugs 16257 and

16258 then?

Yes, this explanation is absolutely reasonable. Thanks!

> Hi Oleg,

>

>> >> /This is either a mistake, or a decision that in LLVM IR snans

>> are always

>> considered to be signalling. /

>> Yes, this seems to be an agreement to treat "undef" as a SNaN for

>> "fdiv".

>

> "undef" is whatever bit pattern you want it to be, i.e. the compiler

> can assume it is any convenient value. For one fdiv optimization

> maybe it is convenient to choose it to be the bit pattern of an snan,

> for some other fdiv optimization the compiler can choose it to be 1.0

> if that is convenient.

>

>> The question is whether we can make the same assumption for other

>> floating point

>> operations, or "fdiv" needs a correction to prevent folding since

>> signalling of

>> SNaNs might be disabled.

>

> You can assume that undef is an snan if you want in any floating point

> operation. But what does that assumption buy you? If you are willing

> to assume that the processor will trap on snans then it buys you a lot

> for any floating point operation, but if you are not willing to assume

> that then you need to find some other trick or give up on folding the

> operation to undef.

This is confusing me a bit.
>

>> >> /This is either a mistake, or a decision that in LLVM IR snans

>> are always

>> considered to be signalling. /

>> Yes, this seems to be an agreement to treat "undef" as a SNaN for

>> "fdiv".

>

> "undef" is whatever bit pattern you want it to be, i.e. the compiler

> can assume it is any convenient value. For one fdiv optimization

> maybe it is convenient to choose it to be the bit pattern of an snan,

> for some other fdiv optimization the compiler can choose it to be 1.0

> if that is convenient.

>

>> The question is whether we can make the same assumption for other

>> floating point

>> operations, or "fdiv" needs a correction to prevent folding since

>> signalling of

>> SNaNs might be disabled.

>

> You can assume that undef is an snan if you want in any floating point

> operation. But what does that assumption buy you? If you are willing

> to assume that the processor will trap on snans then it buys you a lot

> for any floating point operation, but if you are not willing to assume

> that then you need to find some other trick or give up on folding the

> operation to undef.

On the one hand, we can assume undef to be an SNaN for the sake of

folding. InstructionSimplify makes such an assumption and folds "fdiv

undef, %X" and "fdiv %X, undef" to undef.

On the other hand, there are the requests to fold "fmul undef, undef" /

"fadd undef, undef" to undef as well. However, it was stated that such

folding is questionable as signalling of SNaNs can potentially be disabled.

Shouldn't fdiv / fmul / fadd be consistent as regards the assumptions

they make?

Could you please advise how we should proceed with the bugs 16257 and

16258 then?

Aug 27, 2014, 9:36:50 AM8/27/14

to Oleg Ranevskyy, llv...@cs.uiuc.edu

Hi Oleg,

use, they should be consistent. Either snans can be assumed to be signalling,

in which case this information should be used everywhere, or they can't be

assumed to be signalling in which case the signalling argument can't be used

anywhere.

> Could you please advise how we should proceed with the bugs 16257 and 16258 then?

I think you should try to get LLVM floating point experts involved, to find out

their opinion about whether LLVM should really assume that snans always trap.

If they think it is fine to assume trapping, then you can fold any floating

point operation with an "undef" operand to "undef".

If they think it is no good, then the existing folds that use this need to be

removed or weakened, though maybe another argument can be found to justify them.

Ciao, Duncan.

in which case this information should be used everywhere, or they can't be

assumed to be signalling in which case the signalling argument can't be used

anywhere.

> Could you please advise how we should proceed with the bugs 16257 and 16258 then?

their opinion about whether LLVM should really assume that snans always trap.

If they think it is fine to assume trapping, then you can fold any floating

point operation with an "undef" operand to "undef".

If they think it is no good, then the existing folds that use this need to be

removed or weakened, though maybe another argument can be found to justify them.

Ciao, Duncan.

Aug 27, 2014, 1:09:38 PM8/27/14

to Duncan Sands, llv...@cs.uiuc.edu

On Aug 27, 2014, at 6:34 AM, Duncan Sands <duncan...@deepbluecap.com> wrote:I think you should try to get LLVM floating point experts involved, to find out their opinion about whether LLVM should really assume that snans always trap.

If they think it is fine to assume trapping, then you can fold any floating point operation with an "undef" operand to "undef".

If they think it is no good, then the existing folds that use this need to be removed or weakened, though maybe another argument can be found to justify them.

LLVM is used to target many platforms for which sNaNs do not trap, and indeed many platforms that do not have floating point exceptions at all.

—Owen

Aug 28, 2014, 3:08:14 AM8/28/14

to Owen Anderson, llv...@cs.uiuc.edu

Hi Owen,

On 27/08/14 19:06, Owen Anderson wrote:

>

>> On Aug 27, 2014, at 6:34 AM, Duncan Sands <duncan...@deepbluecap.com

thanks for the info. All of the floating point folds that rely on snans

trapping should be corrected then.

In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit

pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to

undef, which is wrong). This implies that it is correct to fold "fadd undef,

undef" to undef. Actually is it true that "fadd x, -0.0" always has exactly the

same bits as x, or does it just compare equal to x when doing floating point

comparisons?

It would be better to fold "fadd x, undef" to a constant rather than to x, but

is this possible? For example, undef could be chosen to be a NaN, in which case

it is tempting to say that "fadd x, NaN" can be folded to NaN. The problem is

that there are lots of NaNs. Suppose we choose a particular one, wonder-NaN.

Is it then true that for any x, there exists some y (presumably a NaN) such that

"fadd x, y" has the same bit pattern as wonder-NaN?

Ciao, Duncan.

On 27/08/14 19:06, Owen Anderson wrote:

>

>> On Aug 27, 2014, at 6:34 AM, Duncan Sands <duncan...@deepbluecap.com

trapping should be corrected then.

In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit

pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to

undef, which is wrong). This implies that it is correct to fold "fadd undef,

undef" to undef. Actually is it true that "fadd x, -0.0" always has exactly the

same bits as x, or does it just compare equal to x when doing floating point

comparisons?

It would be better to fold "fadd x, undef" to a constant rather than to x, but

is this possible? For example, undef could be chosen to be a NaN, in which case

it is tempting to say that "fadd x, NaN" can be folded to NaN. The problem is

that there are lots of NaNs. Suppose we choose a particular one, wonder-NaN.

Is it then true that for any x, there exists some y (presumably a NaN) such that

"fadd x, y" has the same bit pattern as wonder-NaN?

Ciao, Duncan.

Aug 28, 2014, 8:38:18 AM8/28/14

to Duncan Sands, llv...@cs.uiuc.edu

On Aug 28, 2014, at 3:03 AM, Duncan Sands <duncan...@deepbluecap.com> wrote:

> Hi Owen,

>

> On 27/08/14 19:06, Owen Anderson wrote:

>>

>>> On Aug 27, 2014, at 6:34 AM, Duncan Sands <duncan...@deepbluecap.com

>>> <mailto:duncan...@deepbluecap.com>> wrote:

>>>

>>> I think you should try to get LLVM floating point experts involved, to find

>>> out their opinion about whether LLVM should really assume that snans always trap.

>>>

>>> If they think it is fine to assume trapping, then you can fold any floating

>>> point operation with an "undef" operand to "undef".

>>>

>>> If they think it is no good, then the existing folds that use this need to be

>>> removed or weakened, though maybe another argument can be found to justify them.

>>

>> LLVM is used to target many platforms for which sNaNs do not trap, and indeed

>> many platforms that do not have floating point exceptions at all.

>

> thanks for the info. All of the floating point folds that rely on snans trapping should be corrected then.

>

> In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to undef, which is wrong). This implies that it is correct to fold "fadd undef, undef" to undef. Actually is it true that "fadd x, -0.0" always has exactly the same bits as x, or does it just compare equal to x when doing floating point comparisons?

(a) x is a signaling NaN on a platform that supports them.

(b) x is a quiet NaN on a platform that does not propagate NaN payloads (e.g. ARM with "default nan" bit set in fpscr).

(c) x is +0.0 and the rounding mode is round down.

– Steve

Aug 28, 2014, 8:42:11 AM8/28/14

to Duncan Sands, llv...@cs.uiuc.edu

> I think you should try to get LLVM floating point experts involved, to find out their opinion about whether LLVM should really assume that snans always trap.

– Steve

Aug 28, 2014, 11:00:40 AM8/28/14

to Stephen Canon, llv...@cs.uiuc.edu

Hi Stephen,

>> In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to undef, which is wrong). This implies that it is correct to fold "fadd undef, undef" to undef. Actually is it true that "fadd x, -0.0" always has exactly the same bits as x, or does it just compare equal to x when doing floating point comparisons?

>

> fadd x, –0.0 always has the same bit pattern as x, unless:

>

> (a) x is a signaling NaN on a platform that supports them.

because you get a trap?

> (b) x is a quiet NaN on a platform that does not propagate NaN payloads (e.g. ARM with "default nan" bit set in fpscr).

What do you get in this case?

> (c) x is +0.0 and the rounding mode is round down.

So far rounding modes were always ignored in LLVM AFAIK.

Note that LLVM folds "fadd x, -0.0" to x (in InstructionSimplify), which made me

think of exploiting this for the undef case.

Ciao, Duncan.

>> In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to undef, which is wrong). This implies that it is correct to fold "fadd undef, undef" to undef. Actually is it true that "fadd x, -0.0" always has exactly the same bits as x, or does it just compare equal to x when doing floating point comparisons?

>

> fadd x, –0.0 always has the same bit pattern as x, unless:

>

> (a) x is a signaling NaN on a platform that supports them.

> (b) x is a quiet NaN on a platform that does not propagate NaN payloads (e.g. ARM with "default nan" bit set in fpscr).

> (c) x is +0.0 and the rounding mode is round down.

Note that LLVM folds "fadd x, -0.0" to x (in InstructionSimplify), which made me

think of exploiting this for the undef case.

Ciao, Duncan.

Aug 28, 2014, 11:04:54 AM8/28/14

to Duncan Sands, llv...@cs.uiuc.edu

> On Aug 28, 2014, at 10:58 AM, Duncan Sands <duncan...@deepbluecap.com> wrote:

>

> Hi Stephen,

>

>>> In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to undef, which is wrong). This implies that it is correct to fold "fadd undef, undef" to undef. Actually is it true that "fadd x, -0.0" always has exactly the same bits as x, or does it just compare equal to x when doing floating point comparisons?

>>

>> fadd x, –0.0 always has the same bit pattern as x, unless:

>>

>> (a) x is a signaling NaN on a platform that supports them.

>

> because you get a trap?

Because you either get a trap (if the invalid exception is unmasked), or the invalid flag is set and the result is a quiet NaN (much more common).
>

> Hi Stephen,

>

>>> In the case of fadd, given that "fadd x, -0.0" is always equal to x (same bit pattern), then "fadd x, undef" can be folded to "x" (currently it is folded to undef, which is wrong). This implies that it is correct to fold "fadd undef, undef" to undef. Actually is it true that "fadd x, -0.0" always has exactly the same bits as x, or does it just compare equal to x when doing floating point comparisons?

>>

>> fadd x, –0.0 always has the same bit pattern as x, unless:

>>

>> (a) x is a signaling NaN on a platform that supports them.

>

> because you get a trap?

>> (b) x is a quiet NaN on a platform that does not propagate NaN payloads (e.g. ARM with "default nan" bit set in fpscr).

>

> What do you get in this case?

>> (c) x is +0.0 and the rounding mode is round down.

>

> So far rounding modes were always ignored in LLVM AFAIK.

– Steve

Aug 29, 2014, 4:44:53 AM8/29/14

to Stephen Canon, Duncan Sands, llv...@cs.uiuc.edu

Hi,

So, the result of "fadd x, -0.0" might have a bit pattern different from

the one of "x" depending on the value of "x" and the target.

If I get it right, the result does not necessarily compare equal to "x"

in floating point comparisons.

Does this mean that folding of the above fadd to "x" in

InstructionSimplify is incorrect?

Oleg

So, the result of "fadd x, -0.0" might have a bit pattern different from

the one of "x" depending on the value of "x" and the target.

If I get it right, the result does not necessarily compare equal to "x"

in floating point comparisons.

Does this mean that folding of the above fadd to "x" in

InstructionSimplify is incorrect?

Oleg

Aug 29, 2014, 1:57:08 PM8/29/14

to Oleg Ranevskyy, llv...@cs.uiuc.edu, Duncan Sands

LLVM does not (today) try to preserve rounding mode or sNaNs. The only remaining question is whether we should be trying to preserve NaN payloads.

—Owen

—Owen

Sep 1, 2014, 5:46:48 AM9/1/14

to Owen Anderson, Duncan Sands, llv...@cs.uiuc.edu

Hi,

Thank you for your comment, Owen.

My LLVM expertise is certainly not enough to make such decisions yet.

Duncan, do you have any comments on this or do you know anyone else who

can decide about preserving NaN payloads?

Thank you.

Oleg

Thank you for your comment, Owen.

My LLVM expertise is certainly not enough to make such decisions yet.

Duncan, do you have any comments on this or do you know anyone else who

can decide about preserving NaN payloads?

Thank you.

Oleg

Sep 1, 2014, 6:07:07 AM9/1/14

to Oleg Ranevskyy, Owen Anderson, llv...@cs.uiuc.edu

Hi Oleg,

On 01/09/14 15:42, Oleg Ranevskyy wrote:

> Hi,

>

> Thank you for your comment, Owen.

> My LLVM expertise is certainly not enough to make such decisions yet.

> Duncan, do you have any comments on this or do you know anyone else who can

> decide about preserving NaN payloads?

my take is that the first thing to do is to see what the IEEE standard says

about NaNs. Consider for example "fadd x, -0.0". Does the standard specify the

exact NaN bit pattern produced as output when a particular NaN x is input? Or

does it just say that the output is a NaN? If the standard doesn't care exactly

which NaN is output, I think it is reasonable for LLVM to assume it is whatever

NaN is most convenient for LLVM; in this case that means using x itself as the

output.

However this approach does implicitly mean that we may end up not folding

floating point operations completely deterministically: depending on the

optimization that kicks in, in one case we might fold to NaN A, and in some

different optimization we might fold the same expression to NaN B. I think this

is pretty reasonable, but it is something to be aware of.

Ciao, Duncan.

On 01/09/14 15:42, Oleg Ranevskyy wrote:

> Hi,

>

> Thank you for your comment, Owen.

> My LLVM expertise is certainly not enough to make such decisions yet.

> Duncan, do you have any comments on this or do you know anyone else who can

> decide about preserving NaN payloads?

about NaNs. Consider for example "fadd x, -0.0". Does the standard specify the

exact NaN bit pattern produced as output when a particular NaN x is input? Or

does it just say that the output is a NaN? If the standard doesn't care exactly

which NaN is output, I think it is reasonable for LLVM to assume it is whatever

NaN is most convenient for LLVM; in this case that means using x itself as the

output.

However this approach does implicitly mean that we may end up not folding

floating point operations completely deterministically: depending on the

optimization that kicks in, in one case we might fold to NaN A, and in some

different optimization we might fold the same expression to NaN B. I think this

is pretty reasonable, but it is something to be aware of.

Ciao, Duncan.

Sep 1, 2014, 8:50:15 AM9/1/14

to Duncan Sands, llv...@cs.uiuc.edu

Hi Duncan,

I looked through the IEEE standard and here is what I found:

**6.2 Operations with NaNs**

*"For an operation with quiet NaN inputs, other than maximum and
minimum operations, if a floating-point result is to be delivered
the result shall be a quiet NaN which should be one of the input
NaNs"*.

**6.2.3 NaN propagation**

*"An operation that propagates a NaN operand to its result and has
a single NaN as an input should produce a NaN with the payload of
the input NaN if representable in the destination format".*

Floating point add propagates a NaN. There is no conversion in the context of LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result is also a NaN with the same payload.

As regards "fadd %x, undef", where %x might be a NaN and undef might be chosen to be (probably some different) NaN, and a possibility to fold this to a constant (NaN), the standard says:

*"If two or more inputs are NaN, then the payload of the resulting
NaN should be identical to the payload of one of the input NaNs if
representable in the destination format. ***This standard does
not specify which of the input NaNs will provide the payload**".

Thus, this makes it possible to fold "fadd %x, undef" to a NaN. Is this right?

Oleg

I looked through the IEEE standard and here is what I found:

Floating point add propagates a NaN. There is no conversion in the context of LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result is also a NaN with the same payload.

As regards "fadd %x, undef", where %x might be a NaN and undef might be chosen to be (probably some different) NaN, and a possibility to fold this to a constant (NaN), the standard says:

Thus, this makes it possible to fold "fadd %x, undef" to a NaN. Is this right?

Oleg

Sep 10, 2014, 2:56:49 PM9/10/14

to Oleg Ranevskyy, llv...@cs.uiuc.edu

Hi Oleg,

On 01/09/14 18:46, Oleg Ranevskyy wrote:

> Hi Duncan,

>

> I looked through the IEEE standard and here is what I found:

>

> *6.2 Operations with NaNs*

> /"For an operation with quiet NaN inputs, other than maximum and minimum

>

> *6.2.3 NaN propagation*

> /"An operation that propagates a NaN operand to its result and has a single NaN

thanks for finding this out.

>

> Floating point add propagates a NaN. There is no conversion in the context of

> LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result is also a NaN

> with the same payload.

Yes, folding "fadd %x, -0.0" to "%x" is correct. This implies that "fadd undef,

undef" can be folded to "undef".

>

> As regards "fadd %x, undef", where %x might be a NaN and undef might be chosen

> to be (probably some different) NaN, and a possibility to fold this to a

> constant (NaN), the standard says:

> /"If two or more inputs are NaN, then the payload of the resulting NaN should be

> provide the payload*"/.

Ciao, Duncan.

>

> Oleg

>

> On 01.09.2014 10:04, Duncan Sands wrote:

>> Hi Oleg,

>>

>> On 01/09/14 15:42, Oleg Ranevskyy wrote:

>>> Hi,

>>>

>>> Thank you for your comment, Owen.

>>> My LLVM expertise is certainly not enough to make such decisions yet.

>>> Duncan, do you have any comments on this or do you know anyone else who can

>>> decide about preserving NaN payloads?

>>

>> my take is that the first thing to do is to see what the IEEE standard says

>> about NaNs. Consider for example "fadd x, -0.0". Does the standard specify

>> the exact NaN bit pattern produced as output when a particular NaN x is

>> input? Or does it just say that the output is a NaN? If the standard doesn't

>> care exactly which NaN is output, I think it is reasonable for LLVM to assume

>> it is whatever NaN is most convenient for LLVM; in this case that means using

>> x itself as the output.

>>

>> However this approach does implicitly mean that we may end up not folding

>> floating point operations completely deterministically: depending on the

>> optimization that kicks in, in one case we might fold to NaN A, and in some

>> different optimization we might fold the same expression to NaN B. I think

>> this is pretty reasonable, but it is something to be aware of.

>>

>> Ciao, Duncan.

>

On 01/09/14 18:46, Oleg Ranevskyy wrote:

> Hi Duncan,

>

> I looked through the IEEE standard and here is what I found:

>

> /"For an operation with quiet NaN inputs, other than maximum and minimum

> operations, if a floating-point result is to be delivered the result shall be a

> quiet NaN which should be one of the input NaNs"/.
>

> *6.2.3 NaN propagation*

> /"An operation that propagates a NaN operand to its result and has a single NaN

> as an input should produce a NaN with the payload of the input NaN if

> representable in the destination format"./
thanks for finding this out.

>

> Floating point add propagates a NaN. There is no conversion in the context of

> LLVM's fadd. So, if %x in "fadd %x, -0.0" is a NaN, the result is also a NaN

> with the same payload.

undef" can be folded to "undef".

>

> As regards "fadd %x, undef", where %x might be a NaN and undef might be chosen

> to be (probably some different) NaN, and a possibility to fold this to a

> constant (NaN), the standard says:

> identical to the payload of one of the input NaNs if representable in the

> destination format. *This standard does not specify which of the input NaNs will
> provide the payload*"/.

>

> Thus, this makes it possible to fold "fadd %x, undef" to a NaN. Is this right?

Yes, I agree.
> Thus, this makes it possible to fold "fadd %x, undef" to a NaN. Is this right?

Ciao, Duncan.

>

> Oleg

>

> On 01.09.2014 10:04, Duncan Sands wrote:

>> Hi Oleg,

>>

>> On 01/09/14 15:42, Oleg Ranevskyy wrote:

>>> Hi,

>>>

>>> Thank you for your comment, Owen.

>>> My LLVM expertise is certainly not enough to make such decisions yet.

>>> Duncan, do you have any comments on this or do you know anyone else who can

>>> decide about preserving NaN payloads?

>>

>> my take is that the first thing to do is to see what the IEEE standard says

>> about NaNs. Consider for example "fadd x, -0.0". Does the standard specify

>> the exact NaN bit pattern produced as output when a particular NaN x is

>> input? Or does it just say that the output is a NaN? If the standard doesn't

>> care exactly which NaN is output, I think it is reasonable for LLVM to assume

>> it is whatever NaN is most convenient for LLVM; in this case that means using

>> x itself as the output.

>>

>> However this approach does implicitly mean that we may end up not folding

>> floating point operations completely deterministically: depending on the

>> optimization that kicks in, in one case we might fold to NaN A, and in some

>> different optimization we might fold the same expression to NaN B. I think

>> this is pretty reasonable, but it is something to be aware of.

>>

>> Ciao, Duncan.

>

Sep 16, 2014, 1:36:25 PM9/16/14

to Duncan Sands, llv...@cs.uiuc.edu

Hi Duncan,

I reread everything we've discussed so far and would like to pay closer

attention to the the ARM's FPSCR register mentioned by Stephen.

It's really possible on ARM systems that floating point operations on

one or more qNaN operands return a NaN different from the operands. I.e.

operand NaN is not propagated. This happens when the "default NaN" flag

is set in the FPSCR (floating point status and control register). The

result in this case is some default NaN value.

This means "fadd %x, -0.0", which is currently folded to %x by

InstructionSimplify, might produce a different result if %x is a NaN.

This breaks the NaN propagation rules the IEEE standard establishes and

significantly reduces folding capabilities for the FP operations.

This also applies to "fadd undef, undef" and "fadd %x, undef". We can't

rely on getting an arbitrary NaN here on ARMs.

Would you be able to confirm this please?

Thank you in advance for your time!

Kind regards,

Oleg

I reread everything we've discussed so far and would like to pay closer

attention to the the ARM's FPSCR register mentioned by Stephen.

It's really possible on ARM systems that floating point operations on

one or more qNaN operands return a NaN different from the operands. I.e.

operand NaN is not propagated. This happens when the "default NaN" flag

is set in the FPSCR (floating point status and control register). The

result in this case is some default NaN value.

This means "fadd %x, -0.0", which is currently folded to %x by

InstructionSimplify, might produce a different result if %x is a NaN.

This breaks the NaN propagation rules the IEEE standard establishes and

significantly reduces folding capabilities for the FP operations.

This also applies to "fadd undef, undef" and "fadd %x, undef". We can't

rely on getting an arbitrary NaN here on ARMs.

Would you be able to confirm this please?

Thank you in advance for your time!

Kind regards,

Oleg

Sep 16, 2014, 1:44:42 PM9/16/14

to Oleg Ranevskyy, llv...@cs.uiuc.edu, Duncan Sands

As far as I know, LLVM does not try very hard to guarantee constant folded NaN payloads that match exactly what the target would generate.

—Owen

—Owen

Sep 16, 2014, 1:47:59 PM9/16/14

to Owen Anderson, Oleg Ranevskyy, llv...@cs.uiuc.edu

On 16/09/14 19:37, Owen Anderson wrote:

> As far as I know, LLVM does not try very hard to guarantee constant folded NaN payloads that match exactly what the target would generate.

I'm with Owen here. Unless ARM people object, I think it is reasonable to say
> As far as I know, LLVM does not try very hard to guarantee constant folded NaN payloads that match exactly what the target would generate.

that at the LLVM IR level we may assume that the IEEE rules are followed.

Ciao, Duncan.

Sep 16, 2014, 2:01:23 PM9/16/14

to Oleg Ranevskyy, Duncan Sands, llv...@cs.uiuc.edu

It’s important to keep in mind the following:

• The IEEE-754 nan propagation rules are “shoulds”, not “shalls”.

• If the default NaN bit is set in FPSCR, the arithmetic is already not supporting strict NaN propagation, regardless of what the compiler does.

– Steve

• The IEEE-754 nan propagation rules are “shoulds”, not “shalls”.

• If the default NaN bit is set in FPSCR, the arithmetic is already not supporting strict NaN propagation, regardless of what the compiler does.

– Steve

Sep 17, 2014, 12:52:49 PM9/17/14

to Duncan Sands, Owen Anderson, Stephen Canon, llv...@cs.uiuc.edu

Hi,

Thank you for all your helpful comments.

To sum up, below is the list of correct folding examples for fadd:

(1) fadd %x, -0.0 -> %x

(2) fadd undef, undef -> undef

(3) fadd %x, undef -> NaN (undef is a NaN which is propagated)

Looking through the code I found the "NoNaNs" flag accessed through an instance of the FastMathFlags class.

(2) and (3) should probably depend on it.

If the flag is set, (2) and (3) cannot be folded as there are no NaNs and we are not guaranteed to get an arbitrary bit pattern from fadd, right?

Other arithmetic FP operations (fsub, fmul, fdiv) also propagate NaNs. Thus, the same rules seem applicable to them as well:

---------------------------------------------------------------------

- fdiv:

(4) "fdiv %x, undef" is now folded to undef.

The code comment states this is done because undef might be a sNaN. We can't rely on sNaNs as they can either be masked or the platform might not have FP exceptions at all. Nevertheless, such folding is still correct due to the NaN propagation rules we found in the Standard - undef might be chosen to be a NaN and its payload will be propagated.

Moreover, this looks similar to (3) and can be folded to a NaN.*Is it worth doing?*

(5) fdiv undef, undef -> undef

---------------------------------------------------------------------

- fmul:

(6) fmul undef, undef -> undef

(7) fmul %x, undef -> NaN or undef (undef is a NaN, which is propagated)

---------------------------------------------------------------------

- fsub:

(8) fsub %x, -0.0 -> %x (if %x is not -0.0; works this way now)

(9) fsub %x, undef -> NaN or undef (undef is a NaN, which is propagated)

(10) fsub undef, undef -> undef

---------------------------------------------------------------------

I will be very thankful if you could review this final summary and share your thoughts.

Thank you.

P.S. Sorry for bothering you again and again.

Just want to make sure I clearly understand the subject in order to make correct code changes and to be able to help others with this in the future.

Kind regards,

Oleg

Thank you for all your helpful comments.

To sum up, below is the list of correct folding examples for fadd:

(1) fadd %x, -0.0 -> %x

(2) fadd undef, undef -> undef

(3) fadd %x, undef -> NaN (undef is a NaN which is propagated)

Looking through the code I found the "NoNaNs" flag accessed through an instance of the FastMathFlags class.

(2) and (3) should probably depend on it.

If the flag is set, (2) and (3) cannot be folded as there are no NaNs and we are not guaranteed to get an arbitrary bit pattern from fadd, right?

Other arithmetic FP operations (fsub, fmul, fdiv) also propagate NaNs. Thus, the same rules seem applicable to them as well:

---------------------------------------------------------------------

- fdiv:

(4) "fdiv %x, undef" is now folded to undef.

The code comment states this is done because undef might be a sNaN. We can't rely on sNaNs as they can either be masked or the platform might not have FP exceptions at all. Nevertheless, such folding is still correct due to the NaN propagation rules we found in the Standard - undef might be chosen to be a NaN and its payload will be propagated.

Moreover, this looks similar to (3) and can be folded to a NaN.

(5) fdiv undef, undef -> undef

---------------------------------------------------------------------

- fmul:

(6) fmul undef, undef -> undef

(7) fmul %x, undef -> NaN or undef (undef is a NaN, which is propagated)

---------------------------------------------------------------------

- fsub:

(8) fsub %x, -0.0 -> %x (if %x is not -0.0; works this way now)

(9) fsub %x, undef -> NaN or undef (undef is a NaN, which is propagated)

(10) fsub undef, undef -> undef

---------------------------------------------------------------------

I will be very thankful if you could review this final summary and share your thoughts.

Thank you.

P.S. Sorry for bothering you again and again.

Just want to make sure I clearly understand the subject in order to make correct code changes and to be able to help others with this in the future.

Kind regards,

Oleg

Sep 17, 2014, 1:00:17 PM9/17/14

to Oleg Ranevskyy, llv...@cs.uiuc.edu, Duncan Sands

On Sep 17, 2014, at 9:45 AM, Oleg Ranevskyy <llvm.ma...@gmail.com> wrote:Looking through the code I found the "NoNaNs" flag accessed through an instance of the FastMathFlags class.

(2) and (3) should probably depend on it.

If the flag is set, (2) and (3) cannot be folded as there are no NaNs and we are not guaranteed to get an arbitrary bit pattern from fadd, right?

The fast math flags are strictly relaxations of the normal semantics. It should always be safe to strip them, so the set of transforms possible with fast math flags should be a superset of those available without them.

—Owen

Sep 17, 2014, 1:13:40 PM9/17/14

to Oleg Ranevskyy, Owen Anderson, Stephen Canon, llv...@cs.uiuc.edu

Hi Oleg,

On 17/09/14 18:45, Oleg Ranevskyy wrote:

> Hi,

>

and (3) to undef. That's because (IIRC) the NoNans flag promises that no NaNs

will be used by the program. However "undef" could be a NaN, thus the promise

is broken, meaning the program is performing undefined behaviour, and you can do

whatever you want.

>

> Other arithmetic FP operations (fsub, fmul, fdiv) also propagate NaNs. Thus, the

> same rules seem applicable to them as well:

> ---------------------------------------------------------------------

> - fdiv:

> (4) "fdiv %x, undef" is now folded to undef.

But should be folded to NaN, not undef.

> The code comment states this is done because undef might be a sNaN. We

> can't rely on sNaNs as they can either be masked or the platform might not have

> FP exceptions at all. Nevertheless, such folding is still correct due to the NaN

> propagation rules we found in the Standard - undef might be chosen to be a NaN

> and its payload will be propagated.

> Moreover, this looks similar to (3) and can be folded to a NaN. /Is it

> worth doing?/

As the current folding to undef is wrong, it has to be fixed.

>

> (5) fdiv undef, undef -> undef

Yup.

> ---------------------------------------------------------------------

> - fmul:

> (6) fmul undef, undef -> undef

Yup.

> (7) fmul %x, undef -> NaN or undef (undef is a NaN, which is

> propagated)

Should be folded to NaN, not undef.

> ---------------------------------------------------------------------

> - fsub:

> (8) fsub %x, -0.0 -> %x (if %x is not -0.0; works this way

> now)

Should this be: fsub %x, +0.0 ?

> (9) fsub %x, undef -> NaN or undef (undef is a NaN, which is

> propagated)

Should fold to NaN not undef.

> (10) fsub undef, undef -> undef

Yup.

Ciao, Duncan.

On 17/09/14 18:45, Oleg Ranevskyy wrote:

> Hi,

>

> Thank you for all your helpful comments.

>

> To sum up, below is the list of correct folding examples for fadd:

> (1) fadd %x, -0.0 -> %x

> (2) fadd undef, undef -> undef

> (3) fadd %x, undef -> NaN (undef is a NaN which is propagated)

>

> Looking through the code I found the "NoNaNs" flag accessed through an instance

> of the FastMathFlags class.

> (2) and (3) should probably depend on it.

> If the flag is set, (2) and (3) cannot be folded as there are no NaNs and we are

> not guaranteed to get an arbitrary bit pattern from fadd, right?

I think it's exactly the other way round: if NoNans is set then you can fold (2)
>

> To sum up, below is the list of correct folding examples for fadd:

> (1) fadd %x, -0.0 -> %x

> (2) fadd undef, undef -> undef

> (3) fadd %x, undef -> NaN (undef is a NaN which is propagated)

>

> Looking through the code I found the "NoNaNs" flag accessed through an instance

> of the FastMathFlags class.

> (2) and (3) should probably depend on it.

> If the flag is set, (2) and (3) cannot be folded as there are no NaNs and we are

> not guaranteed to get an arbitrary bit pattern from fadd, right?

and (3) to undef. That's because (IIRC) the NoNans flag promises that no NaNs

will be used by the program. However "undef" could be a NaN, thus the promise

is broken, meaning the program is performing undefined behaviour, and you can do

whatever you want.

>

> Other arithmetic FP operations (fsub, fmul, fdiv) also propagate NaNs. Thus, the

> same rules seem applicable to them as well:

> ---------------------------------------------------------------------

> - fdiv:

> (4) "fdiv %x, undef" is now folded to undef.

> The code comment states this is done because undef might be a sNaN. We

> can't rely on sNaNs as they can either be masked or the platform might not have

> FP exceptions at all. Nevertheless, such folding is still correct due to the NaN

> propagation rules we found in the Standard - undef might be chosen to be a NaN

> and its payload will be propagated.

> worth doing?/

As the current folding to undef is wrong, it has to be fixed.

>

> (5) fdiv undef, undef -> undef

> ---------------------------------------------------------------------

> - fmul:

> (6) fmul undef, undef -> undef

> (7) fmul %x, undef -> NaN or undef (undef is a NaN, which is

> propagated)

> ---------------------------------------------------------------------

> - fsub:

> (8) fsub %x, -0.0 -> %x (if %x is not -0.0; works this way

> now)

> (9) fsub %x, undef -> NaN or undef (undef is a NaN, which is

> propagated)

> (10) fsub undef, undef -> undef

Ciao, Duncan.

Sep 22, 2014, 12:00:37 PM9/22/14

to Duncan Sands, llv...@cs.uiuc.edu

Hi Duncan,

On 17.09.2014 21:10, Duncan Sands wrote:

> Hi Oleg,

>

possible at all. But undef is still an arbitrary bit pattern that might

occasionally be the same as the one of a NaN. Thank you for the explanation.

Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to undef,

whereas "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef

(NoNaNs is set) or a NaN (NoNaNs is not set).

Oleg

fsub %x, +0.0 is also covered and always folded to %x.

The version with -0.0 is similar except it additionally checks if %x is

not -0.0.

On 17.09.2014 21:10, Duncan Sands wrote:

> Hi Oleg,

>

> On 17/09/14 18:45, Oleg Ranevskyy wrote:

>> Hi,

>>

>> Thank you for all your helpful comments.

>>

>> To sum up, below is the list of correct folding examples for fadd:

>> (1) fadd %x, -0.0 -> %x

>> (2) fadd undef, undef -> undef

>> (3) fadd %x, undef -> NaN (undef is a NaN which

>> is propagated)

>>

>> Looking through the code I found the "NoNaNs" flag accessed through

>> an instance

>> of the FastMathFlags class.

>> (2) and (3) should probably depend on it.

>> If the flag is set, (2) and (3) cannot be folded as there are no NaNs

>> and we are

>> not guaranteed to get an arbitrary bit pattern from fadd, right?

>

> I think it's exactly the other way round: if NoNans is set then you

> can fold (2) and (3) to undef. That's because (IIRC) the NoNans flag

> promises that no NaNs will be used by the program. However "undef"

> could be a NaN, thus the promise is broken, meaning the program is

> performing undefined behaviour, and you can do whatever you want.

Oh, I see the point now. I thought if NoNaNs was set then no NaNs were
>> Hi,

>>

>> Thank you for all your helpful comments.

>>

>> To sum up, below is the list of correct folding examples for fadd:

>> (1) fadd %x, -0.0 -> %x

>> (2) fadd undef, undef -> undef

>> (3) fadd %x, undef -> NaN (undef is a NaN which

>> is propagated)

>>

>> Looking through the code I found the "NoNaNs" flag accessed through

>> an instance

>> of the FastMathFlags class.

>> (2) and (3) should probably depend on it.

>> If the flag is set, (2) and (3) cannot be folded as there are no NaNs

>> and we are

>> not guaranteed to get an arbitrary bit pattern from fadd, right?

>

> I think it's exactly the other way round: if NoNans is set then you

> can fold (2) and (3) to undef. That's because (IIRC) the NoNans flag

> promises that no NaNs will be used by the program. However "undef"

> could be a NaN, thus the promise is broken, meaning the program is

> performing undefined behaviour, and you can do whatever you want.

possible at all. But undef is still an arbitrary bit pattern that might

occasionally be the same as the one of a NaN. Thank you for the explanation.

Thus, "fadd/fsub/fmul/fdiv undef, undef" can always be folded to undef,

whereas "fadd/fsub/fmul/fdiv %x, undef" is folded to either undef

(NoNaNs is set) or a NaN (NoNaNs is not set).

Oleg

The version with -0.0 is similar except it additionally checks if %x is

not -0.0.

Sep 23, 2014, 10:02:44 AM9/23/14

to Oleg Ranevskyy, llv...@cs.uiuc.edu

Hi Oleg,

On 22/09/14 17:56, Oleg Ranevskyy wrote:

> Hi Duncan,

>

for fmul and fdiv, the reasoning does depend on fmul %x, 1.0 always being equal

to %x (likewise: fdiv %x, 1.0 being equal to %x). Is this true?

Ciao, Duncan.

On 22/09/14 17:56, Oleg Ranevskyy wrote:

> Hi Duncan,

>

to %x (likewise: fdiv %x, 1.0 being equal to %x). Is this true?

Ciao, Duncan.

Sep 23, 2014, 11:37:51 AM9/23/14

to Duncan Sands, llv...@cs.uiuc.edu

Hi Duncan,

Do you mean that we can't apply "fmul/fdiv undef, undef" to undef
folding if "fmul/fdiv %x, 1.0" is not guaranteed to be %x?

If we choose one undef to have an arbitrary bit pattern and another undef = 1.0, we need a guarantee to get the bit pattern of the first undef. Do I get it right?

I checked the standard regarding "x*1.0 == x" and found that only "10.4 Literal meaning and value-changing optimizations" addresses this. I don't pretend to thoroughly understand this paragraph yet, but it seems to me that language standards are required to preserve the literal meaning of the source code. Applying the identity property x*1 is a part of this. Here is a quote from IEEE-754:

*"The following value-changing transformations, among others,
preserve the literal meaning of the source*

*code:*

*― Applying the identity property 0 + x when x is not zero and
is not a signaling NaN and the result*

*has the same exponent as x.*

*― Applying the identity property 1 × x when x is not a
signaling NaN and the result has the same*

*exponent as x."*

Maybe Owen or Stephen would be able to clarify this.

Thank you.

Oleg

If we choose one undef to have an arbitrary bit pattern and another undef = 1.0, we need a guarantee to get the bit pattern of the first undef. Do I get it right?

I checked the standard regarding "x*1.0 == x" and found that only "10.4 Literal meaning and value-changing optimizations" addresses this. I don't pretend to thoroughly understand this paragraph yet, but it seems to me that language standards are required to preserve the literal meaning of the source code. Applying the identity property x*1 is a part of this. Here is a quote from IEEE-754:

Thank you.

Oleg