[LLVMdev] The nsw story

265 views
Skip to first unread message

Dan Gohman

unread,
Nov 29, 2011, 6:21:58 PM11/29/11
to llvmdev@cs.uiuc.edu Mailing List
The old and popular tradition of "int" being the default all-purpose type
in C hit a snag with the arrival of 64-bit general-purpose architectures,
when, for reasons of memory usage and compatibility, "int" was fixed
at 32 bits. This meant it was no longer the same size as a pointer, the
size of an array index, or the widest native integer type. (Sequence of
events simplified for brevity.)

The C standard tries to help with a bunch of typedefs: int_fast32_t,
int_least32_t, intptr_t, int32_t, size_t, ptrdiff_t, and more. With all
these great typedefs, one could imagine that there's no reason anyone
should ever use plain "int" ever again. Unfortunately, juggling all these
typedefs can be impractical, for several reasons.

C doesn't have function overloading, so whenever one talks to the standard
library (e.g. to call "abs"), one must know what the underlying types are
(e.g. which of "abs", "absl", or "absll" is needed?). printf also has
a problem, and although the C standard actually tries to help there,
"%" PRIdLEAST32 and "%" PRIdFAST32 have yet to seriously challenge the
mindshare of "%d". And there are other issues.

"int" remains widely popular, even though it isn't quite the close fit
to hardware that it used to be. Consider this simple testcase, which
is representative of a broader issue, containing an innocent use of int:

void my_copy(char *dst, const char *src, int n) {
for (int i = 0; i != n; ++i)
dst[i] = src[i];
}

On LP64 platforms, this code has implicit sign-extensions to convert
the 32-bit int "i" to 64 bits in the array index expressions. Sign
extensions are relatively fast on most machines, but there's also
the cost of delaying the computations of the addresses for the memory
references, which is significant. And small loops turn small problems
into big problems. The result is around a 12% slowdown on the machine
I happen to be typing this on.

(I'm ignoring loop vectorization and memcpy pattern matching here,
to keep this example simple without loosing too much generality.)

However, C compilers have a way to fix this problem, by repurposing
an ancient hack. Long ago, the C standard was designed to accommodate
machines that didn't use two's complement representations for signed
integers. Different representations lead to different behaviors on
overflow, and the C committee decided to make signed integer arithmetic
portable by declaring that overflow in a signed integer type gets
undefined behavior. Since then, two's complement has largely taken over,
but the C standard still has that rule.

Today, compilers use this rule to promote 32-bit "int" variables to 64
bits to eliminate these troublesome sign extensions. This is considered
valid because any time this would change the behavior of the program,
it would be an overflow in the narrower type, which the rule says is
undefined behavior.

Now consider LLVM. In LLVM, "int" is lowered to "i32" (on LP64 targets),
and signedness is a property of operators, rather than types. For many
years, LLVM was unable to promote "int" variables, because it lacked
the necessary information. Then, I was assigned to fix this. I added
a flag named "nsw", which stands for No Signed Wrap, to be attached to
add instructions and others. It's not a very satisfying name, but it's
sufficiently unique and easy to find in LangRef. The purpose of the nsw
flag is to indicate instructions which are known to have no overflow.

The nsw flag is simple on the surface, but it has complexities. One
is that nsw add is not an associative operation (unlike mathematical
integers, two's complement wrapping signed integers, and unsigned
integers, which are all associative). With nsw, ((a + b) + c) is not
equivalent to (a + (b + c)), because (b + c) might overflow, even if
((a + b) + c) has no overflow. This is inconvenient, but it's manageable.

A much bigger complication is that an add with undefined behavior on
overflow is an instruction that potentially has side effects. Side effects
mean the compiler can't freely move such instructions around. This is
jarring, because add instructions are the kinds of things that otherwise
can be easily moved around -- they are cheap and safe. An example of this
is short-circuit elimination -- turning && into &. If the right operand
of the && contains an add, it will need to be speculatively executed,
and that isn't safe if it might have side effects.

The first solution for this was to define nsw add to return undef on
overflow, instead of invoking immediate undefined behavior. This way,
overflow can happen and it doesn't cause any harm unless the value
is actually used. With the code motion problem fixed, I proceeded to
implement nsw using this approach, and a sign-extension elimination
optimization on top of it.

However, I later realized that undef isn't actually expressive enough
for this purpose. Undef can produce any value, but it will still be some
specific value of its type (with fluctuation, but that's not important
here). When an integer add is promoted to a wider type, an expression in
the narrower type which would overflow is converted to an expression in
the wider type which returns a value which can't be expressed in the
narrower type. There is no possible value in the narrower type that
undef could take on which would produce the same behavior.

When I asked Chris what to do about this problem, he didn't understand
why I was worrying about it. Admittedly, it is a fairly obscure problem.
But at the time, I was of the opinion that we shouldn't sweep known
semantic problems under the rug. We don't have formal semantics, and we
don't do correctness proofs, but it still seemed that we should respond
when we do notice inconsistencies. I wanted to get to the bottom of it.

I set out to find some way to say that an nsw add which overflows doesn't
invoke immediate undefined behavior, but invokes undefined behavior only
when its result is used. It can't be just any use though, because I didn't
want to rule out the possibility of hoisting chains of instructions,
like a+b+c+d+e. LLVM doesn't usually speculate this aggressively today,
but there are plausible scenarios where it might want to in the future.

This led me to think about making nsw overflow return some kind
of poisoned value, like undef, only more powerful. The poison would
propagate down the def-use graph until it reaches an instruction that
is capable of triggering the undefined behavior.

I called the poison value "trap", because of some superficial
similarity to C's "trap value" concept, although I now believe this to
be mistaken. It really is quite different. Also, it's unrelated to the
llvm.trap intrinsic. It should probably be renamed.

The presence of this poison value means that undefined behavior has been
invoked on a potentially speculative code path. The consequences of the
undefined behavior are deferred until that code path is actually used
on some non-speculative path.

This concept seemed to fit all the requirements. It allows for
arbitrary speculation, it's sufficient for sign-extension elimination,
and it's useful for a handful of other neat tricks as well. I wrote up
a description of this concept, and it's been in LangRef ever since. It
sticks out though, because it got pretty big and complex, especially
in view of its relative obscurity. Realistically speaking, it's
probably not fully watertight yet. But I'm not aware of any fundamental
flaws.

Perhaps the best way to understand the complexity is to imagine writing
a tool to detect this kind of undefined behavior. Detecting signed
overflow in C source is pretty easy: just insert checking code at each
signed operation. But as we've just seen, LLVM IR has different rules
from C. The undefined behavior of nsw doesn't matter until it's
actually observed.

So initially, one might imagine detecting trouble by adding a bit
to every Value to track whether it is a poison value. That's pretty
heavy-weight, but it's doable. It gets worse though, because poison
values can also be transmitted through memory. So one would also
have to add a flag to bytes of memory which are poisoned. But it gets
even worse, because poison values can be used as branch conditions,
so one would have to do advanced CFG analysis to prove whether branch
decisions based on poisoned values actually matter. And it gets worse
yet, because the control structure of the program may be dynamic, so
what one would really need to do is something like non-deterministic
execution. This can easily take exponential amounts of time and memory.

A natural reaction to this problem is to think that LLVM IR is so nice
and pretty that naturally there must be a nice and pretty solution. Here
are some alternatives that have been considered:

- Go back to using undef for overflow. There were no known real-world
bugs with this. It's just inconsistent.

- Define add nsw as a fully side-effecting operation, and accept the
limits on code motion that this implies. However, as LLVM starts doing
profile-guided optimizations, and starts thinking about more diverse
architectures, code speculation will likely become more important.

- Define add nsw as a fully side-effecting operation, and teach
optimization passes to strip nsw when moving code past control
boundaries. This is seen as suboptimal because it prevents subsequent
passes from making use of the nsw information. And, it's extra work
for optimization passes.

- Instead of trying to define dependence in LangRef, just say that if
changing the value returned from an overflowing add nsw would
affect the observable behavior of the program, then the behavior of
the program is undefined. This would reduce the amount of text in
LangRef, but it would be a weaker definition, and it would require
sign-extension optimizers and others to do significantly more work
to establish that their transformations are safe.

- Give up on nsw and have compilers emit warnings when they are unable
to perform some optimization due to their inability to exclude the
possibility of overflow. Obviously the warnings would not be on by
default, or even -Wall or probably even -Wextra, so -Werror users need
not revolt. Many people are often faced with code that they cannot
modify for any number of reasons, and would not be able to make use
of such warnings. It's an interesting tradeoff, but it's unpopular.

Unfortunately, none of these are completely nice and pretty. Because of
this, and because most people don't care, nsw with all its conceptual
complexity has survived.

Dan

_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Paul Robinson

unread,
Nov 30, 2011, 3:21:11 PM11/30/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
A few thoughts from someone on the sidelines; hope they're helpful.

On Tue, Nov 29, 2011 at 3:21 PM, Dan Gohman <goh...@apple.com> wrote:
The first solution for this was to define nsw add to return undef on
overflow, instead of invoking immediate undefined behavior. This way,
overflow can happen and it doesn't cause any harm unless the value
is actually used. With the code motion problem fixed, I proceeded to
implement nsw using this approach, and a sign-extension elimination
optimization on top of it.

However, I later realized that undef isn't actually expressive enough
for this purpose. Undef can produce any value, but it will still be some
specific value of its type (with fluctuation, but that's not important
here). When an integer add is promoted to a wider type, an expression in
the narrower type which would overflow is converted to an expression in
the wider type which returns a value which can't be expressed in the
narrower type. There is no possible value in the narrower type that
undef could take on which would produce the same behavior.

Part of the confusion seems to come from overloading "undefined."
The VAX architecture spec nicely dealt with the distinction between
unspecified results and unspecified behavior by consistently using
distinct terms: _unpredictable_ results, and _undefined_ behavior.
An operation producing an unpredictable result could produce any
bit-pattern that fit in the output operand; but it would not trap
or do anything else unfriendly.
An operation causing undefined behavior means anything could happen,
from no-effect to machine meltdown.

I always thought these were usefully distinct concepts, and the
terminology convention really clarified that.

I set out to find some way to say that an nsw add which overflows doesn't
invoke immediate undefined behavior, but invokes undefined behavior only
when its result is used. It can't be just any use though, because I didn't
want to rule out the possibility of hoisting chains of instructions,
like a+b+c+d+e. LLVM doesn't usually speculate this aggressively today,
but there are plausible scenarios where it might want to in the future.

This led me to think about making nsw overflow return some kind
of poisoned value, like undef, only more powerful. The poison would
propagate down the def-use graph until it reaches an instruction that
is capable of triggering the undefined behavior.

Reminds me of how Itanium does speculation, although the details are
getting fuzzy (it has been a while). Registers have an associated 
"NaT" (Not A Thing) flag, which gets set when something bad happens.
(Usually loading from nonexistent or protected memory. Can't remember
whether integer overflows can set this.)
Some operations are NaT-tolerant; any input NaT is simply propagated
to the output. Arithmetic, sign-extension, bit fiddling, and the like
are NaT-tolerant.
Some operations are NaT-intolerant; any input NaT causes a trap.
Storing to memory, pointer derefs, and I think compares are all
NaT-intolerant.
There's also a "check" instruction that exists solely to check
for NaT flags. That might be specific to memory traps; when you trap
on a load, you want to know what address trapped, and the check
instruction has a memory-address operand to provide that info.

So initially, one might imagine detecting trouble by adding a bit
to every Value to track whether it is a poison value. That's pretty
heavy-weight, but it's doable. It gets worse though, because poison
values can also be transmitted through memory. So one would also
have to add a flag to bytes of memory which are poisoned. But it gets
even worse, because poison values can be used as branch conditions,
so one would have to do advanced CFG analysis to prove whether branch
decisions based on poisoned values actually matter. And it gets worse
yet, because the control structure of the program may be dynamic, so
what one would really need to do is something like non-deterministic
execution. This can easily take exponential amounts of time and memory.

I suggest that storing to memory, and conditionals, and anything
else that seems to lead toward the really hairy situations, should
be cases where the nsw/poison/trap is "actually observed."
- Values in memory can be observable by other parts of the program, or
by other threads/processes.
- After you've made a control-flow decision, memory references in the 
successor blocks are part of the observable behavior of the program.

This is taking a broader view of "observable" than what's defined in the
C or C++ standards, which steadfastly ignore a variety of practical realities.
But I think it's a reasonable and defensible position, that still allows you
some scope for operating on these things in a sensible way.

Pogo

Dan Gohman

unread,
Nov 30, 2011, 6:45:43 PM11/30/11
to Paul Robinson, llvmdev@cs.uiuc.edu Mailing List
On Nov 30, 2011, at 12:21 PM, Paul Robinson wrote:

> Part of the confusion seems to come from overloading "undefined."
> The VAX architecture spec nicely dealt with the distinction between
> unspecified results and unspecified behavior by consistently using
> distinct terms: _unpredictable_ results, and _undefined_ behavior.
> An operation producing an unpredictable result could produce any
> bit-pattern that fit in the output operand; but it would not trap
> or do anything else unfriendly.
> An operation causing undefined behavior means anything could happen,
> from no-effect to machine meltdown.
>
> I always thought these were usefully distinct concepts, and the
> terminology convention really clarified that.

LLVM IR does make these distinctions, though the language is a bit
subtle.

"Undefined behavior" and "behavior is undefined" mean anything could
happen. Demons may fly out your nose.

"undef" is an arbitrary fluctuating bit pattern. Also it causes
operations that use it to exhibit constrained fluctuation.

"Result is undefined" means an arbitrary bit pattern may be returned.
Sometimes this means "undef" and sometimes this means a fixed bit
pattern, though this inconsistency may be unintentional.

> I suggest that storing to memory, and conditionals, and anything
> else that seems to lead toward the really hairy situations, should
> be cases where the nsw/poison/trap is "actually observed."
> - Values in memory can be observable by other parts of the program, or
> by other threads/processes.
> - After you've made a control-flow decision, memory references in the
> successor blocks are part of the observable behavior of the program.
>
> This is taking a broader view of "observable" than what's defined in the
> C or C++ standards, which steadfastly ignore a variety of practical realities.
> But I think it's a reasonable and defensible position, that still allows you
> some scope for operating on these things in a sensible way.

Prohibiting poison values from propogating through memory would mean
that the reg2mem pass would no longer be a semantics-preserving pass.
Prohibiting it from propogating through control flow would mean that a
select instruction wouldn't be equivalent to a conditional branch and
a phi. These are problems that aren't really important for hardware
ISAs, but are important for LLVM IR.

Chris Lattner

unread,
Dec 1, 2011, 1:49:49 AM12/1/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List

On Nov 29, 2011, at 3:21 PM, Dan Gohman wrote:

>
> A natural reaction to this problem is to think that LLVM IR is so nice
> and pretty that naturally there must be a nice and pretty solution. Here
> are some alternatives that have been considered:
>
> - Go back to using undef for overflow. There were no known real-world
> bugs with this. It's just inconsistent.

+1 for the simple and obvious answer.

-Chris

Dan Gohman

unread,
Dec 1, 2011, 11:46:18 AM12/1/11
to Chris Lattner, llvmdev@cs.uiuc.edu Mailing List

On Nov 30, 2011, at 10:49 PM, Chris Lattner wrote:

>
> On Nov 29, 2011, at 3:21 PM, Dan Gohman wrote:
>
>>
>> A natural reaction to this problem is to think that LLVM IR is so nice
>> and pretty that naturally there must be a nice and pretty solution. Here
>> are some alternatives that have been considered:
>>
>> - Go back to using undef for overflow. There were no known real-world
>> bugs with this. It's just inconsistent.
>
> +1 for the simple and obvious answer.

To be clear, the main sign-extension elimination optimization is not valid under
the simple and obvious answer. Are you proposing to do it anyway?

Dan

Chris Lattner

unread,
Dec 1, 2011, 12:09:22 PM12/1/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List

On Dec 1, 2011, at 8:46 AM, Dan Gohman wrote:

>
> On Nov 30, 2011, at 10:49 PM, Chris Lattner wrote:
>
>>
>> On Nov 29, 2011, at 3:21 PM, Dan Gohman wrote:
>>
>>>
>>> A natural reaction to this problem is to think that LLVM IR is so nice
>>> and pretty that naturally there must be a nice and pretty solution. Here
>>> are some alternatives that have been considered:
>>>
>>> - Go back to using undef for overflow. There were no known real-world
>>> bugs with this. It's just inconsistent.
>>
>> +1 for the simple and obvious answer.
>
> To be clear, the main sign-extension elimination optimization is not valid under
> the simple and obvious answer. Are you proposing to do it anyway?

Please define "valid". We discussed this a very very long time ago, and I don't recall the details.

-Chris

Dan Gohman

unread,
Dec 1, 2011, 12:23:58 PM12/1/11
to Chris Lattner, llvmdev@cs.uiuc.edu Mailing List

On Dec 1, 2011, at 9:09 AM, Chris Lattner wrote:

>
> On Dec 1, 2011, at 8:46 AM, Dan Gohman wrote:
>
>>
>> On Nov 30, 2011, at 10:49 PM, Chris Lattner wrote:
>>
>>>
>>> On Nov 29, 2011, at 3:21 PM, Dan Gohman wrote:
>>>
>>>>
>>>> A natural reaction to this problem is to think that LLVM IR is so nice
>>>> and pretty that naturally there must be a nice and pretty solution. Here
>>>> are some alternatives that have been considered:
>>>>
>>>> - Go back to using undef for overflow. There were no known real-world
>>>> bugs with this. It's just inconsistent.
>>>
>>> +1 for the simple and obvious answer.
>>
>> To be clear, the main sign-extension elimination optimization is not valid under
>> the simple and obvious answer. Are you proposing to do it anyway?
>
> Please define "valid". We discussed this a very very long time ago, and I don't recall the details.


int a = INT_MAX, b = 1;
long c = (long)(a + b);

What is the value of c, on an LP64 target?

By standard C rules, the add overflow invokes immediate undefined behavior of course.

If a and b are promoted to 64-bit, c is 0x0000000080000000.

In a world where signed add overflow returns undef, c could be any of
0x0000000000000000
0x0000000000000001
0x0000000000000002
...
0x000000007fffffff
or
0xffffffff80000000
0xffffffff80000001
0xffffffff80000002
...
0xffffffffffffffff
however it can't ever be 0x0000000080000000, because there's no 32-bit value
the undef could take which sign-extends into that 64-bit value.

Therefore, if signed add overflow returns undef, promoting such 32-bit variables to
64-bit variables is not a behavior-preserving transformation.

Dan

David A. Greene

unread,
Dec 1, 2011, 12:32:55 PM12/1/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
Dan Gohman <goh...@apple.com> writes:

> The presence of this poison value means that undefined behavior has been
> invoked on a potentially speculative code path. The consequences of the
> undefined behavior are deferred until that code path is actually used
> on some non-speculative path.

This is exactly what was done on Itanium and its follow-ons to handle
speculative loads. It's a perfectly reasonable approach IMHO in its
basic formulation.

The complication you introduced is to go beyond thwe basic formulation
and consider certain kinds of uses "special" in that they don't trigger
undefined behavior. This leads to the analysis problems you described.

> - Go back to using undef for overflow. There were no known real-world
> bugs with this. It's just inconsistent.

I have always considered "undef" to mean "any possible value" so I don't
see a real problem with promoted integers containing a value thhat can't
fit in the original type size. The value in the original type size is
undetermined by the definition of "undef" so who cares what we choose as
the "real" value?

That said, there are a couple of times I ran into situations where LLVM
assumed "undef" somehow restricted the possible set of values it could
represent. As I recall I ran into this in instcombine and dagcombine.
I will have to go back and search for the details. In any event, I
think these things are bugs in LLVM.

So "undef" seems like a reasonable approach to me.

> - Define add nsw as a fully side-effecting operation, and accept the
> limits on code motion that this implies. However, as LLVM starts doing
> profile-guided optimizations, and starts thinking about more diverse
> architectures, code speculation will likely become more important.

It will indeed. I don't like this approach.

> - Define add nsw as a fully side-effecting operation, and teach
> optimization passes to strip nsw when moving code past control
> boundaries. This is seen as suboptimal because it prevents subsequent
> passes from making use of the nsw information. And, it's extra work
> for optimization passes.

Agreed.

> - Instead of trying to define dependence in LangRef, just say that if
> changing the value returned from an overflowing add nsw would
> affect the observable behavior of the program, then the behavior of
> the program is undefined.

Isn't that exactly what the C standard says? Now, LLVM handles more
than C so we need to consider that. Personally, I agree that this is a
less than satisfying solution.

Are you mainly worried about the promoted value being vcasted back to
the original type?

> - Give up on nsw and have compilers emit warnings when they are unable
> to perform some optimization due to their inability to exclude the
> possibility of overflow.

This would be very bad. We groan every time we see "unsigned" used as a
loop index because it seriously degrades dependence analysis and
vectorization for exactly this reason.

-Dave

David A. Greene

unread,
Dec 1, 2011, 12:35:04 PM12/1/11
to Paul Robinson, llvmdev@cs.uiuc.edu Mailing List
Paul Robinson <pogo...@gmail.com> writes:

> Some operations are NaT-tolerant; any input NaT is simply propagated
> to the output. Arithmetic, sign-extension, bit fiddling, and the like
> are NaT-tolerant.

Oh yeah, I forgot about that. Forget all the stuff I said about "base
formulation" and complicating that. :)

> I suggest that storing to memory, and conditionals, and anything
> else that seems to lead toward the really hairy situations, should
> be cases where the nsw/poison/trap is "actually observed."

Agreed.

> This is taking a broader view of "observable" than what's defined in the
> C or C++ standards, which steadfastly ignore a variety of practical realities.
> But I think it's a reasonable and defensible position, that still allows you
> some scope for operating on these things in a sensible way.

I like this approach a lot.

-Dave

David A. Greene

unread,
Dec 1, 2011, 12:37:17 PM12/1/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
Dan Gohman <goh...@apple.com> writes:

> Prohibiting poison values from propogating through memory would mean
> that the reg2mem pass would no longer be a semantics-preserving pass.

Or it means you couldn't demote those values.

> Prohibiting it from propogating through control flow would mean that a
> select instruction wouldn't be equivalent to a conditional branch and
> a phi. These are problems that aren't really important for hardware
> ISAs, but are important for LLVM IR.

We could consider select an observation point and consider its use of
poisoned values undefined behavior.

-Dave

David A. Greene

unread,
Dec 1, 2011, 12:44:26 PM12/1/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
Dan Gohman <goh...@apple.com> writes:

> In a world where signed add overflow returns undef, c could be any of
> 0x0000000000000000
> 0x0000000000000001
> 0x0000000000000002
> ...
> 0x000000007fffffff
> or
> 0xffffffff80000000
> 0xffffffff80000001
> 0xffffffff80000002
> ...
> 0xffffffffffffffff
> however it can't ever be 0x0000000080000000, because there's no 32-bit value
> the undef could take which sign-extends into that 64-bit value.

Is undef considered to be "one true value" which is just unknown? I
always considered it to be "any possible value." If the latter, there
is no problem with undef "changing values" as a result of various
operations, including sign extension.

> Therefore, if signed add overflow returns undef, promoting such 32-bit variables to
> 64-bit variables is not a behavior-preserving transformation.

Sure it is, if undef is allowed to "float freely" and change its value
on a whim. The behavior is, "this could be anything at any time."

If I'm understanding you correctly.

-Dave

Eli Friedman

unread,
Dec 1, 2011, 1:19:36 PM12/1/11
to David A. Greene, llvmdev@cs.uiuc.edu Mailing List
On Thu, Dec 1, 2011 at 9:44 AM, David A. Greene <gre...@obbligato.org> wrote:
> Dan Gohman <goh...@apple.com> writes:
>
>> In a world where signed add overflow returns undef, c could be any of
>>   0x0000000000000000
>>   0x0000000000000001
>>   0x0000000000000002
>>   ...
>>   0x000000007fffffff
>> or
>>   0xffffffff80000000
>>   0xffffffff80000001
>>   0xffffffff80000002
>>   ...
>>   0xffffffffffffffff
>> however it can't ever be 0x0000000080000000, because there's no 32-bit value
>> the undef could take which sign-extends into that 64-bit value.
>
> Is undef considered to be "one true value" which is just unknown?  I
> always considered it to be "any possible value."

The current LLVM IR definition of undef is essentially "each time
undef is evaluated, it evaluates to an arbitrary value, but it must
evaluate to some specific value". SimplifyDemandedBits is an example
of a transformation which uses undef in a way that would be incorrect
under a model where undef is like the current trap value; take a look
at r133022 for a case where this matters.

-Eli

Paul Robinson

unread,
Dec 5, 2011, 2:55:24 PM12/5/11
to David A. Greene, llvmdev@cs.uiuc.edu Mailing List
On Thu, Dec 1, 2011 at 9:37 AM, David A. Greene <gre...@obbligato.org> wrote:
Dan Gohman <goh...@apple.com> writes:

> Prohibiting poison values from propogating through memory would mean
> that the reg2mem pass would no longer be a semantics-preserving pass.

Or it means you couldn't demote those values.

If reg2mem is constructing spill slots, or address-not-taken locals, I could
see classifying those as okay-to-poison. You would not want statics or globals
or pointer-addressed memory to be poisonable, as those should be treated
as being observable locations.

So, it's not really "all of memory" that could be poisoned, only these
compiler-generated things.

Pogo

Dan Gohman

unread,
Dec 5, 2011, 5:22:09 PM12/5/11
to Paul Robinson, llvmdev@cs.uiuc.edu Mailing List

This would mean that it's not valid to do store-to-load propagation
in the optimizer if the pointee might be a global variable. That
would mean the loss of real optimizations.

And it would still mean that non-volatile store instructions suddenly
can have side effects beyond just writing to their addressed memory.
That would be surprising.

I think Dave also suggested making select instructions be
observation points. That would mean that select instructions would
have side effects. Again, that would be surprising.

Dan

David A. Greene

unread,
Dec 5, 2011, 6:50:13 PM12/5/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
Dan Gohman <goh...@apple.com> writes:

> And it would still mean that non-volatile store instructions suddenly
> can have side effects beyond just writing to their addressed memory.
> That would be surprising.

No more surprising than any hardware architecture that implements this
kind of poison tracking. You see the fault or undefined behavior at the
store. Yes, a compiler IR is not an ISA but we do get to define the IR
semantics. It's a contract where we get to specify all of the rules.
:)

If you're more worried about things like the above rather than the high
static analysis cost these various suggestions are trying to address,
that's simply a particular tradeoff you'd rather make. There's no rule
that says any particular semantic is "surprising."

We make the tradeoffs we want and implement the semantics based on that.

> I think Dave also suggested making select instructions be
> observation points. That would mean that select instructions would
> have side effects. Again, that would be surprising.

How so? It certainly is an observable point. A select is just a branch
by another name. If the program behavior was undefined in the presence
of a branch, why shoudln't it be undefined in the presence of a select?

-Dave

Paul Robinson

unread,
Dec 5, 2011, 7:44:33 PM12/5/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
(If this thread is becoming tiresome, let me know. This newbie is trying to
understand some of what's going on; clearly you've thought about it way more
than I have, and I can understand if you want to stop thinking about it!)


On Mon, Dec 5, 2011 at 2:22 PM, Dan Gohman <goh...@apple.com> wrote:
On Dec 5, 2011, at 11:55 AM, Paul Robinson wrote:
>
> On Thu, Dec 1, 2011 at 9:37 AM, David A. Greene <gre...@obbligato.org> wrote:
> Dan Gohman <goh...@apple.com> writes:
>
> > Prohibiting poison values from propogating through memory would mean
> > that the reg2mem pass would no longer be a semantics-preserving pass.
>
> Or it means you couldn't demote those values.
>
> If reg2mem is constructing spill slots, or address-not-taken locals, I could
> see classifying those as okay-to-poison. You would not want statics or globals
> or pointer-addressed memory to be poisonable, as those should be treated
> as being observable locations.
>
> So, it's not really "all of memory" that could be poisoned, only these
> compiler-generated things.

This would mean that it's not valid to do store-to-load propagation
in the optimizer if the pointee might be a global variable. That
would mean the loss of real optimizations.

And it would still mean that non-volatile store instructions suddenly
can have side effects beyond just writing to their addressed memory.
That would be surprising.

I think Dave also suggested making select instructions be
observation points. That would mean that select instructions would
have side effects. Again, that would be surprising.

Dan

Back in your first message on this thread, you did say:

>The poison would
>propagate down the def-use graph until it reaches an instruction that
>is capable of triggering the undefined behavior.

So, somebody somewhere has to be capable of triggering the undefined
behavior. What instructions did you have in mind? If the store,
select, and compare instructions aren't appropriate, which ones are?

I don't think this next suggestion was on your list, so here goes:
Drawing from the Itanium example, we could define a new 'check' 
instruction, which would explicitly trigger the undefined behavior if its
operand is poisoned, or produce a known-safe value otherwise. Yes, the
IR would have to be sprinkled with these things; but only outputs of 'nsw'
kinds of operations would subsequently need a 'check' to sanitize them.

So, if a potentially poison value propagates down the def-use graph
until it reaches a store/select/compare, that instruction must be 
preceded by a 'check' that will sanitize the value.

(Or maybe we should call it the 'foodtaster' instruction, which protects
the important instructions from poison...)

Separating out the induce-undefined-behavior instruction means the
existing semantics of store/select/compare are unaffected, the main
consequences being that you can't move these guys past the 'check'
that feeds them.  But that constraint would exist automatically anyway,
right? because you couldn't move an instruction up prior to where its
input values are defined.

Clever/optimal placement of 'check' becomes a topic of interest, but
that's kind of the point of the exercise.

Pogo

me22

unread,
Dec 5, 2011, 8:50:48 PM12/5/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
On Thu, Dec 1, 2011 at 09:23, Dan Gohman <goh...@apple.com> wrote:
>
> int a = INT_MAX, b = 1;
> long c = (long)(a + b);
>
> What is the value of c, on an LP64 target?
>
> If a and b are promoted to 64-bit, c is 0x0000000080000000.
>
> In a world where signed add overflow returns undef, c could be any of
>  0x0000000000000000
>  ...
>  0xffffffffffffffff
> however it can't ever be 0x0000000080000000, because there's no 32-bit value
> the undef could take which sign-extends into that 64-bit value.
>

But since the add nsw returns undef, then isn't c "sext i32 undef to
i64", and thus also undef?

I read it like this example from LangRef:
%B = undef
%C = xor %B, %B
Safe:
%C = undef

%C is undef, so can be later (in a bitwise or, for example) assumed to
be -1, even though there's no possible choice of value for %B such
that %B xor %B gives -1.

Confused,
~ Scott

Eli Friedman

unread,
Dec 5, 2011, 9:23:21 PM12/5/11
to me22, llvmdev@cs.uiuc.edu Mailing List
On Mon, Dec 5, 2011 at 5:50 PM, me22 <me2...@gmail.com> wrote:
> On Thu, Dec 1, 2011 at 09:23, Dan Gohman <goh...@apple.com> wrote:
>>
>> int a = INT_MAX, b = 1;
>> long c = (long)(a + b);
>>
>> What is the value of c, on an LP64 target?
>>
>> If a and b are promoted to 64-bit, c is 0x0000000080000000.
>>
>> In a world where signed add overflow returns undef, c could be any of
>>  0x0000000000000000
>>  ...
>>  0xffffffffffffffff
>> however it can't ever be 0x0000000080000000, because there's no 32-bit value
>> the undef could take which sign-extends into that 64-bit value.
>>
>
> But since the add nsw returns undef, then isn't c "sext i32 undef to
> i64"

Yes.

> and thus also undef?

No: it's some value between INT32_MIN and INT32_MAX. Think of undef
as a read from an arbitrary register: you can't predict what it will
contain, but you know it contains some representable value.

-Eli

Dan Gohman

unread,
Dec 5, 2011, 10:42:00 PM12/5/11
to Paul Robinson, llvmdev@cs.uiuc.edu Mailing List

A call to printf (I/O), a volatile memory operation, etc. These things
can never be speculated. They are essentially the anchors of the system.

> I don't think this next suggestion was on your list, so here goes:
> Drawing from the Itanium example, we could define a new 'check'
> instruction, which would explicitly trigger the undefined behavior if its
> operand is poisoned, or produce a known-safe value otherwise. Yes, the
> IR would have to be sprinkled with these things; but only outputs of 'nsw'
> kinds of operations would subsequently need a 'check' to sanitize them.
>
> So, if a potentially poison value propagates down the def-use graph
> until it reaches a store/select/compare, that instruction must be
> preceded by a 'check' that will sanitize the value.
>
> (Or maybe we should call it the 'foodtaster' instruction, which protects
> the important instructions from poison...)
>
> Separating out the induce-undefined-behavior instruction means the
> existing semantics of store/select/compare are unaffected, the main
> consequences being that you can't move these guys past the 'check'
> that feeds them. But that constraint would exist automatically anyway,
> right? because you couldn't move an instruction up prior to where its
> input values are defined.
>
> Clever/optimal placement of 'check' becomes a topic of interest, but
> that's kind of the point of the exercise.

For example, suppose we want to convert the && to &, and the ?: to a
select, in this code:

if (a && (b ? (c + d) : e)) {

because we have a CPU architecture with poor branch prediction, or
because we want more ILP, or because of some other reason. Here's what the
LLVM IR for that might look like:

%t0 = add nsw i32 %c, %d
%t1 = select i1 %b, i32 %t0, i32 %e
%t2 = icmp ne i32 %t1, 0
%t3 = and i1 %a, %t2
br i1 %t3, ...

The extra branching is gone, yay. But now we've put an add nsw out there
to be executed unconditionally. If we make the select an observation
point, we'd have introduced undefined behavior on a path that didn't
previously have it.

A foodtaster instruction doesn't really solve this problem, because
we'd have to put it between the add and the select, and it would be
just as problematic.

The current definition of nsw is an attempt to avoid arbitrary
limitations and keep the system as flexible as possible.

One could argue that aggressive speculation is a low-level optimization
better suited to CodeGen than the optimizer, as LLVM divides them, and
that perhaps the cost for providing this level of flexibility in the
optimizer is too high, but that's a different argument.

Dan Gohman

unread,
Dec 5, 2011, 10:44:48 PM12/5/11
to me22, llvmdev@cs.uiuc.edu Mailing List

On Dec 5, 2011, at 5:50 PM, me22 wrote:

> On Thu, Dec 1, 2011 at 09:23, Dan Gohman <goh...@apple.com> wrote:
>>
>> int a = INT_MAX, b = 1;
>> long c = (long)(a + b);
>>
>> What is the value of c, on an LP64 target?
>>
>> If a and b are promoted to 64-bit, c is 0x0000000080000000.
>>
>> In a world where signed add overflow returns undef, c could be any of
>> 0x0000000000000000
>> ...
>> 0xffffffffffffffff
>> however it can't ever be 0x0000000080000000, because there's no 32-bit value
>> the undef could take which sign-extends into that 64-bit value.
>>
>
> But since the add nsw returns undef, then isn't c "sext i32 undef to
> i64", and thus also undef?

No; sext i32 undef to i64 returns an extraordinary thing. It's a
64-bit value where the low 32 bits are undef, and the high
32 bits are sign extension from whatever bit 31 happened to be.
And the low 32 bits in this new value also fluctuate, and the
high 31 bits in this value fluctuate with them in concert with
bit 31 of the new value.

FWIW, you can also get non-bitwise fluctuation, such as (undef*3), which
is a value which fluctuates around multiples of 3. Or (undef*3)/5+1,
which fluctuates around, well, any value which could be computed by
that expression. You can build arbitrarily complex expression trees
around undef, and everything fluctuates according to the constraints of
operands and opcodes. It's quite magical.

>
> I read it like this example from LangRef:
> %B = undef
> %C = xor %B, %B
> Safe:
> %C = undef
>
> %C is undef, so can be later (in a bitwise or, for example) assumed to
> be -1, even though there's no possible choice of value for %B such
> that %B xor %B gives -1.

The reason this example works is that undef fluctuates, and further, it
causes things that use it to fluctuate (within their constraints, as I
discussed above). Since the xor reads %B twice, it could get two different
values, so %C can be an unconstrained undef.

Dan

Chris Lattner

unread,
Dec 6, 2011, 3:32:48 AM12/6/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List

On Dec 5, 2011, at 7:44 PM, Dan Gohman wrote:

>
> On Dec 5, 2011, at 5:50 PM, me22 wrote:
>
>> On Thu, Dec 1, 2011 at 09:23, Dan Gohman <goh...@apple.com> wrote:
>>>
>>> int a = INT_MAX, b = 1;
>>> long c = (long)(a + b);
>>>
>>> What is the value of c, on an LP64 target?
>>>
>>> If a and b are promoted to 64-bit, c is 0x0000000080000000.
>>>
>>> In a world where signed add overflow returns undef, c could be any of
>>> 0x0000000000000000
>>> ...
>>> 0xffffffffffffffff
>>> however it can't ever be 0x0000000080000000, because there's no 32-bit value
>>> the undef could take which sign-extends into that 64-bit value.
>>>
>>
>> But since the add nsw returns undef, then isn't c "sext i32 undef to
>> i64", and thus also undef?
>
> No; sext i32 undef to i64 returns an extraordinary thing. It's a
> 64-bit value where the low 32 bits are undef, and the high
> 32 bits are sign extension from whatever bit 31 happened to be.

More precisely, the high 33 bits are known to be the same.

> operands and opcodes. It's quite magical.

It's actually not that magical. It's also essential that we have this behavior, otherwise normal bitfield code has "undefined" behavior and is misoptimized. This is why the result of "undef & 1" is known to have high bits zero.

-Chris

David A. Greene

unread,
Dec 6, 2011, 12:06:46 PM12/6/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
Dan Gohman <goh...@apple.com> writes:

> For example, suppose we want to convert the && to &, and the ?: to a
> select, in this code:
>
> if (a && (b ? (c + d) : e)) {
>
> because we have a CPU architecture with poor branch prediction, or
> because we want more ILP, or because of some other reason. Here's what the
> LLVM IR for that might look like:
>
> %t0 = add nsw i32 %c, %d
> %t1 = select i1 %b, i32 %t0, i32 %e
> %t2 = icmp ne i32 %t1, 0
> %t3 = and i1 %a, %t2
> br i1 %t3, ...
>
> The extra branching is gone, yay. But now we've put an add nsw out there
> to be executed unconditionally. If we make the select an observation
> point, we'd have introduced undefined behavior on a path that didn't
> previously have it.

Unless the undefined behavior only triggered if the select actually
produced a poisoned result. Then it should have the same behavior as
the branch, no?

> A foodtaster instruction doesn't really solve this problem, because
> we'd have to put it between the add and the select, and it would be
> just as problematic.

Or you put it immediately after the select.

> One could argue that aggressive speculation is a low-level optimization
> better suited to CodeGen than the optimizer, as LLVM divides them, and
> that perhaps the cost for providing this level of flexibility in the
> optimizer is too high, but that's a different argument.

No, I think we want the flexibility. But I believe there are sane ways
to do this.

-Dave

David A. Greene

unread,
Dec 6, 2011, 12:08:38 PM12/6/11
to Eli Friedman, llvmdev@cs.uiuc.edu Mailing List
Eli Friedman <eli.fr...@gmail.com> writes:

>> But since the add nsw returns undef, then isn't c "sext i32 undef to
>> i64"
>
> Yes.
>
>> and thus also undef?
>
> No: it's some value between INT32_MIN and INT32_MAX. Think of undef
> as a read from an arbitrary register: you can't predict what it will
> contain, but you know it contains some representable value.

How does undef not cover the above case? "some value between INT32_MIN
and INT32_MAX" is certainly "some representable value."

-Dave

David A. Greene

unread,
Dec 6, 2011, 12:10:58 PM12/6/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
Dan Gohman <goh...@apple.com> writes:

> No; sext i32 undef to i64 returns an extraordinary thing. It's a
> 64-bit value where the low 32 bits are undef, and the high
> 32 bits are sign extension from whatever bit 31 happened to be.

And thus they are undef because the low 32 bits were/are undef.

> And the low 32 bits in this new value also fluctuate, and the
> high 31 bits in this value fluctuate with them in concert with
> bit 31 of the new value.

That's just ridiculous. Undef is undef. We don't know anything special
about it.

> FWIW, you can also get non-bitwise fluctuation, such as (undef*3), which
> is a value which fluctuates around multiples of 3. Or (undef*3)/5+1,
> which fluctuates around, well, any value which could be computed by
> that expression. You can build arbitrarily complex expression trees
> around undef, and everything fluctuates according to the constraints of
> operands and opcodes. It's quite magical.

This is silly. We can't count on undef having any special properties at
all. If we do, IMHO it's a bug.

-Dave

David A. Greene

unread,
Dec 6, 2011, 12:18:09 PM12/6/11
to Chris Lattner, llvmdev@cs.uiuc.edu Mailing List
Chris Lattner <clat...@apple.com> writes:

>> No; sext i32 undef to i64 returns an extraordinary thing. It's a
>> 64-bit value where the low 32 bits are undef, and the high
>> 32 bits are sign extension from whatever bit 31 happened to be.
>
> More precisely, the high 33 bits are known to be the same.

I think that's a very risky assumption. The name "undef" does not
convey that at all.

>> operands and opcodes. It's quite magical.
>
> It's actually not that magical. It's also essential that we have this
> behavior, otherwise normal bitfield code has "undefined" behavior and
> is misoptimized. This is why the result of "undef & 1" is known to
> have high bits zero.

Well sure, because we know (x & 1) has the high bits set to zero for any
value of x. In the sext case, we know nothing about the high bits
because we don't know what bit 32 was/is.

Yes, we know the high bits are the same but it's surprising to me that
this would be represented as "undef." "undef" to me means I can assume
any value at all, not restricted to "all the bits are the same."

-Dave

Andrew Trick

unread,
Dec 6, 2011, 3:22:00 PM12/6/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
The optimizations we're talking about are forms of safe
speculation. Doing that at IR level is fine.

Hardware support for NaT bits + check instructions have been popping
up on this message thread. Hardware speculation exists solely to
support unsafe speculation, in which certain operations need to be
reexecuted under certain conditions. It is not something we would ever
want to represent in IR. It isn't ammenable to CFG+SSA form.

Runtime checking for undefined behavior would be implemented as
overflow checks, etc. by the front end. I don't think it's related to
unsafe speculation. In other words, I don't see the purpose of a
"check/foodtaster" instruction.

-Andy

Eli Friedman

unread,
Dec 6, 2011, 2:55:52 PM12/6/11
to David A. Greene, llvmdev@cs.uiuc.edu Mailing List
On Tue, Dec 6, 2011 at 9:18 AM, David A. Greene <gre...@obbligato.org> wrote:
> Chris Lattner <clat...@apple.com> writes:
>
>>> No; sext i32 undef to i64 returns an extraordinary thing. It's a
>>> 64-bit value where the low 32 bits are undef, and the high
>>> 32 bits are sign extension from whatever bit 31 happened to be.
>>
>> More precisely, the high 33 bits are known to be the same.
>
> I think that's a very risky assumption.  The name "undef" does not
> convey that at all.
>
>>> operands and opcodes. It's quite magical.
>>
>> It's actually not that magical.  It's also essential that we have this
>> behavior, otherwise normal bitfield code has "undefined" behavior and
>> is misoptimized.  This is why the result of "undef & 1" is known to
>> have high bits zero.
>
> Well sure, because we know (x & 1) has the high bits set to zero for any
> value of x.  In the sext case, we know nothing about the high bits
> because we don't know what bit 32 was/is.
>
> Yes, we know the high bits are the same but it's surprising to me that
> this would be represented as "undef."

We don't represent that as undef in the IR; constant folding will fold
"sext i32 undef to i64" to zero, not undef.

-Eli

Andrew Trick

unread,
Dec 6, 2011, 5:53:14 PM12/6/11
to David A. Greene, llvmdev@cs.uiuc.edu Mailing List
On Dec 6, 2011, at 2:31 PM, David A. Greene wrote:

> Andrew Trick <atr...@apple.com> writes:
>
>> The optimizations we're talking about are forms of safe
>> speculation. Doing that at IR level is fine.
>>
>> Hardware support for NaT bits + check instructions have been popping
>> up on this message thread. Hardware speculation exists solely to
>> support unsafe speculation, in which certain operations need to be
>> reexecuted under certain conditions. It is not something we would ever
>> want to represent in IR. It isn't ammenable to CFG+SSA form.
>>
>> Runtime checking for undefined behavior would be implemented as
>> overflow checks, etc. by the front end. I don't think it's related to
>> unsafe speculation. In other words, I don't see the purpose of a
>> "check/foodtaster" instruction.
>

> The purpose of all of these discussions is to alleviate the
> computational explosion of static analysis that Dan described in the
> presence of nsw bits. The point is to reduce the search space for
> static checking. If we don't care about that, then all of this
> discussion is moot. :)
>
> -Dave

Right. I can see how a value-undefined-if-poison marker could yield more precise static analysis. I didn't mean to discourage that approach, although I'm not aware of any imminent problem that it solves.

-Andy

David A. Greene

unread,
Dec 6, 2011, 5:29:32 PM12/6/11
to Eli Friedman, llvmdev@cs.uiuc.edu Mailing List
Eli Friedman <eli.fr...@gmail.com> writes:


>> Yes, we know the high bits are the same but it's surprising to me that
>> this would be represented as "undef."
>
> We don't represent that as undef in the IR; constant folding will fold
> "sext i32 undef to i64" to zero, not undef.

Ok, so why the fuss over sext?

-Dave

David A. Greene

unread,
Dec 6, 2011, 5:31:16 PM12/6/11
to Andrew Trick, llvmdev@cs.uiuc.edu Mailing List
Andrew Trick <atr...@apple.com> writes:

> The optimizations we're talking about are forms of safe
> speculation. Doing that at IR level is fine.
>
> Hardware support for NaT bits + check instructions have been popping
> up on this message thread. Hardware speculation exists solely to
> support unsafe speculation, in which certain operations need to be
> reexecuted under certain conditions. It is not something we would ever
> want to represent in IR. It isn't ammenable to CFG+SSA form.
>
> Runtime checking for undefined behavior would be implemented as
> overflow checks, etc. by the front end. I don't think it's related to
> unsafe speculation. In other words, I don't see the purpose of a
> "check/foodtaster" instruction.

The purpose of all of these discussions is to alleviate the


computational explosion of static analysis that Dan described in the
presence of nsw bits. The point is to reduce the search space for
static checking. If we don't care about that, then all of this
discussion is moot. :)

-Dave

Paul Robinson

unread,
Dec 6, 2011, 1:07:53 PM12/6/11
to David A. Greene, llvmdev@cs.uiuc.edu Mailing List
On Tue, Dec 6, 2011 at 9:06 AM, David A. Greene <gre...@obbligato.org> wrote:
Dan Gohman <goh...@apple.com> writes:

> For example, suppose we want to convert the && to &, and the ?: to a
> select, in this code:
>
> if (a && (b ? (c + d) : e)) {
>
> because we have a CPU architecture with poor branch prediction, or
> because we want more ILP, or because of some other reason. Here's what the
> LLVM IR for that might look like:
>
>    %t0 = add nsw i32 %c, %d
>    %t1 = select i1 %b, i32 %t0, i32 %e
>    %t2 = icmp ne i32 %t1, 0
>    %t3 = and i1 %a, %t2
>    br i1 %t3, ...
>
> The extra branching is gone, yay. But now we've put an add nsw out there
> to be executed unconditionally. If we make the select an observation
> point, we'd have introduced undefined behavior on a path that didn't
> previously have it.

Unless the undefined behavior only triggered if the select actually
produced a poisoned result.  Then it should have the same behavior as
the branch, no?

> A foodtaster instruction doesn't really solve this problem, because
> we'd have to put it between the add and the select, and it would be
> just as problematic.

Or you put it immediately after the select.

That was my thinking. The select is an observation point for its first operand,
but then merely propagates poison from the second or third, just like any
computational instruction would.
The icmp is an observation point for both inputs.

Pogo

Dan Gohman

unread,
Dec 6, 2011, 8:29:14 PM12/6/11
to Paul Robinson, llvmdev@cs.uiuc.edu Mailing List

Put the observation point on the add, the select, the icmp, or even the
and, or between any of them, and it'll still be before the branch. That
means that the code will have unconditional undefined behavior when the
add overflows, which the original code did not have.

Dan

Paul Robinson

unread,
Dec 7, 2011, 2:19:57 PM12/7/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
Sorry, you lost me there.  The behavior of the observation point isn't undefined
unless you took the overflow path in the first place. I don't see how that makes
it unconditionally undefined.

The discussion started based on a premise of propagating poison values past
all these kinds of places; now you don't want to let me move them past any
of these places? Clearly I don't follow how you think my suggestion diverges
unacceptably from your original sketch. Which might mean it's time to bail...

Pogo

Dan Gohman

unread,
Dec 7, 2011, 2:23:37 PM12/7/11
to Chris Lattner, llvmdev@cs.uiuc.edu Mailing List
On Dec 6, 2011, at 12:32 AM, Chris Lattner wrote:
>
> On Dec 5, 2011, at 7:44 PM, Dan Gohman wrote:
>
>> operands and opcodes. It's quite magical.
>
> It's actually not that magical.

It's not magical from an LLVM perspective. But it is one
of the areas where the popular mental model of LLVM IR as an
assembly language breaks down. Undef is weirder than PHI nodes.

Dan

David A. Greene

unread,
Dec 12, 2011, 12:25:59 PM12/12/11
to Dan Gohman, llvmdev@cs.uiuc.edu Mailing List
Dan Gohman <goh...@apple.com> writes:

>> Or you put it immediately after the select.
>>
>> That was my thinking. The select is an observation point for its first operand,
>> but then merely propagates poison from the second or third, just like any
>> computational instruction would.
>> The icmp is an observation point for both inputs.
>
> Put the observation point on the add, the select, the icmp, or even the
> and, or between any of them, and it'll still be before the branch. That
> means that the code will have unconditional undefined behavior when the
> add overflows, which the original code did not have.

So put it after the branch. Absolutely nothing prevents that. That's
what IA64 compilers have to do today.

-Dave

Reply all
Reply to author
Forward
0 new messages