Sequence point violations

Robbert Krebbers

unread,

Mar 12, 2013, 9:59:59 AM3/12/13

to

It is well known that multiple unsequenced writes to the same location,
and an unsequenced read after a write to the same location, like (x = 2)
+ (x = 3), lead to undefined behavior. Also, we all know that a compiler
is allowed to compile such programs into binary code that behaves badly.

I am wondering what kind of actually used compiler optimizations take
advantage of sequence point violations. Moreover, I am looking for
example programs with sequence point violations that when compiled with
an actual optimizing compiler (like gcc or clang) omit surprising results.

By googleing I am able to find many of such examples for integer
overflow, and other kinds of undefined behavior, but not for sequence
point violations.

Thanks!

Richard Kettlewell

unread,

Mar 12, 2013, 10:08:18 AM3/12/13

to

Robbert Krebbers <newsg...@robbertkrebbers.nl> writes:

> It is well known that multiple unsequenced writes to the same
> location, and an unsequenced read after a write to the same location,
> like (x = 2) + (x = 3), lead to undefined behavior. Also, we all know
> that a compiler is allowed to compile such programs into binary code
> that behaves badly.
>
> I am wondering what kind of actually used compiler optimizations take
> advantage of sequence point violations. Moreover, I am looking for
> example programs with sequence point violations that when compiled
> with an actual optimizing compiler (like gcc or clang) omit surprising
> results.

The example you give produces differing results on (the versions I have
of) GCC and Clang; presumably at least one of them must be surprising
(to someone who had not spotted the UB).

--
http://www.greenend.org.uk/rjk/

Keith Thompson

unread,

Mar 12, 2013, 11:12:21 AM3/12/13

to

I suggest that "surprising" could mean a result that's inconsistent with
any of the possible orderings of the operations, for example if

x = 0;
x = x++;

sets x to 3.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Shao Miller

unread,

Mar 12, 2013, 8:55:13 PM3/12/13

to

I'm not sure that I follow, here... Is there a supposition here that
all cases of undefined behaviour are related to or allow for
optimization opportunities?

For non-volatile 'x', one trivial optimization would be to omit one of
the assignments. :) Are you wondering if there is an implementation
which does so?

--
- Shao Miller
--
"Thank you for the kind words; those are the kind of words I like to hear.

Cheerily," -- Richard Harter

Robbert Krebbers

unread,

Mar 13, 2013, 3:51:50 AM3/13/13

to

On 03/12/2013 04:12 PM, Keith Thompson wrote:
> I suggest that "surprising" could mean a result that's inconsistent with
> any of the possible orderings of the operations

Yes, that is what I meant by surprising.

> x = 0;
> x = x++;
>
> sets x to 3.

With which compiler?

My gcc (4.7.2, linux x86_64) gives 1, and clang (3.0-6.1) gives 0. I
guess these two results can at least be explained by it delaying and
reordering writes, or not. But 3 is quite surprising.

Robbert Krebbers

unread,

Mar 13, 2013, 3:58:51 AM3/13/13

to

On 03/13/2013 01:55 AM, Shao Miller wrote:
> I'm not sure that I follow, here... Is there a supposition here that
> all cases of undefined behaviour are related to or allow for
> optimization opportunities?

Well, most cases of undefined behavior are as far as I see related to

a.) allow more aggressive optimization
b.) exclude programs that cannot be statically excluded (e.g.
syntactically, or by types), but which cannot be compiled "sensibly"
(null pointer dereferences, array out of bounds). Otherwise, a compiler
should include possibly expensive checks.
c.) make C more architecture independent

I doubt either (b.) or (c.) is relevant to the case of sequence point
violations.

James Kuyper

unread,

Mar 13, 2013, 6:17:13 AM3/13/13

to

On 03/13/2013 03:51 AM, Robbert Krebbers wrote:
> On 03/12/2013 04:12 PM, Keith Thompson wrote:
>> I suggest that "surprising" could mean a result that's inconsistent with
>> any of the possible orderings of the operations
> Yes, that is what I meant by surprising.
>
>> x = 0;
>> x = x++;
>>
>> sets x to 3.
> With which compiler?

He was giving an example of a result permitted by the C standard that
would, unlike 0 or 1, be considered surprising - he wasn't giving it as
an example of actual output using a real compiler.
--
James Kuyper

Keith Thompson

unread,

Mar 13, 2013, 1:03:26 PM3/13/13

to

Exactly.

Seebs

unread,

Mar 14, 2013, 1:56:47 PM3/14/13

to

On 2013-03-12, Robbert Krebbers <newsg...@robbertkrebbers.nl> wrote:
> It is well known that multiple unsequenced writes to the same location,
> and an unsequenced read after a write to the same location, like (x = 2)
> + (x = 3), lead to undefined behavior. Also, we all know that a compiler
> is allowed to compile such programs into binary code that behaves badly.

> I am wondering what kind of actually used compiler optimizations take
> advantage of sequence point violations. Moreover, I am looking for
> example programs with sequence point violations that when compiled with
> an actual optimizing compiler (like gcc or clang) omit surprising results.

I bet you'd find it more interesting if they *e*mitted surprising results. :)

I don't know what the code was exactly, but I once saw, on some strange
embedded system, code roughly to the effect of:

i = i++ % 4;

which produced a consistent-but-incomprehensible sequence of outputs; I seem
to recall something like 1, 3, 12, 257. I don't remember details anymore,
this was probably twenty years ago.

-s
--
Copyright 2013, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

Joe keane

unread,

Mar 14, 2013, 4:43:08 PM3/14/13

to

In article <513f3465$0$43849$862e...@ngroups.net>,

Robbert Krebbers <newsg...@robbertkrebbers.nl> wrote:
>It is well known that multiple unsequenced writes to the same location,
>and an unsequenced read after a write to the same location, like (x = 2)
>+ (x = 3), lead to undefined behavior.

I tried some code:

% cat bar.c
int f(int **p, int **q)
{
return *(*p)++ + *(*q)++;
}
% cc -S -O bar.c
% cat bar.s
.file "bar.c"
.text
.globl f
.type f, @function
f:
pushl %ebp
movl %esp, %ebp
pushl %esi
pushl %ebx
movl 8(%ebp), %ebx
movl 12(%ebp), %esi
movl (%ebx), %edx
movl (%esi), %ecx
movl (%ecx), %eax
addl (%edx), %eax
addl $4, %edx
movl %edx, (%ebx)
addl $4, %ecx
movl %ecx, (%esi)
popl %ebx
popl %esi
leave
ret
.size f, .-f
.ident "GCC: (GNU) 4.1.3 20080704 prerelease (NetBSD nb2 20081120)"

But it's not what you want: the unoptimized code does the same thing.

Philip Guenther

unread,

Mar 15, 2013, 5:30:42 AM3/15/13

to

On Tuesday, March 12, 2013 6:59:59 AM UTC-7, Robbert Krebbers wrote:
> It is well known that multiple unsequenced writes to the same location,
> and an unsequenced read after a write to the same location, like (x = 2)
> + (x = 3), lead to undefined behavior. Also, we all know that a compiler
> is allowed to compile such programs into binary code that behaves badly.
>
> I am wondering what kind of actually used compiler optimizations take
> advantage of sequence point violations. Moreover, I am looking for
> example programs with sequence point violations that when compiled with
> an actual optimizing compiler (like gcc or clang) omit surprising results.

Take advantage relative to _what_? To say "this optimization in this compiler takes advantage on sequence point violations" you need to contrast that to what the compiler would have had to do if sequence point violations were _not_ undefined behavior. What could/would the standard say instead?

(To put it another way: in order for results to be surprising, you must have some expectations about what the results should be. What rules are you assuming a compiler would follow for the results to _not_ be surprising?)

One option would be to back down from this directly being undefined behavior to instead give all the "possibly modified lvalues" indeterminate values. Note that "possibly modified lvalues" can easily expand to "the entire system":
*p = *p++ + *p++;

p is modified twice between sequence points, so its value is indeterminate, which makes the assignment to *p undefined behavior. My sense is that this is already what most compilers achieve: few or no optimizations would be ruled out...but it's not much of a tightening from the existing wording and the results are not usefully improved, so no one is going to waste their time trying to formalize that rule.

At the other end, the standard could specify that the results must be "as if" evaluation was a strict left-to-right (or right-to-left?) walk of the parse tree. That's quite strict; would resulted affected by an optimization that conflicted with that be "surprising"?

Philip Guenther

James Kuyper

unread,

Mar 15, 2013, 8:02:46 AM3/15/13

to

If it weren't for the fact that the standard explicitly says that the
behavior is undefined in those cases, the rest of the standard would
leave the order of the side effects unspecified, but would allow nothing
worse than the consequences of that unspecified order.
For some expressions, for some values of the relevant variables, that
could end up allowing a sub-expression to have a value that renders the
behavior of a containing expression undefined, such as division by 0.
However, when that is not the case, an expression containing N different
side effects would have at most N! different possible results; fewer
than that if any pair of side effects is sequenced. Those are the
"unsurprising" results; the fact that the standard allows anything other
than those N! different results is surprising to most people who aren't
aware of the clause that says the behavior is undefined (and also to the
large number of people who don't properly understand what "undefined
behavior" really means).

However, for side-effects which can interact with each other, nothing
less than a strict sequence on those side-effects could be of any
practical use in most contexts. You could get most of the way there with
just a few changes: associate a sequence point with each assignment
operator; require that a++ impose the same sequencing as
(unnamed_temporary=a, a+=1, unnamed_temporary); require that ++a be
equivalent to a+=1, and similarly for --.
--
James Kuyper

James Kuyper

unread,

Mar 15, 2013, 8:02:46 AM3/15/13

to

On 03/15/2013 05:30 AM, Philip Guenther wrote:

christian.bau

unread,

Mar 15, 2013, 6:09:31 PM3/15/13

to

On Mar 12, 1:59 pm, Robbert Krebbers <newsgro...@robbertkrebbers.nl>
wrote:

> I am wondering what kind of actually used compiler optimizations take
> advantage of sequence point violations. Moreover, I am looking for
> example programs with sequence point violations that when compiled with
> an actual optimizing compiler (like gcc or clang) omit surprising results.

Here's a simple example:

An old and well-known method to evaluate an expression using the
minimum number of registers is to evaluate those sub-expressions first
that need more registers. If I want to evaluate (expr1) + (expr2), and
these two expressions need 2 resp. 3 registers to evaluate, then I
evaluate expr2 first using three registers, keep the result in one
register, use the other two to evaluate expr1, and add, so the total
evaluation used 3 registers. Evaluating expr1 first needs 4 registers:
One to hold the result of expr1, three to evaluate expr2.

Now let's say the expressions are (*p = something) + (*q =
somethingelse). If p and q are different pointers, then it doesn't
matter which side is evaluated first. If p and q are the same,
behaviour is undefined, and that allows the compiler to evaluate
whichever side it wants first, so the algorithm above can be used to
evaluate the whole expression using the minimal number of registers.
If behaviour was defined like in Java where the first assignment
happens first, and the second assignment happens second, the compiler
wouldn't have that freedom and might have to produce less good code if
the first subexpression needed more registers to evaluate than the
second.

Tim Rentsch

unread,

Mar 15, 2013, 10:32:58 PM3/15/13

to

James Kuyper <james...@verizon.net> writes:

> On 03/15/2013 05:30 AM, Philip Guenther wrote:

>> [snip]

>>
>> At the other end, the standard could specify that the results must be
>> "as if" evaluation was a strict left-to-right (or right-to-left?)
>> walk of the parse tree. That's quite strict; would resulted affected
>> by an optimization that conflicted with that be "surprising"?
>

> [if side effects occurred in some unspecified order] an expression

> containing N different side effects would have at most N! different

> possible results; [snip elaboration]

This is either sloppy language or sloppy thinking. An
expression like

i++ * i * i * i

has only one side effect, but surely has more than one or
two possible values.

Robbert Krebbers

unread,

Mar 16, 2013, 7:09:24 PM3/16/13

to

On 03/15/2013 01:02 PM, James Kuyper wrote:
> On 03/15/2013 05:30 AM, Philip Guenther wrote:
>> Take advantage relative to _what_? To say "this optimization in this
>> compiler takes advantage on sequence point violations" you need to
>> contrast that to what the compiler would have had to do if sequence
>> point violations were _not_ undefined behavior. What could/would the
>> standard say instead?
>>

>> [...]

>
> If it weren't for the fact that the standard explicitly says that the
> behavior is undefined in those cases, the rest of the standard would
> leave the order of the side effects unspecified, but would allow nothing
> worse than the consequences of that unspecified order.

Yes, that is exactly what I was relating it to.

Robbert Krebbers

unread,

Mar 16, 2013, 7:09:36 PM3/16/13

to

Thanks, these are the kind of optimizations I was looking for!

Nonetheless, I do not see why sequence point violations are necessary to
justify this. If the order of evaluation was just unspecified, this
would also have been possible.

Philip Guenther

unread,

Mar 16, 2013, 8:55:35 PM3/16/13

to

On Saturday, March 16, 2013 4:09:36 PM UTC-7, Robbert Krebbers wrote:
...

> Nonetheless, I do not see why sequence point violations are necessary to
> justify this. If the order of evaluation was just unspecified, this
> would also have been possible.

For whom (and how) would that be an improvement?

Philip Guenther

Wojtek Lerch

unread,

Mar 16, 2013, 9:25:27 PM3/16/13

to

On 16/03/2013 7:09 PM, Robbert Krebbers wrote:
> Nonetheless, I do not see why sequence point violations are necessary to
> justify this. If the order of evaluation was just unspecified, this
> would also have been possible.

Imagine that the values don't fit into a single register (for instance,
64-bit integers on a 32-bit machine) and each assignment needs to store
two registers to memory.

Richard Damon

unread,

Mar 16, 2013, 10:47:25 PM3/16/13

to

On 3/15/13 8:02 AM, James Kuyper wrote:

> If it weren't for the fact that the standard explicitly says that the
> behavior is undefined in those cases, the rest of the standard would
> leave the order of the side effects unspecified, but would allow nothing
> worse than the consequences of that unspecified order.

I believe part of the issue was that there exists (or existed) some
hardware which had requirement on code accessing memory, such that a
hardware fault might occur if two writes to the same memory occurred too
close together, or a write followed by a read to that same location too
close in time. On such a machine, the multiple side effects on the same
memory location didn't just give an make the order unspecified, but
could cause a program crash (a good example of an undefined behavior).
If the two modifications were due to two different pointer variables
which just happened to point to the same object, then the compiler might
not be able to detect that this might happen. Since this sort of machine
might also have a requirement that the program needed to wait for a
result was ready before accessing it (the need to insert some operations
that don't depend on it, or nop's), it could well push the store-backs
for two operations together, for example something like
i = ++*p + ++*q; might generate updates to *p and *q closer than allowed
if the compiler had to allow for p == q.

Tim Rentsch

unread,

Mar 17, 2013, 1:59:05 PM3/17/13

to

Robbert Krebbers <newsg...@robbertkrebbers.nl> writes:

> On 03/15/2013 11:09 PM, christian.bau wrote:
>> On Mar 12, 1:59 pm, Robbert Krebbers<newsgro...@robbertkrebbers.nl>
>> wrote:
>>
>>> > I am wondering what kind of actually used compiler optimizations
>>> > take advantage of sequence point violations. Moreover, I am
>>> > looking for example programs with sequence point violations that
>>> > when compiled with an actual optimizing compiler (like gcc or
>>> > clang) omit surprising results.

>> Here's a simple example: [..snip..]

>>
>
> Thanks, these are the kind of optimizations I was looking for!
>
> Nonetheless, I do not see why sequence point violations are necessary
> to justify this. If the order of evaluation was just unspecified, this
> would also have been possible.

The reasoning here is backwards. Expression semantics that make
order of evaluation unspecified offers essentially no benefit
over making side-effect conflicts be undefined behavior -- in
both cases any sensible developer will avoid such situations
like the plague. So there is no justification for making the
behavior in such cases be unspecified rather than undefined,
which allows a greater range of implementation choices.