Stephen Sprunk <
ste...@sprunk.org> writes:
> On 16-Nov-13 19:28, Tim Rentsch wrote:
>> Stephen Sprunk <
ste...@sprunk.org> writes:
>>> There are sequence points before and after each call to rand(),
>>> regardless of which order they're called in, so there is no danger of
>>> them interleaving.
>>
>> Having a sequence point before and after each common access does
>> not prevent undefined behavior if there is no other ordering
>> between those sequence points. For example
>>
>> int x;
>> ...
>>
>> (10, x = 1, 30) + (20, x = 2, 40)
>>
>> is still undefined behavior, despite there being a sequence point
>> both before and after each access to x.
>
> I trust that you are correct, but I don't understand why. I thought
> those sequence points would establish that x=1 and x=2 were ordered,
> even if that order is unspecified.
The rules for evaluation sequencing in C90 suffer from an
unfortunate choice of phrasing, which turned out to be misleading
in some cases. This confusion was noted fairly soon after C90 was
published, the Defect Report asking about it being submitted in
1993. Around the time C99 was being done, several attempts were
made to define a formal model for sequencing rules, to remove
ambiguities and define the rules more precisely. However these
efforts did not result in any changes in C99. Finally when C11
was being done a suitable formal model was arrived at, and these
changes were incorporated into C11. It is worth reading what C11
says about sequencing relationships, expecially if one is familar
with how those rules are expressed in C90/C99 - even though it
seems like the new rules are different, reportedly the semantics
of the C11 rules is what was intended all along for C90 and C99.
To look at the particular case, here is a diagrammatic view. The
"at sign" in '@x' means lvalue as opposed to rvalue. Statement
boundaries are semicolons.
-- @x --
/ \
- 10 --> , -- = --> , -- 30 --
/ \ / \
/ --- 1 -- \
--> ; -- + --> ; --
\ -- @x -- /
\ / \ /
- 20 --> , -- = --> , -- 40 --
\ /
--- 2 --
Here the lines show ordering for value computations, and the
arrows -> show sequence points, which also order storing of values
for operations going backwards from the ->. The undefined
behavior for this expression can be seen by the absence of a line
connecting the two assignment operators, both of which modify x.
In fact, since both are modifications, not only would there need
to be a line between the two assignments, but there would need to
be an arrow -> between them on that line, to prevent destructive
interference (ie, undefined behavior). But there is no such line.
A sequence point imposes an ordering between two modifications
only when one modification is somewhere on the "forward line" and
the other is somewhere on the "backward line". (The rule for one
modification and one read access is a little more complicated,
involving operators as well as sequence points.)
Unfortunately the Standard expresses these rules by talking about
an interval "Between the previous and next sequence point". This
sounds like such intervals are uniquely determined (which they
aren't), or perhaps like there is an unspecified total ordering of
sequence points (which there isn't). What's intended is a partial
ordering determined by the operators in an expression (and in fact
two partial ordering relationships, one for 'value computations'
that involve only reading, and another for operations that store
into objects). How C90 and C99 express this is misleading at
times; it's more reliable (meaning more consistent with the C11
wording) to ask the question in terms of what the diagram looks
like.
>> What matters here is not the sequence points but the accesses
>> taking place inside a called function body. Evaluation of
>> function bodies does not overlap with the evalution of other
>> expressions outside the called function (including expressions
>> in other called functions).
>>
>> So this
>>
>> int x;
>>
>> void set_x( int new_x; ){ x = new_x; }
>>
>> ...
>>
>> (10, set_x( 1 ), 30) + (20, set_x( 2 ), 40)
>>
>> is defined (unspecified) behavior rather than undefined behavior,
>> despite the accesses to x being the same as in the assignment
>> example as far as sequence points go. [ADDED: probably I should
>> have mentioned that 'set_x(1) + set_x(2)' is also defined
>> (unspecifed) behavior and not undefined behavior.]
>
> That seems strange to me. I thought the sequence points before and
> after the function calls here:
>
> set_x(1) + set_x(2)
>
> would be enough to establish that one call must finish before the other
> could start, even if it's unspecified which order they happen in. To
> interpret it that way doesn't require extra rules about non-overlap of
> function bodies.
The diagram for 'set_x(1) + set_x(2)' looks like this (simplifying
a bit):
--> [set_x(1)] -- x = 1 --> ; --> [return] --
/ \
--> ; -- + --> ; --
\ /
--> [set_x(2)] -- x = 2 --> ; --> [return] --
There are plenty of sequence points, but still no ordering line
between the two assignments. If all we have is the same rule for
sequence points that is used within expressions, this case is
still undefined behavior. I agree with what Richard Damon says,
that the more stringent rule for how function calls work cannot
be derived from what C90 or C99 says about sequencing (or at
least if it can I don't know how). Obviously it's what people
expect, but it doesn't really follow from the phrasing used in
C90 or C99.
(Unfortunately there is no simple way to explain the rule for
function calls in terms of the diagrams. This difficulty may help
explain why the earlier attempts at defining formal models weren't
used in C99.)
I realize this answer may not be completely satisfactory, because it
doesn't directly respond to your intuition about how sequence points
work. The best I can think of to say is that other people have had
reactions much like yours, and it's taken the better part of 20
years to figure out how to say how evaluation sequencing is meant to
work (or not, in the cases where there is undefined behavior). The
good news is that now there is new phrasing in C11, and it is much
better at delineating the different cases unambiguously. So I hope
that either the explanation above or the new C11 description will
help bridge the gap.