Is this really unspecified behavior?

pon...@gmail.com

unread,

Nov 22, 2005, 12:20:16 AM11/22/05

to

C++03 5/4:
[Example:
i = v[i++]; // the behavior is unspecified
.
]

I wanna ask that if v is a std::vector which overloads operator[], is
this unspecified any more?

What I understand is that there's a sequence point at the entry and
exit of a function call, well, operator function certainly is a
function, so if we change that 'v' to a object of std::vector, v[i++]
becomes a function call, the side effect of which takes place before
the assignment operation, therefore 'i' gets a determinable value,
which is v[*old value of the i*], which of course is not unspecified.

That said, I wonder if this analysis is right, did I miss or
misunderstand anything?
If that is true, is this an evidence that some inconsistency, in some
extremely non-obvious way, exists between build-in operator and
operator function(informally known as 'overloaded operator').

Another question is:

What is the side-effect of 'i++' actually? Two options, first of which
is "fetch i from memory, add it by 1, write the new value back", second
is "write the new value stored previously somewhere into the storage of
'i'". Is the answer 'both' or 'either' or whatever? Plus, the words
below(also excerpted from [c++03;5/4])are really puzzling to me, can
anyone explain it please? Does it have anything to do with the two
questions I asked?

[C++03;5/4]"Between the previous and next sequence point a scalar
object shall have its stored value modified at most once by the
evaluation of an expression. Furthermore, the prior value shall be
accessed only to determine the value to be stored.The requirements of
this paragraph shall be met for each allowable ordering of the
subexpressions of a full expression; otherwise the behavior is
undefined."

Any help is appreciated;-)

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]

kuy...@wizard.net

unread,

Nov 22, 2005, 10:02:42 AM11/22/05

to

pon...@gmail.com wrote:
> C++03 5/4:
> [Example:
> i = v[i++]; // the behavior is unspecified
> .
> ]
>
> I wanna ask that if v is a std::vector which overloads operator[], is
> this unspecified any more?
>
> What I understand is that there's a sequence point at the entry and
> exit of a function call, well, operator function certainly is a
> function, so if we change that 'v' to a object of std::vector, v[i++]
> becomes a function call, the side effect of which takes place before
> the assignment operation, therefore 'i' gets a determinable value,
> which is v[*old value of the i*], which of course is not unspecified.
>
> That said, I wonder if this analysis is right, did I miss or
> misunderstand anything?

Yes, that's correct. When operators are overloaded, it's actually just
shorthand for function calls, with all of the corresponding sequence
points.

> If that is true, is this an evidence that some inconsistency, in some
> extremely non-obvious way, exists between build-in operator and
> operator function(informally known as 'overloaded operator').

Yes, this is one of the several inconsistencies between them. Backwards
compatibility with C was an important objective during the development
of C++, which prevented complete consistency between built-in and
user-defined operators. Note: the phrase "overloaded operator" refers
to the operator that is being overloaded. The standard uses the term
"overloaded operator function" for the function that overloads the
overloaded operator, so there's nothing particularly informal about
that terminology.

> Another question is:
>
> What is the side-effect of 'i++' actually? Two options, first of which
> is "fetch i from memory, add it by 1, write the new value back", second
> is "write the new value stored previously somewhere into the storage of
> 'i'". Is the answer 'both' or 'either' or whatever?

Accessing the value of 'i' is not a side effect unless 'i' was declared
volatile. Otherwise, the only side effect of i++ is the writing of the
new value. The read of the previous value (if not volatile), and the
calculation of the new value, are the "main" effect. See 1.9p7 for the
exact definition of "side effect".

> Plus, the words
> below(also excerpted from [c++03;5/4])are really puzzling to me, can
> anyone explain it please? Does it have anything to do with the two
> questions I asked?
>
> [C++03;5/4]"Between the previous and next sequence point a scalar
> object shall have its stored value modified at most once by the
> evaluation of an expression.

This part is pretty straightforward. Several different operators can
modifiy the value of an object: all of the assignment operators, ++
and --. If two such operators modify the value of the same object
without an intervening sequence point, it's a violation of 5p4.

As a practical matter, this is because the absence of a sequence point
allows the implementor to rearrange the generated machine code, so that
there's no telling which of the two modifications will occur first, and
it may even be that the two modifications interfere with each other to
produce a result that's different from what would have happened if
either modification had been the only one.

> Furthermore, the prior value shall be
> accessed only to determine the value to be stored.The requirements of
> this paragraph shall be met for each allowable ordering of the
> subexpressions of a full expression; otherwise the behavior is
> undefined."

This one is much more difficult. If you read the current value of an
object, and then write to that same object, without an intervening
sequence point, then you can't read the value for any purpose other
than determining the value that is to be written.

Again, as a practical matter that's because the absences of a sequence
point allows the read and the write to be in any order, and it even
allows them to interfere with each other. It's allowed only if it's
inherently impossible to know what value to write, until you've
finished reading the value; that guarantees that the read and the write
have to be in the right order, and can't interfere with each other,
even in the absence of a sequence point.

However, there's a lot of confusion and argument about what constitutes
a use of the value for that purpose. The safest thing to do is to avoid
any construct that might be construed as using the previous value for
any other purpose.

Hyman Rosen

unread,

Nov 22, 2005, 8:54:04 PM11/22/05

to

kuy...@wizard.net wrote:
> Yes, this is one of the several inconsistencies between them. Backwards
> compatibility with C was an important objective during the development
> of C++, which prevented complete consistency between built-in and
> user-defined operators.

Specifying order of evaluation completely in C++ would
be completely backwards-compatible with C, though, so
that's no excuse here. 'i = v[i++];' can and should be
the same regardless of whether v is vector or built-in,
and that would not be inconsistent with C.

> However, there's a lot of confusion and argument about what constitutes
> a use of the value for that purpose. The safest thing to do is to avoid
> any construct that might be construed as using the previous value for
> any other purpose.

The right thing to do is to get rid of this stupidity from the
language once and for all and define the order of evaluation
completely, including when side effects happen, as strictly
left-to-right and operands before operation.

Andrew Koenig

unread,

Nov 23, 2005, 12:52:04 AM11/23/05

to

"Hyman Rosen" <hyr...@mail.com> wrote in message
news:E1EeZw5-...@chx400.switch.ch...
> kuy...@wizard.net wrote:

> Specifying order of evaluation completely in C++ would
> be completely backwards-compatible with C, though, so
> that's no excuse here.

The trouble is that if order of evaluation were completely specified in C++,
there might be C++ compilers that would be required to give different
results from existing C implementations for the same program (assuming, of
course, that the program was in the intersection of C and C++).

Greg Herlihy

unread,

Nov 23, 2005, 12:53:09 AM11/23/05

to

pon...@gmail.com wrote:
> C++03 5/4:
> [Example:
> i = v[i++]; // the behavior is unspecified
> .
> ]
>
> I wanna ask that if v is a std::vector which overloads operator[], is
> this unspecified any more?

> What I understand is that there's a sequence point at the entry and
> exit of a function call, well, operator function certainly is a
> function, so if we change that 'v' to a object of std::vector, v[i++]
> becomes a function call, the side effect of which takes place before
> the assignment operation, therefore 'i' gets a determinable value,
> which is v[*old value of the i*], which of course is not unspecified.
>
> That said, I wonder if this analysis is right, did I miss or
> misunderstand anything?

Your analysis is correct. With an overloaded operator[], the evaluation
of the expression i=v[i++] is no longer unspecified, but is, in fact,
defined.

> If that is true, is this an evidence that some inconsistency, in some
> extremely non-obvious way, exists between build-in operator and
> operator function(informally known as 'overloaded operator').

Absolutely. There is no requirement that an overloaded operator behave
at all like the built-in operator. For example, a program could
overload the assignment operator (=) to test for equality, and overload
the equality operator (==) to perform an assignment, for any
user-declared type. It would be perfectly legal for a program to do so,
though probably not a particularly good idea. At least not if one
values consistency. But rather than have the Standard restrict
overloaded operators in some arbitrary manner, it simply leaves their
implementation up to the programmer's own good judgement.

> Another question is:
>
> What is the side-effect of 'i++' actually? Two options, first of which
> is "fetch i from memory, add it by 1, write the new value back", second
> is "write the new value stored previously somewhere into the storage of
> 'i'". Is the answer 'both' or 'either' or whatever? Plus, the words
> below(also excerpted from [c++03;5/4])are really puzzling to me, can
> anyone explain it please? Does it have anything to do with the two
> questions I asked?

It depends on the context in which i++ appears. As a function
parameter, i++ must be evaluated before the function call is made.
Since the function must be passed the value of i before it is
incremented, the compiler must first copy i, increment it, and then
pass the copy of i to the function being called.

> [C++03;5/4]"Between the previous and next sequence point a scalar
> object shall have its stored value modified at most once by the
> evaluation of an expression. Furthermore, the prior value shall be
> accessed only to determine the value to be stored.The requirements of
> this paragraph shall be met for each allowable ordering of the
> subexpressions of a full expression; otherwise the behavior is
> undefined."

I believe the Standard has some examples to illustrate when result of
evaluating an expression is unspecified and when it is undefined.
Essentally, the result of evaluating i = v[i++] is unspecified because
i's value is accessed only once. The evaluation of i++ = v[i++] would
be undefined, since i's value is accessed more than once between
sequence points.

Greg

Hyman Rosen

unread,

Nov 23, 2005, 9:45:36 AM11/23/05

to

Greg Herlihy wrote:
> Essentally, the result of evaluating i = v[i++] is unspecified because
> i's value is accessed only once.

No, it's undefined in the built-in case, because i is modified
twice in the same expression (by the increment and by the
assignment) without an intervening sequence point.

Once again, I invite everyone in the newsgroup to notice how
no one understands the rules as they exist. It's asinine not
to have a defined order of evaluation, including side effects.

Edward Diener No Spam

unread,

Nov 23, 2005, 11:40:36 PM11/23/05

to

Hyman Rosen wrote:
> Greg Herlihy wrote:
>
>> Essentally, the result of evaluating i = v[i++] is unspecified because
>> i's value is accessed only once.
>
>
> No, it's undefined in the built-in case, because i is modified
> twice in the same expression (by the increment and by the
> assignment) without an intervening sequence point.
>
> Once again, I invite everyone in the newsgroup to notice how
> no one understands the rules as they exist.

Obviously not true except as hyperbole.

> It's asinine not
> to have a defined order of evaluation, including side effects.

I totally agree with you here. With all due respect to Mr. Stroustrup's
printed opinion about the importance of C++ maintaining compatibility
with the C language, I also feel that at some time in the future, and I
hope it is the near future, C++ should stop trying to maintain
compatibility with the C language and do the right things as far as its
own C++ language specification is concerned. This is just one of many
other areas, which have been mentioned in numerous other posts on these
NGs, where compatibility with the C language is holding C++ back from
advancing as a language of its own.

pon...@gmail.com

unread,

Nov 23, 2005, 11:45:31 PM11/23/05

to

Hyman Rosen wrote:
> No, it's undefined in the built-in case, because i is modified
> twice in the same expression (by the increment and by the
> assignment) without an intervening sequence point.

yeah, that's where my doubt lies exactly. The standard says that "i =
v[i++]" has unspecified behavior, but according to what you have said,
it should just be undefined behavior.
Is there anything wrong with the corresponding standard wording?

BTW. If what you said is true, similarly the expression " i =
i++"/"i=++i"(whatever) would have undefined behavior,too. It's just
that I couldn't believe this little simple expression has undefined
behavior, though, I think you were right anyway.

Plus, to me it seems very, very confusing to separate the concept of
"evaluation" from that of "side-effect", in particular, the evaluation
of an expression doesn't necessarily mean that the side-effect of this
evaluation takes place simultaneously. Sometimes this behavior is quite
puzzling, it's just counterintuitive.

kuy...@wizard.net

unread,

Nov 23, 2005, 11:40:23 PM11/23/05

to

"Andrew Koenig" wrote:
> "Hyman Rosen" <hyr...@mail.com> wrote in message
> news:E1EeZw5-...@chx400.switch.ch...

.

> > Specifying order of evaluation completely in C++ would
> > be completely backwards-compatible with C, though, so
> > that's no excuse here.
>
> The trouble is that if order of evaluation were completely specified in C++,
> there might be C++ compilers that would be required to give different
> results from existing C implementations for the same program (assuming, of
> course, that the program was in the intersection of C and C++).

As long as the specified order of evaluation under new rules was the
same as one of the permitted orders of evaluation under the current
rules, code which depends upon a different order of evaluation is (even
under the current rules) non-portable. There's only a limited degree to
which I care about what goes wrong with such code.

kuy...@wizard.net

unread,

Nov 23, 2005, 11:40:26 PM11/23/05

to

Hyman Rosen wrote:
> kuy...@wizard.net wrote:
> > Yes, this is one of the several inconsistencies between them. Backwards
> > compatibility with C was an important objective during the development
> > of C++, which prevented complete consistency between built-in and
> > user-defined operators.
>
> Specifying order of evaluation completely in C++ would
> be completely backwards-compatible with C, though, so
> that's no excuse here.

A less extreme possibility would have been to give built-in operators
the same exact sequence points they would have had if there were
actually user-defined operator overloads. That would produce almost the
same effect.

I wasn't involved, so I don't know what the actual reasons for this
decision were, but I suspect that it was considered desireable that
built-in operators would retain all of the opportunities for
optimization allowed by the C sequence point rules. However, you
couldn't easily fit operator overloads into such a scheme, without
giving them the same sequence points as the corresponding function
calls.

Razzer

unread,

Nov 24, 2005, 1:30:47 AM11/24/05

to

"Andrew Koenig" wrote:
> "Hyman Rosen" <hyr...@mail.com> wrote in message
> news:E1EeZw5-...@chx400.switch.ch...
> > kuy...@wizard.net wrote:
>
> > Specifying order of evaluation completely in C++ would
> > be completely backwards-compatible with C, though, so
> > that's no excuse here.
>
> The trouble is that if order of evaluation were completely specified in C++,
> there might be C++ compilers that would be required to give different
> results from existing C implementations for the same program (assuming, of
> course, that the program was in the intersection of C and C++).

Why's that? AFAICS, defining the order of evaluation in cases where it
is undefined in C should not have to worry about giving different
results since there is no set result in C. The only time, AFAIK, you
could not set a definate order of evaluation is the evaluation of
function arguments without potentially getting different results
between C and C++. However, I think the order of evaluation of
arguments causes such a minor set of problems in C++ that one could
leave it how it is while defining other areas of order of evaluation
and still get a substantial benefit.

Ron Natalie

unread,

Nov 24, 2005, 2:03:49 PM11/24/05

to

pon...@gmail.com wrote:
> Hyman Rosen wrote:

> yeah, that's where my doubt lies exactly. The standard says that "i =
> v[i++]" has unspecified behavior, but according to what you have said,
> it should just be undefined behavior.
> Is there anything wrong with the corresponding standard wording?
>

The order of evaluation is unspecified, but changing the value twice
between sequence points is undefined behavior. With built-in
operators there are no sequence points other than the end of the
full expression.

Fourth paragraph of Chapter 5 -- Expressions from the standard.

Except where noted, the order of evaluation of operands of individual
operators and subexpressions of individual
expressions, and the order in which side effects take place, is
unspecified.53) Between the previous

and next sequence point a scalar object shall have its stored value
modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be accessed only to
determine the value to be stored.
The requirements of this paragraph shall be met for each allowable
ordering of the subexpressions of a full

expression; otherwise the behavior is undefined. [Example:

i = v[i++]; // the behavior is unspecified

i = 7, i++, i++; // i becomes 9
i = ++i + 1; // the behavior is unspecified
i = i + 1; // the value of i is incremented
—end example]

Andrew Koenig

unread,

Nov 24, 2005, 3:44:07 PM11/24/05

to

<kuy...@wizard.net> wrote in message
news:1132759176.3...@g43g2000cwa.googlegroups.com...

> As long as the specified order of evaluation under new rules was the
> same as one of the permitted orders of evaluation under the current
> rules, code which depends upon a different order of evaluation is (even
> under the current rules) non-portable. There's only a limited degree to
> which I care about what goes wrong with such code.

Evidently you're not a compiler vendor :-)

On several occasions I have heard representatives of compiler vendors
telling the standards committee that if the standard mandated behavior that
forced them to change their implementations in particular ways, they would
ship a nonconforming implementation rather than alienate their customers.
The unfortunate fact is that when people's programs change behavior from one
version of a compiler to another, they complain--and typically they don't
care whether the change was because their program relied on behavior that
isn't guaranteed.

I'm not saying that things should be this way--in fact, I wish they weren't.
But you can't ignore reality by wishing it away.

Andrew Koenig

unread,

Nov 24, 2005, 3:44:16 PM11/24/05

to

"Razzer" <coolm...@gmail.com> wrote in message
news:1132809799....@z14g2000cwz.googlegroups.com...

> Why's that? AFAICS, defining the order of evaluation in cases where it
> is undefined in C should not have to worry about giving different
> results since there is no set result in C.

Part of the point is to be able to translate a C++ expression into the
equivalent C expression without changing its meaning. That desire implies
that if a C expression is undefined, the corresponding C++ expression must
also be undefined.

Herb Sutter

unread,

Nov 24, 2005, 8:03:27 PM11/24/05

to

On Thu, 24 Nov 2005 04:40:36 GMT, eldiener_n...@earthlink.net
(Edward Diener No Spam) wrote:
>Hyman Rosen wrote:
>> Greg Herlihy wrote:
>>> Essentally, the result of evaluating i = v[i++] is unspecified because
>>> i's value is accessed only once.
>>
>> No, it's undefined in the built-in case, because i is modified
>> twice in the same expression (by the increment and by the
>> assignment) without an intervening sequence point.
>>
>> Once again, I invite everyone in the newsgroup to notice how
>> no one understands the rules as they exist.
>
>Obviously not true except as hyperbole.

Hyman's not exaggerating that much. In my experience at least, the great
majority of programmers aren't aware of the (lack of) rules, and the
minority who do understand the rules still tend to forget them
occasionally. I know it bites me every so often.

>> It's asinine not
>> to have a defined order of evaluation, including side effects.
>
>I totally agree with you here. With all due respect to Mr. Stroustrup's
>printed opinion about the importance of C++ maintaining compatibility
>with the C language, I also feel that at some time in the future, and I
>hope it is the near future, C++ should stop trying to maintain
>compatibility with the C language and do the right things as far as its
>own C++ language specification is concerned. This is just one of many
>other areas, which have been mentioned in numerous other posts on these
>NGs, where compatibility with the C language is holding C++ back from
>advancing as a language of its own.

I don't think you'd get as much push-back from Bjarne as you think. :-)

Evalution reordering is a well-known source of difficulty for programmers
and it has been raised about every other month for at least two decades.
But the reason why it isn't fixed in C++ (or C) has nothing to do with C
compatibility. The real reason is performance: When you talk about nailing
down the order of evaluation, the first howls of protest usually come from
the people who write code optimizers and who demand to know why on earth
you want to tie their hands like this and turn off optimization
opportunities they want to exploit.

BTW, there's a direct parallel (pardon the pun) between this issue and the
issue of instruction reordering, especially memory read/write reordering,
anywhere in the tool/hardware chain right down to the processor itself.
It's common wisdom in the hardware world that a sequentially consistent
memory model (to simplify a little, this means among other things that the
chip doesn't get the flexibility to reorder memory reads or writes and
must follow exactly what's in the source code) is nice, but nobody
actually ships it because it is believed to be too slow for practical use.
This is part of what I had in mind when I wrote:

Chip designers are under so much pressure to deliver ever-faster
CPUs that they’ll risk changing the meaning of your program,
and possibly break it, in order to make it run faster

And:

Two noteworthy examples in this respect are write reordering and
read reordering: Allowing a processor to reorder write operations
has consequences that are so surprising, and break so many
programmer expectations, that the feature generally has to be
turned off because it’s too difficult for programmers to reason
correctly about the meaning of their programs in the presence
of arbitrary write reordering. Reordering read operations can also
yield surprising visible effects, but that is more commonly left
enabled anyway because it isn’t quite as hard on programmers,

Note that evaluation reordering falls into the same category as read
reorder, including this final comment:

and the demands for performance cause designers of operating
systems and operating environments to compromise and choose
models that place a greater burden on programmers because that
is viewed as a lesser evil than giving up the optimization
opportunities.

-- "The Free Lunch Is Over"
http://www.gotw.ca/publications/concurrency-ddj.htm

Have a nice day,

Herb

---
Herb Sutter (www.gotw.ca) (www.pluralsight.com/blogs/hsutter)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

Hyman Rosen

unread,

Nov 25, 2005, 1:56:40 PM11/25/05

to

Andrew Koenig wrote:
> Part of the point is to be able to translate a C++ expression into the
> equivalent C expression without changing its meaning.

Since when? Certainly there was a goal that C code should
carry forward into C++ with its meaning generally unchanged,
but why would anyone care about going the other way? And if
that is what you want, you have to avoid all sorts of C++
constructs anyway, so you would just avoid C-ambiguous
expressions as well.

Hyman Rosen

unread,

Nov 25, 2005, 1:56:07 PM11/25/05

to

Herb Sutter wrote:
> The real reason is performance: When you talk about nailing down
> the order of evaluation, the first howls of protest usually come
> from the people who write code optimizers

Except that this is a bogus reason. We want to specify the exact
meaning of language constructs, not their implementation. This
affects optimizers only in the rare cases of expressions which
are actually ambiguous; obviously most are not, and their evaluation
can be ordered in any way the compiler sees fit by the as-if rules.

It seems to me that such complaints amount to "I'm so smart that it
would be bad to constrain me, but I'm so stupid that I don't know
that I'm not constrained." I don't think that language semantics
should be driven by that kind of consideration.

Hyman Rosen

unread,

Nov 25, 2005, 1:56:16 PM11/25/05

to

Andrew Koenig wrote:
> But you can't ignore reality by wishing it away.

So that's why the committee decide to abandon two-phase
name lookup in templates?

Hyman Rosen

unread,

Nov 25, 2005, 1:56:43 PM11/25/05

to

Razzer wrote:
> However, I think the order of evaluation of arguments causes
> such a minor set of problems in C++ that one could leave it
> how it is

No, absolutely not. Leaving argument evaluation underspecified
is what caused the infamous 'f(auto_ptr, auto_ptr)' problem.
If we can define the order of initialization of class members
we can do the same for arguments. No half measures!

kuy...@wizard.net

unread,

Nov 25, 2005, 1:57:22 PM11/25/05

to

"Andrew Koenig" wrote:
> <kuy...@wizard.net> wrote in message
> news:1132759176.3...@g43g2000cwa.googlegroups.com...
>
> > As long as the specified order of evaluation under new rules was the
> > same as one of the permitted orders of evaluation under the current
> > rules, code which depends upon a different order of evaluation is (even
> > under the current rules) non-portable. There's only a limited degree to
> > which I care about what goes wrong with such code.
>
> Evidently you're not a compiler vendor :-)

.

> But you can't ignore reality by wishing it away.

I'm not wishing it away; all I said is that I don't care about such
code. Compiler vendors, as you've pointed out, do have to care about
such code if it's become a widely used idiom. In the context of this
discussion, I doubt that "i=v[i++];" is in that category. On the other
hand, there are probably some widely used idioms where the order of
evaluation is both important and (at best) unpsecified.

Herb Sutter

unread,

Nov 25, 2005, 8:11:38 PM11/25/05

to

On Fri, 25 Nov 2005 18:56:07 GMT, hyr...@mail.com (Hyman Rosen) wrote:
>It seems to me that such complaints amount to "I'm so smart that it
>would be bad to constrain me, but I'm so stupid that I don't know
>that I'm not constrained." I don't think that language semantics
>should be driven by that kind of consideration.

(BTW, that post was an incomplete one -- I reposted a completed version.)

I tend to agree, but in my other complete post I ask for data. Data is
probably hard to get and pin down because this stuff will vary a lot by
hardware architecture and/or the optimizer you test with, but there must
be papers about this.

It certainly is taken for granted among the hardware designer community
and the optimizer writer community alike that the ability to reorder work
is very important for performance.

Herb

---
Herb Sutter (www.gotw.ca) (www.pluralsight.com/blogs/hsutter)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

---

Herb Sutter

unread,

Nov 25, 2005, 8:12:33 PM11/25/05

to

[I sent a partial version of this message too soon. This is the complete
version.]

There are to interrelated issues on this thread, and I'll try to address
both of them. They are:

1. sequence points
2. (lack of) rules on order of evaluation of function arguments

Both issues are around leaving the implementation latitude to reorder
work, and the tradeoff is between leaving the rules loose so that
optimizers can generate better code and tightening the rules so that
programmers have less of a burden to understand their programs.

I'll show below that exactly the same tradeoff comes in a third related
case of reordering not yet mentioned in this thread:

3. instruction reordering, especially memory read/write reordering

Note that #3 is of very current interest because memory access ordering
guarantees are a key part of the C++0x memory model for concurrency now
under development.

Coming in after Hyman and Edward, who were talking first about issue #1:

Edward Diener wrote:
>Hyman Rosen wrote:
>> Greg Herlihy wrote:
>>> Essentally, the result of evaluating i = v[i++] is unspecified because
>>> i's value is accessed only once.
>>
>> No, it's undefined in the built-in case, because i is modified
>> twice in the same expression (by the increment and by the
>> assignment) without an intervening sequence point.
>>
>> Once again, I invite everyone in the newsgroup to notice how
>> no one understands the rules as they exist.
>
>Obviously not true except as hyperbole.

Hyman is not exaggerating at all. There is perhaps no greater point of
misunderstanding of C than sequence points.

Virtually all C++ experts, including C++ committee members and including
myself until the last Santa Cruz meeting in fall 2002, think they
understand sequence points. We typically view them as an unfortunate and
complicated wart we inherited from C, but one that is at least well
understood in the C committee. That is not true.

My eyes were opened when I attended the fall 2002 C meeting, and saw a
detailed presentation of one person's theory of how sequence points
probably work, how they probably should work, and what we're still not
sure about. It was a revelation to me to learn that this was only the most
recent part of an ongoing series of discussions and debates within the C
committee about what exactly sequence points are, how they actually work,
and how they should work. Even if you know all the detailed reasons why "i
= i++;" is indeterminate, nobody should assume that the way C90 or C99
sequence points are specified is either well understood or that
implementations are consistent in the details.

Now we segue over to the related issue #2 of order of evaluation of
function arguments:

>> It's asinine not
>> to have a defined order of evaluation, including side effects.
>
>I totally agree with you here.

Me too. In my experience at least, most C/C++ programmers aren't aware of
the (lack of) rules around evaluation ordering, and the minority who do

understand the rules still tend to forget them occasionally. I know it
bites me every so often.

>With all due respect to Mr. Stroustrup's

>printed opinion about the importance of C++ maintaining compatibility
>with the C language, I also feel that at some time in the future, and I
>hope it is the near future, C++ should stop trying to maintain
>compatibility with the C language and do the right things as far as its
>own C++ language specification is concerned.

Aside: I don't think you'd get as much push-back from Bjarne as you think.
:-)

>This is just one of many

>other areas, which have been mentioned in numerous other posts on these
>NGs, where compatibility with the C language is holding C++ back from
>advancing as a language of its own.

Not exactly. Both sequence points and evalution reordering are a
well-known sources of difficulty for programmers, and these issues have
been raised about every other month for as long as C++ and C have existed.
But the reason for these relaxed rules, and why they aren't nailed down in
C++ (or C), actually has nothing to do with C compatibility or ineptitude
or any of the other usual suspects.

The real reason why C and C++ have this latitude is for performance. When

you talk about nailing down the order of evaluation, the first howls of

protest come, not from language designers or compiler front-end writers
(who would be only too happy to oblige, because who likes having flaky
corner cases?), but from the people who write code optimizers and who
demand to know 'why on earth you want to tie our hands like this and turn
off optimization opportunities we want to exploit for you -- do you
_really_ want your code to run slow?'

There is a direct relationship between these two issues and issues #3: the

issue of instruction reordering, especially memory read/write reordering,
anywhere in the tool/hardware chain right down to the processor itself.
It's common wisdom in the hardware world that a sequentially consistent

memory model is nice (to simplify a little, SC means that the hardware
must not reorder memory reads or writes and must follow exactly what's in
the source code), but nobody actually ships that because it is believed to
be too slow for practical use. It's quite a revelation to most people
doing lock-free concurrent programming for the first time that the memory
reads and writes they put into their code might not be respected at all,
if the processor (or the optimizer, or any other part of the chain)
decides it would rather do things in a different order than you asked for,
so sorry.

This is part of what I had in mind when I wrote (in "The Free Lunch Is
Over," http://www.gotw.ca/publications/concurrency-ddj.htm):

Chip designers are under so much pressure to deliver ever-faster
CPUs that they’ll risk changing the meaning of your program,
and possibly break it, in order to make it run faster

And:

Two noteworthy examples in this respect are write reordering and
read reordering: Allowing a processor to reorder write operations
has consequences that are so surprising, and break so many
programmer expectations, that the feature generally has to be
turned off because it’s too difficult for programmers to reason
correctly about the meaning of their programs in the presence
of arbitrary write reordering. Reordering read operations can also
yield surprising visible effects, but that is more commonly left
enabled anyway because it isn’t quite as hard on programmers,

Note that the latitude around sequence points and evaluation reordering
falls into the same category as read reordering, including this final
comment:

and the demands for performance cause designers of operating
systems and operating environments to compromise and choose
models that place a greater burden on programmers because that
is viewed as a lesser evil than giving up the optimization
opportunities.

That's why we have at least some read reordering in nearly all memory
models now in production, why we have argument evaluation reordering, and
why we have the latitude around sequence points in all its inglory. It is
believed to be necessary. No, I can't point to measurements, but I'm sure
someone can and I would be interested to see real data about how much
nailing each of these down would cost for standard optimizers on various
popular architectures (alas, that's fairly hard to measure).

Herb

---
Herb Sutter (www.gotw.ca) (www.pluralsight.com/blogs/hsutter)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

---

Ron Natalie

unread,

Nov 26, 2005, 3:46:29 PM11/26/05

to

Hyman Rosen wrote:
> Andrew Koenig wrote:
>> Part of the point is to be able to translate a C++ expression into the
>> equivalent C expression without changing its meaning.
>
> Since when? Certainly there was a goal that C code should
> carry forward into C++ with its meaning generally unchanged,
> but why would anyone care about going the other way? And if
> that is what you want, you have to avoid all sorts of C++
> constructs anyway, so you would just avoid C-ambiguous
> expressions as well.
>

It's not true anyhow. C++ already has differences that
keep expressions from functioning the same. The difference
in typing of char literals for example.

David Abrahams

unread,

Nov 26, 2005, 7:57:08 PM11/26/05

to

r...@spamcop.net (Ron Natalie) writes:

> [Example:
> i = v[i++]; // the behavior is unspecified
> i = 7, i++, i++; // i becomes 9
> i = ++i + 1; // the behavior is unspecified
> i = i + 1; // the value of i is incremented
> —end example]

FWIW, examples are non-normative. If your argument holds up based on
the rest of the text, you're OK. Otherwise the example is broken and
needs to be fixed.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Krzysztof Zelechowski

unread,

Nov 26, 2005, 8:26:57 PM11/26/05

to

Uzytkownik "Hyman Rosen" <hyr...@mail.com> napisal w wiadomosci
news:2005112514443...@mscan6.ucar.edu...

> Andrew Koenig wrote:
>> Part of the point is to be able to translate a C++ expression into the
>> equivalent C expression without changing its meaning.
>
> Since when? Certainly there was a goal that C code should
> carry forward into C++ with its meaning generally unchanged,
> but why would anyone care about going the other way? And if
> that is what you want, you have to avoid all sorts of C++
> constructs anyway, so you would just avoid C-ambiguous
> expressions as well.
>

That is how the C++ front end works: it translates C++ to C. But there is
an easy workaround: create an automatic variable for each function argument;
assign the values to the local variables one by one creating sequence points
between them; call the function with the corresponding local variables.

Happy coding
Chris

David Abrahams

unread,

Nov 27, 2005, 1:35:41 AM11/27/05

to

Hyman, Edward, and Herb wrote:

>> It's asinine not to have a defined order of evaluation, including
>> side effects.
>
> I totally agree with you here.

Me too.

hsu...@gotw.ca (Herb Sutter) writes:

> The real reason why C and C++ have this latitude is for performance.

The other reason is backward compatibility. Even when order of
evaluation is unpredictable, many lines of code have been written that
depend on a particular evaluation order -- the one that happened to be
chosen by the compiler. If you force a specified evaluation order you
will break any such code that isn't already using the order
specified. For many vendors, that kind of breakage induces
unacceptable friction with the customer base.

Between this argument and the one about performance, there seem at
least to be plausible reasons for the status quo. Labelling it
"asinine" not to define an order of evaluation seems inappropriate.
As a practical matter, I'm unlikely to buy into an argument about
anything but the most obvious mistakes if it comes with language like
that because of what it says about the speaker's willingness to give
counterarguments their due.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---

Ron Natalie

unread,

Nov 27, 2005, 5:26:32 PM11/27/05

to

David Abrahams wrote:
> r...@spamcop.net (Ron Natalie) writes:
>
>> [Example:
>> i = v[i++]; // the behavior is unspecified
>> i = 7, i++, i++; // i becomes 9
>> i = ++i + 1; // the behavior is unspecified
>> i = i + 1; // the value of i is incremented
>> —end example]
>
> FWIW, examples are non-normative. If your argument holds up based on
> the rest of the text, you're OK. Otherwise the example is broken and
> needs to be fixed.
>

I know that, but the text that immediately preceeds the example
CLEARLY supports the point I was making and the examples are
consistent with what the preceding text you omitted.

Ron Natalie

unread,

Nov 27, 2005, 5:26:40 PM11/27/05

to

David Abrahams wrote:

> The other reason is backward compatibility. Even when order of
> evaluation is unpredictable, many lines of code have been written that
> depend on a particular evaluation order -- the one that happened to be
> chosen by the compiler. If you force a specified evaluation order you
> will break any such code that isn't already using the order
> specified. For many vendors, that kind of breakage induces
> unacceptable friction with the customer base.

That's spurious. The code is already broken. The compiler is free
to pick a different ordering at whim. If you recompile the
application, perhaps with a newer version of the compiler or
a different compiler, or perhaps just a different set of compiler
flags, the results will be different.

Ron Natalie

unread,

Nov 27, 2005, 5:26:42 PM11/27/05

to

Krzysztof Zelechowski wrote:

> That is how the C++ front end works: it translates C++ to C.

There's no such standard concept as a C++ front end. The language
does not define nor proposes any convention for a conversion of
C++ to C.

There are already a number of constructs in C++ that are DIFFERENT
from how they are implemented in C++.

The answer here is that C++ leaves the behavior undefined for
the same reason C does, the supposed advantage to prescribing
an ordered behavior doesn't override efficiency concerns.

David Abrahams

unread,

Nov 27, 2005, 7:13:51 PM11/27/05

to

Ron Natalie <r...@spamcop.net> writes:

> David Abrahams wrote:
>
>> The other reason is backward compatibility. Even when order of
>> evaluation is unpredictable, many lines of code have been written that
>> depend on a particular evaluation order -- the one that happened to be
>> chosen by the compiler. If you force a specified evaluation order you
>> will break any such code that isn't already using the order
>> specified. For many vendors, that kind of breakage induces
>> unacceptable friction with the customer base.
>
> That's spurious. The code is already broken.

No, the code is nonportable. It works in the context where it is
expected to. It does no good for a vendor to respond with "fix your
code; it is broken" if the important customers say, "fine, we'll go
with someone else who can provide stability across versions."

> The compiler is free to pick a different ordering at whim.

Often it is not. Compiler implementors constrain themselves beyond
what the standard specifies, in order to satisfy various customer
demands -- even unreasonable ones. Often they constrain themselves in
ways that provide "guarantees" they won't document, such as stable
evaluation order.

> If you recompile the application, perhaps with a newer version of
> the compiler

Unless the compiler implementor constrains himself to avoid changing
evaluation order. That *does* happen.

> or a different compiler, or perhaps just a different set of compiler
> flags, the results will be different.

All possibly true. It's also possible for customers to develop
expectations of behavior that isn't guaranteed by the standard or
documented by vendors (actually if you think hard about it I bet
you'll find you have some such expectations). Vendors sometimes will
choose to meet those expectations in order to stay in business.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---

Hyman Rosen

unread,

Nov 28, 2005, 12:29:24 AM11/28/05

to

David Abrahams wrote:
> Between this argument and the one about performance, there seem at
> least to be plausible reasons for the status quo. Labelling it
> "asinine" not to define an order of evaluation seems inappropriate.

This is where we start quoting Emerson.
"A foolish consistency is the hobgoblin of little minds."

> As a practical matter, I'm unlikely to buy into an argument about
> anything but the most obvious mistakes if it comes with language like
> that because of what it says about the speaker's willingness to give
> counterarguments their due.

Believe me, I'm giving those counterarguments *more* than
they're due.

We've got made-up customers who are so tied to a vendor
that they demand compatibility for undocumented and
unspecified features but who are still able to change
vendors if they don't get it.

We've got made-up optimizers who are so brilliant that
they mustn't be constrained for fear of slowing the code,
but who are at the same time so stupid that they can't
figure out the ordinary cases where order doesn't matter.

We've got a (probably) made-up Standard Committee who is
so responsive to compiler vendors that they won't agree to
specify formerly unspecifed behavior, but did standardize
two-phase name lookup in templates, which broke every
implementation in existence, half of whom still haven't
caught up years later.

P.J. Plauger

unread,

Nov 29, 2005, 1:15:46 AM11/29/05

to

"Hyman Rosen" <hyr...@mail.com> wrote in message

news:1twif.776$iZ3.758@trndny03...

> We've got made-up customers who are so tied to a vendor
> that they demand compatibility for undocumented and
> unspecified features but who are still able to change
> vendors if they don't get it.
>
> We've got made-up optimizers who are so brilliant that
> they mustn't be constrained for fear of slowing the code,
> but who are at the same time so stupid that they can't
> figure out the ordinary cases where order doesn't matter.
>
> We've got a (probably) made-up Standard Committee who is
> so responsive to compiler vendors that they won't agree to
> specify formerly unspecifed behavior, but did standardize
> two-phase name lookup in templates, which broke every
> implementation in existence, half of whom still haven't
> caught up years later.

Welcome to the wonderful world of software. Every one of the
things you cite above is true, contradictions and all.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

David Abrahams

unread,

Nov 29, 2005, 1:16:17 AM11/29/05

to

Hyman Rosen <hyr...@mail.com> writes:

> David Abrahams wrote:
>
>> Between this argument and the one about performance, there seem at
>> least to be plausible reasons for the status quo. Labelling it
>> "asinine" not to define an order of evaluation seems inappropriate.
>
> This is where we start quoting Emerson.
> "A foolish consistency is the hobgoblin of little minds."

Consistency with what?

>> As a practical matter, I'm unlikely to buy into an argument about
>> anything but the most obvious mistakes if it comes with language
>> like that because of what it says about the speaker's willingness
>> to give counterarguments their due.
>
> Believe me, I'm giving those counterarguments *more* than
> they're due.

The tone of the following belies that statement.

> We've got made-up customers who are so tied to a vendor
> that they demand compatibility for undocumented and
> unspecified features but who are still able to change
> vendors if they don't get it.
>
> We've got made-up optimizers who are so brilliant that
> they mustn't be constrained for fear of slowing the code,
> but who are at the same time so stupid that they can't
> figure out the ordinary cases where order doesn't matter.

Is it implausible to you that detecting the cases where expressions
must not be reordered would add complexity to an already gnarly area
in any good compiler?

> We've got a (probably) made-up Standard Committee who is
> so responsive to compiler vendors that they won't agree to
> specify formerly unspecifed behavior, but did standardize
> two-phase name lookup in templates, which broke every
> implementation in existence, half of whom still haven't
> caught up years later.

It's normal for those who have never been to a meeting to have no
respect for the committee and its process, and to come charging in
with the idea that one's own concerns represent those of the whole C++
developer community. Usually it doesn't take more than one meeting
for those people to wake up and realize that C++ serves the needs of a
much broader group than they thought and that the committee process is
much more thoughtful and less capricious than they assumed. I invite
you to come to the next meeting and prove yourself the exception to
that rule.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---

kanze

unread,

Nov 29, 2005, 1:16:25 AM11/29/05

to

Herb Sutter wrote:

> It certainly is taken for granted among the hardware designer
> community and the optimizer writer community alike that the
> ability to reorder work is very important for performance.

Is it? I don't know anyone in the hardware designer community,
so I can't speak for them, but the first people I heard
clamoring for a more defined order were experts in compiler
optimization techniques. They're the ones who told me that it
didn't make a difference.

There was a time that it did. Back when using Sethi-Ullman
numbers for register allocation was state of the art. The rule
in C dates from K&R C. Which dates from the days when
Sethi-Ullman numbers were state of the art. But optimization
technology has comme a long way since then, even in "everyday"
compilers.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

kanze

unread,

Nov 29, 2005, 1:16:33 AM11/29/05

to

Hyman Rosen wrote:
> David Abrahams wrote:
> > Between this argument and the one about performance, there
> > seem at least to be plausible reasons for the status quo.
> > Labelling it "asinine" not to define an order of evaluation
> > seems inappropriate.

> This is where we start quoting Emerson. "A foolish
> consistency is the hobgoblin of little minds."

> > As a practical matter, I'm unlikely to buy into an argument
> > about anything but the most obvious mistakes if it comes
> > with language like that because of what it says about the
> > speaker's willingness to give counterarguments their due.

> Believe me, I'm giving those counterarguments *more* than
> they're due.

> We've got made-up customers who are so tied to a vendor that
> they demand compatibility for undocumented and unspecified
> features but who are still able to change vendors if they
> don't get it.

I think you know my position on this, but...

The customers here are not made-up. All too often, I've seen
explications of how C/C++ works which carefully explain that
parameters are evaluated and pushed onto the stack from right to
left. Because, of course, that's the way most early PC
compilers did it.

I still think that we need to specify. The customers here are
not the problem of the standard's committee, anymore than were
the customers who wrote code depending on CFront's lifetime of
temporaries, or templates acting exactly like macros. Such
customers are a problem for the vendors, and the vendors will
handle it exactly like they handled the other two issues.
Depending on their attitudes and positions, they will offer
compiler options to support one behavior or the other, or they
will simply tell the customer where he can get off.

> We've got made-up optimizers who are so brilliant that they
> mustn't be constrained for fear of slowing the code, but who
> are at the same time so stupid that they can't figure out the
> ordinary cases where order doesn't matter.

The real experts in optimization technology (people like David
Chase, for example) argue in favor of defined behavior. They
seem to think that the problem is tractable. The fact that Java
has defined behavior, and regularly beats C++ in benchmarks,
would seem to indicate that it isn't a killer problem, at any
rate.

> We've got a (probably) made-up Standard Committee who is so
> responsive to compiler vendors that they won't agree to
> specify formerly unspecifed behavior, but did standardize
> two-phase name lookup in templates, which broke every
> implementation in existence, half of whom still haven't caught
> up years later.

:-)

Formally, templates didn't exist until the standard committee
invented them, so there weren't any implementations to break:-).
In practice, at least from what little I've been able to gather
(thanks to people like Gaby Dos Reis and David Vandevoorde
looking up the facts for me), two phase name lookup was
introduced into the committee drafts, or at least the discussion
papers, long before most compilers had any support for
templates. So I fear you'll have to blame this one on
irresponsible vendors, who preferred bringing out
implementations that they knew would be broken by the standard,
rather than waiting a little, or at least warning.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Hyman Rosen

unread,

Nov 29, 2005, 10:11:58 AM11/29/05

to

David Abrahams wrote:
> Consistency with what?

The past. The argument that we are so tied to backward
compatibility that we may not even specify what was
formerly unspecified.

> Is it implausible to you that detecting the cases where expressions
> must not be reordered would add complexity to an already gnarly area
> in any good compiler?

Yes, very implausible. After all, optimizers do deal
with improving code over multiple statements, and in
that case evaluation order is defined. Any compiler
text which speaks about optimization describes various
analyses used to determine when statements can be moved
around. It's very nearly fundamental to what optimizers
do.

> no respect for the committee

This has nothing to do with respect. One argument against
specifying evaluation order is that the committee is
reluctant to break implementations (see "consistency" above).
But it's clear that the standardization process in fact broke
implementations across the board in dozens of ways, and many
of those implementations still haven't caught up. That's not
a bad thing, that's a good thing. It demonstrates that the
committee is willing to break implementations for a good
cause, and that should be true as well for evaluation order.

wa...@stoner.com

unread,

Nov 29, 2005, 10:12:51 AM11/29/05

to

Hyman Rosen wrote:
> We want to specify the exact
> meaning of language constructs,

Very often that is true.

But which standard have you been reading? I'd argue that almost the
entire libraries section goes out of its way to specify as little of
the meaning as it can get away with, so that competing vendors can
actually compete.

Even the core language is full of cases where exact meanings could have
been specified, but were left implementation-defined, or undefined. My
gut feeling is that most of these cases are there specifically to give
vendors freedom (to optimize, to build a simpler compiler, ...).

Assuming vendors are responsive to their market, (sometimes this is a
bit of a stretch ;-), the market is more interested in speed than in
exact meaning. I see compilers offering 'restrict' or 'fast but sloppy
floating point math' as options (or even defaults). I don't see
options that enable a strict evaluation order.

We could have had a language that told us the value of 'b' - 'a'.
Instead we got a language where it is implementation-defined, and {
char v='a'; v++; } is undefined.

> not their implementation.

But with a language that lets you get this close to the metal, a great
deal of the implementation is observable (just not in the standard's
sense of observable behavior).

P.J. Plauger

unread,

Nov 29, 2005, 10:35:54 PM11/29/05

to

"Hyman Rosen" <hyr...@mail.com> wrote in message

news:200511291438....@horus.isnic.is...

>> no respect for the committee
>
> This has nothing to do with respect. One argument against
> specifying evaluation order is that the committee is
> reluctant to break implementations (see "consistency" above).
> But it's clear that the standardization process in fact broke
> implementations across the board in dozens of ways, and many
> of those implementations still haven't caught up. That's not
> a bad thing, that's a good thing. It demonstrates that the
> committee is willing to break implementations for a good
> cause, and that should be true as well for evaluation order.

It has *everything* to do with respect. You're second guessing
the committee on inadequate information and you think that's
okay. You don't let in the possibility that the committee
could make a number of decisions that may appear to be
capricious and/or contradictory to you, yet they all can have
defensible reasons.

In the particular case of the committee "breaking" implementations,
C++ had a once in a lifetime opportunity to create a new standard
language. It "broke" various past dialects of C++, and there was
more than one. But none of those were standardized. OTOH, the
C++ committee had rather less latitude to "break" C, for C has
been standardized since 1989.

Please note that I don't necessarily approve of all of the
breaks with popular prior art in C++. Nor do I think the
committee always handled backward compatibility with C as
I would like. Nor do I have an opinion about the importance
of the particular issue of preserving latitude to change order
of evaluation. But in every case I saw enough of the process by
which the C++ committee deliberated, made tradeoffs, and arrived
at final decisions that I believe they're deserving of respect.

I also believe Dave Abrahams was correct to make the comments he
did.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Sandor Hojtsy

unread,

Nov 29, 2005, 10:30:55 PM11/29/05

to

You keep quoting the standard:

i = v[i++]; // the behavior is unspecified
i = 7, i++, i++; // i becomes 9
i = ++i + 1; // the behavior is unspecified
i = i + 1; // the value of i is incremented

Note that this contains an identified defect, it should correctly read:
i = v[i++]; // the behavior is undefined

i = 7, i++, i++; // i becomes 9

i = ++i + 1; // the behavior is undefined

i = i + 1; // the value of i is incremented

See http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#351

I have a question: are these expressions undefined, or well-formed?
i = i = 1;
i = ++i;
a = (i = 1) + (i = 1);
If I take the standard word by word, they do not modify the value of i
twice in an expression, because one of the assignments is not
modification - just reassigning the existing value.

Ben Hutchings

unread,

Nov 30, 2005, 10:06:48 AM11/30/05

to

Sandor Hojtsy <sandor...@gmail.com> wrote:
> You keep quoting the standard:
> i = v[i++]; // the behavior is unspecified
> i = 7, i++, i++; // i becomes 9
> i = ++i + 1; // the behavior is unspecified
> i = i + 1; // the value of i is incremented
> Note that this contains an identified defect, it should correctly read:
> i = v[i++]; // the behavior is undefined
> i = 7, i++, i++; // i becomes 9
> i = ++i + 1; // the behavior is undefined
> i = i + 1; // the value of i is incremented
> See http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#351
>
> I have a question: are these expressions undefined, or well-formed?

They're all well-formed; that's a syntactical property.

> i = i = 1;

Despite the lack of a sequence point I believe this may be defined due
to this wording in 5.17/1: "The result of the assignment operation is
the value stored in the left operand *after* the assignment has taken
place..." (my emphasis).

> i = ++i;
> a = (i = 1) + (i = 1);

In these cases, the order of the two modifications of i is undefined,
so they fall foul of 5/4.

> If I take the standard word by word, they do not modify the value of i
> twice in an expression, because one of the assignments is not
> modification - just reassigning the existing value.

All built-in assignment operators are considered to modify their left
hand side, whether its value changes or not.

--
Ben Hutchings
Horngren's Observation:
Among economists, the real world is often a special case.

Ron Natalie

unread,

Dec 1, 2005, 12:29:30 AM12/1/05

to

Ben Hutchings wrote:

> Despite the lack of a sequence point I believe this may be defined due
> to this wording in 5.17/1: "The result of the assignment operation is
> the value stored in the left operand *after* the assignment has taken
> place..." (my emphasis).

That doesn't imply there's a sequence point. The value of the
expression is the value that will end up there, but that doesn't
imply that the store as actually happened (a sequence point is
required for that).

Hyman Rosen

unread,

Dec 1, 2005, 12:29:21 AM12/1/05

to

P.J. Plauger wrote:
> It has *everything* to do with respect. You're second guessing
> the committee on inadequate information and you think that's
> okay.

No, I'm not. Has the committee ever received and rejected a
proposal to fix order of evaluation? If it has, I'm not aware
of it. The second guessing that's going on is people here on
the newsgroup (although maybe they're on the committee?) saying
that the committee would reject such a proposal because of vendor
objections to changing unspecified behavior and incompatibility
with C. I'm pointing out that the committee has seen fit to
cause such breakage in the past, so there is reason to think that
they might do so again, in a good cause.

Again, when I say that the committee went ahead and broke things
in the past, that's not disrespect. I think it's a *good* thing
that they did it, and I hope they do it again.

Ron Natalie

unread,

Dec 1, 2005, 12:28:21 AM12/1/05

to

Sandor Hojtsy wrote:
> You keep quoting the standard:

I quoted the normative part of the standard which is CORRECT.
Ignore the darned exmaples.

> I have a question: are these expressions undefined, or well-formed?
> i = i = 1;

Well-formed and undefined are not mutually exclusive. The above is
well-formed and it's behavior is undefined as are your other examples.

> i = ++i;
> a = (i = 1) + (i = 1);
> If I take the standard word by word, they do not modify the value of i
> twice in an expression, because one of the assignments is not
> modification - just reassigning the existing value.
>

Huh? The only one that you might make that argument for is
the first one. But as far as C++ is concerned even storing the
same value that already there is a modification, hence:
const int x = 5;
x = 5;
is ill-formed.

In the second case, you are not "reassigning an existing value"
as you don't know when the side effect is applied (it's not guaranteed
until the sequence point which may be AFTER the assignment).

Andrew Koenig

unread,

Dec 1, 2005, 11:09:32 PM12/1/05

to

"Hyman Rosen" <hyr...@mail.com> wrote in message

news:2005112514443...@mscan6.ucar.edu...
> Andrew Koenig wrote:

>> Part of the point is to be able to translate a C++ expression into the
>> equivalent C expression without changing its meaning.

> Since when? Certainly there was a goal that C code should
> carry forward into C++ with its meaning generally unchanged,
> but why would anyone care about going the other way?

The first C++ compiler compiled into C. Such a compiler is much easier to
implement if it can translate expressions that involve only C types into the
corresponding C expressions.

Andrew Koenig

unread,

Dec 1, 2005, 11:12:52 PM12/1/05

to

"Hyman Rosen" <hyr...@mail.com> wrote in message

news:E1Efenr-...@chx400.switch.ch...

> Andrew Koenig wrote:
>> But you can't ignore reality by wishing it away.

> So that's why the committee decide to abandon two-phase
> name lookup in templates?

Explain, please.

Hyman Rosen

unread,

Dec 2, 2005, 8:33:54 PM12/2/05

to

Andrew Koenig wrote:
>>So that's why the committee decide to abandon two-phase
>>name lookup in templates?
>
> Explain, please.

Existing compilers which implemented templates
didn't use two-phase name lookup. Many vendors
continue to ship nonconforming implementations.
Many users of conforming compilers are confused
and upset when they discover that their old code
doesn't work any more.

And yet, the committee standardized that approach
anyway. This demonstrates that the committee was
willing to override those objections in what it
deemed a good cause. The changes required to fix
order of implementation would be much less severe
than the changes required for two-phase lookup,
and no previously defined program behavior would
change. That is why I believe I can "wish away"
the reality that vendors might object to this.

Bob Bell

unread,

Dec 2, 2005, 8:48:54 PM12/2/05

to

kuy...@wizard.net wrote:
> "Andrew Koenig" wrote:
> > <kuy...@wizard.net> wrote in message
> > news:1132759176.3...@g43g2000cwa.googlegroups.com...
> >
> > > As long as the specified order of evaluation under new rules was the
> > > same as one of the permitted orders of evaluation under the current
> > > rules, code which depends upon a different order of evaluation is (even
> > > under the current rules) non-portable. There's only a limited degree to
> > > which I care about what goes wrong with such code.
> >
> > Evidently you're not a compiler vendor :-)
> .

> > But you can't ignore reality by wishing it away.
>

> I'm not wishing it away; all I said is that I don't care about such
> code. Compiler vendors, as you've pointed out, do have to care about
> such code if it's become a widely used idiom. In the context of this
> discussion, I doubt that "i=v[i++];" is in that category. On the other
> hand, there are probably some widely used idioms where the order of
> evaluation is both important and (at best) unpsecified.

I think it would be helpful for your cause to list those widely-used
idioms. Personally, I can't think of a widely-used idiom that depends
on a particular evaluation order, and to the best of my knowledge any
such idiom should merely be considered broken. However, if you know of
any such idioms, I'd be interested to hear about them.

Bob

kuy...@wizard.net

unread,

Dec 3, 2005, 2:46:59 AM12/3/05

to

Bob Bell wrote:
> kuy...@wizard.net wrote:
.

> > discussion, I doubt that "i=v[i++];" is in that category. On the other
> > hand, there are probably some widely used idioms where the order of
> > evaluation is both important and (at best) unpsecified.
>
> I think it would be helpful for your cause to list those widely-used
> idioms. Personally, I can't think of a widely-used idiom that depends
> on a particular evaluation order, and to the best of my knowledge any
> such idiom should merely be considered broken. However, if you know of
> any such idioms, I'd be interested to hear about them.

I wouldn't have used the phrase "probably", if I knew for certain that
any such idioms exist.

I think that such idioms probably exist, simply because from my
experience most C programmers are not experts in writing portable code
(too many of them arent' even experts in writing non-portable code).
When they write code that depends upon implementation-specific
behavior, they're not even aware of the fact. Order of evaluation of
sub-expressions is just one of many different implementation-specific
behaviors that people unwittingly become dependent on.

Dave Harris

unread,

Dec 4, 2005, 1:05:31 PM12/4/05

to

pon...@gmail.com () wrote (abridged):
> It's just that I couldn't believe this little simple expression
> has undefined behavior, though, I think you were right anyway.

If you think of i as being stored as two words that are manipulated
separately, then it becomes easier to see. You might get the high word of
the new value and the low word of the old value. For example:

i = 0x1000ffff;
i = i++;

might yield i == 0x1001ffff, which is different to (and much bigger
than) any value that you would consider "correct". If i is a pointer, this
could point to memory the application doesn't own, which could lead to a
hardware fault even if *i is not accessed. And ints can have trapping
values too.

-- Dave Harris, Nottingham, UK.

Ron Natalie

unread,

Dec 4, 2005, 5:42:02 PM12/4/05

to

Andrew Koenig wrote:
> "Hyman Rosen" <hyr...@mail.com> wrote in message
> news:2005112514443...@mscan6.ucar.edu...
>> Andrew Koenig wrote:
>
>>> Part of the point is to be able to translate a C++ expression into the
>>> equivalent C expression without changing its meaning.
>
>> Since when? Certainly there was a goal that C code should
>> carry forward into C++ with its meaning generally unchanged,
>> but why would anyone care about going the other way?
>
> The first C++ compiler compiled into C. Such a compiler is much easier to
> implement if it can translate expressions that involve only C types into the
> corresponding C expressions.
>

Well the standards community messed that up, because it is not true now
nor has it always been true that it was safe to just pass expressions
that use only C types to the C compiler.

Hyman Rosen

unread,

Dec 5, 2005, 11:29:58 PM12/5/05

to

kuy...@wizard.net wrote:
>>I think it would be helpful for your cause to list those widely-used
>>idioms. Personally, I can't think of a widely-used idiom that depends
>>on a particular evaluation order, and to the best of my knowledge any
>>such idiom should merely be considered broken. However, if you know of
>>any such idioms, I'd be interested to hear about them.
>
>
> I wouldn't have used the phrase "probably", if I knew for certain that
> any such idioms exist.

I don't know for sure, but one guess would be code like this,
cout << "xxx" << f() << "yyy" << g() << "zzz" << h() << "\n";
where the authors don't realize that the function calls can
happen in any order. The compiler which they use happens to give
them an order which works, but there may be some particular order
of evaluation which would be bad, and it silently lurks in the
code until one day it shows up because of some change to compiler
version.

I would say that the apparent semantics of this code (first print
this, then that, then that) are so strong that it takes unusual
mental effort to realize that the calls are not (necessarily) made
left-to-right.

Bo Persson

unread,

Dec 6, 2005, 10:33:39 PM12/6/05

to

"Hyman Rosen" <hyr...@mail.com> skrev i meddelandet
news:E1EjGBD-...@chx400.switch.ch...

> kuy...@wizard.net wrote:
>>>I think it would be helpful for your cause to list those
>>>widely-used
>>>idioms. Personally, I can't think of a widely-used idiom that
>>>depends
>>>on a particular evaluation order, and to the best of my knowledge
>>>any
>>>such idiom should merely be considered broken. However, if you know
>>>of
>>>any such idioms, I'd be interested to hear about them.
>>
>>
>> I wouldn't have used the phrase "probably", if I knew for certain
>> that
>> any such idioms exist.
>
> I don't know for sure, but one guess would be code like this,
> cout << "xxx" << f() << "yyy" << g() << "zzz" << h() << "\n";
> where the authors don't realize that the function calls can
> happen in any order.

If the functions really affect each other, this is *really* bad code.
Why don't we require a diagnostic

Warning: Terrible code - please rewrite!

instad of having the compilers make it work anyway?

Bo Persson

Andrew Koenig

unread,

Dec 6, 2005, 10:35:48 PM12/6/05

to

"Hyman Rosen" <hyr...@mail.com> wrote in message

news:200512021422....@horus.isnic.is...

> Andrew Koenig wrote:
>>>So that's why the committee decide to abandon two-phase
>>>name lookup in templates?

>> Explain, please.

> Existing compilers which implemented templates
> didn't use two-phase name lookup. Many vendors
> continue to ship nonconforming implementations.
> Many users of conforming compilers are confused
> and upset when they discover that their old code
> doesn't work any more.

> And yet, the committee standardized that approach
> anyway. This demonstrates that the committee was
> willing to override those objections in what it
> deemed a good cause. The changes required to fix
> order of implementation would be much less severe
> than the changes required for two-phase lookup,
> and no previously defined program behavior would
> change. That is why I believe I can "wish away"
> the reality that vendors might object to this.

This isn't an explanation.

You said that the committee decided "to abandon two-phase name lookup in
templates," and when I asked you for an explanation, you said that they
didn't abandon it.

So I guess you were being sarcastic, which is not a good way to get people
to take you seriously in a technical discussion.

Here are some facts that I think go a long way toward explaining the current
state of affairs.

The C++ standards committee had its organizational meeting literally the day
after the C89 standard was ratified. It was chartered to standardize C++
using two documents as its basis:

1) The newly ratified C standard;
2) The Annotated Reference Manual.

The C standard was quite explicit about the behavior of built-in operators
on values of built-in types. Moreover, the ARM was reasonably consistent
about deferring to C the definition of how such operators behave.

In contrast, the C standard was, of course, utterly silent about templates
and exceptions, and the ARM marked both those features as "experimental."

Accordingly, I find it completely unsurprising that the committee was much
more deferential to past usage in the case of order of evaluation than it
was in the case of templates, and equally unsurprising that vendors went
along with changes in template behavior where they would not have tolerated
changes in behavior of built-in operators on operands of built-in types.

Andrew Koenig

unread,

Dec 6, 2005, 10:35:44 PM12/6/05

to

"Hyman Rosen" <hyr...@mail.com> wrote in message

news:E1EjGBD-...@chx400.switch.ch...
> kuy...@wizard.net wrote:

>> I wouldn't have used the phrase "probably", if I knew for certain that
>> any such idioms exist.

> I don't know for sure, but one guess would be code like this,
> cout << "xxx" << f() << "yyy" << g() << "zzz" << h() << "\n";
> where the authors don't realize that the function calls can
> happen in any order. The compiler which they use happens to give
> them an order which works, but there may be some particular order
> of evaluation which would be bad, and it silently lurks in the
> code until one day it shows up because of some change to compiler
> version.
>
> I would say that the apparent semantics of this code (first print
> this, then that, then that) are so strong that it takes unusual
> mental effort to realize that the calls are not (necessarily) made
> left-to-right.

Under ordinary circumstances, the order in which the calls are made doesn't
matter. It matters only when the functions have side effects that interfere
with each other in some way.

You don't need order-of-evaluation guarantees to ensure that the various
components will be printed in the right sequence, even if they are evaluated
out of sequence.

Bob Bell

unread,

Dec 6, 2005, 10:37:31 PM12/6/05

to

Hyman Rosen wrote:
> kuy...@wizard.net wrote:
> >>I think it would be helpful for your cause to list those widely-used
> >>idioms. Personally, I can't think of a widely-used idiom that depends
> >>on a particular evaluation order, and to the best of my knowledge any
> >>such idiom should merely be considered broken. However, if you know of
> >>any such idioms, I'd be interested to hear about them.
> >
> >
> > I wouldn't have used the phrase "probably", if I knew for certain that
> > any such idioms exist.
>
> I don't know for sure, but one guess would be code like this,
> cout << "xxx" << f() << "yyy" << g() << "zzz" << h() << "\n";
> where the authors don't realize that the function calls can
> happen in any order. The compiler which they use happens to give
> them an order which works, but there may be some particular order
> of evaluation which would be bad, and it silently lurks in the
> code until one day it shows up because of some change to compiler
> version.

Or perhaps the author simply modifies the expression to push it beyond
some kind of complexity boundary, triggering the compiler to reorder
the expression(s). It doesn't even take a new version of the compiler.
It's hard to regard such code as anything but broken.

I was hoping for an example along the lines of "Compiler vendor ABC
states in their documentation that evaluation order of function
arguments is right to left, as if there are sequence points at the
commas, and here's an example of an idiom that exploits the evaluation
order that's common among users of ABC's compiler."

> I would say that the apparent semantics of this code (first print
> this, then that, then that) are so strong that it takes unusual
> mental effort to realize that the calls are not (necessarily) made
> left-to-right.

On the contrary, this code doesn't make me think the calls are made
left to right, only that the output will be serialized that way. But
then again, I understood (and got used to) the way C++ evaluates
expressions a long time ago.

Bob

ThosRTanner

unread,

Dec 6, 2005, 10:34:22 PM12/6/05

to

Hyman Rosen wrote:
> kuy...@wizard.net wrote:

> > I wouldn't have used the phrase "probably", if I knew for certain that
> > any such idioms exist.
>
> I don't know for sure, but one guess would be code like this,
> cout << "xxx" << f() << "yyy" << g() << "zzz" << h() << "\n";
> where the authors don't realize that the function calls can
> happen in any order. The compiler which they use happens to give
> them an order which works, but there may be some particular order
> of evaluation which would be bad, and it silently lurks in the
> code until one day it shows up because of some change to compiler
> version.
>
> I would say that the apparent semantics of this code (first print
> this, then that, then that) are so strong that it takes unusual
> mental effort to realize that the calls are not (necessarily) made
> left-to-right.

Isn't that one situation where it will work as expected, because it
"translates to":

operator<<(operator<<(operator<<(operator<<(cout, "\n"), h()), "zzz"),
g(), "yyy") etc etc

and there's a sequence point after every function call

kuy...@wizard.net

unread,

Dec 7, 2005, 12:28:11 AM12/7/05

to

"Bo Persson" wrote:
> "Hyman Rosen" <hyr...@mail.com> skrev i meddelandet
> news:E1EjGBD-...@chx400.switch.ch...

.
> > I don't know for sure, but one guess would be code like this,
> > cout << "xxx" << f() << "yyy" << g() << "zzz" << h() << "\n";
> > where the authors don't realize that the function calls can
> > happen in any order.
>
> If the functions really affect each other, this is *really* bad code.
> Why don't we require a diagnostic
>
> Warning: Terrible code - please rewrite!
>
> instad of having the compilers make it work anyway?

Of course it's bad code. The point was that people who write such code
would have a motive for pressuring implementors to continue making it
work as they expect it to. That's a real type of pressure that
implementors face.

kuy...@wizard.net

unread,

Dec 7, 2005, 12:28:22 AM12/7/05

to

ThosRTanner wrote:
> Hyman Rosen wrote:
> > kuy...@wizard.net wrote:
>
> > > I wouldn't have used the phrase "probably", if I knew for certain that
> > > any such idioms exist.
> >
> > I don't know for sure, but one guess would be code like this,
> > cout << "xxx" << f() << "yyy" << g() << "zzz" << h() << "\n";
> > where the authors don't realize that the function calls can
> > happen in any order. The compiler which they use happens to give
> > them an order which works, but there may be some particular order
> > of evaluation which would be bad, and it silently lurks in the
> > code until one day it shows up because of some change to compiler
> > version.
> >
> > I would say that the apparent semantics of this code (first print
> > this, then that, then that) are so strong that it takes unusual
> > mental effort to realize that the calls are not (necessarily) made
> > left-to-right.
> Isn't that one situation where it will work as expected, because it
> "translates to":
>
> operator<<(operator<<(operator<<(operator<<(cout, "\n"), h()), "zzz"),
> g(), "yyy") etc etc
>
> and there's a sequence point after every function call

Yes, the function calls introduce sequence points into that expression.
However, sequence points only enforce a connection between the
evalation of an expression and its side effects. They don't impose a
required ordering on the expressions.
Let me use a simplified version of the example given:

cout << f() << g();

This is equivalent to:

cout.operator<<(f()).operator<<(g()).

This expression involves four function calls:
A: f()
B: cout.operator<<()
C: g()
D: cout.operator<<(f()).operator<<()

Let t(X) be the time at which function X is evaluated. Since function
arguments must be evaluated before the function itself can be
evaluated, we have the following constraints:

t(A) < t(B)
t(C) < t(D)

Since the return value from a function can't be used until after the
function has been evaluated, we have one additional constraint:

t(B) < T(D)

There are exactly two orderings consistent with all of those
constraints:

t(A) < t(B) < t(C) < t(D)

t(C) < t(A) < t(B) < t(D)

and there's no other requirement in the standard that is violated by
either of those orderings. Therefore, if it matters whether f() is
called before or after the call to g(), you've got problems.

Hyman Rosen

unread,

Dec 7, 2005, 10:44:43 AM12/7/05

to

ThosRTanner wrote:
> Isn't that one situation where it will work as expected, because it
> "translates to":
> operator<<(operator<<(operator<<(operator<<(cout, "\n"), h()), "zzz"),
> g(), "yyy") etc etc
> and there's a sequence point after every function call

Your translation is correct, and there is a sequence point after
every call, but it still doesn't work as "expected". The compiler
is free to call f(), g(), and h() in any order before calling any
of the operator<<() methods if it wants to.

Hyman Rosen

unread,

Dec 7, 2005, 10:45:01 AM12/7/05

to

kuy...@wizard.net wrote:
> Of course it's bad code.

It's only bad code because it's written in a language
with bad semantics for it.

> The point was that people who write such code would have a
> motive for pressuring implementors to continue making it work
> as they expect it to. That's a real type of pressure that
> implementors face.

No, the point was that people who write such code don't realize
when an order dependency has crept in because their compiler
happens to pick an order that works for them. If that order were
defined, then their code would not be subject to accidental
breakage when the compiler decided to pick a different order.

Hyman Rosen

unread,

Dec 7, 2005, 10:52:09 AM12/7/05

to

Bob Bell wrote:
> It's hard to regard such code as anything but broken.

That's only because of the widespread reluctance to see
the behavior of the language as the problem rather than
the behavior of the code. Why do you perceive as broken
call( f(), g(), h() );
but not
f(); g(); h();
It's only because long experience has ingrained into you
that in the first expression the calls are necessarily
unordered, while in the second they are ordered. But that's
just an artifact of the language, and it can be changed,
just as the Java creators decided to do.

> On the contrary, this code doesn't make me think the calls are made
> left to right, only that the output will be serialized that way. But
> then again, I understood (and got used to) the way C++ evaluates
> expressions a long time ago.

Exactly. Now go read further responses, especially the one from
ThosRTanner, and notice that he apparently does not have your
level of understanding. Which is my point. In the mathematical
sense, ignorance about sequence points and order of evaluation
is "almost everywhere". Rather than feeling proud about our level
of understanding, we should be sorry that such a level is necessary,
and endeavor to do away with it. Then we could explain the now simple
behavior to everyone, and they would all understand.

Hyman Rosen

unread,

Dec 7, 2005, 10:52:20 AM12/7/05

to

Bo Persson wrote:
> If the functions really affect each other, this is *really* bad code.

No, the code is fine. It's the language that is bad.

> Why don't we require a diagnostic
> Warning: Terrible code - please rewrite!

Because the functions may be separately compiled,
so the compiler cannot know whether they interact.

> instad of having the compilers make it work anyway?

Because programming languages are expressions of what
we wish the computer to do, and having unspecified
constructs in the languages are useless towards that
end.

Hyman Rosen

unread,

Dec 7, 2005, 10:52:32 AM12/7/05

to

kuy...@wizard.net wrote:
> Yes, the function calls introduce sequence points into that expression.
> However

. 40 more lines of explanation
> you've got problems.

Good exposition. I hope this contributes to everyone's
understanding of why the current semantics are a mess and
order of evaluation should be properly defined! Wouldn't
it be nice to say that the calls are evaluated from left
to right, arguments before calls? So much simpler!

Hyman Rosen

unread,

Dec 7, 2005, 10:53:03 AM12/7/05

to

Andrew Koenig wrote:
> 1) The newly ratified C standard;

The newly ratified C standard itself had novelties and changes
to long-existing C behavior. For example, it changed the rules
governing the widening of smaller unsigned types to integral
types, and added stringification and token pasting to the
preprocessor. The even newer C99 standard adds hefty new syntax
to the language for initializations.

> Accordingly, I find it completely unsurprising that the committee was much
> more deferential to past usage in the case of order of evaluation than it
> was in the case of templates, and equally unsurprising that vendors went
> along with changes in template behavior where they would not have tolerated
> changes in behavior of built-in operators on operands of built-in types.

Since I wasn't there, I have no idea whether anyone actually
proposed changing evaluation order during the original process.
I don't even care. I am advocating that it should be changed
now. When I say this, others object that the committee will
not consider changes that impact vendors so much, so I point
out that the committee has impacted vendors before, and that
this would be a change that leaves formerly specified behavior
alone.

I apologize in advance, but I get a very strong sense that the
objections stem more form a reluctance to change the way C and
C++ have always behaved than from any other reason, even though
the old behavior is bad.

Hyman Rosen

unread,

Dec 7, 2005, 10:53:27 AM12/7/05

to

Andrew Koenig wrote:
> Under ordinary circumstances, the order in which the calls are made doesn't
> matter. It matters only when the functions have side effects that interfere
> with each other in some way.

Well, that's what I said. But often compilers actually have a
fixed order of evaluation, it's just that they don't tell anyone
what it is, and it's subject to change between versions or vendors.
So order dependencies might creep in and accidentally work, until
something changes.

> You don't need order-of-evaluation guarantees to ensure that the various
> components will be printed in the right sequence, even if they are evaluated
> out of sequence.

True but irrelevant. The OP asked for code where the order
of evaluation could be important. This is such code. It's
hard to come up with much more that isn't undefined behavior.

Andrew Koenig

unread,

Dec 8, 2005, 2:00:42 AM12/8/05

to

"Hyman Rosen" <hyr...@mail.com> wrote in message

news:200512071313....@horus.isnic.is...

> Well, that's what I said. But often compilers actually have a
> fixed order of evaluation, it's just that they don't tell anyone
> what it is, and it's subject to change between versions or vendors.
> So order dependencies might creep in and accidentally work, until
> something changes.

I haven't done a survey of compilers, but I have certainly encountered
compilers that do not have a fixed order of evaluation in the sense in which
I think you mean it. It is certainly true that most compilers will evaluate
a given expression in the same order every time they encounter it (assuming
that the types are the same), but that doesn't imply a fixed order of
evaluation.

Here's why. One common code-generation technique is to try to minimize the
total number of registers or temporary locations needed to evaluate an
expression. One well-known algorithm for doing that is that whenever there
is an operator with operands that can be evaluated in either order, the
compiler should try to evaluate first the operand that requires the most
registers.

A compiler that uses such an algorithm might well compile f()+g(x,y) by
evaluating g(x,y) first, then f(), then computing the sum, but nevertheless
might compile h()+f()+g(x,y) by evaluating h(), then f(), and finally
g(x,y). This would happen if the subexpression h()+f() required more
registers to evaluate than the subexpression g(x,y).

I would expect this kind of behavior to be common, especially among
optimizing compilers.

Herb Sutter

unread,

Dec 8, 2005, 2:02:14 AM12/8/05

to

hyr...@mail.com (Hyman Rosen) wrote:
>Bob Bell wrote:
>> It's hard to regard such code as anything but broken.
>
>That's only because of the widespread reluctance to see
>the behavior of the language as the problem rather than
>the behavior of the code. Why do you perceive as broken
> call( f(), g(), h() );
>but not
> f(); g(); h();
>It's only because long experience has ingrained into you
>that in the first expression the calls are necessarily
>unordered, while in the second they are ordered. But that's
>just an artifact of the language, and it can be changed,
>just as the Java creators decided to do.

I really think Hyman is making good points here, though he seems to be
somewhat of a lone voice crying out in the wilderness right now in this
thread.

Here's an example I posted a few days ago to a parallel discussion that's
going on within the C++ standards committee reflectors:

---
Here's a specific version of this pitfall (adapted from a similar example
in http://www.gotw.ca/gotw/012.htm):

// Case 1
cout << "x = " << itoa(42,buf,10) << ", y = " << itoa(43,buf,10);

Some compilers I just tried print 42 and 42, some 42 and 43, and others
that I don't have could print 43 and 42.

In fact, one of the compilers I tried gave me different results between
the above code and the following minor rearrangement (and it is or should
be easy to go from the above to the below during maintenance!):

// Case 2
cout << string("x = ") + itoa(42,buf,10) + ", y = " + itoa(43,buf,10);

YMMV, but to me it seems bad that there should be a pitfall like this with
straightforward maintenance of code, moving from Case 1 to Case 2 on the
same compiler and getting different results. But that's our status quo and
it still happened in my re-test this morning on a popular compiler.

I had actually forgotten about this GotW #12 example (it's been 8 years
since I first wrote about this particular one), and it's just another case
where the same old issue comes up.
---

The few responses on the committee reflector seemed to be inclined to view
this as stupid code, and 'don't do that.'

The reason I don't buy that answer is that if the language makes an idiom
natural, it should either make it work predictably or else provide guard
rails to help people avoid the problem. We don't do either, and that's
bad. Unfortunately, in an expert-friendly group of people, most people are
so used to avoiding the problem that they don't realize how serious this
category of problems is and why it's one of the major reasons people leave
C++ for other languages. There are few things more frustrating to
programmers than having naturally written code that compiles silently but
has unpredictable behavior.

Perhaps this category of pitfall is in the top-three list of reasons for
hair loss among C++ developers. :-)

>> On the contrary, this code doesn't make me think the calls are made
>> left to right, only that the output will be serialized that way. But
>> then again, I understood (and got used to) the way C++ evaluates
>> expressions a long time ago.
>
>Exactly. Now go read further responses, especially the one from
>ThosRTanner, and notice that he apparently does not have your
>level of understanding. Which is my point. In the mathematical
>sense, ignorance about sequence points and order of evaluation
>is "almost everywhere". Rather than feeling proud about our level
>of understanding, we should be sorry that such a level is necessary,

Hear, hear.

To steal a quote from a respected colleague of mine, which I included in
the committee reflector discussion (he was speaking about relaxed memory
models, but as I've pointed out earlier in this thread there are
similarities between that and relaxed evaluation ordering):

"Meta point: A programming model is a model for programming.
Semantics should enable efficient implementations, not expose
them."

>and endeavor to do away with it. Then we could explain the now simple
>behavior to everyone, and they would all understand.

Of course, for balance I should reiterate that before proposing any such
change we also need to quantify the costs of enforcing an execution
ordering (presumably left-to-right) w.r.t.:

a) how much optimization loss there actually is on popular
implementations and platforms (note this is likely to vary greatly by
application)

b) how much existing code may rely on a left-to-right ordering (if
there's lots, there may be pressure to preserve its meaning even if it's
currently relying on unspecified and nonportable behavior)

So far, I've been met with more or less deafening silence every time I ask
people for quantified data about how great these costs really are. The
most concrete information I know of, but which still needs measuring, is
that a) is likely to cost several _times_ the throughput on some standard
Spec microbenchmarks where we can get an order-of-magnitude perf gain by
doing things like choosing to stride the other way across arrays (i.e.,
the program's loop strides rows then columns, and we know we'll get better
cache behavior by reordering all or just chunks of the loop by striding
columns then rows). But we don't know the cost on typical app code (e.g.,
some smart people I know expect <10%, maybe <5%, but we need to measure).

I'm willing to do my part: I've started asking around internally here for
people to do measurements of a) and b) on a certain popular compiler and
the resulting performance difference for certain large C/C++ code bases.
Maybe in a few months I'll be ready to share some results. In the
meantime, I would strongly encourage other C++ vendors to do the same.

If we (the industry) can measure the impact and the results show that the
cost of a) and b) is not prohibitive, then the standards committee could
usefully have a discussion about whether to require left-to-right. But we
do need to measure first so that we can accurately understand the costs
vs. benefits and have a discussion based on data.

Herb

---
Herb Sutter (www.gotw.ca) (www.pluralsight.com/blogs/hsutter)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

4zum...@gmail.com

unread,

Dec 8, 2005, 2:03:59 AM12/8/05

to

While this is order of function execution, I've had a number of people
suprised that:

cout << f() << f() << f() << endl;

Doesn't define the order in which the fs are executed.

Ron Natalie

unread,

Dec 8, 2005, 10:35:55 AM12/8/05

to

kuy...@wizard.net wrote:
>
> cout << f() << g();
>

> This expression involves four function calls:
> A: f()
> B: cout.operator<<()
> C: g()
> D: cout.operator<<(f()).operator<<()
>
> Let t(X) be the time at which function X is evaluated. Since function

>

> There are exactly two orderings consistent with all of those
> constraints:
>
> t(A) < t(B) < t(C) < t(D)
>
> t(C) < t(A) < t(B) < t(D)
>
> and there's no other requirement in the standard that is violated by
> either of those orderings. Therefore, if it matters whether f() is
> called before or after the call to g(), you've got problems.
>

Nope. There are more. There is nothing that requires A and
C to have any relative ordering.

A, C, B, D

is another allowable ordering.

Ron Natalie

unread,

Dec 8, 2005, 10:36:03 AM12/8/05

to

Andrew Koenig wrote:
> "Hyman Rosen" <hyr...@mail.com> wrote in message
> news:200512071313....@horus.isnic.is...
>
>> Well, that's what I said. But often compilers actually have a
>> fixed order of evaluation, it's just that they don't tell anyone
>> what it is, and it's subject to change between versions or vendors.
>> So order dependencies might creep in and accidentally work, until
>> something changes.
>
> I haven't done a survey of compilers, but I have certainly encountered
> compilers that do not have a fixed order of evaluation in the sense in which
> I think you mean it. It is certainly true that most compilers will evaluate
> a given expression in the same order every time they encounter it (assuming
> that the types are the same), but that doesn't imply a fixed order of
> evaluation

I don't know what you mean by fixed. While I've never seen a compiler
that evaluates the functions differently over different invocations of
the compiler, I've certainly seen ones that do things differently for
different optimization settings.

Further, forcing Left-to-Right or Right-to-Left evaluationn order
doesn't work unless you further require sequence points.

Ron Natalie

unread,

Dec 8, 2005, 10:36:45 AM12/8/05

to

Bo Persson wrote:
n any order.
>
> If the functions really affect each other, this is *really* bad code.
> Why don't we require a diagnostic
>
> Warning: Terrible code - please rewrite!

I suppose the above is sarcasm, but it's generally not possible to
tell that the functions have untoward effect.

All it takes is a write to the same ultimate output channel (which
may be unrelated C++ streams) to cause variability in observed behavior.

Hyman Rosen

unread,

Dec 8, 2005, 11:08:40 AM12/8/05

to

Ron Natalie wrote:
> Further, forcing Left-to-Right or Right-to-Left evaluationn order
> doesn't work unless you further require sequence points.

The term "sequence point" will be banished to heck.
Expressions will be evaluated left-to-right, operands
before operation. Evaluating an expression with side
effects will cause those side effects to happen. Very
simple, very clear.

Herb Sutter

unread,

Dec 8, 2005, 11:52:37 PM12/8/05

to

After sending this, I thought I should clarify something that on rereading
wasn't clear the way I first wrote it:

hsu...@gotw.ca (Herb Sutter) wrote:
> a) how much optimization loss there actually is on popular
>implementations and platforms (note this is likely to vary greatly by
>application)

[...]

>The
>most concrete information I know of, but which still needs measuring, is
>that a) is likely to cost several _times_ the throughput on some standard
>Spec microbenchmarks where we can get an order-of-magnitude perf gain by
>doing things like choosing to stride the other way across arrays (i.e.,
>the program's loop strides rows then columns, and we know we'll get better
>cache behavior by reordering all or just chunks of the loop by striding
>columns then rows). But we don't know the cost on typical app code (e.g.,
>some smart people I know expect <10%, maybe <5%, but we need to measure).

Specifically, the stride issue was specifically about the performance gain
from of a memory model latitude for reordering reads/writes, not latitude
for reordering expression evaluation.

Bo Persson

unread,

Dec 8, 2005, 11:54:02 PM12/8/05

to

"Ron Natalie" <r...@spamcop.net> skrev i meddelandet
news:43982d20$0$28446$9a6e...@news.newshosting.com...

> Bo Persson wrote:
> n any order.
>>
>> If the functions really affect each other, this is *really* bad
>> code. Why don't we require a diagnostic
>>
>> Warning: Terrible code - please rewrite!
>
> I suppose the above is sarcasm,

It sure is.

> but it's generally not possible to
> tell that the functions have untoward effect.
>
> All it takes is a write to the same ultimate output channel (which
> may be unrelated C++ streams) to cause variability in observed
> behavior.
>

Yes, but should we encourage that kind of coding, by defining its
meaning? I have never felt that I need to write code like

i = f(i++, i++);

so I don't think it is very productive to spend time specifying
exactly what it means.

In both C and C++ we already have ways of specifying a particular
evaluation order when needed. We just write the expressions in the
particular order, and put a semicolon between each. That's it!

I don't belive there is any advantage in supporting function calls
containing large expressions, with multiple interdependent side
effects. In my opinion this would encourage writing hard to understand
code, rather than good and readable code.

I would rather see compiler writers spend their time on implementing
what is already in the standard. That would be more useful to me.

Bo Persson

Andrei Alexandrescu (See Website For Email)

unread,

Dec 8, 2005, 11:56:20 PM12/8/05

to

Herb Sutter wrote:
> Of course, for balance I should reiterate that before proposing any such
> change we also need to quantify the costs of enforcing an execution
> ordering (presumably left-to-right) w.r.t.:
>

> a) how much optimization loss there actually is on popular
> implementations and platforms (note this is likely to vary greatly by
> application)
>

> b) how much existing code may rely on a left-to-right ordering (if
> there's lots, there may be pressure to preserve its meaning even if it's
> currently relying on unspecified and nonportable behavior)
>
> So far, I've been met with more or less deafening silence every time I ask

> people for quantified data about how great these costs really are. The

> most concrete information I know of, but which still needs measuring, is
> that a) is likely to cost several _times_ the throughput on some standard
> Spec microbenchmarks where we can get an order-of-magnitude perf gain by
> doing things like choosing to stride the other way across arrays (i.e.,
> the program's loop strides rows then columns, and we know we'll get better
> cache behavior by reordering all or just chunks of the loop by striding
> columns then rows). But we don't know the cost on typical app code (e.g.,
> some smart people I know expect <10%, maybe <5%, but we need to measure).

I think point (a) needs a tad of refining. The real question is (deep
breath, complicated sentence follows):

a) how much opportunity for optimization from a combination "programmer
willing to optimize" + "optimizing compiler" is lost?

I am emphasizing these two details because:

(1) A "programmer willing to optimize" who knows that right-to-left
evaluation is algorithmically better AND knows that left-to-right
evaluation is guaranteed will introduce named temporary to force
right-to-left evaluation. That is the best solution of all, better than
the programmer just leaving optimality at the whim of the compiler. It's
guaranteed. In contrast, the compiler may or may not detect the
opportunity by itself so we can't know whether the optimization will be
done at all.

(2) An "optimizing compiler" can break the left-to-right evaluation rule
if it detects a micro-optimal reordering that doesn't change the
semantics of the code.

For these reasons I believe that in reality there's not a lot of lost
optimality, much less than one might think at first sight. Advocates of
the status quo should (at best for their case) showcase code that the
compiler optimizes and that's too hard, or too suble, to optimize at
source level.

Andrei

kuy...@wizard.net

unread,

Dec 8, 2005, 11:52:46 PM12/8/05

to

Ron Natalie wrote:
> kuy...@wizard.net wrote:
> >
> > cout << f() << g();
> >
>
> > This expression involves four function calls:
> > A: f()
> > B: cout.operator<<()
> > C: g()
> > D: cout.operator<<(f()).operator<<()
> >
> > Let t(X) be the time at which function X is evaluated. Since function
>
> >
> > There are exactly two orderings consistent with all of those
> > constraints:
> >
> > t(A) < t(B) < t(C) < t(D)
> >
> > t(C) < t(A) < t(B) < t(D)
> >
> > and there's no other requirement in the standard that is violated by
> > either of those orderings. Therefore, if it matters whether f() is
> > called before or after the call to g(), you've got problems.
> >
>
> Nope. There are more. There is nothing that requires A and
> C to have any relative ordering.
>
> A, C, B, D

That case differs from my first case only in the relative order of B
and C, not in the relative order of A and C. The order of B and C could
matter, most plausibly by g() using cout, directly or indirectly. I
missed that case because I was concentrating about the possibility of
the relative order of A and C being important.

Francis Glassborow

unread,

Dec 9, 2005, 11:06:08 AM12/9/05

to

In article <Ir7L4...@beaver.cs.washington.edu>, "Andrei Alexandrescu
(See Website For Email)" <SeeWebsit...@moderncppdesign.com> writes

>(1) A "programmer willing to optimize" who knows that right-to-left
>evaluation is algorithmically better AND knows that left-to-right
>evaluation is guaranteed will introduce named temporary to force
>right-to-left evaluation. That is the best solution of all, better than
>the programmer just leaving optimality at the whim of the compiler.
>It's guaranteed. In contrast, the compiler may or may not detect the
>opportunity by itself so we can't know whether the optimization will be
>done at all.

I think this is a very good point (that I had not considered
previously). If the programmer knows the default evaluation order s/he
can write code to force a different order. If you do not know the
evaluation order and you believe it matters you have to write code to
enforce your preferred ordering.

However there is more than just order of evaluation, there is the issue
of order of side-effects. Should we go the whole way and force an
ordering on those?

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

John Potter

unread,

Dec 9, 2005, 11:59:23 PM12/9/05

to

On Fri, 9 Dec 2005 16:06:08 GMT, fra...@robinton.demon.co.uk (Francis
Glassborow) wrote:

> However there is more than just order of evaluation, there is the issue
> of order of side-effects. Should we go the whole way and force an
> ordering on those?

It is all or nothing. Here is some amusement.

#include <iostream>
int& inc (int& x) { return ++ x; }
int add (int x, int y) { return x + y; }
int main () {
// Undefined behavior.
int x(3);
std::cout << ++x + ++x + ++x << std::endl;
x = 3;
std::cout << ++x + (++x + ++x) << std::endl;
// Just unspecified now.
x = 3;
std::cout << inc(x) + inc(x) + inc(x) << std::endl;
x = 3;
std::cout << inc(x) + (inc(x) + inc(x)) << std::endl;
// Sequence points all over the place but still unspecified.
// When does the lvalue to rvalue conversion happen? This
// problem is unique to references.
x = 3;
std::cout << add(add(inc(x), inc(x)), inc(x)) << std::endl;
x = 3;
std::cout << add(inc(x), add(inc(x), inc(x))) << std::endl;
}

On one implementation, the output was 16 18 16 18 18 16.

Since optimization is part of the subject, -O9 which may totally break
the code produced 15 15 for the last two cases. Amusing.

John

Andrei Alexandrescu (See Website For Email)

unread,

Dec 9, 2005, 11:58:44 PM12/9/05

to

Francis Glassborow wrote:
> In article <Ir7L4...@beaver.cs.washington.edu>, "Andrei Alexandrescu
> (See Website For Email)" <SeeWebsit...@moderncppdesign.com> writes
>
>> (1) A "programmer willing to optimize" who knows that right-to-left
>> evaluation is algorithmically better AND knows that left-to-right
>> evaluation is guaranteed will introduce named temporary to force
>> right-to-left evaluation. That is the best solution of all, better
>> than the programmer just leaving optimality at the whim of the
>> compiler. It's guaranteed. In contrast, the compiler may or may not
>> detect the opportunity by itself so we can't know whether the
>> optimization will be done at all.
>
>
> I think this is a very good point (that I had not considered
> previously). If the programmer knows the default evaluation order s/he
> can write code to force a different order. If you do not know the
> evaluation order and you believe it matters you have to write code to
> enforce your preferred ordering.

There remain the "not-so-obvious" opportunities for optimization, such
as those that reuse registers etc. By my assertion number (2)
("optimizing compiler") I am clarifying that an optimizing compiler can
still evaluate things in the order they please al long as the
left-to-right semantics are unaffected. Because of (1) and (2), I am
speculating that an overwhelming majority of cases are covered, and we
needn't worry about the remaining exceedingly few cases in which there
would be a remanining exceedingly few cycles to be saved.

Andrei

Niklas Matthies

unread,

Dec 9, 2005, 11:57:16 PM12/9/05

to

On 2005-12-09 16:06, Francis Glassborow wrote:
> In article <Ir7L4...@beaver.cs.washington.edu>, "Andrei Alexandrescu
> (See Website For Email)" <SeeWebsit...@moderncppdesign.com> writes
>>(1) A "programmer willing to optimize" who knows that right-to-left
>>evaluation is algorithmically better AND knows that left-to-right
>>evaluation is guaranteed will introduce named temporary to force
>>right-to-left evaluation. That is the best solution of all, better than
>>the programmer just leaving optimality at the whim of the compiler.
>>It's guaranteed. In contrast, the compiler may or may not detect the
>>opportunity by itself so we can't know whether the optimization will be
>>done at all.
>
> I think this is a very good point (that I had not considered
> previously). If the programmer knows the default evaluation order s/he
> can write code to force a different order. If you do not know the
> evaluation order and you believe it matters you have to write code to
> enforce your preferred ordering.

It might be difficult to know for the programmer which order is more
efficient, in particular when the code targets different implementations,
or for example with inlined functions where a change in the function's
implementation can change the optimal evaluation order of the function
arguments at the particular call site. Usually the compiler knows much
better than the programmer.

But the point is right in that it's the order-independent code that
should require special handling by the programmer if necessary, not
the order-dependent code. If the order is going to be defined, it
would be nice if some language constructs would be provided to mark a
group of statements to be executable "in parallel", i.e. having no
order dependencies.

> However there is more than just order of evaluation, there is the
> issue of order of side-effects. Should we go the whole way and force
> an ordering on those?

Yes, because these cause the actual UB most of the time.
Also it would be confusing that (for example) whether

os << x++ << x;

has predictable behavior depends on whether the type of 'x' is a
built-in type or not (i.e. whether '++' is a function call or not).

-- Niklas Matthies

Andrei Alexandrescu (See Website For Email)

unread,

Dec 9, 2005, 11:58:23 PM12/9/05

to

Bo Persson wrote:
> "Ron Natalie" <r...@spamcop.net> skrev i meddelandet

>>but it's generally not possible to
>>tell that the functions have untoward effect.
>>
>>All it takes is a write to the same ultimate output channel (which
>>may be unrelated C++ streams) to cause variability in observed
>>behavior.
>>
>
>
> Yes, but should we encourage that kind of coding, by defining its
> meaning? I have never felt that I need to write code like
>
> i = f(i++, i++);
>
> so I don't think it is very productive to spend time specifying
> exactly what it means.

Maybe, however, at some point someon might write:

f(a++, b++);

where a and b are references that could alias. In that case, it is very
useful to define the behavior of the code. It would be naive to believe
that code tripping on unspecified order of execution can only be
"obviously dumb" by eye inspection.

> In both C and C++ we already have ways of specifying a particular
> evaluation order when needed. We just write the expressions in the
> particular order, and put a semicolon between each. That's it!

Given that they do offer terse ways to express complicated computations,
and that people will use and abuse those terse ways, it is mightlily
important to define behavior.

> I don't belive there is any advantage in supporting function calls
> containing large expressions, with multiple interdependent side
> effects. In my opinion this would encourage writing hard to understand
> code, rather than good and readable code.
>
>
> I would rather see compiler writers spend their time on implementing
> what is already in the standard. That would be more useful to me.

I believe that advocating a standard that leaves unnecessarily much to
the whim of the compiler is an entirely fallacious viewpoint that goes
exactly against what a standard is supposed to do, and that favors
vendor lock-in. I'm also sure that most people on the standardization
committee believe the same. The thing worth discussing, therefore, is
whether defining behavior can impact optimality of code generation or not.

Andrei

Andrei Alexandrescu (See Website For Email)

unread,

Dec 9, 2005, 11:58:00 PM12/9/05

to

Herb Sutter wrote:
> After sending this, I thought I should clarify something that on rereading
> wasn't clear the way I first wrote it:
>
> hsu...@gotw.ca (Herb Sutter) wrote:
>

>> a) how much optimization loss there actually is on popular
>>implementations and platforms (note this is likely to vary greatly by
>>application)
>

> [...]

>
>>The
>>most concrete information I know of, but which still needs measuring, is
>>that a) is likely to cost several _times_ the throughput on some standard
>>Spec microbenchmarks where we can get an order-of-magnitude perf gain by
>>doing things like choosing to stride the other way across arrays (i.e.,
>>the program's loop strides rows then columns, and we know we'll get better
>>cache behavior by reordering all or just chunks of the loop by striding
>>columns then rows). But we don't know the cost on typical app code (e.g.,
>>some smart people I know expect <10%, maybe <5%, but we need to measure).
>
>

> Specifically, the stride issue was specifically about the performance gain
> from of a memory model latitude for reordering reads/writes, not latitude
> for reordering expression evaluation.

Ah, now that makes sense. (I was a bit confused.)

So the jury is still out on finding cases (that are not source-level
optimizable in an obvious way) in which a specified order of argument
evaluation forces the compiler to generate pessimized code.

Andrei

Bo Persson

unread,

Dec 10, 2005, 5:22:37 PM12/10/05

to

"Andrei Alexandrescu (See Website For Email)"

<SeeWebsit...@moderncppdesign.com> skrev i meddelandet
news:Ir8zA...@beaver.cs.washington.edu...

> Bo Persson wrote:
>> "Ron Natalie" <r...@spamcop.net> skrev i meddelandet
>>>but it's generally not possible to
>>>tell that the functions have untoward effect.
>>>
>>>All it takes is a write to the same ultimate output channel (which
>>>may be unrelated C++ streams) to cause variability in observed
>>>behavior.
>>>
>>
>>
>> Yes, but should we encourage that kind of coding, by defining its
>> meaning? I have never felt that I need to write code like
>>
>> i = f(i++, i++);
>>
>> so I don't think it is very productive to spend time specifying
>> exactly what it means.
>
> Maybe, however, at some point someon might write:
>
> f(a++, b++);
>
> where a and b are references that could alias. In that case, it is
> very useful to define the behavior of the code. It would be naive to
> believe that code tripping on unspecified order of execution can
> only be "obviously dumb" by eye inspection.
>

I would still argue that this is what I call "bad code", and that
doing the aliasing without noticing it is even worse. :-)

However, in a previous post you wrote:

>(1) A "programmer willing to optimize" who knows that right-to-left
>evaluation is algorithmically better AND knows that left-to-right
>evaluation is guaranteed will introduce named temporary to force
>right-to-left evaluation. That is the best solution of all, better
>than the programmer just leaving optimality at the whim of the
>compiler.

And here I agree, totally!

Right now, I write my code under the assumption that the compiler is
smart enough to select the proper order - one that is good enough. I
have argued that in my code there is no advantage for a left-to-right
order, because I tend not to write code where it matters.

You just made me realize that the "smart enough" compiler, that I
trust to select a good order, must of course be smart enough to see
this as well. So, if my code is written without the nasty side
effects, and without order dependencies, the compiler can use the
as-if rule and continue to produce the same code for *my* programs.

So, instead of me telling the OP that he can arrange his code better,
we can both get what we want. He can have his left-to-right order of
evaluation, and I can still write my code so that it generally doesn't
matter. In the few cases where it really does matter for me, I can
rearrange *my* code to evaluate the arguments before the function
call, in any order I want. Fine with me!

So, specifying the order of evaluation in the Standard might be a good
idea after all.

Thanks Andrei!

Apologies to Hyman Rosen, for telling you how to write your code.

Bo Persson

Momchil Velikov

unread,

Dec 10, 2005, 5:24:20 PM12/10/05

to

"Andrei Alexandrescu See Website For Email wrote:
> So the jury is still out on finding cases (that are not source-level
> optimizable in an obvious way) in which a specified order of argument
> evaluation forces the compiler to generate pessimized code.

How about finding cases in which the order of evaluation is not
enforceable at the source level in an obvious way ?

~velco

Andrei Alexandrescu (See Website For Email)

unread,

Dec 10, 2005, 9:11:05 PM12/10/05

to

Momchil Velikov wrote:
> "Andrei Alexandrescu See Website For Email wrote:
>
>>So the jury is still out on finding cases (that are not source-level
>>optimizable in an obvious way) in which a specified order of argument
>>evaluation forces the compiler to generate pessimized code.
>
>
> How about finding cases in which the order of evaluation is not
> enforceable at the source level in an obvious way ?

Not sure I understand. For the call (expr0)(arg1, arg2, ..., argn) the
evaluation algorithm should be as if the following happens:

1. Evaluate expr0 resulting in a function f
2. For each i in 1..n in this order, evaluate argi resulting in a value vi
3. Invoke f(v1, v2, ..., vn)

It's a pity that the intended semantics can't be easily expressed as a
source-to-source transformation. (The problem is that rvalue and lvalue
expressions would lead to different types of temporaries.)

Andrei

Momchil Velikov

unread,

Dec 12, 2005, 11:29:03 AM12/12/05

to

Andrei Alexandrescu (See Website For Email) wrote:
> There remain the "not-so-obvious" opportunities for optimization, such
> as those that reuse registers etc. By my assertion number (2)
> ("optimizing compiler") I am clarifying that an optimizing compiler can
> still evaluate things in the order they please al long as the
> left-to-right semantics are unaffected.

The question is whether the compiler is able to deduce the
existance of evaluation orders, different from the strict left-to-right
order, but preserving its semantics. I'd expect the conservative
assumptions, based on incomplete knowledge about a program
to cause the compiler to miss alternatives.

On the other hand, not imposing evaluation orders conveys
important information to the compiler - that the expression
does not depend on the evaluation order, even if this is not
evident to the compiler by other means.

~velco

Hyman Rosen

unread,

Dec 12, 2005, 11:44:38 AM12/12/05

to

Andrei Alexandrescu (See Website For Email) wrote:

> 1. Evaluate expr0 resulting in a function f
> 2. For each i in 1..n in this order, evaluate argi resulting in a value vi
> 3. Invoke f(v1, v2, ..., vn)

Step 2 is wrong. What we actually want there is
2. For each i in 1..n in this order, construct parameter i
of the function using argi. If an exception is thrown
during the construction of any parameter, the previous
parameters are destructed in reverse order of construction.

Momchil Velikov

unread,

Dec 12, 2005, 11:44:56 AM12/12/05

to

"Andrei Alexandrescu See Website For Email wrote:
> Momchil Velikov wrote:
> > "Andrei Alexandrescu See Website For Email wrote:
> >
> >>So the jury is still out on finding cases (that are not source-level
> >>optimizable in an obvious way) in which a specified order of argument
> >>evaluation forces the compiler to generate pessimized code.
> >
> > How about finding cases in which the order of evaluation is not
> > enforceable at the source level in an obvious way ?
>
> Not sure I understand.

Just presenting the opposite view. I'm not sure proponents of the
unspecifed
evaluation order should be put in a defensive position, like your
posting suggests.

> For the call (expr0)(arg1, arg2, ..., argn) the
> evaluation algorithm should be as if the following happens:
>
> 1. Evaluate expr0 resulting in a function f
> 2. For each i in 1..n in this order, evaluate argi resulting in a value vi
> 3. Invoke f(v1, v2, ..., vn)
>
> It's a pity that the intended semantics can't be easily expressed as a
> source-to-source transformation. (The problem is that rvalue and lvalue
> expressions would lead to different types of temporaries.)

Then maybe *this* is the problem to solve. Is it related to the
"forwarding problem"
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1385.htm ?

~velco

Andrei Alexandrescu (See Website For Email)

unread,

Dec 13, 2005, 12:16:25 AM12/13/05

to

Hyman Rosen wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>

>> 1. Evaluate expr0 resulting in a function f
>> 2. For each i in 1..n in this order, evaluate argi resulting in a
>> value vi
>> 3. Invoke f(v1, v2, ..., vn)
>
>

> Step 2 is wrong. What we actually want there is
> 2. For each i in 1..n in this order, construct parameter i
> of the function using argi. If an exception is thrown
> during the construction of any parameter, the previous
> parameters are destructed in reverse order of construction.

I assumed destruction as being a normal outcome of value creation.
That's a language invariant.

Andrei

Andrei Alexandrescu (See Website For Email)

unread,

Dec 13, 2005, 12:16:36 AM12/13/05

to

Momchil Velikov wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>

>>There remain the "not-so-obvious" opportunities for optimization, such
>>as those that reuse registers etc. By my assertion number (2)
>>("optimizing compiler") I am clarifying that an optimizing compiler can
>>still evaluate things in the order they please al long as the
>>left-to-right semantics are unaffected.
>
>
> The question is whether the compiler is able to deduce the
> existance of evaluation orders, different from the strict left-to-right
> order, but preserving its semantics. I'd expect the conservative
> assumptions, based on incomplete knowledge about a program
> to cause the compiler to miss alternatives.

Compilers do that extensively already. The past ten years have seen more
and more aggressive reordering by compilers, and there's no sign of it
slowing down.

> On the other hand, not imposing evaluation orders conveys
> important information to the compiler - that the expression
> does not depend on the evaluation order, even if this is not
> evident to the compiler by other means.

I think in the wake of current developments, the assertion above has
transformed from a certainty into an anachronic speculation that needs
to be revisited.

Andrei Alexandrescu (See Website For Email)

unread,

Dec 13, 2005, 12:16:41 AM12/13/05

to

Momchil Velikov wrote:
>>It's a pity that the intended semantics can't be easily expressed as a
>>source-to-source transformation. (The problem is that rvalue and lvalue
>>expressions would lead to different types of temporaries.)
>
>
> Then maybe *this* is the problem to solve. Is it related to the
> "forwarding problem"
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1385.htm ?

That's being addressed by the rvalue proposal. Solving it won't take
care of defining order of evaluation of function arguments.

Andrei

Momchil Velikov

unread,

Dec 13, 2005, 10:37:06 AM12/13/05

to

"Andrei Alexandrescu See Website For Email wrote:

> Momchil Velikov wrote:
> >>It's a pity that the intended semantics can't be easily expressed as a
> >>source-to-source transformation. (The problem is that rvalue and lvalue
> >>expressions would lead to different types of temporaries.)
> >
> > Then maybe *this* is the problem to solve. Is it related to the
> > "forwarding problem"
> > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1385.htm ?
>
> That's being addressed by the rvalue proposal. Solving it won't take
> care of defining order of evaluation of function arguments.

Sorry, I didn't understand. Do you mean that solving this
issue won't allow evaluation order defining source-to-source
transformation ?

~velco

Hyman Rosen

unread,

Dec 13, 2005, 10:38:02 AM12/13/05

to

Andrei Alexandrescu (See Website For Email) wrote:

> I assumed destruction as being a normal outcome of value creation.
> That's a language invariant.

But it should still be made clear that step two involves
binding the function parameters in order, not just
accumulating a set of values (and references) to be passed
to the function. Evaluating an expression used as an
argument and constructing a function parameter aren't so
obviously the same that it can go without saying. Remember,
one thing that this process is meant to fix is the
f(auto_ptr, auto_ptr) problem.

David Abrahams

unread,

Dec 14, 2005, 12:55:35 AM12/14/05

to

Hyman Rosen <hyr...@mail.com> writes:

> Andrei Alexandrescu (See Website For Email) wrote:
>> I assumed destruction as being a normal outcome of value
>> creation. That's a language invariant.
>
> But it should still be made clear that step two involves
> binding the function parameters in order, not just
> accumulating a set of values (and references) to be passed
> to the function. Evaluating an expression used as an
> argument and constructing a function parameter aren't so
> obviously the same that it can go without saying. Remember,
> one thing that this process is meant to fix is the
> f(auto_ptr, auto_ptr) problem.

It's interesting that this same discussion has been going on
simultaneously on one of the committee mailing lists. Let me just
point out that the general form of that problem is insoluble:

// safe under left-to-right ordering?
f(g(), new T);

As a matter of fact it isn't safe, if f has default arguments.
Leaving that aside, will users be reticent to make this transformation

f(new T, g())

Does one of those look safer to you?

IMO the right solution for the f(auto_ptr, auto_ptr) problem is to add
a library function

auto_ptr_new<T>(arg1, ... argN)

or, in an MPL-enabled world,

new_<auto_ptr<_> >(arg1, ... argN)
new_<shared_ptr<_> >(arg1, ... argN)
new_<unique_ptr<_> >(arg1, ... argN)

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Andrei Alexandrescu (See Website For Email)

unread,

Dec 14, 2005, 12:55:38 AM12/14/05

to

Momchil Velikov wrote:
> "Andrei Alexandrescu See Website For Email wrote:
>
>>Momchil Velikov wrote:
>>
>>>>It's a pity that the intended semantics can't be easily expressed as a
>>>>source-to-source transformation. (The problem is that rvalue and lvalue
>>>>expressions would lead to different types of temporaries.)
>>>
>>>Then maybe *this* is the problem to solve. Is it related to the
>>>"forwarding problem"
>>>http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1385.htm ?
>>
>>That's being addressed by the rvalue proposal. Solving it won't take
>>care of defining order of evaluation of function arguments.
>
>
> Sorry, I didn't understand. Do you mean that solving this
> issue won't allow evaluation order defining source-to-source
> transformation ?

No.

A source-to-source transformation would have made it easier for me to
define order of evaluation by translating C++ to equivalent C++. That
would have made writing my post easier, but it's not essential for
making steps towards defining the order of evaluation.

Andrei

Hyman Rosen

unread,

Dec 14, 2005, 10:32:35 AM12/14/05

to

David Abrahams wrote:
> Let me just point out that the general form
> of that problem is insoluble:
> // safe under left-to-right ordering?
> f(g(), new T);
> As a matter of fact it isn't safe, if f has default arguments.
> Leaving that aside, will users be reticent to make this transformation
> f(new T, g())
> Does one of those look safer to you?

I'm sorry, but I don't understand what you mean.

Under my proposed new regime, function parameters
will be constructed from arguments in the call in
strict left-to-right order, and the arguments will
be evaluated in strict left-to-right order. If
constructing a parameter throws an exception,
previously constructed parameters are destructed
in reverse order. So in the examples, if the new T
argument is for an auto_ptr<T> parameter, then both
versions are equally safe. In the first case, if
g() throws then new T will never be called, and in
the second case if g() throws then the auto_ptr<T>
parameter of f will be destructed and thus the
new T pointer will be freed.

So please explain why you think that the general
case is unsafe, or why people will have to change
parameter order.

Momchil Velikov

unread,

Dec 14, 2005, 10:31:40 AM12/14/05

to

"Andrei Alexandrescu See Website For Email wrote:
> Momchil Velikov wrote:
> > "Andrei Alexandrescu See Website For Email wrote:
> >>Momchil Velikov wrote:
> >>>>It's a pity that the intended semantics can't be easily expressed as a
> >>>>source-to-source transformation. (The problem is that rvalue and lvalue
> >>>>expressions would lead to different types of temporaries.)
> >>>
> >>>Then maybe *this* is the problem to solve. Is it related to the
> >>>"forwarding problem"
> >>>http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1385.htm ?
> >>
> >>That's being addressed by the rvalue proposal. Solving it won't take
> >>care of defining order of evaluation of function arguments.
> >
> > Sorry, I didn't understand. Do you mean that solving this
> > issue won't allow evaluation order defining source-to-source
> > transformation ?
>
> No.
>
> A source-to-source transformation would have made it easier for me to
> define order of evaluation by translating C++ to equivalent C++. That
> would have made writing my post easier, but it's not essential for
> making steps towards defining the order of evaluation.

I'm under the impression we put different meaning in "defining the
order of evaluation". While I mean a source-to-source transformation,
which a programmer employs whenever (s)he want to impose a concrete
order of evaluation, you seem to use it to refer to a change in the
C++ language specification.

To restate the question, will the rvalue proposal [1] enable a
programmer to perform source-to-source transformations
whenever (s)he wants to specify a concrete evaluation order
of subexpressions and function arguments ?

~velco

[1] I guess by "rvalue proposal" you mean this
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1690.html

Andrei Alexandrescu (See Website For Email)

unread,

Dec 14, 2005, 5:10:48 PM12/14/05

to

David Abrahams wrote:
> It's interesting that this same discussion has been going on

> simultaneously on one of the committee mailing lists. Let me just

> point out that the general form of that problem is insoluble:
>
> // safe under left-to-right ordering?
> f(g(), new T);
>
> As a matter of fact it isn't safe, if f has default arguments.
> Leaving that aside, will users be reticent to make this transformation
>
> f(new T, g())
>
> Does one of those look safer to you?
>

> IMO the right solution for the f(auto_ptr, auto_ptr) problem is to add
> a library function
>
> auto_ptr_new<T>(arg1, ... argN)

I'd be interested in understanding the solution of principle even if it
would be too late to push that for standardization.

The code f(new T, g()) can leak because the call to new is bounded to a
temporary that never reaches f. I believe that that's an issue that's
not deeply linked to the order of evaluation. Defining the order of
evaluation would leave that case alone.

However, I believe defining the order of evaluation would solve
f(auto_ptr<T>, auto_ptr<T>). This is because the language invariant
preserves the rule that any value that was created will be destroyed.

Too much effort for the compiler? It's not, really, and looking at the
rules for constructing arrays gives good insights. So, right now, when
you write:

T * p = new T[n];

there's a lot going on. The semantics of the code is really:

T * p;
{
size_t __i = 0;
on_scope_failure {
while (__i > 0) {
p[--__i].T::~T();
}
}
for (; __i != n; ++__i) {
new(p + __i) T();
}
}

(With this occasion I've also shown how nice coding with on_scope_xxx is
:o). It's not much of a difference with try/catch (...) in this case
though.)

We can generalize this idea to the creation of function parameter lists.

Andrei

Andrei Alexandrescu (See Website For Email)

unread,

Dec 14, 2005, 5:10:53 PM12/14/05

to

Andrei Alexandrescu (See Website For Email) wrote:

> Too much effort for the compiler? It's not, really, and looking at the
> rules for constructing arrays gives good insights. So, right now, when
> you write:
>
> T * p = new T[n];
>
> there's a lot going on. The semantics of the code is really:
>
> T * p;
> {
> size_t __i = 0;
> on_scope_failure {
> while (__i > 0) {
> p[--__i].T::~T();
> }
> }
> for (; __i != n; ++__i) {
> new(p + __i) T();
> }
> }

Oops, I meant:

T * p = operator new(n * sizeof(T));

{
size_t __i = 0;
on_scope_failure {
while (__i > 0) {
p[--__i].T::~T();
}

operator delete[](p);

}
for (; __i != n; ++__i) {
new(p + __i) T();
}
}

.with the appropriate magic of calling T::operator new and T::operator
delete if T defines those. By the way, is there a portable way of
writing that in source code? That is, call T::operator new if it's
defined, otherwise call ::operator new.

One more correction to my same post: When I said:

"The code f(new T, g()) can leak because the call to new is bounded to a
temporary that never reaches f. I believe that that's an issue that's
not deeply linked to the order of evaluation. Defining the order of
evaluation would leave that case alone."

. I referred to the case when f takes a raw pointer to T as its first
argument.

David Abrahams

unread,

Dec 14, 2005, 7:42:32 PM12/14/05

to

hyr...@mail.com (Hyman Rosen) writes:

> David Abrahams wrote:
>> Let me just point out that the general form
>> of that problem is insoluble:
>> // safe under left-to-right ordering?
>> f(g(), new T);
>> As a matter of fact it isn't safe, if f has default arguments.
>> Leaving that aside, will users be reticent to make this transformation
>> f(new T, g())
>> Does one of those look safer to you?
>
> I'm sorry, but I don't understand what you mean.
>
> Under my proposed new regime, function parameters
> will be constructed from arguments in the call in
> strict left-to-right order, and the arguments will
> be evaluated in strict left-to-right order. If
> constructing a parameter throws an exception,
> previously constructed parameters are destructed
> in reverse order. So in the examples, if the new T
> argument is for an auto_ptr<T> parameter, then both
> versions are equally safe. In the first case, if
> g() throws then new T will never be called, and in
> the second case if g() throws then the auto_ptr<T>
> parameter of f will be destructed and thus the
> new T pointer will be freed.

f doesn't have an auto_ptr<T> parameter. It takes two pointers.

> So please explain why you think that the general
> case is unsafe, or why people will have to change
> parameter order.

It's not about "having to" change. It's about whether you'll notice
that the change affects safety.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---