Evaluation precedence questions

20 views
Skip to first unread message

Alessandro Molina

unread,
Jan 14, 2004, 6:08:07 PM1/14/04
to
Standard asserts that there isn't an official way to evaluate
statements like:

a[i] = i++
and similars

I was just curious about the reasons. Why didn't the standard
define a generic evaluation direction for expressions?

What problems will happen stating that compilers must always evaluate
expressions from left or from right for example?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Hyman Rosen

unread,
Jan 15, 2004, 5:30:47 AM1/15/04
to
Alessandro Molina wrote:
> I was just curious about the reasons. Why didn't the standard
> define a generic evaluation direction for expressions?

Because We've Always Done It This Way (tm)

People will blather on about optimization possibilities,
but that's a crock. It would be completely reasonable to
adopt Java's order of evaluation semantics and finally
get rid of this stupid albatross. But it will never happen.

Jack Klein

unread,
Jan 15, 2004, 5:38:53 AM1/15/04
to
On 14 Jan 2004 18:08:07 -0500, Alessandro Molina <jav...@supereva.it>
wrote in comp.lang.c++.moderated:

> Standard asserts that there isn't an official way to evaluate
> statements like:
>
> a[i] = i++
> and similars
>
> I was just curious about the reasons. Why didn't the standard
> define a generic evaluation direction for expressions?

For compatibility with C, and for performance.

> What problems will happen stating that compilers must always evaluate
> expressions from left or from right for example?

What will happen is this:

Any order that you pick will generate efficient object code on some
compilers for some processors, and inefficient object code on some
other compilers for some other processors.

One of the basic tenets of C++ is to be as efficient as C for
low-level programming, and in some cases more efficient than C for
high-level programming.

One of the basic tenets of C, and so inherited by C++, is to be as
efficient as possible. Period.

There are languages that will provide the guarantees that you want.
In general, they are not as fast or efficient as C or C++ at the kind
of low-level programming that C and C++ are used for, especially in
systems programming and real-time embedded systems.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html

Dhruv

unread,
Jan 15, 2004, 5:40:21 AM1/15/04
to
On Wed, 14 Jan 2004 18:08:07 -0500, Alessandro Molina wrote:

> Standard asserts that there isn't an official way to evaluate
> statements like:
>
> a[i] = i++
> and similars
>
> I was just curious about the reasons. Why didn't the standard
> define a generic evaluation direction for expressions?
>
> What problems will happen stating that compilers must always evaluate
> expressions from left or from right for example?

I don't know much about the issues, but I think that the example you have
given above should be legal. Consider:

i = 0;

Then, the quiivalen code generated would be: (possibly!)

a[0] = 0;
++i;


Refards,
-Dhruv.

Francis Glassborow

unread,
Jan 15, 2004, 10:12:26 AM1/15/04
to
In message <gg3c00p4aqehbmvo3...@4ax.com>, Jack Klein
<jack...@spamcop.net> writes

>On 14 Jan 2004 18:08:07 -0500, Alessandro Molina <jav...@supereva.it>
>wrote in comp.lang.c++.moderated:
>
> > Standard asserts that there isn't an official way to evaluate
> > statements like:
> >
> > a[i] = i++
> > and similars
> >
> > I was just curious about the reasons. Why didn't the standard
> > define a generic evaluation direction for expressions?
>
>For compatibility with C, and for performance.

I recently suggest at a WG14 meeting (C Standard) that future releases
of the C Standard should specify the order of evaluation (it cannot
break any existing well-formed code because the order is currently
unspecified). The only problem is (as I knew when I raised the question)
that it would break some existing compilers.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

Hyman Rosen

unread,
Jan 15, 2004, 9:53:22 PM1/15/04
to
Jack Klein wrote:
> For compatibility with C

Of course not, since giving defined behavior to formerly
undefined code cannot cause incompatibility.

and for performance.

In older threads on the same subject people have managed to
come up with isolated cases where God can generate better
code by using an order of evaluation other than "left before
right, operands before operation". The practical effects of
these, especially compared with the decades of erroneous code
that this undefined behavior has engendered, are nil.

John Potter

unread,
Jan 15, 2004, 10:01:03 PM1/15/04
to
On 15 Jan 2004 05:40:21 -0500, "Dhruv" <dhru...@gmx.net> wrote:

> On Wed, 14 Jan 2004 18:08:07 -0500, Alessandro Molina wrote:

> > a[i] = i++
> > and similars

> I don't know much about the issues, but I think that the example you have


> given above should be legal. Consider:

It is legal, well formed. You just have no idea what it will do.

> i = 0;

> Then, the quiivalen code generated would be: (possibly!)

> a[0] = 0;
> ++i;

or

++i;
a[1] = 0;

or I is a 32 bit thing on an 8 bit machine and the software
increment is interleaved with the subscripting and formats
the hard drive.

John

Joshua Lehrer

unread,
Jan 16, 2004, 8:07:32 AM1/16/04
to
Jack Klein <jack...@spamcop.net> wrote in message news:<gg3c00p4aqehbmvo3...@4ax.com>...

> Any order that you pick will generate efficient object code on some
> compilers for some processors, and inefficient object code on some
> other compilers for some other processors.
>

so rather than pick one, which will be efficient on some platforms,
and not on others, we pick neither, which means that you can't write
the code at all?

joshua lehrer
factset research systems
NYSE:FDS

Frank Birbacher

unread,
Jan 16, 2004, 8:08:57 AM1/16/04
to
Hi!

Dhruv wrote:
> I don't know much about the issues, but I think that the example you have
> given above should be legal. Consider:

It is not legal if i is a built in type (int, long, ...).

Frank

Francis Glassborow

unread,
Jan 16, 2004, 8:25:31 PM1/16/04
to
In message <ngdd00p1nmcbhg3kb...@4ax.com>, John Potter
<jpo...@falcon.lhup.edu> writes

>On 15 Jan 2004 05:40:21 -0500, "Dhruv" <dhru...@gmx.net> wrote:
>
>> On Wed, 14 Jan 2004 18:08:07 -0500, Alessandro Molina wrote:
>
>> > a[i] = i++
>> > and similars
>
>> I don't know much about the issues, but I think that the example you have
>> given above should be legal. Consider:

It rather depends on what is meant by 'legal'. It breaches a requirement
of the Standard (no diagnostic required) in that it evaluates i for two
distinct purposes only one of which is to determine the value to be
written (to i)


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

Francis Glassborow

unread,
Jan 16, 2004, 8:26:06 PM1/16/04
to
In message <10741746...@master.nyc.kbcfp.com>, Hyman Rosen
<hyr...@mail.com> writes

>Jack Klein wrote:
>> For compatibility with C
>
>Of course not, since giving defined behavior to formerly
>undefined code cannot cause incompatibility.


Careful about the terminology, undefined is usually applied to code
where the Standard says nothing at all and so it can do anything the
compiler decides. Order of evaluation is 'unspecified' which means that
the compiler is strongly constrained and cannot do anything, only choose
some order of evaluation that will deliver the values of operands in a
timely fashion.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

Ben Hutchings

unread,
Jan 16, 2004, 8:32:36 PM1/16/04
to
Hyman Rosen wrote:
> Jack Klein wrote:
>> For compatibility with C
>
> Of course not, since giving defined behavior to formerly
> undefined code cannot cause incompatibility.
>
> and for performance.
>
> In older threads on the same subject people have managed to
> come up with isolated cases where God can generate better
> code by using an order of evaluation other than "left before
> right, operands before operation".
<snip>

Nonsense. If the implementation is free to evaluate operands in any
order it can use this freedom to minimise the number of registers or
stack slots used and thus to reduce memory traffic. This is a fairly
basic optimisation that was covered in the Compilers and Operating
Systems course I took as an undergraduate.

Francis Glassborow

unread,
Jan 16, 2004, 8:33:08 PM1/16/04
to
In message <31c49f0d.04011...@posting.google.com>, Joshua
Lehrer <usene...@lehrerfamily.com> writes

>Jack Klein <jack...@spamcop.net> wrote in message news:<gg3c00p4aqehbmvo3...@4ax.com>...
> > Any order that you pick will generate efficient object code on some
> > compilers for some processors, and inefficient object code on some
> > other compilers for some other processors.
> >
>
>so rather than pick one, which will be efficient on some platforms,
>and not on others, we pick neither, which means that you can't write
>the code at all?

How does that follow? By making some things unspecified we turn them
into QoI issues.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

Hyman Rosen

unread,
Jan 16, 2004, 10:13:32 PM1/16/04
to
Frank Birbacher wrote:
> It is not legal if i is a built in type (int, long, ...).

And so we spin this stupid thread again and again and again...
Am I the only one who is sick to death of it?

Dhruv

unread,
Jan 16, 2004, 10:20:30 PM1/16/04
to
On Thu, 15 Jan 2004 22:01:03 -0500, John Potter wrote:

> On 15 Jan 2004 05:40:21 -0500, "Dhruv" <dhru...@gmx.net> wrote:
>
>> On Wed, 14 Jan 2004 18:08:07 -0500, Alessandro Molina wrote:
>
>> > a[i] = i++
>> > and similars
>
>> I don't know much about the issues, but I think that the example you have
>> given above should be legal. Consider:
>
> It is legal, well formed. You just have no idea what it will do.

[...]

> ++i;
> a[1] = 0;
>
> or I is a 32 bit thing on an 8 bit machine and the software
> increment is interleaved with the subscripting and formats
> the hard drive.

Ok, so let me get a few things straight in my mind:

1. Is = a sequence point?
2. wrt. operator++(int) [post increment] applied to ints (PODS), when
exactly is it said that the increment will happen? Possible options
include:

A) Between the assignment taking place, and the RHS expression being
evaluated.
B) After the value of i has been obtained in the expression, so something
like this would be illegal foo (i, i++) (Which I guess is illegal
currently).
C) After the complete expression has been evaluated. (Which was my initial
assumption).
D) None of the above?


Regards,
-Dhruv.

John Potter

unread,
Jan 17, 2004, 6:04:05 AM1/17/04
to
On 16 Jan 2004 20:26:06 -0500, Francis Glassborow
<fra...@robinton.demon.co.uk> wrote:

> In message <10741746...@master.nyc.kbcfp.com>, Hyman Rosen
> <hyr...@mail.com> writes

> >Jack Klein wrote:
> >> For compatibility with C

> >Of course not, since giving defined behavior to formerly
> >undefined code cannot cause incompatibility.

> Careful about the terminology, undefined is usually applied to code
> where the Standard says nothing at all and so it can do anything the
> compiler decides. Order of evaluation is 'unspecified' which means that
> the compiler is strongly constrained and cannot do anything, only choose
> some order of evaluation that will deliver the values of operands in a
> timely fashion.

The original example for this thread had undefined behavior. Can you
give an example using fundamental types where the unspecified order
permits two or more well defined results? I think that any expression
in which the order of evaluation makes a difference will also have
undefined behavior.

John

Francis Glassborow

unread,
Jan 17, 2004, 6:31:07 PM1/17/04
to
In message <pan.2004.01.16....@gmx.net>, Dhruv
<dhru...@gmx.net> writes

>Ok, so let me get a few things straight in my mind:
>
>1. Is = a sequence point?

No, though some argue that it should be.

>2. wrt. operator++(int) [post increment] applied to ints (PODS), when
>exactly is it said that the increment will happen? Possible options
>include:
>
>A) Between the assignment taking place, and the RHS expression being
>evaluated.
>B) After the value of i has been obtained in the expression, so something
>like this would be illegal foo (i, i++) (Which I guess is illegal
>currently).
>C) After the complete expression has been evaluated. (Which was my initial
>assumption).
>D) None of the above?

D) However it is easy to confuse evaluation, which takes place at some
time before the value is needed with the side-effect of writing a new
value to memory which takes place not later than the next sequence
point.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

Francis Glassborow

unread,
Jan 17, 2004, 6:49:15 PM1/17/04
to
In message <i74h00p9gi60dnepu...@4ax.com>, John Potter
<jpo...@falcon.lhup.edu> writes

>The original example for this thread had undefined behavior. Can you
>give an example using fundamental types where the unspecified order
>permits two or more well defined results? I think that any expression
>in which the order of evaluation makes a difference will also have
>undefined behavior.

Yes but not because of any order of evaluation problem and changing the
rules for order of evaluation would not fix all the problems concerned
with when side effects of evaluation take place (and being more
restrictive on that issue would almost certainly degrade performance).

When I last raised the question about order of evaluation affecting
results (admittedly in a C context) the response was that that was only
undefined behaviour if some specific rule was breached so:

#include <iostream>
int i(3);
int foo(){i++; return i;}
int main(){
std::cout << i + foo() << '\n';
}

does not include UB though the output might be either 7 or 8.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

Andrew Koenig

unread,
Jan 17, 2004, 7:00:44 PM1/17/04
to
> The original example for this thread had undefined behavior. Can you
> give an example using fundamental types where the unspecified order
> permits two or more well defined results?

int append(vector<int>& v, int n)
{
v.push_back(n);
return n;
}

vector int ambig()
{
vector<int> v;
append(3) + append(4);
return v;
}

If you don't like the use of vector<int>, it's easy to recast it in terms of
arrays and pointers.

James Dennett

unread,
Jan 17, 2004, 8:20:37 PM1/17/04
to
John Potter wrote:
> On 16 Jan 2004 20:26:06 -0500, Francis Glassborow
> <fra...@robinton.demon.co.uk> wrote:
>
> > In message <10741746...@master.nyc.kbcfp.com>, Hyman Rosen
> > <hyr...@mail.com> writes
>
> > >Jack Klein wrote:
> > >> For compatibility with C
>
> > >Of course not, since giving defined behavior to formerly
> > >undefined code cannot cause incompatibility.
>
> > Careful about the terminology, undefined is usually applied to code
> > where the Standard says nothing at all and so it can do anything the
> > compiler decides. Order of evaluation is 'unspecified' which means that
> > the compiler is strongly constrained and cannot do anything, only choose
> > some order of evaluation that will deliver the values of operands in a
> > timely fashion.
>
> The original example for this thread had undefined behavior. Can you
> give an example using fundamental types where the unspecified order
> permits two or more well defined results? I think that any expression
> in which the order of evaluation makes a difference will also have
> undefined behavior.

// Silly code for exposition only.

static int i = 0;

int f() { return i = 1; }
int g() { return i = 2; }
int h(int,int,int) { } // order of arg evaluation unspecified

int main() {
if ( h(f(), g(), i) == 1 ) ...
}

Or were you disallowing function calls as well as types
other than fundamental types?

-- James.

Frank Birbacher

unread,
Jan 17, 2004, 8:23:31 PM1/17/04
to
Hi!

Hyman Rosen wrote:
> And so we spin this stupid thread again and again and again...
> Am I the only one who is sick to death of it?

Most of the time I'm reading and writing news offline. Sometimes I'm not
online for one day. Thus my post will appear later. I just read about
ten messages of the thread now. Sorry for the inconvenience.

Frank

John Potter

unread,
Jan 17, 2004, 8:25:22 PM1/17/04
to
On 16 Jan 2004 22:13:32 -0500, Hyman Rosen <hyr...@mail.com> wrote:

> And so we spin this stupid thread again and again and again...
> Am I the only one who is sick to death of it?

It seems that you are not because you keep posting. I seem to have
misplaced the reference to your proposal to change the rules. Could
you post the committee document number?

Those who do not know the rules ask. Those who are sick of the
discussion ignore it. The rules may be stupid, but they are simple
enough to understand. Good practice avoids multiple side effects
and bypasses the problem anyway. I enjoy amusing myself with
undefined behavior which always does the right thing.

((((y = a) *= x) += b) *= x) += c;

That may be changed to well-defined some day. It is also amusing to
see what happens in other cases where there is no right thing.

x = 3;
y = ++x + ++x + ++x;
x = 3;
z = ++x + (++x + ++x);

One should not assert that undefined behavior is associative.

John

John Potter

unread,
Jan 17, 2004, 8:25:43 PM1/17/04
to
On 16 Jan 2004 22:20:30 -0500, "Dhruv" <dhru...@gmx.net> wrote:

> Ok, so let me get a few things straight in my mind:

> 1. Is = a sequence point?

No.

> 2. wrt. operator++(int) [post increment] applied to ints (PODS), when
> exactly is it said that the increment will happen?

What does that mean? Are you talking about the expression value or the
store in the variable?

> Possible options include:

> A) Between the assignment taking place, and the RHS expression being
> evaluated.

I guess you mean after evaluation and before assignment. Maybe.

> B) After the value of i has been obtained in the expression,

What? Maybe.

> so something
> like this would be illegal foo (i, i++) (Which I guess is illegal
> currently).

Well formed with unspecified behavior. It will not format the hard
drive. If it is foo (int i, int j), either i == j or i == j + 1. If
it is foo (int& i, int j), i == j + 1.

> C) After the complete expression has been evaluated. (Which was my initial
> assumption).

Maybe.

> D) None of the above?

No.

All of the above. It is unspecified. The store will happen by the
next sequence point. In the example, the sequence point is the ;

Prefix increment/decrement and assignment are much more interesting
because they produce lvalues.

i++ %= n; // ill-formed, lvalue required

++i %= n; // ill-formed C, undefined C++ which works fine everywhere

x[++i] = i; // Undefined everywhere

The last one is amusing because it turns unspecified behavior into
undefined behavior. The translation could be increment and assign
the new value. It could also be obtain the old value, increment
and assign. Since the second version accesses the old value of an
object which is modified for a reason other than to determine the
new value, it is undefined behavior. It is unspecified which order
will be done and one of the possible orders has undefined behavior;
therefor, the expression has undefined behavior always. An amusing
version would be to obtain the value of i, increment it, store the odd
numbered bits, retrieve the rhs, make the assignment, store the even
numbered bits. The store is finished by the sequence point.

There is a movement to make some lvalue expressions well defined. In
the above example the result of ++i is the lvalue after the modification
has taken place and the %= can be performed with known results. It
works everywhere because the lvalue of i is known at compile time and
the translation is fetch, increment, remainder, store. Compiler
writers are allowed to generate anything, they just don't.

John

John Potter

unread,
Jan 18, 2004, 6:21:34 AM1/18/04
to
On 17 Jan 2004 19:00:44 -0500, "Andrew Koenig" <a...@acm.org> wrote:

> > The original example for this thread had undefined behavior. Can you
> > give an example using fundamental types where the unspecified order
> > permits two or more well defined results?

> int append(vector<int>& v, int n)
> {
> v.push_back(n);
> return n;
> }

> vector int ambig()
> {
> vector<int> v;
> append(3) + append(4);
> return v;
> }

> If you don't like the use of vector<int>, it's easy to recast it in terms of
> arrays and pointers.

I intended result to be the value of the expression not some other side
effect. I guess that I should also complain about it not compiling, but
that is just a few oopses. I like the lack of globals, but not the use
of compound types.

Thanks anyway, for another example of unspecified not being undefined.

John

John Potter

unread,
Jan 18, 2004, 6:22:03 AM1/18/04
to
On 17 Jan 2004 18:49:15 -0500, Francis Glassborow
<fra...@robinton.demon.co.uk> wrote:

> When I last raised the question about order of evaluation affecting
> results (admittedly in a C context) the response was that that was only
> undefined behaviour if some specific rule was breached so:

> #include <iostream>
> int i(3);
> int foo(){i++; return i;}
> int main(){
> std::cout << i + foo() << '\n';
> }

> does not include UB though the output might be either 7 or 8.

Yes, thank you. I always forget about those nasty globals. At
least your example uses a global to produce an expression result.

John

John Potter

unread,
Jan 18, 2004, 6:24:47 AM1/18/04
to
On 17 Jan 2004 20:20:37 -0500, James Dennett <jden...@acm.org> wrote:

> // Silly code for exposition only.

> static int i = 0;

> int f() { return i = 1; }
> int g() { return i = 2; }
> int h(int,int,int) { } // order of arg evaluation unspecified

> int main() {
> if ( h(f(), g(), i) == 1 ) ...
> }

> Or were you disallowing function calls as well as types
> other than fundamental types?

Just did not consider functions using non-arguments. Since functions
with global side effects are evil, I didn't think of them. Thanks
for another example of unspecified defined behavior.

John

Thomas Mang

unread,
Jan 18, 2004, 7:24:36 PM1/18/04
to

John Potter schrieb:

>
> x[++i] = i; // Undefined everywhere
>
> The last one is amusing because it turns unspecified behavior into
> undefined behavior.

John, I do not understand why this example yields undefined behavior.
I would have guessed it's unspecified behavior, because i is modified only once
without intervening sequence point, but read twice.

Is it because i is read at least once for another purpose than to calculate the
new value?

Could you please elaborate?


regards,

Thomas

Frank Birbacher

unread,
Jan 18, 2004, 7:29:24 PM1/18/04
to
Hi!

James Dennett wrote:
> int h(int,int,int) { } // order of arg evaluation unspecified

Why are these arguments not evaluated in order? Did I missiterpreted
something? AFAIK a user defined function introduces a sequence point and
arguments are evaluated left-to-right order. Please correct me if I am
wrong.

Frank

Frank Birbacher

unread,
Jan 18, 2004, 7:30:07 PM1/18/04
to
Hi!

Andrew Koenig wrote:
> vector int ambig()
> {
> vector<int> v;
> append(3) + append(4);
> return v;
> }

This is the point where having the vector iterator being a class instead
of a pointer makes some difference, which I don't like. If the iterator
was a class, the result would be determined.

Frank

Hyman Rosen

unread,
Jan 19, 2004, 2:36:41 AM1/19/04
to
John Potter wrote:
> It seems that you are not because you keep posting. I seem to have
> misplaced the reference to your proposal to change the rules. Could
> you post the committee document number?

I just want to whine and have someone else do the hard work. It's easy
enough to specify the rules. I just haven't gone through the effort of
finding every mention of "sequence point" and revising the text. It's
especially a problem because my PDF copy of the Standard is the original
version that doesn't allow its text to be copied.

My proposal, should I write it, would eliminate all mention of sequence
points, and instead define the concept of an expression being "fully
evaluated". A fully evaluated expression has all of its side effects
complete. Then, for a function call, the postfix expression denoting
the function to be called is fully evaluated, followed by the arguments
in left-to-right order, then the call of the function itself. For
built-in operations, the operands are fully evaluated in left-to-right
order, then the operation is fully evaluated. If a fully evaluated
expression results in an lvalue, the identity of the object or function
to which the lvalue refers is fixed at that point. All full expressions
are fully evaluated once their evaluation is complete. The short circuit
operations keep their old semantics.

Eg.
void hoist(char *p) { if (*p) while (*p = *++p); }
(etc.)

Andrea Griffini

unread,
Jan 19, 2004, 2:37:25 AM1/19/04
to
On 18 Jan 2004 19:29:24 -0500, Frank Birbacher
<bloodym...@gmx.net> wrote:

>arguments are evaluated left-to-right order.

Where did you read this ? Please note that the
comma between the parameters in a function call
is NOT the comma operator.

Andrea

John Potter

unread,
Jan 19, 2004, 2:37:52 AM1/19/04
to
On 18 Jan 2004 19:24:36 -0500, Thomas Mang <a980...@unet.univie.ac.at>
wrote:

> John Potter schrieb:

> > x[++i] = i; // Undefined everywhere

> > The last one is amusing because it turns unspecified behavior into
> > undefined behavior.

> John, I do not understand why this example yields undefined behavior.
> I would have guessed it's unspecified behavior, because i is modified only once
> without intervening sequence point, but read twice.

> Is it because i is read at least once for another purpose than to calculate the
> new value?

Yes. See 5/4.

John

Francis Glassborow

unread,
Jan 19, 2004, 12:23:10 PM1/19/04
to
In message <bue66t$g7uv7$1...@ID-220067.news.uni-berlin.de>, Frank
Birbacher <bloodym...@gmx.net> writes

>Andrew Koenig wrote:
>> vector int ambig()
>> {
>> vector<int> v;
>> append(3) + append(4);
>> return v;
>> }
>
>This is the point where having the vector iterator being a class instead
>of a pointer makes some difference, which I don't like. If the iterator
>was a class, the result would be determined.

Makes no difference, either of the two calls to append can be called
first, and even if the '+' is user defined that is still the case.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

Francis Glassborow

unread,
Jan 19, 2004, 12:23:35 PM1/19/04
to
In message <400A682D...@unet.univie.ac.at>, Thomas Mang
<a980...@unet.univie.ac.at> writes

>>
>> x[++i] = i; // Undefined everywhere
>>
>> The last one is amusing because it turns unspecified behavior into
>> undefined behavior.
>
>John, I do not understand why this example yields undefined behavior.
>I would have guessed it's unspecified behavior, because i is modified only once
>without intervening sequence point, but read twice.
>
>Is it because i is read at least once for another purpose than to calculate the
>new value?

Yes. And the reason for making that undefined is that we did not (way
back in the C days) want to specify that memory was written immediately,
only that it should be done not earlier than the prior sequence point
and not later than the next one. That actually makes sense when we
realise that writing memory actually takes time -- often, more time than
register based evaluations.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

Francis Glassborow

unread,
Jan 19, 2004, 12:23:56 PM1/19/04
to
In message <bue6d2$gnq40$1...@ID-220067.news.uni-berlin.de>, Frank
Birbacher <bloodym...@gmx.net> writes

>James Dennett wrote:
>> int h(int,int,int) { } // order of arg evaluation unspecified
>
>Why are these arguments not evaluated in order? Did I missiterpreted
>something? AFAIK a user defined function introduces a sequence point and
>arguments are evaluated left-to-right order. Please correct me if I am
>wrong.

There is no such requirement in C++ (which is largely what this thread
is about) sub-expressions (and each argument is effectively a
sub-expression) can be evaluated in any order and at any time prior to
their use.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

ka...@gabi-soft.fr

unread,
Jan 19, 2004, 12:38:03 PM1/19/04
to
Frank Birbacher <bloodym...@gmx.net> wrote in message
news:<bue6d2$gnq40$1...@ID-220067.news.uni-berlin.de>...

> James Dennett wrote:
> > int h(int,int,int) { } // order of arg evaluation unspecified

> Why are these arguments not evaluated in order? Did I missiterpreted
> something? AFAIK a user defined function introduces a sequence point
> and arguments are evaluated left-to-right order. Please correct me if
> I am wrong.

You're wrong. A function call introduces a sequence point, and all of
its arguments must be fully evaluated (including side effects) before
the function is called, but the arguments can be evaluated in any
order. Or the evaluation of the arguments can even be interleaved.

In fact, a number of the earlier compilers normally evaluated the
arguments from right to left.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

ka...@gabi-soft.fr

unread,
Jan 19, 2004, 12:41:57 PM1/19/04
to
Francis Glassborow <fra...@robinton.demon.co.uk> wrote in message
news:<CSfRy1Ad...@robinton.demon.co.uk>...

> In message <31c49f0d.04011...@posting.google.com>, Joshua
> Lehrer <usene...@lehrerfamily.com> writes
> >Jack Klein <jack...@spamcop.net> wrote in message
> >news:<gg3c00p4aqehbmvo3...@4ax.com>...
> > > Any order that you pick will generate efficient object code on
> > > some compilers for some processors, and inefficient object code on
> > > some other compilers for some other processors.

> >so rather than pick one, which will be efficient on some platforms,
> >and not on others, we pick neither, which means that you can't write
> >the code at all?

> How does that follow? By making some things unspecified we turn them
> into QoI issues.

I'm not sure which side Joshua is actually arguing for, but his comment
is quite correct. The question was: why is the following (or something
similar) undefined behavior:

a[ i ] = i++ ;

We could impose a specific ordering of the expression, and make it fully
defined behavior. The current solution, however, is that you can't (or
more correctly, may not) write this code.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Jan 19, 2004, 12:42:19 PM1/19/04
to
Ben Hutchings <do-not-s...@bwsint.com> wrote in message
news:<slrnc0foli.1po....@tin.bwsint.com>...

> Hyman Rosen wrote:
> > Jack Klein wrote:
> >> For compatibility with C

> > Of course not, since giving defined behavior to formerly
> > undefined code cannot cause incompatibility.

> > and for performance.

> > In older threads on the same subject people have managed to
> > come up with isolated cases where God can generate better
> > code by using an order of evaluation other than "left before
> > right, operands before operation".
> <snip>

> Nonsense. If the implementation is free to evaluate operands in any
> order it can use this freedom to minimise the number of registers or
> stack slots used and thus to reduce memory traffic.

You're forgetting the as if rule. If the arguments don't contain
function calls themselves, the compiler should be able to reorganize
things exactly as it does today (at least in the cases which are legal
today).

I can see two cases where it *might* make a difference (might, because
I'm not convinced that there are cases in real code where the difference
will be measurable) :

- if the arguments contain calls to non-inlined functions, and

- in cases like f( (*p) ++, (*q) ++ ) -- the current rules allow the
compiler to assume that p and q do not point to the same object, and
interleaf the accesses, where as strict ordering wouldn't.

In both cases, I suspect that any optimization is more theoretical than
real. In most, if not all cases where the resulting difference will
have a measurable impact, the compiler will have enough knowledge to do
it under the "as if" rule.

> This is a fairly basic optimisation that was covered in the Compilers
> and Operating Systems course I took as an undergraduate.

I suspect that a lot of things that were common knowledge when I learned
about compilers are no longer true:-). Do modern compilers still use
Sethi-Ulmann numbers to decide which side of the expression to evaluate
first? (That's really a serious question. It's been a while since I've
been away from compilers, and I think techniques like register coloring
have pretty much replaced Sethi-Ulmann numbers, but I could be wrong,
particularly in the case of register poor architectures like IA-32.)

ka...@gabi-soft.fr

unread,
Jan 19, 2004, 12:52:52 PM1/19/04
to
John Potter <jpo...@falcon.lhup.edu> wrote in message
news:<ik3i00p24gbegpo6v...@4ax.com>...

> On 16 Jan 2004 22:20:30 -0500, "Dhruv" <dhru...@gmx.net> wrote:

> > Ok, so let me get a few things straight in my mind:

> > 1. Is = a sequence point?

> No.

> > 2. wrt. operator++(int) [post increment] applied to ints (PODS),
> > when exactly is it said that the increment will happen?

> What does that mean? Are you talking about the expression value or
> the store in the variable?

> > Possible options include:

> > A) Between the assignment taking place, and the RHS expression being
> > evaluated.

> I guess you mean after evaluation and before assignment. Maybe.

> > B) After the value of i has been obtained in the expression,

> What? Maybe.

> > so something like this would be illegal foo (i, i++) (Which I guess
> > is illegal currently).

> Well formed with unspecified behavior. It will not format the hard
> drive.

§5/4: "Except where noted, the order of evaluation of operands of
individual operators and subexpressions of individual expressions, and
the order in which side effects take place, is unspecified. Between the
previous and next sequence point a scalar object shall have its stored
value modifed at most once by the evaluation of an expression.
Furhtermore, the prior value shall be accessed only to determine the
value to be stored. The requirements of this paragraph shall be met for
each allowable ordering of the subexpressions of a full expression;
otherwise the behavior is UNDEFINED"

The order in which (i, i++) is evaluated is unspecified; the compiler is
even allowed to interleave the operations. (Note that the comma here is
NOT a comma operator, and does not introduce a sequence point.) In one
of the possible orderings, the prior value of i is read for itself, and
not to determine the value which will be stored. Please explain why you
think that the last sentence quoted above does not apply.

> If it is foo (int i, int j), either i == j or i == j + 1. If
> it is foo (int& i, int j), i == j + 1.

Good point. If it is foo( int& i, int j ), the code is legal and fully
specified, because passing i by reference doesn't cause the value to be
read.

> > C) After the complete expression has been evaluated. (Which was my
> > initial assumption).

> Maybe.

> > D) None of the above?

> No.

> All of the above. It is unspecified. The store will happen by the
> next sequence point. In the example, the sequence point is the ;

In the original example (a[i]=i++). How is this different from
f(i,i++)? In the latter, there is an additional sequence point in the
function call, but it is too late to make a difference.

> Prefix increment/decrement and assignment are much more interesting
> because they produce lvalues.

> i++ %= n; // ill-formed, lvalue required

> ++i %= n; // ill-formed C, undefined C++ which works fine everywhere

Does it? Have you really tried all existing compilers? And what does
"works fine" mean in this case? The language doesn't assign any
semantics to this statement, and I can think of at least two reasonable
interpretations.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Jan 19, 2004, 12:55:10 PM1/19/04
to
John Potter <jpo...@falcon.lhup.edu> wrote in message
news:<416i00dajr1qrirbb...@4ax.com>...

> On 16 Jan 2004 22:13:32 -0500, Hyman Rosen <hyr...@mail.com> wrote:

> > And so we spin this stupid thread again and again and again... Am I
> > the only one who is sick to death of it?

> It seems that you are not because you keep posting. I seem to have
> misplaced the reference to your proposal to change the rules. Could
> you post the committee document number?

> Those who do not know the rules ask. Those who are sick of the
> discussion ignore it. The rules may be stupid, but they are simple
> enough to understand.

Apparently not, because people keep asking questions about them:

- "undefined behavior" seems about as difficult for a programmer to
understand (or at least accept) as "no" is for a teen-age boy, and

- even experienced programmers sometimes have problems with the fact
that the ordering introduced by sequence points is only partial.

Note that earlier in this thread, you posted that "f( i, i++ )" is
unspecified, but not undefined. This is contrary to what I have always
heard, and how I interpret the standard. So either one of us is really
stupid, or the rule is less simple that you seem to be saying.

> Good practice avoids multiple side effects and bypasses the problem
> anyway.

Can you guarantee that you've never written code which violates good
practices? The critical point here, IMHO, is not so much defining
something useful in clean programming, but ensuring reproduceable
behavior in the case of errors. I really hate it (and I suspect that
I'm not alone) when my code works in all of my tests (compiled in debug
mode, without optimizing), and fails in the production code (compiled
with optimization turned on). Even if the reason it fails is that I
have been sloppy, and violated my own coding guidelines.

Note that the undefined ordering can create additional problems in C++.
Consider the well defined:

x = f( i ++ ) + g( j ++ ) ;

If f throws, what is the value of j?

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

John Potter

unread,
Jan 20, 2004, 6:17:09 AM1/20/04
to
On 19 Jan 2004 12:55:10 -0500, ka...@gabi-soft.fr wrote:

> John Potter <jpo...@falcon.lhup.edu> wrote in message
> news:<416i00dajr1qrirbb...@4ax.com>...

> > Those who do not know the rules ask. Those who are sick of the


> > discussion ignore it. The rules may be stupid, but they are simple
> > enough to understand.

> Apparently not, because people keep asking questions about them:

> - "undefined behavior" seems about as difficult for a programmer to
> understand (or at least accept) as "no" is for a teen-age boy, and

> - even experienced programmers sometimes have problems with the fact
> that the ordering introduced by sequence points is only partial.

> Note that earlier in this thread, you posted that "f( i, i++ )" is
> unspecified, but not undefined. This is contrary to what I have always
> heard, and how I interpret the standard. So either one of us is really
> stupid, or the rule is less simple that you seem to be saying.

Actually, my incorrect statement is good enough for simplicity. If it
is undefined or unspecified and effects the result, don't use it.

> > Good practice avoids multiple side effects and bypasses the problem
> > anyway.

> Can you guarantee that you've never written code which violates good
> practices? The critical point here, IMHO, is not so much defining
> something useful in clean programming, but ensuring reproduceable
> behavior in the case of errors. I really hate it (and I suspect that
> I'm not alone) when my code works in all of my tests (compiled in debug
> mode, without optimizing), and fails in the production code (compiled
> with optimization turned on). Even if the reason it fails is that I
> have been sloppy, and violated my own coding guidelines.

> Note that the undefined ordering can create additional problems in C++.
> Consider the well defined:

> x = f( i ++ ) + g( j ++ ) ;

> If f throws, what is the value of j?

Do I care? If so, I can't write that.

In practice, the only time that I even think about the order rules is
when I am writing what you might call obfuscated code.

p = p->next = new Node(p, v);
p = (q = p)->next;

Are those valid?

John

John Potter

unread,
Jan 20, 2004, 6:19:53 AM1/20/04
to
On 19 Jan 2004 12:52:52 -0500, ka...@gabi-soft.fr wrote:

> John Potter <jpo...@falcon.lhup.edu> wrote in message
> news:<ik3i00p24gbegpo6v...@4ax.com>...

> > On 16 Jan 2004 22:20:30 -0500, "Dhruv" <dhru...@gmx.net> wrote:

> > > so something like this would be illegal foo (i, i++) (Which I guess
> > > is illegal currently).

> > Well formed with unspecified behavior. It will not format the hard
> > drive.

> §5/4: "Except where noted, the order of evaluation of operands of

[...]

> The order in which (i, i++) is evaluated is unspecified; the compiler is
> even allowed to interleave the operations. (Note that the comma here is
> NOT a comma operator, and does not introduce a sequence point.) In one
> of the possible orderings, the prior value of i is read for itself, and
> not to determine the value which will be stored. Please explain why you
> think that the last sentence quoted above does not apply.

I noticed the error right after posting. I was beginning to think that
I might need to correct it myself. Thanks for saving me from that.

> > ++i %= n; // ill-formed C, undefined C++ which works fine everywhere

> Does it? Have you really tried all existing compilers?

Still waiting for someone to give an implementation which does something
other than expected.

> And what does
> "works fine" mean in this case? The language doesn't assign any
> semantics to this statement, and I can think of at least two reasonable
> interpretations.

This is just an example of the general wording problem for expressions
which produce lvalues. The value of ++i is the new value of i and it
is an lvalue. It could also be written as (i += 1) %= n. I see no
reasonable interpretation other than

i = (i + 1) % n;

or

++i;
i %= n;

Which are different if i is volatile. The volatile question needs to be
answered anyway. Last I heard, there is work to reword those sections
which will not require the store and reload, not introduce a sequence
point and make the expression well defined as the first above. Note
that the old value is only accessed to determine the new value and the
stored value is only modified once. The old C rules just do not fit
well with the new C++ lvalue expressions. There is no C compatibility
problem because they are ill-formed C.

John

Old Wolf

unread,
Jan 20, 2004, 6:24:32 AM1/20/04
to
> > so something
> > like this would be illegal foo (i, i++) (Which I guess is illegal
> > currently).
>
> Well formed with unspecified behavior. It will not format the hard
> drive. If it is foo (int i, int j), either i == j or i == j + 1. If
> it is foo (int& i, int j), i == j + 1.

Isn't the "i" in the first parameter, a reading of "i" for a purpose
other than to determine the new value?

I must admit to never having grokked that oddly-worded line in the
Standard (and it took me a while to even parse it correctly). Was
there any debate over the exact wording when it was written?
Is there a way of rephrasing it that makes it obvious why your example
is not UB, but "i + i++" is ?

ka...@gabi-soft.fr

unread,
Jan 20, 2004, 6:30:45 AM1/20/04
to
Hyman Rosen <hyr...@mail.com> wrote in message
news:<HWFOb.4227$9L4....@nwrdny03.gnilink.net>...

> John Potter wrote:
> > It seems that you are not because you keep posting. I seem to have
> > misplaced the reference to your proposal to change the rules. Could
> > you post the committee document number?

> I just want to whine and have someone else do the hard work.

Independantly of who does the work: it's more grunt work than anything
else. It doesn't require much original thinking, but it is a lot of
work. It would be foolish for anyone to invest the effort unless there
was a reasonable chance of it being adopted. The first step is thus to
create a concensus for defining the order more strictly. Only once that
consensus has been achieved should one attact the actual specification.

FWIW: anything that will reduce undefined behavior has my vote.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Ben Hutchings

unread,
Jan 20, 2004, 9:32:14 AM1/20/04
to
ka...@gabi-soft.fr wrote:
> Ben Hutchings <do-not-s...@bwsint.com> wrote in message
> news:<slrnc0foli.1po....@tin.bwsint.com>...
>> Hyman Rosen wrote:
>> > Jack Klein wrote:
>> >> For compatibility with C
>
>> > Of course not, since giving defined behavior to formerly
>> > undefined code cannot cause incompatibility.
>
>> > and for performance.
>
>> > In older threads on the same subject people have managed to
>> > come up with isolated cases where God can generate better
>> > code by using an order of evaluation other than "left before
>> > right, operands before operation".
>> <snip>
>
>> Nonsense. If the implementation is free to evaluate operands in any
>> order it can use this freedom to minimise the number of registers or
>> stack slots used and thus to reduce memory traffic.
>
> You're forgetting the as if rule. If the arguments don't contain
> function calls themselves, the compiler should be able to reorganize
> things exactly as it does today (at least in the cases which are legal
> today).

Agreed.

> I can see two cases where it *might* make a difference (might, because
> I'm not convinced that there are cases in real code where the difference
> will be measurable) :
>
> - if the arguments contain calls to non-inlined functions, and
>
> - in cases like f( (*p) ++, (*q) ++ ) -- the current rules allow the
> compiler to assume that p and q do not point to the same object, and
> interleaf the accesses, where as strict ordering wouldn't.
>
> In both cases, I suspect that any optimization is more theoretical than
> real. In most, if not all cases where the resulting difference will
> have a measurable impact, the compiler will have enough knowledge to do
> it under the "as if" rule.

Perhaps.

Maybe I should probably stay out and wait for informed opinion from
implementers. So far as I can see we haven't heard from them yet - but
then, they are probably tired of this subject.

>> This is a fairly basic optimisation that was covered in the Compilers
>> and Operating Systems course I took as an undergraduate.
>
> I suspect that a lot of things that were common knowledge when I learned
> about compilers are no longer true:-). Do modern compilers still use
> Sethi-Ulmann numbers to decide which side of the expression to evaluate
> first?

<snip>

I don't remember hearing the term, but evidently those are the numbers
we were calculating and using in a practical in that course. I wouldn't
know whether they are still used.

Francis Glassborow

unread,
Jan 20, 2004, 9:34:04 AM1/20/04
to
In message <843a4f78.04011...@posting.google.com>, Old Wolf
<old...@inspire.net.nz> writes

> > > so something
> > > like this would be illegal foo (i, i++) (Which I guess is illegal
> > > currently).
> >
> > Well formed with unspecified behavior. It will not format the hard
> > drive. If it is foo (int i, int j), either i == j or i == j + 1. If
> > it is foo (int& i, int j), i == j + 1.
>
>Isn't the "i" in the first parameter, a reading of "i" for a purpose
>other than to determine the new value?
>
>I must admit to never having grokked that oddly-worded line in the
>Standard (and it took me a while to even parse it correctly). Was
>there any debate over the exact wording when it was written?
>Is there a way of rephrasing it that makes it obvious why your example
>is not UB, but "i + i++" is ?

You are working from a false premise because the function call example
also has undefined behaviour. The post you quote was incorrect.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

John Potter

unread,
Jan 20, 2004, 5:31:15 PM1/20/04
to
On 20 Jan 2004 06:24:32 -0500, old...@inspire.net.nz (Old Wolf) wrote:

> > > so something
> > > like this would be illegal foo (i, i++) (Which I guess is illegal
> > > currently).

> > Well formed with unspecified behavior. It will not format the hard
> > drive. If it is foo (int i, int j), either i == j or i == j + 1. If
> > it is foo (int& i, int j), i == j + 1.

> Isn't the "i" in the first parameter, a reading of "i" for a purpose
> other than to determine the new value?

Yep. Thanks for catching the error. It is UB if i is passed by value
and fully specified if i is passed by reference.

John

Dhruv

unread,
Jan 21, 2004, 4:03:12 AM1/21/04
to

Before I go ahead, could you explain the difference between:
Undefined behaviour AND Well formed with unspecified behavior.


Regards,
-Dhruv.

Frank Birbacher

unread,
Jan 21, 2004, 4:11:11 AM1/21/04
to
Hi!

Francis Glassborow wrote:
> Makes no difference, either of the two calls to append can be called
> first, and even if the '+' is user defined that is still the case.

Hmm, I always mix this up. Thank you for correction.

vector<int>::iterator a = v.begin();
a = (a++) +5;

Does this depend on the iterator being a pointer or class? User defined
functions introduce a sequence point, thus 'a++' is completely evaluated
before 5 is added (thus before the assignment takes place). It should
depend. Am I right here?

Frank

Francis Glassborow

unread,
Jan 21, 2004, 10:50:53 AM1/21/04
to
In message <bukbne$j4eav$1...@ID-220067.news.uni-berlin.de>, Frank
Birbacher <bloodym...@gmx.net> writes

>Francis Glassborow wrote:
> > Makes no difference, either of the two calls to append can be called
> > first, and even if the '+' is user defined that is still the case.
>
>Hmm, I always mix this up. Thank you for correction.
>
>vector<int>::iterator a = v.begin();
>a = (a++) +5;
>
>Does this depend on the iterator being a pointer or class? User defined
>functions introduce a sequence point, thus 'a++' is completely evaluated
>before 5 is added (thus before the assignment takes place). It should
>depend. Am I right here?

Yes, where a write operation is protected by a function call from other
evaluations in the same expression it is fine. However do you know that
operator ++(int) (post increment) is actually a user defined overload
for vector? You see some vector implementations use raw pointers as
their iterators.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

Francis Glassborow

unread,
Jan 21, 2004, 10:56:40 AM1/21/04
to
In message <pan.2004.01.20....@gmx.net>, Dhruv
<dhru...@gmx.net> writes

>Before I go ahead, could you explain the difference between:
>Undefined behaviour AND Well formed with unspecified behavior.

Unspecified behaviour restricts the possibilities to a finite list of
possibilities. That list is either explicit in the Standard or is
implicit in the context. Order of evaluation is an example of the
latter. Whatever order is actually used it must meet with the
requirements of the Standard and so a programmer can write a list of all
the possibilities.

Undefined behaviour concerns cases where the Standard either explicitly
or implicitly make no requirements of code that is syntactically
correct.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit

ka...@gabi-soft.fr

unread,
Jan 21, 2004, 10:57:02 AM1/21/04
to
Ben Hutchings <do-not-s...@bwsint.com> wrote in message
news:<slrnc0q466.1eg....@tin.bwsint.com>...

[...]


> > I can see two cases where it *might* make a difference (might,
> > because I'm not convinced that there are cases in real code where
> > the difference will be measurable) :

> > - if the arguments contain calls to non-inlined functions, and

> > - in cases like f( (*p) ++, (*q) ++ ) -- the current rules allow
> > the compiler to assume that p and q do not point to the same
> > object, and interleaf the accesses, where as strict ordering
> > wouldn't.

> > In both cases, I suspect that any optimization is more theoretical
> > than real. In most, if not all cases where the resulting difference
> > will have a measurable impact, the compiler will have enough
> > knowledge to do it under the "as if" rule.

> Perhaps.

> Maybe I should probably stay out and wait for informed opinion from
> implementers. So far as I can see we haven't heard from them yet -
> but then, they are probably tired of this subject.

I'm not sure if the implementers, alone, have the answer. I'm fairly
sure that I can construct cases where it could make a difference,
especially on register-poor architectures like Intel. I doubt that the
cases I would construct would ever occur in actual code, however.
Considering the three cases I can think of:

- Really complex expressions may need to save a large number of
intermediate results -- and according to the organization of the
expression, the number of intermediate results may depend on the
order of evaluation. This offers a trivial approach to construct
cases where a compiler can improve things. But generally, the "as
if" rule will allow the optimizations anyway, even with specified
ordering. And complex expressions are hard to maintain; the
expressions that I write would rarely need more than three or four
registers, regardless of ordering.

- On all of the machines I work on, the cost of a function call is
high enough that any gains due to reordering are negligible compared
to it. And of course, even when a function call present, the "as
if" rule still allows full reordering of the parts of the expression
which have no side effects, and whose values cannot be modified by
the function (local variables whose addresses have not been taken).

- The one case I'm really not sure of is something like the second
case above. The fact that the expression makes two modifications
allows the compiler to assume that the lvalue expressions do not
refer to the same object -- that no aliasing is involved. Aliasing
poses a real problem for optimizers, and is one of the reasons why C
and C++ tend to not optimize well. On the other hand, I've yet to
see a compiler which really makes use of this information -- that,
for example, seeing an expression like "f( (*p)++, (*q)++ )"
actively notes that p and q do not refer to the same object, and
exploits this information elsewhere in the expression, or even
within a wider scope. (Note that if p and q have different types,
and neither is pointer to a character type, the compiler may already
make the assumption. As far as I know, not very many do.)

It would be interesting to hear from compiler implementers concerning
this last point. (I'm still not convinced that even if it does allow
some improved optimization, it will have an effect in real programs.
How often do you modify several lvalues of the same type in a single
expression?)

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Jan 21, 2004, 10:59:00 AM1/21/04
to
John Potter <jpo...@falcon.lhup.edu> wrote in message
news:<7m5q00h9t8ic10lse...@news.west.earthlink.net>...

> On 20 Jan 2004 06:24:32 -0500, old...@inspire.net.nz (Old Wolf) wrote:

> > > > so something like this would be illegal foo (i, i++) (Which I
> > > > guess is illegal currently).

> > > Well formed with unspecified behavior. It will not format the
> > > hard drive. If it is foo (int i, int j), either i == j or i == j
> > > + 1. If it is foo (int& i, int j), i == j + 1.

> > Isn't the "i" in the first parameter, a reading of "i" for a purpose
> > other than to determine the new value?

> Yep. Thanks for catching the error. It is UB if i is passed by value
> and fully specified if i is passed by reference.

Which is an interesting reason for preferring pass by const reference to
pass by value:-).

I'd hate to have to maintain a program which depended on such
subtilities.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Jan 21, 2004, 11:30:31 AM1/21/04
to
John Potter <jpo...@falcon.lhup.edu> wrote in message
news:<ukco005fmu8ib18b7...@4ax.com>...

> On 19 Jan 2004 12:55:10 -0500, ka...@gabi-soft.fr wrote:

> > John Potter <jpo...@falcon.lhup.edu> wrote in message
> > news:<416i00dajr1qrirbb...@4ax.com>...

> > > Those who do not know the rules ask. Those who are sick of the
> > > discussion ignore it. The rules may be stupid, but they are
> > > simple enough to understand.

> > Apparently not, because people keep asking questions about them:

> > - "undefined behavior" seems about as difficult for a programmer to
> > understand (or at least accept) as "no" is for a teen-age boy, and

> > - even experienced programmers sometimes have problems with the fact
> > that the ordering introduced by sequence points is only partial.

> > Note that earlier in this thread, you posted that "f( i, i++ )" is
> > unspecified, but not undefined. This is contrary to what I have
> > always heard, and how I interpret the standard. So either one of
> > us is really stupid, or the rule is less simple that you seem to be
> > saying.

> Actually, my incorrect statement is good enough for simplicity. If it
> is undefined or unspecified and effects the result, don't use it.

You should know by now that when it comes to coding, I'm the sort of guy
who wears a belt and suspenders. First rule for readable code: a
statement does one and only one thing. If it modifies x, it doesn't
modify anything else, nor have any effect on control flow.

The second rule, of course, is to follow the accepted idioms of the
language (principle of least surprise).

In the case of C and C++, the two rules often disagree, and even I have
statements that do modify two objects in a single statement. If the
lvalue expressions specifying the objects are non trivial, having to
prove that they can never specify the same object is an additional
burden on the programmer.

And of course, I've never yet met a programmer who is perfect. Everyone
is at the mercy of an error. So we test. For testing to be valid,
however, we need some sort of reproduceability. With undefined
behavior, there's no point in testing, because as we all know, it will
work in the tests, but not once deployed. (I'm overstating the case, of
course, but consistency and reproduceability are important for testing
to be really effective.)

> > > Good practice avoids multiple side effects and bypasses the
> > > problem anyway.

> > Can you guarantee that you've never written code which violates
> > good practices? The critical point here, IMHO, is not so much
> > defining something useful in clean programming, but ensuring
> > reproduceable behavior in the case of errors. I really hate it
> > (and I suspect that I'm not alone) when my code works in all of my
> > tests (compiled in debug mode, without optimizing), and fails in
> > the production code (compiled with optimization turned on). Even
> > if the reason it fails is that I have been sloppy, and violated my
> > own coding guidelines.

> > Note that the undefined ordering can create additional problems in
> > C++. Consider the well defined:

> > x = f( i ++ ) + g( j ++ ) ;

> > If f throws, what is the value of j?

> Do I care? If so, I can't write that.

> In practice, the only time that I even think about the order rules is
> when I am writing what you might call obfuscated code.

> p = p->next = new Node(p, v);
> p = (q = p)->next;

> Are those valid?

The problem isn't when I thing about the order rules. The problem is
getting reproduceable results even when I forget to think about them.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Hyman Rosen

unread,
Jan 21, 2004, 2:52:08 PM1/21/04
to
Dhruv wrote:
> Before I go ahead, could you explain the difference between:
> Undefined behaviour AND Well formed with unspecified behavior.

Undefined behavior means that the standard imposes absolutely
no requirements on the behavior of the program, even for things
the program does before the undefined behavior occurs.

Unspecified behavior means that there are a number of possible
outcomes in a given situation, and the program may execute any
of them, but earlier and later behavior are still as specified
by the standard.

For example,
#include <cstdio>
int a() { return std::printf("a"); }
int b() { return std::printf("b"); }
int main() { return !(a() + b()); }
has unspecified behavior. It may print "ab" or "ba", but it must
print one of those two, and the program is legal. Contrariwise,
if in the above, main were written as
int main(int c, char **) { a(); b(); return !(--c + --c); }
the program would have undefined behavior, and the standard requires
nothing of this program. It need not even print anything, even though
the calls to printf happen before the undefined behavior is encountered.

John Potter

unread,
Jan 21, 2004, 2:56:05 PM1/21/04
to
On 21 Jan 2004 04:03:12 -0500, "Dhruv" <dhru...@gmx.net> wrote:

> Before I go ahead, could you explain the difference between:
> Undefined behaviour AND Well formed with unspecified behavior.

I assume that you have noticed the correction to that statement.
Sorry about the bad example.

Well formed means no diagnostic required. Unspecified behavior
means that only a few possibilities exist where undefined means
anything can happen. A better example.

struct S {
int s;
S& operator++ () {
++ s;
return *this;
}
operator int () const {
return s;
}
};
int dist (int a, int b) {
return b - a;
}
int main () {
S s = { 5 };
int i = { 5 };
std::cout << dist(s, ++ s) << "\n"; // unspecified either 1 or 0
std::cout << dist(i, ++ i) << "\n"; // undefined
}

In both cases, it is unspecified whether the first or second argument
is evaluated or first or even interleaved. In the first case, there
is a sequence point introduced by the operator++ function. This
prevents interleaving and avoids the rule about accessing the old
value for a purpose other than to determine the new value. In the
second case we get undefined behavior because of the latter rule.

John

Dhruv

unread,
Jan 23, 2004, 5:28:38 AM1/23/04