Preincrement yields lvalue?

Jack Klein

unread,

Feb 15, 1999, 3:00:00 AM2/15/99

to

<Jack>

5.3.2 (page 77 of the ANSI PDF file of the standard) states:

1 The operand of prefix ++ is modified by adding 1, or set to true if
it is bool (this use is deprecated). The operand shall be a modifiable
lvalue. The type of the operand shall be an arithmetic type or a
pointer to a completely defined object type. The value is the new
value of the operand; it is an lvalue.

On the other hand, 5.2.6 (page 69) states:

1 The value obtained by applying a postfix ++ is the value that the
operand had before applying the operator. [Note: the value obtained is
a copy of the original value ] The operand shall be a modifiable
lvalue. The type of the operand shall be an arithmetic type or a
pointer to a complete object type. After the result is noted, the
value of the object is modified by adding 1 to it, unless the object
is of type bool, in which case it is set to true. [Note: this use is
deprecated, see annex D. ] The result is an rvalue.

If 5.3.2 is correct, this would have two consequences:

1. It would be a serious incompatibility with C, where the prefix ++
operator does not yield an lvalue.

2. Any use of the resulting lvalue as an lvalue, at least for a
scalar type, could only produce undefined behavior by modifying the
scalar twice without an intervening sequence point.

i.e.

int i = 0;
++i = 5;

Am I misinterpreting or is this a typographical error in the standard?

</Jack>
--
Do not email me with questions about programming.
Post them to the appropriate newsgroup.
Followups to my posts are welcome.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]

Michael Rubenstein

unread,

Feb 15, 1999, 3:00:00 AM2/15/99

to

I've posted a response to Jack's comments on comp.lang.c++. I believe
that this is intentional and that it can be useful in code like

int& f(int& i)
{
return ++i;
}

int& ri = ++i;
int* pi = &++i;

void g(int& x);
g(++i);

I would like to see other opinions.

I also would like to add a question on the assumption that I am
correct. Is the result of prefix ++ on integer types a modifiable
lvalue?

The standard says (3.10)

If an expression can be used to modify the object to which it
efers, the expression is called *modifiable*. A program that
attempts to modify an object through a nonmodifiable lvalue or

rvalue expression is illformed.

(asterisks indicate italics in the standard making this the definition
of "modifiable"). As Jack has pointed out, modifying i using ++i
would result in undefined behavior; does this mean (as far as the
standard goes) that ++i cannot be used to modify i, making it a
nonmodifiable lvalue?

Why is this question meaningful? Consider the program

void f()
{
int i = 0;
++i = 1;
}

int main()
{
}

If ++i is "modifiable" even though modifying i using it results in
undefined behavior, then this program is well-formed. Since the code
that would modify i using ++i is never executed, there is no problem.

However, if ++i is not modifiable, then a diagnostic is required.

Let's see now. If i is an int,

1. ++i = 1; results in undefined behavior. Therefore, ++i is

not a modifiable lvalue.

2. Therefore, a diagnostic is required for ++i = 1;

3. Therefore, ++i = 1; does not result in undefined behavior.

--
Michael M Rubenstein

Siemel Naran

unread,

Feb 15, 1999, 3:00:00 AM2/15/99

to

On 15 Feb 99 07:22:53 GMT, Jack Klein <jack...@att.net> wrote:

>1. It would be a serious incompatibility with C, where the prefix ++
>operator does not yield an lvalue.

Similar remarks hold for the question mark operator. In C,
(cond?t:f) is an rvalue, but in C++, it is an lvalue.

>2. Any use of the resulting lvalue as an lvalue, at least for a
>scalar type, could only produce undefined behavior by modifying the
>scalar twice without an intervening sequence point.
>
>i.e.
>
>int i = 0;
>++i = 5;

The above code is well defined, I think, and should result in i
equal to 5. It is equivalent to,
operator=(operator++(i),5); // before, 'i' is 0
It is not specified which is evaluated first -- 'operator++(i)'
or '5'. It doesn't matter though, because the two expressions
do not share common variables. Hence, we get
operator=(i,5); // before, 'i' is 1
And finally, the above evaluates to,
i; // before 'i' is 5
A compiler may give a warning that the '++i' has no effect.

This code does have undefined behavior:
v[i]=i++;
The reason is that it is not specified which is evaluated first
-- 'v.operator[](i)' or 'i++'.

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

James Kuyper

unread,

Feb 15, 1999, 3:00:00 AM2/15/99

to

Siemel Naran wrote:
>
> On 15 Feb 99 07:22:53 GMT, Jack Klein <jack...@att.net> wrote:
>
> >1. It would be a serious incompatibility with C, where the prefix ++
> >operator does not yield an lvalue.
>
> Similar remarks hold for the question mark operator. In C,
> (cond?t:f) is an rvalue, but in C++, it is an lvalue.
>
> >2. Any use of the resulting lvalue as an lvalue, at least for a
> >scalar type, could only produce undefined behavior by modifying the
> >scalar twice without an intervening sequence point.
> >
> >i.e.
> >
> >int i = 0;
> >++i = 5;
>
> The above code is well defined, I think, and should result in i
> equal to 5. It is equivalent to,
> operator=(operator++(i),5); // before, 'i' is 0

Not quite. The operator function version has extra sequence points that
don't apply to non-class types. Those sequence points give it defined
behavior. The original code apparantly does not (I hope I'm wrong about
that).

Christian Bau

unread,

Feb 16, 1999, 3:00:00 AM2/16/99

to

In article <36d0c594...@netnews.worldnet.att.net>, jack...@att.net
(Jack Klein) wrote:

> 5.3.2 (page 77 of the ANSI PDF file of the standard) states:
>
> 1 The operand of prefix ++ is modified by adding 1, or set to true if
> it is bool (this use is deprecated). The operand shall be a modifiable
> lvalue. The type of the operand shall be an arithmetic type or a

> pointer to a completely=AD defined object type. The value is the new

> value of the operand; it is an lvalue.
>
> On the other hand, 5.2.6 (page 69) states:
>
> 1 The value obtained by applying a postfix ++ is the value that the
> operand had before applying the operator. [Note: the value obtained is
> a copy of the original value ] The operand shall be a modifiable
> lvalue. The type of the operand shall be an arithmetic type or a
> pointer to a complete object type. After the result is noted, the
> value of the object is modified by adding 1 to it, unless the object
> is of type bool, in which case it is set to true. [Note: this use is
> deprecated, see annex D. ] The result is an rvalue.
>
> If 5.3.2 is correct, this would have two consequences:
>

> 1. It would be a serious incompatibility with C, where the prefix ++
> operator does not yield an lvalue.
>

> 2. Any use of the resulting lvalue as an lvalue, at least for a
> scalar type, could only produce undefined behavior by modifying the
> scalar twice without an intervening sequence point.

If ++i is an lvalue, you can take its address or use it to initialise a
reference, so you can use write these:

int i;
extern void f (int* p);
extern void g (int& p);

f (&++i); /* Would be illegal C, but C programmers
havent missed this feature */
g (++i); /* C++ programmers would like this to be legal */
g (i++); /* Not legal C++, and it would be difficult to
give this meaningful semantics */

I think C++ programmers would only use this in situations that could not
be made legal C anyway, like in g (++i).
---

Steve Clamage

unread,

Feb 16, 1999, 3:00:00 AM2/16/99

to

jack...@att.net (Jack Klein) writes:

><Jack>

>5.3.2 (page 77 of the ANSI PDF file of the standard) states:

>1 The operand of prefix ++ is modified by adding 1, or set to true if
>it is bool (this use is deprecated). The operand shall be a modifiable
>lvalue. The type of the operand shall be an arithmetic type or a
>pointer to a completely=AD defined object type. The value is the new
>value of the operand; it is an lvalue.

> ...

>If 5.3.2 is correct, this would have two consequences:

>1. It would be a serious incompatibility with C, where the prefix ++
>operator does not yield an lvalue.

No, it is upward-compatible with C. Every valid C expression using
pre-increment is also valid in C++ (except for possible C++
restrictions on types). You can write valid C++ code that is
not valid C, but that is true of C++ in general. (Otherwise, what
would be the point of inventing C++?)

>2. Any use of the resulting lvalue as an lvalue, at least for a
>scalar type, could only produce undefined behavior by modifying the
>scalar twice without an intervening sequence point.

I don't understand your objection.

>i.e.

>int i = 0;
>++i = 5;

The last line is not valid in C because of the lvalue rule, and it
is not valid in C++ because of the sequence-point rule.

But there certainly are lvalue uses of ++i with well-defined behavior.
You can pass ++i to a reference parameter of a function, because
there is a sequence point between the evaluation of ++i and the
call of the function.

--
Steve Clamage, stephen...@sun.com

Steve Clamage

unread,

Feb 16, 1999, 3:00:00 AM2/16/99

to

mik...@ix.netcom.com (Michael Rubenstein) writes:

> ...

>I also would like to add a question on the assumption that I am
>correct. Is the result of prefix ++ on integer types a modifiable
>lvalue?

I would assume so. The lvalue was modifiable before, it is not
made const, and the standard doesn't say it isn't modifiable.

>The standard says (3.10)

> If an expression can be used to modify the object to which it
> efers, the expression is called *modifiable*. A program that
> attempts to modify an object through a nonmodifiable lvalue or
> rvalue expression is illformed.

>(asterisks indicate italics in the standard making this the definition
>of "modifiable"). As Jack has pointed out, modifying i using ++i
>would result in undefined behavior; does this mean (as far as the
>standard goes) that ++i cannot be used to modify i, making it a
>nonmodifiable lvalue?

I don't see the point of the question. Consider this:

int k = 0;
k = (k = k + 1) + (k = k + 2); // undefined behavior

Here we have attempted to modify k several times between sequence
points. Do you question whether k is a modifiable lvalue?
If not, why do you question whether ++i is a modifiable lvalue?

The property of being a modifiable lvalue is separate from
the rules about sequence points.

You are not required to modify ++i just because it is modifiable.
You are merely allowed to use it in contexts (such as passing
it to a reference parameter of a function) where a modifiable
lvalue is required, as long as you don't break other language
rules.

Steve Clamage

unread,

Feb 16, 1999, 3:00:00 AM2/16/99

to

sbn...@localhost.localdomain.COM (Siemel Naran) writes:

>>int i = 0;
>>++i = 5;

>The above code is well defined, I think, and should result in i

>equal to 5. It is equivalent to,
> operator=(operator++(i),5); // before, 'i' is 0

No. Have a look at section 1.9 for a discussion of sequence points.
The only sequence point in the last line occurs at the end of the
expression. Both the pre-increment and the assignment attempt to
modify the value of i, each modification being a side effect. The
code has undefined results.

Although it is likely the result will be either 1, 5, or 6, and
nothing horrible will happen, the compiler is allowed to diagnose
the error and refuse to compile the code.

You discussion of operator= and operator++ applies only for
interpreting the rules about overloaded operators. Quoting from
13.6, "Built-in operators":

"The candidate operator functions that represent the built-in
operators defined in clause 5 are specified in this subclause.
These candidate functions participate in the operator overload
resolution process as described in 13.3.1.2 and are used for no
other purpose."

Thus, you cannot use function-call semantics to infer sequence
points for built-in operators.

I'm sure someone will question how the result of the undefined
operation could be 1, 5, or 6. Assume that i is in memory,
and the machine can do arithmetic only in registers, but can
assign literal values to memory locations.
So after i is initialized to zero, we could have
A: reg=i; i=5; reg+=1; i=reg; ==> 1
B: reg=i; reg+=1; i=reg; i=5; ==> 5
C: i=5; reg=i; reg+=1; i=reg; ==> 6

Michael Rubenstein

unread,

Feb 16, 1999, 3:00:00 AM2/16/99

to

This is what I have been assuming. Howeve, it is questionable whether
this agrees with he definitions of the standard.

First, let's look at the idea that the lvalue was modifiable before
and therefore should still be modifiable. We are dealing with
different lvalue expressions. ++i and i refer to the same object but
they are different expressions and in 3.10 modifiable is defined with
respect to an expression, not to some abstract referent to an object.

I cannot find anything that prohibits passing a nonmodifiable lvalue
expression as a reference argument. Nor can I find anything that
prevents the new lvalue expression -- remember the parameter in the
function is a different expression -- from being modifiable.

k is a modifiable lvalue because it CAN be used to modify an object.
It CAN do this without causing undefined behavior. The fact that it
can also be used in expressions that cause undefined behavior doesn't
change this.

However, the expresision ++i CANNOT be used to modify an object
without causing undefined behavior. Does this mean that the
expression ++i CANNOT be used to modify an object? ++i may be
passed as an argument, used to initialize a reference, etc., but then
a different lvalue expression is used to modify the object.

We're agreed that other expressions, including some are derived from
++i, can be used to modify i. I can't find anything that says that
this means tht ++i must be modifiable.

I should note that until recently my reasoning was the same as Steve's
and I still prefer this view. I recently posted a message on
comp.lang.c++ that made the implicit assumption that ++i is a
modifiable lvalue.

Jack Klein posted a message on comp.lang.c suggesting that it would be
better if the standard said that ++i is not a modifiable lvalue. I
was about to post an answer similar to Steve's, pointing out that it
was modifiable since it could be used to initialize a reference that
is modifiable. When I checked the definition and realized that
modifiable refers to the expression, not to some abstract concept of
an lvalue, and that raises the question of whether ++i is a modifiable
lvalue expression.

--
Michael M Rubenstein

Siemel Naran

unread,

Feb 16, 1999, 3:00:00 AM2/16/99

to

On 15 Feb 1999 23:21:30 GMT, James Kuyper <kuy...@wizard.net> wrote:
>Siemel Naran wrote:

>> >int i = 0;
>> >++i = 5;
>>
>> The above code is well defined, I think, and should result in i
>> equal to 5. It is equivalent to,
>> operator=(operator++(i),5); // before, 'i' is 0

>Not quite. The operator function version has extra sequence points that
>don't apply to non-class types. Those sequence points give it defined
>behavior. The original code apparantly does not (I hope I'm wrong about
>that).

I'm not quite familiar with the term "sequence points". Please tell
me what it means in simple language, and tell me what else the above
snippet -- "int i=0; ++i=5;" -- could result in. I claim that the
above code is well defined and results in a reference to 'i', where
'i' has the value 5. So what else could happen?

Could the result be a reference to 'i' with the value 6? This
happens if the "i=5" is done first, then the "++i". This sounds
wrong to me, though.

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

David R Tribble

unread,

Feb 16, 1999, 3:00:00 AM2/16/99

to

Steve Clamage <cla...@Eng.Sun.COM> wrote:
> But there certainly are lvalue uses of ++i with well-defined behavior.
> You can pass ++i to a reference parameter of a function, because
> there is a sequence point between the evaluation of ++i and the
> call of the function.

Andrew Koenig wrote:
> It is true that some uses of the value are illegitimate.
> Others, however, are just fine:
> int i = 0;
> int& j = ++i; // equivalent to ++i; int& j = i;

Of course, it makes one wonder what's so hard about simply
incrementing 'i' prior to its use by 'j':

int i = 0;
i++;
int & j = i;

I personally believe that allowing 'i++' to evaluate to an l-value
is an abomination, but at least it's consistent with the other
abomination of allowing 'i=5' to return an l-value.

-- David R. Tribble, dtri...@technologist.com --

Francis Glassborow

unread,

Feb 16, 1999, 3:00:00 AM2/16/99

to

In article <slrn7chedt....@fermi.ceg.uiuc.edu>, Siemel Naran
<sbn...@fermi.ceg.uiuc.edu> writes

>I'm not quite familiar with the term "sequence points". Please tell
>me what it means in simple language, and tell me what else the above
>snippet -- "int i=0; ++i=5;" -- could result in. I claim that the
>above code is well defined and results in a reference to 'i', where
>'i' has the value 5. So what else could happen?

OK I will take your statement at face value. Sequence points occur:
at the end of a full expression
at a comma (sequence) operator
after the evaluation of the arguments of a function
at the return from a function
after the '?' in a conditional operator
after the '&&' in a logical and
after the '||' in a logical or

While between sequence points the code must behave as if the elements of
the expression have been evaluated sequentially (no parallel execution)
there is no such constraint on the way in which side-effects are
produced. The only certainties are that no side effects from evaluation
may occur before the immediately preceding sequence point and all side
effects must be complete before the immediately following sequence
point. Between sequence points there is no ordering of side effects.
Hence (and in case of doubt it is explicitly stated somewhere) writing
twice to the same storage between sequence points results in undefined
behaviour (the two writes might overlap)

There are other complexities with sequence points because they can be
nested, IOW a sequence point may apply to a subexpression but not to the
containing full expression.

>
>Could the result be a reference to 'i' with the value 6? This
>happens if the "i=5" is done first, then the "++i". This sounds
>wrong to me, though.

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

James Kuyper

unread,

Feb 17, 1999, 3:00:00 AM2/17/99

to

Siemel Naran wrote:
....

> I'm not quite familiar with the term "sequence points". Please tell
> me what it means in simple language, and tell me what else the above

Other people have already given you more technical answers; and you can
get the full answers simply by search the Standard, so I'll concentrate
on putting those answers in a context.

Sequence points help describe the extent to which the consequences of a
C++ statement are predictable. At a sequence point, all side effects of
previously evaluated expressions are complete, and none of the side
effects of later expressions have started yet. The absence of a sequence
point gives an implementor greater freedom to optimize code by
rearranging the order of side effects. The presence of a sequence point
makes the results of C++ code more predictable for the developer.
However, it doesn't help as much as some people expect - the
implementation still has considerable freedom to reorder
sub-expressions. But whichever order the implementation chooses, it must
keep distinct the side-effects of expressions seperated by sequence
points.

Most people who are unfamiliar with the concept, tend to think about C
code as if every operation was a sequence point. Sequence points
actually occur far less frequently than that. Also, many people find it
counter-intuitive that the main effect of "i=5" is returning the value
'5', and that the change in 'i's value is "only" a side-effect.

....

> Could the result be a reference to 'i' with the value 6? This
> happens if the "i=5" is done first, then the "++i". This sounds
> wrong to me, though.

It is permitted by the standard, however.

All...@my-dejanews.com

unread,

Feb 17, 1999, 3:00:00 AM2/17/99

to

In article <slrn7chedt....@fermi.ceg.uiuc.edu>,

sbn...@KILL.uiuc.edu wrote:
>
> On 15 Feb 1999 23:21:30 GMT, James Kuyper <kuy...@wizard.net> wrote:
> >Siemel Naran wrote:
>
> >> >int i = 0;
> >> >++i = 5;
> >>
> >> The above code is well defined, I think, and should result in i
> >> equal to 5. It is equivalent to,
> >> operator=(operator++(i),5); // before, 'i' is 0
>
> >Not quite. The operator function version has extra sequence points that
> >don't apply to non-class types. Those sequence points give it defined
> >behavior. The original code apparantly does not (I hope I'm wrong about
> >that).
>

> I'm not quite familiar with the term "sequence points". Please tell
> me what it means in simple language, and tell me what else the above

> snippet -- "int i=0; ++i=5;" -- could result in. I claim that the
> above code is well defined and results in a reference to 'i', where
> 'i' has the value 5. So what else could happen?
>

> Could the result be a reference to 'i' with the value 6? This
> happens if the "i=5" is done first, then the "++i". This sounds
> wrong to me, though.

See http://www.eskimo.com/~scs/C-faq/faq.html -- this is the FAQ
not for comp.std.c++, but for comp.lang.c. Pay particular attention
to section 3 -- all of it except question 3.16 applies equally well
to C++. You should be interested in question 3.8, "What's a
''sequence point''?" The rules for C++ are *slightly* different,
but the concepts still apply.

----
All...@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

James...@dresdner-bank.com

unread,

Feb 17, 1999, 3:00:00 AM2/17/99

to

In article <36C71213.5In article
<slrn7cgk93....@localhost.localdomain>, sbn...@uiuc.edu wrote:

> >2. Any use of the resulting lvalue as an lvalue, at least for a
> >scalar type, could only produce undefined behavior by modifying the
> >scalar twice without an intervening sequence point.
> >

> >i.e.

> >
> >int i = 0;
> >++i = 5;
>
> The above code is well defined, I think, and should result in i
> equal to 5. It is equivalent to,
> operator=(operator++(i),5); // before, 'i' is 0

Almost. But the difference is significant; the function version has
sequence points which aren't present in the initial expression.

> It is not specified which is evaluated first -- 'operator++(i)'
> or '5'. It doesn't matter though, because the two expressions
> do not share common variables. Hence, we get
> operator=(i,5); // before, 'i' is 1
> And finally, the above evaluates to,
> i; // before 'i' is 5
> A compiler may give a warning that the '++i' has no effect.

According to the standard, any attempt to modify the same value more
than once without an intervening sequence point results in undefined
behavior. For the reasons you give, I doubt that any compiler *will* in
fact give a result other than 5. But the standard is clear: the
expression contains undefined behavior.

> This code does have undefined behavior:
> v[i]=i++;
> The reason is that it is not specified which is evaluated first
> -- 'v.operator[](i)' or 'i++'.

No. The reason the code has undefined behavior is that a variable is
modified, and accessed for reasons other than determining the value to
be assigned, without an intervening sequence point.

--
James Kanze GABI Software, Sàrl
Conseils en informatique orienté objet --
-- Beratung in industrieller Datenverarbeitung
mailto: ka...@gabi-soft.fr mailto: James...@dresdner-bank.com

Siemel Naran

unread,

Feb 17, 1999, 3:00:00 AM2/17/99

to

On 17 Feb 1999 18:46:31 GMT, James...@dresdner-bank.com
><slrn7cgk93....@localhost.localdomain>, sbn...@uiuc.edu wrote:

>> operator=(operator++(i),5); // before, 'i' is 0

>> operator=(i,5); // before, 'i' is 1

>> i; // before 'i' is 5

>According to the standard, any attempt to modify the same value more

>than once without an intervening sequence point results in undefined
>behavior. For the reasons you give, I doubt that any compiler *will* in
>fact give a result other than 5. But the standard is clear: the
>expression contains undefined behavior.

Fine. Now please explain to me why the rules for builtin types are
different from the rules for user types. After, I like to think
that there exist classes like this
class int { ... };
class double { ... };
...

If your answer is that the 'undefined rule' gives implementors greater
freedom and thus allows them to generate more optimized code, then my
counter is that they can generate the more optimized code anyway
through the as-if rule. Eg, if it is found that in
j=i++;
the code is more optimized if we do "j=i; ++i", then this is what we,
the optimizer implementors, should do.

In fact, this must be the rule, because 'i' and 'j' may be types of
a user defined class Int. (The reason for using a user defined class
Int is to get some extra safety -- we can prohibit the conversion from
double to int, enforce units checking, etc.) So usage of the user
class Int should be just as efficient as usage of the builtin class int.

>> This code does have undefined behavior:
>> v[i]=i++;
>> The reason is that it is not specified which is evaluated first
>> -- 'v.operator[](i)' or 'i++'.
>
>No. The reason the code has undefined behavior is that a variable is
>modified, and accessed for reasons other than determining the value to
>be assigned, without an intervening sequence point.

Your language is rather technical :). All I'm saying is if i==3,
then the above statement "v[i]=i++" is equivalent to either of these
v[3]=3; // evaluate LHS first
v[4]=3; // evaluate RHS first

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

Francis Glassborow

unread,

Feb 17, 1999, 3:00:00 AM2/17/99

to

In article <slrn7cm7es....@fermi.ceg.uiuc.edu>, Siemel Naran
<sbn...@fermi.ceg.uiuc.edu> writes

>Fine. Now please explain to me why the rules for builtin types are
>different from the rules for user types. After, I like to think
>that there exist classes like this
> class int { ... };
> class double { ... };

You may like to think in that way but considerable time was spent by one
work group writing the standard in attempt to define the builtins in
that way without success. This is just one more instance of something
that seems sensible and easy proving to be quite otherwise.

In addition, what you are proposing would require a very radical rewrite
of the compilers, and, worse still, the optimisers (and those are buggy
enough without starting again from scratch)

Oh, and I have just remembered that you cannot model && and || through
function calls.

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

Scott Meyers

unread,

Feb 18, 1999, 3:00:00 AM2/18/99

to

> Fine. Now please explain to me why the rules for builtin types are
> different from the rules for user types. After, I like to think
> that there exist classes like this
> class int { ... };
> class double { ... };

> ...

This kind of thinking can only get you in trouble. The rules for the
built-in types are fundamentally different from those for user-defined
types.

As a general rule, the user-defined types behave like they do in C, and the
reason they do is for C compatibility. Other current threads point out
ways in which C and C++ differ for built-in types, but general
compatibility was the goal. Don't forget the C/C++ compatibility
manifesto: if a program is valid standard C and valid standard C++, it has
the same semantics. I don't think this this manifesto was fully achieved,
but I find that it explains a lot about how the built-in types behave in
C++.

For user-defined types, virtually everything behaves like a function call,
so function call semantics rule. It's important to remember this when
looking at a statement like this:

x = y;

Is this an assignment? Are its semantics governed by section 5.17 of the
standard ("Assignment operators")? If x and y are of built-in type, the
answer is yes. If they are of user-defined type, the answer is no, and the
relevant section of the standard is 5.2.2 ("Function call"). (If one is a
built-in and one is a user-defined type, we warp to the rules for
overloading resolution, and life becomes miserable.)

As another example of how built-ins and user-defined types differ, consider
the notions of rvalues and lvalues. I've found that these terms are
meaningful only for built-in types, because all operations on user-defined
types have function call semantics, and it's (always?) valid to invoke
functions on rvalues. Hence, objects of user-defined type always act like
lvalues. Example:

int i, j;

i + j = 10; // error! Can assign to rvalue

class MyInt {
... // assume a public operator=(int) exists
};

MyInt operator+(const MyInt&, const MyInt&);

MyInt i, j;

i + j = 10; // fine, assign 10 to the sum of i and j

This is why whe have to approximate rvalue semantics for user-defined types
by declaring them const, e.g.,

const MyInt operator+(const MyInt&, const MyInt&);

MyInt i, j;

i + j = 10; // error, can't invoke non-const operator= on const object;
// this makes the sum of i and j act like an rvalue
Scott

--
Scott Meyers, Ph.D. sme...@aristeia.com
Software Development Consultant http://www.aristeia.com/
Visit http://meyerscd.awl.com/ to demo the Effective C++ CD

Steve Clamage

unread,

Feb 18, 1999, 3:00:00 AM2/18/99

to

sbn...@fermi.ceg.uiuc.edu (Siemel Naran) writes:

>On 17 Feb 1999 18:46:31 GMT, James...@dresdner-bank.com
>><slrn7cgk93....@localhost.localdomain>, sbn...@uiuc.edu wrote:

>>> operator=(operator++(i),5); // before, 'i' is 0
>>> operator=(i,5); // before, 'i' is 1
>>> i; // before 'i' is 5

>>According to the standard, any attempt to modify the same value more
>>than once without an intervening sequence point results in undefined
>>behavior. For the reasons you give, I doubt that any compiler *will* in
>>fact give a result other than 5. But the standard is clear: the
>>expression contains undefined behavior.

>Fine. Now please explain to me why the rules for builtin types are

>different from the rules for user types.

There are no sequence-point or side-effect rules for types. The
rules apply to operations. The rules for built-in operations were
inherited from C. Overloaded operators are a syntax veneer on
top of function calls. Thus, the rules for overloaded operators
(that is, on user-defined types) are exactly the rules for
functions calls, because they are in fact function calls.

>After, I like to think
>that there exist classes like this
> class int { ... };
> class double { ... };
> ...

If we were developing a language from scratch, that might be
true. It isn't true for C++, and thinking it is can only lead
to confusion. You've already encountered some confusion by
conflating built-in operators with function calls. The mental
model would also lead you think that you could derive from
"class int", which you cannot. (At least not in the way C++ is
currently defined.)

>>> This code does have undefined behavior:
>>> v[i]=i++;
>>> The reason is that it is not specified which is evaluated first
>>> -- 'v.operator[](i)' or 'i++'.
>>
>>No. The reason the code has undefined behavior is that a variable is
>>modified, and accessed for reasons other than determining the value to
>>be assigned, without an intervening sequence point.

>Your language is rather technical :).

This is a technical subject, and we are discussing a rather
finely-drawn part of the language specification.

>All I'm saying is if i==3,
>then the above statement "v[i]=i++" is equivalent to either of these
> v[3]=3; // evaluate LHS first
> v[4]=3; // evaluate RHS first

But it isn't only a case of evaluating the LHS and RHS. Two of the
operators have side effects apart from determining the value of each
operand. The language definition leaves unspecified when, between
sequence points, the side effects occur. More than 2 outcomes are
possible, because there are three operators (assign-op, increment,
subscript) and two separate side effects (assignment to i from the
increment, assignment to v[i] from the assign-op).

The precedence rules require the subscript to be evaluated before
the assignment, and the increment to be evaluated before the
assignment. The relative order of subscript and increment is not
specified, and the side effects can take place any time after
their associated operation, in any order.

--
Steve Clamage, stephen...@sun.com

Francis Glassborow

unread,

Feb 18, 1999, 3:00:00 AM2/18/99

to

In article <7afgke$6ts$1...@engnews1.eng.sun.com>, Steve Clamage
<stephen...@sun.com> writes

>The precedence rules require the subscript to be evaluated before
>the assignment, and the increment to be evaluated before the
>assignment. The relative order of subscript and increment is not
>specified, and the side effects can take place any time after
>their associated operation, in any order.

Many people seem to think that there must be a strict (if unspecified)
ordering for side effects (analogous to that for evaluation of sub-
expressions). The reason that side-effects are more insidious (hence
the undefined behaviour) is that AFAIK there is no requirement for
strict sequential application of side-effects. The result is that non-
atomic operations could, technically, be interlaced.

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
---

James...@dresdner-bank.com

unread,

Feb 18, 1999, 3:00:00 AM2/18/99

to

In article <slrn7cm7es....@fermi.ceg.uiuc.edu>,

sbn...@KILL.uiuc.edu wrote:
>
> On 17 Feb 1999 18:46:31 GMT, James...@dresdner-bank.com
> ><slrn7cgk93....@localhost.localdomain>, sbn...@uiuc.edu wrote:
>
> >> operator=(operator++(i),5); // before, 'i' is 0
> >> operator=(i,5); // before, 'i' is 1
> >> i; // before 'i' is 5
>
> >According to the standard, any attempt to modify the same value more
> >than once without an intervening sequence point results in undefined
> >behavior. For the reasons you give, I doubt that any compiler *will* in
> >fact give a result other than 5. But the standard is clear: the
> >expression contains undefined behavior.
>
> Fine. Now please explain to me why the rules for builtin types are
> different from the rules for user types.

Because they are, and always have been. The rules for user defined
types are whatever the user chooses to make them, and for various
reasons, it is impossible for him to make them exactly the same as the
rules for the built-in types.

> After, I like to think
> that there exist classes like this
> class int { ... };
> class double { ... };
> ...

But they don't existe.

> If your answer is that the 'undefined rule' gives implementors greater
> freedom and thus allows them to generate more optimized code, then my
> counter is that they can generate the more optimized code anyway
> through the as-if rule. Eg, if it is found that in
> j=i++;
> the code is more optimized if we do "j=i; ++i", then this is what we,
> the optimizer implementors, should do.

This is the policy of Java. In theory, I agree -- make the language
safe, and let the compiler writers sweat. In practice... I must say
that to date, the optimization in Java compilers is significantly below
what we expect normally from C++ compilers.

> In fact, this must be the rule, because 'i' and 'j' may be types of
> a user defined class Int. (The reason for using a user defined class
> Int is to get some extra safety -- we can prohibit the conversion from
> double to int, enforce units checking, etc.) So usage of the user
> class Int should be just as efficient as usage of the builtin class int.

It's not the rule. User defined classes obey different rules than
built-in types. That's the way the language is defined.

> >> This code does have undefined behavior:
> >> v[i]=i++;
> >> The reason is that it is not specified which is evaluated first
> >> -- 'v.operator[](i)' or 'i++'.
> >
> >No. The reason the code has undefined behavior is that a variable is
> >modified, and accessed for reasons other than determining the value to
> >be assigned, without an intervening sequence point.
>

> Your language is rather technical :). All I'm saying is if i==3,

> then the above statement "v[i]=i++" is equivalent to either of these
> v[3]=3; // evaluate LHS first
> v[4]=3; // evaluate RHS first

What you are saying is what will *probably* be the case. What I was
saying is what the standard says. In particular, according to the
standard, the above statement can be the equivalent to a command to
reformat the hard disk, or anything else. (Especially, it doesn't have
to even correspond to anything that you could write in C++.)

I'm not saying that this is the way it should be -- there are arguments
pro and contra, and globally, this is one place where I personally
prefer the Java approach of defining everything. But this *is* the way
the language is currently defined, and I really doubt that it will
change.

--
James Kanze GABI Software, Sàrl
Conseils en informatique orienté objet --
-- Beratung in industrieller Datenverarbeitung
mailto: ka...@gabi-soft.fr mailto: James...@dresdner-bank.com

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Valentin Bonnard

unread,

Feb 18, 1999, 3:00:00 AM2/18/99

to

Siemel Naran wrote:

> Fine. Now please explain to me why the rules for builtin types are

> different from the rules for user types. After, I like to think

> that there exist classes like this
> class int { ... };
> class double { ... };

But there are no such classes. Otherwise, expressions such as
int() + double() would be ambiguous, because we have as
candidates:
- operator+ (int, int)
- operator+ (double, double)
- ...
and as conversion sequences:
- int -> double
- double -> int
- ...
and why on earth should class double be prefered over class int ?

--

Valentin Bonnard

Christopher Eltschka

unread,

Feb 18, 1999, 3:00:00 AM2/18/99

to

Scott Meyers wrote:

[...]

> As another example of how built-ins and user-defined types differ, consider
> the notions of rvalues and lvalues. I've found that these terms are
> meaningful only for built-in types, because all operations on user-defined
> types have function call semantics, and it's (always?) valid to invoke
> functions on rvalues.

No, it's not always valid. It is valid to
- pass per value or const reference
- invoke member functions

It is not valid to pass per non-const reference.

If you think of all operators on built-in types as global functions,
you get the lvalue-rules right (though not the others).

Your example shows the different behaviour for member functions.
The interesting point is that operator= *must* be implemented
as member, so while we can prevent operator+= to work on
lvalues by making it global (which, however, may have other unwanted
effects), we cannot do so for operator=.

Example:

class X {};

void foo(X&);

foo(X()); // Error: rvalue bound to non-const reference

X x;
foo(x); // Ok: x is lvalue

This usually shows up in streams; f.ex. we cannot simply do

string s;
MyClass MyObject;
...
istringstream(s) >> MyObject;

since operator>>(istream&, MyClass&) takes a non-const
reference to the stream. However, if the operator>> were an
istream member or istringstream member, it would work.

[...]

James Kuyper

unread,

Feb 18, 1999, 3:00:00 AM2/18/99

to

Siemel Naran wrote:
...

> Fine. Now please explain to me why the rules for builtin types are
> different from the rules for user types. After, I like to think
> that there exist classes like this
> class int { ... };
> class double { ... };

> ...

Backward compatibilty with C, and with early implementations of C++ that
produced code to be compiled by C, prevent treating the built-in 'int'
in a manner completely consistent with user-defined classes.

> If your answer is that the 'undefined rule' gives implementors greater
> freedom and thus allows them to generate more optimized code, then my
> counter is that they can generate the more optimized code anyway
> through the as-if rule. Eg, if it is found that in

Some optimizations are achieved by rearranging code to produce the same
effect by different means - those are the optimizations allowed by the
as-if rule. Other optimizations are achieved by rearranging code in ways
that are legal, despite the fact that they produce different effects.
That is the kind of optimization that is permitted by the various rules
that identify unspecified, implementation-defined, or undefined
behavior.

> j=i++;
> the code is more optimized if we do "j=i; ++i", then this is what we,
> the optimizer implementors, should do.
>

> In fact, this must be the rule, because 'i' and 'j' may be types of
> a user defined class Int. (The reason for using a user defined class
> Int is to get some extra safety -- we can prohibit the conversion from
> double to int, enforce units checking, etc.) So usage of the user
> class Int should be just as efficient as usage of the builtin class int.

Because of the differences between built-in classes and user-defined
ones, the efficiency concerns are inherently different. However, if your
'class Int' was defined using mostly inline member functions, it would
probably be just about as fast as 'int', with all the advantages you
list. It just couldn't be used as an exact substitute for 'int'.

Andrei Alexandrescu

unread,

Feb 18, 1999, 3:00:00 AM2/18/99

to

Francis Glassborow wrote in message
<3nRoVzAK...@robinton.demon.co.uk>...

>Oh, and I have just remembered that you cannot model && and || through
>function calls.

Neither ",".

Andrei
---

All...@my-dejanews.com

unread,

Feb 18, 1999, 3:00:00 AM2/18/99

to

In article <36CA21F6...@wizard.net>,

James Kuyper <kuy...@wizard.net> wrote:
>
> Siemel Naran wrote:

> ....

> > I'm not quite familiar with the term "sequence points". Please tell
> > me what it means in simple language, and tell me what else the above
>

> Other people have already given you more technical answers; and you can
> get the full answers simply by search the Standard, so I'll concentrate
> on putting those answers in a context.
>
> Sequence points help describe the extent to which the consequences of a
> C++ statement are predictable. At a sequence point, all side effects of
> previously evaluated expressions are complete, and none of the side
> effects of later expressions have started yet. The absence of a sequence
> point gives an implementor greater freedom to optimize code by
> rearranging the order of side effects. The presence of a sequence point
> makes the results of C++ code more predictable for the developer.
> However, it doesn't help as much as some people expect - the
> implementation still has considerable freedom to reorder
> sub-expressions. But whichever order the implementation chooses, it must
> keep distinct the side-effects of expressions seperated by sequence
> points.
>
> Most people who are unfamiliar with the concept, tend to think about C
> code as if every operation was a sequence point. Sequence points
> actually occur far less frequently than that. Also, many people find it
> counter-intuitive that the main effect of "i=5" is returning the value
> '5', and that the change in 'i's value is "only" a side-effect.
>
> ....

> > Could the result be a reference to 'i' with the value 6? This
> > happens if the "i=5" is done first, then the "++i". This sounds
> > wrong to me, though.
>

> It is permitted by the standard, however.

Look at
http://www.dejanews.com/[ST_rn=ap]/getdoc.xp?AN=207985697.1
for a detailed explanation of *why* the standard makes
i = ++i;
undefined. The analysis refers to C, but the concepts apply
equally well to C++.

----
All...@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----------== Posted via Deja News, The Discussion Network ==----------

http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Siemel Naran

unread,

Feb 19, 1999, 3:00:00 AM2/19/99

to

On 18 Feb 1999 21:38:16 GMT, All...@my-dejanews.com

> James Kuyper <kuy...@wizard.net> wrote:
>> Siemel Naran wrote:

>> > Could the result be a reference to 'i' with the value 6? This
>> > happens if the "i=5" is done first, then the "++i". This sounds
>> > wrong to me, though.

>> It is permitted by the standard, however.

>Look at
> http://www.dejanews.com/[ST_rn=ap]/getdoc.xp?AN=207985697.1
>for a detailed explanation of *why* the standard makes
> i = ++i;
>undefined. The analysis refers to C, but the concepts apply
>equally well to C++.

Thanks for the reference. To be perfectly honest, I didn't find a
satisfactory explanation for why the code should be undefined. But
I buy the main reason -- namely that establishing rules to make the
code defined would have been futile, because code like this does
not get written. The reason reason would be satisfactory if it
explained why code like this does not get written.

Interestingly, this was posted on comp.lang.c++ today. It shows
that there are indeed compilers out there that take advantage of
the undefined behaviour rule to produce different results:

On Thu, 18 Feb 1999 12:13:02 -0700, fysx <fy...@shaw.wave.ca> wrote:

>void use_IntClass() {
> IntClass a = 2;
> IntClass b = a++ + a++;
>
> cout << a << " " << b << endl;
>}
>
>
>void use_int() {
> int a = 2;
> int b = a++ + a++;
>
> cout << a << " " << b << endl;
>}

> * Watcom C++ V11.0 generates:
> * 4 5 [calling use_IntClass]
> * 4 5 [calling use_int]
> *
> * Microsoft Visual C++ 6.0 generates:
> * 4 5
> * 4 4
> *
> * Cygnus Win32 Beta19 g++ generates:
> * 4 5
> * 4 4
> *
> * Borland C++ 4.0 generates:
> * 4 5
> * 4 5
> *
> */

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

Christian Bau

unread,

Feb 19, 1999, 3:00:00 AM2/19/99

to

In article <slrn7cm7es....@fermi.ceg.uiuc.edu>,
sbn...@KILL.uiuc.edu wrote:

> If your answer is that the 'undefined rule' gives implementors greater
> freedom and thus allows them to generate more optimized code, then my
> counter is that they can generate the more optimized code anyway
> through the as-if rule. Eg, if it is found that in

> j=i++;
> the code is more optimized if we do "j=i; ++i", then this is what we,
> the optimizer implementors, should do.

Try to optimise this:

typedef struct { int top; int left; int bottom; int right; } TRect;

long width_height_area (TRect* rect, int* width, int* height)
{
return (*width = rect->right - rect->left)
* (*height = rect->bottom - rect->top);
}

a. Assume the "undefined" rule.
b. Assume Java-style rules, and make sure your code is correct if width ==
height or width == &rect->bottom or height == &rect->right.

Then come back.

Francis Glassborow

unread,

Feb 19, 1999, 3:00:00 AM2/19/99

to

In article <slrn7cphsc....@fermi.ceg.uiuc.edu>, Siemel Naran
<sbn...@fermi.ceg.uiuc.edu> writes

>Interestingly, this was posted on comp.lang.c++ today. It shows
>that there are indeed compilers out there that take advantage of
>the undefined behaviour rule to produce different results:

You are still confusing undefined behaviour with unspecified behaviour.
The former has the potential for causing real damage while the later
only results in unexpected results. E.g.

int i = 0;
int fn(){return ++i;}
int gn(){return i=(i?10:0);}
int main(){
int test=0;
test = fn()+gn();
return test;
}

AFAIK contains no undefined behaviour but it does result in unspecified
behaviour. It is allowed to return 1 or 10 but nothing else.

However:

int i=0;
int main(){
int test=0;
test = (++i) + (i=(i?10:0));
return test;
}

exhibits undefined behaviour because there is no rule to prohibit
simultaneous attempts to update i (a side-effect of the evlauations of
the sub-expressions). It may never happen, but the rules inherited by C
were crafted so simultaneous side-effects are permitted though
simultaneous evaluations (as I believe was clarified by a DR) are not.

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

James...@dresdner-bank.com

unread,

Feb 19, 1999, 3:00:00 AM2/19/99

to

In article <36CC29D0...@wizard.net>,
James Kuyper <kuy...@wizard.net> wrote:

> Some optimizations are achieved by rearranging code to produce the same
> effect by different means - those are the optimizations allowed by the
> as-if rule. Other optimizations are achieved by rearranging code in ways
> that are legal, despite the fact that they produce different effects.
> That is the kind of optimization that is permitted by the various rules
> that identify unspecified, implementation-defined, or undefined
> behavior.

It's not so much a case of different optimizations having different
effects. Undefined behavior is a way of telling the compiler writer
that this case doesn't occur, so he doesn't have to consider it when
trying to prove the legality of an optimizing transformation. Thus, for
example, when faced with an expression such as « x = (*p)++ », the
compiler doesn't have to consider the case where p points to x, which
could potentially limit some optimization.

--
James Kanze GABI Software, Sàrl
Conseils en informatique orienté objet --
-- Beratung in industrieller Datenverarbeitung
mailto: ka...@gabi-soft.fr mailto: James...@dresdner-bank.com

-----------== Posted via Deja News, The Discussion Network ==----------

http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

---

Scott Meyers

unread,

Feb 19, 1999, 3:00:00 AM2/19/99

to

> This usually shows up in streams; f.ex. we cannot simply do
>
> string s;
> MyClass MyObject;
> ...
> istringstream(s) >> MyObject;
>
> since operator>>(istream&, MyClass&) takes a non-const
> reference to the stream. However, if the operator>> were an
> istream member or istringstream member, it would work.

And, in another difference between built-in and user-defined types,
operator>> (and operator<<) *is* a member function for most built-in types,
the exception being char.

Scott

--
Scott Meyers, Ph.D. sme...@aristeia.com
Software Development Consultant http://www.aristeia.com/
Visit http://meyerscd.awl.com/ to demo the Effective C++ CD

Siemel Naran

unread,

Feb 20, 1999, 3:00:00 AM2/20/99

to

On 19 Feb 1999 15:57:03 GMT, Francis Glassborow

>You are still confusing undefined behaviour with unspecified behaviour.
>The former has the potential for causing real damage while the later
>only results in unexpected results. E.g.

Oh, I didn't know there was a distinction between 'undefined' and
'unspecified'. In a way, they're both the same thing.

Unexpected results may be damaging too, maybe even more damaging,
for the simple reason that we're oblivious to fact that the
results are wrong. IOW, if our programs prints "1", we might
think that this is the right result and neglect to pay attention
to it. Then two years later, we're screwed. But if the program
ends with a program crash or a completely bizarre result, then
we'll be forced to investigate the mistake.

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

---

Francis Glassborow

unread,

Feb 20, 1999, 3:00:00 AM2/20/99

to

In article <slrn7crlh6....@fermi.ceg.uiuc.edu>, Siemel Naran
<sbn...@fermi.ceg.uiuc.edu> writes

>On 19 Feb 1999 15:57:03 GMT, Francis Glassborow
>
>>You are still confusing undefined behaviour with unspecified behaviour.
>>The former has the potential for causing real damage while the later
>>only results in unexpected results. E.g.
>
>Oh, I didn't know there was a distinction between 'undefined' and
>'unspecified'. In a way, they're both the same thing.

Not in my coding shop they aren't.

>
>Unexpected results may be damaging too, maybe even more damaging,
>for the simple reason that we're oblivious to fact that the
>results are wrong. IOW, if our programs prints "1", we might
>think that this is the right result and neglect to pay attention
>to it. Then two years later, we're screwed. But if the program
>ends with a program crash or a completely bizarre result, then
>we'll be forced to investigate the mistake.

It is part of any testing routine to validate results. If the process
of getting those results sets fire to the building (which is an
unlikely, but possible consequence of undefined behaviour) or reformats
your hard drive (which I have seen happen) rewrites your CMOS (certainly
possible) or reprograms your graphics card (which happened to me once)
you would understand the difference between undefined and unspecified.

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

Steve Clamage

unread,

Feb 21, 1999, 3:00:00 AM2/21/99

to

sbn...@fermi.ceg.uiuc.edu (Siemel Naran) writes:

>On 19 Feb 1999 15:57:03 GMT, Francis Glassborow

>>You are still confusing undefined behaviour with unspecified behaviour.
>>The former has the potential for causing real damage while the later
>>only results in unexpected results. E.g.

>Oh, I didn't know there was a distinction between 'undefined' and
>'unspecified'. In a way, they're both the same thing.

In a way, dogs and cats are the same thing, but the differences matter
greatly to dogs and cats, and also to many humans.

See section 1.3 of the standard for definitions. "Undefined" means
that the standard places no requirements on the implementation
regarding the source code. Literally anything might happen at run
time, if the code compiles and links. As a quality of
implementation issue, an implemenation can choose to diagnose
code having undefined behavior.

"Unspecified" means that the implementation doesn't have to
document what will happen, and needn't be consistent.
Usually the standard will specify a range of possible behaviors,
if only implicitly.

The standard generally reserves "undefined" for situtations for
which is it is not feasible to specify a range of behaviors, and
for errors which we don't want to require all implementations to
detect. The One-Definition Rule is a perfect example. Detecting
violations is very difficult, and would impose a significant
burden on implementations (and on compile times). It is not
possible to specify a range of behaviors for programs which
violate the ODR.

The "undefined" label is also sometimes reserved as a hook
for implementations to provide extensions to C++. That is,
an implementation can document predictable behavior for
what the standard leaves undefined.

An example of "unspecified" behavior is the order of evaluation
of subexpressions between sequence points. They must all be
evaluated according to the precedence and associativity rules.
Beyond that, the implementation is free to use any evaluation
order; you can't depend on knowing or being able to find out
the order.

--
Steve Clamage, stephen...@sun.com
---

J. Kanze

unread,

Feb 22, 1999, 3:00:00 AM2/22/99

to

sme...@aristeia.com (Scott Meyers) writes:

Not to mention char*. I still wonder why the change. (And according to
what logic.)

Until the final draft, it was a member for all built-in types. And the
difference is not insignificant. Consider the following:

extern void trace( ostream& message ) ;

trace( ostrstream() << "x = " << x ) ;

This was fully legal until the final draft; it is now illegal.

--
James Kanze +33 (0)1 39 23 84 71 mailto: ka...@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orientée objet --
-- Beratung in objektorientierter Datenverarbeitung

All...@my-dejanews.com

unread,

Feb 22, 1999, 3:00:00 AM2/22/99

to

In article <WIBQejAN...@robinton.demon.co.uk>,

Francis Glassborow <fran...@robinton.demon.co.uk> wrote:
> It is part of any testing routine to validate results. If the process
> of getting those results sets fire to the building (which is an
> unlikely, but possible consequence of undefined behaviour) or reformats
> your hard drive (which I have seen happen) rewrites your CMOS (certainly
> possible) or reprograms your graphics card (which happened to me once)
> you would understand the difference between undefined and unspecified.

If a program evokes undefined behavior then it's unlikely but
conceivable that it would reformat your hard drive, rewrite your
CMOS, or reprogram your graphics card. In each case the compiler
vendor could defend the compliance of the compiler itself,
because when a program evokes undefined behavior there are no
more limits on what the program can do.

However, the first example you list above is in a whole new ballpark.
How would you prepare a compiler validation suite to ensure that
errant compilers cannot set the computer ablaze? If there is ANY
combination of events, defined or otherwise, that cause the computer
to set fire to your building, the problem lies neither in the compiler
nor in the source code fed to it.

Captain Kirk may be able to cause a computer to explode simply by
pointing out that it has already made logic errors, but for the rest
of us we're going to have to attach some kind of hardware which is
beyond the scope of the C or C++ languages.

Could we form a committee to design a new language? I would do it
myself, but I'm a bit intimidated by the Q/A angle -- when these
programs blow up, they really BLOW UP. In Dilbert style, the
committee's the first task would be to select a name. Early
suggestions include
BLAM -- Bomb Language Arts Model
YABL -- Yet Another Bomb Language
CILL -- Computer Interface for Large 'Lectronics
(doubles for KILL among the nearly-literate)
POW -- Programs that Operate like Windows95
TICK -- Test Interface to Control KABOOMs
DUCK!-- (the exclamation heard second-most-often from the
Q/A department -- the first-most-often exclamation
isn't appropriate here...)
ABCD -- Another Bomb-Control Definition
BOOM -- Bomb Object-Oriented Modeling

----
All...@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----------== Posted via Deja News, The Discussion Network ==----------

http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Francis Glassborow

unread,

Feb 22, 1999, 3:00:00 AM2/22/99

to

In article <7ash7a$p2q$1...@nnrp1.dejanews.com>, All...@my-dejanews.com
writes

>However, the first example you list above is in a whole new ballpark.
>How would you prepare a compiler validation suite to ensure that
>errant compilers cannot set the computer ablaze? If there is ANY
>combination of events, defined or otherwise, that cause the computer
>to set fire to your building, the problem lies neither in the compiler
>nor in the source code fed to it.

I believe there was once a monitor whose scan rate could be set to zero
under software control. In such a circumstance the monitor caught fire
in a very few seconds. Now, many household appliances are controlled by
embedded controllers - consider the effect of undefined behaviour on the
controller of your central heating boiler:)

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

James Kuyper

unread,

Feb 23, 1999, 3:00:00 AM2/23/99

to

All...@my-dejanews.com wrote:
...

> However, the first example you list above is in a whole new ballpark.
> How would you prepare a compiler validation suite to ensure that
> errant compilers cannot set the computer ablaze? If there is ANY

How would you write a validation suite to ensure that a compiler cannot
produce a core dump? You can't prove a negative. If you have a specific
reason for anticipating a core dump, or a fire-setting, you can test the
cases that trigger that worry.
Otherwise, the best you can do is stress-test the implementation, and
note all possible symptoms of a defective implementation (not just core
dumps and flames).

> combination of events, defined or otherwise, that cause the computer
> to set fire to your building, the problem lies neither in the compiler
> nor in the source code fed to it.

Whether or not your computer can set fire to your building depends upon
the peripherals attached to it. Obviously, a robotically controlled
flame thrower is the ideal peripheral for this task :-) A computer with
some means of destroying itself under software control, is actually a
plausible need for certain high-security applications. However, any
device that can be forced under software control into a regime where it
overheats catastrophically will also do.

What some people miss is that defined behaviour also is allowed to
include burning down the building - that's outside the scope of the
Standard. A conforming implementation of C++, running a strictly
conforming piece of C++ code, can do anything else it wants to, in
addition to translating and executing the program in conformance with
the standard.

However, regardless of whatever else a conforming compiler does while
running a conforming C++ program, it must also do whatever it is that
the Standard says must be done. If the implementation sets a fire that
could destroy the computer the program is running on, then all
operations required by the C++ Standard must be completed before the
fire reaches the computer, or the implementation isn't conforming.
Obviously, a fire set by some other process is outside the
responsibility of the implementor :-)

All...@my-dejanews.com

unread,

Feb 23, 1999, 3:00:00 AM2/23/99

to

In article <36D1F6E5...@wizard.net>,

You claim, in essence, that the program defines a *minimum* set of
acceptable behavior, and the compiler is free to add additional
(potentially random) effects as well.

#include <iostream>
int main() {
std::cout << (111*111) << std::endl;
}

If I run this program I expect the output to be:

12321

But by your logic, the output could be

12345
12345
12345

because first it spits out "123" as required, then (for reasons of
it's own) it emits "45\n1", and then it spits out the required '2',
and then adds "345\n" just for style, and then emits the '1',
followed by "2345" just because it looks good.

I'm fairly certain that not only is this nonsensical, but also
forbidden. From section 1.9 p1:

The semantic descriptions in this International Standard define a
parameterized nondeterministic abstract machine. This International
Standard places no requirement on the structure of conforming
implementations. In particular, they need not copy or emulate the
structure of the abstract machine. Rather, conforming implementations
are required to emulate (only) the observable behavior of the
^^^^^^
abstract machine as explained below.

(Emphasis mine). Despite the word "only" being in parenthesis for some
strange reason, I believe that it is extremely significant. The compiler
may do anything which can reasonably be labelled an emulation of the
abstract machine, and nothing else. For example, floating point numbers
might be rounded or truncated, since either one is a reasonable
emulation. Flames cannot be made to erupt (barring special hardware
intended for that purpose), since that is NOT a reasonable emulation.

----
All...@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

All...@my-dejanews.com

unread,

Feb 24, 1999, 3:00:00 AM2/24/99

to

In article <gT+ei5AP...@robinton.demon.co.uk>,

Francis Glassborow <fran...@robinton.demon.co.uk> wrote:
>
> In article <7ash7a$p2q$1...@nnrp1.dejanews.com>, All...@my-dejanews.com
> writes

> >However, the first example you list above is in a whole new ballpark.
> >How would you prepare a compiler validation suite to ensure that
> >errant compilers cannot set the computer ablaze? If there is ANY

> >combination of events, defined or otherwise, that cause the computer
> >to set fire to your building, the problem lies neither in the compiler
> >nor in the source code fed to it.
>

> I believe there was once a monitor whose scan rate could be set to zero
> under software control. In such a circumstance the monitor caught fire
> in a very few seconds. Now, many household appliances are controlled by
> embedded controllers - consider the effect of undefined behaviour on the
> controller of your central heating boiler:)

Part of my comments, which you snipped, adressed the possibility
that additional hardware exists. Such hardware is beyond the scope
of the C++ language.

----
All...@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

---

James Kuyper

unread,

Feb 24, 1999, 3:00:00 AM2/24/99

to

All...@my-dejanews.com wrote:
>
> In article <36D1F6E5...@wizard.net>,
> James Kuyper <kuy...@wizard.net> wrote:
> >
> > All...@my-dejanews.com wrote:
> > ...

...

> > > combination of events, defined or otherwise, that cause the computer
> > > to set fire to your building, the problem lies neither in the compiler
> > > nor in the source code fed to it.
> >

> > Whether or not your computer can set fire to your building depends upon
> > the peripherals attached to it. Obviously, a robotically controlled
> > flame thrower is the ideal peripheral for this task :-) A computer with
> > some means of destroying itself under software control, is actually a
> > plausible need for certain high-security applications. However, any
> > device that can be forced under software control into a regime where it
> > overheats catastrophically will also do.
> >
> > What some people miss is that defined behaviour also is allowed to
> > include burning down the building - that's outside the scope of the
> > Standard. A conforming implementation of C++, running a strictly
> > conforming piece of C++ code, can do anything else it wants to, in
> > addition to translating and executing the program in conformance with
> > the standard.
> >
> > However, regardless of whatever else a conforming compiler does while
> > running a conforming C++ program, it must also do whatever it is that
> > the Standard says must be done. If the implementation sets a fire that
> > could destroy the computer the program is running on, then all
> > operations required by the C++ Standard must be completed before the
> > fire reaches the computer, or the implementation isn't conforming.
> > Obviously, a fire set by some other process is outside the
> > responsibility of the implementor :-)
>
> You claim, in essence, that the program defines a *minimum* set of
> acceptable behavior, and the compiler is free to add additional
> (potentially random) effects as well.

No - not if the behaviour falls into the category described as
"observable" by the standard.

> #include <iostream>
> int main() {
> std::cout << (111*111) << std::endl;
> }
>
> If I run this program I expect the output to be:
>
> 12321
>

For example, the following behavior is output sent to stdout, which is
required to match the abstract machine's output. Hence, the following
behavior is prohibited.

> But by your logic, the output could be
>
> 12345
> 12345
> 12345
>
> because first it spits out "123" as required, then (for reasons of
> it's own) it emits "45\n1", and then it spits out the required '2',
> and then adds "345\n" just for style, and then emits the '1',
> followed by "2345" just because it looks good.
>
> I'm fairly certain that not only is this nonsensical, but also
> forbidden. From section 1.9 p1:
>
> The semantic descriptions in this International Standard define a
> parameterized nondeterministic abstract machine. This International
> Standard places no requirement on the structure of conforming
> implementations. In particular, they need not copy or emulate the
> structure of the abstract machine. Rather, conforming implementations
> are required to emulate (only) the observable behavior of the
> ^^^^^^
> abstract machine as explained below.
>
> (Emphasis mine). Despite the word "only" being in parenthesis for some
> strange reason, I believe that it is extremely significant. The compiler

That "only" modifies the "required"; implementations are only required
to emulate the observable behavior of the abstract machine. The other
behaviors need not be emulated. This certainly refers to the details
about whether or not a given number is loaded into memory, or retained
in a register from a previous retrieval. All of those effects are, in
principle, observable in a scientific sense, but not in the sense used
by the standard. I contend that it also applies to all other methods
whereby a computer effects it's environment, except for the specific I/O
channels described by the standard.

> may do anything which can reasonably be labelled an emulation of the
> abstract machine, and nothing else. For example, floating point numbers
> might be rounded or truncated, since either one is a reasonable
> emulation. Flames cannot be made to erupt (barring special hardware
> intended for that purpose), since that is NOT a reasonable emulation.

The standard goes on to define what is meant by observable behavior.
Anything that falls into that category must occur only exactly as
required by the standard. Things not described there are beyond the
scope of the standard. The standard, for instance, does not specify how
much electromagnetic radiation may be emmitted by the computer as as
consequence of running the program. However, it's impossible to run a
program on an electronic computer without producing a change in the EM
radiation it produces. In principle, there's no reason why the intensity
of the radiation might not reach lethal levels - such issues fall
outside the scope of the standard (OSHA might be interested, however
:-).

All...@my-dejanews.com

unread,

Feb 24, 1999, 3:00:00 AM2/24/99

to

In article <36D33E63...@wizard.net>,

James Kuyper <kuy...@wizard.net> wrote:
> All...@my-dejanews.com wrote:
> > > > combination of events, defined or otherwise, that cause the computer
> > > > to set fire to your building, the problem lies neither in the compiler
> > > > nor in the source code fed to it.
> > >
> > > Whether or not your computer can set fire to your building depends upon
> > > the peripherals attached to it. Obviously, a robotically controlled
> > > flame thrower is the ideal peripheral for this task :-)

...

> > > What some people miss is that defined behaviour also is allowed to
> > > include burning down the building - that's outside the scope of the
> > > Standard.

...

> > > However, regardless of whatever else a conforming compiler does while
> > > running a conforming C++ program, it must also do whatever it is that
> > > the Standard says must be done.

...

> > You claim, in essence, that the program defines a *minimum* set of
> > acceptable behavior, and the compiler is free to add additional
> > (potentially random) effects as well.
>
> No - not if the behaviour falls into the category described as
> "observable" by the standard.

So it's okay if nobody sees the flames?

> > The semantic descriptions in this International Standard define a
> > parameterized nondeterministic abstract machine. This International
> > Standard places no requirement on the structure of conforming
> > implementations. In particular, they need not copy or emulate the
> > structure of the abstract machine. Rather, conforming implementations
> > are required to emulate (only) the observable behavior of the
> > ^^^^^^
> > abstract machine as explained below.
> >
> > (Emphasis mine). Despite the word "only" being in parenthesis for some
> > strange reason, I believe that it is extremely significant. The compiler
>
> That "only" modifies the "required"; implementations are only required
> to emulate the observable behavior of the abstract machine.

And nothing else. I think we agree on this point.

> The other
> behaviors need not be emulated. This certainly refers to the details
> about whether or not a given number is loaded into memory, or retained
> in a register from a previous retrieval. All of those effects are, in
> principle, observable in a scientific sense, but not in the sense used
> by the standard. I contend that it also applies to all other methods
> whereby a computer effects it's environment, except for the specific I/O
> channels described by the standard.

...

> The standard goes on to define what is meant by observable behavior.
> Anything that falls into that category must occur only exactly as
> required by the standard. Things not described there are beyond the
> scope of the standard. The standard, for instance, does not specify how
> much electromagnetic radiation may be emmitted by the computer as as
> consequence of running the program. However, it's impossible to run a
> program on an electronic computer without producing a change in the EM
> radiation it produces. In principle, there's no reason why the intensity
> of the radiation might not reach lethal levels - such issues fall
> outside the scope of the standard (OSHA might be interested, however

The standard continues: "At program termination, all data written
into files shall be identical to one of the possible results that
execution of the program according to the abstract semantics would
have produced." By files, I believe we include all classifications
of I/O, including the robotically controlled flame thrower. The
compiler must not activate it unless the source code calls for it.

----
All...@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Andrew J Robb

unread,

Feb 25, 1999, 3:00:00 AM2/25/99

to

Jack Klein wrote:

> <Jack>
>
> 5.3.2 (page 77 of the ANSI PDF file of the standard) states:
>
> 1 The operand of prefix ++ is modified by adding 1, or set to true if
> it is bool (this use is deprecated). The operand shall be a modifiable
> lvalue. The type of the operand shall be an arithmetic type or a
> pointer to a completely defined object type. The value is the new
> value of the operand; it is an lvalue.

++i = 5;

Is the problem that a compiler is free to implement this as either of the
following?

(++i) = 5; // i == 5
++(i = 5); // i == 6

This must be true for classes too - either:

i.operator++().operator=(5);
i.operator=(5).operator++();
---

James Kuyper

unread,

Feb 25, 1999, 3:00:00 AM2/25/99

to

All...@my-dejanews.com wrote:
> In article <36D33E63...@wizard.net>,

> James Kuyper <kuy...@wizard.net> wrote:
> > All...@my-dejanews.com wrote:
...

> > > You claim, in essence, that the program defines a *minimum* set of
> > > acceptable behavior, and the compiler is free to add additional
> > > (potentially random) effects as well.
> >
> > No - not if the behaviour falls into the category described as
> > "observable" by the standard.
>

> So it's okay if nobody sees the flames?

No, flames which are in a physical sense entirely observable, don't
constitute observable behavior in the sense of the standard (unless
stdout or some other file is mapped into flame forms :)

...

> > That "only" modifies the "required"; implementations are only required
> > to emulate the observable behavior of the abstract machine.
>

> And nothing else. I think we agree on this point.

I think not - I believe that you're implying that implementations are
prohibited from doing anything else. That's not what it says. The
standard doesn't state that the abstract machine occupies space,
consumes electricity, makes noise, exerts a gravitational pull, etc.
That doesn't mean an implementation is prohibited from doing those
things. Setting fires is simply an additional (admittedly more active)
kind of behaviour that is also outside the scope of the standard.

...

> The standard continues: "At program termination, all data written
> into files shall be identical to one of the possible results that
> execution of the program according to the abstract semantics would
> have produced." By files, I believe we include all classifications
> of I/O, including the robotically controlled flame thrower. The
> compiler must not activate it unless the source code calls for it.

No, files are specifically the things that stdio functions read and
write from. Nothing prohibits the implementation from producing side
effects outside the range of the behaviors described by the standard;
any attempt to do so would be ridiculous.

James Kuyper

unread,

Feb 25, 1999, 3:00:00 AM2/25/99

to

Andrew J Robb wrote:
...

> ++i = 5;
>
> Is the problem that a compiler is free to implement this as either of the
> following?
>
> (++i) = 5; // i == 5
> ++(i = 5); // i == 6
>
> This must be true for classes too - either:
>
> i.operator++().operator=(5);
> i.operator=(5).operator++();

The rules for built-in types are different from the rules for
user-defined types. If 'i' is of a user-defined type, with both of those
operators defined, then function-call semantics are involved, and there
are sequence points seperating the two writes to 'i'. In that case,
'++i=5' is legal, an only the first expansion you've listed is
permitted.
However, if 'i' is of a built-in type, then "++i = 5" involves two
writes to 'i' without an intervening sequence point. That means that the
two options you've listed are the two most likely types of undefined
behavior allowed by that expression.

Stanley Friesen [Contractor]

unread,

Feb 25, 1999, 3:00:00 AM2/25/99

to

In article <7b1viu$ens$1...@nnrp1.dejanews.com>, <All...@my-dejanews.com> wrote:
>
>
>The standard continues: "At program termination, all data written
>into files shall be identical to one of the possible results that
>execution of the program according to the abstract semantics would
>have produced." By files, I believe we include all classifications
>of I/O, including the robotically controlled flame thrower. The
>compiler must not activate it unless the source code calls for it.
>

I don't think you need to do much interpretaion. A few sentences
distant from this is one that defines calls to the (implementation-
defined) I/O routines as part of "observable behavior". Ergo,
the activity of any I/O channel that is "attached" to any of the
set of I/O routines is part of observable behavior.

Indeed, boiled down, the definition of "observable behavior" is
essentially "any input/output behavior" (since "volatile" will often
be used to specify I/O registers).

James Kuyper

unread,

Feb 26, 1999, 3:00:00 AM2/26/99

to

James Kuyper wrote:
>
> All...@my-dejanews.com wrote:

> > In article <36D33E63...@wizard.net>,

> > James Kuyper <kuy...@wizard.net> wrote:
> > > All...@my-dejanews.com wrote:
> ...

> > > > You claim, in essence, that the program defines a *minimum* set of
> > > > acceptable behavior, and the compiler is free to add additional
> > > > (potentially random) effects as well.
> > >
> > > No - not if the behaviour falls into the category described as
> > > "observable" by the standard.
> >

> > So it's okay if nobody sees the flames?
>
> No, flames which are in a physical sense entirely observable, don't
> constitute observable behavior in the sense of the standard (unless
> stdout or some other file is mapped into flame forms :)

For the first time since we started this exchange, I've finally had a
chance to check the exact wording the standard uses to describe
observable behavior. Section 1.9, p6 says "The observable behavior of
the abstract machine is it's sequence of reads and writes to volatile
data and calls to library I/O functions." There's nothing in there about
such purely physical effects as setting a fire. Such effects cannot, in
themselves, have any effect on the conformance of an implementation.