sequence points and , operator

David Leimbach

unread,

Oct 21, 2001, 8:46:05 PM10/21/01

to

I am trying to understand exaclty how the rb_tree find algorithm works and
I stumbled across this bit of code in SGI's STL implementation for
stl_tree.h

__y = __x, __x = _S_left(__x);

I assume this is basically the same as:
__y = (__x = _S_left(__x));

Is the reason for the , operator being invoked above to handle an issue
with sequence points? I still don't totally understand the whole sequence
point issue with C++.

I think
x = x++;
violates a sequence point requirement in the standard but I don't totally
understand all the implications of the rule.

I suppose I should purchase the standard and look it up eh?
Now where is that darn URL?

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

David Leimbach

unread,

Oct 22, 2001, 2:54:12 AM10/22/01

to

David Leimbach wrote:

> I am trying to understand exaclty how the rb_tree find algorithm works and
> I stumbled across this bit of code in SGI's STL implementation for
> stl_tree.h
>
> __y = __x, __x = _S_left(__x);
>
> I assume this is basically the same as:
> __y = (__x = _S_left(__x));

I wasn't awake this morning... That's not what I should think , does.

I should amend that... :P
__y = __x; __x = _S_left(__x);

Jack Klein

unread,

Oct 22, 2001, 2:54:50 AM10/22/01

to

On 21 Oct 2001 20:46:05 -0400, David Leimbach <leim...@bellsouth.net>
wrote in comp.lang.c++.moderated:

> I am trying to understand exaclty how the rb_tree find algorithm works and
> I stumbled across this bit of code in SGI's STL implementation for
> stl_tree.h
>
> __y = __x, __x = _S_left(__x);
>
> I assume this is basically the same as:
> __y = (__x = _S_left(__x));

No, this is the equivalent of this:

__y = __x; __x = _S_left(__x);

In this particular case the statement could be broken into two totally
separate statements without any change in meaning, but that is not
always so.

There is actually no good reason at all from a C++ standpoint for the
code to be written the way it is, rather than the way I rewrote it.
There might possibly be an optimization issue, some compiler generates
slightly better code for the version with the comma operator, but I
doubt it.

> Is the reason for the , operator being invoked above to handle an issue
> with sequence points? I still don't totally understand the whole sequence
> point issue with C++.

It's perfectly simple. We very much want our compilers to generate
the fastest, smallest, most efficient executables possible from our
source code, consistent with making the executable correct. In some
cases that involves the compiler rearranging the execution sequence of
the code from the way we write it in the source, and performing other
optimizations.

The concept of sequence points is one the ways the standard has of
telling us what we can and cannot count on the compiler to do, and
what we in turn have to do to write correct code.

There are points in a program where we can stop and take a snapshot
and find that the real state of the real program exactly matches the
state of the abstract machine described by the language. These are
called sequence points.

At every sequence point, all side effects of code already executed
have taken place, and side effects have not yet taken place for code
that has not yet executed.

Consider the code snippet:

int x;
x = 3;
cout << x << endl;
x = 4;

There is no doubt in our minds that the output will be the value 3.
The semicolon at the end of the statement "x = 3;", and indeed the
semicolon at the end of every statement, is a sequence point. Before
any code after that semicolon is executed the side effect of storing
the value 3 into the int object x will be fully completed. And we
know that the output will not be 4 because the output is finished by
the time execution reaches the semicolon at the end of the "cout << x
<< endl;" statement. The side effect of storing the value 4 into x in
a later statement has not yet happened.

Of course because code can have loops, goto's, and function calls, the
result of statements is not always in the top to bottom order of the
source code file, but this applies to the effects of statements as
they are executed in the actual running of the program.

In addition to the sequence point at the end of every statement, C and
C++ define several other sequence points:

1. There is a sequence point just before a call to a function, after
all of the function's arguments have been evaluated.

2. There is a sequence point when a function returns, before any
further code in the calling function is executed.

3. The logical operators || and && generate a sequence point. Their
left hand expression is evaluated and any side effects of the
evaluation are completed. Then if the overall value of the full
expression is known the right hand expression is either evaluated or
not.

4. The comma operator generates a sequence point (but the commas that
separate arguments to a function are just punctuation, they are not
comma operators).

5. The ternary operator generates a sequence point. In an expression
like:

x = y < 0 ? -y : y;

....a simple statement that sets x to the absolute value of y, the
conditional expression is evaluated and any side effects it has are
completed. Then one of the two other expressions is evaluated.

> I think
> x = x++;
> violates a sequence point requirement in the standard but I don't totally
> understand all the implications of the rule.

Why do we even need the concept of sequence points? So compiler
writers can put the best possible code generators in their compilers.
Sequence points tell us when everything is absolutely in agreement
with what we wrote in the source code. In between sequence points
there is no guarantee when side effects are performed.

Considering your example statement "x = x++;", you are modifying an
object twice without a sequence point in between, and that produces
undefined behavior. Undefined behavior means anything could happen as
far as the C or C++ standard is concerned. It's just like dividing by
0 in a math problem. But leaving that aside for a moment, let's look
at what could happen based on what this code is actually telling the
compiler to do:

1. The lvalue on the left hand side of the equal sign (the
destination) will receive a new value by the sequence point at the end
of the statement. But this can't happen until that new value is
determined from the expression on the right.

2. The expression on the right is a post increment operator.
Assuming this is a built-in type like int, this expression requires
that the program first get the current value of x, which will be the
value yielded by the expression, and also add one to that value and
store the new value back to x before the sequence point at the end of
the statement.

Note that the statement requires two stores to x sometime before the
next sequence point, and there is nothing in the language that states
which is performed first.

Let's assume that x is an int and had the value of 0 before this
statement. So consider a compiler that generates machine language to
do the following:

1. Read the current value of x into a register. (value of x++)
2. Store that value into x (assign to destination) x becomes 1
3. Increment the register (the ++ part of x++)
4. Store the register value back into x (side effect of x++) x
becomes 1 again

x would wind up with the value of 1.

But let's assume another possible machine language sequence:

1. Read the current value of x into a register (value of x++)
2. Store that value int x (assign to destination) x becomes 1
3. Use an instruction that directly increments the memory location
where x is stored so x becomes 2

x would wind up with the value of 2.

Which is correct? Either, both, or any other result. If you break
the rules the standard can't help you.

> I suppose I should purchase the standard and look it up eh?
> Now where is that darn URL?

www.ansi.org click on the "webstore" link and search for "14882".

But now we come to the place where C++ muddies the waters. There are
differences between C++ and C. In C the issue of sequence points is
straightforward, but in C++ it is not.

C++ defines some operators to behave in a way that waddle like
sequence points, fly like sequence points, and quacks like sequence
points, but are not defined as ducks... er, sequence points.

Specifically the preincrement and predecrement operators, and all of
the assignment operators, when applied to scalar types return the
modified object (an lvalue) in C++. And the language automagically
requires that these operators store the new modified value into the
object before yielding that lvalue, even before the next sequence
point.

So there can be a point during the execution of a statement where the
storing of a value into an object must have already taken place even
though there has not been a sequence point yet.

Hopefully this was just an oversight in adopting some of the wording
from the C to the C++ standard, and will be corrected in some future
version of the C++ standard.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq

Erik Max Francis

unread,

Oct 22, 2001, 2:55:47 AM10/22/01

to

David Leimbach wrote:

> __y = __x, __x = _S_left(__x);
>
> I assume this is basically the same as:
> __y = (__x = _S_left(__x));

No, it is equivalent to

__y = __x;
__x = _S_left(__x);

You may note that they are quite different.

The comma operator evaluates to the evaluation of its second operand;
when the comma operator is used as a standalone statement then the value
it expands to is irrelevant.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, US / 37 20 N 121 53 W / ICQ16063900 / &tSftDotIotE
/ \ The multitude of books is making us ignorant.
\__/ Voltaire
Product's Quake III Arena Tips / http://www.bosskey.net/
Tips and tricks from the absolute beginner to the Arena Master.

Hanh Huynh-Huu

unread,

Oct 22, 2001, 2:56:05 AM10/22/01

to

> I am trying to understand exaclty how the rb_tree find algorithm works and
> I stumbled across this bit of code in SGI's STL implementation for
> stl_tree.h
>
> __y = __x, __x = _S_left(__x);
>
> I assume this is basically the same as:
> __y = (__x = _S_left(__x));

nope, __y gets the previous value of __x (before it gets overwritten by
_S_left(__x)
In your code, they all get set to the same value.

>
> Is the reason for the , operator being invoked above to handle an issue
> with sequence points? I still don't totally understand the whole sequence
> point issue with C++.
>
> I think
> x = x++;
> violates a sequence point requirement in the standard but I don't totally
> understand all the implications of the rule.

you might want to check this out ...
http://wwwold.dkuug.dk/JTC1/SC22/WG14/www/docs/n926.htm

>
> I suppose I should purchase the standard and look it up eh?
> Now where is that darn URL?

should be in the faq

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

James Kanze

unread,

Nov 4, 2001, 9:42:07 PM11/4/01

to

Jack Klein <jack...@spamcop.net> writes:

|> On 21 Oct 2001 20:46:05 -0400, David Leimbach
|> <leim...@bellsouth.net> wrote in comp.lang.c++.moderated:

|> > I am trying to understand exaclty how the rb_tree find algorithm
|> > works and I stumbled across this bit of code in SGI's STL
|> > implementation for stl_tree.h

|> > __y = __x, __x = _S_left(__x);

|> > I assume this is basically the same as:
|> > __y = (__x = _S_left(__x));

|> No, this is the equivalent of this:

|> __y = __x; __x = _S_left(__x);

|> In this particular case the statement could be broken into two
|> totally separate statements without any change in meaning, but that
|> is not always so.

|> There is actually no good reason at all from a C++ standpoint for
|> the code to be written the way it is, rather than the way I rewrote
|> it. There might possibly be an optimization issue, some compiler
|> generates slightly better code for the version with the comma
|> operator, but I doubt it.

In general, I find such uses of the comma operator very bad style. It's
worth pointing out, however, that most of them occur in contexts where a
composite statement (multiple statements in {...}) wouldn't be legal --
in a condition of a while, or one of the operations in a for, for
example, or as the second or third operand for ?:.

--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(0)179 2607481