class MyClass {...};
MyClass a, b, c;
...
(a = b) = c;
which ultimately leaves 'a' with the value held by 'c'. In fact, my
compiler
allows this kind of assignment with built-in types as well. But since this
construction seems so un-natural, I usually prevent it for user-defined
types by defining operator= to return a reference to a const object
instead.
I don't understand why I would ever want to return a mutable lvalue from an
assignment expression/operator. Any thoughts?
Thanks,
Greg
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]
So you could call a non-const method for the result of an assignment,
for example.
struct A
{
void whatever() {}
};
A a, b;
(a = b).whatever();
Just a compact way of writing "a = b; a.whatever();".
Victor
--
Please remove capital A's from my address when replying by mail
Count me too on that one!
> assignment expressions like:
>
> class MyClass {...};
> MyClass a, b, c;
> ...
> (a = b) = c;
>
> which ultimately leaves 'a' with the value held by 'c'. In fact, my
> compiler
> allows this kind of assignment with built-in types as well.
In C++, unlike in C, the result of an assignment is a reference to
the assigned expression -- that's why the IMO absolutely senseless
assignment that you showed above is allowed for built-in types.
> But since this
> construction seems so un-natural, I usually prevent it for user-defined
> types by defining operator= to return a reference to a const object
> instead.
Of course; that's what I *always* do! I'm yet to find a good reason
to convince me that allowing modifying operations/methods on the
result of an assignment is a good idea.
Sure, I always agree that any overloaded operator should behave as
built-in types behave... But sorry, I make the exception on this
one: I'll never agree with the built-in types on this one! :-)
> I don't understand why I would ever want to return a mutable lvalue from an
> assignment expression/operator. Any thoughts?
Well, Victor pointed out a situation in which it doesn't seem too
ilogical -- however, I don't think that such compactness is necessary;
If you want to do `a = b; a.whatever();', well just do it!
That's my $0.02...
Carlos
--
> A default assignment operator returns a mutable reference to the left hand
> side, making it possible to write what -- to me at least -- are strange
> assignment expressions like:
>
> class MyClass {...};
> MyClass a, b, c;
> ...
> (a = b) = c;
>
> which ultimately leaves 'a' with the value held by 'c'. In fact, my
> compiler
> allows this kind of assignment with built-in types as well.
When in doubt, do as the ints. Of course, the above does not leave
a with the value held by c always. There are two assignments to a
without a sequence point giving undefined behavior.
I like ((((y = a) *= x) += b) *= x) += c; for user types.
> But since this
> construction seems so un-natural, I usually prevent it for user-defined
> types by defining operator= to return a reference to a const object
> instead.
You can not then place those objects in any standard container.
> I don't understand why I would ever want to return a mutable lvalue from an
> assignment expression/operator. Any thoughts?
See above and do a search on deja for lots of heat on this subject. I
prefer to allowing doing something reasonable to preventing doing
something stupid which is unlikely to be done accidently.
John
Whether you like the (a = b) = c; type of thing, it is something
that is
supported by all basic types. If you want your type to be usable by
anyone,
including people who prefer that sort of terseness over multi-line
alternatives, then a mutable lvalue is better.
If you don't want to support this sort of (a=b)=c; syntax, alternatives
are return a const reference, return a separate object, or simply
return void.
> A default assignment operator returns a mutable reference to the left hand
> side, making it possible to write what -- to me at least -- are strange
> assignment expressions like:
>
> class MyClass {...};
> MyClass a, b, c;
> ...
> (a = b) = c;
>
> which ultimately leaves 'a' with the value held by 'c'. In fact, my
> compiler
> allows this kind of assignment with built-in types as well. But since this
> construction seems so un-natural, I usually prevent it for user-defined
> types by defining operator= to return a reference to a const object
> instead.
>
> I don't understand why I would ever want to return a mutable lvalue from an
> assignment expression/operator. Any thoughts?
Perhaps because that is the way the "built-in" assignment operator has
always worked in C?
I'm not sure you want to override the assignment operator to work in an
unexpected way.
Anders.
--
Anders Pytte Milkweed Software
PO Box 32 voice: (802) 586-2545
Craftsbury, VT 05826 email: and...@milkweed.com
<<snip>>
You're right, this is silly.
> I don't understand why I would ever want to return a mutable lvalue from
an
> assignment expression/operator. Any thoughts?
In case you want to pass the results of the assignment to
a function that takes a mutable reference. You want your
class objects to behave the way everything else does.
This is extremely important in the presence of templates.
MikeT
Agreed. These kinds of chained assignments are really ugly.
Luckily, they are very rarely used.
> But since this
> construction seems so un-natural, I usually prevent it for user-
defined
> types by defining operator= to return a reference to a const object
> instead.
>
Hmm, what do you usually do with the const return value? Probably
something like
while( obj1 = obj2 )
{
}
This works if the class provides a conversion operator to type "bool"
(or to an implicitly bool-convertable type)
This kind of chaining is used more frequently than the previous
example, but is it really less ugly? (A good compiler will always warn
you about the assignment in a conditional expression)
Why do you want to prevent the user of your class from writing the
former, but still allow him to write the latter?
So, IMO the alternatives for assignment return types are:
- void (to prevent misuse)
- a mutable lvalue (to mimick built-in types)
Sent via Deja.com http://www.deja.com/
Before you buy.
> A default assignment operator returns a mutable reference to the left hand
> side, making it possible to write what -- to me at least -- are strange
> assignment expressions like:
>
> class MyClass {...};
> MyClass a, b, c;
> ...
> (a = b) = c;
>
> which ultimately leaves 'a' with the value held by 'c'. In fact, my
> compiler
> allows this kind of assignment with built-in types as well. But since this
> construction seems so un-natural, I usually prevent it for user-defined
> types by defining operator= to return a reference to a const object
> instead.
>
> I don't understand why I would ever want to return a mutable lvalue from an
> assignment expression/operator. Any thoughts?
>
[snip]
My preference is to make any overloaded operator behave as much like
the builtin operator of the same name as is feasible, given the
goals of the overloaded operator.
(If it is not feasible overloaded operator's behavior conceptually
similar to that of the matching builtin operator, the decision to
name that function with an operator's symbol should be re-examined
(but not necessarily abandoned; see <iostream>).)
So I (usually) return a non-const reference from operator=(),
operator++(), operator*() (when used for dereferencing a smart
pointer to a non-const object), operator[](),
operator<<(some_ostream_like_beast&,Foo), and
operator>>(some_istream_like_beast&,Foo) .
The case for operator=() returning a modifiable lvalue is arguable,
but think about operator[]() for a moment:
std::vector<int> v;
v.push_back(1);
v[0]=2;
If the 3rd statement was not legal, I suspect a large segment of the
C++ comunitity would rise in protest. A similar case can be made for
operator*(). (Of course, niether of these operators should return
modifiable lvalues when they are used to access const objects.)
Of course, there are people who avoid using operator= as much as
possible (see the 'Top 10 Language Constructs (C++)' thread.), and
others who want nearly everything to be const. But I often think
these people should probably be using a language expressly designed
to make mutation seldom or never necessary. (Most (all?) functional
languages are good examples of this.)
There are several idioms that rely on assignment returning a non-const
reference. This has been considered in great depth and it is a close
call. However doing it differently from everyone else is unhelpful. The
key feature is that we should not wilfully make udts different from
built-ins.
Example:
mytype & func(mytype & parm){
return parm = /* expression */;
}
Of course it can be written differently, but lack of uniformity is a
pain that becomes even worse when you start using templates.
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
What if you REALLY don't want to create a temporary object, so you code
(a =b) +=5;
as opposed to a = b +5 or even a = b; a += 5;
Hey, its a reason; I didn't say it was a good one.
Joe Gottman
Wrong! In C, assignment returns *the value* assigned, unlike in C++,
where a reference to the assigned expression is returned... Of course,
your point still holds, as we're discussing this in a C++ context...
Carlos
--
Const reference: arguably fine (I actually defend that option).
By value: horrible (unnecessarily inefficient).
Void: absolutely unacceptable (you wouldn't be able to code (a=b)=c;
but you wouldn't be able to code any of these either:
a = b = c;
if ((a = function()) != whatever)
.....
Can you explain further?
My thoughts: (a = b) returns a reference to a, after which
assignment to c takes place? Or, in other words, isn't the the result
of (a=b) a parameter in the second operator= call?
--
- Bruce
> I don't understand why I would ever want to return a mutable lvalue from
an
> assignment expression/operator. Any thoughts?
Consider the case of say a class with reference counting:
class myStuff {
public:
// usual suspects
myStuff& operator=( myStuff& s );
};
Note that the parameter is not a const ref. This is because we're going to
avoid/delay the expensive copy of the string and merely increment the
reference
count in s.
This allows writing "normal" chained assignments such as:
myStuff a;
myStuff b;
myStuff c;
a = b = c = "N/A";
(not that I'd code this way but...)
Now, you could probably get around this by "correctly" implementing myStuff
so
that the members associated with reference counting were mutable, and
therefore
you could use const refs.
Mostly though, I think you're more likely to use it (if at all) in some
form
of:
( foo = bar ).someNonConstMember();
or
baz( foo = bar );
where baz takes a non-const ref. (as in the operator= example above)
Much more important to my mind is ensuring that genericity works without
surprises. Those that insist on const & returns for assignment will find
their users complaining that their favourite templates are giving
bizarre error messages.
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
Read the example again, and produce a reason for ever using it. That is
NOT a chained assignment but a reassignment to the same lvalue. However
that is not the reason for plain ref return from assignment.
I hate to sound like the code police, but I don't think this kind of
"creativity" or "terseness" is readable. So in general, I'd like to prevent
it in my projects.
Greg
Strictly speaking, I'm uncertain whether it's legal, but my implementation
permits it. In other words,
class MyType {
public:
...
const MyType& operator=(const MyType& rhs);
};
std::container<MyType> c1, c2;
...
c1 = c2;
works just fine. In your defense, Austern says that, in order for a type X
to model Assignable, it must support the expression x = y with a return
type
of X&. However, a return type of const X& from operator=(..) clearly does
not prohibit assignment to the elements of a standard container in a case
like that above.
> See above and do a search on deja for lots of heat on this subject.
>
Will do.
Thanks,
> In case you want to pass the results of the assignment to
> a function that takes a mutable reference.
I don't think many people (with the exception of the turkies in this
newsgroup - myself included) will have the presence of mind to ask whether
foo() might modify its argument when they see
foo(a = b);
so I'd rather it not be written in the first place.
> You want your
> class objects to behave the way everything else does.
This seems intuitive enough, but I apparently haven't seen enough to
convince myself yet.
> This is extremely important in the presence of templates.
I'd like to learn more about this.
Sequence points are not relevant here -- I'm pretty sure that the
standard specifies that an assignments returns a reference to the
assigned expression *after* the assignment took place. That is
enough to guarantee that the above statement's behavior is not
undefined.
In a case like i = i++; the sequence points are relevant because
the standard specifies that in the post-increment, the side-effect
takes place *any time* between reading the original value to be
returned, and the next sequence point -- since assignment is not
a sequence point, the behavior is undefined.
But please note that the two situations are very different...
If what you say is true, then it can't possibly make sense to
have built-in types returning a mutable reference to the
assigned expression, because as soon as you do *anything*
that modifies that expression, you get undefined behavior...
(unfortunately, I couldn't find the exact place in Stroustrup's
book where I read the description of the assignment operator's \
behavior, but I'm pretty sure that it said that the reference
is returned *after* the assignment took place)
Carlos
--
I haven't tried it, but I'm assuming
while (obj1 = obj2) {}
won't compile unless the kind of conversion to bool you speak of has been
explicitly defined.
Greg
> John Potter <jpo...@falcon.lhup.edu> wrote:
> >> (a = b) = c;
> [snip]
> > Of course, the above does not leave a with the value held by c
> > always. There are two assignments to a without a sequence point
> > giving undefined behavior.
>
> Can you explain further?
>
> My thoughts: (a = b) returns a reference to a, after which
> assignment to c takes place? Or, in other words, isn't the the result
> of (a=b) a parameter in the second operator= call?
We have enough snipped enough to lose the original context. In the
original, a, b, c were of user defined type and the statement does
leave a with the value of c. It was followed by a mention that it was
accepted for builtin types also. My response would not apply to the
original statement but does apply to builtins. For user defined types,
the operator= is a function (even when the compiler generates it) and a
function call introduces sequence points. For user types:
(a = b) = c; // is
a.operator=(b).operator=(c); // effectively
operator=(&operator=(&a, b), c); // & gives this parameter
For int a, b, c; :
(a = b) = c;
uses the builtin assignment operator which is not a function and there
are no sequence points between the two assignments. Modification of
a variable more than once without a sequence point is undefined
behavior. The implementation is allowed to arrange things as it sees
fit. Examples: load b, load c, sto c, sto b. Perform the two stores
in parallel possibly mixing bytes if int is wider than the data bus.
Interesting that the statement is valid with undefined behavior for
builtins, yet someone is always trying to make it invalid for user
types where it has well defined behavior.
John
>Strictly speaking, I'm uncertain whether it's legal, but my implementation
>permits it. In other words,
> class MyType {
> public:
> ...
> const MyType& operator=(const MyType& rhs);
> };
> std::container<MyType> c1, c2;
> ...
> c1 = c2;
>works just fine.
More accurately, the particular examples that you tried happen
to work on the current version of your implementation,
but there your implementation is under no obligation to continue
to support them in future releases.
--
Andrew Koenig, a...@research.att.com, http://www.research.att.com/info/ark
>I don't think many people (with the exception of the turkies in this
>newsgroup - myself included) will have the presence of mind to ask whether
>foo() might modify its argument when they see
> foo(a = b);
>so I'd rather it not be written in the first place.
This particular example isn't nearly as important as
Thing& Thing::assign(const Thing& newvalue)
{
return *this = newvalue;
}
I think that many programmers would think that
return *this = newvalue;
is simply an abbreviation for
*this = newvalue;
return *this;
and so it is -- but only if assignment returns a nonconst reference.
>
> "John Potter" <jpo...@falcon.lhup.edu> wrote in message
> news:3980fd81...@news.csrlink.net...
> > On 27 Jul 2000 18:54:25 -0400, Greg Hickman <greg.h...@lmco.com>
> > wrote:
> >
> > You can not then place those objects in any standard container.
> >
>
> Strictly speaking, I'm uncertain whether it's legal, but my
implementation
> permits it.
And, I suspect all others also. This point is not subject to debate,
the standard (23.1/4 Table 64) says exactly what Matt says in his book.
To refute that, would require running all standard algorithms on all
containers for all implementations now and in the future. Easier to
return T& <g>.
No heat here. Just the facts. Other reasons are always good for
a fun debate.
John
Yes, but that doesn't prove anything: that simply shows that
there is at least one situation in which the operation that you
do with a standard container does not require the return type
of the operator= to be assignable. That doesn't prove that
the standard containers *never* require an asignable return
type.
> In your defense, Austern says that, in order for a type X
> to model Assignable, it must support the expression x = y with a return
> type of X&.
If that is the case (and I definitely tend to blindly believe
anything stated in Austern's book :-)), that would mean that
there are or may be situations (some algorithms or some operations
on certain containers) in which the return value of operator=
must be assignable. You simply showed one particular situation
in which it is not.
I particularly dislike (big time!) returning a modifiable
reference, but after reading this discussion, I'm tending
to accept the argument... I had never thought that using
standard containers and/or algorithms might lead to non-
compiling code -- seeing that it is the case, I'm starting
to agree with "the other band" :-)
Carlos
--
> John Potter wrote:
> >
> > > (a = b) = c;
> >
> > When in doubt, do as the ints. Of course, the above does not leave
> > a with the value held by c always. There are two assignments to a
> > without a sequence point giving undefined behavior.
Reading your post, I see that we are talking about builtins not user
defined types. My statement above is misleading because the expression
was talking about a udt and it does as expected. If the expression
were about builtins, my statement would be correct and that is what
you are questioning.
> Sequence points are not relevant here -- I'm pretty sure that the
> standard specifies that an assignments returns a reference to the
> assigned expression *after* the assignment took place. That is
> enough to guarantee that the above statement's behavior is not
> undefined.
Sounds good to me and I have also made the statement that the expression
could not have undefined behavior for the same reasons.
5.17/1 ... The result of the assignment operation is the value stored
in the left operand after the assignment has taken place; the result
is an lvalue.
I think that it was James Kanze who pointed out that there is nothing
in that statement about returning anything. Builtin operators are not
functions and do not return things. They are used in expressions and
the expressions have values. The value (rvalue) of the expression is
(the same as) the value stored in the left operand after the assignment
has taken place. The lvalue of the expression is the left operand, and
that is a known fact at compile time. It sure is nice to infer a time
sequence, but the rules for sequence points do not support it. Nothing
says that the rvalue used was obtained from the left operand after the
assignment has taken place. It only says that they have the same value.
Consider a = b = c. There is no way that a conforming program can tell
whether the rvalue assigned to a came from the rvalue of b after the
assignment or from the original rvalue of c.
> If what you say is true, then it can't possibly make sense to
> have built-in types returning a mutable reference to the
> assigned expression, because as soon as you do *anything*
> that modifies that expression, you get undefined behavior...
Not if you introduce a sequence point in what you do.
void f(int&);
f(a = b);
The function will get a reference to a after the assignment has taken
place and can modify a with well defined behavior. That is the reason
for the lvalue (I almost said returned by) of the expression.
There is no sequence point and the behavior is undefined. Fight it
long enough and hard enough, as I did, and someone will recall a
conforming system where it did not work. When the standard says that
something has undefined behavior, it has undefined behavior. No amount
of testing of existing systems will be of any use when that type of
expression [ (a = b) += c; ] stops your pacemaker.
John
And I am pretty certain that you are mistaken. Sequence points are about
side effects and in particular when memory must be written. Evaluating a
reference and writing to the underlying object are quite distinct. The
former must be done prior to evaluating the containing expression, but
the later need only be done some time prior to the next sequence point.
>
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[snip]
> More accurately, the particular examples that you tried happen
> to work on the current version of your implementation,
> but there your implementation is under no obligation to continue
> to support them in future releases.
That's all the incentive I need to fall in line!
Greg
> And, I suspect all others also. This point is not subject to debate,
> the standard (23.1/4 Table 64) says exactly what Matt says in his book.
Thanks for the information. It was your original warning related to STL
containers which prompted me to look up his definition of 'Assignable' -
prior to that, I didn't know there was cause for concern.
Greg
P.S.
Is the text of the standard freely available on the internet?
Although I understand the syntax of your example, my experience is perhaps
too narrow to appreciate the reasons why programmers want to return
references to non-const in this manner. Presumably, one wants to mutate
the
object referred to via the returned reference, but I would tend to do this
in two statements like
Thing t1, t2;
t1.assign(t2); // 1
updateTheThing(t1); // 2
rather than
updateTheThing(t1.assign(t2));
Francis alluded to idioms which depend on operator=() returning a reference
to non-const in a previous post, and I think you've tried give an example
for illustration here, but I don't really understand these "idioms". I'd
like to learn more, but if need be, compatibility with built-ins and the
STL
container concepts are certainly reasons enough to hold me over until the
light comes on.
Thanks,
Greg
It is not only standard library templates but all third party ones. Any
time you make a udt's operators behave differently from built-in types
you need a very good reason because doing so threatens using template
technology with the udt.
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
>Although I understand the syntax of your example, my experience is perhaps
>too narrow to appreciate the reasons why programmers want to return
>references to non-const in this manner.
Well, for example, you do it every time you write
std::cout << "Hello, world!" << std::endl;
--
Andrew Koenig, a...@research.att.com, http://www.research.att.com/info/ark
AFAIK, not freely, but you can download a PDF version for US$ 18,
from http://www.ansi.org (I think it's worth the price).
Carlos
--
Roger that.
Of course! I made reference to your (or was John's?) original argument
about using udt's with STL containers... But of course, the argument
is immediately extended to any template-based library.
> Any
> time you make a udt's operators behave differently from built-in types
> you need a very good reason because doing so threatens using template
> technology with the udt.
Well, just to be picky -- UDT's *always* behave differently whenever
there is assignment or side effects involved; in particular, the
expression (a=b) = c, which leads to undefined behavior (yes, you
convinced me!!! ;-)) for built-in types, is guaranteed to work for
udt's (because the function call introduces a sequence point).
Even more hilarious: something like this:
i = i++;
has well defined behavior for UDT's!!!! (regardless if the
operator++ function is inline or not, of course)
But yes, I do get your point... :-)
Carlos
--
Well, after checking the standard, I am pretty sure that I was not
mistaken; I must go back to my original point (even though you had
me convinced! :-)). I quote 5.17 (page 89 in the PDF version):
"The result of the assignment operation is the value stored in the
left operand after the assignment has taken place; the result is an
lvalue"
Now, that tells me that sequence points are *not* relevant in the
behavior of:
(a = b) = c;
Furthermore, according to the above paragraph (quoted from the
standard), I conclude that the behavior of such expression is
perfectly defined: after the statement is executed, the value
of a is the same as the value of c, and both b and c remain
unchanged.
Did I misread something in the text of the standard?
Carlos
--
PS: If my reasoning here is true, then obviously forget about
my other message about a = b = c;
I agree. I simply mentioned it as technically possible. I certainly
don't advocate it.
>
> Void: absolutely unacceptable (you wouldn't be able to code (a=b)=c;
> but you wouldn't be able to code any of these either:
>
> a = b = c;
> if ((a = function()) != whatever)
> .....
>
I have seen style guidelines that vigorously discourage any of
these constructs. The fact that I dislike those particular
style guides is another story :-)
Actually, I agree with you. But that's a stylistic choice that not
all users of your classes will agree with. Of course, if you are
in control of all the places where your classes will be used
(or they use the same style guidelines) life will be a lot
easier.
> Well, after checking the standard, I am pretty sure that I was not
> mistaken; I must go back to my original point (even though you had
> me convinced! :-)). I quote 5.17 (page 89 in the PDF version):
>
> "The result of the assignment operation is the value stored in the
> left operand after the assignment has taken place; the result is an
> lvalue"
>
> Now, that tells me that sequence points are *not* relevant in the
> behavior of:
>
> (a = b) = c;
>
> Furthermore, according to the above paragraph (quoted from the
> standard), I conclude that the behavior of such expression is
> perfectly defined: after the statement is executed, the value
> of a is the same as the value of c, and both b and c remain
> unchanged.
>
> Did I misread something in the text of the standard?
You need to also use 5/4. The stored value may only be modified once
and the only use of the old value is to compute the new value, between
sequence points.
Let's add a couple of lines.
b = 7;
c = 5;
(a = b) = c;
Assignment does not introduce a sequence point (see 1.9).
The rvalue of a = b is 7. The lvalue of a = b is a. Since the
expression (a = b) = c contains no inner sequence points, the
implementation is allowed to assume that nothing is modified
more than once and execute things in any consistent order.
compute lvalue of a = b // a (compile time)
compute rvalue of a = b // 7
compute rvalue of c // 5
store 5 in a
store 7 in a
Since the fifth line makes the forth line useless and the third is only
used for it, optimize out those two lines. The problem that you are
having is in thinking that the 7 must be stored in a in order to
compute the lvalue of the expression. The standard says that the rvalue
of the expression is the value of the lhs after the assignment takes
place. It doesn't say that it is computed by reading it. The
expression may be reduced to either a = b or a = c. Undefined.
To complicate matters a bit.
int* p;
p = &(a = b);
No problem, the lvalue of the expression is used (invalid C).
*(p = &(a = b)) = c;
No amount of trickery will change the fact that a is modified twice
without a sequence point. Undefined.
> PS: If my reasoning here is true, then obviously forget about
> my other message about a = b = c;
This one is no problem because all interpretations give the same
result.
compute lvalue of a // a (compile time)
compute lvalue of (b = c) // b (compile time)
compute rvalue of c // 5
compute lvalue of b // b (compile time)
store 5 in b
compute rvalue of b // 5
store 5 in a
compute lvalue of a // a (compile time)
compute lvalue of b // b (compile time)
compute rvalue of c // 5
compute rvalue of (b = c) // 5 (noted at compile time)
store 5 in a
store 5 in b
Another easy good one.
struct S { S* next; };
S* p;
p = p->next;
P is used as an rvalue and is assigned to; however, the only use of
the old value of p is in the computation of the new value.
Another bad one.
a = b + ++b;
The first b may be the old or new value depending upon the order of
evaluation. Since it is possible to access the old b for a reason
other than computing the new b, we get undefined behavior.
An exercise for the readers:
struct S { S* next; S* back; };
S* p;
S* q;
(((q->back = p->back)->next = q)->next = p)->back = q;
John
Among the other reasons, one can include the fact that unused code is
prone to be full of bugs. That is why many of the classes I used declare
the assignment as returning void. While the probability of bugs in a
simple return statement can appear low, there is a probability of
forgetting to type in that little '&' in the declaration. The rsult is
we return a copy instead of a reference. If the return value was never
used in code, it may well pass testing and provides surprises in the
future. A 'void' return type will give an error message in that
incertain future.
My own guideline is: don't code for the future, because we don't know
how to test the future yet. My experience shows that even if you try to
test parts of code unused in the current project, the test will never be
good enough. Better not to write the code in the first place.
Of course, most classes should either declare the assignment private
(and keep it unimplemented) or used the default one generated by the
compiler.
> A default assignment operator returns a mutable reference to the left hand
> side, making it possible to write what -- to me at least -- are strange
> assignment expressions like:
>
> class MyClass {...};
> MyClass a, b, c;
> ...
> (a = b) = c;
>
> which ultimately leaves 'a' with the value held by 'c'. In fact, my
> compiler
> allows this kind of assignment with built-in types as well. But since this
> construction seems so un-natural, I usually prevent it for user-defined
> types by defining operator= to return a reference to a const object
> instead.
>
> I don't understand why I would ever want to return a mutable lvalue from an
> assignment expression/operator. Any thoughts?
>
> Thanks,
>
> Greg
>
This is for compatibility with C. In addition, it shortens multiple
assignment statements. We all know what "a=b=c" means mathematically.
--
Harry Erwin, PhD, <mailto:her...@gmu.edu>,Computational Neuroscientist
(modeling bat behavior), Senior SW Analyst and Security Engineer, and
Adjunct Professor of Computer Science, GMU. Looking--CV available at:
<http://mason.gmu.edu/~herwin/CV.htm>
I think it is a matter of interpretation.
(a=b) returns an lvalue for a (in effect its address), but at what stage
is that lvalue updated?
Take a different form:
a = (b = c);
Now when does b actually acquire the value stored in c? The statement
has to do two things in regard to b. First it must determine what to
store in b (that may include conversions, promotions etc.) and second it
must actually write that value to b. It also needs to determine what
value needs to be stored in a. It certainly is not required to actually
write to b and then read it back again (BTW, I think that would be
undefined behaviour, writing and then reading the same storage between
sequence points)
Note that it does not need to actually write to b in order to determine
what will be stored there.
Now revert to the original
(a=b) = c; (for built in types
It does not actually have to write to a in order to determine what the
lvalue of a is and it is only that which it needs for the second
assignment.
Of course this is pure language law (lore) and probably has no practical
impact (until you meet a compiler that inverts the order of the two
stores to a.
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
I read that one, but it seems pretty clear that the description of
the assignment operation overrides 5/4.
> Let's add a couple of lines.
>
> b = 7;
> c = 5;
> (a = b) = c;
>
> Assignment does not introduce a sequence point (see 1.9).
Agreed. (well, we never disagreed on that :-))
> The rvalue of a = b is 7. The lvalue of a = b is a.
NO!!!! The lvalue of (a = b) is not a. The lvalue of (a = b)
is the value stored in a *after* the assignment has taken place,
and it has to be assignable. So, an implementation that wants
to consider that the result of (a = b) is a reference to a,
must wait until a has been assigned before using that result;
otherwise, what that implementation is using *is not* the
result of (a = b) as specified by the standard.
> Since the
> expression (a = b) = c contains no inner sequence points, the
> implementation is allowed to assume that nothing is modified
> more than once and execute things in any consistent order.
Exactly: in any *consistent* order...
> compute lvalue of a = b // a (compile time)
Nope! The result of (a = b) is not a at compile time -- it
can be obtained at compile time by analyzing the whole
expression...
> compute rvalue of a = b // 7
> compute rvalue of c // 5
> store 5 in a
This last line does not comply with the definition of assignment.
You're not storing 5 in the result of (a = b); you're storing
5 in something that happens to be in the same physical location
as will be the result of (a = b).
> used for it, optimize out those two lines. The problem that you are
> having is in thinking that the 7 must be stored in a in order to
> compute the lvalue of the expression.
No, that's not what I'm implying: but the 7 must be stored in a
*before* the result value of (a = b) is used; the result of (a = b)
must be 7, according to the definition I quoted in the previous
message. It must be an lvalue that evaluates to 7 -- thus, any
use of that lvalue must be done *after* the seven is stored.
> The standard says that the rvalue
> of the expression is the value of the lhs after the assignment takes
> place. It doesn't say that it is computed by reading it.
And I'm not saying it either. The compiler doesn't need to read
from a ever in that expression -- it just has to wait until after
the assignment to be able to use the lvalue that (a = b) gives as
result. The compiler shall be able to optimize it either way that
leads to the same result for any possible values of a, b, and c.
So, the optimizer might be free to reduce the expression to a = c,
because the sequence implied by the definition will imply that
end result, regardless of the value of b.
> *(p = &(a = b)) = c;
>
> No amount of trickery will change the fact that a is modified twice
> without a sequence point. Undefined.
Disagree! The compiler is not allowed to use the address of (a = b)
until after a has been assigned, because it wouldn't be using the
result of the expression (a = b) as per its definition.
> a = b = c;
>
> This one is no problem because all interpretations give the same
> result.
According to your reasoning for the other situation, they won't.
You argue that b doesn't have to be assigned before taking its
address, since the result of (b = c) is b. Ok, what could prevent
the compiler from doing this:
compute lvalue of (b = c) // b
Now, since the compiler already knows the result of what has to
be assigned to a, it can go ahead and do it first:
a = the result of (b = c) // which is b.
Finally, before the sequence point is introduced, go ahead and
do the assignment that is pending... b = c;
End result? a has now the original value of b, and b has now
the value of c (which was unchanged).
When you analyze this expression, you're forcing the compiler
to do things in the order you want. Just tell me how would
the rules of sequence points and assignments that you stated
would prevent a compiler from executing a = b = c; as I
described it above?
Carlos
--
I think it's pretty clear in the text that I quoted from the standard:
at any stage *before* making use of it.
> Now revert to the original
>
> (a=b) = c; (for built in types
>
> It does not actually have to write to a in order to determine what the
> lvalue of a is and it is only that which it needs for the second
> assignment.
So, what you're saying is that the piece of text that I quoted means
nothing -- thus, should be removed from the standard.
Notice that the text says first that the result of the assignment is
*the value stored in the left operand*, and then it states that it
is an lvalue. So, knowing that the value of the assignment is the
address of a is not sufficient (it's not sufficient to comply with
the standard); the result of the assignment *MUST* be the value
assigned to a. The implementation will have to figure out a way
of obtaining an lvalue that is guaranteed to correspond to the
value after assignment -- and it must wait until the assignment
has taken place *before* using the lvalue -- otherwise it's not
using the result of the assignment: it's using something that
happens to be in the same physical address that the result of
the assignment will have.
You mention that the lvalue doesn't need to be read -- of course
not! The compiler doesn't need to read from a when it knows that
it just assigned some value to it. But the implementation *is
required* to find a sequence of instructions that correspond to
the effect of not using the result of the assignment (a = b)
until *after* the assignment has taken place.
> Of course this is pure language law
I know... But I'm intrigued by it... (we all are, aren't we? :-))
> Greg Hickman <greg.h...@lmco.com> wrote:
> > I don't understand why I would ever want to return a mutable lvalue from > > an assignment expression/operator. Any thoughts?
>
> This is for compatibility with C. In addition, it shortens multiple
> assignment statements. We all know what "a=b=c" means mathematically.
No, take another look. The value of a = b in C is an rvalue not a
mutable lvalue. (a = b) = c is ill-formed in C and well-formed in
C++ with undefined behavior for built-ins and well defined behavior
for user types.
John
Hmmm. Forgetting the little & in a declaration will, worst case,
cause invocation of a copy constructor on the return statement.
In practice, it often turns out that if you need to write an assignment
operator or a copy constructor, you need to write the other.
Both are a conscious decision, and would be tested in tandem
anyway. As such, I'd suggest that the test harness would need to
specifically include cases that *use* the return value from
an assignment operator.
>
> My own guideline is: don't code for the future, because we don't know
> how to test the future yet. My experience shows that even if you try to
> test parts of code unused in the current project, the test will never be
> good enough. Better not to write the code in the first place.
You sound like an advocate of an approach known as "extreme programming"
:-)
I take a middle ground: code for as much of the future as you can
reasonably anticipate and test for within limits of your project.
If you are coding for the future, but can not write sensible use cases
or test cases, then you are probably overreaching. If you have
a tight deadline, anticipating for the future may not be practical
at all. If you can anticipate the need, and have the time to do
it properly, you will probably save time down track when the
system is extended or enhanced.
IMHO, likely uses of assignment operators and copy constructors
can be reasonably anticipated, and both working code and suitable
test harnesses generated fairly early in the life of a class.
>
> Of course, most classes should either declare the assignment private
> (and keep it unimplemented) or used the default one generated by the
> compiler.
Sort of. Private functions still need an implementation, even if it
is not used anywhere [some compilers/linkers fail to complete a
build without them].
On use of default assignment operators or copy constructors, I agree.
Looking at a small system I've worked on, that involved about
300 classes, only 6 of those classes [two are templates] explicitly
supply assignment operators and copy constructors. All of those
classes do some resource management (memory, sockets, etc) for
which the compiler generated functions will cause confusion.
I do not understand this. the 'value of 'a=b' is a reference to a, that
is the only conceivable interpretation in context. (int)7 is not, and
never can be any kind of lvalue. Until we have a clear agreement on such
basic terminology there is little purpose in pursuing the issue.
lvalues can be dereferenced to produce rvalue's. The point in debate is
exactly how that dereferencing should be achieved. Elsewhere an issue
has been raised concerning assignment to a volatile variable. Now
clarifying that might help resolve our understanding of the fundamental
meaning of the C++ rules re assignment.
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
Where on Earth to you find that? A compiler must not fail (indeed it is
hard to see how it could) because an unused function is not defined.
And a linker cannot know that it was declared if it was not defined.
[...]
> So, what you're saying is that the piece of text that I quoted means
> nothing -- thus, should be removed from the standard.
The quoted sentence is self-contradictory as you correctly point out
in following paragraph.
> Notice that the text says first that the result of the assignment is
> *the value stored in the left operand*, and then it states that it
> is an lvalue. So, knowing that the value of the assignment is the
> address of a is not sufficient (it's not sufficient to comply with
> the standard); the result of the assignment *MUST* be the value
> assigned to a. The implementation will have to figure out a way
> of obtaining an lvalue that is guaranteed to correspond to the
> value after assignment -- and it must wait until the assignment
> has taken place *before* using the lvalue -- otherwise it's not
> using the result of the assignment: it's using something that
> happens to be in the same physical address that the result of
> the assignment will have.
Right.
> You mention that the lvalue doesn't need to be read -- of course
> not! The compiler doesn't need to read from a when it knows that
> it just assigned some value to it. But the implementation *is
> required* to find a sequence of instructions that correspond to
> the effect of not using the result of the assignment (a = b)
> until *after* the assignment has taken place.
... not in cases, like (a=b)=c, whose behavior is undefined. ;-)
Tom Payne
Yes, but the semantic of the program can be vastly different. I've never
seen a test that checked for this before. And I'm guilty on this count,
also.
> You sound like an advocate of an approach known as "extreme programming"
>:-)
Or a victim of tight deadlines!
> > Of course, most classes should either declare the assignment private
> > (and keep it unimplemented) or used the default one generated by the
> > compiler.
>
> Sort of. Private functions still need an implementation, even if it
> is not used anywhere [some compilers/linkers fail to complete a
> build without them].
Hum, is this conformant? I always thought that simply declaring a
function did not force it to be linked in. That's the behavior on all
compilers I've ever worked on.
> John Potter wrote:
> >
> > You need to also use 5/4. The stored value may only be modified once
> > and the only use of the old value is to compute the new value, between
> > sequence points.
>
> I read that one, but it seems pretty clear that the description of
> the assignment operation overrides 5/4.
>
> > Let's add a couple of lines.
> >
> > b = 7;
> > c = 5;
> > (a = b) = c;
> >
> > Assignment does not introduce a sequence point (see 1.9).
>
> Agreed. (well, we never disagreed on that :-))
>
> > The rvalue of a = b is 7. The lvalue of a = b is a.
>
> NO!!!! The lvalue of (a = b) is not a. The lvalue of (a = b)
> is the value stored in a *after* the assignment has taken place,
> and it has to be assignable.
The lvalue of something is its address. The lvalue of (a = b) is
the address of a. The lvalue to rvalue conversion involves reading
the value stored at that location. The rvalue of (a = b) is the
rvalue of b. At the next sequence point, it will also be the rvalue
of a.
The whole purpose of sequence points is to allow the implementation
to load all needed values into the cpu, perform all calculations
and then do all of the stores at the sequence point in any order or
even in parallel, and to let the user know when stored values are
usable. If you read 5/4 as any expression which violates these rules
may not perform as described later in this clause, everything makes
sense.
Assignment and prefix ++ are hard to show good examples. ++ ++ a is
also undefined, but I'll skip that one. Let's try another assignment
example.
void f (int& a, int& b) { b = (a = 3) - 2; }
r1 = 2
r2 = 3
r3 = r2
r3 -= r1
b = r3
a = r2
Perfectly valid translation and gives the expected results. Now
violate 5/4
int x;
f(x, x);
You get unexpected results. The point being that the implementation
generated valid code under the assumption that 5/4 was not violated.
When 5/4 is violated, the problem belongs to the coder not the
implementation.
> > a = b = c;
> >
> > This one is no problem because all interpretations give the same
> > result.
>
> According to your reasoning for the other situation, they won't.
> You argue that b doesn't have to be assigned before taking its
> address, since the result of (b = c) is b.
Correction: the lvalue of (b = c) is b, the rvalue of (b = c) is the
rvalue of c. Again, play with the rvalues in the cpu and save the
stores until the end. If the implementation wishes to use the lvalue
to rvalue conversion on the lvalue of the expression, it must still
get the correct rvalue for the expression. Note the difference from
the other case in which there was no lvalue to rvalue conversion.
John
Then those compilers are broken. A function that is not
called anywhere in the code does not need to be found at
link-time.
The trick mentioned above is a very common trick (see Scott
Meyers Effective C++), typically used with the assignment
operator and with the copy constructor.
The trick is that you put it/them in the private section;
that prevents any client code from using it/them (i.e.,
that will prevent any client code from doing anything that
uses assignment and/or creating copies of an object.
But that doesn't prevent tha you do it from a member
function of the class; that's why you don't implement
it; because if you don't call it, the linker must not
give errors (the function is not referenced in any of
the object modules), and if you call it (which is what
you were trying to prevent), then you can't get an
error while compiling the module, but at least you
would get an error at link-time.
If a compiler is not able to deal with that, it is broken!
Carlos
--
Nope. I mean, yes... but no... ;-)
Don't worry, remember that I'm a teacher, so I will make my
biggest effort to explain myself ;-)
What the standard says is that the result of a = b is the value
stored in a after the assignment has taken place. And it then
further specifies that the return is an lvalue.
This last statement from the standard is what leaves the compilers
no choice but to implement the return value as a reference to a
(it's the only way that you could have an lvalue that is the
value assigned to a).
But the thing is that the way I see that statement, sequence
points are irrelevant *in this particular case* (in the
particular case of assignments).
As you said in a previous message, sequence points are about
when should things be stored in memory. John then stated
that my error was that I was trying to impose a temporal
sequence where the standard didn't require one.
But then I checked the definition of the assignment operation,
and that definition *does impose* a temporal sequence. It
says that the result is the value assigned after the
assignment has taken place. And that's why I insist that
you can't simply state that the result of (a = b) is a
reference to a -- no; the phrase `a reference to a' doesn't
have any timing restrictions, whereas the definition of
assignment does impose a timing restriction!
I agree with John's remark about 4/5; but the way I see it,
the definition for the assignment is a particular case that
overrides the general situation described in 4/5.
> (int)7 is not, and
> never can be any kind of lvalue. Until we have a clear agreement on such
> basic terminology there is little purpose in pursuing the issue.
I'm not implying that the value 7 can be an lvalue; what I'm
saying is that the way I read the definition, I see that
whatever the compiler does, it must figure out a way for the
result of the assignment to have an rvalue of 7. Since that
result must also be an lvalue, then the compiler has no
choice: it must be a reference to a; but a reference that
can not be used before the assignment has taken place (that's
the subtle difference I'm trying to emphasize: you say that
the result is a reference to a -- I say that is a reference
to a, but not in an unqualified manner; it's a reference to
a whose use has restrictions in the timing.
> exactly how that dereferencing should be achieved. Elsewhere an issue
> has been raised concerning assignment to a volatile variable.
I don't think that would be much trouble, since the compiler is
not allowed to optimize anything around a volatile variable; of
course, I guess your point is that depending on what one interprets
as the definition of assignment, then certain maneuvers might be
considered valid, non-optimized operations.
I'm not sure if we're going to reach agreement (maybe if a few
extra persons jump in...), but it certainly has been fun (actually,
a sadistic thought crossed my mind today: I was going to bring
the topic to discuss it with my students!!!!! ;-) John, do you
think they would find it confusing? ;-))
Cheers,
> On use of default assignment operators or copy constructors, I agree.
> Looking at a small system I've worked on, that involved about
> 300 classes, only 6 of those classes [two are templates] explicitly
> supply assignment operators and copy constructors. All of those
> classes do some resource management (memory, sockets, etc) for
> which the compiler generated functions will cause confusion.
I've been wondering whether a consensus exists on whether the copy
constructor and assignment operator should be protected in abstract base
classes? Without restating the entire rationale, I seem to recall Scott
Meyers recommending this to prevent clients from trying to copy or assign
abstract bases.
Makes sense to me. What do you think? Should I conclude from your remarks
that you don't follow/recommend this practice?
Greg
I think the place to start is 5/4. The 'except where noted' statement
in 5/4 has limited application; it merely says evaluation order is
unspecified by default. The next sentence speaks on a somewhat
different matter, and 5/4 does not indicate that there may be
exceptions to the rule that a scalar object may have its value
modified at most once between sequence points. Since the standard only
specifies full-expressions (vs all expressions) as being bounded by
sequence points (1.9/16), the expression (a = b) *in the context of*
(a = b) = c does not introduce a sequence point. Since there are
multiple assignments to the lvalue 'a' between the previous and next
sequence points, the statment is undefined.
For reference, 5/4 of standard:
4 Except where noted, the order of evaluation of operands of
individual operators and subexpressions of individual expressions, and
the order in which side effects take place, is unspecified. Between
the previous and next sequence point a scalar object shall have its
stored value modified at most once by the evaluation of an expression.
Furthermore, the prior value shall be accessed only to determine the
value to be stored. The requirements of this paragraph shall be met
for each allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined. [Example:
i = v[i++]; // the behavior is unspecified
i = 7, i++, i++; // i becomes 9
i = ++i + 1; // the behavior is unspecified
i = i + 1; // the value of i is incremented
--end example]
--
- Bruce
Well...
Actually this issue is now being discussed on an internal Standards
Committee reflector. The deep problem is the behaviour of volatile
variables.
We know what the intent was and it is now pretty clear that the words
fail to express that intent (full backward compatibility with C +
consistency for udts and built-ins). The question now is how do we fix
it?
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
Think of sequence points as deadlines for side effects. The side
effect of x=7 need not occur until the first subsequent sequence
point. In the interim, x is in a state that is so undefined that any
attempt to read or write to x yields undefined behavior.
> But then I checked the definition of the assignment operation,
> and that definition *does impose* a temporal sequence. It
> says that the result is the value assigned after the
> assignment has taken place.
But that assignmet has not "taken place" until the first subsequent
occurrence of a sequence point.
> And that's why I insist that
> you can't simply state that the result of (a = b) is a
> reference to a -- no; the phrase `a reference to a' doesn't
> have any timing restrictions, whereas the definition of
> assignment does impose a timing restriction!
The reference to a has the same timing restrictions as a.
Specifically, between a assignment to a and the first subsequent
occurrence of a sequence point, a is too hot to handle.
Tom Payne
I agree on both counts. Of course, if you declare the assignment
operator private you usually want to do the same with the copy
constructor. Also, even though this is a very well-known trick,
I always feel free to document the declarations with a comment.
This led me to look for a more direct way of expressing the intent
(i.e., disallowing assignment and copy construction), and the
obvious solution seemed to be to move the declarations into a
private base class as shown below. Does anyone have an opinion,
favorable or otherwise, about this approach?
// NoAssign - derive privately from this class to disallow
// copy construction and assignment.
class NoAssign {
protected:
NoAssign() {}
private:
NoAssign(const NoAssign&);
NoAssign& operator=(const NoAssign&);
};
// Example: This class cannot be assigned to or copy constructed.
class MyClass : NoAssign {
// ...
};
Sent via Deja.com http://www.deja.com/
Before you buy.
> Then those compilers are broken.
Most compilers are broken, but I don't think that is your point.
> A function that is not
> called anywhere in the code does not need to be found at
> link-time.
For most implementations this is only true for non-virtual functions.
> I'm not sure if we're going to reach agreement (maybe if a few
> extra persons jump in...), but it certainly has been fun
You might get some more opinions in the x = y = z; Undefined? thread
in csc++ Jan 00. You will find me playing the same role there that
you are playing here.
> (actually,
> a sadistic thought crossed my mind today: I was going to bring
> the topic to discuss it with my students!!!!! ;-) John, do you
> think they would find it confusing? ;-))
Only if they try to make sense out of undefined behavior. <G>
It might help to look at the standard on prefix ++/--. There is
similar wording but it says the new value, it is an lvalue. It
seems like the lvalue part was just added to the C verbage. In
the case of assignment, the one statement covers all of the @=
operators as well as =. It could not say the value of the rhs.
The making sense part requires accepting that expressions are not
a group of functions which return something. Expressions have an
rvalue (computed non-memory values) and sometimes also an lvalue.
It is impossible to write code which uses both of them. The lvalue
of the expression can always be computed at compile time. We are
talking about expressions using the builtin operators, there are
no functions. The side effects of the operators may be delayed,
but the rvalue must be calculated correctly, unless undefined
behavior enters the code.
John
Oops... Certainly, I didn't have in mind virtual functions
when I wrote that (neither was the guy to who I replied, I
think). Since we were talking about copy constructor (can't
be virtual) and overloaded assignment operator (not a good
idea to make it virtual), then of course, I overlooked the
possibility of virtual functions.
Sorry...
Carlos
--
<snipped>
Does anyone have an opinion,
> favorable or otherwise, about this approach?
>
> // NoAssign - derive privately from this class to disallow
> // copy construction and assignment.
> class NoAssign {
> protected:
> NoAssign() {}
> private:
> NoAssign(const NoAssign&);
> NoAssign& operator=(const NoAssign&);
> };
>
> // Example: This class cannot be assigned to or copy constructed.
> class MyClass : NoAssign {
> // ...
> };
>
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.
>
And in the context of the original post that is what we were talking
about. ctors are necessarily non-virtual and copy assignments almost
certainly should not be virtual.
However thanks for dotting the i's and crossing the t's
>
>
>
>
> [ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
> [ about comp.lang.c++.moderated. First time posters: do this! ]
>
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
Hum, I do have a question then. How do these compilers (that require an
implementation for virtual functions) handle the case of pure virtual
functions? These can have an implementation or not, at the liberty of
the programmer. How can the linker know if it require the function to
exist or not?
As an example I submit the following test case. It is designed so that:
- each function doesn't know about the others,
- the bar() function call foo() virtually,
- the bar() function is called when the object is still only a Base,
- the Base::foo() implementation is not referenced directly in any
function.
So what is the correct behavior of the program? MSVC 6.0 does *not* call
Base::foo() but rather does a run-time error "pure virtual function
call". Is this normal? Are implementation of pure virtual functions
only usable as direct call from sub-classes? Or should they be set
temporarily in the virtual table during construction?
-- file named h.h: --
struct Base
{
Base();
virtual void foo() = 0;
};
struct Der : Base
{
void foo() { };
};
-- file named f1.cpp: --
#include "h.h"
#include <stdio.h>
void Base::foo() { puts("Base::foo()"); }
-- file named f2.cpp: --
#include "h.h"
void bar(Base & base);
Base::Base() { bar(*this); }
-- file named f3.cpp: --
#include "h.h"
void bar(Base & base) { base.foo(); }
-- file m.cpp --
#include "h.h"
int main(int, char **)
{
Der d; // This prints "Base::foo()".
return 0;
The whole point about pure virtuals is exactly that you have told the
compiler that there will be no dynamic version (so it can fro example
enter 0 into a vtable if it is handling virtuals that way) and you can
only call the static version explicitly, unless that is called the
linker has nothing to do.
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
> A default assignment operator returns a mutable reference to the left hand
> side, making it possible to write what -- to me at least -- are strange
> assignment expressions like:
>
> class MyClass {...};
> MyClass a, b, c;
> ...
> (a = b) = c;
>
> which ultimately leaves 'a' with the value held by 'c'. In fact, my
> compiler
> allows this kind of assignment with built-in types as well. But since this
> construction seems so un-natural, I usually prevent it for user-defined
> types by defining operator= to return a reference to a const object
> instead.
>
> I don't understand why I would ever want to return a mutable lvalue from an
> assignment expression/operator. Any thoughts?
Unless my boss requires otherwise (he doesn't) I return void.
> (a = b) = c is ill-formed in C and well-formed in
> C++ with undefined behavior for built-ins and well defined behavior
> for user types.
What about the behavior of a=(b=c) in C++? I presume it's defined
for user-defined types but not for built-ins, and I suspect that this
inconsistency and incompatibility with C is simply an oversight.
Tom Payne
a = (b = c) is never a problem (not in C, not in C++). Why wouldn't
it be defined for built-in types? (the argument with (a = b) = c
is that there are two assignments to `a' without a sequence point
separating them -- I have to be honest and confess that I'm still
having a hard time accepting that argument... But let's say that
by majority, I'm sort of getting convinved ;-)).
In a = (b = c) there is one assignment to `b' and one assignment to
`a' within the expression -- that doesn't seem to call for undefined
behaviour.
Carlos
--
> What about the behavior of a=(b=c) in C++? I presume it's defined
> for user-defined types but not for built-ins, and I suspect that this
> inconsistency and incompatibility with C is simply an oversight.
Why is this incompatible with C? The a= assignment needs an r-value,
which is provided by the sub-expression (b=c). There are no multiple
assignments to the same variable. The expression is interpreted no
differently from a=b=c.
The problem with (a=b)=c is that there are two assignments to a
without an intervening sequence point -- unless we are talking about
UDTs, where operator= introduces a sequence point because of the
function call.
--
- Bruce
> Why is this incompatible with C? The a= assignment needs an r-value,
> which is provided by the sub-expression (b=c).
My mistake. I had noted that (b=c) returns b's lvalue, which is then
accessed to determine its rvalue, and somehow got the notion that this
violated the prohibition on accessing "the prior value except to
determine the value to be stored." But, of course, it's not the
"prior value" that's being accessed here.
Tom Payne
|> The making sense part requires accepting that expressions are not
|> a group of functions which return something. Expressions have an
|> rvalue (computed non-memory values) and sometimes also an lvalue.
That's what the C standard says. It doesn't mention rvalue, since *all*
expressions are rvalues. Some are also lvalues.
The C++ standard chose to go a different route. I don't know why.
Perhaps someone remembered all of the discussions the C standard caused,
and decided that something else would be better.
|> It is impossible to write code which uses both of them. The lvalue
|> of the expression can always be computed at compile time.
The lvalue of *p can be computed at compile time?
|> We are
|> talking about expressions using the builtin operators, there are
|> no functions. The side effects of the operators may be delayed,
|> but the rvalue must be calculated correctly, unless undefined
|> behavior enters the code.
Right. In fact, the side effects of the operators may never take place,
unless they are observable behavior.
--
James Kanze mailto:ka...@gabi-soft.de
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
Ziegelhüttenweg 17a, 60598 Frankfurt, Germany Tel. +49(069)63198627
|> Carlos Moreno <mor...@cyberx.com> wrote:
|> [...]
|> > As you [Francis] said in a previous message, sequence points are about
|> > when should things be stored in memory. John then stated
|> > that my error was that I was trying to impose a temporal
|> > sequence where the standard didn't require one.
|> Think of sequence points as deadlines for side effects.
Think of sequence points as a part of the specifications of the
requirements on the program (not the implementation). If a program
doesn't meet its end of the contract (by respecting the restrictions
regarding the sequence points), the implementation is freed from its
obligations.
|> In article <3987516E...@cyberx.com>, Carlos Moreno
|> <mor...@cyberx.com> writes
|> >I'm not sure if we're going to reach agreement (maybe if a few
|> >extra persons jump in...), but it certainly has been fun (actually,
|> >a sadistic thought crossed my mind today: I was going to bring
|> >the topic to discuss it with my students!!!!! ;-) John, do you
|> >think they would find it confusing? ;-))
|> Well...
|> Actually this issue is now being discussed on an internal Standards
|> Committee reflector. The deep problem is the behaviour of volatile
|> variables.
The question concerning volatile variables was discussed with C.
Obviously, the fact that the expression is an rvalue in C, and that
their is no lvalue to rvalue converions, probably affect the semantics
with regards to volatile variables -- one could maintain that the lvalue
to rvalue conversion constitutes an access. (The definition of what
constitutes an access is implementation defined, but it is quite clear
that an lvalue to rvalue conversion should result in an access in every
other context.)
|> We know what the intent was and it is now pretty clear that the words
|> fail to express that intent (full backward compatibility with C +
|> consistency for udts and built-ins). The question now is how do we fix
|> it?
Note, however, that the only problem is with regards to volatile
variables. For the rest, section 1.9 covers the question quite well.
And there is no question that (a=b)=c is illegal; volatiliy has nothing
to do with that.
|> What the standard says is that the result of a = b is the value
|> stored in a after the assignment has taken place. And it then
|> further specifies that the return is an lvalue.
1.9 makes it quite clear that these descriptions describe the behavior
of parameterized non-deterministic abstract machine, and that all that
is required of an implementation is that the observable behavior be
identical to one possible behavior of that abstract machine, *for* *a*
*legal* *C++* *program*. The whole point of sequence points is to limit
what a legal C++ program can do, independantly of the abstract machine,
in other words, to define under what conditions the implementation is
required to maintain the observable effects. Thus, the requirements on
a=b is that the observable semantics of the program be the same as if
the compiler wrote the value to b, and used the value written as the
return value of the expression. For any legal program, the compiler can
optimize however it wants, as long as the observable behavior is
conform.
The key words, of course, are for any legal program. If the program
modifies the same value twice without an intervening sequence point, in
this expression, or in any other, the program contains undefined
behavior, and the standard places *no* requirements on the
implementation.
|> This last statement from the standard is what leaves the compilers
|> no choice but to implement the return value as a reference to a
|> (it's the only way that you could have an lvalue that is the value
|> assigned to a).
This last statement defines the behavior of the abstract machine, see
5.1.2.3, particularly paragraph 3. The only requirements are that the
observable behavior be "as if" the compiler had implemented it this way,
and these requirements only apply to legal programs. Modifying the
value twice without an intervening sequence point renders the program
illegal, and removes the requirements (and all other requirements).
|> But the thing is that the way I see that statement, sequence
|> points are irrelevant *in this particular case* (in the
|> particular case of assignments).
Sequence points are never irrelevant. The standard says, very clearly,
that modifying an object twice without an intervening sequence point is
undefined behavior. Undefined behavior means just that -- you can
ignore everything else in the standard (and even common sense).
|> As you said in a previous message, sequence points are about
|> when should things be stored in memory.
Sequence points have nothing to do with when things should be stored in
memory. When (and if) things are stored in memory is entirely at the
discresion of the compiler. Sequence points have one, and only one
effect: they define when a program is legal, and the compiler must
maintain the observable behavior, and when it isn't, and all
requirements are removed.
The point has been widely discussed, argued and clarified in C. In this
particular case, C++ adopted the C wording more or less as is, both with
regards to sequence points and with the meaning of the specified
semantics of the operators.
|> John then stated
|> that my error was that I was trying to impose a temporal
|> sequence where the standard didn't require one.
The only temporal sequence the standard imposes is in the observable
behavior of a legal program. At any given sequence point, all writes to
volatile variables before the sequence point must occur before any
writes to a volatile variable after the sequence point. If there is no
intervening sequence point, there is no defined order. See 1.9; a full
understanding of 1.9 is important in order to understand the rest of the
standard.
|> But then I checked the definition of the assignment operation,
|> and that definition *does impose* a temporal sequence.
It may say that the abstract machine carries out the operations in a
certain order. Although I'm not totally convinced even of that. What
the abstract machine does, however, has nothing to do with the legality
of the program, and the requirements only hold for legal programs.
[...]
|> > exactly how that dereferencing should be achieved. Elsewhere an issue
|> > has been raised concerning assignment to a volatile variable.
|> I don't think that would be much trouble, since the compiler is
|> not allowed to optimize anything around a volatile variable; of
|> course, I guess your point is that depending on what one interprets
|> as the definition of assignment, then certain maneuvers might be
|> considered valid, non-optimized operations.
Accessing or modifying an object through an expression with a volatile
type is an observable behavior. The compiler is required to respect
order here. But only in a legal program.
|> I'm not sure if we're going to reach agreement (maybe if a few extra
|> persons jump in...), but it certainly has been fun (actually, a
|> sadistic thought crossed my mind today: I was going to bring the
|> topic to discuss it with my students!!!!! ;-) John, do you think
|> they would find it confusing? ;-))
Very. It's not the sort of thing for students. Or even praticing
programmers, really. The entire logic is very legalistic. In a way
that makes it incredibly precise, but very far from what is actually
going on. The goal, of course, is NOT to specify what an implementation
should do, but to define a contract. The requirements on the programmer
are the legality of the program: no multiple modification of a variable
without an intervening sequence point, for example. The requirements on
the implementation are that the observable behavior conform to one
possible version of the abstract machine. And as is often the case with
contracts, if you don't meet your obligations, the other party (the
implementation) is freed from its obligations.
|> In article <39860E5E...@cyberx.com>, Carlos Moreno
|> <mor...@cyberx.com> writes
|> >NO!!!! The lvalue of (a = b) is not a. The lvalue of (a = b)
|> >is the value stored in a *after* the assignment has taken place,
|> >and it has to be assignable. So, an implementation that wants
|> >to consider that the result of (a = b) is a reference to a,
|> >must wait until a has been assigned before using that result;
|> >otherwise, what that implementation is using *is not* the
|> >result of (a = b) as specified by the standard.
|> I do not understand this. the 'value of 'a=b' is a reference to a, that
|> is the only conceivable interpretation in context. (int)7 is not, and
|> never can be any kind of lvalue. Until we have a clear agreement on such
|> basic terminology there is little purpose in pursuing the issue.
The standard could be clearer, but I think that the intent is that the
expression (a=b) have the value that was, or will be, stored in a.
At any rate, the discussion is irrelevant for the legality of (a=b)=c.
The description of operator = provided a definition of the semantics of
assignment in a legal program. Since this expression modifies a twice
without an intervening sequence point, it has no defined semantics.
|> lvalues can be dereferenced to produce rvalue's. The point in
|> debate is exactly how that dereferencing should be
|> achieved.
I see nothing in the standard which says that the value must be
dereference. All that counts is that the visible effects of the program
be the same as if it were dereferenced.
|> Elsewhere an issue has been raised concerning assignment
|> to a volatile variable. Now clarifying that might help resolve our
|> understanding of the fundamental meaning of the C++ rules re
|> assignment.
The problem is that the exact semantics of volatile are implementation
defined.
Note that the problem is not new. The question was raised in C:
volatile int x ;
x = 0 ;
Everyone agrees that their must be a write to x. A (too) literal
interpretation of volatile suggests that there must also be a read of x,
since the expression x = 0 has a value.
I don't know if there was ever a formal demand for interpretation of
this point, but the concensus in comp.std.c at the time was that
whatever happened was implementation defined.
Agreed, but there is now an issue with a = (b=c) that is specifically a
C++ one (and very much language lawyer meat)
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
Are we referring to C (where that is manifestly not true) or C++ where I
think it is also untrue as the return is a reference. IOWs assignment in
C returns an lvalue and that same value is stored in x, while in C++ the
returned value is an rvalue which is independent of the storage process.
What am I missing here?
In C++, the expresson lhs=rhs has the side effect of storing the
rvalue rhs into the lvalue lhs and returning the lvalue lhs (i.e., a
reference to lhs) as its result. In C, that expression has the same
side effect but returns the rvalue rhs as its result. C++ program can
obtain that rvalue through lvalue-to-rvalue conversion, which may or
may not involve read access to lhs (depending on implementation). In
cases where lhs is volatile, however, that additional read access is
part of the program's observable behavior and is different from the
behavior in C.
Tom Payne
That I made one of my irritating errors of inverting lvalue and rvalue
:(
Francis Glassborow Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation
[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]