Return Value Optimization

143 views
Skip to first unread message

Jeet Sukumaran

unread,
Nov 4, 1999, 3:00:00 AM11/4/99
to
>From Meyer's "More Effective C++," return value optimization
is said to likely to take place when a temporary object is returned:

Example #1:
SomeType foo(void) {
// stuff ...
return SomeType(/* stuff */);
}

Lippman & Lajoie's "C++ Primer" indicate another possible RVO scenario:

Example #2:
SomeType foo(void) {
SomeType a;
// stuff ..
return a;
}

Given a statement like:

s = foo();

where s is of SomeType. In Example #1, the temporary is constructed
directly in s's memory-space, as is the object a in Example#2 (if RVO
takes place).

OK -- here are my questions:

(1) Does RVO take place if the memory-space "to be filled" is not a
named object, but is itself another temporary, as in some of the below
code fragments:

if (foo() == s) { }
std::cout << foo()
while ( foo() ) { }

(2) Is there a need to choose between implicit return value
conversion and RVO-faciliation, or is one a clear-cut winner always?

For example:

SomeType foobar(void) {
AnotherType a;

// do something with a

// which is better?
return SomeType(a); // #1: emphasize RVO
return a; // #2: implicit return value conversion
}

In the above case, "a" can be implicitly converted to SomeType. I am
not sure in which circumstances to use #1, which facilitates RVO, or
#2, which allows the standard implicit return value conversion to take
place. Furthermore, would the decision be made differently if foobar
was commonly expected to generate temporaries (as in the example code
fragments under (1) above_, and only rarely have its return value
assigned to a named object?

As a whole, I'm curious about this return value optimization thing.
I've found few references on it. The Meyer's and Lippman&Lajoie ones
are gems. I could not locate any mention of it in the Standard (text
searches for "optimization") or in "C++PL3" (Eyeball Mk1 searches). So,
any thoughts would be welcome.

Thanks!

-- jeet


Sent via Deja.com http://www.deja.com/
Before you buy.

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]


John Potter

unread,
Nov 5, 1999, 3:00:00 AM11/5/99
to
On 4 Nov 1999 18:02:42 -0500, Jeet Sukumaran
<jeet_su...@my-deja.com> wrote:

: >From Meyer's "More Effective C++," return value optimization


: is said to likely to take place when a temporary object is returned:
:
: Example #1:
: SomeType foo(void) {
: // stuff ...
: return SomeType(/* stuff */);

: }

In this case, there is not much to optimize. Yes, stuff can be used
to initialize the return value. If you are also seeing the form
return SomeType(lhs) += rhs;
in an operator+, that is not likely to be optimized. It depends upon
operator+= returning *this which is not required.

: Lippman & Lajoie's "C++ Primer" indicate another possible RVO scenario:


:
: Example #2:
: SomeType foo(void) {
: SomeType a;
: // stuff ..
: return a;

: }

This is explicitely allowed by the standard. In particular
SomeType tmp(lhs);
tmp += rhs;
return tmp;
in operator+ can use RVO by constructing tmp in the return area.

: Given a statement like:


:
: s = foo();
:
: where s is of SomeType. In Example #1, the temporary is constructed
: directly in s's memory-space, as is the object a in Example#2 (if RVO
: takes place).

NO. There are two things going on here. The calling function and
the called function. RVO takes place in the called function. Your
example above is an assignment. The return value must not be
constructed in s because s has previously been constructed. The
return value must be constructed in some temporary in the caller's
space and passed to operator= for the assignment.

Return by value of an UDT is usually implemented by the caller
providing uninitialized space and passing a pointer to that space
as an invisible parameter. The called function then constructs
the return value in that space. If you change the example to
SomeType s(foo());
the caller may pass the space for s to the function and construction
may take place in s.

: OK -- here are my questions:


:
: (1) Does RVO take place if the memory-space "to be filled" is not a
: named object, but is itself another temporary, as in some of the below
: code fragments:
:
: if (foo() == s) { }
: std::cout << foo()
: while ( foo() ) { }

That should be clear from the above. The caller has nothing to do
with RVO. The temporary will be constructed by the called function.
Whether there is any optimization in the called function has nothing
to do with the caller.

: (2) Is there a need to choose between implicit return value


: conversion and RVO-faciliation, or is one a clear-cut winner always?
:
: For example:
:
: SomeType foobar(void) {
: AnotherType a;
:
: // do something with a
:
: // which is better?
: return SomeType(a); // #1: emphasize RVO
: return a; // #2: implicit return value conversion

: }

The return requires a conversion. There is no RVO possible. In
both cases, the return SomeType will be constructed from a. The
first says construct a SomeType and copy it to the return value.
The second says return a by converting it to SomeType and copying
it to the return value. Both cases involve the same copy to the return
value which may be easily removed by almost all compilers.

: In the above case, "a" can be implicitly converted to SomeType. I am


: not sure in which circumstances to use #1, which facilitates RVO, or
: #2, which allows the standard implicit return value conversion to take
: place. Furthermore, would the decision be made differently if foobar
: was commonly expected to generate temporaries (as in the example code
: fragments under (1) above_, and only rarely have its return value
: assigned to a named object?

Likely makes no difference. If the called function is inline, lots
of things can be done "as if" following the rules. Nothing that you
can do to help that.

: As a whole, I'm curious about this return value optimization thing.


: I've found few references on it. The Meyer's and Lippman&Lajoie ones
: are gems. I could not locate any mention of it in the Standard (text
: searches for "optimization") or in "C++PL3" (Eyeball Mk1 searches). So,
: any thoughts would be welcome.

12.8/15 is the paragraph which covers this.

The use of a local variable which is constructed at the beginning
of the function in the return value area and then manipulated in
the function is what most want. It saves a copy at the return.
Cfront implemented this. I have heard that the HP compiler
has it available as an option which is normally disabled. I know
of no other compiler that supports it.

Many hope that will change. The HP view shows that there are also
many who do not want it messing up their intentions. Time will
tell.

You can easily test your compiler by writing a simple class with
a copy ctor which produces output. This optimization is an issue
with many because it can be detected. That violates the "as if"
rule, but it is an exemption.

John

J.Barfurth

unread,
Nov 5, 1999, 3:00:00 AM11/5/99
to

Jeet Sukumaran <jeet_su...@my-deja.com> schrieb in im Newsbeitrag:
7vrrp6$ihm$1...@nnrp1.deja.com...

> >From Meyer's "More Effective C++," return value optimization
> is said to likely to take place when a temporary object is returned:
>
> Example #1:
> SomeType foo(void) {
> // stuff ...
> return SomeType(/* stuff */);
> }
>
> Lippman & Lajoie's "C++ Primer" indicate another possible RVO scenario:
>
> Example #2:
> SomeType foo(void) {
> SomeType a;
> // stuff ..
> return a;
> }
>
> Given a statement like:
>
> s = foo();
This too creates a temporary: The return value optimisation can elide copy
construction but not copy assignment. See below.

> where s is of SomeType. In Example #1, the temporary is constructed
> directly in s's memory-space, as is the object a in Example#2 (if RVO
> takes place).

It wont. s is constructed already. The compiler should not just construct
something else in it's 'memory-space'. To allow the full optimization (no
copying done) use:
SomeType new_s = foo();

> OK -- here are my questions:
>
> (1) Does RVO take place if the memory-space "to be filled" is not a
> named object, but is itself another temporary, as in some of the below
> code fragments:
>
> if (foo() == s) { }
> std::cout << foo()
> while ( foo() ) { }

It could. If the optimization is done at all, it probably will take place in
this case. It seems to me, that it would be even easier to do this
optimization for a temporary (depending on the way return-by-value is
implemented by the compiler for objects of class type). If e.g. the returned
value is just left on the stack (which means the RVO is impossible for
non-block-scoped (named) objects), that value can still be used as a
temporary.

> (2) Is there a need to choose between implicit return value
> conversion and RVO-faciliation, or is one a clear-cut winner always?
>
> For example:
>
> SomeType foobar(void) {
> AnotherType a;
>
> // do something with a
>
> // which is better?
> return SomeType(a); // #1: emphasize RVO
> return a; // #2: implicit return value conversion
> }

I wouldn't expect any difference in generated code between #1 and #2 if RVO
is done. Even if it usually isn't or cannot be done in cases like Example#1
above, line #1 should be easily optimized to do the same thing as line #2.
<BTW>Avoid reusing names: 'Example' #1 is still in scope when you declare
line #1 ;-) </BTW>

Assuming that the appropriate copy c'tor and implicit conversion are
available to allow both #1 and #2 to compile, a compiler should do the same
in both cases IMHO. If line #1 causes a copy constructor to be called even
if the return value is discarded or used as a temporary, the compiler either
would seem very simplistic to me or goes out of its way to ensure copying
temporaries around as much as possible.
Assuming again that both versions are legal, i'd select the one that is more
readable: If the implicit conversion is obvious (else you better make it
explicit anyways), and if the context is such that a casual reader will
still recall that the function actually returns a SomeType and not an
AnotherType, I'd select version #2. Else I prefer #1 - it makes the returned
type obvious. If the conversion isn't natural, I'd make it explicit if I can
and/or use:
return static_cast<SomeType>(a); // #3
which still shouldn't make a difference if RVO is done.

> In the above case, "a" can be implicitly converted to SomeType. I am
> not sure in which circumstances to use #1, which facilitates RVO, or
> #2, which allows the standard implicit return value conversion to take
> place. Furthermore, would the decision be made differently if foobar
> was commonly expected to generate temporaries (as in the example code
> fragments under (1) above_, and only rarely have its return value
> assigned to a named object?

Probably not: the client code won't notice the difference. If RVO can
construct the return value into any named object it can do so for a
temporary as well. If it can't, the client code will get a temporary always
(which it might have to copy then).

> As a whole, I'm curious about this return value optimization thing.
> I've found few references on it. The Meyer's and Lippman&Lajoie ones
> are gems. I could not locate any mention of it in the Standard (text
> searches for "optimization") or in "C++PL3" (Eyeball Mk1 searches). So,
> any thoughts would be welcome.

Search for "elide". iirc the standard talks about eliding copy constructors.
That can be done for other cases of copy initialization as well.

> Thanks!
>
> -- jeet

HTH
-- Jörg

Gabriel Dos_Reis

unread,
Nov 6, 1999, 3:00:00 AM11/6/99
to
Jeet Sukumaran <jeet_su...@my-deja.com> writes:

| >From Meyer's "More Effective C++," return value optimization
| is said to likely to take place when a temporary object is returned:
|
| Example #1:
| SomeType foo(void) {
| // stuff ...
| return SomeType(/* stuff */);
| }

Meyer's coverage of what is he called "return value optimization" is
incomplete. He mostly dealt with what should be termed "unnamed return
value optimization", to emphasize the fact the return-expression
refers to an unnamed object.

| Lippman & Lajoie's "C++ Primer" indicate another possible RVO scenario:
|
| Example #2:
| SomeType foo(void) {
| SomeType a;
| // stuff ..
| return a;
| }
|
| Given a statement like:
|
| s = foo();
|

| where s is of SomeType. In Example #1, the temporary is constructed
| directly in s's memory-space, as is the object a in Example#2 (if RVO
| takes place).

Yes. Actually that is called "named return value optimization" to
emphasize the point that the return-expression is naming a local object.
You'll find an in-depth coverage in the ARM and "Inside the C++ Object
Model" (by Lippman).

| OK -- here are my questions:
|
| (1) Does RVO take place if the memory-space "to be filled" is not a
| named object, but is itself another temporary, as in some of the below
| code fragments:
|
| if (foo() == s) { }
| std::cout << foo()
| while ( foo() ) { }

Yes. See the first sentence of 12.8/15 in ISO C++ 14882.

| (2) Is there a need to choose between implicit return value
| conversion and RVO-faciliation, or is one a clear-cut winner always?
|
| For example:
|
| SomeType foobar(void) {
| AnotherType a;
|
| // do something with a
|
| // which is better?
| return SomeType(a); // #1: emphasize RVO
| return a; // #2: implicit return value conversion
| }
|

| In the above case, "a" can be implicitly converted to SomeType. I am
| not sure in which circumstances to use #1, which facilitates RVO, or
| #2, which allows the standard implicit return value conversion to take
| place.

Actually optimizations are 'quality of implementation' issues. If
your favorite compiler is smeart enought to apply both the unnamed and
the named return value, then probably it doesn't matter which form you
use. However it has been my observation that most compilers can
easily optimize away unnamed objects than they do with named objects.
The form #1 is an instance of what Lippman called "computational
constructor" in "Inside the C++ Object Model" and has the virtue to
turn on the unnamed return value optimization -- it would have been
noted first by Jerry Schwartz.

| ... Furthermore, would the decision be made differently if foobar


| was commonly expected to generate temporaries (as in the example code
| fragments under (1) above_, and only rarely have its return value
| assigned to a named object?

If the value returned by foo() doesn't depend on the (number of) calls,
it might probably be a good idea to compute it once.

| As a whole, I'm curious about this return value optimization thing.
| I've found few references on it. The Meyer's and Lippman&Lajoie ones
| are gems.

I think that ARM and "Inside the C++ Object Model" are authoritative
treatises of the subject.

| ... I could not locate any mention of it in the Standard (text
| searches for "optimization")

Look at the paragraph 12.8/17.

Hope this helps


--
Gabriel Dos Reis, dos...@cmla.ens-cachan.fr

Jeet Sukumaran

unread,
Nov 8, 1999, 3:00:00 AM11/8/99
to
"elide"

That's it! Searches for *that* were sooooo much more fruitful, not
just in the standard (12.8 etc. as you were kind enough to point out),
but on deja etc..

Those who picked up on my error in assuming RVO when the return value
is assigned to an already-constructed object --- kudos, and touche.

Gabriel Dos Reis: I've heard that Lippman's "Inside the Object Model"
covers RVO in depth, but the ACCU review of the book says that it is
severely crippled by logical errors and/or misconceptions? Something to
do with an assignment operator being a pre-requisite for RVO? I'll
have a look at the ARM ... actually, I'd have a look at the "Inside the
Object Model" too if it was easily available, but it isn't, and the ARM
is.

J. Potter: Searching for elide on the deja led to some ideas for
testing when and where copy construction takes place -- I'll be giving
it a spin. I'm curious, though: you say --

: return SomeType(a);
: return a;

> The return requires a conversion. There is
> no RVO possible. In both cases, the return
> SomeType will be constructed from a. The
> first says construct a SomeType and copy it to
> the return value. The second says return a by
> converting it to SomeType and copying it to
> the return value. Both cases involve the same
> copy to the return value which may be easily
> removed by almost all compilers.

So conversion on return precludes RVO? Am I reading you correctly?
, "The return requires a conversion. There is no RVO possible. In
both cases, the return SomeType will be constructed from a." ... does
this, i.e. coversion on return, preclude RVO altogether?

J. Barfouth: I should have used scope resolution operators,
e.g. "(2)::#1" vs. "(1)::#1". I too prefer my (2)::#1 statement,
i.e. "return Type(x)" ... since it makes the object's construction
explicitly clear.

In fact ... just to check, BOTH:

return Type(x);
return static_cast<Type>(x);

DO call construct an object of Type, right? i.e., in both cases the
constructor is called? I suppose that, in the first, if no suitable
constructor is available, the compiler will flag an error? I wonder
what happens in a similar situation with the second statement? If,
after preprocessing, compilation and optimization, the above two
statements will look the same, I suppose that it would be ok for me to
treat the two as semantically equivalent statements (in fact, is there
a real difference between the two)? I really do prefer the first form,
not just because of its explicitness, but also because of its
compactness. I will then probably leave static_casts purely for the
dirty business of pointer-casting.

In general, I now have a much more keener awareness of named and
unnamed RVO. I also see that it is (at present) a quirky and
implmentation defined thing, and I do not have TOO much control over it
(right?). Still, it's nice knowing that it has the potential to be
there, and you are helping it with one particular construction and not
another.

-- jeet


Sent via Deja.com http://www.deja.com/
Before you buy.

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Scott Meyers

unread,
Nov 8, 1999, 3:00:00 AM11/8/99
to
On 6 Nov 1999 03:28:42 -0500, Gabriel Dos_Reis wrote:
> Meyer's coverage of what is he called "return value optimization" is
> incomplete. He mostly dealt with what should be termed "unnamed return
> value optimization", to emphasize the fact the return-expression
> refers to an unnamed object.

Fair enough. However, please note the following entries in the MEC++
errata list (available at
http://www.aristeia.com/BookErrata/mec++-errata.html), which indicate
changes made to MEC++ regarding this issue:

! 7/19/96 sdm 109 In July 1996, the standardization committee
9/26/96
decided that named objects may be treated
essentially the same as unnamed objects for
purposes of performing the return value
optimization. I added a footnote to this effect.

! 2/ 6/98 sdm 101 As noted in the footnote on page 109 (added on 6/
6/98
104 9/26/96), and contrary to the text on these pages,
109 compilers may now apply the same optimizations to
110 named objects that they've always been able to
apply to unnamed objects. I reworded things
and/or added clarifying footnotes, and one result
was that the information formerly in the footnote
on page 109 ended up in a new footnote on page 104.

As background on the relationship between MEC++ and the RVO (and to some
degree my understanding of it and Stan Lippman's), the following is from a
posting I made to this newsgroup on 3 September 1996.

[ Begin 3 Sep 96 posting from Scott Meyers ]

Marc Girod wrote:

"DK" == Dietmar Kuehl <ku...@uzwil.informatik.uni-konstanz.de> writes:

DK> Note, that the object returned in the
DK> 'operator+()' is an unnamed one: The compiler is free to elide this
DK> object! If the function is inlined, chances are that indeed only the
DK> necessary object on the caller's stack will be created.

This is what I read in Meyers book. Only, after that, I got Stan
Lippman's review in C++ Report 8/7 (June 96), asserting the opposite:
that the opmimization is called the "named return value optimization",
and that you'd rather write:

const USD operator+ (const USD& u1, const USD& u2) {
USD s(u1);
s += u2;
return s;
}

As it turns out, we're both wrong: BOTH forms allow for the return value
optimization. There is an interesting story behind this, however.

When Stan's review of my book came out in early July, I wrote the
following, which I planned to submit to the C++ Report as a letter to
the editor. I never submitted the letter, and I'll explain why below.

Stan Lippman's comments on the return value optimization in his review
of _More Effective C++_ failed to include important information
regarding the status of that optimization with respect to the emerging
language standard. The draft standard currently contains
self-contradictory language that both allows and disallows the
optimization.

Consider again the first example that Stan gives in his review:

X bar() // bar returns an object of type X
{
X xx; // xx is a local automatic of type X
...
return xx; // optimization question: may xx be omitted from the
} // program by combining it with bar's return value?

Stan states that it is possible to perform the optimization and then
goes on to show how cfront implemented it in 1991. But he fails to
address the question of whether the optimization is allowed.

The standard reference in 1991 was the ARM, and this is what the ARM
says in section 3.5:

A named automatic object may not be destroyed before the end of its
block nor may an automatic named object of a class with a constructor
or a destructor with side effects be eliminated even if it appears to
be unused.

In cases where X has a constructor or destructor with side effects,
then, the optimization does not appear to have been allowed, and if
cfront performed it in such cases, that appears to have been a violation
of the ARM. (The situation is not as clear as it could be. ARM 12.1.1c
gives an example showing elimination of a named auto object without
considering whether that object has a constructor or destructor with
side effects. However, that is commentary, and the material in section
3.5 is non-commentary, and the general rule of thumb when reading the
ARM has been that non-commentary trumps commentary.)

The current draft ANSI/ISO standard carries the sentiment of the above
passage over as paragraph 3 of section 3.7.2:

If a named automatic object has initialization or a destructor with
side effects, it shall not be destroyed before the end of its block,
nor shall it be eliminated as an optimization even if it appears to be
unused.

However, it also includes contradictory language -- language not present
in the ARM -- in section 12.8, paragraph 15:

Whenever a class object is copied and the implementation can prove
that either the original or the copy will never again be used, an
implementation is permitted to treat the original and the copy as two
different ways of referring to the same object and not perform a copy
at all. In that case, the object is destroyed at the later of times
when the original and the copy would have been destroyed without the
optimization.

This is an unusual definition of the word "optimization," because it
implies that the timing of a call to an object's destructor is dependent
on whether the optimization is applied. In other words, such
"optimization" may change a program's behavior! Personally, I find
this disturbing, but that's neither here nor there.

Stan nearly mentions the impact of the return value optimization on
destructors, but ultimately he considers only the impact on copy
constructor invocation. He argues that it is necesary to allow the
return value optimization to allow for equally efficient code to be
generated for the following constructs:

X xx0(1024);
X xx1 = X(1024);
X xx2 = (x)1024;

However, the objects to be optimized out of existence in these cases are
temporary objects -- *unnamed* objects -- and are hence not subject to
the provisions of section 3.7.2 (paragraph 2) of the draft standard.
There has never been any question that eliminating temporary objects is
allowed.

The long and short of it is this:
- Section 3.7.2 of the current draft standard prohibits the elimination
of
named objects as an optimization.
- Section 12.8 allows it.
- As Stan reported, Josee Lajoie believes that the committee will
ultimately allow named objects to be eliminated in conjunction
with the return value optimization.

This letter accurately summarized the situation as of the beginning of
July, i.e., at the time both my book and Stan's review were written. In
mid-July, however, the ANSI/ISO standardization committee met again, and
during that meeting they adopted the change in rules that Stan expected.
By the end of July, then, the draft language standard *did* allow for
the elimination of named objects under the conditions described in DWP
12.8. In that respect, Stan was right and I was wrong. At no time was
it ever invalid to eliminate unnamed objects, however, and in that
respect, I was right and Stan was wrong.

As things stand now, there is no difference between named and unnamed
objects in terms of the ability of compilers to optimize them out of
existence. At least that's what the current DWP says. What real
compilers do probably varies from vendor to vendor, and if your code is
dependent on such things, you should run your own tests and, as they say
on the X Files, trust no one. Not me, not my books, not Stan, not
Stan's books.

(Actually, if your code is dependent on such things, you should try to
rework your code to eliminate the dependency.)

I apologize for any confusion that may arise from my treatment of this
issue in "More Effective C++".

Scott

PS - I don't know where Stan got the terminology "named return value"
for the optimization we're discussing. It's called the "return value
optimization" in the ARM and in every discussion I've ever seen on the
topic. The only thing I know called a "named return value" is a
syntactic extension to C++ supported by g++ that allows programmers to
explicitly refer to a function's return value by giving it a name.

[ End 3 Sep 96 posting from Scott Meyers ]

Scott

--
Scott Meyers, Ph.D. sme...@aristeia.com
Software Development Consultant http://www.aristeia.com/
Visit http://meyerscd.awl.com/ to demo the Effective C++ CD

John Potter

unread,
Nov 9, 1999, 3:00:00 AM11/9/99
to
On 8 Nov 1999 04:39:12 -0500, Jeet Sukumaran
<jeet_su...@my-deja.com> wrote:

: I've heard that Lippman's "Inside the Object Model"
: covers RVO in depth, but the ACCU review of the book says that it is
: severely crippled by logical errors and/or misconceptions? Something to
: do with an assignment operator being a pre-requisite for RVO?

The problem is that Lippman is talking about cfront as "The" C++
Object Model. On RVO, he is correct that cfront used user defined
copy ctor (not assignment) as a switch which enabled it. I guess
the logic is that if the user accepts the compiler generated copy
ctor, copying must be cheap. It could also be that cfront was
pre standard and the copy ctor was not yet well defined as it is
now. Bit wise copy was common. The book is historical in nature.
Most of it was obsolete when written, but still informative.

: Searching for elide on the deja led to some ideas for
: testing when and where copy construction takes place -- I'll be giving
: it a spin. I'm curious, though: you say --
:
: : return SomeType(a);
: : return a;
:
: > The return requires a conversion. There is
: > no RVO possible. In both cases, the return
: > SomeType will be constructed from a. The
: > first says construct a SomeType and copy it to
: > the return value. The second says return a by
: > converting it to SomeType and copying it to
: > the return value. Both cases involve the same
: > copy to the return value which may be easily
: > removed by almost all compilers.
:
: So conversion on return precludes RVO? Am I reading you correctly?
: , "The return requires a conversion. There is no RVO possible. In
: both cases, the return SomeType will be constructed from a." ... does
: this, i.e. coversion on return, preclude RVO altogether?

Sorry, semantics. In D&E the term RVO was used to describe what has
evolved to NRVO. What I meant was that the local variable a could
not be removed. The cases that you talk about are not any kind of
RVO in my mind. They are just removal of temporaries which can
happen on a call as well as a return. The classic Meyers/Lippman
debate was.

X operator+ (X const& lhs, X const& rhs) { // Lippman
X t(lhs);
t += rhs;
return t;
}

With NRVO (Lippman added the N to the D&E term), the construction
takes place in the return value space and the named variable t
is not used.

X operator+ (X const& lhs, X const& rhs) { // Meyers
return X(lhs) += rhs;
}

The claim was that it should be easier to remove this unnamed
temporary. In reality, it was not removed. I had one broken
compiler which used a const cast on lhs for this. It did
save a copy, but I sure didn't like the effect. It was later
pointed out my Jason Merril that removal of the temporary was
highly unlikely because it required knowledge that += returned
*this. Lippman's form requires no knowledge outside of the
function.

: In fact ... just to check, BOTH:


:
: return Type(x);
: return static_cast<Type>(x);

The language has an ambiguity. The first could be a direct ctor call
or a function style cast. The resolution is that it is a function
style cast which makes it the same as (Type)x which is translated as
static_cast<Type>(x). You have found two ways to write the same
thing. Of course, the cast does call the ctor to get the job done.
You could also just say return x unless there is a conversion and
it is via an explicit ctor.

John

Gabriel Dos_Reis

unread,
Nov 15, 1999, 3:00:00 AM11/15/99
to
sme...@aristeia.com (Scott Meyers) writes:

| On 6 Nov 1999 03:28:42 -0500, Gabriel Dos_Reis wrote:
| > Meyer's coverage of what is he called "return value optimization" is
| > incomplete. He mostly dealt with what should be termed "unnamed return
| > value optimization", to emphasize the fact the return-expression
| > refers to an unnamed object.
|
| Fair enough. However, please note the following entries in the MEC++
| errata list (available at
| http://www.aristeia.com/BookErrata/mec++-errata.html), which indicate
| changes made to MEC++ regarding this issue:

[...]

| PS - I don't know where Stan got the terminology "named return value"
| for the optimization we're discussing. It's called the "return value
| optimization" in the ARM and in every discussion I've ever seen on the
| topic. The only thing I know called a "named return value" is a
| syntactic extension to C++ supported by g++ that allows programmers to
| explicitly refer to a function's return value by giving it a name.
|
| [ End 3 Sep 96 posting from Scott Meyers ]
|
| Scott

Well, actually I didn't intend to reopen a past (closed?) Lippman
vs. Meyers war.

The fact is that the expression "return value optimization" was
part of established pratice to mean a precise particular type of
optimization (described in the ARM) whereas you used it to describe
another kind of optimizations. I find that very confusing.
Furthermore, you made some assertions that constituate strong
evidences that you were not talking about what is described in the
ARM. Here is one such assertion: from MEC++, page 109:

The final efficiency observation concerns implementing the
standalone operators. Look again at the implementation for
operator+:

template<class T>
const T operator+(const T& lhs, const T& rhs)
{ return T(lhs) += rhs; }

...

The first implementation[*] /is/ eligible for the return value
optimization, so compilers can generate better code for it.


[*] 'first implementation' refers to the snippet I quoted.

As it was proven elsewhere in this thread, that claim is simply untrue
as it relies on the fact that operator+= returns *this; in fact, to
make any optimization at all, the compiler has to know what operator+=
is doing.

In that respect I don't think it corresponds to what is described in
the ARM.

As far as the adjective 'named' added by Lippman is concerned, I find
it a Good Thing as it emphasizes what is really going on. On a
technical ground, it corresponds to ARM's description ; at the same
time it differentiates from what he calls 'computational constructor'.
That is why I recommended Lippman's book. I didn't intend to diminish
yours.


As an aside, I don't much like your recommended template
implementation of operator+ as it heavily relies on operator+= being a
member function (or taking a const T& as its first argument, unlikely
to happen in a real-life application).

--
Gabriel Dos Reis, dos...@cmla.ens-cachan.fr

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Gabriel Dos_Reis

unread,
Nov 15, 1999, 3:00:00 AM11/15/99
to
Jeet Sukumaran <jeet_su...@my-deja.com> writes:


[...]

| Gabriel Dos Reis: I've heard that Lippman's "Inside the Object Model"


| covers RVO in depth, but the ACCU review of the book says that it is
| severely crippled by logical errors and/or misconceptions?

I don't know of the ACCU review of the book.

| ... Something to


| do with an assignment operator being a pre-requisite for RVO?

No, in any case it didn't assert that assignment operator was a
pre-requisite for the named return value optimization. But it stated
that the presence of the copy-constructor was indeed a pre-requisite
of the NRVO. I think that _was_ a particular requirement of Cfront,
not that of the Standard as I understand it.

John Potter

unread,
Nov 15, 1999, 3:00:00 AM11/15/99
to
On 15 Nov 1999 05:35:50 -0500, Gabriel Dos_Reis
<gdos...@korrigan.inria.fr> wrote:

: As an aside, I don't much like your recommended template


: implementation of operator+ as it heavily relies on operator+= being a
: member function

Which I hope is very likely to happen in all real-life applications.

: (or taking a const T& as its first argument, unlikely


: to happen in a real-life application).

I don't use it because no compiler that implements (N)RVO which I
have tested optimizes it. It just doesn't work. At the time, it
was a reasonable recommendation, but time has shown otherwise.

John

Gabriel Dos_Reis

unread,
Nov 15, 1999, 3:00:00 AM11/15/99
to
jpo...@falcon.lhup.edu (John Potter) writes:

| On 15 Nov 1999 05:35:50 -0500, Gabriel Dos_Reis
| <gdos...@korrigan.inria.fr> wrote:
|
| : As an aside, I don't much like your recommended template
| : implementation of operator+ as it heavily relies on operator+= being a
| : member function
|
| Which I hope is very likely to happen in all real-life applications.

Well, I have to conclude we don't have access to the same 'real-life
applications' :-)

I faced a library where there was a need that the first operand of
operator+= be a modifiable *lvalue*. The natural way to meet that
requirement is to implement operator+= a non-member function

T& operator+=(T&, const T&);

operator+= as a member function simply don't meet that requirement.


Also, one should be aware that blindly implementing operator+
in terms of operator+= isn't always the best solution.
If you try to implement an efficient library for univariate dense
polynomial processing you'll quickly realize that contrary to popular
belief, operator+= is best implemented... in terms of operator+.

--
Gabriel Dos Reis, dos...@cmla.ens-cachan.fr

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Scott Meyers

unread,
Nov 23, 1999, 3:00:00 AM11/23/99
to
On 15 Nov 1999 05:35:50 -0500, Gabriel Dos_Reis wrote:
> The fact is that the expression "return value optimization" was
> part of established pratice to mean a precise particular type of
> optimization (described in the ARM) whereas you used it to describe
> another kind of optimizations. I find that very confusing.

Perhaps established practice is in the eye of the beholder. To me, the RVO
has always referred to the ability of a compiler to use the memory for a
function's return value to hold a local object of the same type that would
otherwise have to be copied into the return value location, thus saving the
cost of the copy.

> Furthermore, you made some assertions that constituate strong
> evidences that you were not talking about what is described in the
> ARM. Here is one such assertion: from MEC++, page 109:
>
> The final efficiency observation concerns implementing the
> standalone operators. Look again at the implementation for
> operator+:
>
> template<class T>
> const T operator+(const T& lhs, const T& rhs)
> { return T(lhs) += rhs; }
>
> ...
>
> The first implementation[*] /is/ eligible for the return value
> optimization, so compilers can generate better code for it.
>
> [*] 'first implementation' refers to the snippet I quoted.
>
> As it was proven elsewhere in this thread, that claim is simply untrue
> as it relies on the fact that operator+= returns *this; in fact, to
> make any optimization at all, the compiler has to know what operator+=
> is doing.

I agree. I don't know what printing of MEC++ you are quoting, but the
material you cited reads as follows in the current printing:

The final efficiency observation concerns implementing the stand-alone


operators. Look again at the implementation for operator+:

template<class T>
const T operator+(const T& lhs, const T& rhs)
{ return T(lhs) += rhs; }

[...]

template<class T>
const T operator+(const T& lhs, const T& rhs)
{

T result(lhs); // copy lhs into result
return result += rhs; // add rhs to it and return
}

This template is almost equivalent to the [first] one above, but there is a
crucial difference. This second template contains a named object,
result. The fact that this object is named means that the return value
optimization (see Item 20) was, until relatively recently, unavailable
for this implementation of operator+ (see the footnote on page 104). The
first implementation has always been eligible for the return value
optimization, so the odds may be better that the compilers you use will
generate optimized code for it.

Now, truth in advertising compels me to point out that the expression

return T(lhs) += rhs;

is more complex than most compilers are willing to subject to the return
value optimization. The first implementation above may thus cost you one
temporary object within the function, just as you'd pay for using the
named object result. However, the fact remains that unnamed objects have
historically been easier to eliminate than named objects, so when faced
with a choice between a named object and a temporary object, you may be
better off using the temporary. It should never cost you more than its
named colleague, and, especially with older compilers, it may cost you
less.

I want to point out my deliberate (and repeated) use of the word "may". If
we assume that T's operator+= is an inline function (and thus visible to
the compiler), I continue to believe that

return T(lhs) += rhs;

could lead to use of the RVO. At the same time, I concede that this
optimization is not available in general (especially if operator+= isn't
visible during compilation of the above construct), and the revised wording
of the above paragraph (vis a vis the wording in the first few (several?)
printings of the book) reflects my improved understanding of the situation.

If I were to write the passage today from scratch, I'd drop the subject
altogether, because you're much righter on this particular topic than my
book is. Sigh.

> In that respect I don't think it corresponds to what is described in
> the ARM.

I sympathize with your confusion, and I think you have a valid point. At
the same time, the fundamental issue *is* the same: whether a compiler is
permitted to construct a local object in the memory for a function's return
value, thus eliminating the need to copy the object into the return value
location.

> As an aside, I don't much like your recommended template
> implementation of operator+ as it heavily relies on operator+= being a

> member function (or taking a const T& as its first argument, unlikely


> to happen in a real-life application).

There are two ways to skin this cat, I suppose. My preference is to make
sure that operations that are supposed to return rvalue-acting objects
return const objects, thus prohibiting their use in non-lvalue contexts.
That's why I advocate having operator+ and operator++(int) return const
objects, for example. In addition, it strikes me as much more natural to
make all the assignment operators (including operator+= et al) member
functions, given that operator= itself must be a member. Until I read your
post, I'd frankly never heard of anybody proposing that operator+= should
be a nonmember, but I can see how it would work. Still, in my experience,
it's quite unconventional to declare it that way. FWIW, the only place
I've ever seen it spelled out where certain operators should be declared is
Rob Murray's "C++ Strategies and Tactics," and in that book (on page 47),
he recommends that all assignment operators (including op=) be made
members. Surely that should settle the matter :-)

Bill Seymour

unread,
Nov 23, 1999, 3:00:00 AM11/23/99
to
Scott Meyers wrote:
>
> template<class T>
> const T operator+(const T& lhs, const T& rhs)
> { return T(lhs) += rhs; }
>
> [...]
>
> template<class T>
> const T operator+(const T& lhs, const T& rhs)
> {
> T result(lhs); // copy lhs into result
> return result += rhs; // add rhs to it and return
> }
>

I don't see the difference between the two vis-à-vis the RVO
since both return the result of operator+=. Shouldn't the
body of the second be written

T result(lhs);
result += rhs;
return result;

to make it an example of returning a named object? Or am I
missing something?

--Bill Seymour

J.Barfurth

unread,
Nov 23, 1999, 3:00:00 AM11/23/99
to

Scott Meyers <sme...@aristeia.com> schrieb in im Newsbeitrag:
MPG.12a3de7ee...@news.teleport.com...

> On 15 Nov 1999 05:35:50 -0500, Gabriel Dos_Reis wrote:

> > The first implementation[*] /is/ eligible for the return value
> > optimization, so compilers can generate better code for it.
> >
> > [*] 'first implementation' refers to the snippet I quoted.

Also the first one below

What was your reason to omit:


template<class T>
const T operator+(const T& lhs, const T& rhs)
{
T result(lhs); // copy lhs into result

result += rhs; // add rhs to it

return result; // and return
}
where the (N)RVO can be done even for out-of-line operator+= ?

This is in line with something I remember reading a while back (iirc in
Steve McConells "Code complete"):
Don't try to be clever by putting too much into one line (or expression) -
the compiler's ability to optimize may degrade with complex expressions.
While I don't know to what degree today's optimizers are still susceptible
to this, but a decent optimizer should at least not do any worse than in
your second example above.
Iirc the author of the above had a (measured) example, where this
degradation actually ocurred (something like a one-line strcpy).
IMHO it hurts readability to put too much into an expression. Especially
this applies to cases like assignment operators, where to me the side effect
of changing the object assigned to is immediately evident, while the return
value is collateral. I avoid using such return values (i.e. from
sideeffect-ful operations). As usual this is nothing absolute. If a function
is more complicated than the example above, and avoiding such constructs
reduces complexity by avoiding an abundance of local names, then it might do
it. But mostly I'd rather refactor such functions...

-- Jörg Barfurth

John Potter

unread,
Nov 24, 1999, 3:00:00 AM11/24/99
to
On 23 Nov 1999 10:38:28 -0500, sme...@aristeia.com (Scott Meyers)
wrote:

: On 15 Nov 1999 05:35:50 -0500, Gabriel Dos_Reis wrote:
: > The fact is that the expression "return value optimization" was
: > part of established pratice to mean a precise particular type of
: > optimization (described in the ARM) whereas you used it to describe
: > another kind of optimizations. I find that very confusing.
:
: Perhaps established practice is in the eye of the beholder. To me, the RVO
: has always referred to the ability of a compiler to use the memory for a
: function's return value to hold a local object of the same type that would
: otherwise have to be copied into the return value location, thus saving the
: cost of the copy.

void f (T t);
void g (T const& lhs, T const& rhs)
{
f(T(lhs) += rhs);
}

Should this be called the call value optimization? Restate the above
replacing return by parameter. Both of these are covered by the same
temporary removal statement in the standard. The other statement in
the standard covers removing a named local variable in the specific
case of return by value. This one needs a name. It is the one that
the ARM mentioned, cfront implemented, and the standard was changed
to allow (after printing of MEC++).

: I agree. I don't know what printing of MEC++ you are quoting, but the


: material you cited reads as follows in the current printing:

Imperical evidence here.

: The final efficiency observation concerns implementing the stand-alone


: operators. Look again at the implementation for operator+:
:
: template<class T>
: const T operator+(const T& lhs, const T& rhs)
: { return T(lhs) += rhs; }

Copy not removed.

: template<class T>


: const T operator+(const T& lhs, const T& rhs)
: {
: T result(lhs); // copy lhs into result
: return result += rhs; // add rhs to it and return

: }

Copy not removed.

template<class T>
const T operator+(const T& lhs, const T& rhs)
{
T result(lhs);

result += rhs;
return result;
}

Copy removed regardless of where operator+= was located. It is a
purely local optimization.

The results are for cfront and the only modern compiler that I know
of which performs the RVO. For other compilers, the copy was not
removed for any of them. The GNU extension

template<class T>
const T operator+(const T& lhs, const T& rhs) result(lhs)


{
result += rhs;
return result;
}

produces a syntax error due to the const on the return value. The
standard explicitely allows ignoring cv qualifiers.

See also Defect Report: Lifetime of "named" temporaries? in csc++.

: FWIW, the only place


: I've ever seen it spelled out where certain operators should be declared is
: Rob Murray's "C++ Strategies and Tactics," and in that book (on page 47),
: he recommends that all assignment operators (including op=) be made
: members. Surely that should settle the matter :-)

He also recommends that all unary operators be members. But we know
that postfix should be non-members. ;-)

template <class T>
T const operator++ (T& that, int) {
T tmp(that);
++ that;
return tmp;
}

Hey, there's that RVO again. ;-)

John

Scott Meyers

unread,
Nov 25, 1999, 3:00:00 AM11/25/99
to
On 23 Nov 1999 15:37:20 -0500, J.Barfurth wrote:
> What was your reason to omit:
> template<class T>
> const T operator+(const T& lhs, const T& rhs)
> {
> T result(lhs); // copy lhs into result
> result += rhs; // add rhs to it
> return result; // and return
> }
> where the (N)RVO can be done even for out-of-line operator+= ?

Probably just that I didn't think of it :-)

In retrospect, I made what I'd guess is a common mistake. I "know" that
operator+= returns its left-hand argument, so I figured that

return result += rhs;

and

result += rhs;
return result;

are equivalent. Except that I can't "know" this, because operator+= can do
whatever it pleases. I simply overlooked that.

As you and others have demonstrated to the point where even I can see it,
the implementation above is clearly the best way to go. I'll add this to
the MEC++ errata, and I'll try to incorporate it in future printings of the
book. If somebody can tell me who originally pointed out this mistake in
the book, I'll gladly add that person to the acknowledgements when I make
the change.

Thanks to all for showing me the folly of my ways.

> This is in line with something I remember reading a while back (iirc in
> Steve McConells "Code complete"):

Good book. I like his others even better. "Software Project Survival
Guide" was a revelation to me. Then again, I don't get out much...

Michel Michaud

unread,
Nov 26, 1999, 3:00:00 AM11/26/99
to
Scott Meyers wrote :

>
> On 23 Nov 1999 15:37:20 -0500, J.Barfurth wrote:
> > template<class T>
> > const T operator+(const T& lhs, const T& rhs)
> > {
> > T result(lhs); // copy lhs into result
> > result += rhs; // add rhs to it
> > return result; // and return
> > }

> As you and others have demonstrated to the point where even I can see it,


> the implementation above is clearly the best way to go. I'll add this

I don't want to restart an old discussion, but I still think the
following way has to be considered (especially as a template):

template<class T>
const T operator+(T lhs, const T& rhs)
{
lhs += rhs;
return lhs;
}

--
Michel Michaud (mic...@removethis.mail2.cstjean.qc.ca)
http://www3.sympatico.ca/michel.michaud

John Potter

unread,
Nov 26, 1999, 3:00:00 AM11/26/99
to
On 26 Nov 1999 12:44:46 -0500, Michel Michaud
<mic...@gest-netware.cstjean.qc.ca> wrote:

: I don't want to restart an old discussion, but I still think the


: following way has to be considered (especially as a template):

:
: template<class T>
: const T operator+(T lhs, const T& rhs)


: {
: lhs += rhs;
: return lhs;

: }

In theory it is possible for the parameter and the return value to
occupy the same space. It has been recently stated in csc++ that
it is allowed by the standard.

In practice, it will not happen. The copy to lhs will be performed
by the caller. The copy from lhs to the return value will be
performed by the function.

The simple local form will work on a compiler which implements it.

This form could work since the template is inline; however, do not
hold your breath waiting.

John

Gabriel Dos_Reis

unread,
Nov 26, 1999, 3:00:00 AM11/26/99
to
sme...@aristeia.com (Scott Meyers) writes:


[...]

| As you and others have demonstrated to the point where even I can see it,

| the implementation above is clearly the best way to go. I'll add this to
| the MEC++ errata, and I'll try to incorporate it in future printings of the
| book. If somebody can tell me who originally pointed out this mistake in
| the book, I'll gladly add that person to the acknowledgements when I make
| the change.

I don't know you originally pointed out that flaw. But I remember
having discussed similar issue with John Potter a year (or so) ago. I
guess he knows the first person who pointed out the flaw.

--
Gabriel Dos Reis, dos...@cmla.ens-cachan.fr

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

John Potter

unread,
Dec 6, 1999, 3:00:00 AM12/6/99
to
On 15 Nov 1999 23:06:12 -0500, Gabriel Dos_Reis
<gdos...@korrigan.inria.fr> wrote:

: Also, one should be aware that blindly implementing operator+


: in terms of operator+= isn't always the best solution.
: If you try to implement an efficient library for univariate dense
: polynomial processing you'll quickly realize that contrary to popular
: belief, operator+= is best implemented... in terms of operator+.

I must confess ignorance, but am open to education. Would this be
like matrix multiplication?

The blind solution:

M& operator*= (M const& rhs) {
M tmp;
// do the work of tmp = *this * rhs
*this = tmp; // one copy assignment
return *this;
}
M operator* (M const& lhs, M const& rhs) {
M tmp(lhs);
tmp *= rhs;
return tmp; // one or two copy ctor depending on RVO
}

The alternative:

M& operator*= (M const& rhs) {
return *this = *this * rhs; // one copy assignment
}
M operator* (M const& lhs, M const& rhs) {
M tmp;
// do the work of tmp = lhs * rhs
return tmp; // zero or one copy ctor depending on RVO
}

Copies *= *
blind 1 2/3
alternative 1/2 0/1

The alternative has a better average. It is also possible that a
knowledge of the domain and common coding indicates that * is used
much more than *=. Is this what you were talking about?

John

Gabriel Dos_Reis

unread,
Dec 6, 1999, 3:00:00 AM12/6/99
to
jpo...@falcon.lhup.edu (John Potter) writes:

| On 15 Nov 1999 23:06:12 -0500, Gabriel Dos_Reis
| <gdos...@korrigan.inria.fr> wrote:
|
| : Also, one should be aware that blindly implementing operator+
| : in terms of operator+= isn't always the best solution.
| : If you try to implement an efficient library for univariate dense
| : polynomial processing you'll quickly realize that contrary to popular
| : belief, operator+= is best implemented... in terms of operator+.
|
| I must confess ignorance, but am open to education. Would this be
| like matrix multiplication?

Yes. It's also like implementing operator+= for bignums. They all can
been seen as 'matrix-vector multiplication'. In any case your
comparaison is accurate.

| The blind solution:

[...]

|
| The alternative:

[...]

| Copies *= *
| blind 1 2/3
| alternative 1/2 0/1
|
| The alternative has a better average. It is also possible that a
| knowledge of the domain and common coding indicates that * is used
| much more than *=. Is this what you were talking about?

Absolutely yes.




--
Gabriel Dos Reis, dos...@cmla.ens-cachan.fr

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Reply all
Reply to author
Forward
0 new messages