C++0x: returning rvalue references, recycling temporaries

SG

unread,

Sep 11, 2008, 2:16:45 PM9/11/08

to

[split from thread "std::max(unsigned, size_t), amd64 and C++0x"]

On 11 Sep., 10:59, Howard Hinnant <howard.hinn...@gmail.com> wrote:
> On Sep 9, 6:28 pm, SG <s.gesem...@gmail.com> wrote:
> >
> > There's some benefit to returning rvalue references: temporary objects
> > can be recycled:

string operator+(const string&, const string&);
string && operator+(string &&, const string &);
string && operator+(const string &, string &&);
string && operator+(string &&, string &&);
(operator+ overloads example, see document N1377)

string x = "/home/foo/" + get_string() + ".png"; // is fine

> > string && result = "/home/foo/" + get_string() + ".png";
> >
> > but I guess this [...] would result in a dangling reference. [...]
> > Defining a reference and initializing it by an _rvalue reference
> > returned by a function_ is doomed to fail, isn't it?
>
> I consider this a bug in the rvalue-ref papers. :-)

What exactly? The line of code ("string && result = ...") is not part
of document N1377. The reason I brought this up was because I'm not
sure about its intended semantics w.r.t. the temporary's life time.
I'm currently assuming that it's not ill-formed but usually unsafe
because such a function most likely returns a reference to a temporary
(function argument T&&). But I really like the idea of recycling
temporary objects. I'd appreciate it if you could share your thoughts
on this.

A& f(A& x) { return x; } // #1
A&& f(A&& x) { return x; } // #2

A a = f(A()); // #2 OK
A&& b = A(); // -- OK
A&& c = f(b); // #1 OK
A&& c = f(A()); // #2 Ouch!

Cheers,
SG

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

David Abrahams

unread,

Sep 11, 2008, 7:13:40 PM9/11/08

to

on Thu Sep 11 2008, SG <s.gesemann-AT-gmail.com> wrote:

> I really like the idea of recycling temporary objects. I'd appreciate
> it if you could share your thoughts on this.

I think the costs of moving data into a fresh rvalue object with the
right lifetime properties are so low that it almost never makes sense.

T const& x = foo(T());

In most cases it's nearly free for foo to return a copy of its argument
(safe in this code) no matter how heavyweight T is, so why risk
returning a reference that could dangle?

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Howard Hinnant

unread,

Sep 11, 2008, 11:38:31 PM9/11/08

to

On Sep 11, 7:13 pm, David Abrahams <d...@boostpro.com> wrote:
> on Thu Sep 11 2008, SG <s.gesemann-AT-gmail.com> wrote:
>
> > I really like the idea of recycling temporary objects. I'd appreciate
> > it if you could share your thoughts on this.
>
> I think the costs of moving data into a fresh rvalue object with the
> right lifetime properties are so low that it almost never makes sense.
>
> T const& x = foo(T());
>
> In most cases it's nearly free for foo to return a copy of its argument
> (safe in this code) no matter how heavyweight T is, so why risk
> returning a reference that could dangle?

<nod> That's a fair summary of my thoughts on this as well. It isn't
that:

string a, b, c;
string d = a + b + c;

is unsafe when string+string returns a string&&. It is just that you
allow the possibility for your client to do something like:

const string& d = a + b + c;
// use d here

That code works if a + b + c returns by value and doesn't work if it
returns by string&&. The risk of disaster is small (not many will
catch the result by reference). The cost to prevent the disaster is
small (as long as you design string as cheap to move).

If the cost of preventing the small risk was large, I'd argue for the
small risk, and just document the risk's existence. But since the
cost of prevention is small, it seems like electrons well spent to
me. In the end, it is an engineering tradeoff either way. Either
answer can be argued to be correct.

When I gave string+string a string&& return in N1377 I had not yet
appreciated the (small) risk involved with that choice. One lives and
learns. :-)

-Howard

--

marc....@gmail.com

unread,

Sep 12, 2008, 4:58:51 PM9/12/08

to

On 12 sep, 05:38, Howard Hinnant <howard.hinn...@gmail.com> wrote:
> <nod> That's a fair summary of my thoughts on this as well. It isn't
> that:
>
> string a, b, c;
> string d = a + b + c;
>
> is unsafe when string+string returns a string&&. It is just that you
> allow the possibility for your client to do something like:
>
> const string& d = a + b + c;
> // use d here
>
> That code works if a + b + c returns by value and doesn't work if it
> returns by string&&.

Really, it doesn't? I just looked at N2723, and in paragraph 8.5.3.5
it seems like the case where the initializer is a rvalue reference
falls into the last "Otherwise", where a temporary is created and
bound to the reference (not as good as binding to the same object as
the rvalue ref and increasing its lifetime, but still ok). Or does the
term "rvalue" mean "either rvalue or rvalue reference" there? I am not
that familiar with the language of the standard, so I may easily
misinterpret.

I am really interested in this, because I have already started writing
code that looks exactly like your example:

const string& d = a + b + c;

with my own type instead of std::string. And since I wrote that within
the first day of trying to use rvalue references, I assume it is not a
marginal case.

What is your advice for someone writing a class like string? Always
return by value?

Howard Hinnant

unread,

Sep 12, 2008, 10:28:51 PM9/12/08

to

Yes, always return by value.

Consider:

string operator+(const string&, const string&);

string&& operator+(string&& x, const string& y) {x += y; return x;}

const string& d = a + b + c;

"a+b" forms a temporary string which then binds to "x" in the second op
+ overload. That temporary is appended to and then returned by rvalue
ref. d now refers to the temporary created by "a+b" and modified by
"+= c". The life time of that temporary ends at the semicolon after
"c". So d is a dangling reference.

Now consider:

string operator+(const string&, const string&);

string operator+(string&& x, const string& y) {x += y; return
move(x);}

const string& d = a + b + c;

This is slightly more expensive because it involves an extra move
construction. However "a + b + c" now creates a temporary whose
lifetime is going to extend beyond the semicolon until the reference d
is destructed due to the language in 12.2 [class.temporary].

For the cost of an extra move construction (after an append) you can
sleep better at night. :-)

-Howard

Seungbeom Kim

unread,

Sep 13, 2008, 11:09:04 PM9/13/08

to

Howard Hinnant wrote:
>
> Now consider:
>
> string operator+(const string&, const string&);
> string operator+(string&& x, const string& y) {x += y; return
> move(x);}
>
> const string& d = a + b + c;
>
> This is slightly more expensive because it involves an extra move
> construction. However "a + b + c" now creates a temporary whose
> lifetime is going to extend beyond the semicolon until the reference d
> is destructed due to the language in 12.2 [class.temporary].
>
> For the cost of an extra move construction (after an append) you can
> sleep better at night. :-)

1. How is it more expensive by involving an extra move construction?
As far as I understand, move(x) above is just a shorthand for
static_cast<string&&>(x). Does the static_cast incur any (significant)
operation in the machine code level?

2. Why do we need the move? It converts an lvalue reference to an rvalue
reference, but we are returning by (r)value anyway, and we can always
safely return a reference when the return type is not a reference and
that creates a copy of the referent; is this correct?

T foo(T& t) { return t; } // returns a *copy* of t
T foo(T&& t) { return t; } // what about this?

What about not writing the explicit move: is it correct/desirable?

string operator+(string&& x, const string& y) { x += y; return x; }

--
Seungbeom Kim

SG

unread,

Sep 14, 2008, 5:47:29 AM9/14/08

to

On 14 Sep., 05:09, Seungbeom Kim <musip...@bawi.org> wrote:
> 1. How is it more expensive by involving an extra move construction?
> As far as I understand, move(x) above is just a shorthand for
> static_cast<string&&>(x). Does the static_cast incur any (significant)
> operation in the machine code level?

It is a little more expensive because a new temporary is created based
on an old one (by move construction) instead of recycling the old
temporary (by returning a reference to it).

> 2. Why do we need the move? It converts an lvalue reference to an rvalue
> reference, but we are returning by (r)value anyway, and we can always
> safely return a reference when the return type is not a reference and
> that creates a copy of the referent; is this correct?
>
> T foo(T& t) { return t; } // returns a *copy* of t
> T foo(T&& t) { return t; } // what about this?

The return value is also copy constructed because 't' is a named
reference. Named references are treated as lvalue references. The
statement "return move(t);" forces move-construction of the return
value. A move() makes sense here because in the 2nd overload 't' is
guaranteed to be a temporary due to the more attractive 1st overload
(T&) for lvalues. So 't' won't be used anymore and can be destroyed.

> What about not writing the explicit move: is it correct/desirable?
>
> string operator+(string&& x, const string& y) { x += y; return x; }

Correct and safe. But this will copy-construct a new temporary based
on an old temporary. Since the old temporary isn't needed anymore a
move-construction of the new temporary is as safe but potentially
faster.

Cheers,
SG

--

marc....@gmail.com

unread,

Sep 15, 2008, 7:16:38 PM9/15/08

to

On 13 sep, 04:28, Howard Hinnant <howard.hinn...@gmail.com> wrote:
> Consider:
>
> string operator+(const string&, const string&);
> string&& operator+(string&& x, const string& y) {x += y; return x;}
>
> const string& d = a + b + c;
>
> "a+b" forms a temporary string which then binds to "x" in the second op
> + overload. That temporary is appended to and then returned by rvalue
> ref. d now refers to the temporary created by "a+b" and modified by
> "+= c". The life time of that temporary ends at the semicolon after
> "c". So d is a dangling reference.

When I read 8.5.3.5, it looks like it says that d does not refer to
the temporary created by a+b but to a newly created temporary that
gets initialized from the rvalue-ref. But if you say different, I
trust you.

Would there be any ill effect to changing the standard so that the
lifetime extension of 12.2 also applies to this case? It looks like a
fairly natural expectation.

> For the cost of an extra move construction (after an append) you can
> sleep better at night. :-)

Suboptimal solutions prevent me from sleeping well :-(

SG

unread,

Sep 16, 2008, 6:28:53 PM9/16/08

to

On 16 Sep., 01:16, marc.gli...@gmail.com wrote:
> Would there be any ill effect to changing the standard so that the
> lifetime extension of 12.2 also applies to this case? It looks like a
> fairly natural expectation.

I also thought about this. The problem is however that the compiler
doesn't know just by looking at a function prototype what reference is
returned. Consider:

// insert more operator+ overloads here (#1...#3)
string&& operator+(string&& a, string&& b); // #4

In this case the implementation of #4 could return a reference to 'a',
a reference to 'b' or just any other reference. Under the hood
references are probably returned in form of a pointer. So, to extend
the life time of the right temporary the compiler would need to check
what temporary is actually referenced (if any). This is not known
during compile time by simply looking at the function prototype. It
might even be a runtime dependent reference like in this case:

string&& operator+(string&& a, string&& b)
{
if (a.capacity()-a.length()< b.length()
&& (b.capacity()-b.length()>=a.length()) {
b.insert(0,a);
return b;
} else {
a += b;
return a;
}
}

This makes extending the life time of the right temporary almost
impossible.

Cheers,
SG