Preferring operator=(T&& rhs) over operator(T rhs) may allow more efficient copy-swap idiom

oguz...@gmail.com

unread,

Apr 13, 2018, 2:11:20 AM4/13/18

to ISO C++ Standard - Future Proposals

Let me know if this is a stupid idea or something that is already considered. But to my knowledge, if we have the following operator overloads for a class, compilers will generate an ambiguity error (probably enforced by the standard):

C& C::operator=(C rhs);
C& C::operator=(C&& rhs);

However, if the compilers could choose the rvalue reference version over the pass-by-value version for rvalue arguments (i.e. c2 = c1 + c1), it could lead to an efficient copy-swap idiom implementation.

Below is a sample code that demonstrates the problem. The issue is explained by the long comment block inside the main function:

#include <algorithm>

class C
{
    int* data;

     public:

     C() : data(nullptr) { }

     C(int data) : data(new int)
     {
         *(this->data) = data;
     }

     C(const C& rhs) : data(new int)
     {
         *data = *(rhs.data);
     }

     C(C&& rhs) : C()
     {
         // Move constructor is first creating a default
         // object and swapping it with the rvalue reference.
         swap(*this, rhs);
     }

     C& operator=(C rhs)
     {
         // We let the compiler copy into rhs and we swap the
         // current object with this copy. Together with the
         // move constructor above, this implements the copy-swap
         // idiom. Thanks to the move-constructor above, the
         // copy to rhs is not a deep copy if the input is an rvalue
         // reference.
         swap(*this, rhs);
         return *this;
     }

     // The function below is commented out because it fails compilation
     // due to ambiguity with the above function. However, if it worked
     // it could have saved us an extra call to the move constructor when
     // we make calls such as c2 = c1 + c1 (see operator+ below). If it had
     // worked, the temporary created from c1 + c1 would have been directly 
     // taken by rvalue reference instead of its copy  (albeit shallow) being
     // created. 
     
     /*
     C& operator=(C&& rhs)
     {
         swap(*this, rhs);
         return *this;
     }
     */

     C operator+(const C& rhs)
     {
         C result(*data + *(rhs.data));
         return result;
     }

     friend void swap(C& lhs, C& rhs);

     ~C()
     {
         delete data;
     }
};

void swap(C& lhs, C& rhs)
{
    std::swap(lhs.data, rhs.data);
}

int main()
{
    C c1(7);
    C c2;

    // The following will first create the "result" inside operator+.
    // The return value will then get move-constructed from the result
    // (I'm assuming that -fno-elide-constructors option is used). Then
    // the "rhs" parameter of operator= will get move-constructed from
    // this return temporary.
    //
    // But if we could overload operator=(C&&), this second move-construction
    // could have been avoided as the return temporary could have been directly
    // captured by rvalue reference.
    //
    // Granted that we can implement operator=(const C&) and operator=(C&&)
    // to make assignment as efficient as possible for both lvalue and rvalue
    // inputs. However, to my knowledge, having operator=(const C&) would
    // essentially prevent us from having the copy-swap idiom.

    c2 = c1 + c1;

    return 0;
}

Nicolas Lesser

unread,

Apr 13, 2018, 2:33:18 AM4/13/18

to std-pr...@isocpp.org

You can have the copy-swap idiom with an `operator=(const C&), I don't know why you think so otherwise.

C& operator=(const C& rhs) {
C copy(rhs);

swap(copy, *this);

return *this;

}

This is essentially what your operator=(C) is doing, but just that the copy is now explicit instead of the compiler doing it for you.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/62809333-fd1a-4bde-8805-ccc0578a4284%40isocpp.org.

oguz...@gmail.com

unread,

Apr 13, 2018, 3:24:32 AM4/13/18

to ISO C++ Standard - Future Proposals

I remember reading in SO forums that letting the compiler take care of the copy may allow for a more efficient implementation in certain cases. See the following links:

-- https://stackoverflow.com/questions/3279543/what-is-the-copy-and-swap-idiom?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa
-- https://web.archive.org/web/20140113221447/http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/

What they say is that if you let the compiler do the copy, rather than doing it yourself, the compiler may accomplish copy-elision instead of pass-by-value. But now thinking about this more carefully, given that we have a move-assignment operator, in a case where copy elision could have happened (due to the actual parameter being an rvalue object), our move-assignment operator would kick in avoiding the copy. So in the presence of a move-assignment operator, doing the copy yourself or letting the compiler do it for us would be equally efficient. Please correct me if this reasoning is wrong.

Oguz

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

Nicolas Lesser

unread,

Apr 13, 2018, 5:11:42 AM4/13/18

to std-pr...@isocpp.org

On Fri, Apr 13, 2018 at 9:24 AM, <oguz...@gmail.com> wrote:

I remember reading in SO forums that letting the compiler take care of the copy may allow for a more efficient implementation in certain cases. See the following links:

-- https://stackoverflow.com/questions/3279543/what-is-the-copy-and-swap-idiom?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa
-- https://web.archive.org/web/20140113221447/http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/

What they say is that if you let the compiler do the copy, rather than doing it yourself, the compiler may accomplish copy-elision instead of pass-by-value.

Jup, or a move. It'd do a copy in the worst copy.

But now thinking about this more carefully, given that we have a move-assignment operator, in a case where copy elision could have happened (due to the actual parameter being an rvalue object), our move-assignment operator would kick in avoiding the copy.

Exactly.

So in the presence of a move-assignment operator, doing the copy yourself or letting the compiler do it for us would be equally efficient. Please correct me if this reasoning is wrong.

Yes, as for rvalues your move assignment operator would get chosen instead (that's the whole: avoiding copies for rvalues). Using pass-by-value gets you the most efficient code possible if you don't want to write two overloads of your operator=. (lvalue -> copy; xvalue -> move; prvalue -> nothing).

But if you are using two overloads, there is no need to use pass-by-value, because then you want to pass-by-ref which would be even better, but because moves are generally cheap, the above pass-by-value is usually sufficient. (lvalue -> copy; xvalue -> nothing; prvalue -> nothing).

Richard Hodges

unread,

Apr 13, 2018, 7:04:47 AM4/13/18

to std-pr...@isocpp.org

On 13 April 2018 at 07:11, <oguz...@gmail.com> wrote:

Let me know if this is a stupid idea or something that is already considered. But to my knowledge, if we have the following operator overloads for a class, compilers will generate an ambiguity error (probably enforced by the standard):

Imagine an external function, foo:

extern void foo(Bar b);

where foo is defined in an another compilation unit.

That function is not expecting a reference, it is expecting a fully formed Bar to exist in its arguments (i.e. for all intents and purposes 'on the stack')

assume at the call site:

foo(Bar(arguments, of, bar));

A well behaved compiler will:

1. allocate space on the stack, in foo's argument stack frame

2. construct b right there on the stack.

3. assume that foo will destroy b.

In other words, copy elision will take place. The b belongs to foo. The caller constructed it on foo's behalf.

Now consider:

extern void foo(Bar&& b);

The same compiler will:

1. allocate space on the stack, in the caller's stack frame

2. construct b right there on the stack.

3. pass the address of b to foo

4. call b's destructor upon foo's return

Now consider

extern void foo(Bar const& b);

The same compiler will:

1. allocate space on the stack, in the caller's stack frame

2. construct b right there on the stack.

3. pass the address of b to foo

4. call b's destructor upon foo's return

In the pass-by-reference cases the code incurs a cost of one extra operation on the stack - pushing the address of b. The only difference in the implementations is when the work of destruction is done. The same amount of work is actually done.

The question is, do we mind about this minuscule extra cost?

If we absolutely do, then it's worth thinking about allowing the 3 overloads to coexist (this may be a can of worms?).

If we don't mind about the push, probably best to leave it alone.

That's my 2c.

Does that seem reasonable?

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

floria...@gmail.com

unread,

Apr 13, 2018, 7:49:50 AM4/13/18

to ISO C++ Standard - Future Proposals

In my experience, values are much easier to deal with, especially when you want to return them.
Consider a simple operator+

If you want to use references and optimize for rvalues, you do:

struct Foo {
  Foo& operator+=(const Foo&);
};
Foo operator+(const Foo& a, const Foo& b) { Foo R; R += a ; return R; }
Foo&& operator+(Foo&& a, const Foo& b) { a += b; return std::move(a); }
Foo&& operator+(const Foo& a, Foo&& b) { return std::move(b) + a; }
Foo&& operator+(Foo&& a, Foo&& b) { return std::move(a) + b; }

And if you do that, you might end up with dangling references in few cases.

If you do the same with passing by value:

struct Foo {
  Foo& operator+=(const Foo&); // unchanged
};
Foo operator+(Foo a, Foo b) { a += b; return a; }

The code here is much simpler, you cannot have dangling references, and in most cases it is at least as optimized as the reference implementation (but no guarantee by the language, I think).
(maybe you still need to take the second argument as a reference though...)

Arthur O'Dwyer

unread,

Apr 13, 2018, 7:40:56 PM4/13/18

to ISO C++ Standard - Future Proposals, floria...@gmail.com

On Friday, April 13, 2018 at 4:49:50 AM UTC-7, floria...@gmail.com wrote:

In my experience, values are much easier to deal with, especially when you want to return them.
Consider a simple operator+

If you want to use references and optimize for rvalues, you do:
struct Foo { Foo& operator+=(const Foo&); }; Foo operator+(const Foo& a, const Foo& b) { Foo R; R += a ; return R; } Foo&& operator+(Foo&& a, const Foo& b) { a += b; return std::move(a); } Foo&& operator+(const Foo& a, Foo&& b) { return std::move(b) + a; } Foo&& operator+(Foo&& a, Foo&& b) { return std::move(a) + b; }
And if you do that, you might end up with dangling references in few cases.

Right; and also I believe you silently break lifetime-extension, for people who care about that kind of thing.

Foo a, b, c;

const auto& d = (a + b) + c; // oops

I think (with about 90% confidence) that a sufficient and maximally optimizable version can look like this:

struct Foo {
    Foo& operator+=(const Foo&);
};

Foo operator+(Foo a, const Foo& b) { a += b; return a; }
Foo operator+(const Foo& a, Foo&& b) { b += a; return b; }

Notice that

- we always return by value (which will move, not copy, in cases where it cannot be elided)

- we always take by-const-ref as a general rule, if we do not need to make a copy

- we always take by-value as a general rule, if we do need to make a copy

- we have one special overload (const T&, T&&) that breaks the general rules for performance's sake

Notice that "Foo e = a + b + c + d" does exactly one more "move" than we'd like it to, but I think that is unavoidable as of C++17.

https://wandbox.org/permlink/gc61gvtGTwy24Mby

I have a blog post on the subject of unavoidable moves.

https://quuxplusone.github.io/blog/2018/03/29/the-superconstructing-super-elider/

Thoughts?

–Arthur

oguz...@gmail.com

unread,

Apr 14, 2018, 5:43:27 AM4/14/18

to ISO C++ Standard - Future Proposals

I think this is very reasonable. My original question was regarding the case when copy-elision was turned off. However, you are right that in a real world setting copy elision will take place and the maximum efficiency can be gained by implementing all three versions. Whether this minimal gain in efficiency is worth the effort of adding an extra function is something that everybody should consider for their own application.

In my original question I also wanted to find out if rules could not be set up to choose an rvalue reference over value parameter in overloading resolution. rvalue reference is chosen over lvalue reference, so why not rvalue reference is not chosen over value parameter? Is there a case when a programmer would prefer the pass-by-value for a temporary object?

Thanks,

Oguz

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

oguz...@gmail.com

unread,

Apr 14, 2018, 5:47:53 AM4/14/18

to ISO C++ Standard - Future Proposals, oguz...@gmail.com

Oops, never mind. You've actually answered this question. In the case of copy-elision, value parameter can be faster than r-value reference (or l-value reference for that matter)!

Cheers,

Oguz

oguz...@gmail.com

unread,

Apr 18, 2018, 10:56:44 PM4/18/18

to ISO C++ Standard - Future Proposals

Thinking and experimenting a bit more about it, the three overloads cannot co-exist right?

My observation has been that as soon as you have "void foo(Bar b)" together with the either reference version, you start to get ambiguous overload error from the compiler. This doesn't work:

class C
{
};


void f(C a)
{
}


void f(const C& a)
{
}


void f(C&& a)
{
}


int main()
{
    f(C());
    return 0;
}

Cheers,

Oguz

On Friday, April 13, 2018 at 2:04:47 PM UTC+3, Richard Hodges wrote:

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

Reply all

Reply to author

Forward