Better Performance in Polymorphic Programming: Trivially Swappable

216 views
Skip to first unread message

Mingxin Wang

unread,
Jun 28, 2018, 11:45:08 PM6/28/18
to ISO C++ Standard - Future Proposals
During the process of implementing the proposal P0957 (https://wg21.link/p0957), I found that if the concept of "Trivially Swappable" is defined, the performance of the implementation of will be improved to a certain extent without reducing usability.

For instance, in the move constructor of a value-semantics-based polymorphic wrapper (e.g., `std::function`, `std::any` or the `proxy` in P0957) with SBO (Small Buffer Optimization, aka. SOO, Small Object Optimization), if the value being type-erased is not guaranteed to be trivially swappable, the implementation should always make an indirect call to the concrete move constructor when moving the value from one buffer to another. If we have the concept of "Trivially Swappable", the overhead can be eliminated. This feature could also help in GC-based situations.

I am also wondering if this concept could help in generating default move constructors.

I am looking forward to your comments!

Mingxin Wang

Nicol Bolas

unread,
Jun 29, 2018, 12:16:05 AM6/29/18
to ISO C++ Standard - Future Proposals
On Thursday, June 28, 2018 at 11:45:08 PM UTC-4, Mingxin Wang wrote:
During the process of implementing the proposal P0957 (https://wg21.link/p0957), I found that if the concept of "Trivially Swappable" is defined, the performance of the implementation of will be improved to a certain extent without reducing usability.

OK, so... what would this concept mean? Can you provide a definition of these requirements and what they would allow you to do?

I am also wondering if this concept could help in generating default move constructors.

Do we need help generating default move constructors? Is `= default` not good enough? Or are you talking about something else?

Mingxin Wang

unread,
Jun 29, 2018, 3:19:41 AM6/29/18
to ISO C++ Standard - Future Proposals

On Friday, June 29, 2018 at 12:16:05 PM UTC+8, Nicol Bolas wrote:
On Thursday, June 28, 2018 at 11:45:08 PM UTC-4, Mingxin Wang wrote:
During the process of implementing the proposal P0957 (https://wg21.link/p0957), I found that if the concept of "Trivially Swappable" is defined, the performance of the implementation of will be improved to a certain extent without reducing usability.

OK, so... what would this concept mean? Can you provide a definition of these requirements and what they would allow you to do?

Informally, a type meets the TriviallySwappable requirements if the "std::swap" function overload of this type performs bitwise swap operation.
 

I am also wondering if this concept could help in generating default move constructors.

Do we need help generating default move constructors? Is `= default` not good enough? Or are you talking about something else?

If the construction of a type involves heap allocation with exclusive ownership, e.g. `std::unique_ptr`, the default move constructor will not work. But as long as `std::unique_ptr` is TriviallySwappable and DefaultConstructible, the move constructor could be generated with the default constructor and bitwise swap, which is equivalent to the semantics defined in the standard.

Gašper Ažman

unread,
Jun 29, 2018, 4:56:01 AM6/29/18
to std-pr...@isocpp.org
How does this interface with the concept of relocatable? One would think relocatable implies trivially swappable, same as noexcept movable implies noexcept swappable.

G

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/18469feb-bdda-466d-ba8c-37933c1ea807%40isocpp.org.

inkwizyt...@gmail.com

unread,
Jun 29, 2018, 8:32:42 AM6/29/18
to ISO C++ Standard - Future Proposals


On Friday, June 29, 2018 at 9:19:41 AM UTC+2, Mingxin Wang wrote:

On Friday, June 29, 2018 at 12:16:05 PM UTC+8, Nicol Bolas wrote:
On Thursday, June 28, 2018 at 11:45:08 PM UTC-4, Mingxin Wang wrote:
During the process of implementing the proposal P0957 (https://wg21.link/p0957), I found that if the concept of "Trivially Swappable" is defined, the performance of the implementation of will be improved to a certain extent without reducing usability.

OK, so... what would this concept mean? Can you provide a definition of these requirements and what they would allow you to do?

Informally, a type meets the TriviallySwappable requirements if the "std::swap" function overload of this type performs bitwise swap operation.
 
And how you detect/define this? Probably best way would be if function `bit_swap(&a, &b)` exists. This function could work for relocate too (you swap with uninitialized memory).

Thiago Macieira

unread,
Jun 29, 2018, 6:08:10 PM6/29/18
to std-pr...@isocpp.org
On Friday, 29 June 2018 00:19:40 PDT Mingxin Wang wrote:
> Informally, a type meets the *TriviallySwappable* requirements if the
> "std::swap" function overload of this type performs bitwise swap operation.

Why do you need the concept? Why can't you just use std::swap for your use-
case?

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center



Arthur O'Dwyer

unread,
Jun 29, 2018, 6:51:12 PM6/29/18
to ISO C++ Standard - Future Proposals
On Friday, June 29, 2018 at 1:56:01 AM UTC-7, Gašper Ažman wrote:
How does this interface with the concept of relocatable? One would think relocatable implies trivially swappable, same as noexcept movable implies noexcept swappable.

In my C++Now 2018 talk on "trivially relocatable," I promised in the outline to talk about its relationship to "trivially swappable," and then did not actually do so — sorry!
Essentially, yes, if a type is trivially relocatable then it intuitively ought to be considered trivially swappable.  However, there are two minor caveats that I can think of off the top of my head (and the reason I didn't talk about it at C++Now is that I haven't thought about it much, and the reason for *that* is that I don't have a motivating use-case).

Caveat (A): Trivial relocation can be optimized into memcpy() or memmove().  Trivial swap cannot be optimized into mem-anything, because there is no libc primitive for swapping arrays of bytes. We could certainly propose to add a __builtin_memswap() that would perform the swap "in-place" in cache-line-sized blocks, but I'm not aware of any proposals nor prior art in that area.

Caveat (B): Notice that whereas "relocate" means "move-construct, then destroy", we might say that "swap" means "move-construct, then move-assign, then move-assign, then destroy." (This being the operation done by the unconstrained std::swap template.)  This involves a relationship among 3 operations, which might be a little scarier than relocate's relationship among 2 operations, which is scarier than the current Standard Library's "trivially X" traits which all involve only a single operation.

Caveat (C): For small types like unique_ptr, __builtin_memswap() will not be any faster than the unconstrained std::swap template. The point of optimizing into mem-anything is to get speedups on large arrays, such as during std::vector reallocation.  std::vector swapping is already fast, and cannot be made faster by __builtin_memswap().  Now, std::array swapping could be made faster; consider—

    std::array<std::unique_ptr<int>, 10000> a;
    std::array<std::unique_ptr<int>, 10000> b;
    a.swap(b);  // could probably get a factor-of-2 speedup on this operation by using __builtin_memswap

But, this is not an operation that happens often enough in real programs for anyone to get really motivated about.

TLDR I think "trivially swappable" is a straightforward corollary to "trivially relocatable", but its cost-of-specification might be a bit higher, and its benefit-in-performance is essentially zero as far as any use-case I can think of.

–Arthur

Mingxin Wang

unread,
Jun 29, 2018, 8:08:23 PM6/29/18
to ISO C++ Standard - Future Proposals
On Friday, June 29, 2018 at 8:32:42 PM UTC+8, Marcin Jaczewski wrote:
On Friday, June 29, 2018 at 9:19:41 AM UTC+2, Mingxin Wang wrote:
Informally, a type meets the TriviallySwappable requirements if the "std::swap" function overload of this type performs bitwise swap operation.
 
And how you detect/define this? Probably best way would be if function `bit_swap(&a, &b)` exists. This function could work for relocate too (you swap with uninitialized memory).

I think we can have a special type traits `std::is_trivially_swappable`, and `std::is_trivially_swappable<T>::value` is always `true`. Users may provide specializations that make it `false` indicating a type is NOT trivially swappable. After all, in most cases I have seen where you do not want to print the absolute value of `this`, the types always meet the TriviallySwappable requirements.

Nicol Bolas

unread,
Jun 29, 2018, 8:58:42 PM6/29/18
to ISO C++ Standard - Future Proposals
On Friday, June 29, 2018 at 6:51:12 PM UTC-4, Arthur O'Dwyer wrote:
On Friday, June 29, 2018 at 1:56:01 AM UTC-7, Gašper Ažman wrote:
How does this interface with the concept of relocatable? One would think relocatable implies trivially swappable, same as noexcept movable implies noexcept swappable.

In my C++Now 2018 talk on "trivially relocatable," I promised in the outline to talk about its relationship to "trivially swappable," and then did not actually do so — sorry!
Essentially, yes, if a type is trivially relocatable then it intuitively ought to be considered trivially swappable.  However, there are two minor caveats that I can think of off the top of my head (and the reason I didn't talk about it at C++Now is that I haven't thought about it much, and the reason for *that* is that I don't have a motivating use-case).

Caveat (A): Trivial relocation can be optimized into memcpy() or memmove().  Trivial swap cannot be optimized into mem-anything, because there is no libc primitive for swapping arrays of bytes. We could certainly propose to add a __builtin_memswap() that would perform the swap "in-place" in cache-line-sized blocks, but I'm not aware of any proposals nor prior art in that area.

Performing a byte copy on a TrivialRelocatable object into another object of that type leaves the original object in a conceptually invalid state. That is, even though it is technically valid, if you called its destructor, that would be really bad. You will have violated the TrivialRelocatable agreement.

However, if you take an object which is invalid through TrivialRelocation, and perform a trivial relocation operation into that, then the object should be considered valid again.

A trivial swap operation is merely this:

void trivial_swap(T &a, T &b)
{
  std
::byte buff[sizeof(T)];
  memcpy
(buff, &a, sizeof(T));
  memcpy
(&a, &b, sizeof(T)); //b is no longer valid.
  memcpy
(&b, buff, sizeof(T)); //b is valid again.
}

I believe the validity of this code naturally falls out of a type being TriviallyRelocatable.

`b` stopped being valid because we copied its values into a living object of type `T`, and thus that object took ownership over those values. But `buff` contains values of a now orphaned object. By copying that value into `b`, we unorphan that object's values.

Mingxin Wang

unread,
Jun 29, 2018, 9:03:26 PM6/29/18
to ISO C++ Standard - Future Proposals
On Saturday, June 30, 2018 at 6:51:12 AM UTC+8, Arthur O'Dwyer wrote:
Caveat (A): Trivial relocation can be optimized into memcpy() or memmove().  Trivial swap cannot be optimized into mem-anything, because there is no libc primitive for swapping arrays of bytes. We could certainly propose to add a __builtin_memswap() that would perform the swap "in-place" in cache-line-sized blocks, but I'm not aware of any proposals nor prior art in that area.

I think it is acceptible just to make it "implementation-defined". If the type to swap is small, the compiler may generate specific instructions performing efficient copy operations for 1, 2, 4, 8... bytes (or tricks like `a ^= b ^= a ^= b`). Otherswise, the implementation may invoke `memcpy` twice. This is the usual implementation I have seen for the specializations of `std::swap`.

Caveat (B): Notice that whereas "relocate" means "move-construct, then destroy", we might say that "swap" means "move-construct, then move-assign, then move-assign, then destroy." (This being the operation done by the unconstrained std::swap template.)  This involves a relationship among 3 operations, which might be a little scarier than relocate's relationship among 2 operations, which is scarier than the current Standard Library's "trivially X" traits which all involve only a single operation.

By saying "I am also wondering if this concept could help in generating default move constructors", I am thinking of the possibility to make "swap" a primitive, in other words, to make "swap beneath move", and the generated move constructors are always exception-safe.

Caveat (C): For small types like unique_ptr, __builtin_memswap() will not be any faster than the unconstrained std::swap template. The point of optimizing into mem-anything is to get speedups on large arrays, such as during std::vector reallocation.  std::vector swapping is already fast, and cannot be made faster by __builtin_memswap().  Now, std::array swapping could be made faster; consider—

    std::array<std::unique_ptr<int>, 10000> a;
    std::array<std::unique_ptr<int>, 10000> b;
    a.swap(b);  // could probably get a factor-of-2 speedup on this operation by using __builtin_memswap

But, this is not an operation that happens often enough in real programs for anyone to get really motivated about.

The main motivation for the TriviallySwappable requirements is to avoid unnecessary runtime dispatch in polymorphic use cases, and it has been briefly illustrated in my initial post.

Nicol Bolas

unread,
Jun 29, 2018, 9:04:50 PM6/29/18
to ISO C++ Standard - Future Proposals
On Friday, June 29, 2018 at 6:08:10 PM UTC-4, Thiago Macieira wrote:
On Friday, 29 June 2018 00:19:40 PDT Mingxin Wang wrote:
> Informally, a type meets the *TriviallySwappable* requirements if the
> "std::swap" function overload of this type performs bitwise swap operation.

Why do you need the concept? Why can't you just use std::swap for your use-
case?

He didn't really explain the problem very well. It's a performance issue, not a functionality issue.

Say you have a type-erased type like `any`. It's storing some type-erased value, and it's using small buffer optimization. If you move an `any`, and the stored object fits within the small buffer (like a `unique_ptr<T>`), then you can only move it by invoking `unique_ptr<T>`'s move constructor. This requires an indirect call through the type-erasure machinery.

That's slow.

A swap operation would require 3 of these moves. That's really slow.

However, if the `any` could, at swap time, detect that its contents were TriviallySwappable, it could perform the swap with 3 memcpy operations.

Nicol Bolas

unread,
Jun 29, 2018, 9:09:21 PM6/29/18
to ISO C++ Standard - Future Proposals
On Friday, June 29, 2018 at 9:03:26 PM UTC-4, Mingxin Wang wrote:
On Saturday, June 30, 2018 at 6:51:12 AM UTC+8, Arthur O'Dwyer wrote:
Caveat (A): Trivial relocation can be optimized into memcpy() or memmove().  Trivial swap cannot be optimized into mem-anything, because there is no libc primitive for swapping arrays of bytes. We could certainly propose to add a __builtin_memswap() that would perform the swap "in-place" in cache-line-sized blocks, but I'm not aware of any proposals nor prior art in that area.

I think it is acceptible just to make it "implementation-defined". If the type to swap is small, the compiler may generate specific instructions performing efficient copy operations for 1, 2, 4, 8... bytes (or tricks like `a ^= b ^= a ^= b`). Otherswise, the implementation may invoke `memcpy` twice. This is the usual implementation I have seen for the specializations of `std::swap`.

Caveat (B): Notice that whereas "relocate" means "move-construct, then destroy", we might say that "swap" means "move-construct, then move-assign, then move-assign, then destroy." (This being the operation done by the unconstrained std::swap template.)  This involves a relationship among 3 operations, which might be a little scarier than relocate's relationship among 2 operations, which is scarier than the current Standard Library's "trivially X" traits which all involve only a single operation.

By saying "I am also wondering if this concept could help in generating default move constructors", I am thinking of the possibility to make "swap" a primitive, in other words, to make "swap beneath move", and the generated move constructors are always exception-safe.

That would not have that effect. A type with a throwing move constructor is one that cannot be empty, and therefore allocates state even if it is empty. The classic example being `std::list` implementations that are required to have a single node. Your "swap primitive" wouldn't be able to allocate that memory, so it could not use that implementation.

Compiler-generated move constructors are always as exception-safe as the move constructors they call. And your feature here can't change that.

Mingxin Wang

unread,
Jun 29, 2018, 10:10:33 PM6/29/18
to ISO C++ Standard - Future Proposals
I do not see the difference between allocating constructions and non-allocating constructions. Generating move constructors with `swap` requires the types to be trivially swappable and default constructible, rather than trivially default constructible. Thus I think the move constructor of `std::list` can theoretically be generated with `swap`.

Compiler-generated move constructors are always as exception-safe as the move constructors they call. And your feature here can't change that.

You are right about that. However, the default constructors are not always correct, e.g. for `std::unique_ptr`, because there is a chance for the default move constructors to have different semantics from other hand-written constructors. Generating move constructors with `swap` could avoid such abuse.

Nicol Bolas

unread,
Jun 29, 2018, 11:23:29 PM6/29/18
to ISO C++ Standard - Future Proposals
On Friday, June 29, 2018 at 10:10:33 PM UTC-4, Mingxin Wang wrote:
On Saturday, June 30, 2018 at 9:09:21 AM UTC+8, Nicol Bolas wrote:
On Friday, June 29, 2018 at 9:03:26 PM UTC-4, Mingxin Wang wrote:
By saying "I am also wondering if this concept could help in generating default move constructors", I am thinking of the possibility to make "swap" a primitive, in other words, to make "swap beneath move", and the generated move constructors are always exception-safe.

That would not have that effect. A type with a throwing move constructor is one that cannot be empty, and therefore allocates state even if it is empty. The classic example being `std::list` implementations that are required to have a single node. Your "swap primitive" wouldn't be able to allocate that memory, so it could not use that implementation.

I do not see the difference between allocating constructions and non-allocating constructions. Generating move constructors with `swap` requires the types to be trivially swappable and default constructible, rather than trivially default constructible. Thus I think the move constructor of `std::list` can theoretically be generated with `swap`.

It could, but it still wouldn't be noexcept because `std::list`'s default constructor is not noexcept. Or more specifically, it is not required to be.

So you've gained nothing.

Also, trivial swapping should not require default constructible. The whole point of trivial-anyability is that it can be done via memcpy of bytes. So a trivial swap should be able to work through a byte array.

Compiler-generated move constructors are always as exception-safe as the move constructors they call. And your feature here can't change that.

You are right about that. However, the default constructors are not always correct, e.g. for `std::unique_ptr`, because there is a chance for the default move constructors to have different semantics from other hand-written constructors. Generating move constructors with `swap` could avoid such abuse.

The ability of hand-written constructors to have different behavior from default constructors is not "abuse". It's why you can write move constructors at all.

Default move constructors should not try to be fancy. They should provide very simple, very obvious semantics: move each element. The fancier the compiler tries to get, the more likely it is that something will go wrong.

Arthur O'Dwyer

unread,
Jun 29, 2018, 11:58:09 PM6/29/18
to ISO C++ Standard - Future Proposals
On Fri, Jun 29, 2018 at 5:58 PM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, June 29, 2018 at 6:51:12 PM UTC-4, Arthur O'Dwyer wrote:
On Friday, June 29, 2018 at 1:56:01 AM UTC-7, Gašper Ažman wrote:
How does this interface with the concept of relocatable? One would think relocatable implies trivially swappable, same as noexcept movable implies noexcept swappable.

In my C++Now 2018 talk on "trivially relocatable," I promised in the outline to talk about its relationship to "trivially swappable," and then did not actually do so — sorry!
Essentially, yes, if a type is trivially relocatable then it intuitively ought to be considered trivially swappable.  However, there are two minor caveats that I can think of off the top of my head (and the reason I didn't talk about it at C++Now is that I haven't thought about it much, and the reason for *that* is that I don't have a motivating use-case).

Caveat (A): Trivial relocation can be optimized into memcpy() or memmove().  Trivial swap cannot be optimized into mem-anything, because there is no libc primitive for swapping arrays of bytes. We could certainly propose to add a __builtin_memswap() that would perform the swap "in-place" in cache-line-sized blocks, but I'm not aware of any proposals nor prior art in that area.

void trivial_swap(T &a, T &b)
{
  std
::byte buff[sizeof(T)];
  memcpy
(buff, &a, sizeof(T));
  memcpy
(&a, &b, sizeof(T)); //b is no longer valid.
  memcpy
(&b, buff, sizeof(T)); //b is valid again.
}

I believe the validity of this code naturally falls out of a type being TriviallyRelocatable.

Correct. What you've written there seems like essentially the same thing you'd get from the unconstrained std::swap template after optimization, which is why I went on to say that it didn't seem like the cost/benefit was there for small objects. (And, as I wrote, for large objects trivial swap cannot be optimized into mem-anything, because there is no libc primitive for swapping arrays of bytes.)

However, I just went and checked Godbolt, and in fact both GCC and Clang do have real trouble optimizing the unconstrained std::swap template for unique_ptr!
So the benefit in practice (to specializing std::swap for trivially relocatable types) would at least be non-zero.
But I'd still want to see a use-case that measurably benefits from an O(1)-faster swap operation.

–Arthur

Nicol Bolas

unread,
Jun 30, 2018, 12:18:44 AM6/30/18
to ISO C++ Standard - Future Proposals
The OP is attempting to avoid indirect calls during swapping of type-erased objects with small storage buffers. That's the use case. Swaps would be cheap if the object always contained a pointer to heap allocated memory. But with small storage optimization, that's not possible. If the object lives in the small storage buffer, then you have to call its move constructor.

So just look at the code for swapping `any` objects. It's not pretty.

The goal of this notion is to restore the ability to have cheap swapping.

Now granted, TriviallyRelocatable gives him exactly what he wants. So I don't see a need for yet another Trivially-whatever concept.

Oh, and BTW: the alignas part on your byte buffer isn't necessary; trivial copy operations to an array of bytes don't have to be aligned to the alignment of the original object. That only matters if you're copying into an object of the appropriate type.

Thiago Macieira

unread,
Jun 30, 2018, 1:32:55 AM6/30/18
to std-pr...@isocpp.org
On Friday, 29 June 2018 18:04:50 PDT Nicol Bolas wrote:
> On Friday, June 29, 2018 at 6:08:10 PM UTC-4, Thiago Macieira wrote:
> > On Friday, 29 June 2018 00:19:40 PDT Mingxin Wang wrote:
> > > Informally, a type meets the *TriviallySwappable* requirements if the
> > > "std::swap" function overload of this type performs bitwise swap
> >
> > operation.
> >
> > Why do you need the concept? Why can't you just use std::swap for your
> > use-
> > case?
>
> He didn't really explain the problem very well. It's a performance issue,
> not a functionality issue.

Ok, so XY issue. He has a problem X (performance), he thinks he can solve it
with Y (concept) and asked about Y.

> Say you have a type-erased type like `any`. It's storing some type-erased
> value, and it's using small buffer optimization. If you move an `any`, and
> the stored object fits within the small buffer (like a `unique_ptr<T>`),
> then you can only move it by invoking `unique_ptr<T>`'s move constructor.
> This requires an indirect call through the type-erasure machinery.

Sorry, but why does it? Why can't there be a std::swap(std::any &, std::any &)
that knows that it can simply swap the internals, fast?

> A swap operation would require 3 of these moves. That's really slow.
>
> However, if the `any` could, at swap time, detect that its contents were
> TriviallySwappable, it could perform the swap with 3 memcpy operations.

std::any would know that its contents are trivially swappable without the need
for the concept. It knows that by design.

Mingxin Wang

unread,
Jun 30, 2018, 5:47:20 AM6/30/18
to ISO C++ Standard - Future Proposals
On Saturday, June 30, 2018 at 1:32:55 PM UTC+8, Thiago Macieira wrote:

> Say you have a type-erased type like `any`. It's storing some type-erased
> value, and it's using small buffer optimization. If you move an `any`, and
> the stored object fits within the small buffer (like a `unique_ptr<T>`),
> then you can only move it by invoking `unique_ptr<T>`'s move constructor.
> This requires an indirect call through the type-erasure machinery.

Sorry, but why does it? Why can't there be a std::swap(std::any &, std::any &)
that knows that it can simply swap the internals, fast?

In short, the problem is that: in order to stay compatible with non-trivially-swappable objects, it would be difficult to implement polymorphic facilities like `std::any` without dispatching of similar code most of the time. If the polymorphic facilities no longer support non-trivially-swappable objects, there will be a considerable performance improvement without decline in practical usability.

> A swap operation would require 3 of these moves. That's really slow.
>
> However, if the `any` could, at swap time, detect that its contents were
> TriviallySwappable, it could perform the swap with 3 memcpy operations.

std::any would know that its contents are trivially swappable without the need
for the concept. It knows that by design.

Yes, but as long as they support non-trivially-swappable ones on construction, they could only know about the object at runtime when using them.

Mingxin Wang

unread,
Jun 30, 2018, 6:10:49 AM6/30/18
to ISO C++ Standard - Future Proposals
After learning about your talk, I am pretty sure that the semantics of "TriviallyRelocatable" is exactly what I want, and SBO in polymorphism could be a strong motivation. Is there any paper on this concept? I think the PFA (P0957) that I am currently working on will be a loyal referer.

Nicol Bolas

unread,
Jun 30, 2018, 9:26:02 AM6/30/18
to ISO C++ Standard - Future Proposals
On Saturday, June 30, 2018 at 1:32:55 AM UTC-4, Thiago Macieira wrote:
On Friday, 29 June 2018 18:04:50 PDT Nicol Bolas wrote:
> On Friday, June 29, 2018 at 6:08:10 PM UTC-4, Thiago Macieira wrote:
> > On Friday, 29 June 2018 00:19:40 PDT Mingxin Wang wrote:
> > > Informally, a type meets the *TriviallySwappable* requirements if the
> > > "std::swap" function overload of this type performs bitwise swap
> >
> > operation.
> >
> > Why do you need the concept? Why can't you just use std::swap for your
> > use-
> > case?
>
> He didn't really explain the problem very well. It's a performance issue,
> not a functionality issue.

Ok, so XY issue. He has a problem X (performance), he thinks he can solve it
with Y (concept) and asked about Y.

Isn't every proposal an XY issue using that reasoning? That same XY issue is why TriviallyCopyable exists.


> Say you have a type-erased type like `any`. It's storing some type-erased
> value, and it's using small buffer optimization. If you move an `any`, and
> the stored object fits within the small buffer (like a `unique_ptr<T>`),
> then you can only move it by invoking `unique_ptr<T>`'s move constructor.
> This requires an indirect call through the type-erasure machinery.

Sorry, but why does it? Why can't there be a std::swap(std::any &, std::any &)
that knows that it can simply swap the internals, fast?

The TriviallyCopyable designation makes it legal for you to do byte-copies of such objects. These are the only types where that is legal. `unique_ptr<T>` is not TriviallyCopyable, but it is logically TriviallySwappable. That is, given an understanding of how it's implementation (and compilers) works, you ought to be able to do byte copies between two objects of that type.

However, just because it "ought" to work doesn't mean it is legal by the C++ standard. TriviallyCopyable is the only grouping where byte copying of any form is allowed.

Furthermore, let's say that std::any can break the rules. Well, that would only apply to standard library defined types. So if you tried to store a non-standard type in `std::any`, it wouldn't be able to guess whether the type was TriviallySwappable or not. Nor would users be able to write their own type-erased types that can take advantage of that tool.

> A swap operation would require 3 of these moves. That's really slow.
>
> However, if the `any` could, at swap time, detect that its contents were
> TriviallySwappable, it could perform the swap with 3 memcpy operations.

std::any would know that its contents are trivially swappable without the need
for the concept. It knows that by design.

It can't know that because "trivially swappable" isn't a thing. With no standard-defined mechanism to key into, the only thing it could do is pick a number of standard-defined types and declare that they fit the bill.

Tony V E

unread,
Jun 30, 2018, 11:13:26 AM6/30/18
to Thiago Macieira
> std::any would know that its contents are trivially swappable without the need 
for the concept. It knows that by design

‎If the type inside the any has a pointer to itself, you can't swap the bytes.


Sent from my BlackBerry portable Babbage Device
  Original Message  
From: Thiago Macieira
Sent: Saturday, June 30, 2018 1:32 AM
To: std-pr...@isocpp.org
Reply To: std-pr...@isocpp.org
Subject: Re: [std-proposals] Re: Better Performance in Polymorphic Programming: Trivially Swappable
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2028231.PPXpJpiaJL%40tjmaciei-mobl1.

Thiago Macieira

unread,
Jun 30, 2018, 12:41:18 PM6/30/18
to std-pr...@isocpp.org
On Saturday, 30 June 2018 02:47:20 PDT Mingxin Wang wrote:
> > > This requires an indirect call through the type-erasure machinery.
> >
> > Sorry, but why does it? Why can't there be a std::swap(std::any &,
> > std::any &)
> > that knows that it can simply swap the internals, fast?
>
> In short, the problem is that: in order to stay compatible with
> non-trivially-swappable objects, it would be difficult to implement
> polymorphic facilities like `std::any` without dispatching of similar code
> most of the time. If the polymorphic facilities no longer support
> non-trivially-swappable objects, there will be a considerable performance
> improvement without decline in practical usability.

Please give an actual example. I can't see what you're trying to do.

Note: do not include a memcpy of an object of a class with a virtual table.
Assume such classes are not trivially relocatable or swappable.

> > A swap operation would require 3 of these moves. That's really slow.
> >
> > > However, if the `any` could, at swap time, detect that its contents were
> > > TriviallySwappable, it could perform the swap with 3 memcpy operations.
> >
> > std::any would know that its contents are trivially swappable without the
> > need
> > for the concept. It knows that by design.
>
> Yes, but as long as they support non-trivially-swappable ones on
> construction, they could only know about the object at runtime when using
> them.

Sounds like QoI to me. std::any needs to use its internal storage only when at
compile time it can determine that the contents can be trivially relocated.
For anything else, it needs to use dynamic allocation, regardless of size.
That way, all std::any swaps will be trivial.

How it determines trivial relocatability is the topic of the discussion.
Currently, in standard C++, it requires a trivial type, as standard C++ does
not have something like Qt's Q_MOVABLE_TYPE.

Thiago Macieira

unread,
Jun 30, 2018, 12:45:53 PM6/30/18
to std-pr...@isocpp.org
On Saturday, 30 June 2018 06:26:01 PDT Nicol Bolas wrote:
> > std::any would know that its contents are trivially swappable without the
> > need
> > for the concept. It knows that by design.
>
> It can't know that because "trivially swappable" isn't a thing. With no
> standard-defined mechanism to key into, the only thing it could do is pick
> a number of standard-defined types and declare that they fit the bill.

My point is that std::any can be designed in such a way that it knows by
construction that it is always trivially swappable. A trivial implementation
contains a pointer to a base class of polymorphic type that is derived for
each hosted type. A pointer is trivially swappable, therefore std::any's
swap() function is trivially swappable.

That's why I am saying this is QoI: the implementation can design it in such a
way. Or they can make different trade-offs, by saying that it will avoid
memory allocation but requiring then that there be a runtime dispatch in order
to do moves and swaps.

Can we have the cake and eat it too? I like that idea.

Thiago Macieira

unread,
Jun 30, 2018, 1:16:44 PM6/30/18
to std-pr...@isocpp.org
On Saturday, 30 June 2018 09:45:49 PDT Thiago Macieira wrote:
> That's why I am saying this is QoI: the implementation can design it in such
> a way. Or they can make different trade-offs, by saying that it will avoid
> memory allocation but requiring then that there be a runtime dispatch in
> order to do moves and swaps.
>
> Can we have the cake and eat it too? I like that idea.

Actually, is this worth it? Maybe with a different example, other than
std::any. I'd like to see such a concrete example.

std::any can only be trivially swapped with another std::any if *both* are
trivially swappable. And in the case we're discussing, you can only determine
if they are at runtime anyway. Therefore, doing swaps in an array of std::any
cannot be vectorised anyway and must be done for each element, with a dynamic
check of the trait.

So I ask again: is it worth storing this trait? Since std::any needs to check
dynamically what type it has, it may as well simply do the type-erased
swapping on its own.

If we take the example of std::array<std::unique_ptr<T>>, then std::array's
swap function calls unique_ptr's, which knows its internals and knows whether
it can trivially swap them. In the obvious implementation of std::unique_ptr,
swapping is trivial: just swap two pointers. That means the whole array's swap
is just swapping pointers and the compiler can optimise this.

Nicol Bolas

unread,
Jun 30, 2018, 1:25:02 PM6/30/18
to ISO C++ Standard - Future Proposals


On Saturday, June 30, 2018 at 1:16:44 PM UTC-4, Thiago Macieira wrote:
On Saturday, 30 June 2018 09:45:49 PDT Thiago Macieira wrote:
> That's why I am saying this is QoI: the implementation can design it in such
> a way. Or they can make different trade-offs, by saying that it will avoid
> memory allocation but requiring then that there be a runtime dispatch in
> order to do moves and swaps.
>
> Can we have the cake and eat it too? I like that idea.

Actually, is this worth it? Maybe with a different example, other than
std::any. I'd like to see such a concrete example.

std::any can only be trivially swapped with another std::any if *both* are
trivially swappable.

In the proposed implementation based on this concept, `std::any` would use small buffer optimization only for types that are TriviallySwappable/TriviallyRelocatable. Therefore, `std::any` itself would be TriviallySwappable/TriviallyRelocatable, regardless of the type it holds.

And indeed, with this concept, we can require implementations of `std::any` to be TriviallyRelocatable, and thus yoke the benefits of employing that.

Nicol Bolas

unread,
Jun 30, 2018, 1:30:05 PM6/30/18
to ISO C++ Standard - Future Proposals


On Saturday, June 30, 2018 at 11:13:26 AM UTC-4, Tony V E wrote:
> std::any would know that its contents are trivially swappable without the need 
for the concept. It knows that by design

‎If the type inside the any has a pointer to itself, you can't swap the bytes.

That's not entirely true. The type would have to have an invariant that some pointer points to itself or a subobject thereof. It's OK to do a memcpy of a TriviallyCopyable object that just so happens to be storing a pointer to itself, because the result is exactly the same as if you had done `auto T = other_object;`.

So in the case you're talking about, the type would have to have an invariant such that when you perform a copy, the copy doesn't copy the other object's pointer to itself. It initializes its own "itself" pointer to point to itself.

In that case... such a type would not be TriviallyRelocatable/TriviallySwappable (and if you designate your type as such, you're lying and deserve what you get). Therefore, `std::any` would not store it in the small buffer.

This is why TriviallyRelocatable is opt-in.

Nicol Bolas

unread,
Jun 30, 2018, 1:31:39 PM6/30/18
to ISO C++ Standard - Future Proposals


On Saturday, June 30, 2018 at 12:45:53 PM UTC-4, Thiago Macieira wrote:
On Saturday, 30 June 2018 06:26:01 PDT Nicol Bolas wrote:
> > std::any would know that its contents are trivially swappable without the
> > need
> > for the concept. It knows that by design.
>
> It can't know that because "trivially swappable" isn't a thing. With no
> standard-defined mechanism to key into, the only thing it could do is pick
> a number of standard-defined types and declare that they fit the bill.

My point is that std::any can be designed in such a way that it knows by
construction that it is always trivially swappable. A trivial implementation
contains a pointer to a base class of polymorphic type that is derived for
each hosted type. A pointer is trivially swappable, therefore std::any's
swap() function is trivially swappable.

That's why I am saying this is QoI: the implementation can design it in such a
way. Or they can make different trade-offs, by saying that it will avoid
memory allocation but requiring then that there be a runtime dispatch in order
to do moves and swaps.

Or, we can get the best of both worlds by defining the concept of TriviallySwapable. Or TriviallyRelocatable, since it turns out that this is the same thing.

Can we have the cake and eat it too? I like that idea.

How is your suggestion having cake and eating it? With your way, an implementation must pick one or the other. With the suggested solution, you get both.

Thiago Macieira

unread,
Jun 30, 2018, 4:51:26 PM6/30/18
to std-pr...@isocpp.org
On Saturday, 30 June 2018 10:31:39 PDT Nicol Bolas wrote:
> > Can we have the cake and eat it too? I like that idea.
>
> How is your suggestion having cake and eating it? With your way, an
> implementation must pick one or the other. With the suggested solution, you
> get both.

It isn't. I meant to say that I am convinced there can be a better situation.

Nicol Bolas

unread,
Jun 30, 2018, 6:36:58 PM6/30/18
to ISO C++ Standard - Future Proposals
On Saturday, June 30, 2018 at 4:51:26 PM UTC-4, Thiago Macieira wrote:
On Saturday, 30 June 2018 10:31:39 PDT Nicol Bolas wrote:
> > Can we have the cake and eat it too? I like that idea.
>
> How is your suggestion having cake and eating it? With your way, an
> implementation must pick one or the other. With the suggested solution, you
> get both.

It isn't. I meant to say that I am convinced there can be a better situation.

Well, it turns out that the OP really just wants TriviallyRelocatable, which adequately solves his problem. So we already have a better solution: build the infrastructure needed for TriviallyRelocatable, and we solve this problem and plenty of others.

Thiago Macieira

unread,
Jul 1, 2018, 10:47:56 AM7/1/18
to std-pr...@isocpp.org
On Saturday, 30 June 2018 15:36:58 PDT Nicol Bolas wrote:
> Well, it turns out that the OP really just wants TriviallyRelocatable,
> which adequately solves his problem. So we already have a better solution:
> build the infrastructure needed for TriviallyRelocatable, and we solve this
> problem and plenty of others.

Agreed. Just note that trivial relocatability goes one step further than
destructive moves. With the latter, to implement a swap you'd still need three
actual calls.

And that trivial relocatability gives us the efficient memswap that Arthur
suggested, plus it could be implemented by regular library code.

Arthur O'Dwyer

unread,
Jul 2, 2018, 2:45:20 AM7/2/18
to ISO C++ Standard - Future Proposals
FYI, I just implemented this "check if trivially swappable, and if so, trivially swap" optimization in a branch of SG14's stdext::inplace_function:
And then I realized that it might be better to stick with the "relocate" primitive:

But if this is only speeding up `swap`, then I still don't really see the point.  Is anyone motivated to do some kind of benchmark to prove that "trivial swap" is a useful operation?

(Recall that I have done the benchmarks to prove that "trivial relocate" is a useful operation, in non-type-erased contexts. What we're looking for here is some special indication about trivial-swap-in-terms-of-trivial-relocate.)

It occurs to me that "trivial swap" might be useful to algorithms such as std::sort, which can be based on swaps. But I'm not sure if those algorithms currently work by swapping, or if they work by moving (in which case they might be adjusted to work by trivially-relocating, in the trivially relocatable case).

–Arthur

Mingxin Wang

unread,
Jul 2, 2018, 6:29:45 AM7/2/18
to ISO C++ Standard - Future Proposals
After reading the code, I find the implementation was not exactly consistent with what I imagine. I did not mean that the implementation should not be compatible with non-trivially-relocatable types, but should optimize for trivially-relocatable types. In order to clarify the motivation, I made an illustrative example similar to `std::function`, which is also the equivalent code expected to be generated with `static_proxy<Callable<void>>` defined in the PFA (p0957).

#include <cstdio>

#include <utility>
#include <type_traits>

/**
 * The definition of is_trivially_relocatable
 * Let's assume that any move constructible type is trivially relocatable by
 * default.
 */
template <class T>
struct is_trivially_relocatable : std::is_move_constructible<T> {};

template <class T>
inline constexpr bool is_trivially_relocatable_v =
    is_trivially_relocatable<T>::value;

/**
 * The configuration of SBO and corresponding data structure
 */
inline constexpr std::size_t SBO_SIZE = sizeof(void*);

template <class T>
inline constexpr bool USE_SBO =
    is_trivially_relocatable_v<T> &&
    sizeof(T) <= SBO_SIZE &&
    std::alignment_of_v<T> <= std::alignment_of_v<void*>;

union storage_t {
  void* data_ptr_;
  mutable char data_[SBO_SIZE];
};

/**
 * A basic implementation for the dispatch specifically for Callable types
 * Note that there is NO runtime overhead in checking whether a type is suitable
 * for SBO.
 */
template <class R, class... Args>
struct vtable_t {
 public:
  template <class T>
  constexpr explicit vtable_t(std::in_place_type_t<T>)
      : fun_(fun_impl<T>), destroy_(destroy_impl<T>) {}

  R(*fun_)(storage_t&, Args&&...);
  void(*destroy_)(storage_t&);

 private:
  template <class T>
  static R fun_impl(storage_t& s, Args&&... args) {
    if constexpr (USE_SBO<T>) {
      return (*reinterpret_cast<T*>(s.data_))(std::forward<Args>(args)...);
    } else {
      return (*static_cast<T*>(s.data_ptr_))(std::forward<Args>(args)...);
    }
  }

  template <class T>
  static void destroy_impl(storage_t& s) {
    if constexpr (USE_SBO<T>) {
      reinterpret_cast<T*>(s.data_)->~T();
    } else {
      delete static_cast<T*>(s.data_ptr_);
    }
  }
};

/**
 * Declare the value of the vtables
 */
template <class T, class R, class... Args>
inline constexpr vtable_t<R, Args...> VTABLE{std::in_place_type<T>};

/**
 * A basic functionality defined by `std::function`
 */
template <class T>
class my_function;

template <class R, class... Args>
class my_function<R(Args...)> {
 public:
  /**
   * An empty state of the vtable is not designed to keep the code simple
   */
  my_function() : vtable_(nullptr) {}

  template <class T>
  my_function(T&& data) : my_function(std::in_place_type<std::decay_t<T>>,
                                      std::forward<T>(data)) {}

  template <class T, class... _Args>
  explicit my_function(std::in_place_type_t<T>, _Args&&... args) {
    if constexpr (USE_SBO<T>) {
      new (reinterpret_cast<T*>(storage_.data_)) T(std::forward<_Args>(args)...);
    } else {
      storage_.data_ptr_ = new T(std::forward<_Args>(args)...);
    }
    vtable_ = &VTABLE<T, R, Args...>;
  }

  /**
   * This is what I want: No dispatch when moving or swapping
   * The code is correct because if the type is NOT trivially relocatable, it
   * will be stored on the heap, and its pointer is simply trivially relocatable.
   */
  my_function(my_function&& rhs) {
    storage_ = rhs.storage_;
    vtable_ = rhs.vtable_;
    rhs.vtable_ = nullptr;
  }

  /**
   * In order to keep the code simple, the copy constructor was not implemented
   */
  my_function(const my_function&) = delete;

  ~my_function() {
    if (vtable_ != nullptr) {
      vtable_->destroy_(storage_);
    }
  }

  R operator()(Args... args) {
    vtable_->fun_(storage_, std::forward<Args>(args)...);
  }

  /**
   * Performing bit-wise swap operation is OK!
   */
  void swap(my_function& rhs) {
    std::swap(storage_, rhs.storage_);
    std::swap(vtable_, rhs.vtable_);
  }

 private:
  storage_t storage_;
  const vtable_t<R, Args...>* vtable_;
};

int main() {
  my_function<void()> f([] { puts("lalala"); });
  f();
  auto g = std::move(f);  // Nice!
  g();
  f.swap(g);  // Nice!
  f();
  return 0;
}

The code works fine with latest gcc, clang and msvc.

Thank you!

Mingxin Wang

Arthur O'Dwyer

unread,
Jul 2, 2018, 1:08:06 PM7/2/18
to ISO C++ Standard - Future Proposals
On Mon, Jul 2, 2018 at 3:29 AM, Mingxin Wang <wmx16...@163.com> wrote:
After reading the code, I find the implementation was not exactly consistent with what I imagine. I did not mean that the implementation should not be compatible with non-trivially-relocatable types, but should optimize for trivially-relocatable types. In order to clarify the motivation, I made an illustrative example similar to `std::function`, which is also the equivalent code expected to be generated with `static_proxy<Callable<void>>` defined in the PFA (p0957).

#include <cstdio>

#include <utility>
#include <type_traits>

/**
 * The definition of is_trivially_relocatable
 * Let's assume that any move constructible type is trivially relocatable by
 * default.
 */
template <class T>
struct is_trivially_relocatable : std::is_move_constructible<T> {};


For the purposes of this discussion, I'd greatly prefer/appreciate it if you'd use a plausible definition, such as

template<class T>
struct is_trivially_relocatable : std::bool_constant<
    std::is_trivially_move_constructible_v<T> &&
    std::is_trivially_destructible_v<T>
> {};
template<>
struct is_trivially_relocatable<std::string> : std::true_type {};

Using a real definition will make it clear what we're really talking about, and save a lot of time and confusion. Every time you make up a fake definition, such as "a type is trivially relocatable iff it is move-constructible," you make it harder to talk about the actual proposal, because every question about your proposal ends up having at least two answers: the fake answer (what happens if we assume "all move-constructible types are trivially relocatable") and one or more real answers (what happens if we replace the fake assumption with various real-world assumptions).

For example, in my "swap" implementation, I found that I needed to make an executive decision about what to do with types X where
    is_trivially_move_constructible_v<X> && is_trivially_destructible_v<X> && not is_trivially_assignable_v<X>
I decided that such types should not be "trivially swapped" by default (although of course they can opt-in to the trivial swap codepath, the same way std::string does in my example code above).


 

template <class T>
inline constexpr bool USE_SBO =
    is_trivially_relocatable_v<T> &&
    sizeof(T) <= SBO_SIZE &&
    std::alignment_of_v<T> <= std::alignment_of_v<void*>;


This is similar to what std::function does today, except that today it uses SBO only when
    is_nothrow_relocatable_v<T>
You're going a step further here and saying let's only use SBO for things where the relocation operation doesn't need to be type-erased at all (because it's just memcpy).
So you get fewer virtual dispatches, but — at least on paper — there is a set of types T that you are "kicking out" of SBO. (Namely, the set of types which are nothrow-relocatable but not trivially relocatable. For example: a struct containing an offset_ptr. We very well may not care about such types, there may not be many of them in practice, but we should be honest about what happens to them.)

For what it's worth: inplace_function has no "non-SBO" path, so we can't use the "kick it out of SBO" approach that you were thinking of, and so I hadn't really been thinking about it. But you're right, "kick it out of SBO" is a perfectly valid solution for std::function, std::any, etc.

 

  template <class T>
  static void destroy_impl(storage_t& s) {
    if constexpr (USE_SBO<T>) {
      reinterpret_cast<T*>(s.data_)->~T();
    } else {
      delete static_cast<T*>(s.data_ptr_);
    }
  }
};


I see!  This requires that we have the following behavior for std::function/std::any/etc.:

    std::function<void()> f1 = foo;
    auto f2 = std::move(f1);
    assert( bool(f1) == false );

Right now, libc++ fails the assertion for std::function; libc++ passes the equivalent assertion for std::any, and libstdc++ passes the assertion in both cases.
And SG14's stdext::inplace_function fails the assertion too, but I have just reopened https://github.com/WG21-SG14/SG14/issues/113 to get this fixed for real.

Speeding up "move-construction" of type-erased types is a bigger deal than just speeding up "swap"!

By the way, would anyone here be interested in co-authoring a paper on "is_trivially_relocatable" for San Diego?
Same question in re "is_trivially_comparable / has_strong_structural_equality" for San Diego.
In both cases I think it's unlikely to get a lot of changes in, but just getting a standard type-trait (so we can test the trivial relocatability of lambda types, for example) would be a big win.
Email me offline if so.

–Arthur
Reply all
Reply to author
Forward
0 new messages