Node containers and std::unique_ptr

801 views
Skip to first unread message

dsk...@google.com

unread,
May 19, 2017, 11:51:07 AM5/19/17
to cxx
Hi,

map<Key*, std::unique_ptr<Value>> came up in the last two codereviews I saw. Map's node is pretty big by itself (4 pointers), so in order to minimize memory overhead we need to maximize the payload. I checked the codebase, and it seems that the pattern is pretty common:

list<std::unique_ptr : 49
map<.*,\s+std::unique_ptr : 594 (all maps)
set<std::unique_ptr : 32 (all sets)

Of course there are reasons to use std::unique_ptr. I can think of:
  • Polymorphic payload
  • Incomplete type (i.e. the definition is in .cc file), although std::list explicitly allows incomplete types
  • Payload is adopted (i.e. created somewhere else) / payload ownership is released
  • ..
But I'm sure there are cases where std::unique_ptr was used unnecessarily. Should we care, and try to detect & warn about those cases? Or maybe write some guidelines?


dan...@chromium.org

unread,
May 19, 2017, 11:53:52 AM5/19/17
to Dmitry Skiba, cxx
It's unclear to me what you're proposing to replace it. list is slow and should almost always be avoided. These are maps that own the value, in which case we should use unique_ptr to express that ownership.  Maybe a concrete example would help.
 


--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To post to this group, send email to c...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/0a126eec-3d9a-4103-ac0e-c837722b62a5%40chromium.org.

Dmitry Skiba

unread,
May 19, 2017, 12:10:42 PM5/19/17
to Dana Jansens, cxx
I'm proposing to replace map<Key, std::unique_ptr<Value>> with map<Key, Value>. Same for other containers.

I.e. map.insert({key, base::MakeUnique<Value>(...)}) becomes map.emplace(key, Value(...)).

Primiano Tucci

unread,
May 19, 2017, 12:24:03 PM5/19/17
to Dmitry Skiba, Dana Jansens, cxx
I'm proposing to replace map<Key, std::unique_ptr<Value>> with map<Key, Value>
This means that Value has to be copiable or moveable. If the object is not cheaply moveable that means deep moving lot of bytes on insertions/removals. It's not immediate to me why such replacement would be beneficial. 
Furthermore, in many of those cases where a container holds a unique_ptr, the lifetime of the owned object is typically larger than its presence in the container (e.g. base::Value). In other words, in many cases people need to std::move() the ownership in and out of the container.

One thing I have not fully understood is: why the difference between map<Key, std::unique_ptr<Value>> and map<Key, Value> isn't just sizeof(std::unique_ptr) i.e. a pointer?

Jeffrey Yasskin

unread,
May 19, 2017, 12:45:28 PM5/19/17
to Primiano Tucci, Dmitry Skiba, Dana Jansens, cxx
On Fri, May 19, 2017 at 9:23 AM, 'Primiano Tucci' via cxx <c...@chromium.org> wrote:
I'm proposing to replace map<Key, std::unique_ptr<Value>> with map<Key, Value>
This means that Value has to be copiable or moveable. If the object is not cheaply moveable that means deep moving lot of bytes on insertions/removals. It's not immediate to me why such replacement would be beneficial. 

Using map.emplace() means insertions don't have to move anything.

I'm also skeptical of the claim that folks are doing this because they have an expensive-to-move type. Do you have examples?
 
Furthermore, in many of those cases where a container holds a unique_ptr, the lifetime of the owned object is typically larger than its presence in the container (e.g. base::Value). In other words, in many cases people need to std::move() the ownership in and out of the container.

Dmitry mentioned this possibility: "Payload is adopted (i.e. created somewhere else) / payload ownership is released"
 
One thing I have not fully understood is: why the difference between map<Key, std::unique_ptr<Value>> and map<Key, Value> isn't just sizeof(std::unique_ptr) i.e. a pointer?

Avoiding a layer of pointers can reduce cache misses. map<> is terrible on this point, but having it store pointers instead of values makes it worse.

Jeffrey
 

Primiano Tucci

unread,
May 19, 2017, 1:14:03 PM5/19/17
to Jeffrey Yasskin, Dmitry Skiba, Dana Jansens, cxx
On Fri, May 19, 2017 at 9:45 AM Jeffrey Yasskin <jyas...@chromium.org> wrote:
On Fri, May 19, 2017 at 9:23 AM, 'Primiano Tucci' via cxx <c...@chromium.org> wrote:
I'm proposing to replace map<Key, std::unique_ptr<Value>> with map<Key, Value>
This means that Value has to be copiable or moveable. If the object is not cheaply moveable that means deep moving lot of bytes on insertions/removals. It's not immediate to me why such replacement would be beneficial. 

Using map.emplace() means insertions don't have to move anything.

I'm also skeptical of the claim that folks are doing this because they have an expensive-to-move type. Do you have examples?

The first 5 I got from codesearch are:
1) src/ppapi/host/ppapi_host.h: ResourceHost has a vector 
2) src/crypto/nss_util.cc: ChromeOSUserData has a vector, a callback, two unique_ptrs
3) tools/gn/loader.h: ToolchainRecord has a vector, and an inner class with bunch of strings and other nested objects.
4) ui/gfx/image/image.cc: RepresentationType seems just an enum, not sure why this is a unique_ptr
5) net/dns/record_parsed.h: RecordParsed has a string, 3 ints, a unique_ptr, and a base::Time. Not sure how expensive this is to move, but definitely not trivial. 

Ryan Hamilton

unread,
May 19, 2017, 1:31:13 PM5/19/17
to Primiano Tucci, Dmitry Skiba, Dana Jansens, cxx
On Fri, May 19, 2017 at 9:23 AM, 'Primiano Tucci' via cxx <c...@chromium.org> wrote:
I'm proposing to replace map<Key, std::unique_ptr<Value>> with map<Key, Value>
This means that Value has to be copiable or moveable. If the object is not cheaply moveable that means deep moving lot of bytes on insertions/removals. It's not immediate to me why such replacement would be beneficial. 
Furthermore, in many of those cases where a container holds a unique_ptr, the lifetime of the owned object is typically larger than its presence in the container (e.g. base::Value). In other words, in many cases people need to std::move() the ownership in and out of the container.

​In addition to Value needing to be at least moveable to be stored directly in the map, it also means that Value*s are not guaranteed to be valid as long as the associated Value is alive. For some usecases, this can be a major problem.

Jeffrey Yasskin

unread,
May 19, 2017, 1:40:42 PM5/19/17
to Primiano Tucci, Jeffrey Yasskin, Dmitry Skiba, Dana Jansens, cxx
On Fri, May 19, 2017 at 10:13 AM, Primiano Tucci <prim...@google.com> wrote:


On Fri, May 19, 2017 at 9:45 AM Jeffrey Yasskin <jyas...@chromium.org> wrote:
On Fri, May 19, 2017 at 9:23 AM, 'Primiano Tucci' via cxx <c...@chromium.org> wrote:
I'm proposing to replace map<Key, std::unique_ptr<Value>> with map<Key, Value>
This means that Value has to be copiable or moveable. If the object is not cheaply moveable that means deep moving lot of bytes on insertions/removals. It's not immediate to me why such replacement would be beneficial. 

Using map.emplace() means insertions don't have to move anything.

I'm also skeptical of the claim that folks are doing this because they have an expensive-to-move type. Do you have examples?

The first 5 I got from codesearch are:
1) src/ppapi/host/ppapi_host.h: ResourceHost has a vector 
2) src/crypto/nss_util.cc: ChromeOSUserData has a vector, a callback, two unique_ptrs
3) tools/gn/loader.h: ToolchainRecord has a vector, and an inner class with bunch of strings and other nested objects.

ToolchainRecord is DISALLOW_COPY_AND_ASSIGN, so it *can't* be moved. (And it contains a Lock, which more fundamentally can't be moved.) That should exempt it from Dmitry's check.

emplace() still lets you put it into a map by value, but it's ugly:

  std::map<int, Lock> m;
  m.emplace(std::piecewise_construct, std::forward_as_tuple(3), std::tuple<>());

There's a new C++17 function try_emplace(key, value_ctor_args...) that makes this better, but of course we can't use it yet.

4) ui/gfx/image/image.cc: RepresentationType seems just an enum, not sure why this is a unique_ptr
5) net/dns/record_parsed.h: RecordParsed has a string, 3 ints, a unique_ptr, and a base::Time. Not sure how expensive this is to move, but definitely not trivial. 

RecordParsed would be assignments to about 11 pointers, which is likely cheaper than allocating a block to store it.

Jeffrey

Jeffrey Yasskin

unread,
May 19, 2017, 1:42:32 PM5/19/17
to Ryan Hamilton, Primiano Tucci, Dmitry Skiba, Dana Jansens, cxx
std::map<> doesn't invalidate pointers to its members until they're .erase()d. I think you mean that you couldn't have the initial Value() constructor stash |this| somewhere external and have that be the address that's actually stored in the map?

Jeffrey

Dmitry Skiba

unread,
May 19, 2017, 2:01:12 PM5/19/17
to Primiano Tucci, Jeffrey Yasskin, Dana Jansens, cxx
On Fri, May 19, 2017 at 10:13 AM, Primiano Tucci <prim...@google.com> wrote:


On Fri, May 19, 2017 at 9:45 AM Jeffrey Yasskin <jyas...@chromium.org> wrote:
On Fri, May 19, 2017 at 9:23 AM, 'Primiano Tucci' via cxx <c...@chromium.org> wrote:
I'm proposing to replace map<Key, std::unique_ptr<Value>> with map<Key, Value>
This means that Value has to be copiable or moveable. If the object is not cheaply moveable that means deep moving lot of bytes on insertions/removals. It's not immediate to me why such replacement would be beneficial. 

Using map.emplace() means insertions don't have to move anything.

I'm also skeptical of the claim that folks are doing this because they have an expensive-to-move type. Do you have examples?

The first 5 I got from codesearch are:
1) src/ppapi/host/ppapi_host.h: ResourceHost has a vector 
2) src/crypto/nss_util.cc: ChromeOSUserData has a vector, a callback, two unique_ptrs
3) tools/gn/loader.h: ToolchainRecord has a vector, and an inner class with bunch of strings and other nested objects.
4) ui/gfx/image/image.cc: RepresentationType seems just an enum, not sure why this is a unique_ptr
5) net/dns/record_parsed.h: RecordParsed has a string, 3 ints, a unique_ptr, and a base::Time. Not sure how expensive this is to move, but definitely not trivial. 

#4 is polymorphic case, ImageRep is an abstract class.
Regarding vectors - why are they not movable? For example vector<shared_ptr<Foo>> is certainly movable.

My motivation behind all this was to reduce number of allocations, and to use memory more efficiently. For example map's node has 4 pointers for internal purposes (libc++), so with payload of pair<Key*, Value*> we're utilizing just 30% of node memory. If Value is embedded directly, the utilization gets higher, reflecting more efficient use of memory. For sets that is even worse, i.e. a set<Value*> allocates 20 bytes, but uses only 4 for the payload. 

I think that cost of additional allocation / free that unqiue_ptr incurs will almost always be higher than the cost of moving things around. Cost of allocation / free is also very unpredictable, while moving things around is way more trivial and stable.

Dmitry Skiba

unread,
May 19, 2017, 2:04:16 PM5/19/17
to Ryan Hamilton, Primiano Tucci, Dana Jansens, cxx
On Fri, May 19, 2017 at 10:31 AM, Ryan Hamilton <r...@chromium.org> wrote:

I'm not sure I understand - you mean, the pointer to the Value is not stable between container operations?

dan...@chromium.org

unread,
May 19, 2017, 2:08:11 PM5/19/17
to Dmitry Skiba, Primiano Tucci, Jeffrey Yasskin, cxx
On Fri, May 19, 2017 at 2:01 PM, Dmitry Skiba <dsk...@google.com> wrote:

On Fri, May 19, 2017 at 10:13 AM, Primiano Tucci <prim...@google.com> wrote:


On Fri, May 19, 2017 at 9:45 AM Jeffrey Yasskin <jyas...@chromium.org> wrote:
On Fri, May 19, 2017 at 9:23 AM, 'Primiano Tucci' via cxx <c...@chromium.org> wrote:
I'm proposing to replace map<Key, std::unique_ptr<Value>> with map<Key, Value>
This means that Value has to be copiable or moveable. If the object is not cheaply moveable that means deep moving lot of bytes on insertions/removals. It's not immediate to me why such replacement would be beneficial. 

Using map.emplace() means insertions don't have to move anything.

I'm also skeptical of the claim that folks are doing this because they have an expensive-to-move type. Do you have examples?

The first 5 I got from codesearch are:
1) src/ppapi/host/ppapi_host.h: ResourceHost has a vector 
2) src/crypto/nss_util.cc: ChromeOSUserData has a vector, a callback, two unique_ptrs
3) tools/gn/loader.h: ToolchainRecord has a vector, and an inner class with bunch of strings and other nested objects.
4) ui/gfx/image/image.cc: RepresentationType seems just an enum, not sure why this is a unique_ptr
5) net/dns/record_parsed.h: RecordParsed has a string, 3 ints, a unique_ptr, and a base::Time. Not sure how expensive this is to move, but definitely not trivial. 

#4 is polymorphic case, ImageRep is an abstract class.
Regarding vectors - why are they not movable? For example vector<shared_ptr<Foo>> is certainly movable.

My motivation behind all this was to reduce number of allocations, and to use memory more efficiently. For example map's node has 4 pointers for internal purposes (libc++), so with payload of pair<Key*, Value*> we're utilizing just 30% of node memory. If Value is embedded directly, the utilization gets higher, reflecting more efficient use of memory. For sets that is even worse, i.e. a set<Value*> allocates 20 bytes, but uses only 4 for the payload. 

I think that cost of additional allocation / free that unqiue_ptr incurs will almost always be higher than the cost of moving things around. Cost of allocation / free is also very unpredictable, while moving things around is way more trivial and stable.

My gut feeling is that we do have many unique_ptrs in containers because they used to be raw pointers because we didn't want copyable types, and we had no ability to use moving in the past. So the migration from raw pointer to unique pointer was much more obvious than raw pointer to making the type movable and storing by value.

There's many types in chrome with DISALLOW_COPY_AND_ASSIGN that don't have a move constructor but could.

Ryan Hamilton

unread,
May 19, 2017, 2:42:29 PM5/19/17
to Dmitry Skiba, Primiano Tucci, Dana Jansens, cxx
Indeed. So for example if the Value is stored in the map, but Value*s are also handed out to external callers​ then any mutation of the map may invalidate the caller's Value*s.

Dmitry Skiba

unread,
May 19, 2017, 2:50:15 PM5/19/17
to Ryan Hamilton, Primiano Tucci, Dana Jansens, cxx
That's true for base::small_map, but not for std::map - it guarantees iterator / pointer stability. See http://stackoverflow.com/questions/6438086/iterator-invalidation-rules/6442829

Ryan Hamilton

unread,
May 19, 2017, 3:17:16 PM5/19/17
to Dmitry Skiba, Primiano Tucci, Dana Jansens, cxx
Ah, good point. I was thinking about containers in general, not std::map specifically. Whoops!
Reply all
Reply to author
Forward
0 new messages