splice in std::map

504 views
Skip to first unread message

Aso Renji

unread,
Sep 29, 2015, 7:30:47 PM9/29/15
to ISO C++ Standard - Future Proposals

std::list contain splice method, that can move values from on list to another, but std::map - not. Why std::map don't have same functional? Very simple proposal - include splice method into std::map. If target map already have same keys as moved values, splice throw exception. In other case splice grant no-throw guarantee. Yes, i can move values through pair of map1.insert(*it) and map2.erase(it). But in this case my code can't grant any no-throw guarantee, because any call of std::map::insert can throw std::bad_alloc.

Howard Hinnant

unread,
Sep 29, 2015, 8:14:29 PM9/29/15
to std-pr...@isocpp.org
On Sep 29, 2015, at 7:30 PM, Aso Renji <asor...@gmail.com> wrote:
>
> std::list contain splice method, that can move values from on list to another, but std::map - not. Why std::map don't have same functional? Very simple proposal - include splice method into std::map. If target map already have same keys as moved values, splice throw exception. In other case splice grant no-throw guarantee. Yes, i can move values through pair of map1.insert(*it) and map2.erase(it). But in this case my code can't grant any no-throw guarantee, because any call of std::map::insert can throw std::bad_alloc.

Alan Talbot has graciously submitted revision 2 of N3645 "Splicing Maps and Sets" to the pre-Kona mailing. The mailing deadline was just last Friday and so it is not available yet (give it a week or two). Here is a convenience link to the previous revision:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3645.pdf

Spoiler: there is not a proposal for a member function called splice. But the functionality of transferring ownership of a node from one container to another is there. And there is even more (and more powerful) functionality proposed. For example one can efficiently change the value of a key using this functionality (with no deallocation/reallocation sequence).

This includes functionality that many people have asked for over an extended period of time. And yet the committee has not yet allowed your std::lib vendor to provide it. If you care about access to this functionality so you can use it in your code, make sure you locate a committee member and let them know. If you and 1,000 others don’t go to the effort to do that, then Alan and his coauthors have an uphill battle in Kona next month.

Howard

Jared Grubb

unread,
Sep 29, 2015, 8:55:14 PM9/29/15
to ISO C++ Standard - Future Proposals
Really cool paper!

This would cross off one of my STL wishlist items (move element out of a std::set).

Jared

Edward Catmur

unread,
Sep 30, 2015, 11:07:52 AM9/30/15
to ISO C++ Standard - Future Proposals
On Wednesday, 30 September 2015 01:14:29 UTC+1, Howard Hinnant wrote:
 And there is even more (and more powerful) functionality proposed.  For example one can efficiently change the value of a key using this functionality (with no deallocation/reallocation sequence).

How? *a.extract(k) would still return pair<Key const, T>& so there's no actual legal way to modify the key, notwithstanding that modifying it wouldn't break container invariants.

I still think this is a great proposal, I just don't think it should be oversold.

Howard Hinnant

unread,
Sep 30, 2015, 11:34:16 AM9/30/15
to std-pr...@isocpp.org
The original intent was that *a.extract(k) would return a reference to a pair<Key, T> instead of a pair<Key const, T>. See my comment dated 2009-09-19 in LWG issue 839:

http://cplusplus.github.io/LWG/lwg-active.html#839

for a sketch. (extract is renamed to remove there). This can be made possible by having the map store a union{pair<Key, T>, pair<Key const, T>} and returning the proper member as requested. Technically this is not implementable in portable C++, but that is one of the reasons the std::lib exists: to write non-portable code so you don’t have to.

The technique has been field tested in libc++ for the last 6 or so years:

https://github.com/llvm-mirror/libcxx/blob/master/include/map#L616-L654

When *a.extract(k) is accessed, the node is no longer stored in the map. It is owned uniquely by the node_ptr, and you have to re-insert it into the map if desired (see LWG 839 for a more thorough — but still brief — sketch of the original idea).

I’m reviewing N3645 and I agree that it is not consistent with my description above. I was not present at the meeting where this was discussed, and N3645 is a product of that meeting, so I am behind on some of the details.

That being said, I have reviewed the pre-Kona version (which will be P0083R0). In this paper the syntax for modifying the key is different:

auto np = a.extract(k);
np.key() = new_key;
a.insert(std::move(np));

Personally I prefer the old syntax:

auto np = a.extract(k);
np->first = new_key;
a.insert(std::move(np));

However I want the functionality far more than I want any specific syntax to do it with. So I’m a happy camper. :-)

Howard

Howard Hinnant

unread,
Sep 30, 2015, 11:00:24 PM9/30/15
to std-pr...@isocpp.org
On Sep 30, 2015, at 11:34 AM, Howard Hinnant <howard....@gmail.com> wrote:
>
> (which will be P0083R0)

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0083r0.pdf

Howard

Ami Tavory

unread,
Oct 1, 2015, 5:16:04 AM10/1/15
to std-pr...@isocpp.org
  The document states:

"Although these [splice operations, a.t.] will have the same complexity as a conventional insert and erase, the actual cost will typically be much less since the objects do not need to be copied nor the nodes reallocated."

  Hope I didn't miss anything, but for generalized tree-based associative-containers (that is, both node based and ordered-vector based), split and join operations - which are specialized cases of splice - can be done without exceptions and with excellent complexity (llogarithmic for red-black trees, linear for ordered vectors). Split transfers keys that are larger (smaller) than some key, and join merges two containers where one set of keys is smaller (larger) than the other. The complexity is much lower than an equivalent sequence of inserts and erases, not only because the individual nodes are transferred rather than allocated, copied, and deallocated, but also because entire sub-trees can be transferred without restructuring (see Introduction To Algorithms - Red-Black Trees).
  The suggested operations can be implemented in terms of these operations, and, depending on the usage pattern, this can be very efficient (it stands to reason that the use of tree-based containers coincides with dealing with contiguous ranges of keys).
  I've implemented this many years ago as a libstdc++ extension.

 
Howard

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

maurice barnum

unread,
Oct 1, 2015, 6:00:40 AM10/1/15
to std-pr...@isocpp.org, howard....@gmai.com
is there a reason the proposal doesn't include operations like

map<Key, T, Comp, Allocator> extract(const_iterator first,
const_iterator last);

for map,multimap,set,multiset? replace map<> above with "my type"

combined with merge, this would allow me to move a subtree from one
sorted container (sorry, i don't know the preferred term) to another
without any allocations. i'm also thinking of use cases where i have
a shared set from which i'd like to quickly extract a range of items
and then do stuff with that range whilst minimizing the critical section
and/or minimizing iterative interlocking.

supporting the same operations for the hash containers probably doesn't
make any sense.

Howard Hinnant

unread,
Oct 1, 2015, 10:46:35 AM10/1/15
to std-pr...@isocpp.org, howard....@gmai.com
The authors of this proposal have no implementation experience for your suggestion. Perhaps you could propose this additional API.

Howard

Ami Tavory

unread,
Oct 1, 2015, 11:08:13 AM10/1/15
to std-pr...@isocpp.org, howard....@gmai.com
  I'd like to point out again, that, unless I misunderstood something, this can be done, without exceptions and in sublinear time, using well-known algorithms for balanced trees (details in my previous message).

Howard Hinnant

unread,
Oct 1, 2015, 11:20:08 AM10/1/15
to std-pr...@isocpp.org
On Oct 1, 2015, at 11:08 AM, Ami Tavory <ata...@gmail.com> wrote:
>
>> The authors of this proposal have no implementation experience for your suggestion. Perhaps you could propose this additional API.
>>
> I'd like to point out again, that, unless I misunderstood something, this can be done, without exceptions and in sublinear time, using well-known algorithms for balanced trees (details in my previous message).
>

Thank you. The proposal process is like an open-source software project. Anyone willing to do the work can participate. I encourage anyone with implementation experience to do so.

Howard

Ami Tavory

unread,
Oct 1, 2015, 11:26:11 AM10/1/15
to std-pr...@isocpp.org
  Sorry again if I'm misunderstanding you, but I tried to point out that I've implemented this in a libstdc++ extension which I think has been bundled with g++ for the last decade (at least I always see it in the ext/pb_ds directory of /usr/include/c++/[ver]). Incidentally, here are the performance tests for the split/join operations from the docs. Please let me know if I can help out with anything.

Howard

David Krauss

unread,
Oct 1, 2015, 11:38:04 AM10/1/15
to std-pr...@isocpp.org

On 2015–10–01, at 11:26 PM, Ami Tavory <ata...@gmail.com> wrote:

Sorry again if I'm misunderstanding you, but I tried to point out that I've implemented this in a libstdc++ extension which I think has been bundled with g++ for the last decade (at least I always see it in the ext/pb_ds directory of /usr/include/c++/[ver]). Incidentally, here are the performance tests for the split/join operations from the docs. Please let me know if I can help out with anything.

Howard seems to be simply asking for a write-up which could be properly considered by the committee. Just looking at the documentation page (perhaps overlooking source code comments), all I see is this:

Additional Methods

Tree-based containers support split and join methods. It is possible to split a tree so that it passes all nodes with keys larger than a given key to a different tree. These methods have the following advantages over the alternative of externally inserting to the destination tree and erasing from the source tree:

  1. These methods are efficient - red-black trees are split and joined in poly-logarithmic complexity; ordered-vector trees are split and joined at linear complexity. The alternatives have super-linear complexity.
  2. Aside from orders of growth, these operations perform few allocations and de-allocations. For red-black trees, allocations are not performed, and the methods are exception-free. 

It looks very promising, but not yet actionable.

Just as I was sending this I got your last message, and though the graphs are nice proof, they don’t seem to add any concepts.

A proposal should clearly state the interface you want to add, its complexity requirements, and its relation to the existing library and other proposals like extract.

Come back to this list with a draft and we’ll help to point out any gaps.

For what it’s worth, I think extracting a unique_ptr (or unique_ptr-alike) and splitting a new container are orthogonal, and both would be nice to have.

inkwizyt...@gmail.com

unread,
Oct 1, 2015, 12:10:52 PM10/1/15
to ISO C++ Standard - Future Proposals
I think this could be solved if we could apply common initial sequence in that case.
e.g. `pair<Key const, T>&` can alias `pair<Key, T>&` based on this both have same layout but only difference is constnes of members.
Only danger of that approach is that calling member function will have different behavior because will work with different const version of members:
template<typename A, typename B>
struct P
{
    A a
;
    B b
;
   
int foo() { return a.bar() + b.bar(); }
};

struct T
{
   
int bar() { return 0; }
   
int bar() const { return 1; }
};

union
{
    P
<T, T> f;
    P
<const T, T> s;
    P
<const T, const T> t;
   
const P<T, T> r;
} a { P<T, T>{} };

assert(a.f.foo() == 0);
assert(a.s.foo() == 1);
assert(a.t.foo() == 2);
assert(a.r.foo() == 2);



Another example could be `std::move` that will only partially move objects.

Ami Tavory

unread,
Oct 1, 2015, 12:17:01 PM10/1/15
to std-pr...@isocpp.org
On Thu, Oct 1, 2015 at 6:37 PM, David Krauss <pot...@gmail.com> wrote:

On 2015–10–01, at 11:26 PM, Ami Tavory <ata...@gmail.com> wrote:

Sorry again if I'm misunderstanding you, but I tried to point out that I've implemented this in a libstdc++ extension which I think has been bundled with g++ for the last decade (at least I always see it in the ext/pb_ds directory of /usr/include/c++/[ver]). Incidentally, here are the performance tests for the split/join operations from the docs. Please let me know if I can help out with anything.

Howard seems to be simply asking for a write-up which could be properly considered by the committee. Just looking at the documentation page (perhaps overlooking source code comments), ...

  OK, thanks for the explanation. I'll try to phrase it precisely in the terms you described.
 

Additional Methods

Tree-based containers support split and join methods. It is possible to split a tree so that it passes all nodes with keys larger than a given key to a different tree. These methods have the following advantages over the alternative of externally inserting to the destination tree and erasing from the source tree:

  1. These methods are efficient - red-black trees are split and joined in poly-logarithmic complexity; ordered-vector trees are split and joined at linear complexity. The alternatives have super-linear complexity.
  2. Aside from orders of growth, these operations perform few allocations and de-allocations. For red-black trees, allocations are not performed, and the methods are exception-free. 

It looks very promising, but not yet actionable.

Just as I was sending this I got your last message, and though the graphs are nice proof, they don’t seem to add any concepts.

A proposal should clearly state the interface you want to add, its complexity requirements, and its relation to the existing library and other proposals like extract.

Come back to this list with a draft and we’ll help to point out any gaps.

For what it’s worth, I think extracting a unique_ptr (or unique_ptr-alike) and splitting a new container are orthogonal, and both would be nice to have.

Agustín K-ballo Bergé

unread,
Oct 1, 2015, 12:38:35 PM10/1/15
to std-pr...@isocpp.org
On 10/1/2015 1:10 PM, inkwizyt...@gmail.com wrote:
> I think this could be solved if we could apply common initial sequence
> in that case.

The common initial sequence rule applies only to standard-layout unions,
so it would only work for standard-layout keys and values, giving
undefined behavior elsewhere. Furthermore, you are only allowed to read
from a member of the common initial sequence, not modify it.

Regards,
--
Agustín K-ballo Bergé.-
http://talesofcpp.fusionfenix.com

Howard Hinnant

unread,
Oct 1, 2015, 12:46:11 PM10/1/15
to std-pr...@isocpp.org
On Oct 1, 2015, at 12:38 PM, Agustín K-ballo Bergé <kaba...@hotmail.com> wrote:
>
> On 10/1/2015 1:10 PM, inkwizyt...@gmail.com wrote:
>> I think this could be solved if we could apply common initial sequence
>> in that case.
>
> The common initial sequence rule applies only to standard-layout unions, so it would only work for standard-layout keys and values, giving undefined behavior elsewhere. Furthermore, you are only allowed to read from a member of the common initial sequence, not modify it.

Fortunately, just like <atomic>, <typeinfo>, and parts of several other libraries such as <type_traits>, <mutex> and <locale>, there is no requirement that <map> be implementable in portable C++.

The std::lib protects us from non-portable and undefined behavior by providing well-defined functionality that is otherwise unaccessible to us. Sometimes this requires intimate cooperation with the compiler the std::lib ships with.

Howard

Agustín K-ballo Bergé

unread,
Oct 1, 2015, 1:06:46 PM10/1/15
to std-pr...@isocpp.org
Nod, we are saying the same thing, libc++'s implementation requires
compiler magic.

Ami Tavory

unread,
Oct 1, 2015, 6:56:56 PM10/1/15
to std-pr...@isocpp.org
  I'd like to ask about the following alternative, which seems to me like something between what the paper mentions as "Talbot's original idea" and this version of the paper. IMHO, it offers improved performance for ordered containers, natural operations for multi-key containers, and offers nearly all of the benefits of node_ptr without introducing a new class.

It requires adding split and join methods to all associative containers. 

*{map|set} methods:

    void split(iterator b, iterator e, container &other)

Input: takes a range given by two iterators to the object b, e, and another empty object of the same type other
Effect: transplants the range into other, removing it from the original object.
Complexity: O( (log(n))^2 ), where n is the size of the object to begin with.
Exception guarantees: those of the comparison functor (this method does not allocate memory).

    void join(container &other)

Input: takes an object of same type other, whose keys may not be both smaller-equal and larger with those of this object.
Effect: transplants the content of other into this object, emptying it.
Complexity: O( (log(max(m, n))^2 ), where m, n are the sizes of the objects before the operation.
Exception guarantees: those of the comparison functor (this method must not allocate memory).

unordered_*{map|set} have the following methods:

    void split(const key &k, container &other)

Input: takes a key reference k, and another empty object of the same type other
Effect: transplants the range of elements with key k, removing it from the original object.
Complexity: expected O( n ), where n is the size of the range.
Exception guarantees: those of the hash & equality functors, as well as those of inserting elements into other.

    void join(container &other)

Input: takes an object of same type other.
Effect: transplants the content of other into this object, emptying it.
Complexity: expected O( n ), where n is the size of other before the operation.
Exception guarantees: none, but if an exception is thrown, the each element must be in either this object or other.

-------------------------------------------------------------

This alternative has tradeoffs relative to the proposal. It has lower complexity for ordered associative containers. It extends naturally to multi-key containers. It unfortunately is not generic w.r.t. ordered/unordered containers.

Another point is that an empty container can basically function as a node_ptr (by splitting a single-size range into it), without introducing a new class. For ordered containers, there are no disadvantages (no exceptions can be thrown). For unordered containers, there is a disadvantage in terms of exception safety, but it is less severe than might be thought. It's possible to configure an empty unordered associative container so that it will not throw on the first insert. Thus for a sequence of movements between containers, a single object can be pre-built, and then used repeatedly with no exception possibilities. 



On Thu, Oct 1, 2015 at 6:37 PM, David Krauss <pot...@gmail.com> wrote:

David Krauss

unread,
Oct 1, 2015, 7:25:21 PM10/1/15
to std-pr...@isocpp.org, Howard Hinnant
On 2015–10–02, at 6:56 AM, Ami Tavory <ata...@gmail.com> wrote:

  I'd like to ask about the following alternative, which seems to me like something between what the paper mentions as "Talbot's original idea" and this version of the paper. IMHO, it offers improved performance for ordered containers, natural operations for multi-key containers, and offers nearly all of the benefits of node_ptr without introducing a new class.

I suspect that node_ptr should be a unique_ptr specialization. Its treatment of allocators (propagate_on_container_move_assignment etc) is unnecessary and complex. Pointers are not containers. See P0043 §1.1 for some tangential treatment.

It requires adding split and join methods to all associative containers. 

*{map|set} methods:

    void split(iterator b, iterator e, container &other)

Why does this not return by value? Is efficiency gained by passing a non-empty other, as opposed to separate split and join operations?

Even so, I think that split-by-value should be provided.

Input: takes a range given by two iterators to the object b, e, and another empty object of the same type other
Effect: transplants the range into other, removing it from the original object.
Complexity: O( (log(n))^2 ), where n is the size of the object to begin with.
Exception guarantees: those of the comparison functor (this method does not allocate memory).

    void join(container &other)

This should take an rvalue reference with move-from semantics. Also, it may allocate memory when the source has an unequal allocator.

Input: takes an object of same type other, whose keys may not be both smaller-equal and larger with those of this object.
Effect: transplants the content of other into this object, emptying it.
Complexity: O( (log(max(m, n))^2 ), where m, n are the sizes of the objects before the operation.
Exception guarantees: those of the comparison functor (this method must not allocate memory).

unordered_*{map|set} have the following methods:

    void split(const key &k, container &other)

Input: takes a key reference k, and another empty object of the same type other

This should certainly return by value instead.

Another point is that an empty container can basically function as a node_ptr (by splitting a single-size range into it), without introducing a new class.

I suspect that node_ptr is intended simply as an interface wrapper on a container class. I’d presume that this solution was already considered.

For ordered containers, there are no disadvantages (no exceptions can be thrown). For unordered containers, there is a disadvantage in terms of exception safety, but it is less severe than might be thought. It's possible to configure an empty unordered associative container so that it will not throw on the first insert. Thus for a sequence of movements between containers, a single object can be pre-built, and then used repeatedly with no exception possibilities. 

No possible exception subsequent to populating that first object, right? This somewhat explains the reference parameters, but the difference seems unsubstantial.

Altogether this looks pretty good, thanks for sharing!

inkwizyt...@gmail.com

unread,
Oct 2, 2015, 6:45:48 AM10/2/15
to ISO C++ Standard - Future Proposals
My point was relaxing this rules to not standard layout types if difference is only const.
If we don't change const type we have exactly same type and we could allow writing to it from any union member.
If we change constnes then we can only read from that. Only rule here would be that current active union member should not have const members.

union
{
    pair<A, B> a;
    pair<const A, B> b;
} u1, u2;

new (&u1.a) pair<A, B>();
u1.a.first.foo(); //current allowed
u1.b.first.foo(); //using IntSeqRul, different member type, only read allowed
u1.b.second.bar(); //using IntSeqRul, same member type, write allowed

new (&u2.b) pair<A, B>();
u2.a.first.foo(); //illegal because it remove const

Omer Rosler

unread,
Apr 5, 2016, 6:57:38 PM4/5/16
to ISO C++ Standard - Future Proposals, asor...@gmail.com
Hello,
I just read the post-Jacksonville version of the paper (P0083R2) and I have a question (hope this is the right place for it).
Why doesn't the paper propose adding the node_handle interface to {forward_}list as well?
It will enable a uniform extraction API for all node-based containers in the STL and I don't see any immediate problems.

Morwenn

unread,
Apr 6, 2016, 3:53:14 AM4/6/16
to ISO C++ Standard - Future Proposals, asor...@gmail.com
My best guess is that it's not the problem that the paper is trying to solve; having a uniform API would be interesting, but it's already possible to splice lists.

I guess that if the std::map & std::set splicing proposal is accepted, then a subsequent proposal could propose a similar interface for lists too, but in this
case it would also be interesting to see how it plays with P0310 (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0310r0.pdf) which proposes
among other things to expose the node type of a list to users.
Reply all
Reply to author
Forward
0 new messages