std::list::extract?

709 views
Skip to first unread message

Mutz, Marc

unread,
Feb 13, 2018, 5:38:08 AM2/13/18
to std-pr...@isocpp.org
Hi,

Whenever I find myself using std::list, I do so because of slice(). Yet
a typical slice() call like

std::list<~~~> local;
local.splice(local.begin(), tasks, tasks.begin());

is quite a mouthful and even long-time C++ developers (incl. myself)
need to look up what this call does. In C++17, and with std::set, this
could be written as:

auto local = tasks.extract(tasks.begin());

which is self-documenting, even if you don't remember the exact
semantics of extract() - at least it's clear what is being extracted. We
also don't need to duplicate the type of 'tasks', thanks to auto.

Now, surely when extract() was added to the associative containers,
adding it to std::list was discussed, too. P0083R3 as written suggests
that extract() is the replacement for slice(), which, according to the
proposal, is not applicable for associative containers, which must
maintain ordering. It does not contain any thoughts on adding it to
list, too, apparently deeming splice() fit for the task (which it is,
technically).

It turns out, however, that the API eventually designed turned out much
nicer than the catch-all slice() we have on list, and that developers
(incl. myself) expected extract() on list, too, what with list also
being node-based.

So, would a proposal to add node_handle functionality to std::list and
std::forward_list be welcome?

Thanks,
Marc

Nicol Bolas

unread,
Feb 13, 2018, 10:49:44 AM2/13/18
to ISO C++ Standard - Future Proposals

Why? What's the point of exposing `node_handle` and that sort of thing on `std::list`? `list` already has splicing functionality, and splicing multiple nodes at once is a very fast operation in lists.

Your principle problem is that the current API for it is difficult to work with, since you have to have an already existing `list` type to do the splicing. So, just make an API that creates a new `list` that gets spliced into. It will be no less efficient than the node_type version.

Mutz, Marc

unread,
Feb 13, 2018, 6:17:28 PM2/13/18
to std-pr...@isocpp.org
On 2018-02-13 16:49, Nicol Bolas wrote:
[...]
> Why? What's the point of exposing `node_handle` and that sort of thing
> on `std::list`?

Consistency, for one.

> `list` already has splicing functionality, and
> splicing multiple nodes at once is a very fast operation in lists.
>
> Your principle problem is that the current API for it is difficult to
> work with, since you have to have an already existing `list` type to
> do the splicing. So, just make an API that creates a new `list` that
> gets spliced into. It will be no less efficient than the node_type
> version.

extract() and node-insert make sense on std::list, so for consistency
they should be there. That's one part, the one I'm interested in.
Whether there's another API that could be added to std::list that
returns a list instead of a node_handle is another topic, and orthogonal
to whether node_handle-API should be on list, too.

Thanks,
Marc

Arthur O'Dwyer

unread,
Feb 13, 2018, 6:23:09 PM2/13/18
to ISO C++ Standard - Future Proposals
On Tuesday, February 13, 2018 at 2:38:08 AM UTC-8, Mutz, Marc wrote:
Hi,

Whenever I find myself using std::list, I do so because of slice(). Yet
a typical slice() call like

    std::list<~~~> local;
    local.splice(local.begin(), tasks, tasks.begin());

For reference, the above two-liner is equivalent to this three-liner...
    decltype(tasks) local;
    local.push_back(std::move(tasks.front()));
    tasks.pop_front();

is quite a mouthful and even long-time C++ developers (incl. myself)
need to look up what this call does. In C++17, and with std::set, this
could be written as:

    auto local = tasks.extract(tasks.begin());

No, that's incorrect. In C++17, the `extract` method returns a "node handle", which is not at all the same thing as a list. What you'd actually write in a hypothetical `extract`-enabled std::list implementation would be...

    decltype(tasks) local;
    auto nh = tasks.extract(tasks.begin());
    local.insert(std::move(nh));

or simply

    decltype(tasks) local;
    local.insert(tasks.extract(tasks.begin()));

Now unfortunately we hit a snag. `std::list`, being a sequence container (not an associative container), does not have a position-agnostic `insert` method!
So we end up having to tell the list exactly where we want to insert the new item:

    decltype(tasks) local;
    local.insert(local.begin(), tasks.extract(tasks.begin()));

Compare this to the version you didn't like:

    decltype(tasks) local;
    local.insert(local.begin(), tasks, tasks.begin());

Isn't the version you didn't like better than the version you're proposing?

HTH,
–Arthur

Nicol Bolas

unread,
Feb 13, 2018, 10:17:39 PM2/13/18
to ISO C++ Standard - Future Proposals
On Tuesday, February 13, 2018 at 6:17:28 PM UTC-5, Mutz, Marc wrote:
On 2018-02-13 16:49, Nicol Bolas wrote:
[...]
> Why? What's the point of exposing `node_handle` and that sort of thing
> on `std::list`?

Consistency, for one.

Different kinds of things are not expected to be consistent with each other because they're different. Sequence containers don't have the same interfaces as associative containers and vice-versa.

For example, `list` can handle splicing with arbitrary numbers of nodes, so long as they're all in a range. Associative containers cannot; `node_handle` only contains a single element. So their API only supports one-by-one element transfers.

So how would you extend the `extract` interface to handle ranges of nodes? Well, you'd have to have some object to store the range of nodes, obviously. And since it's a range, maybe it'd be nice to be able to add more elements to the range. Or perhaps remove some.

And we just reinvented `std::list`.

Adding `node_handle` functionality to `list` is like adding `operator[]` to `list`. Sure you can do it, but that doesn't mean you should. It serves no useful purpose and is fundamentally inferior to what we already can do at present.

Mutz, Marc

unread,
Feb 14, 2018, 7:37:41 AM2/14/18
to std-pr...@isocpp.org
Hi,

On 2018-02-14 00:23, Arthur O'Dwyer wrote:
[...]
>> std::list<~~~> local;
>> local.splice(local.begin(), tasks, tasks.begin());
>
> For reference, the above two-liner is equivalent to this
> three-liner...
> decltype(tasks) local;
> local.push_back(std::move(tasks.front()));
> tasks.pop_front();

No, it is not. In the splice() call, the value_type is neither copied
nor moved. The list node is simple re-linked. In the three-liner, the
value_type is moved, a new node is allocated, and the old one destroyed.

[...]
>> auto local = tasks.extract(tasks.begin());
>
> No, that's incorrect. In C++17, the `extract` method returns a "node
> handle", which is not at all the same thing as a list.

A node_handle gives me access to the element already (cf.
std::set::node_type::value()). My specific use-case was extracting a
single element from a list (think work-queue of a thread-pool, which is
100% of my use-cases for std::list over 15 years of professional C++
development). Right now, for ownership reasons, I need to use a local
std::list. But with node_handle, which is an owner, too, I don't need
that:

auto local = tasks.extract(tasks.begin());
do_something_with(local.value());
// at end of scope: 'local' deletes 'local.value()'
// alternatively, I can insert the node into some outgoing queue:
// out.insert(out.end(), std::move(local));

The benefit here is that the extraction and insertion of the work item
does not involve memory (de)allocations.

As for ranges: splice() is ok there, since the iterator pair
unambiguously identifies the source and the single iterator the target.
It's the one-element case where the splice() call is overly opaque.


Taking a step back, I appreciate that map::extract() was conceived as
the corresponding API to list::splice(), and that therefore,
feature-wise, list::extract() is not strictly necessary. But for me, not
having followed the standardisation process of this feature, extract()
et al were a way to allow access to nodes of node-based containers.
Splicing being just one of them. Taking a cue from Herb, let's look at
use-cases for splice() and ways to make them simpler: I showed node
extraction above, but node creation is another example:

You may want to separate node allocation from insertion into a (probably
shared) container: If you want the memory allocation outside the
critical section, what do you do? You create a std::list with one
element outside the critical section, and then splice the list into the
shared container under mutex protection.

So, there's even a use-case for a std::Container::node_type::create()
(or, rather, static container::create_node() b/c of allocators)
function!

Thanks,
Marc

Nicol Bolas

unread,
Feb 14, 2018, 9:57:09 AM2/14/18
to ISO C++ Standard - Future Proposals

We don't need a special API for the case of one element. If an API is fine for many elements, then it works just as well for a single one. Let's not forget the Zero/One/Infinity rule. Some types can only handle One, so they use  `node_handle`. Splice is for types that can handle Infinity.

If you're making a thread work-queue, you wouldn't want to write that code differently for the "pop one item" operation vs. "pop many items". One item is just a special case.

Taking a step back, I appreciate that map::extract() was conceived as
the corresponding API to list::splice(), and that therefore,
feature-wise, list::extract() is not strictly necessary. But for me, not
having followed the standardisation process of this feature, extract()
et al were a way to allow access to nodes of node-based containers.
Splicing being just one of them. Taking a cue from Herb, let's look at
use-cases for splice() and ways to make them simpler: I showed node
extraction above, but node creation is another example:

You may want to separate node allocation from insertion into a (probably
shared) container: If you want the memory allocation outside the
critical section, what do you do? You create a std::list with one
element outside the critical section, and then splice the list into the
shared container under mutex protection.

So, there's even a use-case for a std::Container::node_type::create()
(or, rather, static container::create_node() b/c of allocators)
function!

... how is that a use-case? You use a container to do that. Node handles don't need to have creation functionality. Node handles are nothing more than intermediaries. They allow some slight value manipulation, but they aren't meant to be single-element containers. You create a node in a container, extract it, and then insert it somewhere else. That's the usage pattern.

I don't know why it is that you want to treat `node_handle`s as real objects rather than intermediaries, but that's not their function. We use containers; that's the way the system is designed, and it works quite well this way.

If there was some design deficiency in this process, something that is otherwise genuinely impossible with this design, that would be one thing. But everything you want to do can be done, just not the way you want to do it. Containers have good support for node creation as it is. They already have allocators and ways to manipulate them. They already have `emplace` and similar functions. And so forth.

There's no reason beyond personal preference to duplicate all of these APIs in `node_handle`s.

Nicol Bolas

unread,
Feb 14, 2018, 10:29:55 AM2/14/18
to ISO C++ Standard - Future Proposals

To me, the principle need for a `list::extract` is to avoid having to do this:

decltype(source) lst(source.get_allocator());
lst
.splice(lst.begin(), begin, end);

It'd be much easier to be able to do:

auto lst = source.extract(begin, end);

It should just be a convenience wrapper around `splice` functionality.

Mutz, Marc

unread,
Feb 15, 2018, 3:28:01 AM2/15/18
to std-pr...@isocpp.org
On 2018-02-14 16:29, Nicol Bolas wrote:
[...]
> To me, the principle need for a `list::extract` is to avoid having to
> do this:
>
> decltype(source) lst(source.get_allocator());
> lst.splice(lst.begin(), begin, end);
> It'd be much easier to be able to do:
>
> auto lst = source.extract(begin, end);
> It should just be a convenience wrapper around `splice` functionality.

Right. It's a bit weird that list::extract returns a list while
map::extract returns a node_handle, but I guess that's ok, since map
does not have an extract(it, it) overload.

Now, what about the one-element case? Saying

auto lst = source.extract(source.begin(), ++source.begin());

is clearly wasteful, since you need to increment an iterator for no good
reason (even C++98 splice() is overloaded for range/single-element). So,
assume we also add

auto extracted = source.extract(source.begin());

or even

auto extracted = source.extract_front();

what's the decltype of 'extracted'? A list? Or something compatible with
a std::set::node_type?

Also: if we add list::extract(it, it) -> list, should we add
map::extract(it, it) -> map? After all, a sub-range of a map is properly
ordered for becoming a new map, both in the RB-tree and the hash-map
implementations.

Thanks,
Marc

Arthur O'Dwyer

unread,
Feb 15, 2018, 1:24:16 PM2/15/18
to ISO C++ Standard - Future Proposals
On Thu, Feb 15, 2018 at 12:27 AM, Mutz, Marc <ma...@kdab.com> wrote:
On 2018-02-14 16:29, Nicol Bolas wrote:
[...]
To me, the principle need for a `list::extract` is to avoid having to
do this:

decltype(source) lst(source.get_allocator());
lst.splice(lst.begin(), begin, end);
It'd be much easier to be able to do:

auto lst = source.extract(begin, end);
It should just be a convenience wrapper around `splice` functionality.

Right. It's a bit weird that list::extract returns a list while map::extract returns a node_handle, but I guess that's ok, since map does not have an extract(it, it) overload.

Okay, I'm convinced that Marc has a use-case for "moveless popping" an element off of a std::list (which could be used to shorten a critical section, for example).
However, `extract` is not the right name for this thing unless it returns a node_handle, which it apparently does not.
Marc, the name for "snip a range of elements out of a larger sequence" would be more like "sublist", analogous to "substr". However, "substr" is copying, whereas you want something that mutates the original container too; therefore the name should include a verb. Perhaps "extract_sublist" or "pop_sublist" or "snip_sublist". Perhaps just "snip", actually — it's sufficiently different from the already-taken "extract", and pleasingly similar to the existing verb "splice".

You'll have to decide how this is going to work with allocators (grrr, what a mess the committee has made here). The "snip" algorithm itself does not allocate or deallocate any memory, but it does cause the creation of a brand-new container (the returned list) which must come with an allocator suitable for deallocating the node(s) that you've snipped out. This means that the "snip" algorithm must copy the list's existing allocator — but it must not use select_on_container_copy_construction. You should specify this precisely, and think about whether there are any corner cases that might matter here.

Since list iterators are non-random-access iterators, I believe there will be a desire for

    auto lst = source.snip(source.begin(), 42);

as that is implementable much more efficiently than

    auto lst = source.snip(source.begin(), std::next(source.begin(), 42));

This eliminates your concern about the one-element case; it becomes simply

    auto lst = source.snip(source.begin(), 1);

with no special coding necessary.
However, there is still a use-case for

    auto lst = source.snip(first, last);

as well; I would tentatively expect both signatures to be provided.

Notice that source.snip(first, last) cannot possibly be an O(1) operation, because it needs to walk the entire sublist to decide on the value of lst.size() and the value to subtract from source.size().
Consider whether `forward_list::snip_after` should exist. It could probably be O(1) because forward_list does not track its own size.

Also: if we add list::extract(it, it) -> list, should we add map::extract(it, it) -> map? After all, a sub-range of a map is properly ordered for becoming a new map, both in the RB-tree and the hash-map implementations.

Proper sequential ordering isn't sufficient to become a brand-new map (AFAIK); you also need proper tree structure. Snipping a subtree into a new tree is possible, but detecting whether a range makes up a subtree structure isn't possible at the moment, and snipping out a subtree would tend to leave the source tree unbalanced.
Is it actually possible to snip out a range from an RB-tree, turn the range into a balanced RB-tree, and rebalance the source RB-tree, and do it all faster than if you just transferred the node-handles one at a time?

–Arthur

Nicol Bolas

unread,
Feb 15, 2018, 2:41:23 PM2/15/18
to ISO C++ Standard - Future Proposals


On Thursday, February 15, 2018 at 1:24:16 PM UTC-5, Arthur O'Dwyer wrote:
On Thu, Feb 15, 2018 at 12:27 AM, Mutz, Marc <ma...@kdab.com> wrote:
On 2018-02-14 16:29, Nicol Bolas wrote:
[...]
To me, the principle need for a `list::extract` is to avoid having to
do this:

decltype(source) lst(source.get_allocator());
lst.splice(lst.begin(), begin, end);
It'd be much easier to be able to do:

auto lst = source.extract(begin, end);
It should just be a convenience wrapper around `splice` functionality.

Right. It's a bit weird that list::extract returns a list while map::extract returns a node_handle, but I guess that's ok, since map does not have an extract(it, it) overload.

Okay, I'm convinced that Marc has a use-case for "moveless popping" an element off of a std::list (which could be used to shorten a critical section, for example).
However, `extract` is not the right name for this thing unless it returns a node_handle, which it apparently does not.

Fair enough.

Marc, the name for "snip a range of elements out of a larger sequence" would be more like "sublist", analogous to "substr". However, "substr" is copying, whereas you want something that mutates the original container too; therefore the name should include a verb. Perhaps "extract_sublist" or "pop_sublist" or "snip_sublist". Perhaps just "snip", actually — it's sufficiently different from the already-taken "extract", and pleasingly similar to the existing verb "splice".


Since list iterators are non-random-access iterators, I believe there will be a desire for

    auto lst = source.snip(source.begin(), 42);

as that is implementable much more efficiently than

    auto lst = source.snip(source.begin(), std::next(source.begin(), 42));

If you only have a size rather than an iterator, yes this will be faster. But if it's a choice between a size and an iterator you already have, neither is going to be faster.

In any case, if we're going to add sized `snip`, then we should also add sized `splice`, for the same reasons.

This eliminates your concern about the one-element case; it becomes simply

    auto lst = source.snip(source.begin(), 1);

with no special coding necessary.
However, there is still a use-case for

    auto lst = source.snip(first, last);

as well; I would tentatively expect both signatures to be provided.

Notice that source.snip(first, last) cannot possibly be an O(1) operation, because it needs to walk the entire sublist to decide on the value of lst.size() and the value to subtract from source.size().

Whether this is problematic or not depends entirely on what you do with the extracted elements. If you splice the entire snipped list into some other `list`, then that can be an O(1) operation, because you already have its size. If you had instead returned some specialized, unsized snipped-node-sequence object, then you would have to compute the new size at `splice` time. So whether at snip or splice time, you're computing a size.

But really, this is an argument for having an un-sized/non-O(1)-sized `list`. We have to deal with what we've got.

Reply all
Reply to author
Forward
0 new messages