Possibility of linear complexity for std::{set|map}::merge

128 views
Skip to first unread message

mr...@google.com

unread,
Jun 28, 2018, 2:10:06 PM6/28/18
to ISO C++ Standard - Discussion
Pre-C++17, there wasn't a good way to merge contents of associative containers like std::set and std::map without doing copies, but C++17 added a new merge() function that can directly extract nodes of the source container. While this is a great addition for merging containers with copy-expensive keys, what surprises me is that the complexity of this function is O(N log N), at least according to Table 90 of N4713.

The two other ways to merge associative containers that I know of include:

1) std::{set|map}::insert(other_container.begin(), other_container.end())
2) std::set_union() (or std::merge() for std::multi{set|map})

The former just seems to be std::{set|map}::merge() with copies instead of moves (with the same O(N log N) complexity). The latter can be used to possibly achieve linear complexity with the right sort of inserter iterator, although it's unwieldy nature makes it prone to mistakes and it is used to create a third copy instead of merging into one of the two sources. Although it does copies, the latter shows that the possibility of linearly merging the containers should be possible.

Granted, std::{set|map}::merge also takes associative containers with a different comparator than it's own, so the non-linear complexity is probably from accounting for the source container being of a completely different order. But if the comparators are actually the same, what prevents us from making the process linear?

template<class C2>
void merge(std::set<Key, C2, Allocator>& source);
template<class C2>
void merge(std::set<Key, C2, Allocator>&& source);
template<class C2>
void merge(std::multiset<Key, C2, Allocator>& source);
template<class C2>
void merge(std::multiset<Key, C2, Allocator>&& source);

// New, specialized versions for containers with the same key comparison function.
void merge(std::set<Key, Compare, Allocator>& source);
void merge(std::set<Key, Compare, Allocator>&& source);
void merge(std::multiset<Key, Compare, Allocator>& source);
void merge(std::multiset<Key, Compare, Allocator>&& source);

It would be something like this (not the best implementation, but the idea is there):

void std::set::merge(std::set<Key, Compare, Allocator>& source) {
 
auto comp = this->key_comp();
 
auto source_it = source.begin();
 
for (auto it = this->begin(); it != this->end(); ++it) {
   
if (source_it == source.end()) break;
   
if (comp(*it, *source_it)) continue;
   
this->insert(it, source.extract(source_it++));
 
}
 
while (source_it != source.end()) {
   
this->insert(this->end(), source.extract(source_it++));
 
}
}

Is there something I'm missing, or is this actually possible? If so, is it possible to retroactively append it to C++17, given how recent it is and how it's not really a change in the API but rather in implementation? It shouldn't cause API/ABI breakage either.

Richard Smith

unread,
Jun 28, 2018, 4:15:57 PM6/28/18
to std-dis...@isocpp.org
I think you want "while (!comp(*it, *source_it))" here, not "if (comp(*it, *source_it))" here.
 

 
}
 
while (source_it != source.end()) {
   
this->insert(this->end(), source.extract(source_it++));
 
}
}

Is there something I'm missing, or is this actually possible?

That's theoretically possible, but that implementation violates the current complexity requirements, because the first loop is O(A + B), whereas the required complexity is O(B log (A + B)) (where A = dest.size() and B = source.size()). If B is small compared to A, this will slow down the algorithm. (I think that can be fixed by using a tree traversal over A in the first loop instead of a linear scan, which should give O(min(A, B log (A + B)) + B) for the first loop, which is an improvement over the standard's algorithm if B is large compared to A and is sorted according to A's comparator, but is otherwise the same.)

If so, is it possible to retroactively append it to C++17, given how recent it is and how it's not really a change in the API but rather in implementation? It shouldn't cause API/ABI breakage either.

The standard does not dictate a particular implementation; if you are able to implement something like the above in a way that satisfies (or improves upon) the current complexity requirements, I imagine standard library vendors would be interested in seeing a benchmark for how that implementation compares to their existing one. If you can persuade standard library vendors that your approach is better in practice, that'd be good motivation for a library paper suggesting that the asymptotic complexity guarantees of merge be adjusted to be O(min(A, B log (A + B)) + B) in the case where B is sorted according to A's comparator.

T. C.

unread,
Jun 29, 2018, 6:06:53 AM6/29/18
to ISO C++ Standard - Discussion, mr...@google.com
The comparator can be stateful. How can you reliably know that they have the same behavior across the two containers?

mr...@google.com

unread,
Jun 29, 2018, 12:15:14 PM6/29/18
to ISO C++ Standard - Discussion, mr...@google.com
On Thursday, June 28, 2018 at 1:15:57 PM UTC-7, Richard Smith wrote:
That's theoretically possible, but that implementation violates the current complexity requirements, because the first loop is O(A + B), whereas the required complexity is O(B log (A + B)) (where A = dest.size() and B = source.size()). If B is small compared to A, this will slow down the algorithm. (I think that can be fixed by using a tree traversal over A in the first loop instead of a linear scan, which should give O(min(A, B log (A + B)) + B) for the first loop, which is an improvement over the standard's algorithm if B is large compared to A and is sorted according to A's comparator, but is otherwise the same.)

Ah, true, since it's a tree anyways, tree traversal to the next hint would be better. Out of curiosity, how did you get O(min(A, B log (A + B)) + B) as a complexity?

On Friday, June 29, 2018 at 3:06:53 AM UTC-7, T. C. wrote:
The comparator can be stateful. How can you reliably know that they have the same behavior across the two containers?

I'm not sure what to think about that. Is it some static internal variable that changes upon every insert, or maybe a comparator that hashes each node's pointer and chooses between multiple comparisons? A stateful comparator just seems like a can of worms that results in weird behavior anywhere. The result of calling merge() or anything else should not change based on implementation, so a comparator that changes behavior depending on how many times it was constructed/called or what it was compared with or which order they were compared with should be considered undefined behavior. Also, the behavior of the comparator should not behave differently between container instances, or else something like moving or copying between containers would invalidate the invariants of being sorted. In fact, for that reason alone, I don't think there is ever a good use for a comparator that changes behavior midway.

T. C.

unread,
Jun 29, 2018, 12:20:53 PM6/29/18
to ISO C++ Standard - Discussion, mr...@google.com


On Friday, June 29, 2018 at 12:15:14 PM UTC-4, mr...@google.com wrote:
On Friday, June 29, 2018 at 3:06:53 AM UTC-7, T. C. wrote:
The comparator can be stateful. How can you reliably know that they have the same behavior across the two containers?

I'm not sure what to think about that. Is it some static internal variable that changes upon every insert, or maybe a comparator that hashes each node's pointer and chooses between multiple comparisons? A stateful comparator just seems like a can of worms that results in weird behavior anywhere. The result of calling merge() or anything else should not change based on implementation, so a comparator that changes behavior depending on how many times it was constructed/called or what it was compared with or which order they were compared with should be considered undefined behavior. Also, the behavior of the comparator should not behave differently between container instances, or else something like moving or copying between containers would invalidate the invariants of being sorted. In fact, for that reason alone, I don't think there is ever a good use for a comparator that changes behavior midway.

It doesn't need to change behavior midway. The point is that the behavior of the comparator is not solely determined by its type. 

template<class T> struct comp { bool ascending; bool operator()(const T& l, const T& r) const { return ascending? l < r : r < l; } };

set<int, comp<int>> s1(comp<int>{true});
set<int, comp<int>> s2(comp<int>{false});

s1.merge(s2);

mr...@google.com

unread,
Jun 29, 2018, 12:58:03 PM6/29/18
to ISO C++ Standard - Discussion, mr...@google.com
On Friday, June 29, 2018 at 9:20:53 AM UTC-7, T. C. wrote:
It doesn't need to change behavior midway. The point is that the behavior of the comparator is not solely determined by its type. 

template<class T> struct comp { bool ascending; bool operator()(const T& l, const T& r) const { return ascending? l < r : r < l; } };

set<int, comp<int>> s1(comp<int>{true});
set<int, comp<int>> s2(comp<int>{false});

s1.merge(s2);

I see, that makes sense. So in other words, while going through the source to figure out where to put the next node, I need to also check if it is sorted compared to the previous element? Sort of like this (in pseudo-code-ish):

void std::set::merge(std::set<Key, C2, Allocator>& source) {

 
auto comp = this->key_comp();

 
auto dest_it = this->root_it;
 
auto prev_source_it = source.end();
 
for (auto source_it = source.begin(); source_it != source.end(); ) {
     
// If this is the first iteration or the source is not sorted via comp, start searching from the root.
     
// Otherwise, start searching from where the previous source was last inserted.
     
auto search_hint_it = (prev_source_it != source.end() && comp(*prev_source_it, *source_it)) ? dest_it : this->root_it;
     
auto insert_hint_it = this->lower_bound_hint(search_hint_it, *source_it);
     
this->insert(insert_hint_it, source.extract(source_it++));
     dest_it
= insert_hint_it;
   
}
}

Which would be similar to the range iterator constructor, where it is linear if the range is sorted by comp and O(N log N) otherwise.

Of course, this is assuming that all insertions right now are done through the root and the existence of a lower_bound_hint implemented via tree traversal.

Richard Smith

unread,
Jun 29, 2018, 5:27:42 PM6/29/18
to std-dis...@isocpp.org, mr...@google.com
On Fri, 29 Jun 2018 at 09:15, mrpi via ISO C++ Standard - Discussion <std-dis...@isocpp.org> wrote:
On Thursday, June 28, 2018 at 1:15:57 PM UTC-7, Richard Smith wrote:
That's theoretically possible, but that implementation violates the current complexity requirements, because the first loop is O(A + B), whereas the required complexity is O(B log (A + B)) (where A = dest.size() and B = source.size()). If B is small compared to A, this will slow down the algorithm. (I think that can be fixed by using a tree traversal over A in the first loop instead of a linear scan, which should give O(min(A, B log (A + B)) + B) for the first loop, which is an improvement over the standard's algorithm if B is large compared to A and is sorted according to A's comparator, but is otherwise the same.)

Ah, true, since it's a tree anyways, tree traversal to the next hint would be better. Out of curiosity, how did you get O(min(A, B log (A + B)) + B) as a complexity?

Just a crude upper bound: the tree traversal visits all nodes in A at most three times each[1], so if we don't skip any parts of the tree, the traversal is bounded above by k(A + B) steps. And each time we advance through the tree looking for the next element in B, that takes at most log (#elements) time, which is bounded above by log(A + B), which gives the other bound of B log (A + B) for the traversal. The other cost is for B correctly-hinted insertions, which adds an amortized upper bound of O(B) since insert with a correct hint is amortized constant time.

 [1] OK, the tree gets rebalanced as we go, due to the insertions, so this is not exactly the whole truth, but it should be close enough. I think you can make this precise by including the potential additional one-time traversal cost as a constant factor in the (amortized constant) cost of the rebalancing.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
Reply all
Reply to author
Forward
0 new messages