std::midpoint and iterators

292 views
Skip to first unread message

jgot...@gmail.com

unread,
Mar 5, 2019, 9:00:40 PM3/5/19
to ISO C++ Standard - Future Proposals
The new std::midpoint function (defined in P0811R2) defines a new function std::midpoint.  It is defined for all integral types and pointers, but not for other iterator types.  I think it should be defined for all random-access iterator types.  A major use-case for this function is in implementing algorithms that have to split a range in two, such as mergesort or median-of-three partitioning for quicksort.  Also, as this function is specified now, it is implementation-defined whether it will work for std::vector<X>::iterators.  If vector's iterators are specified as pointers it will, and if not it won't.  That can easily change when switching between compilers, or even between optimization levels of the same compiler.

Joe Gottman

Jeffrey Yasskin

unread,
Mar 6, 2019, 1:41:10 PM3/6/19
to std-pr...@isocpp.org
It sounds useful to extend midpoint() to work on at least random-access iterators, and possibly all forward iterators. Want to write a paper?

On Tue, Mar 5, 2019 at 6:00 PM <jgot...@gmail.com> wrote:
The new std::midpoint function (defined in P0811R2) defines a new function std::midpoint.  It is defined for all integral types and pointers, but not for other iterator types.  I think it should be defined for all random-access iterator types.  A major use-case for this function is in implementing algorithms that have to split a range in two, such as mergesort or median-of-three partitioning for quicksort.  Also, as this function is specified now, it is implementation-defined whether it will work for std::vector<X>::iterators.  If vector's iterators are specified as pointers it will, and if not it won't.  That can easily change when switching between compilers, or even between optimization levels of the same compiler.

Joe Gottman

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/17327c95-88a5-4467-b775-dee91f8b3bce%40isocpp.org.

jgot...@gmail.com

unread,
Mar 6, 2019, 9:40:58 PM3/6/19
to ISO C++ Standard - Future Proposals
I'm of two minds about extending the midpoint() function to all iterator types.  It would definitely be useful, but there are a couple of potential problems.  Currently the midpoint() function is always of constant complexity; if we extend it to forward iterators it will be O(n) for forward or bidirectional iterator types. Also, for a pair of pointers  p and q into the same container,, both midpoint(p, q) and midpoint(q, p) are currently well-defined.  It is possible to maintain this property if p and q are random-access iterators into the same container, but not if they are only bidirectional or forward iterators.

Joe Gottman

Daniel Krügler

unread,
Mar 7, 2019, 2:03:32 AM3/7/19
to std-pr...@isocpp.org
Am Do., 7. März 2019 um 03:41 Uhr schrieb <jgot...@gmail.com>:
>
> I'm of two minds about extending the midpoint() function to all iterator types. It would definitely be useful, but there are a couple of potential problems. Currently the midpoint() function is always of constant complexity; if we extend it to forward iterators it will be O(n) for forward or bidirectional iterator types. Also, for a pair of pointers p and q into the same container,, both midpoint(p, q) and midpoint(q, p) are currently well-defined. It is possible to maintain this property if p and q are random-access iterators into the same container, but not if they are only bidirectional or forward iterators.
>
> Joe Gottman

I don't consider this as an argument, *not* to provide this
functionality. We have a long-established practice of functions in the
standard library (especially in [algorithms]) that specify complexity
limits depending on iterator categories, e.g. for is_permutation:

Complexity: No applications of the corresponding predicate if
ForwardIterator1 and ForwardIterator2
meet the requirements of random access iterators and last1 - first1 !=
last2 - first2.
Otherwise, exactly last1 - first1 applications of the corresponding
predicate if equal(first1,
last1, first2, last2) would return true if pred was not given in the
argument list or equal(first1,
last1, first2, last2, pred) would return true if pred was given in the
argument list; otherwise,
at worst O(N2), where N has the value last1 - first1.

Several other examples exist.

- Daniel

Nicol Bolas

unread,
Mar 7, 2019, 11:29:46 AM3/7/19
to ISO C++ Standard - Future Proposals


On Thursday, March 7, 2019 at 2:03:32 AM UTC-5, Daniel Krügler wrote:
Am Do., 7. März 2019 um 03:41 Uhr schrieb <jgot...@gmail.com>:
>
> I'm of two minds about extending the midpoint() function to all iterator types.  It would definitely be useful, but there are a couple of potential problems.  Currently the midpoint() function is always of constant complexity; if we extend it to forward iterators it will be O(n) for forward or bidirectional iterator types. Also, for a pair of pointers  p and q into the same container,, both midpoint(p, q) and midpoint(q, p) are currently well-defined.  It is possible to maintain this property if p and q are random-access iterators into the same container, but not if they are only bidirectional or forward iterators.
>
> Joe Gottman

I don't consider this as an argument, *not* to provide this
functionality.

And yet, we do this all the time. `std::sort`, `std::lower_bound`, and such all require RandomAccessRanges. They don't need that requirement; you can do sorting and binary searches on any ForwardRange. But that's not something anyone actually wants to do; it's excruciatingly slow.

Furthermore, let's say that I have a `std::forward_list`. This is a ForwardRange, but it is also a SizedRange. If I want to do some `midpoint`-based algorithm, the most efficient way to do this is not by iterators, but by indices. That is, you start with an index range of [0, size), and you `midpoint` that to get your middle index. You then use `range::advance` to increment the begin iterator to the midpoint. And then you decide which side you're interested in.

But you keep the index around to do you next `midpoint` call. This minimize the number of increments to iterators: you will only ever perform n increments at the most.

If we let people use `midpoint` directly on ForwardRanges, you're performing lots more iterator increments. Each part of a binary search increments the iterator N times, followed by N/2 to get the midpoint. Even if we restrict it to a SizedRange, that still doesn't work, because we only know the size of a `std::forward_list`, not the size of an iterator pair from a `forward_list`.

This is precisely why we don't require `sort` and `lower_bound` to work on non-RandomAccessRanges: to prevent people from writing bad, inefficient algorithms. I submit that `midpoint` is just a part of `lower_bound`, so it should have a similar restriction.

Daniel Krügler

unread,
Mar 10, 2019, 9:40:31 AM3/10/19
to std-pr...@isocpp.org
Am Do., 7. März 2019 um 17:29 Uhr schrieb Nicol Bolas <jmck...@gmail.com>:
>
>
>
> On Thursday, March 7, 2019 at 2:03:32 AM UTC-5, Daniel Krügler wrote:
>>
>> Am Do., 7. März 2019 um 03:41 Uhr schrieb <jgot...@gmail.com>:
>> >
>> > I'm of two minds about extending the midpoint() function to all iterator types. It would definitely be useful, but there are a couple of potential problems. Currently the midpoint() function is always of constant complexity; if we extend it to forward iterators it will be O(n) for forward or bidirectional iterator types. Also, for a pair of pointers p and q into the same container,, both midpoint(p, q) and midpoint(q, p) are currently well-defined. It is possible to maintain this property if p and q are random-access iterators into the same container, but not if they are only bidirectional or forward iterators.
>> >
>> > Joe Gottman
>>
>> I don't consider this as an argument, *not* to provide this
>> functionality.
>> And yet, we do this all the time.

In my original reply I was giving an argument, *why* we could add this
functionality. It was not an argument that was intended to suggest
that it needs to be added.

> `std::sort`, `std::lower_bound`, and such all require RandomAccessRanges. They don't need that requirement; you can do sorting and binary searches on any ForwardRange. But that's not something anyone actually wants to do; it's excruciatingly slow.

This alone is not a reason to not provide such a functionality. For
example, std::distance is "slow" for forward iterators, but it is
still a useful functionality.

> Furthermore, let's say that I have a `std::forward_list`. This is a ForwardRange, but it is also a SizedRange.

I cannot find evidence that supports your claim: According to [range.sized] p2

"[..] T models SizedRange only if ranges::size(t) is O(1) [..]

std::forward_list does not provide a size() member function, because
that would conflict with the design targets of that specific
container. In addition, the requirements imposed by [range.prim.size]
seem to conflict with ranges::size() being well-formed for
std::forward_list. There does not even exist support for a std::size
range access for std::forward_list.

> If I want to do some `midpoint`-based algorithm, the most efficient way to do this is not by iterators, but by indices. That is, you start with an index range of [0, size), and you `midpoint` that to get your middle index. You then use `range::advance` to increment the begin iterator to the midpoint. And then you decide which side you're interested in.
>

Our original point of discussion was an algorithm, now you are
switching to container functionality. Our (unwritten) guidelines have
the effect that we have a much more conservative view when deciding to
provide functionality for containers (such as for forward_list for
example), but we have no such strict guidelines for algorithms.

> But you keep the index around to do you next `midpoint` call. This minimize the number of increments to iterators: you will only ever perform n increments at the most.
>
> If we let people use `midpoint` directly on ForwardRanges, you're performing lots more iterator increments. Each part of a binary search increments the iterator N times, followed by N/2 to get the midpoint. Even if we restrict it to a SizedRange, that still doesn't work, because we only know the size of a `std::forward_list`, not the size of an iterator pair from a `forward_list`.
>

I was not suggesting to provide container support for midpoint. This
would be inappropriate for the reasons that you and me have provided.

> This is precisely why we don't require `sort` and `lower_bound` to work on non-RandomAccessRanges: to prevent people from writing bad, inefficient algorithms. I submit that `midpoint` is just a part of `lower_bound`, so it should have a similar restriction.

I disagree that this is *precisely* the reason why we don't have
standarized std::sort and std::lower_bound to work for non-random
access ranges. We do provide support for the algorithm partition_point
even for forward iterators for example, even though the algorithm
needs to call the equivalent of std::distance which has linear
complexity for non-random access iterators. Stepanov and McJones
provide in Elements of Programming

We need to decide whether we want an middle_point algorithm for
forward iterators. There may be some reasons not to provide it, but
the question is whether they are sufficient. On the other hand we can
always start with stricter requirements and can relax those
requirements in the future if there is evidence that there are
sufficient motivating reasons. So unless someone brings forward some
usecases for plain forward iterator support for middle_point I'm happy
to standardize only a random access iterator based version of
middle_point. However, such a reduced provision does not naturally
fall out of the existing Standard Library contents, it requires an
individual decision.

- Daniel
Reply all
Reply to author
Forward
0 new messages