# Emulating map with a sorted vector

11 views

### Scott Meyers

Sep 27, 2000, 3:00:00 AM9/27/00
to
[I'm posting this to both comp.std.c++ and comp.lang.c++.moderated, because
there is a standards question as well as a workaround question.]

In his column in the April 2000 C++ Report, Matt Austern explains how a
sorted vector may sometimes be a better choice than a set. I've been
trying to extend his reasoning to argue that a sorted vector<pair<K,V> >
may sometimes be better than a map<K, V>.

If I have a sorted vector<T> and I want to search it using lower_bound and
a custom comparison function, the signature of the comparison function is
expected to be this, more or less:

bool compFunc(const T& lhs, const T& rhs); // return whether lhs precedes
// rhs in the sort order

When T is pair<K, V>, that means we expect the comparison function to look
more or less like this:

bool compFunc(const pair<K, V>& lhs, const pair<K, V>& rhs);

The problem is that if I'm emulating map/multimap semantics, I don't want
to compare a pair<K, V> and a pair<K, V>, I want to compare a K and a K,
because maps look only at keys when searching.

While it's running, lower_bound compares a given value with a value it
finds in a range. With a custom comparison function, there is no reason
why the type of the given value must be the same as the type of the value
in the range. In my case, I want to compare a type K (a specific key
value) with an object of type pair<K, V> (a value from a
vector<pair<K, V> >). That suggests I should be able to declare my
comparison function as follows:

bool compFunc(const pair<K, V>& p, const K& v);

>From what I can tell, the standard neither explicitly allows nor explicitly
prohibits this, so I wrote the following program to see if it would
compile. (I don't care if it runs, only if it compiles.)

#include <vector>
#include <string>
#include <utility>
#include <algorithm>

using std::vector;
using std::pair;
using std::string;
using std::lower_bound;

typedef vector<pair<int, string> > MapVec;

bool keyCompare(const pair<int, string>& p, int val);

int main()
{
MapVec mv;

lower_bound(mv.begin(), mv.end(), 10, keyCompare); // all I care about
} // is whether this
// call compiles

I was hoping that all my platforms would accept this, because that way I
could sleep soundly at night after telling people they could consider doing
this kind of thing. Of the five platforms I tested, four compiled it, but
one did not. As I said, I believe both results are conforming, but I'd
appreciate others' readings of the matter.

You may wonder why I care. I want to be able to search the sorted vector
of pairs using lower_bound and a comparison function. If the comparision
function must compare two pair objects, that means I have to construct a
pair, even though I may have only a key value. This is not only awkward,
it may also engender a performance penalty for the construction of the
superfluous value part of the pair. In my code above, for example, I'd
have to turn an int into a pair<int, string>, and that would require that I
construct a string I will never use. I want to avoid the awkwardness and
the possible performance penalty.

If I can't use my asymmetric comparison function above, is there a better
alternative to writing my own function to perform binary search? Such a
function would be almost identical to the algorithms binary_search and
lower_bound, so I'd prefer to avoid that if I can.

Thanks,

Scott

--
I've scheduled more public seminars. Check out
http://www.aristeia.com/seminars/ for details.

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]

### David Abrahams

Sep 27, 2000, 3:00:00 AM9/27/00
to

wrote in message news:MPG.143ab91a1...@news.supernews.com...

> I was hoping that all my platforms would accept this, because that way I
> could sleep soundly at night after telling people they could consider
doing
> this kind of thing. Of the five platforms I tested, four compiled it, but
> one did not. As I said, I believe both results are conforming, but I'd
> appreciate others' readings of the matter.

The standard clearly requires this to work, and I take advantage of the fact
all the time for the same reasons you do:

25.3.3.1 lower_bound

template<class ForwardIterator, class T>
ForwardIterator lower_bound(ForwardIterator first, ForwardIterator last,
const T& value);

template<class ForwardIterator, class T, class Compare>
ForwardIterator lower_bound(ForwardIterator first, ForwardIterator last,
const T& value, Compare comp);

1 Requires: Type T is LessThanComparable (20.1.2).
2 Effects: Finds the first position into which value can be inserted without
violating the ordering.
3 Returns: The furthermost iterator i in the range [first, last] such that
for any iterator j in the range [first, i) the following corresponding
conditions hold: *j < value or comp(*j, value) != false

The only requirement on comp() here is that it be able to accept the
arguments in the order specified.

So, time to name names: which is the nonconforming library?

-Dave

### Brian McNamara!

Sep 27, 2000, 3:00:00 AM9/27/00
to
"David Abrahams" <abra...@mediaone.net> once said:
>The standard clearly requires this to work, and I take advantage of the fact
>all the time for the same reasons you do:

I disagree; the standard all but forbids it.

See 25.3-4. I can't quote it here, as my copy of the standard doesn't
let me cut-and-paste text (argh), but it says that comparators must be
"strict weak orderings" and then defines strict weak orderings as
operations on two arguments _of_the_same_type_.

Note also that LessThanComparable (20.1.2) is defined for two arguments
of the same type.

So I think Scott's original example is not standards-conforming.

I also think this is a weakness in the standard (there are more
general/expressive choices for "template concepts" than the ones
described in the standard, and we need those to solve this problem),
but I'm sure they did the best they could for the first standard.

--
Brian McNamara

### Scott Meyers

Sep 27, 2000, 3:00:00 AM9/27/00
to
On Wed, 27 Sep 2000 14:35:38 GMT, David Abrahams wrote:
> The standard clearly requires this to work, and I take advantage of the fact
> all the time for the same reasons you do:

I would be thrilled if it were supposed to work, but I'm skeptical. In the
example code I posted, for example, my comparison function is declared like
this:

bool keyCompare(const pair<int, string>& p, int val);

Where does the standard say that they pair should be the first parameter
and the int should be the second? I only put them in that order because I
looked at several STL implementations and noticed that that was the order
in which they were placed in calls to the comparison function. Would an
implementation be non-conforming if the types were expected in the reverse
order?

> So, time to name names: which is the nonconforming library?

I honestly don't know, but it looks like it might be SGI's. It's whatever
library is used by Comeau's on-line test compiler
(http://www.comeaucomputing.com/tryitout/). I found the diagnostics
interesting, because they suggest that the library is using some kind of
compile-time semantic assertion rather than rejecting some specific line of
code. (As Program Chair, I can't resist pointing out that there will be
two papers and a discussion session on this topic at the October 10
workshop on template programming. Check out
http://www.oonumerics.org/tmpw00.)

Anyway, this is what Comeau C++ says about my sample code:

Comeau C/C++ 4.2.43 for ONLINE_EVALUATION

"como/include/sgi/concept_checks.h", line 262: error: no suitable user-
defined
conversion from "__check_equal<std::iterator_traits<std::pair<int,
std::string> *>::value_type>" to "__check_equal<int>" exists
__check_equal<_TypeX> t1 = __check_equal<_TypeY>();
^
detected during:
instantiation of "void _STL_SAME_TYPE_ERROR<_TypeX,
_TypeY>::__type_X_not_same_as_type_Y(_TypeX, _TypeY)
[with _TypeX=int,
_TypeY=std::iterator_traits<std::pair<int, std::string>
*>::value_type]" at line 1978 of
"como/include/sgi/stl_algo.h"
instantiation of "_ForwardIter std::lower_bound(_ForwardIter,
_ForwardIter, const _Tp &, _Compare) [with
_ForwardIter=std::vector<std::pair<int, std::string>,
std::allocator<std::pair<int,
std::string>>>::value_type
*, _Tp=int, _Compare=bool (*)(const std::pair<int,
std::string> &, int)]" at line 19 of "5505.c"

"como/include/sgi/concept_checks.h", line 308: error: no suitable constructor
exists to convert from "const int" to "std::pair<int, std::string>"
return __f(__first, __second);
^
detected during:
instantiation of "_Ret _STL_BINARY_FUNCTION_ERROR<_Func, _Ret,
_First,
_Second>::__binary_function_requirement_violation(_Func
&, const _First &, const _Second &) [with _Func=bool
(*)(const std::pair<int, std::string> &, int),
_Ret=bool, _First=int, _Second=int]" at line 1980 of
"como/include/sgi/stl_algo.h"
instantiation of "_ForwardIter std::lower_bound(_ForwardIter,
_ForwardIter, const _Tp &, _Compare) [with
_ForwardIter=std::vector<std::pair<int, std::string>,
std::allocator<std::pair<int,
std::string>>>::value_type
*, _Tp=int, _Compare=bool (*)(const std::pair<int,
std::string> &, int)]" at line 19 of "5505.c"

Scott

--
I've scheduled more public seminars. Check out
http://www.aristeia.com/seminars/ for details.

---

### llewelly.@@edevnull.dot.com

Sep 27, 2000, 3:00:00 AM9/27/00
to
"David Abrahams" <abra...@mediaone.net> writes:

> wrote in message news:MPG.143ab91a1...@news.supernews.com...
> > I was hoping that all my platforms would accept this, because that way I
> > could sleep soundly at night after telling people they could consider
> doing
> > this kind of thing. Of the five platforms I tested, four compiled it, but
> > one did not. As I said, I believe both results are conforming, but I'd
> > appreciate others' readings of the matter.
>

> The standard clearly requires this to work, and I take advantage of the fact
> all the time for the same reasons you do:
>
>

> 25.3.3.1 lower_bound
>
> template<class ForwardIterator, class T>
> ForwardIterator lower_bound(ForwardIterator first, ForwardIterator last,
> const T& value);
>
> template<class ForwardIterator, class T, class Compare>
> ForwardIterator lower_bound(ForwardIterator first, ForwardIterator last,
> const T& value, Compare comp);
>
> 1 Requires: Type T is LessThanComparable (20.1.2).
> 2 Effects: Finds the first position into which value can be inserted without
> violating the ordering.
> 3 Returns: The furthermost iterator i in the range [first, last] such that
> for any iterator j in the range [first, i) the following corresponding
> conditions hold: *j < value or comp(*j, value) != false

I think 25.3.3.1 could use some clairification. There are actually two
lower_bound() unctions, and one of them has no need to require that
T be LessThanComparable (as the user specifes comp), but I cannot
find any wording in the standard that makes the 'Requires' section
conditional on which function is used.

(by the way, this issue is (I think) orthogonal to the question Scott

>
> The only requirement on comp() here is that it be able to accept the
> arguments in the order specified.
>

> So, time to name names: which is the nonconforming library?

I do not know; all the compilers I have compiled it, but gave a link
error until I provided an (empty) definition for keyCompare().

### David Abrahams

Sep 27, 2000, 3:00:00 AM9/27/00
to

"Scott Meyers" <sme...@aristeia.com> wrote in message
news:MPG.143bd2a98...@news.supernews.com...

> On Wed, 27 Sep 2000 14:35:38 GMT, David Abrahams wrote:
> > The standard clearly requires this to work, and I take advantage of the
fact
> > all the time for the same reasons you do:
>
> I would be thrilled if it were supposed to work, but I'm skeptical. In
the
> example code I posted, for example, my comparison function is declared
like
> this:
>
> bool keyCompare(const pair<int, string>& p, int val);
>
> Where does the standard say that they pair should be the first parameter
> and the int should be the second?

In the excerpt I quoted in my previous post. Specifically:

3 Returns: The furthermost iterator i in the range [first, last] such that
for any iterator j in the range [first, i) the following corresponding
conditions hold: *j < value or comp(*j, value) != false

> I only put them in that order because I

> looked at several STL implementations and noticed that that was the order
> in which they were placed in calls to the comparison function. Would an
> implementation be non-conforming if the types were expected in the reverse
> order?

Yes, but I think that Brian McNamara is technically correct, that the
comparator is required to be a StrictWeakOrdering, which defines an
operation on operands of the same type. In fact, that's *almost* what the
Comeau compiler is checking for here. I don't think it got the check quite
right here, though, because it's not checking that the comparator can
compare objects of the same type. The STL code reads:

__STL_REQUIRES_SAME_TYPE(_Tp,
typename iterator_traits<_ForwardIter>::value_type);

Which is checking that the value type of the iterator is the same as the
type of the argument you're comparing with. I think it would be fairly easy
to fix the concept check to do the right thing (see the end of this
message), though I also agree with Brian that the StrictWeakOrdering
requirement is a serious weakness; maybe even a defect.

I know it's a bit tricky, but surely, this KeyCompare satisfies all the
requirements of a StrictWeakOrdering and ought to work. After all, the
standard does -not- say that the iterator's value_type and the 3rd argument
to lower_bound have to be the same type:

struct KeyCompare {
bool operator()(const std::pair<int, string>& x, const std::pair<int,
string>& y)
{ return x.first < y.first; }
bool operator()(const pair<int, string>& p, int val) { return p.first <
val; }
};

> > So, time to name names: which is the nonconforming library?
>

> I honestly don't know, but it looks like it might be SGI's. It's whatever
> library is used by Comeau's on-line test compiler
> (http://www.comeaucomputing.com/tryitout/). I found the diagnostics
> interesting, because they suggest that the library is using some kind of
> compile-time semantic assertion rather than rejecting some specific line
of
> code. (As Program Chair, I can't resist pointing out that there will be
> two papers and a discussion session on this topic at the October 10
> workshop on template programming. Check out
> http://www.oonumerics.org/tmpw00.)
>
> Anyway, this is what Comeau C++ says about my sample code:

<concept check violations snipped>

A more accurate concept check would involve instantiating a template which
uses the comparator to compare two objects of the iterator's value_type.
Something like this should do it:

template <class F, class T>
struct CheckOrderingOnType {
friend void call_operator_must_accept_operands_of_same_type(const F& f,
const T& t) {
f(t, t);
}

Then you just instantiate (using SGI's argument names):

CheckOrderingOnType<__comp, typename
iterator_traits<_ForwardIter>::value_type> x;

-Dave

### Fergus Henderson

Sep 27, 2000, 3:00:00 AM9/27/00
to
"David Abrahams" <abra...@mediaone.net> writes:

>25.3.3.1 lower_bound
...

>template<class ForwardIterator, class T, class Compare>
>ForwardIterator lower_bound(ForwardIterator first, ForwardIterator last,
>const T& value, Compare comp);
>
>1 Requires: Type T is LessThanComparable (20.1.2).
>2 Effects: Finds the first position into which value can be inserted without
>violating the ordering.

>3 Returns: The furthermost iterator i in the range [first, last] such that
>for any iterator j in the range [first, i) the following corresponding
>conditions hold: *j < value or comp(*j, value) != false
>

>The only requirement on comp() here is that it be able to accept the
>arguments in the order specified.

There are additional requirements on comp() specified in 25.3:
it has to induce a strict weak ordering.

| 25.3 - Sorting and related operations [lib.alg.sorting]
|
| -1- All the operations in lib.alg.sorting have two versions: one that takes a function
| object of type Compare and one that uses an operator<.
|
| -2- Compare is used as a function object which returns true if the first argument is less
| than the second, and false otherwise. Compare comp is used throughout for algorithms
| assuming an ordering relation. It is assumed that comp will not apply any non-constant
| function through the dereferenced iterator.
|
| -3- For all algorithms that take Compare, there is a version that uses operator< instead.
| That is, comp(*i, *j) != false defaults to *i < *j != false. For the algorithms to work
| correctly, comp has to induce a strict weak ordering on the values.
|
| -4- The term strict refers to the requirement of an irreflexive relation (!comp(x, x) for
| all x), [...]

The definition of "strict weak ordering" implies that `comp' must
accept arguments of the same type.

So if `comp' does not accept elements of the same type, then
algorithms such as lower_bound() do not have to "work correctly",
whatever that means. It's not clear if "not working correctly" can
include failing to compile.

--
Fergus Henderson <f...@cs.mu.oz.au> | "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh> | of excellence is a lethal habit"
PGP: finger f...@128.250.37.3 | -- the last words of T. S. Garp.

### llewelly.@@edevnull.dot.com

Sep 28, 2000, 3:00:00 AM9/28/00
to
Scott Meyers <sme...@aristeia.com> writes:

> On Wed, 27 Sep 2000 14:35:38 GMT, David Abrahams wrote:
> > The standard clearly requires this to work, and I take advantage of the fact
> > all the time for the same reasons you do:
>
> I would be thrilled if it were supposed to work, but I'm skeptical. In the
> example code I posted, for example, my comparison function is declared like
> this:
>
> bool keyCompare(const pair<int, string>& p, int val);
>
> Where does the standard say that they pair should be the first parameter

> and the int should be the second? I only put them in that order because I

> looked at several STL implementations and noticed that that was the order
> in which they were placed in calls to the comparison function. Would an
> implementation be non-conforming if the types were expected in the reverse
> order?
>

> > So, time to name names: which is the nonconforming library?
>
> I honestly don't know, but it looks like it might be SGI's. It's whatever
> library is used by Comeau's on-line test compiler
> (http://www.comeaucomputing.com/tryitout/). I found the diagnostics
> interesting, because they suggest that the library is using some kind of
> compile-time semantic assertion rather than rejecting some specific line of
> code. (As Program Chair, I can't resist pointing out that there will be
> two papers and a discussion session on this topic at the October 10
> workshop on template programming. Check out
> http://www.oonumerics.org/tmpw00.)

Yes - the newest version of sgi's lib tries to provide explicit
constraints as components that are used like compile time asserts.

[snip]

### James Kuyper

Sep 28, 2000, 3:00:00 AM9/28/00
to
"llewelly."@@edevnull.dot.com wrote:
...

> I think 25.3.3.1 could use some clairification. There are actually two
> lower_bound() unctions, and one of them has no need to require that
> T be LessThanComparable (as the user specifes comp), but I cannot
> find any wording in the standard that makes the 'Requires' section
> conditional on which function is used.
>
> (by the way, this issue is (I think) orthogonal to the question Scott

Yes, I find the wording that the standard uses in these functions
confusing. When there are two functions, and one uses a "a<b" and the
other uses
"comp(a,b)", the two descriptions are always intermingled. There are
parts that apparantly only apply to one or the other, and the reader is
required to disentangle them. Similar things are done with Predicate
arguments.

If the authors were to reply to this criticism by saying "it's obvious
which statements apply to which version", I'd have to agree, in most
cases. However, I've found that I can usually disentangle things like
this much better than most people, even better than most technically
trained people. Even I was fairly confused when I first ran into this
style of description, and I could easily see other people remaining
confused permanently.

I think those clauses should be re-written. However, I suspect that
doing so would expand the standard by dozens of pages of
almost-redundant description.

### James Kuyper

Sep 28, 2000, 3:00:00 AM9/28/00
to
Fergus Henderson wrote:
...
> The definition of "strict weak ordering" implies that `comp' must
> accept arguments of the same type.

Not quite. It requires that comp(*i,*j) and comp(*j,*i) must both be
legal expressions, but that's not quite the same thing. Consider:

template<class Key, class T>
struct KeyCompare
{
bool operator()(const Key&, const Key&);
bool operator()(const std::pair<Key,T>&, const Key&);
bool operator()(const Key&, const std::pair<Key,T>&);
bool operator()(const std::pair<Key,T>&, const std::pair<Key,T>&);

### Dietmar Kuehl

Sep 28, 2000, 3:00:00 AM9/28/00
to
In article <39D29BA6...@wizard.net>,

James Kuyper <kuy...@wizard.net> wrote:
> I think those clauses should be re-written. However, I suspect that
> doing so would expand the standard by dozens of pages of
> almost-redundant description.

Not necessarily:

template<class ForwardIterator, class T>

ForwardIterator
lower_bound(ForwardIterator first, ForwardIterator last,

const T& value);

Returns: lower_bound(first, last, value,
less<typename iterator_traits<ForwardIterator>::value_type>());

I'd guess that implementations use something like this anyway.
--
<mailto:dietma...@yahoo.com> <http://www.dietmar-kuehl.de/>

Sent via Deja.com http://www.deja.com/

### Ross Smith

Sep 28, 2000, 3:00:00 AM9/28/00
to
Matthew Austern wrote:
>
> The stupid part is that it's easy to write the algorithm so that
> something like Scott's use works. What's hard is finding the right
> way to generalize the strict weak ordering requirement. Getting rid
> of it entirely, and saying that comp is any old function object, is
> definitely wrong. What's less clear is how to generalize it
> correctly.
>
> Here's one way to state the generalization. I'm afraid of putting
> something this complicated in the standard, unless there's a way to
> say the same thing in a much simpler way. I'm taking V to be
> ForwardIterator's value type.
>
> comp is a function object whose first argument type is V
> and whose second argument type is T, and where comp(x, y)
> is equivalent to comp'(x, pi(y)), where comp' is some
> strict weak ordering on V and where pi is some
> homomorphism from V to T.

I think it can be stated more simply, something like this:

comp is a function object whose first argument type is V and whose
second argument type is T, such that, for any two iterators i<j in
the range [first,last), and any value t of type T, comp(*j,t)
implies comp(*i,t).

This suggests an obvious generalisation to a "lower_bound_if()" that
takes a unary predicate instead of a binary one (from which a usage
equivalent to Scott's asymmetric comparison can be easily derived using
a binder).

There's a problem, though.

Compare lower_bound() with upper_bound(). The natural way to use an
asymmetric comparison with lower_bound() is comp(*i,t), whereas the
natural usage for upper_bound() is comp(t,*i). A look at the code of the
SGI and Dinkumware implementations shows that they do indeed arrange
their comparisons this way.

If the necessity isn't obvious, think about under what circumstances you
need to compare with the first and last elements in the sequence. Given
only commp(*i,t), upper_bound() can't distinguish between situations
where the correct result is first, and those where it's first+1.
Similarly, given only comp(t,*i), lower_bound() can't distinguish
between results of last-1 and last.

So I don't think it's possible to modify the standard to allow the
asymmetric comparisons Scott wanted, unless we're willing to give up the
ability to call lower_bound() and upper_bound() on the same set of
arguments (and throw out equal_range(), or at least leave it restricted
to the currently allowed symmetric comparisons).

The unary-predicate search function I mentioned above would still be
useful, though. The distinction between lower_bound() and upper_bound()
disappears (there's no notion of equivalence, so there are no borderline
cases that satisfy one but not the other), and we can define a
"binary_find_if()" that returns the first iterator in the sequence for
which pred(*i) is true, or last if there is no such element. The
requirement on pred is that, for any two iterators i<j in [first,last),
pred(i) implies pred(j).

Here's the algorithm (based on upper_bound() from the SGI STL):

template <typename FwdIter, typename Pred>
FwdIter binary_find_if(FwdIter first, FwdIter last, Pred p) {
typename std::iterator_traits<FwdIter>::difference_type
half, length(std::distance(first, last));
FwdIter middle;
while (length > 0) {
half = length >> 1;
middle = first;
if (p(*middle)) {
length = half;
} else {
first = middle;
++first;
length -= half + 1;
}
}
return first;
}

--
Ross Smith <ros...@ihug.co.nz> The Internet Group, Auckland, New Zealand
========================================================================
"C++ is to programming as sex is to reproduction. Better ways might
technically exist but they're not nearly as much fun." -- Nikolai Irgens

### James Kuyper

Sep 28, 2000, 3:00:00 AM9/28/00
to
Dietmar Kuehl wrote:
>
> In article <39D29BA6...@wizard.net>,
> James Kuyper <kuy...@wizard.net> wrote:
> > I think those clauses should be re-written. However, I suspect that
> > doing so would expand the standard by dozens of pages of
> > almost-redundant description.
>
> Not necessarily:
>
> template<class ForwardIterator, class T>
> ForwardIterator
> lower_bound(ForwardIterator first, ForwardIterator last,
> const T& value);
>
> Returns: lower_bound(first, last, value,
> less<typename iterator_traits<ForwardIterator>::value_type>());
>
> I'd guess that implementations use something like this anyway.

You're right. The standard could simply add a statement near the start
of the Algorithms section that any Compare template parameter defaults
to std::less<iterator_traits<Iterator>::value_type>, where Iterator
stands in for the appropriate iterator argument. Similarly, it could say
that BinaryPredicate always defaults to std::equal_to<>. It's a little
more complicated with functions that take Predicate arguments; the form
that doesn't take the Predicate argument takes a value argument instead,
and is equivalent to using the Predicate form with
std::bind2nd(std::equal_to<T>(),value).

This would significantly simplify the descriptions of the algorithms
themselves. However, there is a cost to this approach: the part of the
description that covers the function object is usually more abstract
than the part that covers the corresponding operator; it might be harder
for users to recognize that the default behavior is what they need.
However, even with that disadvantage, I still think it would be easier
to understand than the way it's currently written.

### David Abrahams

Sep 28, 2000, 3:00:00 AM9/28/00
to

"Dietmar Kuehl" <dietma...@yahoo.com> wrote in message
news:8qv4n3\$ml5\$1...@nnrp1.deja.com...

> In article <39D29BA6...@wizard.net>,
> James Kuyper <kuy...@wizard.net> wrote:
> > I think those clauses should be re-written. However, I suspect that
> > doing so would expand the standard by dozens of pages of
> > almost-redundant description.
>
> Not necessarily:
>
> template<class ForwardIterator, class T>
> ForwardIterator
> lower_bound(ForwardIterator first, ForwardIterator last,
> const T& value);
>
> Returns: lower_bound(first, last, value,
> less<typename iterator_traits<ForwardIterator>::value_type>());
>
> I'd guess that implementations use something like this anyway.

Oh, here's an interesting twist! Implementations which do exactly that are
nonconforming, since today's standard specifies operator<(), not less<>(),
which may have been specialized.

I'd guess that this is arguably a weakness, too. By today's standard, it
makes the first form useless for pointers unless they point into the same
array.

-Dave

### John Potter

Sep 28, 2000, 3:00:00 AM9/28/00
to
On Wed, 27 Sep 2000 17:52:37 GMT, Scott Meyers <sme...@aristeia.com>
wrote:

> Anyway, this is what Comeau C++ says about my sample code:
>

> Comeau C/C++ 4.2.43 for ONLINE_EVALUATION
>
> "como/include/sgi/concept_checks.h", line 262: error: no suitable user-
> defined
> conversion from "__check_equal<std::iterator_traits<std::pair<int,
> std::string> *>::value_type>" to "__check_equal<int>" exists
> __check_equal<_TypeX> t1 = __check_equal<_TypeY>();

So, our friendly compiler is being picky about nonsense. A slight
modification should sufficiently confuse it.

bool keyCompare(const pair<int, string>& p,

const pair<int,string>& val);

int main()
{
MapVec mv;

lower_bound(mv.begin(), mv.end(),

reinterpret_cast<pair<int,string>const&>(
static_cast<int const&>(10)), keyCompare);
}

You should be able to sleep well when recommending this elegant code. ;)

John

### Scott Meyers

Sep 28, 2000, 3:00:00 AM9/28/00
to
On Thu, 28 Sep 2000 06:01:06 CST, Dietmar Kuehl wrote:
> Returns: lower_bound(first, last, value,
> less<typename iterator_traits<ForwardIterator>::value_type>());

Alas, this isn't quite true. lower_bound() without a predicate doesn't use
less<T>, it hardwires in use of operator<(), just like find() hardwires in
use of operator==() and sort() without a predicate hardwires in use of
operator<(). I think it's too bad that these do that instead of using the
corresponding comparison function objects, but them's the breaks.

Scott

--
I've scheduled more public seminars. Check out
http://www.aristeia.com/seminars/ for details.

---

### Dietmar Kuehl

Sep 28, 2000, 3:00:00 AM9/28/00
to
Hi,
In article <MPG.143d026eb...@news.supernews.com>,

Scott Meyers <sme...@aristeia.com> wrote:
> On Thu, 28 Sep 2000 06:01:06 CST, Dietmar Kuehl wrote:
> > Returns: lower_bound(first, last, value,
> > less<typename
iterator_traits<ForwardIterator>::value_type>());
>
> Alas, this isn't quite true. lower_bound() without a predicate
> doesn't use less<T>, it hardwires in use of operator<(), just like
> find() hardwires in use of operator==() and sort() without a
> predicate hardwires in use of operator<().

Are you talking of what the standard says or how the implementations
work you have looked at? I think that the implementation is free to
implement it either way. ... and even the above statement in the
standard would not disallow this. Well, if 'std::less<T>' is
specialized to produce some side effect, the implementation is probably
supposed to also produce this side effect. <sigh>

Sent via Deja.com http://www.deja.com/

---

### Dietmar Kuehl

Sep 28, 2000, 3:00:00 AM9/28/00
to
Hi,
In article <tiHA5.19421\$tn.4...@typhoon.ne.mediaone.net>,

"David Abrahams" <abra...@mediaone.net> wrote:
> Oh, here's an interesting twist! Implementations which do exactly
> that are nonconforming, since today's standard specifies operator<(),
> not less<>(), which may have been specialized.

It may have been specialized but actually this doesn't matter! In
20.3.3 paragraph 4 'operator()()' of 'std::less<T>' is defined to
return 'x < y'. The implementation of 'operator()()' might be different
but it still has to produce the equivalent of 'x < y' since there are
restrictions on specializations of standard classes (17.4.3.1 paragraph
1):

Such a specialization ... results in undefined behavior ... unless
the specialiation meets the standard library requirements for the
original template.

The standard library requirement on 'std::less<T>::operator()()' is to
return the result of 'x < y' if the arguments 'x' and 'y' are passed to
this function - whether specialized or not. That is, the following
assertion should never fail:

assert((x < y) == std::less<T>()(x, y));

assuming that 'x' and 'y' are of type 'T' or 'T const&'.

### Ross Smith

Sep 28, 2000, 3:00:00 AM9/28/00
to
Dietmar Kuehl wrote:
>
> Hi,
> In article <MPG.143d026eb...@news.supernews.com>,
> Scott Meyers <sme...@aristeia.com> wrote:
> > On Thu, 28 Sep 2000 06:01:06 CST, Dietmar Kuehl wrote:
> > > Returns: lower_bound(first, last, value,
> > > less<typename
> iterator_traits<ForwardIterator>::value_type>());
> >
> > Alas, this isn't quite true. lower_bound() without a predicate
> > doesn't use less<T>, it hardwires in use of operator<(), just like
> > find() hardwires in use of operator==() and sort() without a
> > predicate hardwires in use of operator<().
>
> Are you talking of what the standard says or how the implementations
> work you have looked at? I think that the implementation is free to
> implement it either way. ... and even the above statement in the
> standard would not disallow this. Well, if 'std::less<T>' is
> specialized to produce some side effect, the implementation is probably
> supposed to also produce this side effect. <sigh>

No, I'm afraid Scott is right. Check the first few paragraphs of 25.3.
The sorting algorithms are explicitly stated to default to operator<,
not std::less.

--
Ross Smith <ros...@ihug.co.nz> The Internet Group, Auckland, New Zealand
========================================================================
"C++ is to programming as sex is to reproduction. Better ways might
technically exist but they're not nearly as much fun." -- Nikolai Irgens

---

### James Kuyper

Sep 28, 2000, 3:00:00 AM9/28/00
to
Matthew Austern wrote:
...
> Incidentally, there should be similar words for the version that uses
> operator<. Again, it clearly is wrong to permit any old operator<
> that happens to have the right argument types: operator< has to induce
> a strict weak ordering.

That's already there; that's part of what LessThanComparable means.
What's less clear (but still, I believe, true) is that the
LessThanComparable requirement applies only when using the "<" version,
and strict weak ordering only applies to the Compare object version, and
only when operating on the objects in the iterator range.

I don't think Compare has to create a strict weak ordering of all
objects it could be applied to; I think it need only create a strict
weak ordering of the objects that you actually give to lower_bound<>.
25.3.3p3: "_comp_ has to induce a strict weak ordering on the values".
Not on all possible values, only on "the" values.

### John Potter

Sep 29, 2000, 3:00:00 AM9/29/00
to
On Wed, 27 Sep 2000 19:25:42 GMT, "David Abrahams"
<abra...@mediaone.net> wrote:

> > Where does the standard say that they pair should be the first parameter
> > and the int should be the second?
>

> In the excerpt I quoted in my previous post. Specifically:
>

> 3 Returns: The furthermost iterator i in the range [first, last] such that
> for any iterator j in the range [first, i) the following corresponding
> conditions hold: *j < value or comp(*j, value) != false

Consider:

template <class FI, class T, class Comp>
FI lower_bound (FI lo, FI hi, T const& t, Comp comp) {
while (lo != hi) {
FI mid(lo), pre;
FI::distance_type d(distance(lo, hi) / 2);
if (d) {
pre = mid ++;
}
if (comp(*mid, t))
lo = ++ mid;
else if (comp(t, *mid))
hi = mid;
else if (lo == mid)
hi = mid;
else if (comp(*pre, *mid))
lo = hi = mid;
else
hi = pre;
}
return lo;
}

This gets the job done, but you may claim that it does not meet the
complexity requirements of at most log(last - first) + 1 compares.
Is that a strict requirement or amortized or bigO? If the intent of
the standard is to specify the algorithm, it would be better to
present the code than try to fake it with words. Ostream_iterator
is an example where the code is essentially given.

John

### Scott Meyers

Sep 29, 2000, 3:00:00 AM9/29/00
to
On Thu, 28 Sep 2000 13:40:49 CST, John Potter wrote:
> lower_bound(mv.begin(), mv.end(),
> reinterpret_cast<pair<int,string>const&>(
> static_cast<int const&>(10)), keyCompare);
> }
>
> You should be able to sleep well when recommending this elegant code. ;)

Too much typing. I'd use C-style casts. Much more elegant. Much.

Scott

--
I've scheduled more public seminars. Check out
http://www.aristeia.com/seminars/ for details.

---

### David Abrahams

Sep 29, 2000, 3:00:00 AM9/29/00
to

"Dietmar Kuehl" <dietma...@yahoo.com> wrote in message
news:8r03ii\$g9t\$1...@nnrp1.deja.com...

> Hi,
> In article <tiHA5.19421\$tn.4...@typhoon.ne.mediaone.net>,
> "David Abrahams" <abra...@mediaone.net> wrote:
> > Oh, here's an interesting twist! Implementations which do exactly
> > that are nonconforming, since today's standard specifies operator<(),
> > not less<>(), which may have been specialized.
>
> It may have been specialized but actually this doesn't matter! In
> 20.3.3 paragraph 4 'operator()()' of 'std::less<T>' is defined to
> return 'x < y'. The implementation of 'operator()()' might be different
> but it still has to produce the equivalent of 'x < y' since there are
> restrictions on specializations of standard classes (17.4.3.1 paragraph
> 1):
>
> Such a specialization ... results in undefined behavior ... unless
> the specialiation meets the standard library requirements for the
> original template.
>
> The standard library requirement on 'std::less<T>::operator()()' is to
> return the result of 'x < y' if the arguments 'x' and 'y' are passed to
> this function - whether specialized or not. That is, the following
> assertion should never fail:
>
> assert((x < y) == std::less<T>()(x, y));

Yes, but it is allowed to do additional work (e.g. have other effects like
side-effects). I'm just proving that the proposed specification of
lower_bound() is not strictly equivalent to the current one.

-Dave

### James Kuyper

Sep 29, 2000, 3:00:00 AM9/29/00
to
Dietmar Kuehl wrote:
>
> Hi,
> In article <tiHA5.19421\$tn.4...@typhoon.ne.mediaone.net>,
> "David Abrahams" <abra...@mediaone.net> wrote:
> > Oh, here's an interesting twist! Implementations which do exactly
> > that are nonconforming, since today's standard specifies operator<(),
> > not less<>(), which may have been specialized.
>
> It may have been specialized but actually this doesn't matter! In
> 20.3.3 paragraph 4 'operator()()' of 'std::less<T>' is defined to
> return 'x < y'. The implementation of 'operator()()' might be different
> but it still has to produce the equivalent of 'x < y' since there are
> restrictions on specializations of standard classes (17.4.3.1 paragraph
> 1):
>
> Such a specialization ... results in undefined behavior ... unless
> the specialiation meets the standard library requirements for the
> original template.
>
> The standard library requirement on 'std::less<T>::operator()()' is to
> return the result of 'x < y' if the arguments 'x' and 'y' are passed to
> this function - whether specialized or not. That is, the following
> assertion should never fail:
>
> assert((x < y) == std::less<T>()(x, y));
>
> assuming that 'x' and 'y' are of type 'T' or 'T const&'.

Two problems:
1) std::less<T> might have been specialized by the user to cause
side-effects, in addition to meeting the standards requirements on it's
return value.

2) The standard specifically constrains std::less<T*>(p,q) more tightly
than "p<q" (where p and q are both pointers to T). That tighter
constraint is meaningful only because it allows std::less<T*>(p,q) to
return a different value than "p<q".

Therefore, requiring the use of std::less<> would be a significant
change in the requirements.

### Dietmar Kuehl

Sep 29, 2000, 3:00:00 AM9/29/00
to
Hi,
Ross Smith (ros...@ihug.co.nz) wrote:
: No, I'm afraid Scott is right. Check the first few paragraphs of 25.3.

: The sorting algorithms are explicitly stated to default to operator<,
: not std::less.

These sections define the semantics which means that eg.

std::sort(begin, end);

behaves as if

std::sort(begin, end, std::less<T>());

is used. In particular, I cannot see any restriction which would
prohibit the implementation

template <typename It>
void sort(It begin, It end) {
sort(begin, end, std::less<typename iterator_traits<It>::value_type>());
}

This is what I'm proposing to explicitly state in the standard instead
of using the somewhat convoluted technique to intermangle description
of two different functions.

Actually, Dave's concern about specialization of 'std::less' for user
defined types 'T' is something which also slightly worries me: Although
I think the standard requirements on specializations should be
sufficient, I'm not clear whether the following is a valid user
specification of 'std::less':

struct foo { bool operator< (foo const&) const; };
namepspace std {
template <>
class less<foo> {
bool operator(foo const& f1, foo const& f2) const {
std::cout << "hello, world\n";
return f1 < f2;
}
};
}

Clearly, the behavior of 'std::sort()' would vary depending on whether
the function is implemented like given above or without use of the
'std::less<foo>' functor. Of course, 'std::sort()' can still use the
comparator version but using a special auxiliary functor and not
'std::less<foo>': Internally used classes, say '_Less<foo>' do not run
the risk of being specialized. However, I don't think that
implementations should be required to use such strange classes but
should be allowed to use the standard functors.
--
<mailto:dietma...@yahoo.de>
<http://www.fmi.uni-konstanz.de/~kuehl/>
I am a realistic optimist - that's why I appear to be slightly pessimistic

### James Kuyper

Sep 29, 2000, 3:00:00 AM9/29/00
to
John Potter wrote:
...

> Is that a strict requirement or amortized or bigO? If the intent of
> the standard is to specify the algorithm, it would be better to
> present the code than try to fake it with words. Ostream_iterator
> is an example where the code is essentially given.

It is not the intent of the standard to specify the algorithm. It is the
intent of the standard to require an algorithm at least as good as a
particular algorithm, known by the committee to be a feasible algorithm,
but without requiring use of that particular algorithm. This statement
is true of all the complexity requirements, not just this particular
one.

### James Kuyper

Sep 30, 2000, 3:00:00 AM9/30/00
to
Dietmar Kuehl wrote:
...

> sufficient, I'm not clear whether the following is a valid user
> specification of 'std::less':
>
> struct foo { bool operator< (foo const&) const; };
> namepspace std {
> template <>
> class less<foo> {
> bool operator(foo const& f1, foo const& f2) const {
> std::cout << "hello, world\n";
> return f1 < f2;
> }
> };
> }

It is. It meets all the requirements for user-defined specialization of
std::less<foo>.

### Dietmar Kuehl

Sep 30, 2000, 8:54:45 PM9/30/00
to
Hi,
In article <39D5291A...@wizard.net>,

James Kuyper <kuy...@wizard.net> wrote:
> It is. It meets all the requirements for user-defined specialization
of
> std::less<foo>.

Except that it has different effects than those specified in the
standard. The standards lists the effect to be 'returns the result of
x < y' (or something like this). The specialization has additional
effects.

Sent via Deja.com http://www.deja.com/

---

### John Potter

Oct 2, 2000, 3:00:00 AM10/2/00
to
On Fri, 29 Sep 2000 23:36:15 GMT, James Kuyper <kuy...@wizard.net>
wrote:

> John Potter wrote:
> ...
> > Is that a strict requirement or amortized or bigO? If the intent of
> > the standard is to specify the algorithm, it would be better to
> > present the code than try to fake it with words. Ostream_iterator
> > is an example where the code is essentially given.
>
> It is not the intent of the standard to specify the algorithm. It is the
> intent of the standard to require an algorithm at least as good as a
> particular algorithm, known by the committee to be a feasible algorithm,
> but without requiring use of that particular algorithm. This statement
> is true of all the complexity requirements, not just this particular
> one.

That's a nice bunch of nonsense to counter my nonsense. Let's skip
that and get back to the first part which is a question that you did

The complexity requirements in the standard are a mixture of wordings
which appear to be bigO, amortized bigO, and absolute. Some are
impossible to meet, some are meaningless, and some are just confusing.

Impossible:
vector<int> v;
v.insert(v.end(), istream_iterator<int>(cin),
istream_iterator<int>());
Since the distance to the end of the vector is zero, this must execute
in zero time.

Meaningless:
vector::push_back is amortized constant time.
There is a general problem with amortized over what. We may assume
that it is amortized over a sequence of push_backs. For associative
iterators, the assumption of over a complete traversal using only
++ or --, not both is not as obvious.
new_size = (1 + 1e-20) * size() + 1;
This is amortized constant time as size() goes to an impossible
value. For usable values it is linear.

Confusing:
The lower_bound above. The posted algorithm is O(log(size())).

Taken with a grain of salt, the complexity requirements are great. The
insert example will be linear, the multiplier for vector growth will be
a reasonable number, and my posted algorithm is conformant. Taken
literally, the complexity requirements need a lot of work to hold in a
court of law. As a user, all I ask of the standard is that the
complexity requirements promise an order statistic. Absolutes may be
left to QoI not conformance.

My real objection was to the statement that the formal description of
the result of lower_bound restricts the operations that an
implementation may perform. Any algorithm which produces the required
result with logarithmic compares is fine.

John

### James Kuyper

Oct 3, 2000, 3:00:00 AM10/3/00
to
John Potter wrote:
...

> The complexity requirements in the standard are a mixture of wordings
> which appear to be bigO, amortized bigO, and absolute. Some are
> impossible to meet, some are meaningless, and some are just confusing.
>
> Impossible:
> vector<int> v;
> v.insert(v.end(), istream_iterator<int>(cin),
> istream_iterator<int>());
> Since the distance to the end of the vector is zero, this must execute
> in zero time.

Conventional usage is that complexity measures are taken in the limit of
large N. Ideally it would be in the limit of infinite N, but for real
implementations there's always a physically imposed maximum N that is
meaningful, and "large" should be defined by comparison with that
maximum. In principle, every algorithm gets an additional logarithmic
factor due to the fact that the size needed to store loop iterators
grows logarithmically with the maximum range they need to cover. For
instance, each time you multiply N by 2^32, you need an additional 32
bits of storage for your iterators. There is necessarily a corresponding
increase in the time needed to read and write the values of those
iterators.
As a practical matter, complexity measures are usually taken as applying
with fixed-size data types, with corresponding range limits, and N is
taken to be of size comparable to but significantly less than those
limits.

The standard says nothing to endorse this conventional usage, but
neither does it reject it. As you point out, without some such
interpretation, the requirements are literally unachievable.

> Meaningless:
> vector::push_back is amortized constant time.
> There is a general problem with amortized over what. We may assume
> that it is amortized over a sequence of push_backs. For associative
> iterators, the assumption of over a complete traversal using only
> ++ or --, not both is not as obvious.
> new_size = (1 + 1e-20) * size() + 1;
> This is amortized constant time as size() goes to an impossible
> value. For usable values it is linear.

I can't disagree; "amortized" should be better defined; it's definition
will generally have to be different in the different contexts that it's
used in.

> Confusing:
> The lower_bound above. The posted algorithm is O(log(size())).

Not necessarily. On ForwardIterators, std::distance() and std::advance()
aren't guaranteed to be any better than linear in the distance
traversed. The way you use them, the total distance traversed by
std::distance() is approximately 2*size(). For large enough N, the
linear number of iterations will dominate the logarithmic number of
comparisons, even if the comparisons are much more expensive than the
iterations.

This is perfectly consistent with the standard; it requires a
logarithmic number of comparisons, but the number of "steps taken" is
said to be linear for non-random-access iterators. Therefore, yes, your
algorithm does meet the standard's complexity requirements. I haven't
bothered to verify that it does the right thing, but it looks like it
would be a fairly typical implementation of std::lower_bound, except for
the reversed order of the comparisons, and the corresponding reversals
in logic.

...

> My real objection was to the statement that the formal description of
> the result of lower_bound restricts the operations that an
> implementation may perform. Any algorithm which produces the required
> result with logarithmic compares is fine.

Agreed (modulo the fact that there's also a complexity requirement on
the number of steps taken). But that DOES restrict "the operations that
an implementation may perform"; there's lots of algorithms that don't
meet the complexity requirements, and they're all prohibited for a
conforming implementation.

No one was making that statement, anyway. Complexity requirements are a
side issue you brought in, and I responded to. The real issue was that
the definition of std::lower_bound is in terms of comp(*j,value), not
!comp(value,*j). As has been pointed out, the requirements on 'comp'
guarantee that the code can be written in terms of reversed comparisons
instead of forward ones, with corresponding reversals in the logic.
However, it's only those requirements which allow such an implementation
- if it weren't for the requirement that compare() must impose a "strict
weak order" on the value, then the reversal would be prohibited.

### Brian McNamara!

Oct 3, 2000, 3:00:00 AM10/3/00
to
Ross Smith <ros...@ihug.co.nz> once said:
>Compare lower_bound() with upper_bound(). The natural way to use an
>asymmetric comparison with lower_bound() is comp(*i,t), whereas the
>natural usage for upper_bound() is comp(t,*i). A look at the code of the
>SGI and Dinkumware implementations shows that they do indeed arrange
>their comparisons this way.
>
>If the necessity isn't obvious, think about under what circumstances you
>need to compare with the first and last elements in the sequence. Given
>only commp(*i,t), upper_bound() can't distinguish between situations
>where the correct result is first, and those where it's first+1.
>Similarly, given only comp(t,*i), lower_bound() can't distinguish
>between results of last-1 and last.

This is an important point. However

>So I don't think it's possible to modify the standard to allow the
>asymmetric comparisons Scott wanted, unless we're willing to give up the
>ability to call lower_bound() and upper_bound() on the same set of
>arguments (and throw out equal_range(), or at least leave it restricted
>to the currently allowed symmetric comparisons).

I maybe disagree with this (I'm unclear about exactly what you mean).
Here is my take.

Basically, the lower_bound/upper_bound/equal_range trio divide a
sequence into three ranges:

--------------|----------|---------------------------
lower equiv upper

The full division requires _two_ tests, by virtue of the equivalence
class in the middle. With a symmetric comparator, these two tests can
both be performed by a single comparator; x and y are in the same
equivalence class if !comp(x,y)&&!comp(y,x).

When x and y are different types, however, the comparator allows only
one test. Consider the set of names bound to types considered here:

FI lower_bound(FI first, FI last, const T& val, C comp);
typedef iterator_traits<FI>::value_type V;

If C has the signature

bool operator()( V, T );

then lower_bound works, but not the other two. That is, we can
partition the sequence like so:

--------------|--------------------------------------
lower rest

If C has the signature

bool operator()( T, V );

then upper_bound works, but not the other two. That is, we can
partition the sequence like so:

-------------------------|---------------------------
rest upper

If C has both signatures, then we can do all three operations and divide
the sequence into

--------------|----------|---------------------------
lower equiv upper

Thus the "requirements" are

algorithm comparator
--------- ----------
lower_bound bool operator()( V, T );
upper_bound bool operator()( T, V );
equal_range both versions

Note that if T and V are the same type, then a single comparator
operation meets both requirements simultaneously, and degenerates into
the version in the current standard.

With these relaxed restrictions, not only would Scott's (OP) code
compile, but also one could define a comparator like

struct C {
bool operator()(const pair<int, string>& p, int val);
bool operator()(int val, const pair<int, string>& p);
};

which was legal for equal_range. (It was unclear to me if you would
consider this a "symmetric" comparator.)

>The unary-predicate search function I mentioned above would still be
>useful, though. The distinction between lower_bound() and upper_bound()
>disappears (there's no notion of equivalence, so there are no borderline
>cases that satisfy one but not the other), and we can define a
>"binary_find_if()" that returns the first iterator in the sequence for
>which pred(*i) is true, or last if there is no such element. The
>requirement on pred is that, for any two iterators i<j in [first,last),
>pred(i) implies pred(j).

This is quite nice!

--
Brian McNamara