Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Andrei's "iterators must go" presentation

603 views
Skip to first unread message

georg...@gmail.com

unread,
May 9, 2009, 2:03:29 PM5/9/09
to
I've read through this presentation:

http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009/05/08/iterators-must-go.pdf

and I am confused about how ranges are significantly different enough
from iterators to warrant the avocation of their elimination. Yeah,
iterators have some complexity, but I've always been led to believe
the complexity was there for a reason.

Is it syntactic sugar or is there something actually gained here?


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Jerry Coffin

unread,
May 10, 2009, 4:32:26 PM5/10/09
to
In article <e8f5fb01-1fde-411f-b10b-72dacafea4e5
@t10g2000vbg.googlegroups.com>, georg...@gmail.com says...

> I've read through this presentation:
>
> http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009/05/08/iterators-must-go.pdf
>
> and I am confused about how ranges are significantly different enough
> from iterators to warrant the avocation of their elimination. Yeah,
> iterators have some complexity, but I've always been led to believe
> the complexity was there for a reason.
>
> Is it syntactic sugar or is there something actually gained here?

There's more than syntactic sugar. There's the recognition that
iterators almost always come in pairs, but (right now) there's no way to
verify that a pair is really a pair. Just for example, something like:

int a[10], b[10], c[10];

std::copy(a, b, c);

This makes no sense, but C++ doesn't provide a reasonable way to verify
that. Containers reduce the problems, but we can still do silly things:

std::vector<int> a(10), b(10), c(10};

std::copy(a.begin(), b.end(), c.begin());

This gets especially ugly when/if you have (for example) two sources of
input and need to place the output in a third place -- with five
iteratators to track, it's easy to make a mistake, and nearly impossible
for the compiler to diagnose the mistake.

By wrapping the beginning and ending points up into a single range
object, we make it relatively difficult to create a range that's invalid
-- we can only construct a range using a range's constructors, and those
are (intended to be) designed to only allow valid ranges to be
constructed.

The syntactic advantage shouldn't be taken lightly either -- for a
typical algorithm, you're cutting the number of parameters in half,
which can be quite significant.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Anand Hariharan

unread,
May 10, 2009, 4:36:32 PM5/10/09
to
On May 9, 1:03 pm, "george.r...@gmail.com" <george.r...@gmail.com>
wrote:

> I've read through this presentation:
>
> http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009...

>
> and I am confused about how ranges are significantly different enough
> from iterators to warrant the avocation of their elimination. Yeah,
> iterators have some complexity, but I've always been led to believe
> the complexity was there for a reason.
>
> Is it syntactic sugar or is there something actually gained here?
>

Idea of 'ranges' and how they can alleviate difficulties in using
iterators isn't new. Dietmar Kuehl spoke about them almost four years
ago. Look up Message IDs

<3kp2ugF...@individual.net>
<3kva4lF...@individual.net>

both from the thread titled "Manual Loops vs STL Algorithms in the
Real World" initiated by Scott Meyers.

- Anand

Edward Diener

unread,
May 10, 2009, 4:33:25 PM5/10/09
to
georg...@gmail.com wrote:
> I've read through this presentation:
>
> http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009/05/08/iterators-must-go.pdf
>
> and I am confused about how ranges are significantly different enough
> from iterators to warrant the avocation of their elimination. Yeah,
> iterators have some complexity, but I've always been led to believe
> the complexity was there for a reason.
>
> Is it syntactic sugar or is there something actually gained here?
>
>

The PDF unfortunately tells very little about Andrei's argument,
whatever it is. Until one gets a formal, written presentation arguing
the case which the title of the PDF suggests, I would suspend judgment
on what the title implies.

Boost libraries have both iterator and range support, and range is used
there to further simplify end-user access to elements in a container.
Iterators still seem wonderfully flexible and easy to use to me so
whatever Andrei's argument(s) are, whether for iterators/ranges in C++
or D, they need to be presented in a way I can understand and not solely
in a PDF document with lots of capital letters, formatting, pictures,
and little explanation. I would guess Andrei's actual talk presented a
detailed formal argument but the PDF file itself does not do so and IMO
should just be ignored.

Martin Eisenberg

unread,
May 10, 2009, 7:37:48 PM5/10/09
to
georg...@gmail.com wrote:

> http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/
> 2009/05/08/iterators-must-go.pdf

Interesting. I'm wondering about the injunction against "squirreling
away" single iterators, though. I sort of see where it comes from but
is it really necessary to double the space and copying cost (when the
range has known finite size) wherever I have a specific container
entry on a leash?


Martin

--
Quidquid latine scriptum est, altum videtur.

Mathias Gaunard

unread,
May 10, 2009, 7:36:53 PM5/10/09
to
On 9 mai, 20:03, "george.r...@gmail.com" <george.r...@gmail.com>
wrote:

> I've read through this presentation:
>
> http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009...

>
> and I am confused about how ranges are significantly different enough
> from iterators to warrant the avocation of their elimination. Yeah,
> iterators have some complexity, but I've always been led to believe
> the complexity was there for a reason.
>
> Is it syntactic sugar or is there something actually gained here?

He says some things are not possible as iterators though. This is
perfectly false. His system allows nothing more in terms of
expressiveness.

It allows writing more optimal ranges for some situations than the
iterator system, that is all. That could also be possible with a much
more conservative proposition: simply allow the "past-the-end"
iterator to be of a distinct type, convertible to the base type for
compatibility. This has the same effect as having a "is_at_end"
primitive.

georg...@gmail.com

unread,
May 11, 2009, 8:10:13 AM5/11/09
to
> Idea of 'ranges' and how they can alleviate difficulties in using
> iterators isn't new. Dietmar Kuehl spoke about them almost four years
> ago.

There was a lot of good information there, thanks!

That was four years ago, so presumably the people who are on the
standards committee are well aware of the problems associated with
some of the STL's design. So why no 0x changes for some of these
issues?

alan_mc...@yahoo.com

unread,
May 11, 2009, 8:12:14 AM5/11/09
to
I've long felt that STL iterators were a crock,
because I've seen something better than either
STL iterators or Andrei's "ranges."

Before the STL came out, there was a C++
library called the RogueWave ("RW") library, which
had a different notion of "iterator".
My company was using this library when I started
there (as we still are.)

A RW iterator was a single object which held
the beginning, end, and a position for a
sequential container object. The iterator had
increment, decrement, set to begining,
(set to end?), dereference, and ways to test
whether the iterator was at the end of its
range.

As I recall, a loop with a RW iterator was
in principle:

Container<Type> cont;
Container<Type>::iterator iter( cont );

while ( iter++ )
{
Type &obj = *iter;
...
}

(The template names are made up, and I may
have used the wrong operators, but loops
looked essentially like this.)

It was a Really Useful Concept, something
that C (and most C libraries) did _not_ have,
and our company modelled the iterators in its
own libraries on them.

By contrast, the STL "iterator" concept is
basically a generalization of the C pointer
concept, which is itself really just some
a memory address with some semantic sugar added.
(In fact, as far as I can tell, every STL
iterator can be implemented as a an object
containing a single memory address.)

It has all the flaws and limitations that
C pointers have, which have been known
ever since C was invented. Andrei's objections
to iterators are essentially objections to
basing high-level programming on something
as low-level as C pointers.


My only objection to Andrei's proposal (to
the extent that I understand it) is that it
is _still_ not as useful as an RW iterator
because it does _not_ include position.

One still has to run around with iterators,
pointers, or what have you, to actually
iterate. With the RW iterator, you
have everything there.


I would like to suggest that Andrei (and
others?) take a look at the RW iterator
concept -- maybe we can bring the STL all
the way up to the 1990's!


(And while you all are at it, there are
a bunch of other things that neither STL
nor Boost have yet done as well as RW,
particularly date-time and timezone.)

Andrei Alexandrescu

unread,
May 11, 2009, 8:41:05 AM5/11/09
to
Mathias Gaunard wrote:
> On 9 mai, 20:03, "george.r...@gmail.com" <george.r...@gmail.com>
> wrote:
>> I've read through this presentation:
>>
>> http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009...
>>
>> and I am confused about how ranges are significantly different enough
>> from iterators to warrant the avocation of their elimination. Yeah,
>> iterators have some complexity, but I've always been led to believe
>> the complexity was there for a reason.
>>
>> Is it syntactic sugar or is there something actually gained here?
>
> He says some things are not possible as iterators though.

I did not allege that.

Andrei

Dave Jenkins

unread,
May 11, 2009, 8:41:13 AM5/11/09
to

"Edward Diener" <eldi...@tropicsoft.com> wrote in message
news:TKlNl.39339$b9.2...@bignews6.bellsouth.net...

> The PDF unfortunately tells very little about Andrei's argument,
> whatever it is. Until one gets a formal, written presentation arguing
> the case which the title of the PDF suggests, I would suspend judgment
> on what the title implies.

I heard Andrei's talk at BoostCon and it was excellent. He's very
entertaining as well as thought provoking.

You're right Ed, it would be hard to judge the argument just from the PDF.
He made some good but subtle arguments for ditching iterators and going to
ranges. But you have to understand that his concept of ranges is not the
same as Boost ranges. I believe Andrei is planning to write an article on
the subject soon, so we will have more to discuss then.

Pete Becker

unread,
May 11, 2009, 12:37:02 PM5/11/09
to
alan_mc...@yahoo.com wrote:
>
> As I recall, a loop with a RW iterator was
> in principle:
>
> Container<Type> cont;
> Container<Type>::iterator iter( cont );
>
> while ( iter++ )
> {
> Type &obj = *iter;
> ...
> }
>

Read about range-based for loops in C++0x.


>
> By contrast, the STL "iterator" concept is
> basically a generalization of the C pointer
> concept, which is itself really just some
> a memory address with some semantic sugar added.
> (In fact, as far as I can tell, every STL
> iterator can be implemented as a an object
> containing a single memory address.)
>

No, many STL iterators are more complex than a single memory address.
The abstraction that iterators present is like the one presented by
pointers, but its implemention is often far more sophisticaed.

And if you prefer RW-style iterators or pretty much any other iteration
technique, you can write a straightforward template or two to implement
it on top of STL-style iterators.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of
"The Standard C++ Library Extensions: a Tutorial and Reference"
(www.petebecker.com/tr1book)

Pete Becker

unread,
May 11, 2009, 12:37:03 PM5/11/09
to
georg...@gmail.com wrote:
>> Idea of 'ranges' and how they can alleviate difficulties in using
>> iterators isn't new. Dietmar Kuehl spoke about them almost four years
>> ago.
>
> There was a lot of good information there, thanks!
>
> That was four years ago, so presumably the people who are on the
> standards committee are well aware of the problems associated with
> some of the STL's design. So why no 0x changes for some of these
> issues?
>

Yes, those of us who are on the standards committee are well aware that
different designs have different strengths and weaknesses. Whether those
constitute problems is a matter of judgment and perspective. Also, note
that Andrei's paper starts from a fundamental misunderstanding of the
STL design. It's not "iterators = gcd(containers, algorithms)". It's
"iterators = gcd(sequences, algorithms)". Unfortunately, this is a very
common mistake.

If you have a list of things you think are problems you might check the
current working draft to see whether they've been addressed before
asserting "no 0x changes". In particular, look at the Range concept and
range-based for loops.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of
"The Standard C++ Library Extensions: a Tutorial and Reference"
(www.petebecker.com/tr1book)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Dave Harris

unread,
May 11, 2009, 12:34:13 PM5/11/09
to
georg...@gmail.com () wrote (abridged):

> I've read through this presentation:
>
> http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/200
> 9/05/08/iterators-must-go.pdf
>
> and I am confused about how ranges are significantly different
> enough from iterators to warrant the avocation of their
> elimination. Yeah, iterators have some complexity, but I've
> always been led to believe the complexity was there for a reason.
>
> Is it syntactic sugar or is there something actually gained here?

I thought the PDF was pretty self-explanatory. It makes a strong case
that much of the complexity of iterators arises because they are the
wrong abstraction for the job at hand.

You can make them work, but it's clumsy. For example, if you want to
present the characters of a classic C-string, terminated by '\0', then
that's easier with a single range object than with a pair of iterators.
There's nowhere for the second iterator to point to. It has to be in a
special state as an "end" iterator, and comparing another iterator
against it has to detect that state and then compare the current
character against '\0'. Having the functionality and interface split over
2 objects turns what should be a simple job into a harder one.

Syntactic sugar would mean using something like std::pair<iterator,
iterator>, and even that isn't to be sneezed at because it helps keep the
right pairs together. The PDF, however, goes far beyond that. It gives
many more examples than C-Strings.

-- Dave Harris, Nottingham, UK.

Mathias Gaunard

unread,
May 11, 2009, 12:37:41 PM5/11/09
to
On 11 mai, 14:41, Andrei Alexandrescu <SeeWebsiteForEm...@erdani.org>
wrote:

> > He says some things are not possible as iterators though.
>
> I did not allege that.

"Iterators can't implement it!", slide 52, about range adaptor Stride.

Replacing a pair of iterators by a single object doesn't bring
anything new in expressiveness.
Proof is, I can write an adapter on top of a pair of standard
iterators to fulfill your concepts, making standard iterators at least
as expressive.
I cannot do it the other way around, since there is no way to
implement comparison between iterators with your primitives, making
iterators *more* expressive.

Most of the advertised advantages of your approach are simply
available by grouping the two iterators in a pair rather than passing
two iterators around. Which is what Boost.Range does.
It is true however your concepts are more practical to write, and can
allow more optimal iteration in some cases. Both of these could still
be achieved in a more conservative manner rather than just throwing
away everything and starting anew.

There are a few flaws in the presentation, also: you're taking your
ranges by value, and you pass containers as ranges (slide 55, for
example), leading not only to spurious copies but also lifetime issues
with range adaptors (assuming chain takes by value, chain(v, s)
obviously references ranges that don't exist anymore).

Olivier

unread,
May 11, 2009, 12:39:58 PM5/11/09
to

> You're right Ed, it would be hard to judge the argument just from the PDF.

Is there a video of the presentation Andrei did as, yes, understanding
his full argumentation based solely on the PDF is not easy.

Olivier.

Peter Dimov

unread,
May 11, 2009, 12:59:49 PM5/11/09
to
The slides were interesting, but left me with a question. std::find
returns an iterator, let's call it 'middle', from which one can
construct both [begin, middle) and [middle, end). But the range-based
find returns the second range, and I see no way (from what has been
shown in the slides) to derive [begin, middle) from it.

This is a concrete manifestation of a general principle, of iterators
acting as cursors into a sequence, representing a position, not a set
of elements. How do ranges deal with the need to represent positions?

Hakusa

unread,
May 11, 2009, 6:01:51 PM5/11/09
to
On May 11, 12:34 pm, brang...@cix.co.uk (Dave Harris) wrote:

> You can make them work, but it's clumsy. For example, if you want to
> present the characters of a classic C-string, terminated by '\0', then
> that's easier with a single range object than with a pair of iterators.
> There's nowhere for the second iterator to point to. It has to be in a
> special state as an "end" iterator, and comparing another iterator
> against it has to detect that state and then compare the current
> character against '\0'. Having the functionality and interface split over
> 2 objects turns what should be a simple job into a harder one.

Reading that paragraph made me fantasize a little about the idea of a
sentinel iterator...

for_each( cString, sentinel_iterator<char>('\0'), f );
for_each( cString, make_sentinel('\0'), f ); // Better

Probably wouldn't be hard to implement at all, but the problem being
the first two arguments are supposed to be the same type.

But is that harder? It's more typing, but it seems fairly straight
forward.

Andrei Alexandrescu

unread,
May 11, 2009, 6:01:13 PM5/11/09
to
Mathias Gaunard wrote:
> On 11 mai, 14:41, Andrei Alexandrescu <SeeWebsiteForEm...@erdani.org>
> wrote:
>
>>> He says some things are not possible as iterators though.
>> I did not allege that.
>
> "Iterators can't implement it!", slide 52, about range adaptor Stride.

Iterators have trouble implementing strides because they don't
intrinsically know their limits. That's why an iterator could implement
a stride by essentially defining a mechanism to figure out whether it's
safe to take a step. Such a mechanism makes the iterator essentially a
range underneath.

Andrei

Andrei Alexandrescu

unread,
May 12, 2009, 5:14:07 AM5/12/09
to
Pete Becker wrote:
> georg...@gmail.com wrote:
>>> Idea of 'ranges' and how they can alleviate difficulties in using
>>> iterators isn't new. Dietmar Kuehl spoke about them almost four years
>>> ago.
>>
>> There was a lot of good information there, thanks!
>>
>> That was four years ago, so presumably the people who are on the
>> standards committee are well aware of the problems associated with
>> some of the STL's design. So why no 0x changes for some of these
>> issues?
>>
>
> Yes, those of us who are on the standards committee are well aware that
> different designs have different strengths and weaknesses. Whether those
> constitute problems is a matter of judgment and perspective. Also, note
> that Andrei's paper starts from a fundamental misunderstanding of the
> STL design. It's not "iterators = gcd(containers, algorithms)". It's
> "iterators = gcd(sequences, algorithms)". Unfortunately, this is a very
> common mistake.

Good point. I see how the metaphor I used could have been interpreted as
a misunderstanding, but I fail to see how ranges proper stem from such a
misunderstanding. If you care to share, I'm all ears.

Andrei

--

Frank Birbacher

unread,
May 12, 2009, 5:15:41 AM5/12/09
to
Hi!

Peter Dimov schrieb:


> But the range-based
> find returns the second range, and I see no way (from what has been
> shown in the slides) to derive [begin, middle) from it.

Or how about std::partition, I usually need both resulting ranges.

> How do ranges deal with the need to represent positions?

This is an important point for me. Although there are Boost.Ranges
around for quite a while now, I don't use them at all. I don't see the
advantage. Most of the time when I need iterators and algorithms I need
to handle positions. So I must use iterators anyways. That's why I don't
start with ranges at all. Currently I don't think they mix well.

Frank

Maciej Sobczak

unread,
May 12, 2009, 9:26:20 AM5/12/09
to
On 9 Maj, 20:03, "george.r...@gmail.com" <george.r...@gmail.com>
wrote:

> http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009...

> Is it syntactic sugar or is there something actually gained here?

What is certainly *not* gained here is the alleged higher security of
ranges.
Granted, ranges allow to avoid the problem of using iterators from
distinct sequences when a single sequence is intended, but frankly -
how many times have you seen a bug because of this possibility? I
don't claim it never ever happens, but I'm asking about real
statistics.

Statistically, a much bigger problem is the UB related to dangling
aliases, when the iterator/pointer/reference/range/<put-your-favorite-
aliasing-mechanism-here> is invalidated due to element removal.
The undefined behavior is a real issue which is not addressed by
ranges and therefore in the area of code security the added value of
ranges is much smaller than suggested in the presentation.

--
Maciej Sobczak * www.msobczak.com * www.inspirel.com

Database Access Library for Ada: www.inspirel.com/soci-ada

Mathias Gaunard

unread,
May 12, 2009, 11:18:25 AM5/12/09
to
On 12 mai, 11:15, Frank Birbacher <bloodymir.c...@gmx.net> wrote:

> This is an important point for me. Although there are Boost.Ranges
> around for quite a while now, I don't use them at all. I don't see the
> advantage. Most of the time when I need iterators and algorithms I need
> to handle positions. So I must use iterators anyways. That's why I don't
> start with ranges at all. Currently I don't think they mix well.

I personally only use ranges as a commodity for the user, the
algorithm usually extracts both iterators to do its work.

Also, for output I still use output iterators, not ranges.

Pete Becker

unread,
May 12, 2009, 11:18:39 AM5/12/09
to
Andrei Alexandrescu wrote:
> Pete Becker wrote:
>> georg...@gmail.com wrote:
>>>> Idea of 'ranges' and how they can alleviate difficulties in using
>>>> iterators isn't new. Dietmar Kuehl spoke about them almost four years
>>>> ago.
>>>
>>> There was a lot of good information there, thanks!
>>>
>>> That was four years ago, so presumably the people who are on the
>>> standards committee are well aware of the problems associated with
>>> some of the STL's design. So why no 0x changes for some of these
>>> issues?
>>>
>>
>> Yes, those of us who are on the standards committee are well aware
>> that different designs have different strengths and weaknesses.
>> Whether those constitute problems is a matter of judgment and
>> perspective. Also, note that Andrei's paper starts from a fundamental
>> misunderstanding of the STL design. It's not "iterators =
>> gcd(containers, algorithms)". It's "iterators = gcd(sequences,
>> algorithms)". Unfortunately, this is a very common mistake.
>
> Good point. I see how the metaphor I used could have been interpreted as
> a misunderstanding, but I fail to see how ranges proper stem from such a
> misunderstanding. If you care to share, I'm all ears.
>

Sorry, that was a bit hyperbolic. There's nothing wrong with ranges.
They're more limiting than iterators, and that extra limiting can be a
good thing, but it doesn't mean we should abandon iterators. I do tend
to bristle at examples that are chosen to illustrate the weaknesses of
one approach and the strengths of others, without a similar focus on the
weaknesses of the approach being advocated.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of
"The Standard C++ Library Extensions: a Tutorial and Reference"
(www.petebecker.com/tr1book)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Frank Birbacher

unread,
May 12, 2009, 11:41:17 AM5/12/09
to
Hi!

Maciej Sobczak schrieb:


> Granted, ranges allow to avoid the problem of using iterators from
> distinct sequences when a single sequence is intended, but frankly -
> how many times have you seen a bug because of this possibility? I
> don't claim it never ever happens, but I'm asking about real
> statistics.

I feel very confident in using iterators. I use them quite often. And I
had such a bug twice within a couple of string routines. I mixed the
iterators of two different std::strings in two opportunities in
different projects I had. Luckily enough one of the used compilers is
MSVC which employs runtime checks in debug mode for this kind of errors.

> Statistically, a much bigger problem is the UB related to dangling
> aliases, when the iterator/pointer/reference/range/<put-your-favorite-
> aliasing-mechanism-here> is invalidated due to element removal.

I don't encounter this problem often.

Frank

Andrei Alexandrescu

unread,
May 12, 2009, 3:34:15 PM5/12/09
to
Maciej Sobczak wrote:
> On 9 Maj, 20:03, "george.r...@gmail.com" <george.r...@gmail.com>
> wrote:
>
>> http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009...
>
>> Is it syntactic sugar or is there something actually gained here?
>
> What is certainly *not* gained here is the alleged higher security of
> ranges.

I understand. I did not allege that. The problem with slides is that
they hardly convey a significant percentage of what's going on (it was a
90-minutes talk that prolonged per popular demand into a 110-minutes talk).

The slide "Checkable?" shows the asserts checking for bumping
past-the-end and also the fact that the pointers are private. The
voiceover said with a nice Romanian accent that dangling ranges are not
any more helpful than dangling iterators.

Something I also pointed out is that in a checked implementation, the
marginal cost of checking for dangling ranges is lower than the cost for
iterators, so ranges are at a net advantage.


Andrei

Stephen Howe

unread,
May 12, 2009, 4:10:09 PM5/12/09
to
>I've read through this presentation:
>
>http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009/05/08
/iterators-must-go.pdf

Yes I saw this at ACCU. Slide 15/52 is incorrect.

(i) There is no #include <errno.h>
(ii) And errno must be set to 0 before each library call and if the
library function has failed, only then can errno be examined and only
if the implementation indicates that it sets errno should it be
examined.
(iii) It is perfectly possible that errno might be set even if all
library functions have succeeded.

Cheers

Stephen Howe

Andrei Alexandrescu

unread,
May 12, 2009, 4:11:26 PM5/12/09
to

I understand. Now, the audience of the talk was definitely an expert in
the goods and bads of iterators, so it was safe to stick with the deltas
without risking being accused of intellectual dishonesty. There's also
been criticism of iterators before (our own James Kanze has been doing
that forever in this group), but a cure was never presented convincingly.

To me the main surprise was that ranges, in spite of their narrower
interface and dirt-simple implementation, can be used exclusively to
implement a superset of <algorithm> with huge gains in generality and at
a fraction of the cost. (Yes, I've implemented <algorithm> and more
twice: once with iterators, then with ranges.)


Andrei

--

Dragan Milenkovic

unread,
May 12, 2009, 4:10:38 PM5/12/09
to
Andrei Alexandrescu wrote:

> Mathias Gaunard wrote:
>>
>> "Iterators can't implement it!", slide 52, about range adaptor Stride.
>
> Iterators have trouble implementing strides because they don't
> intrinsically know their limits. That's why an iterator could implement
> a stride by essentially defining a mechanism to figure out whether it's
> safe to take a step. Such a mechanism makes the iterator essentially a
> range underneath.

So maybe some iterators are ranges underneath, but in order to kill'em
I guess all or most of iterators should have to be ranges underneath.
OTOH, all ranges are iterators underneath. Two levels of abstraction
seems the right thing to do.

I have to ask why is it wrong to make an iterator know its limits?
It may be ugly, but IMHO it doesn't violate the concept of iterator
or make it wrong. Right?

Oh, I cannot refrain from commenting that after using them in practice
for a while, complex numbers become more intuitive... :-D

--
Dragan

Dave Harris

unread,
May 12, 2009, 5:58:49 PM5/12/09
to
see.my....@gmail.com (Maciej Sobczak) wrote (abridged):

> Granted, ranges allow to avoid the problem of using iterators from
> distinct sequences when a single sequence is intended, but frankly -
> how many times have you seen a bug because of this possibility? I
> don't claim it never ever happens, but I'm asking about real
> statistics.

It's happened to me, albeit rarely. However, I'm aware of being
constantly vigilant about the possibility. Having to worry about it is a
cost. I don't have enough spare brain cells to be able to waste them
thinking about stuff like that.


> Statistically, a much bigger problem is the UB related to dangling
> aliases, when the iterator/pointer/reference/range/<put-your-

> favorite-aliasing-mechanism-here> is invalidated due to element


> removal. The undefined behavior is a real issue which is
> not addressed by ranges and therefore in the area of code
> security the added value of ranges is much smaller than suggested
> in the presentation.

Although ranges do offer a kind of economy of scale that I find
attractive. You can store a pointer to a vector plus an index into it,
for the same memory cost as two pointers, and so avoid some of the
invalidation issues. A list of ranges to update will be roughly half as
long as the corresponding list of iterators. It generally feels easier
and cheaper to do clever stuff with ranges. It is a more cohesive
abstraction.

-- Dave Harris, Nottingham, UK.

--

Andrei Alexandrescu

unread,
May 12, 2009, 5:59:25 PM5/12/09
to
Dragan Milenkovic wrote:
> Andrei Alexandrescu wrote:
>> Mathias Gaunard wrote:
>>>
>>> "Iterators can't implement it!", slide 52, about range adaptor Stride.
>>
>> Iterators have trouble implementing strides because they don't
>> intrinsically know their limits. That's why an iterator could
>> implement a stride by essentially defining a mechanism to figure out
>> whether it's safe to take a step. Such a mechanism makes the iterator
>> essentially a range underneath.
>
> So maybe some iterators are ranges underneath, but in order to kill'em
> I guess all or most of iterators should have to be ranges underneath.
> OTOH, all ranges are iterators underneath. Two levels of abstraction
> seems the right thing to do.

I'd rather have one than two, but that's just me.

> I have to ask why is it wrong to make an iterator know its limits?
> It may be ugly, but IMHO it doesn't violate the concept of iterator
> or make it wrong. Right?

If it knows its limits it better supports an interface that reflects
that knowledge. This means... make them ranges.


Andrei

--

Mathias Gaunard

unread,
May 12, 2009, 6:00:35 PM5/12/09
to
On 12 mai, 22:10, Dragan Milenkovic <dra...@plusplus.rs> wrote:

> I have to ask why is it wrong to make an iterator know its limits?
> It may be ugly, but IMHO it doesn't violate the concept of iterator
> or make it wrong. Right?

There are quite a few cases that require it.
There is nothing wrong with it, except the "end" iterator becomes
redundant and thus wastes memory.

If you allow the end iterator to be of a different type that you
eventually make empty, and define ranges as compressed pairs rather
than pairs, then you solve that problem.

The iterator abstraction will always be needed anyway. It's a position
within a sequence, something that a range isn't and shouldn't be
anyway.

--

Anand Hariharan

unread,
May 13, 2009, 4:58:02 AM5/13/09
to
On May 12, 8:26 am, Maciej Sobczak <see.my.homep...@gmail.com> wrote:
> On 9 Maj, 20:03, "george.r...@gmail.com" <george.r...@gmail.com>
> wrote:
>
> >http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009...
> > Is it syntactic sugar or is there something actually gained here?
>
> What is certainly *not* gained here is the alleged higher security of
> ranges.
> Granted, ranges allow to avoid the problem of using iterators from
> distinct sequences when a single sequence is intended, but frankly -
> how many times have you seen a bug because of this possibility? I
> don't claim it never ever happens, but I'm asking about real
> statistics.
>

It is (of course) quite difficult to come up with statistics, but it
is not unusual that people do stuff like so:

stringstream ss;
// ...
char *foo = malloc( ss.str().size() + 1 );
copy( ss.str().begin(), ss.str().end(), foo );

I remember seeing at least one such thread and smiling to myself at
the rather unfortunate error made by the OP. He had done quite a bit
of research trying to figure out what was causing the buffer overflow,
but the bug in the above line had eluded him.

- Anand

Stephen Howe

unread,
May 13, 2009, 4:56:39 AM5/13/09
to
>To me the main surprise was that ranges, in spite of their narrower
>interface and dirt-simple implementation, can be used exclusively to
>implement a superset of <algorithm> with huge gains in generality and at
>a fraction of the cost. (Yes, I've implemented <algorithm> and more
>twice: once with iterators, then with ranges.)

Another possibility that occured to on reading this, is that at its
simplest, a range could be a pair of iterators (then begin and end).
But it need not be. A range could be "thinner", because some of the
"guts" of the pair of iterators could be duplicated, therefore you can
economise on the range "guts".

sizeof(range) <= 2 * sizeof(iterator)

Stephen Howe

alan_mc...@yahoo.com

unread,
May 13, 2009, 5:03:18 AM5/13/09
to
On May 12, 6:00 pm, Mathias Gaunard <loufo...@gmail.com> wrote:

> The iterator abstraction will always be needed anyway. It's a position
> within a sequence, something that a range isn't and shouldn't be
> anyway.

I'd say it's somewhat less than a position.

A single iterator, by itself, is not of much use.
You can dereference it -- if you know it isn't end() or rend().
You can increment it -- if you know it isn't at the end,
or decrement it -- if you know it isn't at the beginning.

Basically, in the absence of additional information,
an iterator value by itself (and I'm restricting myself to valid
iterator values) really can't be used for much of
anything.

This is what I don't like about STL iterators -- IMHO,
they aren't really an abstraction of anything
(except C pointers.) An STL iterator is _part_ of an abstraction.
E.g., one of the two endpoints of a sequence. Or one of
the three "points" that define the state of iteration within
a sequence.

This is not to say you can't write good code with them,
or that they aren't useful. But they don't provide
a more abstract way of describing a problem.
To me,

copy( a_range_start, a_range_end, b_copystart )

is no more abstract than

memcpy( b_copystart, a_range_start, a_range_end-a_range_start );

or

for ( p = a_range_start, q = b_copystart; p < a_range_end; ) *q++
= *p++;

(and the third has the advantage of being explicit.)

In one respect, iterators are a step backward from pointers:
with pointers, you can have a "null" value which indicates
"not pointing to anything." The closest thing iterators have
is an "end()" value (cf. the return value from find() ),
but you can't tell if it's the "end()"
value unless you have the container around (or have kept
that container's "end()" value somewhere.

t...@42.sci.utah.edu

unread,
May 13, 2009, 5:09:47 AM5/13/09
to
Frank Birbacher <bloodym...@gmx.net> wrote:
>
> I mixed the iterators of two different std::strings in two
> opportunities in different projects I had. Luckily enough one of the
> used compilers is MSVC which employs runtime checks in debug mode for
> this kind of errors.

You might also consider libstdc++'s debug mode, next time:

http://gcc.gnu.org/onlinedocs/libstdc++/manual/debug_mode.html

To summarize all the docs in a short blurb: compile with
`-D_GLIBCXX_DEBUG -D_GLIBCXX_CONCEPT_CHECKS'.

Cheers,

-tom

cjhopman

unread,
May 13, 2009, 12:12:36 PM5/13/09
to
On May 9, 1:03 pm, "george.r...@gmail.com" <george.r...@gmail.com>
wrote:

> I've read through this presentation:
>
> http://www.boostcon.com/si...

>
> and I am confused about how ranges are significantly different enough
> from iterators to warrant the avocation of their elimination.
>
> Is it syntactic sugar or is there something actually gained here?

Hm-

Interesting presentation.

To me, it seems as though the benefit of iterators is that they often
act like pointers. That's nice when you want to use them like a
pointer, but doesn't really help for other uses. For example, almost
all std algorithms work on ranges. Using iterators with this is fine,
but I definitely see ways that ranges are better for this. Possibly
the biggest thing that I see is that we currently use iterator
comparison to determine if we are at the end of a range... That seems
wrong, and means that we need an explicit iterator to mark the end of
a range. With the current standard library, a range HAS to be a pair
of iterators, but why? A pair of iterators is not necessarily the
natural representation of a range and is not actually required.

For example, say you want a container interface for a c-string. With
iterators how do you find the .end() iterator? Do you scan through the
string to find the end? Or do you write a complicated iterator that
has special comparisons to work as you want? With ranges, this would
be simple. The idea is that we can have a specialized comparison for
telling if we are at the end. When you are scanning through a range,
you just need increment and a check to see if you are at the end (in
fact these could be the same method), you don't need a generalized
iterator comparison.

Ranges could also allow for more efficient algorithms. See
http://lafstern.org/matt/segmented.pdf. As presented in that paper, it
is possible to address the problem with iterators, but with ranges it
just becomes natural.

Just my thoughts,

-Chris Hopman

Dragan Milenkovic

unread,
May 13, 2009, 12:13:06 PM5/13/09
to
alan_mc...@yahoo.com wrote:
> On May 12, 6:00 pm, Mathias Gaunard <loufo...@gmail.com> wrote:
>
>> The iterator abstraction will always be needed anyway. It's a position
>> within a sequence, something that a range isn't and shouldn't be
>> anyway.
>
> I'd say it's somewhat less than a position.
>
> A single iterator, by itself, is not of much use.
> You can dereference it -- if you know it isn't end() or rend().
> You can increment it -- if you know it isn't at the end,
> or decrement it -- if you know it isn't at the beginning.
>
> Basically, in the absence of additional information,
> an iterator value by itself (and I'm restricting myself to valid
> iterator values) really can't be used for much of
> anything.

I used an index like this (if this was a bad idea, please tell me):

std::list<Stuff>
std::map<Key, std::list<Stuff>::iterator>

There really is no need to have "end" information with each iterator.
It would also be silly if we got different ends with
each iterator in the map.

I also used ranges for stuff where I needed them. So maybe
names were a bad choice (iterator -> pointer/position,
range -> iterator), maybe not... But I really want both abstractions.
Or another approach that can cover both iterators and ranges.


> This is what I don't like about STL iterators -- IMHO,
> they aren't really an abstraction of anything
> (except C pointers.) An STL iterator is _part_ of an abstraction.
> E.g., one of the two endpoints of a sequence. Or one of
> the three "points" that define the state of iteration within
> a sequence.
>
> This is not to say you can't write good code with them,
> or that they aren't useful. But they don't provide
> a more abstract way of describing a problem.
>
> To me,
>
> copy( a_range_start, a_range_end, b_copystart )
>
> is no more abstract than
>
> memcpy( b_copystart, a_range_start, a_range_end-a_range_start );
>
> or
>
> for ( p = a_range_start, q = b_copystart; p < a_range_end; ) *q++
> = *p++;
>
> (and the third has the advantage of being explicit.)
>
> In one respect, iterators are a step backward from pointers:
> with pointers, you can have a "null" value which indicates
> "not pointing to anything." The closest thing iterators have
> is an "end()" value (cf. the return value from find() ),
> but you can't tell if it's the "end()"
> value unless you have the container around (or have kept
> that container's "end()" value somewhere.

I disagree, iterators are not a step backward.
"null" pointer cannot help you to mark the end of an array,
you have to use either its length or past-the-end pointer.

Would it be acceptable if there was a limit information
with each C-pointer? There simply _is_ more than one level
of abstraction. I didn't give it much thought, but maybe
it would be sufficient to have only ranges and pointers
(iterators with only a dereferencing operation), but
I'm pretty sure there cannot be only one abstraction.

--
Dragan

Stephen Howe

unread,
May 13, 2009, 4:15:17 PM5/13/09
to
>I understand. Now, the audience of the talk was definitely an expert in
>the goods and bads of iterators, so it was safe to stick with the deltas
>without risking being accused of intellectual dishonesty. There's also
>been criticism of iterators before (our own James Kanze has been doing
>that forever in this group), but a cure was never presented convincingly.

What was James's criticism of iterators Andrei?
I have just googled and found some things but nothing substantial.

Cheers

Stephen Howe

Mathias Gaunard

unread,
May 14, 2009, 4:38:51 AM5/14/09
to
On 13 mai, 11:03, alan_mckenn...@yahoo.com wrote:

> I'd say it's somewhat less than a position.
>
> A single iterator, by itself, is not of much use.

It allows you to refer to a single element in a sequence.
Just like an index allows you to refer to a single element in an
array, an iterator allows you to refer to a single element in any
sequence (it's not exactly the same, but you get the idea).

It is useful as itself to indicate the position of an element, like as
the result of std::find, to insert another element before/after
another, as a hint for insertion in balanced trees, etc.

Mathias Gaunard

unread,
May 14, 2009, 5:10:00 AM5/14/09
to

The latter.
Calling it complicated is quite overdoing it, though.


> With ranges, this would
> be simple. The idea is that we can have a specialized comparison for
> telling if we are at the end.

Nothing forces you to generate iterators directly. You can use some
kind of CRTP-based facade or policy-based design to automatically
generate iterators with the primitives of your choice.

This is actually recommended, since writing iterators can be quite
verbose otherwise.


> When you are scanning through a range,
> you just need increment and a check to see if you are at the end (in
> fact these could be the same method), you don't need a generalized
> iterator comparison.

The thing is, you lose nothing by representing ranges as pairs of
iterators. Yet you would lose something by getting rid of iterators,
since those are still useful as positions.

The generalized iterator comparison is a mechanism to compare
compositions, and is worth it.

Thomas J. Gritzan

unread,
May 14, 2009, 5:10:00 AM5/14/09
to
Stephen Howe schrieb:

>> I understand. Now, the audience of the talk was definitely an expert in
>> the goods and bads of iterators, so it was safe to stick with the deltas
>> without risking being accused of intellectual dishonesty. There's also
>> been criticism of iterators before (our own James Kanze has been doing
>> that forever in this group), but a cure was never presented convincingly.
>
> What was James's criticism of iterators Andrei?
> I have just googled and found some things but nothing substantial.

James often argues that the iterator pattern in the GoF was better than
those of Java or the STL.

For example, "Iterators in Java and C++" thread in c.l.c++:
http://groups.google.com/group/comp.lang.c++/browse_thread/thread/a6647e44ef6d9b96/3bf8ff26c8185e6e#3bf8ff26c8185e6e

One problem with STL iterators is that it's not easy to chain them. You
always need to provide a pair of them, and a function that filters out
or alters some elements has to return this pair somehow.

With a one-object iterator (or Andrei's range) it's easy to do
do_something( filter1( filter2( it ) ) );

It might be a bad idea to base the new range-based for loop on a pair of
iterators for this reason. It's trivial to write an adaptor that takes a
pair of iterators and provides hasnext/next/get functions, but once this
for loop is in the language, we won't be able to change it to something
that works on a kind of enumerator.

--
Thomas

Mathias Gaunard

unread,
May 14, 2009, 2:49:07 PM5/14/09
to
On 14 mai, 11:10, "Thomas J. Gritzan" <phygon_antis...@gmx.de> wrote:

> One problem with STL iterators is that it's not easy to chain them. You
> always need to provide a pair of them, and a function that filters out
> or alters some elements has to return this pair somehow.
>
> With a one-object iterator (or Andrei's range) it's easy to do
> do_something( filter1( filter2( it ) ) );

It works just fine with pairs of iterators.
The recently accepted extension to Boost.Range does just that.

do_something(filtered(filtered(range, f1), f2));
or
do_something(range | filtered(f1) | filtered(f2));


--

alan_mc...@yahoo.com

unread,
May 14, 2009, 3:36:52 PM5/14/09
to
On May 14, 4:38 am, Mathias Gaunard <loufo...@gmail.com> wrote:
> On 13 mai, 11:03, alan_mckenn...@yahoo.com wrote:
>
> > I'd say it's somewhat less than a position.
>
> > A single iterator, by itself, is not of much use.
>
> It allows you to refer to a single element in a sequence.

Unless it happens to have the value seq.end()
for the sequence that your iterator value
originally came from. In that case, it
doesn't refer to anything -- it is either
the endpoint of a range or a "magic value"
used to indicate "no position at all."

The only way to know is to be supplied
with additional information -- the
end() value for the particular sequence
your iterator came from, or else a
guarrantee from whoever supplies
your iterator value that they will _never_
return and end() value.
end().

> It is useful as itself to indicate the
> position of an element, like

> the result of std::find,

This is the textbook example of an iterator
value that may not refer to anything at all.
You cannot use the result of std::find()
without an additional iterator value -- specifically,
the end() value of the sequence it was used on.

My reason for saying that iterators are
an incomplete concept is that when I
actually try to use them, I find
myself constantly having to go back
to the original sequence for additional
information.

In particular, I've written a number of
application-specific containers that
are implemented in terms of STL containers,
and I've found that it never makes sense
to pass an iterator to the outside world --
returning iterators always ends up meaning
that too much of the implementation has to
be exposed to the user. In most cases,
I end up returning a reference or pointer to
a contained value (or a copy of it.) If
that's not enough, then an iterator isn't
enough, either -- I have to define
a class to return the contained information.

Dave Jenkins

unread,
May 14, 2009, 3:47:17 PM5/14/09
to

"cjhopman" <cjho...@gmail.com> wrote in message
news:f4842b64-3363-4b44...@b1g2000vbc.googlegroups.com...

>
> Ranges could also allow for more efficient algorithms. See
> http://lafstern.org/matt/segmented.pdf. As presented in that paper, it
> is possible to address the problem with iterators, but with ranges it
> just becomes natural.

I don't see how the segmented algorithms in Matt Austern's paper
http://lafstern.org/matt/segmented.pdf would benefit from Andrei's ranges.
Andrei's ranges are 1-dimensional.

For segmented algorithms, I think the natural abstraction is
multi-dimensional ranges. If he can extend his ranges to the
multi-dimensional case, then segmented/hierarchical algorithms would be
natural and efficient.

Dave Jenkins

Dragan Milenkovic

unread,
May 15, 2009, 8:14:51 AM5/15/09
to
alan_mc...@yahoo.com wrote:
> On May 14, 4:38 am, Mathias Gaunard <loufo...@gmail.com> wrote:
>> On 13 mai, 11:03, alan_mckenn...@yahoo.com wrote:
>>
>>> I'd say it's somewhat less than a position.
>>> A single iterator, by itself, is not of much use.
>> It allows you to refer to a single element in a sequence.
>
> Unless it happens to have the value seq.end()
> for the sequence that your iterator value
> originally came from. In that case, it
> doesn't refer to anything -- it is either
> the endpoint of a range or a "magic value"
> used to indicate "no position at all."

In my experience, it does not happen too often in the real world.
In the use case where you need an iterator for position,
you actually _know_ it's a good position, otherwise there
is no position to start with. If I set up a collection of pointers
to refer to some objects somewhere, I know those pointers never
have a null value. If you want a higher-level abstraction, use a range.
But sometimes I need an iterator just as the next person needs
a plain old pointer.

IMHO, much bigger issues are pointer-to-garbage / invalid iterator.
But I have found (and someone else in this thread did the same),
the invalid iterator happens very rarely once iterators get under
your skin.

Also, this code:

for (std::set<Foo>::const_iterator i=x.foos().begin();
i!=x.foos().end(); ++i)

... happens only once in your life (and just because nobody pointed you
to this common error). After that, each and every time you will write:

const std::set<Foo> & foos = x.foos();
for (std::set<Foo>::const_iterator i=foos.begin(); i!=foos.end(); ++i)

... of course - with typedefs or maybe user-defined container classes...

(The assumption was that x.foos() might or must return a copy
of the collection.)

<snip/>


>> It is useful as itself to indicate the
>> position of an element, like
>> the result of std::find,
>
> This is the textbook example of an iterator
> value that may not refer to anything at all.
> You cannot use the result of std::find()
> without an additional iterator value -- specifically,
> the end() value of the sequence it was used on.
>
> My reason for saying that iterators are
> an incomplete concept is that when I
> actually try to use them, I find
> myself constantly having to go back
> to the original sequence for additional
> information.

But why is this a bad thing? You _do_ need
this additional information, and whether you
get it from the collection or from the iterator
itself, it's just a matter of semantics.

And I believe that iterator has a thinner interface
and implementation thanks to this semantics.
I might be wrong...

> In particular, I've written a number of
> application-specific containers that
> are implemented in terms of STL containers,
> and I've found that it never makes sense
> to pass an iterator to the outside world --
> returning iterators always ends up meaning
> that too much of the implementation has to
> be exposed to the user. In most cases,
> I end up returning a reference or pointer to
> a contained value (or a copy of it.) If
> that's not enough, then an iterator isn't
> enough, either -- I have to define
> a class to return the contained information.

You used the term "never". Could you give an example
(in words, not necessarily in code). I mean, iterator
is just a way to iterate through a sequence of values,
nothing else. It does it's purpose. For use cases
where ranges are needed - use ranges. But find()
and end() have no problems... you can always
write in your class a version of find() that returns
a valid iterator or throws an exception.
(Maybe someone could correct me if this is a bad thing).

Anyway, maybe iterator concepts should be revised,
but I don't think they suffers from problems related
to find(), end(), having not enough information...
Maybe we could add a few refined concepts like
a StubbornIterator that stays valid if you add/remove
elements other that itself from the collection,
but I have no idea if it would help in anyway...

Just my thoughts based on C++03 experience,
but with very little research done... If I find time,
maybe I'll correct this. Could someone give references
to current and previous discussions and attempts
to fix and/or change iterators.

--
Dragan

cjhopman

unread,
May 15, 2009, 8:14:02 AM5/15/09
to
On May 14, 4:10 am, Mathias Gaunard <loufo...@gmail.com> wrote:
> On 13 mai, 18:12, cjhopman <cjhop...@gmail.com> wrote:
>
> > For example, say you want a container interface for a c-string. With
> > iterators how do you find the .end() iterator? Do you scan through the
> > string to find the end? Or do you write a complicated iterator that
> > has special comparisons to work as you want?
>
> The latter.
> Calling it complicated is quite overdoing it, though.
>
> > With ranges, this would
> > be simple. The idea is that we can have a specialized comparison for
> > telling if we are at the end.
>
> Nothing forces you to generate iterators directly. You can use some
> kind of CRTP-based facade or policy-based design to automatically
> generate iterators with the primitives of your choice.
>
> This is actually recommended, since writing iterators can be quite
> verbose otherwise.

I think you may have misunderstood me. I was not saying that writing
an iterator, in general, is complicated. I was saying that this
particular iterator would be unnecessarily complicated. Just think
about its basic design... Because we can not actually get a pointer to
the end of the range, every iterator would need a flag to determine if
it were an "end" iterator (for example an "end" iterator's base
pointer could hold a special value such as 0). Then, either iteration
becomes more than just incrementing a pointer, or comparison will have
two cases. This is too complicated.

The range implementation would be much simpler. The increment and
return true if at end is simply "return *(++i) == 0".

> > When you are scanning through a range,
> > you just need increment and a check to see if you are at the end (in
> > fact these could be the same method), you don't need a generalized
> > iterator comparison.
>
> The thing is, you lose nothing by representing ranges as pairs of
> iterators. Yet you would lose something by getting rid of iterators,
> since those are still useful as positions.

While it is somewhat unintuitive, a range can denote a position just
as well as an iterator. However, you do lose something by representing
a range as a pair of iterators. Consider the example above, the range
implementation is not only simpler, it is more efficient. Another case
would be iterating through a deque, d. In this case, after each
increment the iterator implementation will compare the iterator
against d.end(). However, much of the time, there is not really any
need for that comparison, it is already determined in the increment,
and the range implementation can take advantage of that fact.

Now, I have cheated a little here. For example, the null terminated
range would be incompatible with ranges with a different end point in
the c-string. To have something completely compatible in that case
would leave you with basically a pair of complicated iterators. Even
if you require that compatibility, though, the deque example is still
more efficient in the range implementation.

Things could be even more efficient with a further change to the
algorithms interface. That is, give ranges more power over how
algorithms are done. One example of this would be to have algorithms
make liberal use of for_each and allow it to be specialized for each
range. Consider the case of the deque. Currently, for_each applied to
a sequence in a deque requires at least 3 comparisons and one addition
for each element. With the range interface that has an increment and
return true if at end method it requires 2 comparisons and one
addition for each element. And with an interface where for_each is
specialized by the range, it would require one comparison and one
addition per element.

Just doing for_each would cover many of the algorithms, but you would
likely want a for_each_stop_at or some such for algorithms like find
or find_if. For something like copy or swap_ranges, you could do a
binary_for_each (which could default to for_each on the primary range
with a functor that increments the secondary range, which could be an
unchecked increment that doesn't care about the end of the range).

-Chris Hopman

Thomas Beckmann

unread,
May 15, 2009, 8:16:47 AM5/15/09
to
After having some thoughts on the issue my opinion is that there are places
for positions (iterators) and ranges (pairs of iterators). So why not
support both concepts, by i.e. overloading?

Also, when thinking about strings and substrings I asked myself, why not
make ranges provide the container interface? - Just to make my point clear,
iterators, enumerators, ranges and containers are distinct but closely
related.

Mathias Gaunard

unread,
May 15, 2009, 8:18:37 AM5/15/09
to
On 14 mai, 21:36, alan_mckenn...@yahoo.com wrote:

> > It allows you to refer to a single element in a sequence.
>
> Unless it happens to have the value seq.end()
> for the sequence that your iterator value
> originally came from. In that case, it
> doesn't refer to anything -- it is either
> the endpoint of a range or a "magic value"
> used to indicate "no position at all."

It's just like an out of bounds index.


> > It is useful as itself to indicate the
> > position of an element, like
> > the result of std::find,
>
> This is the textbook example of an iterator
> value that may not refer to anything at all.
> You cannot use the result of std::find()
> without an additional iterator value -- specifically,
> the end() value of the sequence it was used on.

It's not fundamentally different from returning -1, as is often done
to search in arrays.

cjhopman

unread,
May 15, 2009, 8:16:08 AM5/15/09
to
On May 14, 2:47 pm, "Dave Jenkins" <da...@jenkins.net> wrote:
> "cjhopman" <cjhop...@gmail.com> wrote in message

>
> news:f4842b64-3363-4b44...@b1g2000vbc.googlegroups.com...
>
>
>
> > Ranges could also allow for more efficient algorithms. See
> >http://lafstern.org/matt/segmented.pdf. As presented in that paper, it
> > is possible to address the problem with iterators, but with ranges it
> > just becomes natural.
>
> I don't see how the segmented algorithms in Matt Austern's paper
> http://lafstern.org/matt/segmented.pdfwould benefit from Andrei's ranges.

> Andrei's ranges are 1-dimensional.
>
> For segmented algorithms, I think the natural abstraction is
> multi-dimensional ranges. If he can extend his ranges to the
> multi-dimensional case, then segmented/hierarchical algorithms would be
> natural and efficient.
>
> Dave Jenkins

There is no need for multi-dimensional ranges for segmented algorithms
to be efficient. I wrote up a little code to demonstrate this (
http://pages.cs.wisc.edu/~hopman/segmented.cpp ). For the range
interface we have the single method for increment position and
returning true if at the end. Note that the beginning and the end of
the range that we are doing stuff over is not actually the beginning
or the end of the container (or of a given segment). This requires the
range to have the extra left_node_end member, something that can't be
done with iterators.

On my laptop (compiled with g++ -O3), the iterator implementation runs
in 4.00s and the range implementation in 2.53s. That is the iterator
implementation is almost 60% slower (I was definitely a bit surprised
by that). Now there is at least one bug in my range implementation (if
the end of the range is the beginning of a segment it will fail
miserably), but that is easily fixed. The point is that a one-
dimensional range is fine, no need for multi-dimensional ranges.

(Also, I was wrong in my other post about the number of comparisons in
iterating through the deque in my other post... oh well)

-Chris Hopman

georg...@gmail.com

unread,
May 15, 2009, 8:44:08 AM5/15/09
to
> If you have a list of things you think are problems you might check the
> current working draft to see whether they've been addressed before
> asserting "no 0x changes". In particular, look at the Range concept and
> range-based for loops.

I am aware of the Range concept and the range-based for loop, however
these weren't the focus of Andrei's presentation as I understood it.
My initial inquiry to the group was about whether it was possible that
iterators were a flawed way of thinking. Arguments presented in this
thread are compelling, in particular that iterators are sequence-
oriented rather than container-oriented, and that it's difficult for a
compiler to distinguish a pair of ranged iterators mistakenly from two
different sequences.

mzdude

unread,
May 15, 2009, 8:45:40 AM5/15/09
to
Ranges are useful. I have some lightweight classes the keep a pair
of iterators together to form a range. In my domain I have two
large vectors, one comprises an X-axis and the other the Y-Axis.
Every once in a while we venture into 3 dimensional plots. But
2 dimensions is enough for now.

I often have to work on a subset of this data. Often a range
is denoted by time (which happens to be X-axis).
iterator lb = lower_bound(c.begin(),c.end(), startTime);
iterator hb = lower_bound(lb,c.end(), endTime);
myRange range(lb,hb); // useful pairing of iterators


My question is if iterators go, and we are left with something like
range r( c ); // set range to begin, end of container
r = lower_bound(r,startTime); // returns [lb,end())

How would one set the upper end of the range to bound the range
such that range r = [lb,hb)?

Assuming one can do this easily and elegantly, I now need to
translate that range into the appropriate range for the Y-Axis.
I currently use the std::distance() to get an index. I translate
that index into a Y-Axis iterator.

X-Axis Y-Axis
------ --------
1.0 13.2
1.1 13.4 <-----|
1.2 15.1 | Plot this sub range
1.3 10.2 <-----|
1.4 6.3

In his presentation Andrei mentions lock step. This means a find would
have be something like
rngX,rngY = lower_bound(zip(rngX,rngY),startTime)

Am I missing something?

Thomas Beckmann

unread,
May 15, 2009, 8:45:29 AM5/15/09
to
>> So maybe some iterators are ranges underneath, but in order to kill'em
>> I guess all or most of iterators should have to be ranges underneath.
>> OTOH, all ranges are iterators underneath. Two levels of abstraction
>> seems the right thing to do.
>
> I'd rather have one than two, but that's just me.
>
>> I have to ask why is it wrong to make an iterator know its limits?
>> It may be ugly, but IMHO it doesn't violate the concept of iterator
>> or make it wrong. Right?
>
> If it knows its limits it better supports an interface that reflects that
> knowledge. This means... make them ranges.

Firstly, knowledge of the end can be considered an implementation detail.
Hiding it is a good thing with regard to changability. Also notice, there is
nothing to prevent you from creating an iterator type providing a richer
interface.

Bytheway, I actually think the iterator interface being based on mimicing
build-in pointer syntax and semantics is a good thing making it easier to
read.


In practice, the biggest problem to me is, that I need to write typedefs to
use containers and iterators intuitively: one for the container, two for
iterators (const/nonconst) and two for ranges (pair of const/nonconst
iterators). These come in addition to variables. This is just to make the
implementation read easier, at the cost of bloating the interface. Even
worse, the iterator and range typedefs are usually part of the public
interface while the others are not.

Thomas J. Gritzan

unread,
May 15, 2009, 8:46:27 AM5/15/09
to
Mathias Gaunard schrieb:

> On 14 mai, 11:10, "Thomas J. Gritzan" <phygon_antis...@gmx.de> wrote:
>
>> One problem with STL iterators is that it's not easy to chain them. You
>> always need to provide a pair of them, and a function that filters out
>> or alters some elements has to return this pair somehow.
>>
>> With a one-object iterator (or Andrei's range) it's easy to do
>> do_something( filter1( filter2( it ) ) );
>
> It works just fine with pairs of iterators.
> The recently accepted extension to Boost.Range does just that.
>
> do_something(filtered(filtered(range, f1), f2));
> or
> do_something(range | filtered(f1) | filtered(f2));

No. Boost.Range contains one-object iterators, at least syntactically,
which makes it easy to chain them. Try that if you always have to pass
around a pair of them.

Because a Boost.Range always contains a pair of STL iterators, you
always have to provide an end iterator, even if you don't specify a
range (which is first element to one-past-last element) and the last
element should be identified by a condition.
It's possible to do that: istream_iterator is one example, in
Boost.Range are many others.

But why take the burden? Why should we always have to give two
iterators? Think about an "iterator", well, I'll call it enumerator,
which does std::getline(std::cin, line):

class line_enumerator
{
std::string line;
public:
line_enumerator() { next(); }
bool done() const { return std::cin; }
std::string const& get() const { return line; }
void next() { std::getline(std::cin, line); }
}

How to use:

for (line_enumerator en; !en.done(); en.next())
{
std::string line = en.get();
// ...
}

This enumerators work exactly like Andrei's range, but I think that
"range" is the wrong name, especially when it comes to "infinite range",
so I call them enumerator.

Try to write the above line_enumerator as STL iterator (using
Boost.Range is cheating, because they did the hard work).

--
Thomas

Dragan Milenkovic

unread,
May 15, 2009, 3:47:49 PM5/15/09
to
cjhopman wrote:
> On May 14, 4:10 am, Mathias Gaunard <loufo...@gmail.com> wrote:
>> On 13 mai, 18:12, cjhopman <cjhop...@gmail.com> wrote:
<snip/>

>>> When you are scanning through a range,
>>> you just need increment and a check to see if you are at the end (in
>>> fact these could be the same method), you don't need a generalized
>>> iterator comparison.
>> The thing is, you lose nothing by representing ranges as pairs of
>> iterators. Yet you would lose something by getting rid of iterators,
>> since those are still useful as positions.
>
> While it is somewhat unintuitive, a range can denote a position just
> as well as an iterator. However, you do lose something by representing
> a range as a pair of iterators. Consider the example above, the range
> implementation is not only simpler, it is more efficient. Another case
> would be iterating through a deque, d. In this case, after each
> increment the iterator implementation will compare the iterator
> against d.end(). However, much of the time, there is not really any
> need for that comparison, it is already determined in the increment,
> and the range implementation can take advantage of that fact.

I think there is a std::logic_error here.

If you have a container and two iterators, you lose nothing by
representing a range for the container as a pair of iterators.
I believe this is what Mathias said, and it is true.

You said that you lose if you _always_ represent a range with
a pair of iterators. This is also true. So don't do it... :-D

I'm not trying to solve anything, just speaking in favor
of iterators. Consider this example if we used ranges
for positions:

std::list<Bar> l;
std::list<Bar>::range pos1 = my_find(l, 123);
l.push_back(new_bar);
std::list<Bar>::range pos2 = my_find(l, 123);

Is 'pos1 == pos2' considering they have different limits,
or is it a requirement by design that both 'pos1' and 'pos2'
will iterate through 'new_bar'?

--
Dragan

georg...@gmail.com

unread,
May 15, 2009, 3:46:39 PM5/15/09
to
On May 11, 1:59 pm, Peter Dimov <pdi...@gmail.com> wrote:
> The slides were interesting, but left me with a question. std::find
> returns an iterator, let's call it 'middle', from which one can
> construct both [begin, middle) and [middle, end). But the range-based
> find returns the second range, and I see no way (from what has been
> shown in the slides) to derive [begin, middle) from it.
>
> This is a concrete manifestation of a general principle, of iterators
> acting as cursors into a sequence, representing a position, not a set
> of elements. How do ranges deal with the need to represent positions?

This is an interesting point that I would like to see addressed. I've
used this technique before, too.

It seems to me that treating two ranges as positions (iterators) is
more awkward than treating two positions as a range.


--

georg...@gmail.com

unread,
May 15, 2009, 3:46:07 PM5/15/09
to
On May 12, 5:11 pm, Andrei Alexandrescu

> To me the main surprise was that ranges, in spite of their narrower
> interface and dirt-simple implementation, can be used exclusively to
> implement a superset of <algorithm> with huge gains in generality and at
> a fraction of the cost. (Yes, I've implemented <algorithm> and more
> twice: once with iterators, then with ranges.)

Which cost are you referring to? Cost of development of the library?

I find this rather awkward:

template <class R, class T>
R find(R r, T value);

Isn't it wasteful to return a range here (I'm assuming that a range is
at worst twice the size of an iterator)? Doesn't it pre-suppose that I
haven't already eliminated duplicates from the container and ordered
them somehow?

Mathias Gaunard

unread,
May 15, 2009, 3:58:17 PM5/15/09
to
On 15 mai, 14:14, cjhopman <cjhop...@gmail.com> wrote:

> I think you may have misunderstood me. I was not saying that writing
> an iterator, in general, is complicated. I was saying that this
> particular iterator would be unnecessarily complicated. Just think
> about its basic design... Because we can not actually get a pointer to
> the end of the range, every iterator would need a flag to determine if
> it were an "end" iterator (for example an "end" iterator's base
> pointer could hold a special value such as 0). Then, either iteration
> becomes more than just incrementing a pointer, or comparison will have
> two cases. This is too complicated.
>
> The range implementation would be much simpler. The increment and
> return true if at end is simply "return *(++i) == 0".

What you're saying is that the iterator-based iteration would generate
code like this:

for(T* p = begin; p != 0;)
{
f(*p);

++p;
if(*p == 0)
p = 0;
}

while your range-based solution would generate code like this:

for(T* p = begin; *p != 0; p++)
{
f(*p);
}

is the second one really faster?

Thomas Beckmann

unread,
May 15, 2009, 3:59:48 PM5/15/09
to
> On my laptop (compiled with g++ -O3), the iterator implementation runs
> in 4.00s and the range implementation in 2.53s. That is the iterator
> implementation is almost 60% slower (I was definitely a bit surprised
> by that). Now there is at least one bug in my range implementation (if
> the end of the range is the beginning of a segment it will fail
> miserably), but that is easily fixed. The point is that a one-
> dimensional range is fine, no need for multi-dimensional ranges.


Repeat your test reversing the sections in main, do your range test then the
iterator test. Chances are you get different results due to memory
allocation overhead in the runtime, cache misses, etc. Notice, the second
run does not change any data which the hardware might take advantage of.
However, I have no idea of whether CPUs could actually detect that and skip
updating system memory.

A design issue is that range::next() mixes up advancing and querying
operations. Now, for supporting const-correct code you will need an explicit
has_next() function. Also, I find the iterator implementation of advance and
query easier to understand than the range implementation. Same applies for
the sections in main() using it.

Finally, in this case 2u*sizeof(iterator) < sizeof(range) holds. I worked
some time in embedded environment and could think of ways to make the
iterator have a single pointer member only, i.e. by aligned allocations.
Chances are that such an implementation does not show any difference in the
generated binary. To me the design issue of the last paragraph weights the
most though.


Regards,
Thomas.

alan_mc...@yahoo.com

unread,
May 15, 2009, 3:58:34 PM5/15/09
to
On May 15, 8:14 am, Dragan Milenkovic <dra...@plusplus.rs> wrote:
> alan_mckenn...@yahoo.com wrote:

> > My reason for saying that iterators are
> > an incomplete concept is that when I
> > actually try to use them, I find
> > myself constantly having to go back
> > to the original sequence for additional
> > information.
>
> But why is this a bad thing? You _do_ need
> this additional information, and whether you
> get it from the collection or from the iterator
> itself, it's just a matter of semantics.

I'm not saying that iterators are useless.
I'm saying that they are a low-level concept,
conceptually no different from pointers
(except that they threw away the concept of
a null value.)

You end up writing code that doesn't look
a lot different from what you wrote in C.

If you want to represent the iteration process
from, say, a "for" loop as an object, you'll
get something more like the RogueWave iterator
concept. The STL iterator is more like
an index variable.


>
> > In particular, I've written a number of
> > application-specific containers that
> > are implemented in terms of STL containers,
> > and I've found that it never makes sense
> > to pass an iterator to the outside world --
> > returning iterators always ends up meaning
> > that too much of the implementation has to
> > be exposed to the user. In most cases,
> > I end up returning a reference or pointer to
> > a contained value (or a copy of it.) If
> > that's not enough, then an iterator isn't
> > enough, either -- I have to define
> > a class to return the contained information.
>
> You used the term "never". Could you give an example
> (in words, not necessarily in code).

An example (which comes up frequently): I
want to store a table of data, in an object,
and let the application look up a "row" of
the table using a key. In this case, I define
a struct for the data in a row, and supply a
"lookUp()" function which you give
the key values to and it returns a pointer to
a (const) row struct -- or a null pointer if
there is no such row (exceptions are _way_ too
expensive, not to mention making the calling
code clumsier)

If I returned an iterator, I would need to
have another function to supply the iterator
value that indicates "not found."

If there are several rows with the same key,
and I want to return them all, an STL iterator
still wouldn't work, even though my user
will probably want to "iterate." I need
to return an object which can be iterated
but also knows when it gets to the end
and can be "rewound." This is _not_ an
STL iterator. (A _pair_ of iterators might
be enough, since start=end indicates "no rows".)


--

alan_mc...@yahoo.com

unread,
May 15, 2009, 3:57:54 PM5/15/09
to
On May 15, 8:18 am, Mathias Gaunard <loufo...@gmail.com> wrote:
> On 14 mai, 21:36, alan_mckenn...@yahoo.com wrote:
>
> > > It allows you to refer to a single element in a sequence.
>
> > Unless it happens to have the value seq.end()
> > for the sequence that your iterator value
> > originally came from. In that case, it
> > doesn't refer to anything -- it is either
> > the endpoint of a range or a "magic value"
> > used to indicate "no position at all."
>
> It's just like an out of bounds index.

No, a (non-negative) out-of-bounds index is
usually more like an invalid iterator. If
you want an index value that signals "no
position at all", you'll use something like
-1, which is instantly recognizable as not
a valid position.

The STL overloads the "end()" concept to represent
two very different things:

(a) "no position at all" -- corresponds
to the conventions index=-1
or pointer=null

(b) endpoint of a range -- corresponds
to array size or a pointer to one
past the end of an array.

The first is, IMHO, just a poor choice
on the part of the STL authors -- it would
have been possible to add a "null" value
to the iterator concept at the beginning.
(It's probably too late now.)
And, since iterators are just a generalization
of a C-style pointer, for which null is one
of the more useful values, it's a surprising
omission.

The second is by its nature something that
is not generally useful except when paired with
something else -- an array size is usually
not much use unless paired with some way
to access the array, a "one-past-the-end"
pointer is not much use unless you have
another pointer that points into a valid
part of the array.


> > > It is useful as itself to indicate the
> > > position of an element, like
> > > the result of std::find,
>
> > This is the textbook example of an iterator
> > value that may not refer to anything at all.
> > You cannot use the result of std::find()
> > without an additional iterator value -- specifically,
> > the end() value of the sequence it was used on.
>
> It's not fundamentally different from returning -1, as is often done
> to search in arrays.

Except that -1 is instantly recognizable as
"not a position" without having to know the
array it refers to.

Frank Birbacher

unread,
May 16, 2009, 5:33:39 PM5/16/09
to
Hi!

alan_mc...@yahoo.com schrieb:


> (exceptions are _way_ too
> expensive, not to mention making the calling
> code clumsier)

If you immediately need to catch the exception, they are the wrong tool.
You are right to oppose. But often enough I wish there was a "const
mapped_type& std::map::at(key_type const& key) const" function which
would throw if the key was not found.


> If there are several rows with the same key,
> and I want to return them all, an STL iterator
> still wouldn't work, even though my user
> will probably want to "iterate." I need
> to return an object which can be iterated
> but also knows when it gets to the end
> and can be "rewound." This is _not_ an
> STL iterator. (A _pair_ of iterators might
> be enough, since start=end indicates "no rows".)

Just like std::map and multimap equal_range member function does? This
is perfectly fine for me.

In general I agree that iterators are seldom useful outside some scope
around the container, e.g. in a higher level interface definition. But
for working with all sorts of containers I really love the concept.

Frank

georg...@gmail.com

unread,
May 16, 2009, 5:30:53 PM5/16/09
to
On May 15, 4:58 pm, alan_mckenn...@yahoo.com wrote:
> If I returned an iterator, I would need to
> have another function to supply the iterator
> value that indicates "not found."

In this case, I fail to see how the scenario differs significantly
from using a custom range that contains an embedded end marker.

> If there are several rows with the same key,
> and I want to return them all, an STL iterator
> still wouldn't work, even though my user
> will probably want to "iterate." I need
> to return an object which can be iterated

You can make an iterator that has whatever behaviour you want.

cjhopman

unread,
May 16, 2009, 5:34:03 PM5/16/09
to
On May 15, 2:58 pm, Mathias Gaunard <loufo...@gmail.com> wrote:
> On 15 mai, 14:14, cjhopman <cjhop...@gmail.com> wrote:
> > ...

>
> What you're saying is that the iterator-based iteration would generate
> code like this:
>
> for(T* p = begin; p != 0;)
> {
> f(*p);
>
> ++p;
> if(*p == 0)
> p = 0;
>
> }
>
> while your range-based solution would generate code like this:
>
> for(T* p = begin; *p != 0; p++)
> {
> f(*p);
>
> }
>
> is the second one really faster?

Yes, the second one is faster.

I was actually more saying that the range one would be simpler, but it
turns out that it is faster too.

I wrote up a quick test case (similar to my segmented data structure
one). It takes about 1.60 seconds with the iterator loop and 0.95
seconds with the range loop on my laptop (compiled with g++ -O3, code
is here http://pages.cs.wisc.edu/~hopman/cstring_range.cpp). Pretty
neat, no?

-Chris Hopman

cjhopman

unread,
May 16, 2009, 5:33:52 PM5/16/09
to
On May 15, 2:59 pm, "Thomas Beckmann" <ka6552-...@online.de> wrote:
> Repeat your test reversing the sections in main, do your range test then the
> iterator test. Chances are you get different results due to memory
> allocation overhead in the runtime, cache misses, etc. Notice, the second
> run does not change any data which the hardware might take advantage of.

Reversing the sections in main makes no difference, the timing is
still the same.

Notice, the second run does the same thing as the first one on a
different data structure.

> A design issue is that range::next() mixes up advancing and querying
> operations. Now, for supporting const-correct code you will need an explicit
> has_next() function.

You want the advancing and querying operations to be done together,
otherwise you often end up doing the query twice (also, "advancing"
and "reaching the end" are very much related things). You could add an
additional querying operation, if you wanted.

> Also, I find the iterator implementation of advance and
> query easier to understand than the range implementation. Same applies for
> the sections in main() using it.

The sections in main using it are a simple issue... especially since
you could just use a range-based for loop. You can change the
interface around some to get something that you would like.

The iterator implementation is only easier due to the optimization
that the range implementation allowed me to do. If you remove the
optimization they will be almost exactly the same. In this case, a
range implementation can be more efficient than the iterator
implementation, there are other cases where the range implementation
would be simpler. In all cases, the range implementation could just be
a pair of iterators and so neither more complicated nor less efficient
than the iterator implementation.

> Finally, in this case 2u*sizeof(iterator) < sizeof(range) holds. I worked
> some time in embedded environment and could think of ways to make the
> iterator have a single pointer member only, i.e. by aligned allocations.

Again, a range could just be a pair of iterators. That is, a range can
do anything iterators can do... the reverse is not true.

-Chris Hopman

Hakusa

unread,
May 16, 2009, 5:33:20 PM5/16/09
to
On May 15, 8:16 am, "Thomas Beckmann" <ka6552-...@online.de> wrote:
> After having some thoughts on the issue my opinion is that there are places
> for positions (iterators) and ranges (pairs of iterators). So why not
> support both concepts, by i.e. overloading?
>
> Also, when thinking about strings and substrings I asked myself, why not
> make ranges provide the container interface? - Just to make my point clear,
> iterators, enumerators, ranges and containers are distinct but closely
> related.

Agreed. Though, the discussion seems to have devolved into deciding
whether iterators or ranges are the cure-all.

georg...@gmail.com

unread,
May 16, 2009, 8:46:55 PM5/16/09
to
On May 16, 6:33 pm, Hakusa <Hak...@gmail.com> wrote:
> Agreed. Though, the discussion seems to have devolved into deciding
> whether iterators or ranges are the cure-all.

No, that's where the discussion started. Andrei's presentation made
that claim ("Iterators must go"), and I came to the newsgroup for a
discusson on the merits of it.

Andrei Alexandrescu

unread,
May 17, 2009, 5:52:00 PM5/17/09
to
georg...@gmail.com wrote:
> On May 12, 5:11 pm, Andrei Alexandrescu
>> To me the main surprise was that ranges, in spite of their narrower
>> interface and dirt-simple implementation, can be used exclusively to
>> implement a superset of <algorithm> with huge gains in generality and at
>> a fraction of the cost. (Yes, I've implemented <algorithm> and more
>> twice: once with iterators, then with ranges.)
>
> Which cost are you referring to? Cost of development of the library?
>
> I find this rather awkward:
>
> template <class R, class T>
> R find(R r, T value);
>
> Isn't it wasteful to return a range here (I'm assuming that a range is
> at worst twice the size of an iterator)? Doesn't it pre-suppose that I
> haven't already eliminated duplicates from the container and ordered
> them somehow?

The spec of find is very simple: reduce r from its front until value is
found, or r is exhausted. There's no ordering or uniqueness requirement.

The range may be larger than iterator but that hardly impedes find's
efficiency. The find function takes linear time, and therefore does
enough work to drown the extra cost of returning e.g. two words instead
of one.


Andrei

Andrei Alexandrescu

unread,
May 17, 2009, 5:51:58 PM5/17/09
to
cjhopman wrote:
> On May 15, 2:59 pm, "Thomas Beckmann" <ka6552-...@online.de> wrote:
>> Repeat your test reversing the sections in main, do your range test then the
>> iterator test. Chances are you get different results due to memory
>> allocation overhead in the runtime, cache misses, etc. Notice, the second
>> run does not change any data which the hardware might take advantage of.
>
> Reversing the sections in main makes no difference, the timing is
> still the same.
>
> Notice, the second run does the same thing as the first one on a
> different data structure.
>
>> A design issue is that range::next() mixes up advancing and querying
>> operations. Now, for supporting const-correct code you will need an explicit
>> has_next() function.
>
> You want the advancing and querying operations to be done together,
> otherwise you often end up doing the query twice (also, "advancing"
> and "reaching the end" are very much related things). You could add an
> additional querying operation, if you wanted.

This is an interesting point. My first design defined a different
interface for input ranges versus forward ranges. Input ranges were:

template <class T> struct InputRange
{
bool empty() const;
T popNext();
}

whereas forward ranges were (and are):

template <class T> struct ForwardRange
{
bool empty() const;
T& front();
void popFront();
}

I think one can pursue such a design, but I found it difficult to write
code that works efficiently with forward ranges adapted to input ranges.
(Notice return-by-value from popNext). To simplify everybody's life, I
decided to require input ranges to cache the last element fetched (just
like the STL does). Things seem to work pretty well.


Andrei

georg...@gmail.com

unread,
May 17, 2009, 9:10:50 PM5/17/09
to
On May 17, 6:52 pm, Andrei Alexandrescu

<SeeWebsiteForEm...@erdani.org> wrote:
> The find function takes linear time, and therefore does
> enough work to drown the extra cost of returning e.g. two words instead
> of one.

In the case of find(), I can't argue against performance. I'll re-word
my posting.

If I know there is only zero or one occurrence of an element in the
sequence, I am paying the space price of a range when I know a range
can't possibly exist. That same space price would also hold true if I
simply wanted to hold N individual ranges into a sequence that I know
only represent one element each. So by eliminating iterators, I would
be forced to pay the price of extra storage in a range in
circumstances where I know ranges aren't required.

Doesn't that mean that I paying for space for something I don't use?

georg...@gmail.com

unread,
May 17, 2009, 9:11:35 PM5/17/09
to
Further to the point of using ranges with erase:

With iterators, to erase everything in a sequence from one particular
element to the end, I would write something like:

auto i = find(v.begin(), v.end(), val);
v.erase(i, v.end());

What would this look like with ranges?

auto r = find(v.all(), val);
auto r2 = range(r.front(), v.end());
v.erase(r2);

That's not right because I assume r.front() loses the context of the
container it's in, so there would be no way to detect that the same
container is being used to construct the new range.

How would this work?

Martin Eisenberg

unread,
May 18, 2009, 10:36:56 AM5/18/09
to
Andrei Alexandrescu wrote:

> The spec of find is very simple: reduce r from its front until
> value is found, or r is exhausted. There's no ordering or
> uniqueness requirement.

That sounds more like lower_bound. In a parallel post, George Ryan
evidently has find(r, x) := upper_bound(lower_bound(r, x), x) in
mind. That seems reasonable too when the range is in fact sorted.


Martin

--
Quidquid latine scriptum est, altum videtur.

Andrei Alexandrescu

unread,
May 18, 2009, 10:43:54 AM5/18/09
to
georg...@gmail.com wrote:
> On May 17, 6:52 pm, Andrei Alexandrescu
> <SeeWebsiteForEm...@erdani.org> wrote:
>> The find function takes linear time, and therefore does
>> enough work to drown the extra cost of returning e.g. two words instead
>> of one.
>
> In the case of find(), I can't argue against performance. I'll re-word
> my posting.
>
> If I know there is only zero or one occurrence of an element in the
> sequence, I am paying the space price of a range when I know a range
> can't possibly exist. That same space price would also hold true if I
> simply wanted to hold N individual ranges into a sequence that I know
> only represent one element each. So by eliminating iterators, I would
> be forced to pay the price of extra storage in a range in
> circumstances where I know ranges aren't required.
>
> Doesn't that mean that I paying for space for something I don't use?
>
>

If you want to hold N individual ranges into a non-random-access
container, ranges will be at a disadvantage compared to iterators. You
can store pointers to the elements though, or indexes if the container
has random access. After having hacked into this stuff for years, I can
say that I am willing to forgo this one scenario for the sake of
advantages that ranges bring.

Andrei Alexandrescu

unread,
May 18, 2009, 10:43:52 AM5/18/09
to
georg...@gmail.com wrote:
> Further to the point of using ranges with erase:
>
> With iterators, to erase everything in a sequence from one particular
> element to the end, I would write something like:
>
> auto i = find(v.begin(), v.end(), val);
> v.erase(i, v.end());
>
> What would this look like with ranges?
>
> auto r = find(v.all(), val);
> auto r2 = range(r.front(), v.end());
> v.erase(r2);
>
> That's not right because I assume r.front() loses the context of the
> container it's in, so there would be no way to detect that the same
> container is being used to construct the new range.
>
> How would this work?
>

v.erase(find(v.all(), val));

Andrei

Dave Harris

unread,
May 18, 2009, 10:43:54 AM5/18/09
to
pdi...@gmail.com (Peter Dimov) wrote (abridged):

> The slides were interesting, but left me with a question. std::find
> returns an iterator, let's call it 'middle', from which one can
> construct both [begin, middle) and [middle, end). But the
> range-based find returns the second range, and I see no way
> (from what has been shown in the slides) to derive [begin, middle)
> from it.

We'd need functions to implement an algebra of ranges. Concatenating
ranges is easy. Taking the first N items of a range is also easy; I often
use std::find with erase, like:
iterator middle = v.find( v.begin(), v.end(), item );
if (middle != v.end())
v.erase( middle );

and if erase() now only accepts range arguments, and std::find() returns
the range [middle, end), then presumably we need to reduce the result of
std::find() to a range which is either empty, or contains a single item,
so that erase() doesn't delete too much. And this can be done in a
generic way.

A function that returns [begin, middle) when passed [begin, end) and
[middle, end) seems harder to write in a generic way. In other words, I
think its implementation depends on the implementation of the ranges.
Some types of range may not be able to represent it at all. For example,
if you have a null-terminated string, and a [middle, end) range
implemented as a single pointer, then [begin, middle) has to be
represented differently.

Is that a problem? It's surely no different to when we produced
categories of iterator.

-- Dave Harris, Nottingham, UK.

Dave Harris

unread,
May 18, 2009, 10:43:54 AM5/18/09
to
georg...@gmail.com () wrote (abridged):

> With iterators, to erase everything in a sequence from one
> particular element to the end, I would write something like:
>
> auto i = find(v.begin(), v.end(), val);
> v.erase(i, v.end());
>
> What would this look like with ranges?

I think you misunderstand what range find returns: it returns (middle,
end]. So to do what you want is the simple case:

v.erase( find( v.all(), value ) );

The harder (and more common) case is if you only want to erase a single
item. Then you need something like:

auto tail = find( v.all(), value );
v.erase( first<1>( tail ) );

where the result of first<N>() is a range consisting of the first N items
of its argument range (or fewer, if the argument range is shorter).
("First" may not be the best name.)

(Alternatively, we could have a version of erase that takes a range and
only deletes the first item of that range; but we'd have to be careful
not to confuse it with the normal range-erase. It might be better to just
provide a mechanism to make single-item ranges easy to detect and
overload/specialise for at compile-time, if the efficiency of the
single-item case matters.)

This definition of find() is arguably what std::find() does already. In
practice you usually have to compare its result to end() to see if
anything was found, so you have:

auto end = v.end();
auto middle = find( v.begin(), end, value );
if (middle != end)
v.erase( middle );

In other words, find() ought to return end as well as middle, but it can
only return 1 result, and end is the argument the caller passed to it, so
it relies on the caller remembering that argument to reconstruct the
range. (In practice we usually go back to the original v.end() instead,
assuming that it won't have changed.) Having find() return both middle
and end makes its result more self-contained and cohesive.

(I find "more cohesive" to be common phrase when comparing ranges with
iterators.)

-- Dave Harris, Nottingham, UK.

--

Seungbeom Kim

unread,
May 18, 2009, 10:43:55 AM5/18/09
to
georg...@gmail.com wrote:
>
> I find this rather awkward:
>
> template <class R, class T>
> R find(R r, T value);
>
> Isn't it wasteful to return a range here (I'm assuming that a range is
> at worst twice the size of an iterator)?

That could be useful in the particular context of continued searches,
but otherwise I also feel it awkward that find() should return a range;
intuitively, find() answers the question "Where is something?", and
the the type of the answer should be a position (or positions), not
a range. The waste of space is secondary.

I'm also curious to hear what other search algorithms that's not linear,
such as lower_bound(), upper_bound() or equal_range(), should return.

--
Seungbeom Kim

Peter Dimov

unread,
May 18, 2009, 4:37:46 PM5/18/09
to
On May 18, 5:43 pm, brang...@cix.co.uk (Dave Harris) wrote:
> pdi...@gmail.com (Peter Dimov) wrote (abridged):
>
> > The slides were interesting, but left me with a question. std::find
> > returns an iterator, let's call it 'middle', from which one can
> > construct both [begin, middle) and [middle, end). But the
> > range-based find returns the second range, and I see no way
> > (from what has been shown in the slides) to derive [begin, middle)
> > from it.
>
> We'd need functions to implement an algebra of ranges.

This is, more or less, my point; that the necessary algebra was not
shown in the slides.

> Concatenating ranges is easy. Taking the first N items of a range is also

easy; [...]

Not if you stick to the range concepts in the slides (you can, in
principle, derive the first N items of a bidirectional multipass range
but it doesn't feel quite right).

If you have operator- for ranges, so that you can derive [begin,
middle) as [begin, end) - [middle, end), then taking the first N items
is also easy: R - advance(R, N). (It still requires a multipass range
though.)

...

> A function that returns [begin, middle) when passed [begin, end) and
> [middle, end) seems harder to write in a generic way.

One can think of ranges ending at 'end' as isomorphic to iterators
from which 'end' is reachable. [begin, end) is isomorphic to 'begin',
[middle, end) is isomorphic to 'middle', and the iterator pair <begin,
middle> can be said to be isomorphic to the range pair < [begin, end),
[middle, end) >. Which can now be convinced to model the forward range
concept... but not the bidirectional range concept because [middle,
end) cannot grow into [begin, end).

> Is that a problem?

Not really, I was just wondering how the algebraic closure looks like,
as the concepts in the slides do not, in my opinion, match iterators
in expressive power.

Andrei Alexandrescu

unread,
May 18, 2009, 4:39:33 PM5/18/09
to
Dave Harris wrote:
> A function that returns [begin, middle) when passed [begin, end) and
> [middle, end) seems harder to write in a generic way. In other words, I
> think its implementation depends on the implementation of the ranges.
> Some types of range may not be able to represent it at all. For example,
> if you have a null-terminated string, and a [middle, end) range
> implemented as a single pointer, then [begin, middle) has to be
> represented differently.

Yes, if arbitrary slicing is needed for ranges, it would have to be a
primitive. So far I got away without it, but that doesn't mean it ought
to be part of certain range categories.

Andrei

Andrei Alexandrescu

unread,
May 18, 2009, 4:38:57 PM5/18/09
to
Seungbeom Kim wrote:
> georg...@gmail.com wrote:
>>
>> I find this rather awkward:
>>
>> template <class R, class T>
>> R find(R r, T value);
>>
>> Isn't it wasteful to return a range here (I'm assuming that a range is
>> at worst twice the size of an iterator)?
>
> That could be useful in the particular context of continued searches,
> but otherwise I also feel it awkward that find() should return a range;
> intuitively, find() answers the question "Where is something?", and
> the the type of the answer should be a position (or positions), not
> a range. The waste of space is secondary.

The answer type reflects the input range capabilities.

> I'm also curious to hear what other search algorithms that's not linear,
> such as lower_bound(), upper_bound() or equal_range(), should return.
>

In my implementation the three primitives require a random-access range.
They respectively return (a) the subrange of elements strictly smaller
than the sought value, (b) the subrange of elements strictly greater
than the sought value, and (c) the subrange of elements neither smaller
nor greater than the sought value. It's a pleasure working with those
instead of their iterator counterparts.

http://www.digitalmars.com/d/2.0/phobos/std_algorithm.html#lowerBound
http://www.digitalmars.com/d/2.0/phobos/std_algorithm.html#upperBound
http://www.digitalmars.com/d/2.0/phobos/std_algorithm.html#equalRange


Andrei

--

Dragan Milenkovic

unread,
May 18, 2009, 4:38:02 PM5/18/09
to
Andrei Alexandrescu wrote:
[snip]

>
> If you want to hold N individual ranges into a non-random-access
> container, ranges will be at a disadvantage compared to iterators. You
> can store pointers to the elements though, or indexes if the container
> has random access. After having hacked into this stuff for years, I can
> say that I am willing to forgo this one scenario for the sake of
> advantages that ranges bring.

You cannot use pointers if you want to insert/erase relative to those
positions. Random access containers invalidate iterators on
insert/erase, so one _must_ use indices. I usually save iterators
for lists and sets.

Isn't this just how boost multi_index library uses iterators?

IMHO, it's not just one scenario. More like selling your left arm
for the sake of buying a car. I'm sure some people will want their
arm back so there will be something like:

typedef int position_type;
void erase_by_position(container & c, const position_type & p);
range position_to_range(container & c, const position_type & p);

Also, may I ask is there operator== defined for a range and how does
it work? Is a range even meant to be saved or only used during the
current iteration?

I apologize for commenting without having the complete picture
on the range proposal, but you did say it doesn't cover this scenario.
Is there any kind of document available concerning the proposal?

--
Dragan

Andrei Alexandrescu

unread,
May 18, 2009, 4:39:06 PM5/18/09
to
Dave Harris wrote:
> georg...@gmail.com () wrote (abridged):
>> With iterators, to erase everything in a sequence from one
>> particular element to the end, I would write something like:
>>
>> auto i = find(v.begin(), v.end(), val);
>> v.erase(i, v.end());
>>
>> What would this look like with ranges?
>
> I think you misunderstand what range find returns: it returns (middle,
> end]. So to do what you want is the simple case:
>
> v.erase( find( v.all(), value ) );
>
> The harder (and more common) case is if you only want to erase a single
> item. Then you need something like:
>
> auto tail = find( v.all(), value );
> v.erase( first<1>( tail ) );
>
> where the result of first<N>() is a range consisting of the first N items
> of its argument range (or fewer, if the argument range is shorter).
> ("First" may not be the best name.)

v.erase(tail, 1);

Andrei

georg...@gmail.com

unread,
May 18, 2009, 4:44:44 PM5/18/09
to
On May 18, 11:43 am, Andrei Alexandrescu
> v.erase(find(v.all(), val));

Ah, I see - I misunderstood. I assumed the range returned from find
would only contain the elements that matched the find, but the
reduction of R contains all of the original elements after the found
element.

georg...@gmail.com

unread,
May 18, 2009, 4:44:55 PM5/18/09
to
On May 18, 11:43 am, brang...@cix.co.uk (Dave Harris) wrote:

> I think you misunderstand what range find returns: it returns

Yes, I didn't realise that the range returned from find contained a
sub-range from the original. I assumed it returned a new range with
just the found elements in it.


> The harder (and more common) case is if you only want to erase a single
> item. Then you need something like:
>
> auto tail = find( v.all(), value );
> v.erase( first<1>( tail ) );

I don't find the syntax

v.erase( first<1>( find( v.all(), val ) ) );

too ugly, so I'm happy with that. :-)

Andrei Alexandrescu

unread,
May 18, 2009, 4:45:15 PM5/18/09
to
Martin Eisenberg wrote:
> Andrei Alexandrescu wrote:
>
>> The spec of find is very simple: reduce r from its front until
>> value is found, or r is exhausted. There's no ordering or
>> uniqueness requirement.
>
> That sounds more like lower_bound. In a parallel post, George Ryan
> evidently has find(r, x) := upper_bound(lower_bound(r, x), x) in
> mind. That seems reasonable too when the range is in fact sorted.

It's not quite lower_bound, as I think of lower_bound as the "lower",
left-hand side, range.

Returning the remainder of the range is the most general option possible
when you want to accommodate input ranges. It so happens it turns out to
be very convenient in practice.


Andrei

--

georg...@gmail.com

unread,
May 18, 2009, 4:46:31 PM5/18/09
to
On May 11, 1:37 pm, Pete Becker <p...@versatilecoding.com> wrote:

> If you have a list of things you think are problems you might check the
> current working draft to see whether they've been addressed before
> asserting "no 0x changes". In particular, look at the Range concept and
> range-based for loops.

I have been reading through the draft, but as you can imagine it's
much easier to follow when you've got someone to offer pointers. :-)

It's unfortunate that the range concepts didn't get extended to the
algorithm library, but I understand there's only so many hours in the
day and that it's been ten years between standards.

Like everyone else, I have concerns over the length of time between
standards updates - Has the deadline for TR2 proposals lapsed?

georg...@gmail.com

unread,
May 19, 2009, 8:38:23 AM5/19/09
to
On May 18, 5:38 pm, Dragan Milenkovic <dra...@plusplus.rs> wrote:

> Is there any kind of document available concerning the proposal?

I found this proposal by Thorsten Ottosen:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1871.html
and it looks like it's in the holding pattern for TR2. I haven't read
enough of it yet to see how similar it is to Andrei's approach.

I couldn't find anything else on the committee page, but I am not a
pro at navigating it.


--

Dave Harris

unread,
May 19, 2009, 8:41:41 AM5/19/09
to
SeeWebsit...@erdani.org (Andrei Alexandrescu) wrote (abridged):

> > v.erase( first<1>( tail ) );
>
> v.erase(tail, 1);

Is there a good way to do it without an implied loop? (I suppose it
matters less for vectors than for lists, but even for vectors we are
destructing N items with a loop rather than 1 item without one.)

I guess using something like first<> doesn't work if erase only accepts
ranges defined by its own vector. At least, it needs a way to tunnel down
into the range to get a pointer or similar that corresponds to a position
in its array.

-- Dave Harris, Nottingham, UK.

--

Martin Eisenberg

unread,
May 19, 2009, 3:22:25 PM5/19/09
to
Andrei Alexandrescu wrote:
> Martin Eisenberg wrote:
>> Andrei Alexandrescu wrote:
>>
>>> The spec of find is very simple: reduce r from its front until
>>> value is found, or r is exhausted. There's no ordering or
>>> uniqueness requirement.
>>
>> That sounds more like lower_bound. In a parallel post, George
>> Ryan evidently has find(r, x) := upper_bound(lower_bound(r, x),
>> x) in mind. That seems reasonable too when the range is in fact
>> sorted.
>
> It's not quite lower_bound, as I think of lower_bound as the
> "lower", left-hand side, range.

I see. That raises the question how to get from a range to its
complement, for the bounding algorithms also serve to locate the
element just inside a bound as well as the one just outside.

> Returning the remainder of the range is the most general option
> possible when you want to accommodate input ranges. It so
> happens it turns out to be very convenient in practice.

But once you detect the lower bound in an input range, the prefix is
gone. So lower_bound is undefined on input ranges?


Martin

--
Quidquid latine scriptum est, altum videtur.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Andrei Alexandrescu

unread,
May 19, 2009, 6:02:21 PM5/19/09
to
Dave Harris wrote:
> SeeWebsit...@erdani.org (Andrei Alexandrescu) wrote (abridged):
>>> v.erase( first<1>( tail ) );
>> v.erase(tail, 1);
>
> Is there a good way to do it without an implied loop? (I suppose it
> matters less for vectors than for lists, but even for vectors we are
> destructing N items with a loop rather than 1 item without one.)

The interesting problem is really splice(), i.e. yanking an entire
sublist off a list in O(1). I have not addressed that in my lib yet.

> I guess using something like first<> doesn't work if erase only accepts
> ranges defined by its own vector. At least, it needs a way to tunnel down
> into the range to get a pointer or similar that corresponds to a position
> in its array.

That is correct.


Andrei

Andrei Alexandrescu

unread,
May 19, 2009, 6:02:34 PM5/19/09
to
georg...@gmail.com wrote:
> On May 18, 5:38 pm, Dragan Milenkovic <dra...@plusplus.rs> wrote:
>
>> Is there any kind of document available concerning the proposal?
>
> I found this proposal by Thorsten Ottosen:
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1871.html
> and it looks like it's in the holding pattern for TR2. I haven't read
> enough of it yet to see how similar it is to Andrei's approach.

That proposal, as well as Adobe's ranges, are very different from the
ranges in my talk. This is because Boost's and Adobe's ranges use
iterators as a primitive notion.

Andrei

--

Andrei Alexandrescu

unread,
May 19, 2009, 6:02:00 PM5/19/09
to
Martin Eisenberg wrote:
> Andrei Alexandrescu wrote:
>> Martin Eisenberg wrote:
>>> Andrei Alexandrescu wrote:
>>>
>>>> The spec of find is very simple: reduce r from its front until
>>>> value is found, or r is exhausted. There's no ordering or
>>>> uniqueness requirement.
>>> That sounds more like lower_bound. In a parallel post, George
>>> Ryan evidently has find(r, x) := upper_bound(lower_bound(r, x),
>>> x) in mind. That seems reasonable too when the range is in fact
>>> sorted.
>> It's not quite lower_bound, as I think of lower_bound as the
>> "lower", left-hand side, range.
>
> I see. That raises the question how to get from a range to its
> complement, for the bounding algorithms also serve to locate the
> element just inside a bound as well as the one just outside.
>
>> Returning the remainder of the range is the most general option
>> possible when you want to accommodate input ranges. It so
>> happens it turns out to be very convenient in practice.
>
> But once you detect the lower bound in an input range, the prefix is
> gone. So lower_bound is undefined on input ranges?

It is in the original, iterator-based STL and in my library. In
addition, currently my library requires a random-access range for
lower_bound. I plan to return a different type from lower_bound when an
input or forward range are passed: a lazy range of unknown length that
makes progress until it hits the sought element, at which point it
stops. That would make lower_bound more general than iterator_based STL
because it can operate on input ranges (iterator-based STL requires at
least forward ranges)>

Andrei

--

georg...@gmail.com

unread,
May 19, 2009, 8:51:23 PM5/19/09
to
On May 19, 7:02 pm, Andrei Alexandrescu
<SeeWebsiteForEm...@erdani.org> wrote:

> The interesting problem is really splice(), i.e. yanking an entire
> sublist off a list in O(1). I have not addressed that in my lib yet.

How would ranges handle other STL containers like deques and lists,
anyway? I think in iterators as the abstraction of sequences, so I
assume that the range class(es) would have to specific to each type of
container?

Andrei Alexandrescu (See Website For Email)

unread,
May 20, 2009, 8:13:24 AM5/20/09
to
georg...@gmail.com wrote:
> On May 19, 7:02 pm, Andrei Alexandrescu
> <SeeWebsiteForEm...@erdani.org> wrote:
>
>> The interesting problem is really splice(), i.e. yanking an entire
>> sublist off a list in O(1). I have not addressed that in my lib yet.
>
> How would ranges handle other STL containers like deques and lists,
> anyway? I think in iterators as the abstraction of sequences, so I
> assume that the range class(es) would have to specific to each type of
> container?

In that regard they are very much like iterators: they have intimate
knowledge of the container they crawl on.

Andrei

georg...@gmail.com

unread,
May 20, 2009, 3:43:36 PM5/20/09
to
On May 20, 9:13 am, "Andrei Alexandrescu (See Website For Email)"

> In that regard they are very much like iterators: they have intimate
> knowledge of the container they crawl on.

Okay, you may have thought I was through asking questions, but alas I
am not. :-)

I'm summarizing the whole thread here, but so far I see advantages in
not having to worry about this class of error:

int a[10], b[10], c[10];
std::copy(a,b,c);

even though there appears to be controversy over how common or
pressing those errors may be. And I also like that ranges can be
chained together. The C++ committee seems to on board with the Range
concept and TR2 proposals for ranges.

I like iterators because they are the good(ish?) concept for
specifying a single position into a sequence (notwithstanding their
combined access and traversal, which may or may not be a zero sum
argument, and arguments in favour of indicies or pointers which are
possibly out-of-scope for this thread). Plus, I don't have to pay for
the space cost of a range when I don't need it, and that seems very C+
+-like to me.

So it really comes down to why using iterators as a primitive notation
would be good or bad for C++, and I haven't really read a good
argument against having both iterators _and_ ranges. It appears to me
that the only major failing of iterators (their not knowing their
limits) could be solved by having a common past-the-end representation
external to the containers; and that seems fairly straightforward.
There are some minor syntactical ugliness with iterators, but those
arguments get subjective pretty fast.

You're stating that iterators should be a thing of the past, but I
haven't seen a coup-de-grace for the argument against using iterators
in conjunction with (or as primitives for) ranges. There's one slide
(#21) where you list the final nail in the coffin for iterators.
Perhaps you (and/or others) wouldn't mind elaborating on them a bit?

"All iterator types are fundamentally unsafe." I'm not sure what you
mean here. Do you mean you can iterate off the end and frolic in
memory?

"For most iterator types, given an iterator..." You mention you can't
say whether certain operations are possible on an iterator, but I've
never really tried to use iterators without already knowing their
types. Is there an advance use case where that's common?

georg...@gmail.com

unread,
May 21, 2009, 11:12:02 AM5/21/09
to
On May 19, 7:02 pm, Andrei Alexandrescu wrote:

> This is because Boost's and Adobe's ranges use iterators as a
> primitive notion.

I have to admit that the proposal's syntax of

std::vector<int*> vec = ...;
copy( vec | reversed | indirected,
ostream_iterator<int>( cout ) );

seems overcomplicated next to (guessing):

copy( vec.all(), ostream_range( cout ) );

however the former does have the advantage of dereferencing the
contents of the container automagically for you. That seems kinda
handy.

I did a quick look of the std.algorithm page for D, and I didn't see
anything in the copy() or the range class there that would allow for
something like that. Any plans?

Another thing that bothers me a bit about the range proposal for TR2
is that it looks like there may be a lot of template metamagic going
on, and that always seems to bog down compilation times. I mean,
processors are fast and all, but even with my four cores I can
sometimes be impressed at how long some boost libraries can take. :-)

Andrei Alexandrescu

unread,
May 21, 2009, 3:52:31 PM5/21/09
to
georg...@gmail.com wrote:
> On May 20, 9:13 am, "Andrei Alexandrescu (See Website For Email)"
>> In that regard they are very much like iterators: they have intimate
>> knowledge of the container they crawl on.
>
> Okay, you may have thought I was through asking questions, but alas I
> am not. :-)
>
> I'm summarizing the whole thread here, but so far I see advantages in
> not having to worry about this class of error:
>
> int a[10], b[10], c[10];
> std::copy(a,b,c);
>
> even though there appears to be controversy over how common or
> pressing those errors may be. And I also like that ranges can be
> chained together. The C++ committee seems to on board with the Range
> concept and TR2 proposals for ranges.

I consider the added safety of copy() a nice perk brought about by the
higher abstraction provided by ranges.

> I like iterators because they are the good(ish?) concept for
> specifying a single position into a sequence (notwithstanding their
> combined access and traversal, which may or may not be a zero sum
> argument, and arguments in favour of indicies or pointers which are
> possibly out-of-scope for this thread). Plus, I don't have to pay for
> the space cost of a range when I don't need it, and that seems very C+
> +-like to me.

Sounds more like C to me. Iterators are like pointers - unwieldy,
unsafe, error-prone, and almost always suspect when not encapsulated
(e.g. smart pointers, vector, other collections etc.) Ranges are a
definite step up. In my experience, often there's no extra cost to
ranges because what was needed was two iterators anyway.

> So it really comes down to why using iterators as a primitive notation
> would be good or bad for C++, and I haven't really read a good
> argument against having both iterators _and_ ranges.

Complexity.

> It appears to me
> that the only major failing of iterators (their not knowing their
> limits) could be solved by having a common past-the-end representation
> external to the containers; and that seems fairly straightforward.
> There are some minor syntactical ugliness with iterators, but those
> arguments get subjective pretty fast.
>
> You're stating that iterators should be a thing of the past, but I
> haven't seen a coup-de-grace for the argument against using iterators
> in conjunction with (or as primitives for) ranges. There's one slide
> (#21) where you list the final nail in the coffin for iterators.
> Perhaps you (and/or others) wouldn't mind elaborating on them a bit?

Ranges are a coup-de-grace for iterators as much as vector<int> is a
coup-de-grace for int* (accompanied by the inevitable size_t), and smart
pointers are a coup-de-grace for bald pointers. Yes, vector and smart
pointers do add overhead. Is it worth it? I'd say so.

If I can choose, I will define one abstraction instead of two. As such,
a design offering both iterators and ranges would be distinctly
unappealing to me (also because the uncomfortable problem of ranges that
don't quite have meaningful iterators associated).

> "All iterator types are fundamentally unsafe." I'm not sure what you
> mean here. Do you mean you can iterate off the end and frolic in
> memory?

Which of an iterator's primitive operations can you check?

> "For most iterator types, given an iterator..." You mention you can't
> say whether certain operations are possible on an iterator, but I've
> never really tried to use iterators without already knowing their
> types. Is there an advance use case where that's common?

I don't understand this.


Andrei

Andrei Alexandrescu

unread,
May 21, 2009, 3:52:18 PM5/21/09
to
Peter Dimov wrote:
> On May 18, 5:43 pm, brang...@cix.co.uk (Dave Harris) wrote:
>> pdi...@gmail.com (Peter Dimov) wrote (abridged):
>>
>>> The slides were interesting, but left me with a question. std::find
>>> returns an iterator, let's call it 'middle', from which one can
>>> construct both [begin, middle) and [middle, end). But the
>>> range-based find returns the second range, and I see no way
>>> (from what has been shown in the slides) to derive [begin, middle)
>>> from it.
>> We'd need functions to implement an algebra of ranges.
>
> This is, more or less, my point; that the necessary algebra was not
> shown in the slides.
>
>> Concatenating ranges is easy. Taking the first N items of a range is also
> easy; [...]
>
> Not if you stick to the range concepts in the slides (you can, in
> principle, derive the first N items of a bidirectional multipass range
> but it doesn't feel quite right).

The way to derive the first N items of a range is by creating a new kind
of range (called Take in D's std.algorithm) that takes another range and
a number. The Take range lazily iterates its host range until the number
gets exhausted. There's a convenience function called take() that gives
you type deduction. So you can write:

foreach (e; take(40, uniform(0, 100))
{
... use e ...
}

to get 40 random integers numbers in [0, 100). Lazy iteration is key for
composing infinite ranges, because obviously random numbers will never
finish so any eager operation on them will go on forever.

There's one interesting detail about Take alluded to in my slides. If
the host range offers a size() operation (length in D lingo), the Take
range also offers a size() operation which is obviously min(n,
host.size()). If the host range is infinite (a primitive concept in D's
ranges), then again Take has a size() computed as n. Finally, if the
range is finite but doesn't know its length, Take also doesn't know its
length. I think this is one pretty cool example on how range categories
interact.

> If you have operator- for ranges, so that you can derive [begin,
> middle) as [begin, end) - [middle, end), then taking the first N items
> is also easy: R - advance(R, N). (It still requires a multipass range
> though.)

I gave this some more thought and figured what the right design is.
find() must return the right-hand range, any other design would be too
limiting. However sometimes the left-hand range is needed, and that is
best done with an until function. until lazily iterates its input range
until it finds the sought object. So then you can iterate through a socket:

foreach (char c; until(socket, '\n')) { ... }

That loop will read a line. If you just want to copy that line, you'd write:

copy(until(socket, '\n'), appender(&someString));

(This hasn't been implemented yet, don't look it up.)

>> A function that returns [begin, middle) when passed [begin, end) and
>> [middle, end) seems harder to write in a generic way.
>
> One can think of ranges ending at 'end' as isomorphic to iterators
> from which 'end' is reachable.

I think they're isomorphic to iterators from which 'end' is reachable
and distinguishable.

> [begin, end) is isomorphic to 'begin',
> [middle, end) is isomorphic to 'middle', and the iterator pair <begin,
> middle> can be said to be isomorphic to the range pair < [begin, end),
> [middle, end) >. Which can now be convinced to model the forward range
> concept... but not the bidirectional range concept because [middle,
> end) cannot grow into [begin, end).
>> Is that a problem?
>
> Not really, I was just wondering how the algebraic closure looks like,
> as the concepts in the slides do not, in my opinion, match iterators
> in expressive power.

Any higher-level abstraction will by necessity disable some of the
functionality that can be achieved with lower-level abstractions. The
key is to gain enough from the better abstraction to justify the lost
expressive power. After extensive experience with both iterators and
ranges, I can say that indeed there is loss (e.g. "I want a compact
index storing iterators in collections without random access") but that
is overwhelmed by the huge gains in power, simplicity, and generality.
I've implemented the STL with iterators and then a large superset of it
with ranges, in a much shorter time and with enormous gains in
generality. In wake of that experience, to me there's no looking back at
iterators.


Andrei

Andrei Alexandrescu

unread,
May 21, 2009, 3:52:03 PM5/21/09
to

There is no proposal. I plan to write an article in August.

I did not find it necessary to define comparison for equality between
ranges. I do have an equal(r1, r2) that compares ranges element for element.


Andrei


--

Dave Harris

unread,
May 21, 2009, 9:38:42 PM5/21/09
to
georg...@gmail.com () wrote (abridged):

> So it really comes down to why using iterators as a primitive
> notation would be good or bad for C++, and I haven't really read a
> good argument against having both iterators _and_ ranges.

There are two sides to that issue: containers and algorithms.

First, I doubt if anyone is really going to break existing C++ code by
deleting std::vector<>.begin(), or std::find(first, last, value), so we
are stuck with iterators for C++. If we add ranges, we have to make sure
they play nicely with iterators.

I don't see a problem on the containers side. I think containers need to
export something like an iterator regardless, that efficiently represents
a single position and can be incremented etc in a uniform way. What we
have now is at worst over-engineered but still useful. Containers can
export ranges in addition, and probably those ranges will be implemented
in terms of one or two iterators. There's no clash of names because the
iterator-producing functions will use names like begin() and end(), and
the range functions will use names like all().

Adding range-based algorithms alongside iterator-based algorithms strikes
me as a bit trickier. If you use the same function names, then the range
version of binary_op transform() will have the same name and number of
arguments as the iterator version of the unary_op transform(), and the
arguments are templated, so they'll tend to match each other. I imagine
we'd either need some category magic to sort it out, or else put the
range-based algorithms in a different namespace entirely, or else just
give them different names.

For me, the basic issue Andrei raises is whether the algorithms should be
implemented most directly in terms of ranges, or in terms of iterators.
In other words, if range-based algorithms are in a new namespace called
"stdr", then should we have:

template< class Range, class Value >
Range stdr::find( Range all, Value value ) {
return Range(
std::find( all.begin(), all.end(), value ), all.end() );
}

or:

template< class Iterator, class Value >
Iterator std::find( Iterator first, Iterator last, Value value ) {
stdr::iterator_range<Iterator> r( first, last );
return stdr::find( r, value ).begin();
}

Andrei argues persuasively that the algorithms should use ranges as
primitives, mainly because ranges are more general. That is, ranges
should not be required to have iterator-producing members begin() and
end() in order to work with algorithms. (Although some ranges can have.)

-- Dave Harris, Nottingham, UK.

--

Hakusa

unread,
May 21, 2009, 9:36:21 PM5/21/09
to
On May 21, 11:12 am, george.r...@gmail.com wrote:
> On May 19, 7:02 pm, Andrei Alexandrescu wrote:
>
> > This is because Boost's and Adobe's ranges use iterators as a
> > primitive notion.
>
> I have to admit that the proposal's syntax of
>
> std::vector<int*> vec = ...;
> copy( vec | reversed | indirected,
> ostream_iterator<int>( cout ) );
>
> seems overcomplicated next to (guessing):
>
> copy( vec.all(), ostream_range( cout ) );

To me, the proposal is overcomplicated next to
copy (
vec.begin(), vec.end(),
ostream_iterator<int>( cout, " " )
);

And I don't understand why operator | (Container, SomethingElse)
should return ... ANYTHING. I hate to use such a cliche term, but it
seems like extreme syntactic sugar. Maybe more syntactic magic.

What should be done if I type "vec | reversed | in_order"? (Or
whatever it should be.) Why should one need knowledge of these
attribute things in order to call copy?

It is loading more messages.
0 new messages