iterators and views of lists

5 views
Skip to first unread message

Brendan Miller

unread,
Dec 15, 2009, 10:39:29 PM12/15/09
to pytho...@python.org
I was trying to reimplement some of the c++ library of generic
algorithms in c++ in python, but I was finding that this is
problematic to do this in a generic way because there isn't any
equivalent of c++'s forward iterators, random access iterators, etc.
i.e. all python iterators are just input iterators that can't mutate
the sequence they iterate over nor move backwards or by an arbitrary
offset.

I'm wondering if anyone has done work towards creating more powerful
iterators for python, or creating some more pythonic equivalent.

In particular, I was thinking that slices are almost equivalent to a
range of random access iterators, except that they create an
unnecessary copy. If there was a version of slice that defined a
mutable view over the original list, you'd have something equivalent
to t a pair of random access iterators.

For instance, if you wanted to reverse a segment of a list in place
you would be able to do:

list = [1,2,3,4]

#pretend this is some variant of slice that produces views instead
#obviously this couldn't use the same syntax, and might be a method instead.
view = list[1:]
view.reverse()

print list
[1,4,3,2]

This doesn't really handle forward and bidirectional iterators...
which I guess would be good for algorithms that operator over disk
base datastructures...

Anyone else had similar thoughts?

Brendan

Terry Reedy

unread,
Dec 16, 2009, 12:09:34 AM12/16/09
to pytho...@python.org
On 12/15/2009 10:39 PM, Brendan Miller wrote:
> I was trying to reimplement some of the c++ library of generic
> algorithms in c++ in python, but I was finding that this is
> problematic to do this in a generic way because there isn't any
> equivalent of c++'s forward iterators, random access iterators, etc.
> i.e. all python iterators are just input iterators that can't mutate
> the sequence they iterate over nor move backwards or by an arbitrary
> offset.
>
> I'm wondering if anyone has done work towards creating more powerful
> iterators for python, or creating some more pythonic equivalent.

For sequences, integer indexes let you do anything you want that the
container supports. No need for anything more, though you can wrap ints
if you must. The read-only iterator protocal is intentionally minimal. I

Brendan Miller

unread,
Dec 16, 2009, 2:48:04 AM12/16/09
to pytho...@python.org
On Tue, Dec 15, 2009 at 9:09 PM, Terry Reedy <tjr...@udel.edu> wrote:
> On 12/15/2009 10:39 PM, Brendan Miller wrote:
>> I'm wondering if anyone has done work towards creating more powerful
>> iterators for python, or creating some more pythonic equivalent.
>
> For sequences, integer indexes let you do anything you want that the
> container supports.

No, that's what I'm getting at... Most of the existing mutating
algorithms in python (sort, reverse) operate over entire collections,
not partial collections delimited by indexes... which would be really
awkward anyway.

Currently people slice and dice with well... slices, but those are
copying, so if you want to operate over part of a range you make a
copy, perform the operation, then copy the results back in.

I was thinking you'd want something like random access iterators in
c++, or pointers in c, to write typical in place algorithmic code. To
me, something like non-copying slices (maybe you'd call it a list
view?) would seem functionally similar and maybe more pythonic.

Bearophile

unread,
Dec 16, 2009, 5:54:03 AM12/16/09
to
Brendan Miller:

> Currently people slice and dice with well... slices, but those are
> copying, so if you want to operate over part of a range you make a
> copy, perform the operation, then copy the results back in.
>
> I was thinking you'd want something like random access iterators in
> c++, or pointers in c, to write typical in place algorithmic code. To
> me, something like non-copying slices (maybe you'd call it a list
> view?) would seem functionally similar and maybe more pythonic.

There are surely ways to modify the current situation, introducing
views, and other kind of iterators (C++ design in this regard can be
improved, see the last works of Andrei Alexandrescu about D).

This can lead to a certain increase of the performance of Python
programs, but it also surely makes writing programs harder and
introduces new bug sources.

Modern languages are now doing something kinda different from what you
ask for, introducing more immutable data (that in theory lead to more
copying, but in practice allow for more data sharing too, because
there's less need to actually copy data when the data is immutable),
see Clojure. An hypothetical Python language designed today probably
copies more stuff from Haskell and Clojure.

Python is kept intentionally simple, because it's designed to be
usable by people that are not just professional programmers, but for
example a biologist that needs to perform some computations and
doesn't want to use all the time learning how to use the language
itself (think about C++ again). So Python programmers live with a less
performing language. When performance of the code is not enough (even
with the future Unladen Swallow), they use another language (often to
write extensions used from Python code).

Bye,
bearophile

Steven D'Aprano

unread,
Dec 16, 2009, 6:39:07 AM12/16/09
to
On Tue, 15 Dec 2009 23:48:04 -0800, Brendan Miller wrote:

> On Tue, Dec 15, 2009 at 9:09 PM, Terry Reedy <tjr...@udel.edu> wrote:
>> On 12/15/2009 10:39 PM, Brendan Miller wrote:
>>> I'm wondering if anyone has done work towards creating more powerful
>>> iterators for python, or creating some more pythonic equivalent.
>>
>> For sequences, integer indexes let you do anything you want that the
>> container supports.
>
> No, that's what I'm getting at... Most of the existing mutating
> algorithms in python (sort, reverse) operate over entire collections,
> not partial collections delimited by indexes... which would be really
> awkward anyway.

I'm sympathetic to your request for list views. I've often wanted some
way to cleanly and neatly do this:

for item in seq[1:]:
process(item)

without making an unnecessary copy of almost all of seq.

None of the alternatives are very nice. They all obscure the simplicity
of the algorithm and add unnecessary effort:


for i, item in seq:
if i != 0:
process(item)


it = iter(seq)
try:
it.next() # throw the first value away
except StopIteration:
pass
for item in it:
process(item)


if not isinstance(seq, basestring):
raise TypeError("this only works on strings")
for item in buffer(seq, 1):
process(item)


Although that last one is the least horrible, particularly if you are
prepared to drop the explicit type-check.

> Currently people slice and dice with well... slices, but those are
> copying, so if you want to operate over part of a range you make a copy,
> perform the operation, then copy the results back in.
>
> I was thinking you'd want something like random access iterators in c++,
> or pointers in c, to write typical in place algorithmic code. To me,
> something like non-copying slices (maybe you'd call it a list view?)
> would seem functionally similar and maybe more pythonic.


Unless you have a really huge amount of data in your list, chances are
that the cost of making a temporary copy will be relatively small. After
all, under the hood you're just copying pointers (at least for CPython),
not the entire data object. So most of the time the extra complexity of
in-place algorithms is just premature optimization.

But not always.

--
Steven

Paul Rudin

unread,
Dec 16, 2009, 7:16:03 AM12/16/09
to
Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> writes:


> I'm sympathetic to your request for list views. I've often wanted some
> way to cleanly and neatly do this:
>
> for item in seq[1:]:
> process(item)
>
> without making an unnecessary copy of almost all of seq.
>

I don't know how it's implemented - but presumably itertools.islice
could provide what you're asking for?

Anh Hai Trinh

unread,
Dec 16, 2009, 7:58:58 AM12/16/09
to
On Dec 16, 10:39 am, Brendan Miller <catph...@catphive.net> wrote:
> I was trying to reimplement some of the c++ library of generic
> algorithms in c++ in python, but I was finding that this is
> problematic to do this in a generic way because there isn't any
> equivalent of c++'s forward iterators, random access iterators, etc.
> i.e. all python iterators are just input iterators that can't mutate
> the sequence they iterate over nor move backwards or by an arbitrary
> offset.

You might be interested in this library <http://pypi.python.org/pypi/
stream>.

You can easily create arbitrary "slice", for example

i = mylist >> takei(primes())

will return an iterator over the items of mylist with a prime number
index, given that primes() return an iterator over prime numbers.

I'm not familiar with forward/random-access/bidirectional iterators.

Peter Otten

unread,
Dec 16, 2009, 8:33:05 AM12/16/09
to
Paul Rudin wrote:

For skipping just a few items islice() is perfect, for big offsets its O(N)
behaviour may get annoying:

$ python -m timeit -s"from itertools import islice; N=10**5; r = range(N)"
"for i in islice(r, N, N): pass"
100 loops, best of 3: 1.93 msec per loop

$ python -m timeit -s"from itertools import islice; N=10**6; r = range(N)"
"for i in islice(r, N, N): pass"
10 loops, best of 3: 18.9 msec per loop

$ python -m timeit -s"from itertools import islice; N=10**7; r = range(N)"
"for i in islice(r, N, N): pass"
10 loops, best of 3: 188 msec per loop

islice() could be changed to special-case lists and tuples, but that feels a
bit unclean.

Carl Banks

unread,
Dec 16, 2009, 9:06:08 AM12/16/09
to
On Dec 15, 11:48 pm, Brendan Miller <catph...@catphive.net> wrote:
> I was thinking you'd want something like random access iterators in
> c++, or pointers in c, to write typical in place algorithmic code. To
> me, something like non-copying slices (maybe you'd call it a list
> view?) would seem functionally similar and maybe more pythonic.


My general answer to the question, "Has anyone done any work in Python
to get a certain feature that we have in C++?, is, "There's little
demand for many features of C++ because they are actually workarounds
for the language's poor expressibility". I'd say that's kind of the
case here.

C++ is poorly expressive (container operations are often an excercise
in line-noise-management), and values fine-tuned optimization. So it
makes sense to offer things like mutator views that reduce the number
of container operations needed, and avoids copying.

Python expresses container operations succinctly and clearly, and
doesn't value performance as much. Copying and replacing is good
enough most of the time, so there's less demand for things like
mutator views.


Carl Banks

Francesco Bochicchio

unread,
Dec 16, 2009, 12:44:47 PM12/16/09
to
On Dec 16, 1:58 pm, Anh Hai Trinh <anh.hai.tr...@gmail.com> wrote:

>
> You might be interested in this library <http://pypi.python.org/pypi/
> stream>.
>
> You can easily create arbitrary "slice", for example
>
>   i = mylist >> takei(primes())
>
> will return an iterator over the items of mylist with a prime number
> index, given that primes() return an iterator over prime numbers.
>


Nice :-)

I was starting to experiment data flow programming with python myself,
although I'm just playing with it..
I find the idea of data flow programming fascinatin, and wonder if it
can be considered a general-purpose program paradigm.

Ciao
-----
FB

Brendan Miller

unread,
Dec 16, 2009, 1:57:45 PM12/16/09
to Paul Rudin, pytho...@python.org
> --
> http://mail.python.org/mailman/listinfo/python-list
>

itertools.islice returns an iterator. My main point is that in python
iterators are weak and can't be used to write many types of
algorithms. They only go in one direction and they can't write to the
collection.

Another poster mentioned a stream library that is also iterator based.
Basically I'm saying that anything with iterators is pretty limited
because of the interface they present. See section 6.5 here for the
iterator protocol:

http://docs.python.org/library/stdtypes.html

Basically this is only useful for writing for loops or map/reduce
operations. However, python's primary datastructures, the dynamic
array (list) and hashtable (dictionary) are more powerful than that.

Anh Hai Trinh

unread,
Dec 16, 2009, 3:38:47 PM12/16/09
to
On Dec 16, 2:48 pm, Brendan Miller <catph...@catphive.net> wrote:

> No, that's what I'm getting at... Most of the existing mutating
> algorithms in python (sort, reverse) operate over entire collections,
> not partial collections delimited by indexes... which would be really
> awkward anyway.

Ok it can be done! The code is here: <http://gist.github.com/258134>.

>>> from listagent import listagent
>>> x = [22, 7, 2, -5, 8, 4]
>>> listagent(x)[1:].sort()
>>> x
[22, -5, 2, 4, 7, 8]
>>> listagent(x)[::2].reverse()
>>> x
[7, -5, 2, 4, 22, 8]

Basically the agent refers to the original list only by "address
translation", and indeed made no copy of anything whatever! I
implemented Shell sort but I suppose others are possible.

The implementation is incomplete, for now you cannot do slice
assignment, i.e.

a = listagent(x)[::-2]
a[1:] = [4, 2]

but it should be easy to implement, and I'm too sleepy.

Peace,
----aht

Brendan Miller

unread,
Dec 16, 2009, 6:16:57 PM12/16/09
to Anh Hai Trinh, pytho...@python.org

Very cool, that's more or less what I was thinking of.

I have a couple of thoughts:
1. Since [:] by convention already creates a copy, it might violate
people's expectations if that syntax were used.

2. I'd give the listagent the mutable sequence interface, but and then
make new algorithms (except sort and reverse which are required by the
mutable sequence) as functions that just expect an object that mutable
sequence compatible. That way the algorithm is decoupled from the
datastructure, and the same code can handle both lists and listagents,
or even potentially other datastructures that can expose the same
interface.

For instance, I was coding up some stl algorithms in python including
next_permutation. So, if you wanted to make a generic next_permutation
in python, you wouldn't want to make it a member of listagent, because
then regular lists couldn't use it conveniently. However, if it were
just a function that accepted anything that implemented mutable
sequence, it could operate both over lists and over listagent without
any conversion.

Anh Hai Trinh

unread,
Dec 17, 2009, 11:41:19 AM12/17/09
to
> I have a couple of thoughts:
> 1. Since [:] by convention already creates a copy, it might violate
> people's expectations if that syntax were used.

Indeed, listagent returns self on __getitem__[:]. What I meant was
this:

x = [0, 1, 2, 3, 4, 5, 6, 7]
a = listagent(x)[::2]
a[:] = listagent(x)[::-2]

And we get x = [7, 1, 5, 3, 3, 5, 1, 7], the copying happens in-place,
of course.


> 2. I'd give the listagent the mutable sequence interface

Done! I put the code in a repository here for those who might be
interested:
<http://github.com/aht/listagent.py>.

In retrospect, the Python gurus here was right though. Copy, modify
then replace is good enough, if not better since items are just
pointers instead of values. For reversing, you need to translate all
the indices (which cost exactly one addition and one multiplication
per index). Is that cheaper than copying all the pointers to a new
list? For sorting, you definitely need to construct a lookup table
since the sort algorithm needs to look over the indices multiple
times, which means you are using two pointer indirections per index.
Is that cheaper than just copying all the pointers to a new list? Even
if there is any benefit at all, you'll need to implement listagent in
C to squeeze it out.

However, using listagents is faster if you just want a few items out
of the slice. And it's cute.

Peace,
----aht

Brendan Miller

unread,
Dec 17, 2009, 3:07:59 PM12/17/09
to Anh Hai Trinh, pytho...@python.org

Well, it doesn't really need to be any slower than a normal list. You
only need to use index and do extra additions because it's in python.
However, if listagent were written in C, you would just have a pointer
into the contents of the original list, and the length, which is all
that list itself has.

I don't actually expect you to write that, I'm just pointing it out.

As for copying pointers not taking much time... that depends on how
long the list is. if you are working with small sets of data, you can
do almost anything and it will be efficient. However, if you have
megabytes or gigabytes of data (say you are working with images or
video), than the difference between an O(1) or an O(n) operation is a
big deal.

I agree though, it doesn't matter to everyone and anyone. The reason I
was interested was because i was trying to solve some specific
problems in an elegant way. I was thinking it would be cool to make
python more usable in programming competitions by giving it its own
port of the STL's algorithm library, which needs something along the
lines of C++'s more powerful iterators.

Steven D'Aprano

unread,
Dec 17, 2009, 9:44:08 PM12/17/09
to
On Thu, 17 Dec 2009 12:07:59 -0800, Brendan Miller wrote:

> I was thinking it would be cool to make python more usable in
> programming competitions by giving it its own port of the STL's
> algorithm library, which needs something along the lines of C++'s more
> powerful iterators.

For the benefit of those of us who aren't C++ programmers, what do its
iterators do that Python's don't?


--
Steven

Brendan Miller

unread,
Dec 18, 2009, 1:00:20 AM12/18/09
to Steven D'Aprano, pytho...@python.org

Python iterators basically only have one operation:

next(), which returns the next element or throws StopIteration.

In C++ terminology this is a Input iterator. It is good for writing
"for each" loops or map reduce operations.

An input iterator can't mutate the data it points to.

C++ also has progressively stronger iterators:
http://www.sgi.com/tech/stl/Iterators.html

InputIterator <- read only, one direction, single pass
ForwardIterator <- read/write, one direction, multi pass
BidirectionalIterator <- read/write, can move in either direction
RandomAccessIterator <- read/write, can move in either direction by an
arbitrary amount in constant time (as powerful as a pointer)

Each only adds extra operations over the one before. So a
RandomAccessIterator can be used anywhere a InputIterator can, but not
vice versa.

Also, this is a duck typing relationship, not a formal class
inheritance. Anything that quacks like a RandomAccessIterator is a
RandomAccessIterator, but there is no actual RandomAccessIterator
class.

So, for instance stl sort function takes pair of random access
iterator delimiting a range, and can sort any datastructure that can
provide that powerful of an iterator (arrays, vectors, deques).

http://www.sgi.com/tech/stl/sort.html

MyCollection stuff;
// put some stuff in stuff

sort(stuff.begin(), stuff.end());

Where begin() and end() by convention return iterators pointing to the
beginning and end of the sequence.

Anh Hai Trinh

unread,
Dec 18, 2009, 12:03:21 PM12/18/09
to
On Dec 18, 3:07 am, Brendan Miller <catph...@catphive.net> wrote:

> Well, it doesn't really need to be any slower than a normal list. You
> only need to use index and do extra additions because it's in python.
> However, if listagent were written in C, you would just have a pointer
> into the contents of the original list, and the length, which is all
> that list itself has.

You're right, I was thinking in Python instead of C. So the
translations in that case is really straight forward, using just
pointer arithmetic.

----aht

Carl Banks

unread,
Dec 18, 2009, 1:39:15 PM12/18/09
to
On Dec 17, 10:00 pm, Brendan Miller <catph...@catphive.net> wrote:
> On Thu, Dec 17, 2009 at 6:44 PM, Steven D'Aprano
>
> <st...@remove-this-cybersource.com.au> wrote:
> > On Thu, 17 Dec 2009 12:07:59 -0800, Brendan Miller wrote:
>
> >> I was thinking it would be cool to make python more usable in
> >> programming competitions by giving it its own port of the STL's
> >> algorithm library, which needs something along the lines of C++'s more
> >> powerful iterators.
>
> > For the benefit of those of us who aren't C++ programmers, what do its
> > iterators do that Python's don't?
>
> Python iterators basically only have one operation:
>
> next(), which returns the next element or throws StopIteration.
>
> In C++ terminology this is a Input iterator. It is good for writing
> "for each" loops or map reduce operations.

Hmm. I guess the main thing that's bothering me about this whole
thread is that the true power of Python iterators is being overlooked
here, and I don't think you're being fair to call Python iterators
"weak" (as you did in another thread) or say that they are only good
for for-else type loops.

The fact is, Python iterators have a whole range of powers that C++
iterators do not. C++ iterators, at least the ones that come in STL,
are limited to iterating over pre-existing data structures. Python
iterators don't have this limation--the data being "iterated" over can
be virtual data like an infinite list, or data generated on the fly.
This can be very powerful.

Here's a cool slideshow on what can be done with iterators in Python
(generators specifically):

http://www.dabeaz.com/generators/

It is true that Python iterators can't be used to mutate the
underlying structure--if there is actual underlying data structure--
but it doesn't mean they are weak or limited. Python and C++
iterators are similar in their most basic usage, but grow more
powerful in different directions.

Carl Banks

Alf P. Steinbach

unread,
Dec 18, 2009, 2:08:27 PM12/18/09
to
* Carl Banks:

It's good that Python iterators can do things.

However, it's not the case that C++ iterators can't do those things.

C++ iterators very much routinely do such things.

However, C++ iterators are flawed in a way that Python iterators are not.

You might say, in an analogy with control structures, that this flaw gives C++
iterators the power of "goto" but also with all the negative baggage...

I'm too lazy to Google, but you might search for Alexandrescu and "ranges", and
possibly throw in "iterators" among the search terms.


Cheers & hth.,

- Alf

Lie Ryan

unread,
Dec 18, 2009, 2:20:41 PM12/18/09
to
On 12/18/2009 7:07 AM, Brendan Miller wrote:
> As for copying pointers not taking much time... that depends on how
> long the list is. if you are working with small sets of data, you can
> do almost anything and it will be efficient. However, if you have
> megabytes or gigabytes of data (say you are working with images or
> video), than the difference between an O(1) or an O(n) operation is a
> big deal.

A 1-million member list takes ~130 msec, 10 million 1.3s. In many cases
the algorithm would easily dwarf the copying itself.

Carl Banks

unread,
Dec 18, 2009, 2:32:52 PM12/18/09
to

STL Iterators? No. Maybe if you're talking about boost or homemade
iterators or something I could buy it, though I'd still bs suspicious
of "routinely".


> However, C++ iterators are flawed in a way that Python iterators are not.
>
> You might say, in an analogy with control structures, that this flaw gives C++
> iterators the power of "goto" but also with all the negative baggage...

That too.


> I'm too lazy to Google, but you might search for Alexandrescu and "ranges", and
> possibly throw in "iterators" among the search terms.


Carl Banks

Alf P. Steinbach

unread,
Dec 18, 2009, 3:18:04 PM12/18/09
to

The intersection of STL and the C++ standard library is of nearly the same size
as the STL, but I don't understand why you're limiting yourself to the STL.

After all, the STL was first developed for the Ada language, IIRC.

Using the standard C++ library, per the 1998 first (and current!) C++ standard,
below is an idiomatic "copy Carl Banks to output" C++ program where, you might
note, the "ostream_iterator" has a sister "istream_iterator" that definitely
does not only iterat over a "pre-existing data structure", as you wrote.

Of course, mostly one defines iterators in the source code or by using Boost or
similar libraries; that's how C++ works.

There's no special language support for iterators in C++; it's wholly a library
and application defined concept.


<code>
#include <iostream>
#include <iterator>
#include <vector>
#include <algorithm>

int main()
{
using namespace std;
static char const* const words[] =
{ "Carl", "Banks", "is", "wrong", "on", "the", "Internet", "!" };

copy(
words, words + sizeof(words)/sizeof(*words),
ostream_iterator<char const*>( cout, "\n" )
);
}
</code>

Brendan Miller

unread,
Dec 18, 2009, 4:10:28 PM12/18/09
to Carl Banks, pytho...@python.org

When I said they are "weak" I meant it in sense that the algorithms
writeable against an InputerIterator interface (which is what python's
iterator protocol provides) is a proper subset of the algorithms that
can be written against a RandomAccessIterator interface. The class of
algorithms expressible against a python iterator is indeed limited to
those that can be expressed with a for each loop or map/reduce
operation.

Yes, python iterators can indeed iterate over infinite lists. All that
you need to do is have the next() operation never throw. The same
thing can be done in c++, or any other language that has iterators.

Generators provide a convenient way to write input iterators; however,
the same thing can be done in any language. It just requires more
boilerplate code to keep track of the iteration state.

Of course, that's not to say generators aren't important. They make it
that much easier to write your own iterators, so in a practical sense,
people are more likely to write their own iterators in python than in
C++.

Carl Banks

unread,
Dec 18, 2009, 4:35:01 PM12/18/09
to
On Dec 18, 12:18 pm, "Alf P. Steinbach" <al...@start.no> wrote:
[lots of tangential information snipped]
> ...but I don't understand why you're limiting yourself to the STL.

Because I believe the vast majority of iterators used in C++ in
practice are the ones provided by STL types, therefore I felt a
statement about the STL iterators reflected the commonly used power
(if not the ultimate capability) of iterators in C++. I suppose it
was a bit hasty.


> ... "ostream_iterator" ...

Well that's a good example.

I retract my statement that C++ iterators in general not having this
power, but I still doubt that C++ iterators are used like this much.
In most code I've seen people use STL or similar data structures, and
then use the iterators for the particular data type, and/or pointers.


Carl Banks

Terry Reedy

unread,
Dec 18, 2009, 4:45:38 PM12/18/09
to pytho...@python.org
On 12/18/2009 1:00 AM, Brendan Miller wrote:

>> For the benefit of those of us who aren't C++ programmers, what do its
>> iterators do that Python's don't?

It depends on what one means by 'iterator'. Python iterators do not fit
in the STL hierarchy. On the other hand, Python indexes are a form of
random access iterator, the top of the hierarchy.

> Python iterators basically only have one operation:

> next(), which returns the next element or throws StopIteration.
>
> In C++ terminology this is a Input iterator. It is good for writing
> "for each" loops or map reduce operations.
>

> An input iterator can't mutate the data it points to.

For that, Python uses indexes. Random access 'iterators' are a
generalization of sequence indexes.

> C++ also has progressively stronger iterators:
> http://www.sgi.com/tech/stl/Iterators.html
>
> InputIterator<- read only, one direction, single pass
> ForwardIterator<- read/write, one direction, multi pass
> BidirectionalIterator<- read/write, can move in either direction
> RandomAccessIterator<- read/write, can move in either direction by an
> arbitrary amount in constant time (as powerful as a pointer)

An index can be incremented or decremented an arbitrary amount. Note
that Python indexes are object pointers, not memory byte pointers, so
that index + 1 points to the next object, not the next byte of memory in
possibly the same object. They are, however, much safer than pointers in
that x[i] can only retrieve objects of x and never, surprise, surprise,
of some other collection.

> Each only adds extra operations over the one before. So a
> RandomAccessIterator can be used anywhere a InputIterator can, but not
> vice versa.
>
> Also, this is a duck typing relationship, not a formal class
> inheritance. Anything that quacks like a RandomAccessIterator is a
> RandomAccessIterator, but there is no actual RandomAccessIterator
> class.
>
> So, for instance stl sort function takes pair of random access
> iterator delimiting a range, and can sort any datastructure that can
> provide that powerful of an iterator (arrays, vectors, deques).

Python, traditionally, only came with one mutable builtin sequence type,
so the sort function was made a method of that type. There is also the
importable array module. Apparently, there has been little demand for a
generic array.sort method as there is none. One could easily write a
function that took an array or other seqeunce and two indexes as args.

The generic reversed() builtin function by default uses sequence indexes
in the obvious manner to return a reversed version of *any* sequence,
mutable or not.

> http://www.sgi.com/tech/stl/sort.html
>
> MyCollection stuff;
> // put some stuff in stuff
>
> sort(stuff.begin(), stuff.end());
>
> Where begin() and end() by convention return iterators pointing to the
> beginning and end of the sequence.

start,stop = 0, len(seq)

Terry Jan Reedy


Bearophile

unread,
Dec 18, 2009, 5:47:43 PM12/18/09
to
Brendan Miller:

> I agree though, it doesn't matter to everyone and anyone. The reason I
> was interested was because i was trying to solve some specific
> problems in an elegant way. I was thinking it would be cool to make
> python more usable in programming competitions by giving it its own
> port of the STL's algorithm library, which needs something along the
> lines of C++'s more powerful iterators.

It seems you have missed my post, so here it is, more explicitly:

http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009/05/08/iterators-must-go.pdf

Bye,
bearophie

Steven D'Aprano

unread,
Dec 18, 2009, 8:12:18 PM12/18/09
to
On Fri, 18 Dec 2009 10:39:15 -0800, Carl Banks wrote:

> It is true that Python iterators can't be used to mutate the underlying
> structure--if there is actual underlying data structure--

An iterator is a protocol. So long as you have __iter__ and next (or
__next__ in Python 3) methods, your class is an iterator. What you do in
those methods are up to you. If you want to mutate the underlying data,
you're free to do so.

Just for chuckles, here's a quick-and-dirty iterator version of the Tower
of Hanoi that mutates itself as needed. (I make no promise that it is
either efficient or bug-free.)


class Hanoi:
def __init__(self, n=5):
self.a = range(n, 0, -1)
self.b = []
self.c = []
self.num = 0
def __str__(self):
template = "Needle %d: %s"
d = [template % (i, data) for (i,data) in
enumerate([self.a, self.b, self.c])]
return '\n'.join(d) + '\n'
def __iter__(self):
return self
def next(self):
if [] == self.a == self.c:
raise StopIteration
self.num += 1
if self.num % 2:
# Odd-numbered move: move the smallest disk one
# position clockwise.
needle = self.find_smallest()
nxt = "bca"["abc".find(needle)]
else:
# Even-numbered move: make the only legal move of the
# next-smallest disk.
needle, nxt = self.find_legal()
self.move(needle, nxt)
return str(self)
def move(self, frm, to):
"""Move the top disk from needle `frm` to needle `to`."""
from_needle = getattr(self, frm)
to_needle = getattr(self, to)
to_needle.append(from_needle[-1])
del from_needle[-1]
def find_smallest(self):
for needle in 'abc':
if 1 in getattr(self, needle):
return needle
def find_legal(self):
"""Find the only legal move not using the smallest disk."""
ignore = self.find_smallest()
if ignore == 'a': disks = 'bc'
elif ignore == 'b': disks = 'ac'
else: disks = 'ab'
left_needle = getattr(self, disks[0])
right_needle = getattr(self, disks[1])
if left_needle == []:
return disks[1], disks[0]
elif right_needle == []:
return disks[0], disks[1]
elif left_needle[-1] < right_needle[-1]:
return disks[0], disks[1]
else:
return disks[1], disks[0]


And in use:


>>> p = Hanoi()
>>> for state in p:
... print state
...
Needle 0: [5, 4, 3, 2]
Needle 1: [1]
Needle 2: []

Needle 0: [5, 4, 3]
Needle 1: [1]
Needle 2: [2]

[ rest of output skipped for brevity ]


Python iterators don't generally mutate the thing they are iterating
over, not because they can't, but because it's generally a bad idea to do
so. In general, it really only makes sense as an optimization in
situations where you have so much data that you can't afford to make a
copy of it. For the occasional time where it does make sense, Python
iterators are perfectly capable of doing so.

--
Steven

Gregory Ewing

unread,
Dec 19, 2009, 12:10:17 AM12/19/09
to
Terry Reedy wrote:
> On the other hand, Python indexes are a form of
> random access iterator, the top of the hierarchy.

The term "random access iterator" seems oxymoronic to
me. Iteration is all about operating on things in
sequence. If you're accessing elements arbitrarily,
then you're not iterating.

> Python, traditionally, only came with one mutable builtin sequence type,
> so the sort function was made a method of that type.

Also it probably operates rather more efficiently
than a generic one would, as it doesn't have to go
through a general mechanism to access elements, but
can take advantage of its knowledge of the internal
structure of a list.

--
Greg

Brendan Miller

unread,
Dec 19, 2009, 12:51:46 AM12/19/09
to Bearophile, pytho...@python.org
> --
> http://mail.python.org/mailman/listinfo/python-list
>

Andrei is arguing for replacing iterators with ranges, which are
equivalently powerful to C++ iterators but easier to use. Actually,
what I want in python is something like this.

If you look at Anh Hai Trinh's posts he implemented something
basically like a python version of andre's ranges, which he called
listagents. That pdf is interesting though because he's thought
through what an interface for bidrectional ranges should look like
which I had not yet.

However, you should note that andrei's ranges allow mutation of the
original datastructure they are a range over. My impression from your
earlier post was that you disagreed with that idea of mutating
algorithms and wanted something more functional, whereas I, and andrei
in that pdf, are more concerned with imperative programming and in
place algorithms.

I don't want to get into a big discussion about FP vs imperative
programming, as that is simply too large a topic and I have had that
discussion many times before. I'm more of a turing machine than a
lambda calculus guy myself, but if other people make other choices
that's fine with me.

Lie Ryan

unread,
Dec 19, 2009, 1:57:42 AM12/19/09
to

Terry Reedy

unread,
Dec 19, 2009, 3:15:06 AM12/19/09
to pytho...@python.org

I presume there is no array sort because it is O(n) to copy array to
list and back again with tolist and fromlist methods.

Anyone needed space-saving of in place in array can write array sort
with generic quicksort or whatever is appropriate to peculiarities of
specific data.

tjr

Anh Hai Trinh

unread,
Dec 19, 2009, 12:04:25 PM12/19/09
to
On Dec 19, 5:47 am, Bearophile <bearophileH...@lycos.com> wrote:

> It seems you have missed my post, so here it is, more explicitly:
>

> http://www.boostcon.com/site-media/var/sphene/sphwiki/attachment/2009...


Interestingly, my `listagent` can be used as a lazy iterator and thus
using itertools we can compose them just like Andrei's `range`.

the stage:

x = range(0, 20, 2); x
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

y = range(10, 20); y
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

z = [996, 758, 670, 236, 898, 337, 442, 452, 490, 547]

from listagent import listagent
import itertools

chain:

sorted(itertools.chain(listagent(x)[::2], listagent(y)[-1:1:-2]))
[0, 4, 8, 12, 13, 15, 16, 17, 19]

zip:

sorted(itertools.izip(listagent(z)[1::3], listagent(x)[2::3]))
[(452, 16), (758, 4), (898, 10)]

stride: slicing an agent returns another one, those lazy agents!

python -m timeit -s'from listagent import listagent' 'list(listagent
(range(1000000))[1:-1:5][::7][10000:-10000:42])'
10 loops, best of 3: 55.5 msec per loop

python -m timeit 'range(1000000)[1:-1:5][::7][10000:-10000:42]'
10 loops, best of 3: 280 msec per loop

radial: not implemented


`listagent` is implemented in pure Python, of course. If implemented
in C, there is surprisingly little difference between a slicing
`agent` like this, or a listiterator, since both will have to keep a
pointer to an memory index of in the original list: the listiterator
advances this pointer when we call next(), whereas the agent keeps a
pointer to the starting element of the slice and return basically
(PyObject *) *(p+(i*step)) on __getitem__[i]. Mutability is a just
matter of exposing this pointer to Python code in a nice way.

----aht

Rhodri James

unread,
Dec 19, 2009, 9:01:22 PM12/19/09
to
On Fri, 18 Dec 2009 21:10:28 -0000, Brendan Miller <catp...@catphive.net>
wrote:

> When I said they are "weak" I meant it in sense that the algorithms
> writeable against an InputerIterator interface (which is what python's
> iterator protocol provides) is a proper subset of the algorithms that
> can be written against a RandomAccessIterator interface. The class of
> algorithms expressible against a python iterator is indeed limited to
> those that can be expressed with a for each loop or map/reduce
> operation.

I think the most relevant thing that can be said to this is that you need
to stop trying to write Python code as if it was C++. The languages
constructs are intended for different purposes; iterators in C++ are part
of the small arsenal of devices needed to provide type-independence to
algorithms, something that comes a lot more easily to Python through
dynamic typing and easy slicing. I'm barely literate in C++, but I can't
see uses for a C++ random access iterator that aren't served in Python by
simple indexing (possibly with additional bounds checking) or a Python
iterator (possibly over a sliced sequence).

The downside, of course, is that you don't have the security provided by
static type checking. You pays your money, you takes your choice;
preferably, however, you don't do like a friend of mine and try to write
FORTRAN in whatever language you're using!

--
Rhodri James *-* Wildebeeste Herder to the Masses

Anh Hai Trinh

unread,
Dec 20, 2009, 1:18:43 AM12/20/09
to
On Dec 20, 12:04 am, Anh Hai Trinh <anh.hai.tr...@gmail.com> wrote:

> chain:
>
>   sorted(itertools.chain(listagent(x)[::2], listagent(y)[-1:1:-2]))
>   [0, 4, 8, 12, 13, 15, 16, 17, 19]
>
> zip:
>
>   sorted(itertools.izip(listagent(z)[1::3], listagent(x)[2::3]))
>   [(452, 16), (758, 4), (898, 10)]

I think I mis-interpret Andrei's slides. I think what he meant to sort
a chain of slices is such that to move the smaller elements into the
first-given slices in-place, thus moving items around whatever lists
underlying those slices.

And for zip, I think he meant sort the first slice, but moving in-
place the items referred by the others in lock-step.

Reply all
Reply to author
Forward
0 new messages