Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Python 3000 idea -- + on iterables -> itertools.chain

2 views
Skip to first unread message

John Reese

unread,
Nov 12, 2006, 7:46:26 PM11/12/06
to
It seems like it would be clear and mostly backwards compatible if the
+ operator on any iterables created a new iterable that iterated
throught first its left operand and then its right, in the style of
itertools.chain. This would allow summation of generator expressions,
among other things, to have the obvious meaning.

Any thoughts? Has this been discussed before? I didn't see it
mentioned in PEP 3100.

The exception to the compatibility argument is of course those
iterables for which + is already defined, like tuples and lists, for
that set of code that assumes that the result is of that same type,
explicitly or implicitly by calling len or indexing or whathaveyou.
In those cases, you could call tuple or list on the result. There are
any number of other things in Python 3000 switching from lists to
one-at-a-time iterators, like dict.items(), so presumably this form of
incompatibility isn't a showstopper.

Fredrik Lundh

unread,
Nov 12, 2006, 8:17:24 PM11/12/06
to pytho...@python.org
John Reese wrote:

> It seems like it would be clear and mostly backwards compatible if the
> + operator on any iterables created a new iterable that iterated
> throught first its left operand and then its right, in the style of
> itertools.chain.

you do know that "iterable" is an informal interface, right? to what
class would you add this operation?

</F>

George Sakkis

unread,
Nov 12, 2006, 9:34:26 PM11/12/06
to
Fredrik Lundh wrote:

The base object class would be one candidate, similarly to the way
__nonzero__ is defined to use __len__, or __contains__ to use __iter__.

Alternatively, iter() could be a wrapper type (or perhaps mixin)
instead of a function, something like:

from itertools import chain, tee, islice

import __builtin__
_builtin_iter = __builtin__.iter

class iter(object):

def __init__(self, iterable):
self._it = _builtin_iter(iterable)

def __iter__(self):
return self
def next(self):
return self._it.next()

def __getitem__(self, index):
if isinstance(index, int):
try: return islice(self._it, index, index+1).next()
except StopIteration:
raise IndexError('Index %d out of range' % index)
else:
start,stop,step = index.start, index.stop, index.step
if start is None: start = 0
if step is None: step = 1
return islice(self._it, start, stop, step)

def __add__(self, other):
return chain(self._it, other)
def __radd__(self,other):
return chain(other, self._it)

def __mul__(self, num):
return chain(*tee(self._it,num))

__rmul__ = __mul__

__builtin__.iter = iter


if __name__ == '__main__':
def irange(*args):
return iter(xrange(*args))

assert list(irange(5)[:3]) == range(5)[:3]
assert list(irange(5)[3:]) == range(5)[3:]
assert list(irange(5)[1:3]) == range(5)[1:3]
assert list(irange(5)[3:1]) == range(5)[3:1]
assert list(irange(5)[:]) == range(5)[:]
assert irange(5)[3] == range(5)[3]

s = range(5) + range(7,9)
assert list(irange(5) + irange(7,9)) == s
assert list(irange(5) + range(7,9)) == s
assert list(range(5) + irange(7,9)) == s

s = range(5) * 3
assert list(irange(5) * 3) == s
assert list(3 * irange(5)) == s


George

Fredrik Lundh

unread,
Nov 13, 2006, 2:26:59 AM11/13/06
to pytho...@python.org
George Sakkis wrote:

> The base object class would be one candidate, similarly to the way
> __nonzero__ is defined to use __len__, or __contains__ to use __iter__.
>
> Alternatively, iter() could be a wrapper type (or perhaps mixin)
> instead of a function, something like:

so you're proposing to either make *all* objects respond to "+", or
introduce limited *iterator* algebra.

not sure how that matches the OP's wish for "mostly backwards
compatible" support for *iterable* algebra, really...

(iirc, GvR has shot down a few earlier "let's provide sugar for iter-
tools" proposals. no time to dig up the links right now, but it's in
the python-dev archives, somewhere...)

</F>

Georg Brandl

unread,
Nov 13, 2006, 3:07:01 AM11/13/06
to
George Sakkis wrote:
> Fredrik Lundh wrote:
>
>> John Reese wrote:
>>
>> > It seems like it would be clear and mostly backwards compatible if the
>> > + operator on any iterables created a new iterable that iterated
>> > throught first its left operand and then its right, in the style of
>> > itertools.chain.
>>
>> you do know that "iterable" is an informal interface, right? to what
>> class would you add this operation?
>>
>> </F>
>
> The base object class would be one candidate, similarly to the way
> __nonzero__ is defined to use __len__, or __contains__ to use __iter__.

What has a better chance of success in my eyes is an extension to yield
all items from an iterable without using an explicit for loop: instead of

for item in iterable:
yield item

you could write

yield from iterable

or

yield *iterable

etc.

Georg

George Sakkis

unread,
Nov 13, 2006, 3:59:40 AM11/13/06
to
Fredrik Lundh wrote:

> George Sakkis wrote:
>
> > The base object class would be one candidate, similarly to the way
> > __nonzero__ is defined to use __len__, or __contains__ to use __iter__.
> >
> > Alternatively, iter() could be a wrapper type (or perhaps mixin)
> > instead of a function, something like:
>
> so you're proposing to either make *all* objects respond to "+", or
> introduce limited *iterator* algebra.

If by 'respond to "+"' is implied that you can get a "TypeError:
iterable argument required", as you get now for attempting "x in y" for
non-iterable y, why not ? Although I like the iterator algebra idea
better.

> not sure how that matches the OP's wish for "mostly backwards
> compatible" support for *iterable* algebra, really...

Given the subject of the thread, backwards compatibility is not the
main prerequisite. Besides, it's an *extension* idea; allow operations
that were not allowed before, not the other way around or modifying
existing semantics. Of course, programs that attempt forbidden
expressions on purpose so that they can catch and handle the exception
would break when suddenly no exception is raised, but I doubt there are
many of those...

George

Carl Banks

unread,
Nov 13, 2006, 5:59:57 AM11/13/06
to
Georg Brandl wrote:
> What has a better chance of success in my eyes is an extension to yield
> all items from an iterable without using an explicit for loop: instead of
>
> for item in iterable:
> yield item
>
> you could write
>
> yield from iterable
>
> or
>
> yield *iterable

Since this is nothing but an alternate way to spell a very specific
(and not-too-common) for loop, I expect this has zero chance of
success.


Carl Banks

Carl Banks

unread,
Nov 13, 2006, 6:38:29 AM11/13/06
to
George Sakkis wrote:
> Fredrik Lundh wrote:
>
> > George Sakkis wrote:
> >
> > > The base object class would be one candidate, similarly to the way
> > > __nonzero__ is defined to use __len__, or __contains__ to use __iter__.
> > >
> > > Alternatively, iter() could be a wrapper type (or perhaps mixin)
> > > instead of a function, something like:
> >
> > so you're proposing to either make *all* objects respond to "+", or
> > introduce limited *iterator* algebra.
>
> If by 'respond to "+"' is implied that you can get a "TypeError:
> iterable argument required", as you get now for attempting "x in y" for
> non-iterable y, why not ?

Bad idea on many, many levels. Don't go there.


> Although I like the iterator algebra idea
> better.
>
> > not sure how that matches the OP's wish for "mostly backwards
> > compatible" support for *iterable* algebra, really...
>
> Given the subject of the thread, backwards compatibility is not the
> main prerequisite. Besides, it's an *extension* idea; allow operations
> that were not allowed before, not the other way around or modifying
> existing semantics.

You missed the important word (in spite of Fredrick's emphasis):
iterable. Your iter class solution only works for *iterators* (and not
even all iterators); the OP wanted it to work for any *iterable*.

"Iterator" and "iterable" are protocols. The only way to implement
what the OP wanted is to change iterable protocol, which means changing
the documentation to say that iterable objects must implement __add__
and that it must chain the iterables, and updating all iterable types
to do this. Besides the large amount of work that this will need,
there are other problems.

1. It increases the burden on third party iterable developers.
Protocols should be kept as simple as possible for this reason.
2. Many iterable types already implement __add__ (list, tuple, string),
so this new requirement would complicate these guys a lot.

> Of course, programs that attempt forbidden
> expressions on purpose so that they can catch and handle the exception
> would break when suddenly no exception is raised, but I doubt there are
> many of those...

3. While not breaking backwards compatibility in the strictest sense,
the adverse effect on incorrect code shouldn't be brushed aside. It
would be a bad thing if this incorrect code:

a = ["hello"]
b = "world"
a+b

suddenly started failing silently instead of raising an exception.


Carl Banks

George Sakkis

unread,
Nov 13, 2006, 10:00:16 AM11/13/06
to
Carl Banks wrote:

> George Sakkis wrote:
> > Fredrik Lundh wrote:
> >
> > > George Sakkis wrote:
> > >
> > > > The base object class would be one candidate, similarly to the way
> > > > __nonzero__ is defined to use __len__, or __contains__ to use __iter__.
> > > >
> > > > Alternatively, iter() could be a wrapper type (or perhaps mixin)
> > > > instead of a function, something like:
> > >
> > > so you're proposing to either make *all* objects respond to "+", or
> > > introduce limited *iterator* algebra.
> >
> > If by 'respond to "+"' is implied that you can get a "TypeError:
> > iterable argument required", as you get now for attempting "x in y" for
> > non-iterable y, why not ?
>
> Bad idea on many, many levels. Don't go there.

Do you also find the way "in" works today a bad idea ?

> > Although I like the iterator algebra idea
> > better.
> >
> > > not sure how that matches the OP's wish for "mostly backwards
> > > compatible" support for *iterable* algebra, really...
> >
> > Given the subject of the thread, backwards compatibility is not the
> > main prerequisite. Besides, it's an *extension* idea; allow operations
> > that were not allowed before, not the other way around or modifying
> > existing semantics.
>
> You missed the important word (in spite of Fredrick's emphasis):
> iterable. Your iter class solution only works for *iterators* (and not
> even all iterators); the OP wanted it to work for any *iterable*.

I didn't miss the important word, I know the distinction between
iterables and iterators; that's why I said I like the iterator algebra
idea better (compared to extending the object class so that effectively
creates an iterable algebra).

> "Iterator" and "iterable" are protocols. The only way to implement
> what the OP wanted is to change iterable protocol, which means changing
> the documentation to say that iterable objects must implement __add__
> and that it must chain the iterables, and updating all iterable types
> to do this. Besides the large amount of work that this will need,
> there are other problems.
>
> 1. It increases the burden on third party iterable developers.
> Protocols should be kept as simple as possible for this reason.
> 2. Many iterable types already implement __add__ (list, tuple, string),
> so this new requirement would complicate these guys a lot.

If __add__ was ever to be part of the *iterable* protocol, it would be
silly to implement it for every new iterable type; the implementation
would always be the same (i.e. chain(self,other)), so it should be put
in a base class all iterables extend from. That would be either a
mixin class, or object. This is parallel to how __contains__ is part of
the sequence protocol, but if you (the 3rd party sequence developer)
don't define one, a default __contains__ that relies on __getitem__ is
created for you.

> > Of course, programs that attempt forbidden
> > expressions on purpose so that they can catch and handle the exception
> > would break when suddenly no exception is raised, but I doubt there are
> > many of those...
>
> 3. While not breaking backwards compatibility in the strictest sense,
> the adverse effect on incorrect code shouldn't be brushed aside. It
> would be a bad thing if this incorrect code:
>
> a = ["hello"]
> b = "world"
> a+b
>
> suddenly started failing silently instead of raising an exception.

That's a good example for why I prefer an iterator rather than an
iterable algebra; the latter is too implicit as "a + b" doesn't call
only __add__, but __iter__ as well. On the other hand, with a concrete
iterator type "iter(a) + iter(b)" is not any more error-prone than
'int(3) + int("2")' or 'str(3) + str("2")'.

What's the objection to an *iterator* base type and the algebra it
introduces explicitly ?

George

Georg Brandl

unread,
Nov 13, 2006, 11:30:54 AM11/13/06
to

well, it could also be optimized internally, i.e. with a new opcode.

Georg

Carl Banks

unread,
Nov 13, 2006, 7:22:03 PM11/13/06
to

George Sakkis wrote:
> Carl Banks wrote:
> > George Sakkis wrote:
> > > If by 'respond to "+"' is implied that you can get a "TypeError:
> > > iterable argument required", as you get now for attempting "x in y" for
> > > non-iterable y, why not ?
> >
> > Bad idea on many, many levels. Don't go there.
>
> Do you also find the way "in" works today a bad idea ?

Augh. I don't like it much, but (assuming that there are good use
cases for testing containment in iterables that don't define
__contains__) it seems to be the best way to accomplish it for
iterables in general. However, "in" isn't even comparable to "add"
here.

First of all, unlike "add", the nature of "in" more of less requires
that the second operand is some kind of collection, so surprises are
kept to a minimum. Second, testing containment is just a bit more
important, and thus deserving of a special case, than chaining
iterables.

The problem is taking a very general, already highly overloaded
operator +, and adding a special case to the interpreter for one of the
least common uses. It's just a bad idea.


> > 3. While not breaking backwards compatibility in the strictest sense,
> > the adverse effect on incorrect code shouldn't be brushed aside. It
> > would be a bad thing if this incorrect code:
> >
> > a = ["hello"]
> > b = "world"
> > a+b
> >
> > suddenly started failing silently instead of raising an exception.
>
> That's a good example for why I prefer an iterator rather than an
> iterable algebra; the latter is too implicit as "a + b" doesn't call
> only __add__, but __iter__ as well. On the other hand, with a concrete
> iterator type "iter(a) + iter(b)" is not any more error-prone than
> 'int(3) + int("2")' or 'str(3) + str("2")'.
>
> What's the objection to an *iterator* base type and the algebra it
> introduces explicitly ?

Well, it still makes it more work to implement iterator protocol, which
is enough reason to make me -1 on it. Anyways, I don't think it's very
useful to have it for iterators because most people write functions for
iterables. You'd have to write "iter(a)+iter(b)" to chain two
iterables, which pretty much undoes the main convenience of the +
operator (i.e., brevity). But it isn't dangerous.


Carl Banks

0 new messages