Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Iterator class to allow self-restarting generator expressions?

15 views
Skip to first unread message

John O'Hagan

unread,
Mar 1, 2009, 10:20:28 AM3/1/09
to pytho...@python.org
Inspired by some recent threads here about using classes to extend the
behaviour of iterators, I'm trying to replace some some top-level functions
aimed at doing such things with a class.

So far it's got a test for emptiness, a non-consuming peek-ahead method, and
an extended next() which can return slices as well as the normal mode, but
one thing I'm having a little trouble with is getting generator expressions
to restart when exhausted. This code works for generator functions:

class Regen(object):
"""Optionally restart generator functions"""
def __init__(self, generator, options=None, restart=False):
self.gen = generator
self.options = options
self.gen_call = generator(options)
self.restart = restart

def __iter__(self):
return (self)

def next(self):
try:
return self.gen_call.next()
except StopIteration:
if self.restart:
self.gen_call = self.generator(self.options)
return self.gen_call.next()
else:
raise

used like this:

def gen():
for i in range(3):
yield i

reg = Regen(gen, restart=True)

I'd like to do the same for generator expressions, something like:

genexp = (i for i in range(3))

regenexp = Regen(genexp, restart=True)

such that regenexp would behave like reg, i.e. restart when exhausted (and
would only raise StopIteration if it's actually empty). However because
generator expressions aren't callable, the above approach won't work.

I suppose I could convert expressions to functions like:

def gen():
genexp = (i for i in range(3))
for j in genexp:
yield j

but that seems tautological.

Any clues or comments appreciated.

John

Mark Tolonen

unread,
Mar 1, 2009, 11:39:46 AM3/1/09
to pytho...@python.org

"John O'Hagan" <rese...@johnohagan.com> wrote in message
news:200903011520....@johnohagan.com...

> Inspired by some recent threads here about using classes to extend the
> behaviour of iterators, I'm trying to replace some some top-level
> functions
> aimed at doing such things with a class.
>
> So far it's got a test for emptiness, a non-consuming peek-ahead method,
> and
> an extended next() which can return slices as well as the normal mode, but
> one thing I'm having a little trouble with is getting generator
> expressions
> to restart when exhausted. This code works for generator functions:

[snip code]

The Python help shows the Python-equivalent code (or go to the source) for
things like itertools.islice and itertools.icycle, which sound like what you
are re-implementing. It looks like to handle generators icycle saves the
items as they are generated in another list, then uses the list to generate
successive iterations.

-Mark


Gabriel Genellina

unread,
Mar 1, 2009, 11:54:21 AM3/1/09
to pytho...@python.org
En Sun, 01 Mar 2009 13:20:28 -0200, John O'Hagan <rese...@johnohagan.com>
escribió:

> Inspired by some recent threads here about using classes to extend the
> behaviour of iterators, I'm trying to replace some some top-level
> functions
> aimed at doing such things with a class.
>
> So far it's got a test for emptiness, a non-consuming peek-ahead method,
> and
> an extended next() which can return slices as well as the normal mode,
> but
> one thing I'm having a little trouble with is getting generator
> expressions
> to restart when exhausted. This code works for generator functions:

[...]

> I'd like to do the same for generator expressions, something like:
>
> genexp = (i for i in range(3))
>
> regenexp = Regen(genexp, restart=True)
>
> such that regenexp would behave like reg, i.e. restart when exhausted
> (and
> would only raise StopIteration if it's actually empty). However because
> generator expressions aren't callable, the above approach won't work.

I'm afraid you can't do that. There is no way of "cloning" a generator:

py> g = (i for i in [1,2,3])
py> type(g)()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create 'generator' instances
py> g.gi_code = code
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: readonly attribute
py> import copy
py> copy.copy(g)
Traceback (most recent call last):
...
TypeError: object.__new__(generator) is not safe, use generator.__new__()
py> type(g).__new__
<built-in method __new__ of type object at 0x1E1CA560>

You can do that with a generator function because it acts as a "generator
factory", building a new generator when called. Even using the Python C
API, to create a generator one needs a frame object -- and there is no way
to create a frame object "on the fly" that I know of :(

py> import ctypes
py> PyGen_New = ctypes.pythonapi.PyGen_New
py> PyGen_New.argtypes = [ctypes.py_object]
py> PyGen_New.restype = ctypes.py_object
py> g = (i for i in [1,2,3])
py> g2 = PyGen_New(g.gi_frame)
py> g2.gi_code is g.gi_code
True
py> g2.gi_frame is g.gi_frame
True
py> g.next()
1
py> g2.next()
2

g and g2 share the same execution frame, so they're not independent. There
is no easy way to create a new frame in Python:

py> type(g.gi_frame)()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create 'frame' instances

One could try using PyFrame_New -- but that's way too magic for my taste...

--
Gabriel Genellina

Chris Rebert

unread,
Mar 1, 2009, 12:51:07 PM3/1/09
to Gabriel Genellina, pytho...@python.org
On Sun, Mar 1, 2009 at 8:54 AM, Gabriel Genellina
<gags...@yahoo.com.ar> wrote:
> En Sun, 01 Mar 2009 13:20:28 -0200, John O'Hagan <rese...@johnohagan.com>
> escribió:
>
>> Inspired by some recent threads here about using classes to extend the
>> behaviour of iterators, I'm trying to replace some some top-level
>> functions
>> aimed at doing such things with a class.
>>
>> So far it's got a test for emptiness, a non-consuming peek-ahead method,
>> and
>> an extended next() which can return slices as well as the normal mode, but
>> one thing I'm having a little trouble with is getting generator
>> expressions
>> to restart when exhausted. This code works for generator functions:
>
> [...]
>
>> I'd like to do the same for generator expressions, something like:
>>
>> genexp = (i for i in range(3))
>>
>> regenexp = Regen(genexp, restart=True)
>>
>> such that regenexp would behave like reg, i.e. restart when exhausted (and
>> would only raise StopIteration if it's actually empty). However because
>> generator expressions aren't callable, the above approach won't work.
>
> I'm afraid you can't do that. There is no way of "cloning" a generator:

Really? What about itertools.tee()? Sounds like it'd do the job,
albeit with some caveats.
http://docs.python.org/library/itertools.html#itertools.tee

Cheers,
Chris

--
Follow the path of the Iguana...
http://rebertia.com

Gabriel Genellina

unread,
Mar 1, 2009, 1:08:04 PM3/1/09
to pytho...@python.org
En Sun, 01 Mar 2009 15:51:07 -0200, Chris Rebert <cl...@rebertia.com>
escribió:

> On Sun, Mar 1, 2009 at 8:54 AM, Gabriel Genellina
> <gags...@yahoo.com.ar> wrote:
>> En Sun, 01 Mar 2009 13:20:28 -0200, John O'Hagan
>> <rese...@johnohagan.com>
>> escribió:
>>
>>> Inspired by some recent threads here about using classes to extend the
>>> behaviour of iterators, I'm trying to replace some some top-level
>>> functions
>>> aimed at doing such things with a class.

>> I'm afraid you can't do that. There is no way of "cloning" a generator:


>
> Really? What about itertools.tee()? Sounds like it'd do the job,
> albeit with some caveats.
> http://docs.python.org/library/itertools.html#itertools.tee

It doesn't clone the generator, it just stores the generated objects in a
temporary array to be re-yielded later.

--
Gabriel Genellina

Terry Reedy

unread,
Mar 1, 2009, 5:21:15 PM3/1/09
to pytho...@python.org
John O'Hagan wrote:
> Inspired by some recent threads here about using classes to extend the
> behaviour of iterators, I'm trying to replace some some top-level functions
> aimed at doing such things with a class.
>
> So far it's got a test for emptiness, a non-consuming peek-ahead method, and
> an extended next() which can return slices as well as the normal mode, but
> one thing I'm having a little trouble with is getting generator expressions
> to restart when exhausted. This code works for generator functions:
>
> class Regen(object):
> """Optionally restart generator functions"""
> def __init__(self, generator, options=None, restart=False):
> self.gen = generator

Your 'generator' parameter is actually a generator function -- a
function that created a generator when called.

> self.options = options

Common practice would use 'args' instead of 'options'.

> self.gen_call = generator(options)

If the callable takes multiple args, you want '*options' (or *args)
instead of 'options'.

That aside, your 'gen_call' parameter is actually a generator -- a
special type of iterator (uncallable object with __next__ (3.0) method).

It is worthwhile keeping the nomenclature straight. As you discovered,
generator expressions create generators, not generator functions. Other
than being given the default .__name__ attribute '<genexpr>', there is
otherwise nothing special about their result. So I would not try to
treat them specially. Initializing a Regen instance with *any*
generator (or other iterator) will fail.

On the other hand, your Regen instances could be initialized with *any*
callable that produces iterators, including iterator classes. So you
might as well call the parameters iter_func and iterator.

In general, for all iterators and not just generators, reiteration
requires a new iterator, either by duplicating the original or by saving
the values in a list and iterating through that.

Terry Jan Reedy

John O'Hagan

unread,
Mar 1, 2009, 10:33:14 PM3/1/09
to pytho...@python.org
On Sun, 1 Mar 2009, Mark Tolonen wrote:
> "John O'Hagan" <rese...@johnohagan.com> wrote in message
> news:200903011520....@johnohagan.com...
>
> > Inspired by some recent threads here about using classes to extend the
> > behaviour of iterators, I'm trying to replace some some top-level
> > functions
> > aimed at doing such things with a class.
> >
> > So far it's got a test for emptiness, a non-consuming peek-ahead method,
> > and
> > an extended next() which can return slices as well as the normal mode,
> > but one thing I'm having a little trouble with is getting generator
> > expressions
> > to restart when exhausted. This code works for generator functions:
>
> [snip code]
>
> The Python help shows the Python-equivalent code (or go to the source) for
> things like itertools.islice and itertools.icycle, which sound like what
> you are re-implementing. It looks like to handle generators icycle saves
> the items as they are generated in another list, then uses the list to
> generate successive iterations.
>

Thanks for your reply Mark; I've looked at the itertools docs (again, this
time I understood more of it!), but because the generators in question
produce arbitrarily many results (which i should have mentioned), it would
not always be practical to hold them all in memory.

So I've used a "buffer" instance attribute in my iterator class, which only
holds as many items as are required by the peek(), next() and __nonzero__()
methods, in order to minimize memory use (come to think of it, I should add a
clear() method as well...).

But the islice() function looks very useful and could replace some code in my
generator functions, as could some of the ingenious recipes at the end of the
itertools chapter. It's always good to go back to the docs!

As for restarting the iterators, it seems from other replies that I must use
generator function calls rather than expressions in order to do that.

Thanks,

John

Lie Ryan

unread,
Mar 2, 2009, 3:48:02 AM3/2/09
to pytho...@python.org
Gabriel Genellina wrote:
> En Sun, 01 Mar 2009 15:51:07 -0200, Chris Rebert <cl...@rebertia.com>
> escribió:
>> On Sun, Mar 1, 2009 at 8:54 AM, Gabriel Genellina
>> <gags...@yahoo.com.ar> wrote:
>>> En Sun, 01 Mar 2009 13:20:28 -0200, John O'Hagan
>>> <rese...@johnohagan.com>
>>> escribió:
>>>
>>>> Inspired by some recent threads here about using classes to extend the
>>>> behaviour of iterators, I'm trying to replace some some top-level
>>>> functions
>>>> aimed at doing such things with a class.
>
>>> I'm afraid you can't do that. There is no way of "cloning" a generator:
>>
>> Really? What about itertools.tee()? Sounds like it'd do the job,
>> albeit with some caveats.
>> http://docs.python.org/library/itertools.html#itertools.tee
>
> It doesn't clone the generator, it just stores the generated objects in
> a temporary array to be re-yielded later.
>

How about creating something like itertools.tee() that will save and
dump items as necessary. The "new tee" (let's call it tea) would return
several generators that all will refer to a common "tea" object. The
common tea object will keep track of which items has been collected by
each generators and generate new items as necessary. If an item has
already been collected by all generators, that item will be dumped.

Somewhat like this: # untested

class Tea(object):
def __init__(self, iterable, nusers):
self.iterable = iterable
self.cache = {}
self.nusers = nusers

def next(self, n):
try:
item, nusers = self.cache[n]
self.cache[n] = (item, nusers - 1)
except IndexError: # the item hasn't been generated
item = self.iterable.next()
self.cache[n] = (item, nusers)
else:
if nusers == 0:
del self.cache[n]
return item

class TeaClient(object):
def __init__(self, tea):
self.n = 0
self.tea = tea
def next(self):
self.n += 1
return self.tea.next(self.n)

def tea(iterable, nusers):
teaobj = Tea(iterable, nusers)
return [TeaClient(teaobj) for _ in range(nusers)]

Gabriel Genellina

unread,
Mar 2, 2009, 6:43:31 AM3/2/09
to pytho...@python.org

That's exactly what itertools.tee does! Or I'm missing something?

--
Gabriel Genellina

John O'Hagan

unread,
Mar 2, 2009, 11:01:50 AM3/2/09
to pytho...@python.org
On Sun, 1 Mar 2009, Terry Reedy wrote:
> John O'Hagan wrote:
> > Inspired by some recent threads here about using classes to extend the
> > behaviour of iterators, I'm trying to replace some some top-level
> > functions aimed at doing such things with a class.
> >
> > So far it's got a test for emptiness, a non-consuming peek-ahead method,
> > and an extended next() which can return slices as well as the normal
> > mode, but one thing I'm having a little trouble with is getting generator
> > expressions to restart when exhausted. This code works for generator
> > functions:
> >

Thanks to all who replied for helping to clear up my various confusions on
this subject. For now I'm content to formulate my iterators as generator
function calls and I'll study the various approaches offered here. For now
here's my attempt at a class that does what I want:

class Exgen(object):
"""works for generator functions"""
def __init__(self, iter_func, restart=False, *args):
self.iter_func = iter_func
self.args = args
self.iterator = iter_func(*args)
self.restart = restart
self._buffer = []
self._buff()

def __iter__(self):
return (self)

def __nonzero__(self):
if self._buffer:
return True
return False

def _buff(self, stop=1):
"""Store items in a list as required"""
for _ in range(stop - len(self._buffer)):
try:
self._buffer.append(self.iterator.next())
except StopIteration:
if self.restart:
self.iterator = self.iter_func(self.args)
self._buffer.append(self.iterator.next())
else:
break

def peek(self, start=0, stop=1):
"""See a slice of whats coming up"""
self._buff(stop)
return self._buffer[start:stop]

def next(self, start=0, stop=1):
"""Consume a slice"""
self._buff(stop)
if self._buffer:
result = self._buffer[start:stop]
self._buffer = self._buffer[:start] + self._buffer[stop:]
return result
else:
raise StopIteration

def clear(self):
"""Empty the buffer"""
self._buffer = []
self._buff()

Regards,

John

0 new messages