Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Distinguishing active generators from exhausted ones

2 views
Skip to first unread message

Michal Kwiatkowski

unread,
Jul 25, 2009, 2:30:54 PM7/25/09
to
Hi,

Is there a way to tell if a generator has been exhausted using pure
Python code? I've looked at CPython sources and it seems that
something like "active"/"exhausted" attribute on genobject is missing
from the API. For the time being I am using a simple C extension to
look at f_stacktop pointer of the generator frame, which seems to
differentiate active generators from exhausted ones. See
http://bazaar.launchpad.net/~ruby/pythoscope/support-python2.3/annotate/286/pythoscope/_util.c#L16
for complete source code.

I may be missing something obvious here. Is there a better way to tell
if a given generator object is still active or not?

Cheers,
mk

Jason Tackaberry

unread,
Jul 25, 2009, 4:00:02 PM7/25/09
to pytho...@python.org
On Sat, 2009-07-25 at 11:30 -0700, Michal Kwiatkowski wrote:
> Is there a way to tell if a generator has been exhausted using pure
> Python code? I've looked at CPython sources and it seems that

Upon a cursory look, after a generator 'gen' is exhausted (meaning
gen.next() has raised StopIteration), it seems that gen.gi_frame will be
None.

Cheers,
Jason.

Michal Kwiatkowski

unread,
Jul 25, 2009, 4:20:11 PM7/25/09
to

Only in Python 2.5 or higher though. I need to support Python 2.3 and
2.4 as well, sorry for not making that clear in the original post.

Cheers,
mk

Ben Finney

unread,
Jul 25, 2009, 7:10:40 PM7/25/09
to
Michal Kwiatkowski <consta...@gmail.com> writes:

> I may be missing something obvious here. Is there a better way to tell
> if a given generator object is still active or not?

foo = the_generator_object
try:
do_interesting_thing_that_needs(foo.next())
except StopIteration:
generator_is_exhausted()

In other words, don't LBYL, because it's EAFP. Whatever you need to do
that requires the next item from the generator, do that; you'll get a
specific exception if the generator is exhausted.

--
\ “Courteous and efficient self-service.” —café, southern France |
`\ |
_o__) |
Ben Finney

Hendrik van Rooyen

unread,
Jul 26, 2009, 3:48:49 AM7/26/09
to pytho...@python.org
On Saturday 25 July 2009 20:30:54 Michal Kwiatkowski wrote:
> Hi,
>
> Is there a way to tell if a generator has been exhausted using pure
> Python code? I've looked at CPython sources and it seems that
> something like "active"/"exhausted" attribute on genobject is missing
> from the API. For the time being I am using a simple C extension to
> look at f_stacktop pointer of the generator frame, which seems to
> differentiate active generators from exhausted ones. See
> http://bazaar.launchpad.net/~ruby/pythoscope/support-python2.3/annotate/286
>/pythoscope/_util.c#L16 for complete source code.

>
> I may be missing something obvious here. Is there a better way to tell
> if a given generator object is still active or not?

Is there a reason why you cannot just call the next method and handle
the StopIteration when it happens?

- Hendrik

Michal Kwiatkowski

unread,
Jul 26, 2009, 4:45:21 AM7/26/09
to
On Jul 26, 1:10 am, Ben Finney <ben+pyt...@benfinney.id.au> wrote:

> Michal Kwiatkowski <constant.b...@gmail.com> writes:
> > I may be missing something obvious here. Is there a better way to tell
> > if a given generator object is still active or not?
>
>     foo = the_generator_object
>     try:
>         do_interesting_thing_that_needs(foo.next())
>     except StopIteration:
>         generator_is_exhausted()
>
> In other words, don't LBYL, because it's EAFP. Whatever you need to do
> that requires the next item from the generator, do that; you'll get a
> specific exception if the generator is exhausted.

The thing is I don't need the next item. I need to know if the
generator has stopped without invoking it. Why - you may ask. Well,
the answer needs some explaining.

I'm working on the Pythoscope project (http://pythoscope.org) and I
use tracing mechanisms of CPython (sys.settrace) to capture function
calls and values passed to and from them. Now, the problem with
generators is that when they are ending (i.e. returning instead of
yielding) they return a None, which is in fact indistinguishable from
"yield None". That means I can't tell if the last captured None was in
fact yielded or is a bogus value which should be rejected. Let me show
you on an example.

import sys

def trace(frame, event, arg):
if event != 'line':
print frame, event, arg
return trace

def gen1():
yield 1
yield None

def gen2():
yield 1

sys.settrace(trace)
print "gen1"
g1 = gen1()
g1.next()
g1.next()
print "gen2"
g2 = gen2()
[x for x in g2]
sys.settrace(None)

The first generator isn't finished, it yielded 1 and None. Second one
is exhausted after yielding a single value (1). The problem is that,
under Python 2.4 or 2.3 both invocations will generate the same trace
output. So, to know whether the last None was actually a yielded value
I need to know if a generator is active or not.

Your solution, while gives me an answer, is not acceptable because
generators can cause side effects (imagine call to launch_rockets()
before the next yield statement ;).

Cheers,
mk

Aahz

unread,
Jul 26, 2009, 7:56:59 PM7/26/09
to
In article <2a408da6-af57-45d0...@s15g2000yqs.googlegroups.com>,
Michal Kwiatkowski <consta...@gmail.com> wrote:

Are you sure? It appears to work in Python 2.4; I don't have time to
check 2.3.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer

Terry Reedy

unread,
Jul 26, 2009, 8:10:00 PM7/26/09
to pytho...@python.org
Michal Kwiatkowski wrote:

> The thing is I don't need the next item. I need to know if the
> generator has stopped without invoking it.

Write a one-ahead iterator class, which I have posted before, that sets
.exhausted to True when next fails.

tjr

greg

unread,
Jul 26, 2009, 9:30:41 PM7/26/09
to
Michal Kwiatkowski wrote:

> The first generator isn't finished, it yielded 1 and None. Second one
> is exhausted after yielding a single value (1). The problem is that,
> under Python 2.4 or 2.3 both invocations will generate the same trace
> output.

This seems to be a deficiency in the trace mechanism.
There really ought to be a 'yield' event to distinguish
yields from returns.

You could put in a feature request on python-dev
concerning this.

--
Greg

Steven D'Aprano

unread,
Jul 26, 2009, 11:18:53 PM7/26/09
to


And hope that the generator doesn't have side-effects...


--
Steven

Terry Reedy

unread,
Jul 27, 2009, 2:02:19 AM7/27/09
to pytho...@python.org

If run to exhastion, the same number of side-effects happen.
The only difference is that they each happen once step happpen sooner.
For reading a file that is irrelevant. Much else, and the iterator is
not just an iterator.

tjr


Steven D'Aprano

unread,
Jul 27, 2009, 2:22:54 AM7/27/09
to

I believe the OP specifically said he needs to detect whether or not an
iterator is exhausted, without running it to exhaustion, so you shouldn't
assume that the generator has been exhausted.

When it comes to side-effects, timing matters. For example, a generator
which cleans up after it has run (deleting temporary files, closing
sockets, etc.) will leave the environment in a different state if run to
exhaustion than just before exhaustion. Even if you store the final
result in a one-ahead class, you haven't saved the environment, and that
may be significant.

(Of course, it's possible that it isn't significant. Not all differences
make a difference.)

The best advice is, try to avoid side-effects, especially in generators.


--
Steven

Message has been deleted

Michal Kwiatkowski

unread,
Jul 27, 2009, 9:46:01 AM7/27/09
to
On Jul 27, 1:56 am, a...@pythoncraft.com (Aahz) wrote:
> >> Upon a cursory look, after a generator 'gen' is exhausted (meaning
> >> gen.next() has raised StopIteration), it seems that gen.gi_frame will be
> >> None.
>
> >Only in Python 2.5 or higher though. I need to support Python 2.3 and
> >2.4 as well, sorry for not making that clear in the original post.
>
> Are you sure? It appears to work in Python 2.4; I don't have time to
> check 2.3.

No, it does not work in Python 2.4. gi_frame can be None only in
Python 2.5 and higher.

Via "What’s New in Python 2.5" (http://docs.python.org/whatsnew/
2.5.html):

"""
Another even more esoteric effect of this change: previously, the
gi_frame attribute of a generator was always a frame object. It’s now
possible for gi_frame to be None once the generator has been
exhausted.
"""

Cheers,
mk

Aahz

unread,
Jul 27, 2009, 10:41:45 AM7/27/09
to
In article <1c8ae01e-2e9c-497c...@g31g2000yqc.googlegroups.com>,

Michal Kwiatkowski <consta...@gmail.com> wrote:
>On Jul 27, 1:56 am, a...@pythoncraft.com (Aahz) wrote:
>>>> Upon a cursory look, after a generator 'gen' is exhausted (meaning
>>>> gen.next() has raised StopIteration), it seems that gen.gi_frame will be
>>>> None.
>>>
>>>Only in Python 2.5 or higher though. I need to support Python 2.3 and
>>>2.4 as well, sorry for not making that clear in the original post.
>>
>> Are you sure? It appears to work in Python 2.4; I don't have time to
>> check 2.3.
>
>No, it does not work in Python 2.4. gi_frame can be None only in
>Python 2.5 and higher.

You're right, I guess I must have made a boo-boo when I was switching
versions.

Terry Reedy

unread,
Jul 27, 2009, 4:47:34 PM7/27/09
to pytho...@python.org
Steven D'Aprano wrote:
> On Mon, 27 Jul 2009 02:02:19 -0400, Terry Reedy wrote:
>
>> Steven D'Aprano wrote:
>>> On Sun, 26 Jul 2009 20:10:00 -0400, Terry Reedy wrote:
>>>
>>>> Michal Kwiatkowski wrote:
>>>>
>>>>> The thing is I don't need the next item. I need to know if the
>>>>> generator has stopped without invoking it.
>>>> Write a one-ahead iterator class, which I have posted before, that
>>>> sets .exhausted to True when next fails.
>>>
>>> And hope that the generator doesn't have side-effects...
>> If run to exhastion, the same number of side-effects happen. The only
>> difference is that they each happen once step happpen sooner. For
>> reading a file that is irrelevant. Much else, and the iterator is not
>> just an iterator.
>
> I believe the OP specifically said he needs to detect whether or not an
> iterator is exhausted, without running it to exhaustion, so you shouldn't
> assume that the generator has been exhausted.

I believe the OP said he needs to determine whether or not an iterator
(specifically generator) is exhausted without consuming an item when it
is not. That is slightly different. The wrapper I suggested makes that
easy. I am obviously not assuming exhaustion when there is a .exhausted
True/False flag to check.

There are two possible definition of 'exhausted': 1) will raise
StopIteration on the next next() call; 2) has raised StopIteration at
least once. The wrapper converts 2) to 1), which is to say, it obeys
definition 1 once the underlying iteration has obeyed definition 2.

Since it is trivial to set 'exhausted=True' in the generator user code
once StopIteration has been raised (meaning 2), I presume the OP wants
the predictive meaning 1).

Without a iterator class wrapper, I see no way to predict what a
generator will do (expecially, raise StopIteration the first time)
without analyzing its code and local variable state.

I said in response to YOU that once exhaustion has occurred, then the
same number of side effects would have occurred.

> When it comes to side-effects, timing matters.

Sometimes. And I admitted that possibility (slight garbled).

For example, a generator
> which cleans up after it has run (deleting temporary files, closing
> sockets, etc.) will leave the environment in a different state if run to
> exhaustion than just before exhaustion. Even if you store the final
> result in a one-ahead class, you haven't saved the environment, and that
> may be significant.

Of course, an eager-beaver generator written to be a good citizen might
well close resources as soon as it knows *they* are exhausted, long
before *it* yields the last items from the in-memory last block read.
For all I know, file.readlines could do such.

Assuming that is not the case, the cleanup will not happen until the
what turns out to be the final item is requested from the wrapper. Once
cleanup has happened, .exhausted will be set to True. If proper
processing of even the last item requires that cleanup not have
happened, then that and prediction of exhaustion are incompatible. One
who wants both should write an iterator class instead of generator function.

> (Of course, it's possible that it isn't significant. Not all differences
> make a difference.)
>
> The best advice is, try to avoid side-effects, especially in generators.

Agreed.

Terry Jan Reedy

Michal Kwiatkowski

unread,
Jul 27, 2009, 5:37:57 PM7/27/09
to
On Jul 27, 10:47 pm, Terry Reedy <tjre...@udel.edu> wrote:
> There are two possible definition of 'exhausted': 1) will raise
> StopIteration on the next next() call; 2) has raised StopIteration at
> least once. The wrapper converts 2) to 1), which is to say, it obeys
> definition 1 once the underlying iteration has obeyed definition 2.
>
> Since it is trivial to set 'exhausted=True' in the generator user code
> once StopIteration has been raised (meaning 2), I presume the OP wants
> the predictive meaning 1).

No, I meant the second meaning (i.e. generator is exhausted when it
has returned instead of yielding).

While, as you showed, it is trivial to create a generator that will
have the "exhausted" flag, in my specific case I have no control over
the user code. I have to use what the Python genobject API gives me
plus the context of the trace function.

Cheers,
mk

0 new messages