Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

question about generators

0 views
Skip to first unread message

Andrew Koenig

unread,
Aug 14, 2002, 11:38:20 AM8/14/02
to
I had a function along the following lines:

def f():
if <condition>:
print <something>
else:
<do something>
f()
<do something else>

and I wanted to turn it into a generator instead of having it
generate output directly. My first try was to change "print"
to "yield", and that failed horribly.

Of course, what happened was that the recursive call was now
creating a generator that was being thrown away. To make it
work, I had to do this:

def f():
if <condition>:
yield <something>
else:
<do something>
for i in f():
yield i
<do something else>

So... my question is this: Is there a cleaner general way of making
this kind of program transformation? Obviously, replacing

for i in f():
yield i

by

yield f()

won't work -- nor should it, because the compiler doesn't know that
I might not be intending to write a generator that yields a sequence
of genrators.

--
Andrew Koenig, a...@research.att.com, http://www.research.att.com/info/ark

Daniel Dittmar

unread,
Aug 14, 2002, 12:16:52 PM8/14/02
to
Andrew Koenig wrote:
> def f():
> if <condition>:
> yield <something>
> else:
> <do something>
> for i in f():
> yield i
> <do something else>
>
> So... my question is this: Is there a cleaner general way of making
> this kind of program transformation?

The Unix way would be to use pipes/queues:
def f(queue):
if <condition>:
queue.append (...)
else:
<do something>
f (queue)
<do something else>

This of course requires a second thread for the queue to be read.
Maybe someone creates a
yield >>queue, value
to allow this in one thread.

Daniel

Bjorn Pettersen

unread,
Aug 14, 2002, 2:45:28 PM8/14/02
to
> From: Andrew Koenig [mailto:a...@research.att.com]

[snip turning a function into a generator]



> So... my question is this: Is there a cleaner general way of

> making this kind of program transformation? Obviously, replacing


>
> for i in f():
> yield i
>

> by
>
> yield f()
>
> won't work -- nor should it, because the compiler doesn't
> know that I might not be intending to write a generator that
> yields a sequence of genrators.

I've noticed a similar problem, ie. all my generators are recursive and
I have to do the for-loop dance, and yes it had me flumoxed the first
time too.

It seems like this is a common enough idiom that it could warrant
special syntax(?) Coming from a C++ background (but having also
programmed in Python for the last five years) something like

yield << f()

would seem natural, and also symetric to print >> ;->

-- bjorn

Jonathan Hogg

unread,
Aug 14, 2002, 4:36:38 PM8/14/02
to
On 14/8/2002 16:38, in article yu993cth...@europa.research.att.com,
"Andrew Koenig" <a...@research.att.com> wrote:

> I had a function along the following lines:
>
> def f():
> if <condition>:
> print <something>
> else:
> <do something>
> f()
> <do something else>
>
> and I wanted to turn it into a generator instead of having it
> generate output directly. My first try was to change "print"
> to "yield", and that failed horribly.

I'm confused, the pattern you show above will only ever 'print <something>'
once. There are only two paths through the function, one prints, the other
calls itself recursively. So the function can only work down a bit through
some recursive calls doing <something>, then when the condition is true it
will print something, bubble back up through all the calls doing <something
else>, and exit.

So if it can only return one thing, you don't need a generator. So I'm
guessing there's something crucial missing from the pattern shown.

I was going to suggest analysing the function and turning it into a loop
instead of recursion, but I can't suggest how to do that without knowing a
bit more about how it was meant to work.

That said, I usually just do what you did and put a 'for x in f(): yield x'

Jonathan

Andrew Koenig

unread,
Aug 14, 2002, 5:27:27 PM8/14/02
to
Jonathan> I'm confused, the pattern you show above will only ever
Jonathan> 'print <something>' once.

You're quite right. I really meant this:

def f():
for <...>


if <condition>:
print <something>
else:
<do something>
f()
<do something else>

--

Carl Banks

unread,
Aug 14, 2002, 5:16:56 PM8/14/02
to
Jonathan Hogg wrote:
> On 14/8/2002 16:38, in article yu993cth...@europa.research.att.com,
> "Andrew Koenig" <a...@research.att.com> wrote:
>
>> I had a function along the following lines:
>>
>> def f():
>> if <condition>:
>> print <something>
>> else:
>> <do something>
>> f()
>> <do something else>
>>
>> and I wanted to turn it into a generator instead of having it
>> generate output directly. My first try was to change "print"
>> to "yield", and that failed horribly.
>
> I'm confused, the pattern you show above will only ever 'print <something>'
> once.

Not so fast! He could have two calls to f(), or the f() inside of a
loop. Omitted for simplicity, of course.


--
CARL BANKS
http://www.aerojockey.com

Aahz

unread,
Aug 14, 2002, 7:06:53 PM8/14/02
to
In article <yu993cth...@europa.research.att.com>,

Not really. The two functions are not really semantically equivalent.
Consider the necessary code had your original function instead of
"print" used "return".
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/

Tim Peters

unread,
Aug 14, 2002, 8:53:25 PM8/14/02
to
[Bjorn Pettersen, on Andrew Koenig's

for i in f():
yield i

]

> I've noticed a similar problem, ie. all my generators are recursive and
> I have to do the for-loop dance, and yes it had me flumoxed the first
> time too.
>
> It seems like this is a common enough idiom that it could warrant
> special syntax(?)

There was a lot of discussion of this at the time (dig into the iterators
mailing list archive). We decided to keep Simple Generators, well,
*simple*, to start with.

> Coming from a C++ background (but having also programmed in Python for
> the last five years) something like
>
> yield << f()
>
> would seem natural, and also symetric to print >> ;->

The one I like best is

yield every some_expression

where "every" acts like a pseudo-keyword in the context of a yield statement
(like "as" does in the context of "from x import y as z"). This honors a
similar meaning for "every" in the Icon language.

Barry has already reserved

yield << f()

to mean that the file named by "f" should read one line and suspend its
value to the caller <wink>.


Jonathan Hogg

unread,
Aug 15, 2002, 3:42:49 AM8/15/02
to
On 14/8/2002 22:27, in article yu99wuqt...@europa.research.att.com,
"Andrew Koenig" <a...@research.att.com> wrote:

> You're quite right. I really meant this:
>
> def f():
> for <...>

> if <condition>:
> print <something>
> else:
> <do something>
> f()
> <do something else>

Hmmm... that's a doozy. I'm not sure I'd want to try re-writing that without
recursion. The resulting solution would probably be completely unobvious.

I'd definitely just stick with the 'for x in f(): yield x' convention ;-)

I don't think it's so bad anyway as it reads better for me. Understanding
what a recursive algorithm is doing can be hard at the best of times, but
the for loop makes it explicit there will be multiple results at this point.
With some appropriate naming, it can be quite elegant:

if is_something( node ):
yield node.value
else:
for value in all_sub_somethings_of( node ):
yield value

Well, maybe...

Jonathan

Andrew Koenig

unread,
Aug 15, 2002, 9:19:28 AM8/15/02
to
>> You're quite right. I really meant this:

>> def f():
>> for <...>
>> if <condition>:
>> print <something>
>> else:
>> <do something>
>> f()
>> <do something else>

Jonathan> Hmmm... that's a doozy. I'm not sure I'd want to try
Jonathan> re-writing that without recursion. The resulting solution
Jonathan> would probably be completely unobvious.

Jonathan> I'd definitely just stick with the 'for x in f(): yield x'
Jonathan> convention ;-)

So let me bring the discussion back to my original question. I started with

def f():
for <...>:
if <condition>:
print <something>
else:
<do something>
f()
<do something else

I wanted to change this function so that instead of printing its results,
it yielded them. So I made the obvious change:

def f():
for <...>:
if <condition>:

yield <something>


else:
<do something>
f()
<do something else

and the program stopped working. That's when I realized that yield wasn't
like print, and I had to write

def f():
for <...>:
if <condition>:

yield <something>
else:
<do something>

for i in f():
yield i

<do something else

instead. At first I wondered if perhaps calling a generator from
another generator, and not doing anything with the results, should
automatically yield them, but decided that was too clever. Still, I
wondered if there was an easier way of doing this program
transformation. So far, apparently there isn't.

Steve Holden

unread,
Aug 15, 2002, 11:08:11 AM8/15/02
to
"Andrew Koenig" <a...@research.att.com> wrote in message
news:yu99k7ms...@europa.research.att.com...

But don't you think this is because of the fundamentally different nature of
"print" and "yield"? I didn't see your answer to Aahz' point, which was
(I'll repeat it here in case your server didn't catch the post):

'''


>So... my question is this: Is there a cleaner general way of making
>this kind of program transformation?

Not really. The two functions are not really semantically equivalent.
Consider the necessary code had your original function instead of
"print" used "return".

'''

I think that's a valid question.
regards
-----------------------------------------------------------------------
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/pwp/
-----------------------------------------------------------------------

Andrew Koenig

unread,
Aug 15, 2002, 12:02:03 PM8/15/02
to
Steve> But don't you think this is because of the fundamentally
Steve> different nature of "print" and "yield"?

At one level, of course I do -- but on another level, I don't.

A generator is a program that yields a sequence of values
to its caller. If I have a program that prints a sequence
of values -- that is, a program that yields a sequence of
values to the printer -- one might think that by changing all
of the print statements to yield statements (ignoring formatting
issues, of course), one could change one program to another.

That is, suppose we have:

def f():
# ...
print foo
# ...

It is not all that absurd to suppose that we could transform the
program this way and maintain its behavior:

def g():
# ...
yield foo
# ...

def f():
for i in g():
print i

And indeed we can -- the only problem being recursive calls.

In other words, let's go back to my original example:

def f():
for <...>:
if <condition>:
print <something>
else:
<do something>
f()
<do something else>

and transform it exactly as I have done above:

def g():


for <...>:
if <condition>:
yield <something>
else:
<do something>
f()
<do something else>

def f():
for i in g():
print i

This doesn't work because it's a ``shallow transformation'' that fails
in the face of recursive calls. To find the corresponding deep
transformation, we have to map the recursion into an analogous recursion:


def g():


for <...>:
if <condition>:
yield <something>
else:
<do something>

for i in g()


yield i
<do something else>

and then it works.

Andrew Koenig

unread,
Aug 15, 2002, 1:30:37 PM8/15/02
to
Aha! I just realized part of the origin of my puzzlement.

Yield actually does one of two very different things depending
on context. Consider:

def f():
yield 1
yield 2

The "yield 1" statement does two things:

1) It creates a generator object and returns that object to
the caller;

2) When the caller calls the generator's "next" method,
it passes 1 back to the caller.

The "yield 2" statement does only one thing: It passes 2 back to the
caller.

Let's call these two statements "type 1" and "type 2" yield
statements.

How do we know which yield statement does what? The first statement
executed in a function is type 1, all others, including re-executions
of that same statement, are type 2.

If a function calls another function that also contains a yield
statement, the first yield statement executed in that function is
also type 1.

There is no way of executing a type 2 yield statement in a function
without first executing a type 1 yield statement in the same function.

These facts mean that yield statements break a form of abstraction
that is commonplace in many other contexts: the ability to take a
collection of statements and put them in a function. For example:

def f():
<statement 1>
<statement 2>
<statement 3>
<statement 4>

Under ordinary circumstances, I can rewrite this as

def f():
<statement 1>
g()
<statement 4>

def g():
<statement 2>
<statement 3>

without changing the meaning of the program (provided that statement 2
and statement 3 do not refer to local variables of f).

However, if I use yield statements, the abstraction breaks down:

def f():
yield 1
yield 2
yield 3
yield 4

is not equivalent to

def f():
yield 1
g()
yield 4

def g():
yield 2
yield 3

I think that Tim's "yield every g()" notion is really a way of saying
``I want to call g, but I want every yield in g and anything it calls
to be considered type 2, not type 1.''

Bernhard Herzog

unread,
Aug 15, 2002, 1:56:23 PM8/15/02
to
Andrew Koenig <a...@research.att.com> writes:

> Aha! I just realized part of the origin of my puzzlement.
>
> Yield actually does one of two very different things depending
> on context. Consider:
>
> def f():
> yield 1
> yield 2
>
> The "yield 1" statement does two things:
>
> 1) It creates a generator object and returns that object to
> the caller;

It doesn't. This happens when f is called. However, f doesn't really
start to execute until the generator's next() method is called:
>>> def f():
... print 'a'
... yield 1
... print 'b'
... yield 2
...
>>> g = f()
>>> g.next()
a
1
>>> g.next()
b
2

> 2) When the caller calls the generator's "next" method,
> it passes 1 back to the caller.
>
> The "yield 2" statement does only one thing: It passes 2 back to the
> caller.
>
> Let's call these two statements "type 1" and "type 2" yield
> statements.

All yields are of type 2

[...]


> These facts mean that yield statements break a form of abstraction
> that is commonplace in many other contexts: the ability to take a
> collection of statements and put them in a function. For example:
>
> def f():
> <statement 1>
> <statement 2>
> <statement 3>
> <statement 4>
>
> Under ordinary circumstances, I can rewrite this as
>
> def f():
> <statement 1>
> g()
> <statement 4>
>
> def g():
> <statement 2>
> <statement 3>
>
> without changing the meaning of the program (provided that statement 2
> and statement 3 do not refer to local variables of f).

As you observe this doesn't for yield statements. In this regard yield
is just like return for which it wouldn't work either. Both yield and
return redirect the control flow back to the immediate caller, only with
return the caller is the caller of the function, with yield it's the
caller of the generator's next method.

Bernhard

--
Intevation GmbH http://intevation.de/
Sketch http://sketch.sourceforge.net/
MapIt! http://www.mapit.de/

Neil Schemenauer

unread,
Aug 15, 2002, 3:13:45 PM8/15/02
to
Bernhard Herzog wrote:

> Andrew Koenig <a...@research.att.com> writes:
> > The "yield 1" statement does two things:
> >
> > 1) It creates a generator object and returns that object to
> > the caller;
>
> It doesn't. This happens when f is called. However, f doesn't really
> start to execute until the generator's next() method is called

Right. Whether 'def something' is going to be a "generator function" or
a plain function is determined at compile time by examining the body for
yield statements. When a generator function is called it returns a
generator-iterator. Calling .next() on the generator-iterator begins
execution of the code.

For all the details refer to PEP 255 (www.python.org/peps/pep-0255.html).

Neil

Aahz

unread,
Aug 15, 2002, 3:48:56 PM8/15/02
to
In article <yu993ctg...@europa.research.att.com>,

Andrew Koenig <a...@research.att.com> wrote:
>
>Steve> But don't you think this is because of the fundamentally
>Steve> different nature of "print" and "yield"?
>
>At one level, of course I do -- but on another level, I don't.
>
>A generator is a program that yields a sequence of values to its
>caller. If I have a program that prints a sequence of values -- that
>is, a program that yields a sequence of values to the printer -- one
>might think that by changing all of the print statements to yield
>statements (ignoring formatting issues, of course), one could change
>one program to another.

Nope. A generator is a function that creates an iterator. An iterator
object produces values when its next() method is called. Don't get
confused by the fact that Python generators use the same terminology as
similar constructs in other languages. Especially don't confuse the
issue further by re-using Python terminology in idiosyncratic ways; in
Python, "sequences" refer strictly to array-like objects (lists, tuples,
and strings). "Stream" seems to be gaining some popularity to refer to
the output of an iterator.

I repeat my suggestion that you construct a solution using "return"
instead of "print".

Tim Peters

unread,
Aug 15, 2002, 5:08:59 PM8/15/02
to
[Andrew Koenig]
> ...

> Yield actually does one of two very different things depending
> on context.

I expect that's been cleared up by now, so I'll skip a repetition.

> ...


> These facts mean that yield statements break a form of abstraction
> that is commonplace in many other contexts: the ability to take a
> collection of statements and put them in a function. For example:
>
> def f():
> <statement 1>
> <statement 2>
> <statement 3>
> <statement 4>
>
> Under ordinary circumstances, I can rewrite this as
>
> def f():
> <statement 1>
> g()
> <statement 4>
>
> def g():
> <statement 2>
> <statement 3>
>
> without changing the meaning of the program (provided that statement 2
> and statement 3 do not refer to local variables of f).

More relevant I think is that 2 and 3 must not affect control flow in ways
that change as a result of the move. For example, if 2 is "return", the
"before" and "after" spellings say quite different things.

> However, if I use yield statements,

Among many other control-flow statements, yes <wink>.

> the abstraction breaks down:

> def f():
> yield 1
> yield 2
> yield 3
> yield 4
>
> is not equivalent to
>
> def f():
> yield 1
> g()
> yield 4
>
> def g():
> yield 2
> yield 3
>
> I think that Tim's "yield every g()" notion is really a way of saying
> ``I want to call g, but I want every yield in g and anything it calls
> to be considered type 2, not type 1.''

I don't buy the type 1-vs-2 distinction (all yields are "type 2" in Python);

yield every x

is intended as pure sugar for

for _temp_var in x:
yield _temp_var

A

yield every g()

in your "after" f() would make it quite like the "before" spelling.

If you haven't played with the Icon language, you should! It's a lot of
fun, and it's very educational to program in a language where generators are
ubiquitous (all expressions in Icon are generators, although many of them--
like, say, the integer literal 42 --yield a sequence with only one value).
What I'm calling "yield every" here is closest to what's called "suspend" in
Icon; Icon also has an "every" control structure, which forces its
expression argument to generate all its results, but without any notion of
returning them *to* "a caller". So, e.g.,

every f((1 to 3) + (5 to 10))

calls the function f() 18 times, once each for the 18 sums 1+5, 1+6, ...,
3+10. OK, I'm lying, it may do a lot more than that, if "f" is an
expression that happens to generate multiple callables too. Like

every (f|g|h)(42)

calls f(42) then g(42) then h(42).

suspend (f|g|h)(42)

does the same, but also yields each function-call result to the current
caller.

Generators weren't meant to be central to Python programming, though, and it
took a lot of discipline to keep the Simple Generators PEP simple <0.7
wink>.


Andrew Koenig

unread,
Aug 15, 2002, 5:37:57 PM8/15/02
to
Tim> If you haven't played with the Icon language, you should!

I have.

Paul Rubin

unread,
Aug 15, 2002, 5:54:10 PM8/15/02
to
Andrew Koenig <a...@research.att.com> writes:
> Steve> But don't you think this is because of the fundamentally
> Steve> different nature of "print" and "yield"?
>
> At one level, of course I do -- but on another level, I don't.
>
> A generator is a program that yields a sequence of values
> to its caller. If I have a program that prints a sequence
> of values -- that is, a program that yields a sequence of
> values to the printer -- one might think that by changing all
> of the print statements to yield statements (ignoring formatting
> issues, of course), one could change one program to another.

The problem you're hitting is that generators do what you expect, but
Python doesn't support generators--it only supports a limited type of
generator called a "simple generator", which can only yield directly
to its caller. Its callees cannot yield through it.

Implementing generators requires something like Scheme continuations,
which save the whole call stack up to the yield point, and switch back
to it when the generator is resumed. That was considered infeasible
in Jython and a pain in the neck in CPython, hence the limitation to
simple generators, which save only a single stack frame (that's just
another Python heap object). Stackless Python could implement full
generators that do what you want them to; I don't know if it does so though.

Anyway, the problem you're having where yield doesn't really work like
print comes from this implementation limitation, not anything inherent
in the notion of generators.

Andrew Koenig

unread,
Aug 15, 2002, 6:48:01 PM8/15/02
to
Paul> The problem you're hitting is that generators do what you
Paul> expect, but Python doesn't support generators--it only supports
Paul> a limited type of generator called a "simple generator", which
Paul> can only yield directly to its caller. Its callees cannot yield
Paul> through it.

Paul> Implementing generators requires something like Scheme
Paul> continuations, which save the whole call stack up to the yield
Paul> point, and switch back to it when the generator is resumed.
Paul> That was considered infeasible in Jython and a pain in the neck
Paul> in CPython, hence the limitation to simple generators, which
Paul> save only a single stack frame (that's just another Python heap
Paul> object). Stackless Python could implement full generators that
Paul> do what you want them to; I don't know if it does so though.

Ah, I see.

Paul> Anyway, the problem you're having where yield doesn't really
Paul> work like print comes from this implementation limitation, not
Paul> anything inherent in the notion of generators.

To complicate matters, I now realize that my mental model of what was
happening is wrong anyway. I said earlier that I imagined the first
yield in a generator being different from the others, but what is
actually happening is that the mere textual appearance of a yield
statement in a function causes the function to return an iterator
immediately.

This example shows the difference:

def f():
print "Hello"
yield 1

>>> a = f()
>>> a.next()
Hello
1

In my previous model, I would have expected the call to f() to print
Hello and then return an iterator.

Tim Peters

unread,
Aug 15, 2002, 6:46:06 PM8/15/02
to
[Tim]

> If you haven't played with the Icon language, you should!

[Andrew Koenig]
> I have.

Cool! Then build on that: Python's generators are the same as Icon's
generators, except that exposing the .next() method allows Python's flavor
to be used for *some* things that require full-blown coexpressions in Icon
(driving a generator in Icon is, as in CLU too, wholly tied to control
structures -- exposing .next() allows Python's generators to be more
flexible).

The syntactic sugar is different in that Icon's "suspend" *implies* "every",
but in Icon too a generator can return a result only to its immmediate
caller (ditto in CLU).

If you want something fancier than that, you have to use a coexpression in
Icon (provided your platform supports Icon coexps -- not all do), and
explicitly name the intended target coexp. That's where your analogy to
Python's "print" breaks down: "print" is syntactic sugar in Python for (in
part) explicitly naming sys.stdout as the intended target. That's why it
doesn't matter from where you invoke print -- the target is global. The
target of yield is "one up the call stack", though, and that's relative to
which function you're in. The same is true of Icon's suspend.

There's a Generator.py in the Python Demo/threads/ directory with a
different approach, using threads under the covers to allow "yielding"
values directly across any number of "stack frames". There the producer
feeds values to a_generator.put(), and a consumer sucks them out via
a_generator.get(). Everything is explicit then, and the kinds of examples
you're writing work differently.


Carel Fellinger

unread,
Aug 15, 2002, 9:08:36 PM8/15/02
to
On Thu, Aug 15, 2002 at 04:02:03PM +0000, Andrew Koenig wrote:
> Steve> But don't you think this is because of the fundamentally
> Steve> different nature of "print" and "yield"?
>
> At one level, of course I do -- but on another level, I don't.
>
> A generator is a program that yields a sequence of values
> to its caller. If I have a program that prints a sequence
> of values -- that is, a program that yields a sequence of
> values to the printer -- one might think that by changing all
> of the print statements to yield statements (ignoring formatting
> issues, of course), one could change one program to another.

And here lies the root of your confusion:), the sequence sent to
the printer is more like a single string, sent in bits and pieces.
And though those bits and pieces themselves are strings, what the
printer sees is one single stream of characters, i.e. a single string.

To me the print statement is more like an elaborate *join* function,
joining strings with spaces and the occasional newline character(s).

Moreover it does this secretely, as a side effect. no matter where the
print statement is put in your source, all of its output will magically
appear on the same paper.

Generators, on the other hand, really are on pair with plain sequences
without trickery or wichcraft. If you want to join such a sequence
you'll have to do it yourself, and if you want to extent one sequence
with another you'll really have to code it.


Let's remove some of the wichcraft and use StringIO. And, as a bonus,
find the proper way to refactor print statements to keep its output
within program control.

>>> import StringIO
>>> printer = StringIO.StringIO()
>>> def f():
... print >> printer, "spam spam", "and spam"
... print >> printer, "or even more spam",
...
>>> f()
>>> printer.getvalue()
'spam spam and spam\nor even more spam'
>>> list(printer.getvalue())
['s', 'p', 'a', 'm', ' ', 'a', 'n', 'd', ' ', 's', 'p', 'a', 'm', '\n', 'e', 'v', 'e', 'n', ' ', 'm', 'o', 'r', 'e', ' ', 's', 'p', 'a', 'm']

>>> def g():
... yield "spam spam", "and spam" #this yields a tuple!
... yield "or even more spam"
...
>>> list(r)
[('spam', 'and spam'), 'or even more spam']


This simple example show that even without recursion print doesn't behave
like yield at all. That's why Aahz and Steve urged you to rewrite it
with return statements. But I would rather forget about generators and
try to refactor with StringIO instead.


> In other words, let's go back to my original example:
>

Let's make the printer visible:

import StringIO
printer = StringIO.StringIO()

> def f():

def f(printer): #and keep it visible

> for <...>:
> if <condition>:
> print <something>

print >> printer, <something> #keep it visible

> else:
> <do something>
> f()

f(printer) #just keep the printer visible

> <do something else>

The rest of the program need not change, just refer to printer.getvalue()
to see what otherwise would have appeared on paper.


--
groetjes, carel


David Eppstein

unread,
Aug 15, 2002, 8:33:26 PM8/15/02
to
In article <7xbs83i...@ruckus.brouhaha.com>,
Paul Rubin <phr-n...@NOSPAMnightsong.com> wrote:

> Anyway, the problem you're having where yield doesn't really work like
> print comes from this implementation limitation, not anything inherent
> in the notion of generators.

I think it's also inherent in the syntax chosen for simple generators --
there's no way of distinguishing a function that when called returns a
new simple generator (call this a generator factory) from a function
that when called routes its yields to the outlet of its caller (call
this a subgenerator).

Tim has proposed a "yield every x()" syntactic-sugar that would allow
you to take a generator factory x and use it as if it were a
subgenerator. This seems a reasonable idea, but there is an efficiency
argument for having a direct syntax for subgenerators:

Suppose you have a long call chain of "yield every" statements (as can
occur for instance in the tree traversal example from the simple
generator PEP) -- say the length of the chain is L. Then the obvious
implementation is to simply expand each "yield every x" statement into
"for i in x: yield i". But, this would incur O(L) overhead for each
generated item. One could try to speed this up by using some kind of
balanced tree data structure to keep track of which generators are
participating in "yield every" statements, but this seems difficult
because of the complication that a single x may occur in multiple
simultaneous "yield every" statements.

On the other hand, suppose you have a syntax for defining subgenerators,
and a long call chain of subgenerators. Each subgenerator can point
directly to the output point for its yields (known at the time the
subgenerator is created and never changed), so the time per yielded item
is only a constant.

--
David Eppstein UC Irvine Dept. of Information & Computer Science
epps...@ics.uci.edu http://www.ics.uci.edu/~eppstein/

Bengt Richter

unread,
Aug 15, 2002, 10:40:26 PM8/15/02
to
On Thu, 15 Aug 2002 17:30:37 GMT, Andrew Koenig <a...@research.att.com> wrote:

>Aha! I just realized part of the origin of my puzzlement.
>
>Yield actually does one of two very different things depending
>on context. Consider:
>
> def f():
> yield 1
> yield 2
>
>The "yield 1" statement does two things:

AFAIK actually not. You don't get to a yield statement by calling f().
You can get to the first one by calling f().next(), and if you want
to continue to the second one, you have to call .next() of what f()
returned a second time. E.g.,
>>> def f():
... yield 1
... yield 2
...
>>> fgen = f()
>>> fgen.next()
1
>>> fgen.next()
2
>>> fgen.next()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
StopIteration

But successive call to f() just get you more fresh generator objects:
>>> f()
<generator object at 0x00852BB0>
>>> f()
<generator object at 0x008526D0>
>>> f()
<generator object at 0x00852BB0>


>
> 1) It creates a generator object and returns that object to

^^ f() yes, yield no


> the caller;
>
> 2) When the caller calls the generator's "next" method,
> it passes 1 back to the caller.
>
>The "yield 2" statement does only one thing: It passes 2 back to the
>caller.
>
>Let's call these two statements "type 1" and "type 2" yield
>statements.

AFAIK there aren't two kinds. And you don't get to any yield by calling f().

>
>How do we know which yield statement does what? The first statement
>executed in a function is type 1, all others, including re-executions
>of that same statement, are type 2.
>
>If a function calls another function that also contains a yield
>statement, the first yield statement executed in that function is
>also type 1.
>
>There is no way of executing a type 2 yield statement in a function
>without first executing a type 1 yield statement in the same function.
>

Well, I don't think it's the yield statements per se that get executed
to do different things. Personally, I don't like f() returning a generator object,
when -- prior to insertions of one or more yields in its body -- f() would do the
usual function execution. That effectively makes def f(...) bind the name f
to a factory function instead of a function. You need a class (or maybe closure-maker) so
that a generator instance can have state and keep track of it, and the factory function
can return an instance of the generator class, but I think it's misleading to substitute
a factory function bound to the same name as the function that previously didn't have
any yield statements. I.e.,

def foo():
yield 1
yield 2

logically makes foo into a factory function, not the original function. I.e.,

foogen = foo()

does not call the apparent foo() function at all. foo() now generates a class
instance that has a method .next(), which is used to execute successive chunks
of the purported foo whose name was stolen. With a fresh foogen instance, naturally
foogen.next() executes the code in the original foo up to the first yield and returns.
The foogen state can then save a start location just past the yield for use when
foogen.next() is called next time.

I think I would rather have had the compiler introduce a __makegen__ attribute to
foo when it saw yield and wanted to make a factory function. Thus foo.__makegen__()
would return what foo() does now, leaving foo() to raise a warning exception complaining
about an unbound interator method or something like that if called by itself.
One could speculate about contexts for dynamically associating foo with a generator
instance, and advancing its state for the result of its next yield point, but that
would be a longer discussion...

The
for i in foo(): ...
construct would look for foo.__makegen__() first, but lacking __makegen__ it would
of course call foo() as an ordinary function and expect an iterable object back
as it does now. This obviously has to be preserved so as to make e.g.,

def foo(): return [1,2,3]
work as expected in
for i in foo(): ...

But IMO implicit switcheroos using the same name are not too Pythonic ;-/
If you disassemble foo you get no hint in the byte codes that foo() is not going to
execute that code, but instead is going to return an object whose .next()
method will somehow make controlled use of the code you see disassembled.

BTW, ISTM the semantics of generator creation are very analogous to those of thread creation.
I.e., there is a computing thing whose execution state advances, and whose state needs
to be kept track of. A yield is analogous to setting a condition variable that the 'caller' is
waiting on, and suspending execution until getting a signal to resume. Except with a generator
the context switches are synchronous.

I think that's because a function definition is implicitly converted to
a factory function definition by putting yield(s) in the body.
(What happened to the 'explicit is good' zen ;-)

So your first call to f() doesn't give you 1, and neither does a second call to f().
It gives you a fresh object created by the factory function going by the name f.
You have to set theobj = f() and call theobj.next() to get action defined by the
original function.

>I think that Tim's "yield every g()" notion is really a way of saying
>``I want to call g, but I want every yield in g and anything it calls
>to be considered type 2, not type 1.''

I think f() and g() are always 'type 1' if there's a yield in their definition.
You have to use the objects they return to get 'type 2' results.
Successive 'type 2' (yield) results are produced by repeatedly calling
the .next() method of a single 'type 1' object returned by f() or g().
Calling f() repeatedly gives you successive freshly initialized 'type 1'
objects.

IOW "yield every g()" is nice sugar for doing the type 1 iterobj=g() call
and then yielding every available (and always type 2) iterobj.next() result.

Regards,
Bengt Richter

Bengt Richter

unread,
Aug 16, 2002, 4:01:00 AM8/16/02
to
On 15 Aug 2002 15:48:56 -0400, aa...@pythoncraft.com (Aahz) wrote:
[...]

>similar constructs in other languages. Especially don't confuse the
>issue further by re-using Python terminology in idiosyncratic ways; in
>Python, "sequences" refer strictly to array-like objects (lists, tuples,
>and strings). "Stream" seems to be gaining some popularity to refer to
>the output of an iterator.

Is there an official glossary of Python terminology? Seems like it would
be a good thing as a section of the docs, and referenced in introduction,
tutorial, etc. (I know I'm taking a risk, not having searched before posting this ;-)

Regards,
Bengt Richter

greg

unread,
Aug 16, 2002, 4:24:54 AM8/16/02
to
Tim Peters wrote:
>
> There was a lot of discussion of this at the time (dig into the iterators
> mailing list archive). We decided to keep Simple Generators, well,
> *simple*, to start with.

I don't agree with the characterisation of Python generators
as "simpler" than Icon ones (from the user's perspective).
There's a point of view from which Icon's generators are
simpler than Python's -- that is, the one Andrew Koenig
was adopting when he didn't immediately see the need for
the "for...yield" loop.

I predicted, at the time generators were introduced, that
some people would find this aspect of them unintuitive,
and it seems that I was right.

If by "simple" you're referring to the *implementation*,
then yes, it would have been much harder to add Icon-style
generators to CPython. (But only because the guts of CPython
hadn't been designed with them in mind from the beginning
like Icon had.)

Greg Ewing

greg

unread,
Aug 16, 2002, 4:32:27 AM8/16/02
to
Andrew Koenig wrote:
> At first I wondered if perhaps calling a generator from
> another generator, and not doing anything with the results, should
> automatically yield them, but decided that was too clever. Still, I
> wondered if there was an easier way of doing this program
> transformation. So far, apparently there isn't.

Actually, there is a way, sort of, although it doesn't
involve using Python generators. You can write it in
so-called "continuation passing style" like this:

def f(produce_result):
for ...:
if ...:
produce_result(x)
else:
...
f(produce_result)
...

and you call it with a function (or some other callable
object) that does what you want done with each value.

This is very similar to the way Icon works under the
covers.

Greg Ewing

greg

unread,
Aug 16, 2002, 5:06:55 AM8/16/02
to
Paul Rubin wrote:
>
> Implementing generators requires something like Scheme continuations,
> which save the whole call stack up to the yield point, and switch back
> to it when the generator is resumed. That was considered infeasible
> in Jython and a pain in the neck in CPython, hence the limitation to
> simple generators, which save only a single stack frame (that's just
> another Python heap object). Stackless Python could implement full
> generators that do what you want them to;

No, no, no. You *don't* need Stackless-ness to implement
Icon-style generators. CPython *could* have been made to
do them (although it would have been a lot of work).

However, as Tim pointed out in another post, Python's
generators can actually do something that Icon's can't:
you can have multiple Python generators on the go in
parallel. This is possible because it uses a sort of
one-level-deep version of Stackless-ness, keeping a
Python frame object around for each active generator.

So, although you lose something the Python way, you
gain something too. This is another reason why it's
a bit misleading to call Python's generators "simpler"
than Icon's.

Greg Ewing

greg

unread,
Aug 16, 2002, 5:16:15 AM8/16/02
to
David Eppstein wrote:
>
> Suppose you have a long call chain of "yield every" statements
> -- say the length of the chain is L. Then the obvious
> implementation is to simply expand each "yield every x" statement into
> "for i in x: yield i". But, this would incur O(L) overhead for each
> generated item.

I have an idea about this, which I hope I'll get around to
following up. I *think* it will be possible to optimise
"yield every x" in the case where x is a generator-iterator,
to eliminate the call chain.

This would be done by keeping a stack of active generator-
iterators somewhere (not exactly sure where yet). When you
hit a "yield every", you push the sub-generator-iterator
onto the stack, so that subsequent yields go directly to
the most recent one. When it is exhausted, you pop the
stack and try to resume the next most recent one. When
the stack is empty, you raise StopIteration.

Greg Ewing

greg

unread,
Aug 16, 2002, 5:32:14 AM8/16/02
to
Tim Peters wrote:
>
> The syntactic sugar is different in that Icon's "suspend" *implies* "every",
> but in Icon too a generator can return a result only to its immmediate
> caller (ditto in CLU).

(Presumably by "every" here you're talking about the proposed
Python "yield every", not the Icon "every".)

That's one way of thinking about it, I suppose, but it may not
be the best one for getting your head around how Icon works.
In Icon, calling, suspending and returning actually all
transfer control in the same direction -- *down* the call
stack. The only time you go *up* the call stack is when
you fail!

It's confusing trying to compare two languages when simple
words like "return" can mean quite different things...

Greg Ewing

Neil Schemenauer

unread,
Aug 16, 2002, 10:35:32 AM8/16/02
to
David Eppstein wrote:
> Tim has proposed a "yield every x()" syntactic-sugar that would allow
> you to take a generator factory x and use it as if it were a
> subgenerator. This seems a reasonable idea, but there is an efficiency
> argument for having a direct syntax for subgenerators:

I think that depends on how 'yield every' works. Does it require a
generator-iterator or just any iterator? Also, does it allow the
generator-iterator to be passed? For example,

def grange(n):
for i in xrange(n):
yield i

def grange_wrapper():
return grange()

def a():
yield every grange(10)

def b():
yield every grange_wrapper(10)

def c():
yield every range(10)

do 'a', 'b', 'c' all work?

Neil

Tim Peters

unread,
Aug 16, 2002, 10:31:44 AM8/16/02
to
[Tim]

> There was a lot of discussion of this at the time (dig into the
> iterators mailing list archive). We decided to keep Simple
> Generators, well, *simple*, to start with.

[Greg Ewing]


> I don't agree with the characterisation of Python generators
> as "simpler" than Icon ones (from the user's perspective).

The characterization Simple was in contrast to continuation-based gimmicks,
not at all to Icon. The word "generator" means something much hairier to,
e.g., Scheme-heads (e.g., you've seen Paul Rubin's recent post about
generators, which assumes a meaning for the word that isn't intended in Icon
or Python or CLU or Sather). Simple Generators was in contrast to (Complex)
Stacklessness.

> There's a point of view from which Icon's generators are
> simpler than Python's

Absolutely. Icon is built on generators from the ground up, so they're
natural as breathing there; they'll never be that natural in Python.

> -- that is, the one Andrew Koenig was adopting when he didn't
> immediately see the need for the "for...yield" loop.

Andrew's example "fails" in the same way if code in Icon, so this particular
example isn't helpful to your point (in Icon he would need to plop the
recursive call into an explicit "suspend" control structure to get the
example to work).

> I predicted, at the time generators were introduced, that
> some people would find this aspect of them unintuitive,
> and it seems that I was right.

Some Icon users are also suprised by this -- indeed, recursive generators
are the pons asinorum for would-be idiomatic Icon programmers too.
"Forgetting" the suspend-- and using return instead of suspend --are very
common newbie mistakes there. However, Icon isn't any better at telepathy
than Python is, and a programmer may very well *not* wish to yield values
produced by callee generators. In Icon and Python, you have to explicitly
"pass them up" if that's what you want (btw, in another message you seem to
assume meanings for "up" and "down" that are the reverse of most-common
usage: IME most people seem to think of callees as returning "up" to their
callers; I'm acutely aware of this because the idea of a stack "growing
down" is repugnant to my sense of beauty too <wink>).

> If by "simple" you're referring to the *implementation*,

> then yes it would have been much harder to add Icon-style


> generators to CPython. (But only because the guts of CPython
> hadn't been designed with them in mind from the beginning
> like Icon had.)

It's more that the *semantics* of Python weren't designed with generators in
mind at all. Icon-style generators with a wholly Icon flavor would be a
poor fit to Python. I call what we implemented in Python "resumable
functions", because that's still the best conceptual fit to Python's
semantics. Implementation ease and ease of fitting into Python's semantic
model are usually strongly related, perhaps because Guido has always favored
operational definitions.


Tim Peters

unread,
Aug 16, 2002, 1:00:19 PM8/16/02
to
[Neil Schemenauer]

> I think that depends on how 'yield every' works.

I thought of it as pure syntactic sugar,

yield every expr

same-as

for _tempname in expr:
yield _tempname

where _tempname is an internal vrbl name that doesn't conflict with any
other vrbl name.

> Does it require a generator-iterator or just any iterator?

By definition, the semantics are exactly the same as in the "for" loop
rewrite. So, for example,

yield every [1, 2, 3]

would yield 1, then yield 2, then yield 3, because that's what

for x in [1, 2, 3]:
yield x

does today. So expr must evaluate to an iterable object, and that's the
only requirement. A generator-iterator is one kind of iterable object, of
course.

Note that

yield every 666

would raise an exception, because

for x in 666:

raises one today.

> Also, does it allow the generator-iterator to be passed?

Couldn't parse that one.

> For example,
>
> def grange(n):
> for i in xrange(n):
> yield i
>
> def grange_wrapper():
> return grange()
>
> def a():
> yield every grange(10)
>
> def b():
> yield every grange_wrapper(10)

Since grange_wrapper() above doesn't accept an argument, it's unclear
whether you're asking about exception behavior here.

> def c():
> yield every range(10)
>
> do 'a', 'b', 'c' all work?

If you think these "work" today (grange_wrapper() really muddied your
intent -- you're passing it an argument but it doesn't accept one, while it
in turn doesn't pass an argument to a function that requires one), yes, else
no:

def aa():
for x in grange(10):
yield x

def bb():
for x in grange_wrapper(10):
yield x

def cc():
for x in range(10):
yield x


Neil Schemenauer

unread,
Aug 16, 2002, 1:00:45 PM8/16/02
to
Tim Peters wrote:
> I thought of it as pure syntactic sugar,
>
> yield every expr
>
> same-as
>
> for _tempname in expr:
> yield _tempname

Tasty.

> > Also, does it allow the generator-iterator to be passed?
>
> Couldn't parse that one.

I meant that would it work if you called a function that returned a
generator-iterator (as opposed to calling a generator-function).

> Since grange_wrapper() above doesn't accept an argument, it's unclear
> whether you're asking about exception behavior here.

I should be more careful when I type examples. grange_wrapper was
supposed to take an argument just like grange(). Forget the examples.
Your proposed 'yield every' would accept any object that supported the
iterator protocol. In that case, David Eppstein's idea of optimizing
away the extra 'for' loop would be difficult.

It's not a big problem, IMO. I'd much rather have the extra
flexibility. The loop would be efficient since it could be implemented
as a C function (like map()) instead of having to go through the VM.
Have you pitched 'yield every' to Guido yet? Is it worth trying to come
up with a patch?

Neil

David Eppstein

unread,
Aug 16, 2002, 1:25:26 PM8/16/02
to
In article <3D5CC2DC...@cosc.canterbury.ac.nz>,
greg <gr...@cosc.canterbury.ac.nz> wrote:

But the set of generators currently involved in yield every statements
doesn't form a stack, or even a set of paths -- it's not hard to set up
a situation with generators x, y, and z where x and y are both
simultaneously calling "yield every" on z. The order in which x and y
get items from z would then be controlled by the order in which x and y
are themselves called.

Now that I think about it, though, a generator can only be calling
"yield every" on one other generator, and the graph of "yield every"
calls is acyclic (attempts at cyclicity result in ValueError: generator
already executing) so the "yield every" connections should form a
forest, where the roots of the forest are the generators that are
actually doing stuff instead of passing the buck to other generators.
Each time a "yield every" statement is executed, it adds one more edge,
connecting two trees in the forest to form a single tree, and each time
a generator involved in a "yield every" statement terminates, it removes
the root of a tree and splits it into a forest of subtrees. Each actual
call to a generator involved in this forest should be diverted to the
root of the tree to which it belongs. So you could use a dynamic tree
data structure such as the one of Sleator and Tarjan [A data structure
for dynamic trees, J. Comp. Sys. Sci. 24:362-381, 1983] to perform each
diversion in logarithmic time. Probably this is too complicated to be
practical, though.

Here's an attempt at something more practical, based on the guess that
multiple "yield every" calls to the same generator are unlikely. Each
generator x maintains two pointers, child(x) and ancestor(x). Initially
child(x)=None and ancestor(x)=x. Calls to x are diverted to
ancestor(x), and the child pointers are used to back up the stack when
ancestors terminate.

When any generator x is called, perform the following steps:
while ancestor(x) is not x itself but has terminated:
ancestor(x) = child(ancestor(x))
while ancestor(ancestor(x)) != ancestor(x):
ancestor(x) = ancestor(ancestor(x))
perform actual generator call on ancestor(x)

A "yield every y" statement from generator x expands to something like
if child(y): # y already in a different yield every
for i in y: yield i # simulate by for loop instead of diverting
else:
child(y) = x
ancestor(x) = ancestor(y)
divert generator call to ancestor(x)

It's hard to say much about worst-case performance of this scheme, but
in typical cases where a generator involved in a "yield every" is not
ever called by a different statement (true e.g. in the tree traversal
example) it looks like it allows constant amortized time generation of
each element regardless of the yield chain length.

Bengt Richter

unread,
Aug 16, 2002, 4:37:06 PM8/16/02
to
On Fri, 16 Aug 2002 07:35:32 -0700, Neil Schemenauer <n...@python.ca> wrote:

>David Eppstein wrote:
>> Tim has proposed a "yield every x()" syntactic-sugar that would allow
>> you to take a generator factory x and use it as if it were a
>> subgenerator. This seems a reasonable idea, but there is an efficiency
>> argument for having a direct syntax for subgenerators:
>
>I think that depends on how 'yield every' works. Does it require a
>generator-iterator or just any iterator? Also, does it allow the
>generator-iterator to be passed? For example,
>
> def grange(n):
> for i in xrange(n):
> yield i
>
> def grange_wrapper():

^^(n)
> return grange()
^^(n)

>
> def a():
> yield every grange(10)
>
> def b():
> yield every grange_wrapper(10)
>
> def c():
> yield every range(10)
>
>do 'a', 'b', 'c' all work?
>

I think if you substitute fakegen below for grange,
the answers should be the same. The object returned just
needs to be iterable, and yield every is sugar for an implicit
for loop using the .next method of the object returned by
the expression in "yield every <expression>". It doesn't
matter how you write the expression, so long as it evaluates
to an iterable object in some viable state (not necessarily
its first, but usually so).

Which BTW means you should be able to write
f = fakegen(25)
f.next()
yield every f
to throw the first away and get all the rest

>>> class fakegen:
... def __iter__(self): return self
... def __init__(self, n):
... self.n = n
... self.i = 0
... def next(self):
... i = self.i
... if i < self.n:
... self.i += 1
... return i # yield i
... else:
... raise StopIteration
...
>>> for x in fakegen(5): print x,
...
0 1 2 3 4

A lot of folks seem to be forgetting that an initialized _instance_ of something
has to be created first, and then used via .next() to step to the next yield result.

I.e., the "for x in fakegen(n)" obscures that this is happening:
>>> f=fakegen(3)
>>> f.next()
0
>>> f.next()
1
>>> f.next()
2
>>> f.next()


Traceback (most recent call last):
File "<stdin>", line 1, in ?

File "<stdin>", line 12, in next
StopIteration

Likewise "yield every fakegen(n)" obscures the fact that an instance of
something must be created and stepped through via .next().

Note that you could bind to the next method of a new generator:
>>> g=fakegen(3).next
>>> g()
0
>>> g()
1
>>> g()
2
>>> g()


Traceback (most recent call last):
File "<stdin>", line 1, in ?

File "<stdin>", line 12, in next
StopIteration

So presumably if g were visible you could write yield g()
in several different places inside another generator, and expect to
get the results in the order called. (Which is quite different from 'every').

Anyway, if I understand correctly, yields in the body of a def fun(...): ... cause
'fun' effectively to be bound to a factory function (apparently based on a flag
associated with the code so as to divert normal execution to do the factory job),
not the function it looks like in dis.dis(fun).

The factory function acts like the fakegen class above and returns an initialized
iterable object. That object makes use of what you see in dis.dis(fun), and maintains
state (frame etc) to walk from yield to yield.

At least, this concept of things seems to work in making sense of what I've seen so far ;-)

Regards,
Bengt Richter

greg

unread,
Aug 18, 2002, 6:46:58 AM8/18/02
to
Tim Peters wrote:
>
> > -- that is, the one Andrew Koenig was adopting when he didn't
> > immediately see the need for the "for...yield" loop.
>
> Andrew's example "fails" in the same way if code in Icon, so this particular
> example isn't helpful to your point (in Icon he would need to plop the
> recursive call into an explicit "suspend" control structure to get the
> example to work).

Yes, that's true. But the equivalent to that in Python would
be "yield f()", which doesn't work either (and was one of the
things he tried, if I remember correctly).

> (btw, in another message you seem to
> assume meanings for "up" and "down" that are the reverse of most-common
> usage: IME most people seem to think of callees as returning "up" to their
> callers;

Which is the point I was rather provocatively trying to
make! I don't know how other people deal with it, but
for me the key to understanding Icon control flow is
to think of "suspend" as a form of call rather than
a form of return. It's equivalent to a call with an
implicit continuation passed down.

Or at least that's what my memory says from when I
played with Icon quite a few years ago. I may be
mis-remembering some of the details -- perhaps I
should go away and refresh my knowledge before saying
any more!

> I'm acutely aware of this because the idea of a stack "growing
> down" is repugnant to my sense of beauty too <wink>).

Hmmm, yes, the stack metaphor does suggest upward growth.
But we talk about "depth" of recursion, not "height" of
recursion!

Ah, well. I never, ever claim to be consistent. :-)

Greg

Greg Ewing

unread,
Aug 19, 2002, 8:08:06 PM8/19/02
to
David Eppstein wrote:

> But the set of generators currently involved in yield every statements
> doesn't form a stack, or even a set of paths

>


> Now that I think about it, though, a generator can only be calling
> "yield every" on one other generator,

>

> When any generator x is called, perform the following steps:


> while ancestor(x) is not x itself but has terminated:
> ancestor(x) = child(ancestor(x))
> while ancestor(ancestor(x)) != ancestor(x):
> ancestor(x) = ancestor(ancestor(x))
> perform actual generator call on ancestor(x)

I've thought about this a bit more, and I think it can be done
a lot more simply than that. All that's needed is a pointer in
each generator-iterator that points to the subject of the
current yield-every statement being executed, if any.

The next() method of the generator-iterator first checks this
pointer, and if it's not null, does a next() on it instead.
If it's null, or calling next() on it raises StopIteration,
carry on executing until the next yield as usual.

This will still require going down a chain of next() methods
when generators are nested, but the calls will all be C calls
and should therefore be quite fast.

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg

David Eppstein

unread,
Aug 20, 2002, 1:05:15 AM8/20/02
to
In article <3D61886...@something.invalid>,
Greg Ewing <see_repl...@something.invalid> wrote:

> I've thought about this a bit more, and I think it can be done
> a lot more simply than that. All that's needed is a pointer in
> each generator-iterator that points to the subject of the
> current yield-every statement being executed, if any.
>
> The next() method of the generator-iterator first checks this
> pointer, and if it's not null, does a next() on it instead.
> If it's null, or calling next() on it raises StopIteration,
> carry on executing until the next yield as usual.
>
> This will still require going down a chain of next() methods
> when generators are nested, but the calls will all be C calls
> and should therefore be quite fast.

Somehow I find solutions that depend on the slowness of interpreted
Python to be inelegant. But I guess that's not a serious practical
criticism...

Greg Ewing

unread,
Aug 20, 2002, 2:26:32 AM8/20/02
to
David Eppstein wrote:

> Somehow I find solutions that depend on the slowness of interpreted
> Python to be inelegant.


If interpreted Python were as fast as C, this optimisation
mightn't gain much. But it's not, so it would. What's so
bad about that?

Andreas Leitgeb

unread,
Aug 20, 2002, 4:35:30 AM8/20/02
to
Greg Ewing <see_repl...@something.invalid> wrote:
> I've thought about this a bit more, and I think it can be done
> a lot more simply than that. All that's needed is a pointer in
> each generator-iterator that points to the subject of the
> current yield-every statement being executed, if any.

Maybe I'm misunderstanding this, but
I strongly want to oppose any implicit propagation of nested
generator-calls, whether recursive or not.

Even a recursive generator may choose to do whatever it likes
with the iterator returned from a sub-generator - even yield
it "as is"!

The previously suggested new pseudo-keyword "every" as in
yield every <iterator>
would be very clear, concise and good syntactic sugar for the current
for i in <iterator>: yield i
with some bonuses:
+ no local variable needs to be bound
+ speed (at least because of previous point :-)
+ could be used right of a colon:
if <whatever>: yield every <iterator>
The only "disadvantage" being, that it's a change in syntax
with some (likely marginal) effect in python's code-size.

There may be better words than "every", perhaps "each" sounds better
in this context.

--
Newsflash: Sproingy made it to the ground !
read more ... <http://avl.enemy.org/sproingy/>

Delaney, Timothy

unread,
Aug 20, 2002, 11:34:43 PM8/20/02
to
> From: Andreas...@siemens.at [mailto:Andreas...@siemens.at]

>
> There may be better words than "every", perhaps "each" sounds better
> in this context.

Whilst 'yield every' is lovely syntax sugar (or 'yield each', or whatever),
I'm starting to question the utility of it.

I post the following example of code from the sets discussion over at
python-dev (Guido's code ;) ...

> def product(s, *sets):
> if not sets:
> for x in s:
> yield (x,)
> else:
> subproduct = list(product(*sets))
> for x in s:
> for t in subproduct:
> yield (x,) + t

As we can see, there are two sections of the format:

for x in <seq>:
yield <something>

However, in neither case would you be able to use

yield every <seq>

I have no ideas as to whether it would be made more general. Perhaps with a
generator version of map ...

def func (s, t=t):
return (x,) + t

yield every generatormap(func, subproduct)

but that is *way* uglier (and longer) than the simple for: loop.

Is it worthwhile to have a construct which can only be used in one specific
case (yield every element in an iterable object) when the alternative is so
short? The *only* advantages I can think of is that it could be somewhat
faster, and wouldn't require a temporary name.

Tim Delaney

Greg Ewing

unread,
Aug 21, 2002, 2:05:16 AM8/21/02
to
Andreas Leitgeb wrote:

>
> Maybe I'm misunderstanding this, but
> I strongly want to oppose any implicit propagation of nested
> generator-calls, whether recursive or not.

Don't worry, there wouldn't be any change in
semantics. What I'm talking about would only
be done when executing a "yield every" statement,
and it would only be an optimisation -- the
"yield every" statement would still have the
same semantics as a for-loop with a "yield"
in it.

> There may be better words than "every", perhaps "each" sounds better
> in this context.


I was originally thinking of using "yield from",
in the interests of re-using an existing keyword.
But since it can be a pseudo-keyword, that's
not really an issue.

Andrew Koenig

unread,
Aug 21, 2002, 9:06:14 AM8/21/02
to
>> def product(s, *sets):
>> if not sets:
>> for x in s:
>> yield (x,)
>> else:
>> subproduct = list(product(*sets))
>> for x in s:
>> for t in subproduct:
>> yield (x,) + t

Delaney> As we can see, there are two sections of the format:

Delaney> for x in <seq>:
Delaney> yield <something>

Delaney> However, in neither case would you be able to use

Delaney> yield every <seq>

def product(s, *sets):
if not sets:

yield every [(x,) for x in s]
else:
subproduct = list(product(*sets))
yield every [(x,) + t for x in s for t in subproduct]

I should think that would work as long as "yield every" takes an
iterable rather than just a generator.

Andrew Koenig

unread,
Aug 21, 2002, 9:41:13 AM8/21/02