A Problem with Python's 'yield'
* To: LL1 Mailing List <address@hidden>
* Subject: A Problem with Python's 'yield'
* From: Eric Kidd <address@hidden>
* Date: 27 May 2003 11:15:20 -0400
* Organization:
* Sender: address@hidden
I'm going to pick on Python here, but only because the example code will
be short and sweet. :-) I believe several other implementations of
generators have the same problem.
Python's generator system, used naively, turns an O(N) tree traversal
into an O(N log N) tree traversal:
class Tree:
def __init__(self, value, left=None, right=None):
self.value = value
self.left = left
self.right = right
def in_order(self):
if self.left is not None:
for v in self.left.in_order():
yield v
yield self.value
if self.right is not None:
for v in self.right.in_order():
yield v
t=Tree(2, Tree(1), Tree(3))
for v in yield_bug.t.in_order():
print v
This prints:
1
2
3
Unfortunately, this snippet calls 'yield' 5 times, because the leaf
values must be yielded twice on their way back up the tree.
We can shorten the code--and make it run in O(N) time--by adding a new
keyword to replace the "for v in ...: yield v" pattern:
def in_order(self):
if self.left is not None:
yield_all self.left.in_order():
yield self.value
if self.right is not None:
yield_all self.right.in_order():
Interestingly enough, this allows you define notions such as
"tail-recursive generation", and apply the usual bag of
recursion-optimization techniques.
Cheers,
Eric
|>oug
You should also have looked for the responses to that. Tim Peter's
response is available from
http://aspn.activestate.com/ASPN/Mail/Message/624273
as linked from
http://aspn.activestate.com/ASPN/Mail/Message/python-dev/758572
Here is the most relevant parts.
I'm not bothered -- this comes with the territory. If/when
full-fledged coroutines make it in too, people worried about that can
use them instead. Curious fact: I *was* worried about the worst-case
time aspects of "simple generators" in Icon years ago, but in practice
never ever got burned by it. And rewriting stuff to use Icon
co-expressions instead invariably resulted in messier code that ran
significantly slower in virtually all cases, except for the ones I
*contrived* to prove the O() difference.
BTW, Python almost never worries about worst-case behavior, and people
using Python dicts instead of, e.g., balanced trees, get to carry their
shame home with them hours earlier each day <wink> .
Andrew
da...@dalkescientific.com
Maybe. Until you define the semantics of yield_all and at least outline an
implementation, I am not convinced of 'run in o(n) time'. There was once a
several-post discussion of a related idea of having yield somehow,
magically, skip intermediate generators that only yielded value on up,
without tranformation. But it was never clear how to do this practically
without negatively impacting all generators. Cetainly, if <yield_all
iterator> == <for i in iterator: yield i>, I don't see how anything is
gained except for a few keystrokes. If <yield_all iterator> == <yield
list(i for i in iterator)> then the replacement is a semantic change.
> def in_order(self):
> if self.left is not None:
> yield_all self.left.in_order():
> yield self.value
> if self.right is not None:
> yield_all self.right.in_order():
If and when I write a text-based double-recursion to iteration transformer,
a pseudokeyword might be be an idea for indicating that stacked yields are
identify functions and therefore bypassable.
Terry J. Reedy
This is very reminiscent of discussions several years ago about tail
recursion and how it would be a great thing to optimise the edge cases.
Of course we didn't have generators then, so we couldn't complain about
*their* inefficiencies then.
>
>> def in_order(self):
>> if self.left is not None:
>> yield_all self.left.in_order():
>> yield self.value
>> if self.right is not None:
>> yield_all self.right.in_order():
>
>
> If and when I write a text-based double-recursion to iteration transformer,
> a pseudokeyword might be be an idea for indicating that stacked yields are
> identify functions and therefore bypassable.
>
The key words in the above being "use" and "case", I suspect.
python:-always-something-new-to-bitch-about-ly y'rs - steve
Small diversion:
You weren't lazy enough because you added words. The idiom AFAIK is:
Plus ça change, plus ça reste la même chose.
You shouldn't add the "la", I think that came from translating
too literally, adding an article to a comparative in french
turns it into a superlative. So instead of writing:
The more it changes, the more is stays the same
You wrote something like:
Most it changes, most it is the same thing.
--
Antoon Pardon
> On Mon, 28 Feb 2005 18:25:51 -0500, Douglas Alan wrote:
>> While writing a generator, I was just thinking how Python needs a
>> "yield_all" statement. With the help of Google, I found a
>> pre-existing discussion on this from a while back in the
>> Lightweight Languages mailing list. I'll repost it here in order
>> to improve the chances of this enhancement actually happening
>> someday.
> You should also have looked for the responses to that. Tim Peter's
> response is available from
> http://aspn.activestate.com/ASPN/Mail/Message/624273
[...]
> Here is the most relevant parts.
[...]
> BTW, Python almost never worries about worst-case behavior, and people
> using Python dicts instead of, e.g., balanced trees, get to carry their
> shame home with them hours earlier each day <wink> .
If you'll reread what I wrote, you'll see that I'm not concerned with
performance, but rather my concern is that I want the syntactic sugar.
I'm tired of writing code that looks like
def foogen(arg1):
def foogen1(arg2):
# Some code here
# Some code here
for e in foogen1(arg3): yield e
# Some code here
for e in foogen1(arg4): yield e
# Some code here
for e in foogen1(arg5): yield e
# Some code here
for e in foogen1(arg6): yield e
when it would be much prettier and easier to read if it looked like:
def foogen(arg1):
def foogen1(arg2):
# Some code here
# Some code here
yield_all foogen1(arg3)
# Some code here
yield_all foogen1(arg4)
# Some code here
yield_all foogen1(arg5)
# Some code here
yield_all foogen1(arg6)
|>oug
> Cetainly, if <yield_all
> iterator> == <for i in iterator: yield i>, I don't see how anything
> is gained except for a few keystrokes.
What's gained is making one's code more readable and maintainable,
which is the one of the primary reasons that I use Python.
|>oug
On of the reasons why Python is readable is that the core language is
comparatively small. Adding a new reserved word simply to save a few
characters is a difficult choice, and each case has to be judged on its
merits, but it seems to me that in this case the extra syntax is a burden
that would have to be learned by all Python programmers with very little
benefit.
Remember that many generators will want to do slightly more than just yield
from another iterator, and the for loop allows you to put in additional
processing easily whereas 'yield_all' has very limited application e.g.
for tok in tokenstream():
if tok.type != COMMENT:
yield tok
I just scanned a random collection of my Python files: out of 50 yield
statements I found only 3 which could be rewritten using yield_all.
> Douglas Alan wrote:
>> "Terry Reedy" <tjr...@udel.edu> writes:
>>> Cetainly, if <yield_all
>>> iterator> == <for i in iterator: yield i>, I don't see how anything
>>> is gained except for a few keystrokes.
>> What's gained is making one's code more readable and maintainable,
>> which is the one of the primary reasons that I use Python.
> On of the reasons why Python is readable is that the core language is
> comparatively small.
It's not that small anymore. What it *is* is relatively conceptually
simple and readily comprehensible (i.e. "lightweight"), unlike
languages like C++ and Perl.
> Adding a new reserved word simply to save a few
> characters
It's not to "save a few characters". It's to make it immediately
clear what is happening.
> is a difficult choice, and each case has to be judged on its merits,
> but it seems to me that in this case the extra syntax is a burden
> that would have to be learned by all Python programmers with very
> little benefit.
The amount of effort to learn what "yield_all" does compared to the
amount of effort to understand generators in general is so miniscule,
as to be negligible. Besides, by this argument, the standard library
should be kept as small as possible too, since people have to learn
all that stuff in order to understand someone else's code.
> Remember that many generators will want to do slightly more than just yield
> from another iterator, and the for loop allows you to put in additional
> processing easily whereas 'yield_all' has very limited application e.g.
> for tok in tokenstream():
> if tok.type != COMMENT:
> yield tok
> I just scanned a random collection of my Python files: out of 50 yield
> statements I found only 3 which could be rewritten using yield_all.
For me, it's a matter of providing the ability to implement
subroutines elegantly within generators. Without yield_all, it is not
elegent at all to use subroutines to do some of the yielding, since
the calls to the subroutines are complex, verbose statements, rather
than simple ones.
I vote for the ability to have elegant, readable subroutining,
regardless of how much you in particular would use it.
|>oug
Doug> def foogen1(arg2):
Doug> # Some code here
Doug> # Some code here
Doug> yield_all foogen1(arg3)
Doug> # Some code here
Doug> yield_all foogen1(arg4)
Doug> # Some code here
Doug> yield_all foogen1(arg5)
Doug> # Some code here
Doug> yield_all foogen1(arg6)
If this idea advances I'd rather see extra syntactic sugar introduced to
complement the current yield statement instead of adding a new keyword.
It's a bit clumsy to come up with something that will work syntactically
since the next token following the yield keyword can be any identifier.
You'd thus need another keyword there. Something like:
def foogen(arg1):
def foogen1(arg2):
# Some code here
# Some code here
yield from foogen1(arg3)
# Some code here
yield from foogen1(arg4)
# Some code here
yield from foogen1(arg5)
# Some code here
yield from foogen1(arg6)
It would be nicer if that was
yield all from <something>
but since "all" is a valid identifier that might break existing code, though
maybe the presence of the following "from" can be used to distinguish
these two cases:
yield <expr>
yield all from <expr>
Skip
yield *<expr>
(Mu-hu-ha-ha-ha!)
You absolutely and definitively have my vote.
When I first learned the generators , I was even wondering if there was
something wrong in what I do when faced with the sub-generators problem you
describe. I was wondering "why am I doing this extra for-loop ? Is there
something wrong ? Can I return the sub-iterator itself and let the final
upper loop do the job ? But no, I absolutely have to 'yield'. What then ?"
Therefore, the suggestion you make, or something similar, would have actually
ease my learning, at least for me.
Regards,
Francis Girard
>If this idea advances I'd rather see extra syntactic sugar introduced to
>complement the current yield statement instead of adding a new keyword.
>It's a bit clumsy to come up with something that will work syntactically
>since the next token following the yield keyword can be any identifier.
>You'd thus need another keyword there. Something like:
>
>
I'd agree on *not* introducing a new keyword. I run into this issue
every once in a while, but new keywords for minor syntactic sugar seems
a bit much.
> # Some code here
> yield from foogen1(arg3)
>
>
...
>It would be nicer if that was
>
> yield all from <something>
>
>
I don't really like the need to look past that (potentially long)
expression to see the effect of the operation. I don't mind the yield
from syntax, it nicely encapsulates the learning of "generators" so that
when you see yield up front you know something generatish is going on.
I'd be fine with:
for yield on foogen1(arg3)
or
for yield from foogen1(arg3)
which goes more toward the idea of being syntactic sugar for a for loop
that yields each value that is produced. Of course, what happens with:
[ for yield from foogen1(arg3) ]
would then have to be defined... that might make it too complex an
change. Oh well.
Have fun all,
Mike
________________________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://www.vrplumber.com
http://blog.vrplumber.com
PyCon is coming...
+1 for "generatish" as VOTW (Vocabulation of the Week). =)
STeVe
> > Cetainly, if <yield_all
> > iterator> == <for i in iterator: yield i>, I don't see how anything
> > is gained except for a few keystrokes.
>
> What's gained is making one's code more readable and maintainable,
> which is the one of the primary reasons that I use Python.
I don't see a lot of difference in readability and maintainability
between the two versions. And if yield_all is going to expand into the
loop, anyway, I'd prefer to make that obvious by using the for-loop
version, rather than using a keyword and pretending that passing the
iterators on has no overhead.
If we're talking about machinery behind the scenes to shortcut chains of
yield_all's, so that the time to pass items up through the chain is
smaller than it would be in the for-loop case, I'd think that would be a
better reason for a keyword, because it's not something that can be done
very easily without one in the current language. I don't know how to
make such shortcutting machinery faster than logarithmic in the worst
case (taking into account the possibility that multiple generators could
have yield_all's to the same iterator) but I think it could be made
nearly constant time in most situations. On the other hand, I'm not
convinced that this would be needed frequently enough to warrant the
complexity of trying to optimize it.
--
David Eppstein
Computer Science Dept., Univ. of California, Irvine
http://www.ics.uci.edu/~eppstein/
No, this won't do. What is needed is a way to yield the results of a generator
from inside another generator with having to do a for-yield-loop inside the
outter generator.
Regards,
Francis Girard
Regards
> Therefore, the suggestion you make, or something similar, would have
> actually ease my learning, at least for me.
Yes, I agree 100%. Not having something like "yield_all" hurt my
ability to learn to use Python's generators quickly because I figured
that Python had to have something like yield_all. But no matter how
hard I looked, I couldn't find anything about it in the manual.
So the argument that adding a feature makes the language harder to
learn is a specious one. Sometimes an extra feature makes the
language easier to learn.
|>oug
> In article <lcacpnx...@gaffa.mit.edu>,
> Douglas Alan <nes...@mit.edu> wrote:
>> > Cetainly, if <yield_all
>> > iterator> == <for i in iterator: yield i>, I don't see how anything
>> > is gained except for a few keystrokes.
>> What's gained is making one's code more readable and maintainable,
>> which is the one of the primary reasons that I use Python.
> I don't see a lot of difference in readability and maintainability
> between the two versions.
In that case, your brain works nothing like mine.
|>oug
Guido has generally observed a parsimony about the introduction of
features such as the one you suggest into Python, and in particular he
is reluctant to add new keywords - even in cases like decorators that
cried out for a keyword rather than the ugly "@" syntax.
In my opinion that is a good thing mostly, not only because it avoids
code breakage (which can't possibly be bad) but also because it tends to
limit the number of ways that different programmers can express the same
idea.
I suspect this is why people have suggested that you were "only going to
save a few keystrokes". Despite your feeling that yield_all makes the
intent of the code more obvious there seems to be a majority in favor of
the simpler expression of the idea with a yield in a loop.
If you think there's a genuine chance this could be usefully added to
Python the solution is obvious: write and submit a PEP.
regards
Steve
--
Meet the Python developers and your c.l.py favorites March 23-25
Come to PyCon DC 2005 http://www.pycon.org/
Steve Holden http://www.holdenweb.com/
> Guido has generally observed a parsimony about the introduction of
> features such as the one you suggest into Python, and in particular
> he is reluctant to add new keywords - even in cases like decorators
> that cried out for a keyword rather than the ugly "@" syntax.
In this case, that is great, since I'd much prefer
yield *gen1(arg)
than
yield_all gen1(arg)
anyway, as someone else suggested in this thread (followed by a
demonic laugh). The only reason I mentioned "yield_all" is because
there was a preexisting discussion that used "yield_all".
|>oug
I'm guessing the * syntax is pretty unlikely to win Guido's approval.
There have been a number of requests[1][2][3] for syntax like:
x, y, *rest = iterable
for unpacking a variable sized list (where *rest would work in an
analogous way to what it does in the args of a def.) Guido has
consistently rejected these proposals, e.g.:
"I think it's not worth adding now, and if you don't hear from me again
on this topic it's because I haven't changed my mind..."
My suspicion is that if he doesn't like the * syntax when there's a
close parallel to the argument parsing usage, he's not likely to like it
when there isn't one.
STeVe
[1]http://mail.python.org/pipermail/python-dev/2002-November/030349.html
[2]http://mail.python.org/pipermail/python-dev/2004-August/046684.html
[3]http://mail.python.org/pipermail/python-dev/2004-November/049895.html
> I'm guessing the * syntax is pretty unlikely to win Guido's
> approval. There have been a number of requests[1][2][3] for syntax
> like:
> x, y, *rest = iterable
Oh, it is so wrong that Guido objects to the above. Python needs
fully destructuring assignment!
|>oug
Douglas> If you'll reread what I wrote, you'll see that I'm not
Douglas> concerned with performance, but rather my concern is that
Douglas> I want the syntactic sugar. I'm tired of writing code
Douglas> that looks like
Douglas> def foogen(arg1):
Douglas> def foogen1(arg2):
Douglas> # Some code here
Douglas> # Some code here
Douglas> for e in foogen1(arg3): yield e
Douglas> # Some code here
Douglas> for e in foogen1(arg4): yield e
Douglas> # Some code here
Douglas> for e in foogen1(arg5): yield e
Douglas> # Some code here
Douglas> for e in foogen1(arg6): yield e
How about writing it like the following?
def gen_all(gen):
for e in gen:
yield e
def foogen(arg1):
def foogen1(arg2):
# Some code here
# Some code here
gen_all(arg3)
# Some code here
gen_all(arg4)
# Some code here
gen_all(arg5)
# Some code here
gen_all(arg6)
Regards,
Isaac.
def gen_all(gen):
for e in gen:
yield e
def foogen(arg1):
def foogen1(arg2):
# Some code here
# Some code here
gen_all(arg3)
^ I mean foogen1(arg3), obviously, and similar for below
If you actually try doing this, you will see why I want "yield_all".
|>oug
Douglas> If you actually try doing this, you will see why I want
Douglas> "yield_all".
Oh... I see your point.
I was about to suggest that the code in my posts before should be made
to work somehow. I mean, if in
def fun1(x):
if not x:
raise MyErr()
...
def fun2():
...
fun1(val)
fun2()
we can expect that main gets the exception thrown by fun1, why in
def fun1(x):
if not x:
yield MyObj()
...
def fun2():
fun1(val)
for a in fun2():
...
we cannot expect MyObj() to be yielded to main? But soon I found that
it is not realistic: there is no way to know that fun2 has generator
semantics. Perhaps that is a short-sightness in not introducing a new
keyword instead of def when defining generators.
Regards,
Isaac.
If you do write a PEP, try to get genexp syntax supported by the yield keyword.
That is, the following currently triggers a syntax error:
def f():
yield x for x in gen1(arg)
I wouldn't mind seeing it successively yield the values returned by gen1()
instead. To my mind that better conveys what's going on than putting the yield
statement inside the for loop (it should also provide the opportunity to
optimise the next() calls by temporarily swapping the outer generator's frame
for the inner generator's frame, and swapping them back only when the inner
generator is exhausted)
That syntax error I mentioned makes it backwards compatible, too (existing code
must have parentheses around the genexp, which will not by altered if the yield
keyword gains a native genexp syntax).
If anyone thinks that this would result in a couple of parentheses making too
much difference, consider this:
Py> [x for x in []]
[]
Py> [(x for x in [])]
[<generator object at 0x009E6698>]
Cheers,
Nick.
--
Nick Coghlan | ncog...@email.com | Brisbane, Australia
---------------------------------------------------------------
http://boredomandlaziness.skystorm.net
Yeah, there are a lot of folks that like this idea, but most of us are
willing to conceded that Guido's intuition for these kind of things is
generally pretty good. This would be convenient for me occasionally,
but I certainly wouldn't need it that often...
STeVe
Hmm. My impression is that Guido did not like x,*y=iterable because he
does *not* see it as a 'close parallel' but as a strained analogy. To me,
yield *iterable is closer to the use in function calling. It would mean
'unpack in time' rather than 'unpack in space' but that is natural (to me,
anyway) since that is what generators are about.
In any case, I also like this better than yield_all and would estimate that
it a higher chance of accceptance, even if still under 50%. Part of the
justification could be that the abbreviated form is much easier to speed up
than the expanded form. If the ref count of the iterator is just 1, then
it might be reasonable to assume that iterator.next will not be called from
anywhere else. (IE, I can't think of an exception, but know that there
might be one somehow.)
Terry J. Reedy
Terry J. Reedy
I don't see the * in
yield *iterable
being much closer to the use in argument unpacking. Note what happens
to iterables after *:
py> def gen():
... yield 1
... yield 2
... print "complete!"
...
py> def f(*args):
... print args
...
py> f(*gen())
complete!
(1, 2)
The entire iterable is immediately exhausted. So if
yield *iterable
is supposed to parallel argument unpacking, I would expect that it would
also immediately exhaust the iterable, e.g. it would be equivalent to:
for item in tuple(iterable):
yield item
which I don't think is what the OP wants.
I'm certain I could get used to the syntax. I'm only suggesting that I
don't find it very intuitive. (And I *have* thought a lot about
argument unpacking -- see my older threads about *args being an iterable
instead of a tuple.)
STeVe
> If you do write a PEP, try to get genexp syntax supported by the
> yield keyword.
> That is, the following currently triggers a syntax error:
> def f():
> yield x for x in gen1(arg)
Wouldn't
yield *(x for x in gen1(arg))
be sufficient, and would already be supported by the proposal at
hand?
Also, with the syntax you suggest, it's not automatically clear
whether you want to yield the generator created by the generator
expression or the values yielded by the expression. The "*" makes
this much more explicit, if you ask me, without hindering readability.
|>oug
> Douglas Alan wrote:
>> Steve Holden <st...@holdenweb.com> writes:
>>>Guido has generally observed a parsimony about the introduction of
>>>features such as the one you suggest into Python, and in particular
>>>he is reluctant to add new keywords - even in cases like decorators
>>>that cried out for a keyword rather than the ugly "@" syntax.
>>
>> In this case, that is great, since I'd much prefer
>>
>> yield *gen1(arg)
>
> If you do write a PEP, try to get genexp syntax supported by the yield keyword.
>
> That is, the following currently triggers a syntax error:
> def f():
> yield x for x in gen1(arg)
Hmmmm.
At first I liked this, but the reason that is a syntax error is that it is
"supposed" to be
def f():
yield (x for x in gen1(arg))
which today on 2.4 returns a generator instance which will in turn
yield one generator instance from the genexp, and I am quite uncomfortable
with the difference between the proposed behaviors with and without the
parens.
Which sucks, because at first I really liked it :-)
We still would need some syntax to say "yield this 'in place' rather than
as an object".
Moreover, since "yield" is supposed to be analogous to "return", what does
return x for x in gen1(arg)
do? Both "it returns a list" and "it returns a generator" have some
arguments in their favor.
And I just now note that any * syntax, indeed, any syntax at all will
break this.
You know, given the marginal gains this gives anyway, maybe it's best off
to just observe that in the event that this is really important, it's
possible to hand-code the short-circuiting without too much work, and let
people write a recipe or something.
def genwrap(*generators):
while generators:
try:
returnedValue = generators[-1].next()
if hasattr(returnedValue, 'next'):
generators.append(returnedValue)
continue
yield returnedValue
except StopIteration:
generators.pop()
Not tested at all because the wife is calling and I gotta go :-)
Jeremy> def f():
Jeremy> yield (x for x in gen1(arg))
Jeremy> which today on 2.4 returns a generator instance which will in
Jeremy> turn yield one generator instance from the genexp, and I am
Jeremy> quite uncomfortable with the difference between the proposed
Jeremy> behaviors with and without the parens.
Jeremy> Which sucks, because at first I really liked it :-)
Jeremy> We still would need some syntax to say "yield this 'in place'
Jeremy> rather than as an object".
def f():
yield from (x for x in gen1(arg))
Skip
This suggestion had been made in a previous posting and it has my preference :
def f():
yield from gen1(arg)
Regards
Francis
It would, but, as Steven pointed out, the * in func(*args) results in
tuple(args) being passed to the underlying function.
So I see no reason to expect "yield *iterable" to imply a for loop that yields
the iterators contents. IMO, it's even more of a stretch than the tuple
unpacking concept (at least that idea involves tuples!)
Whereas:
yield x for x in iterable if condition
Maps to:
for x in iterable:
if condition:
yield x
Just as:
[x for x in iterable if condition]
Maps to:
lc = []
for x in iterable:
if condition:
lc.append(x)
And:
(x for x in iterable if condition)
Maps to:
def g()
for x in iterable:
if condition:
yield x
And removing a couple of parentheses is at least as clear as adding an asterisk
to the front :)
And it would continue to do so in the future. On the other hand, removing the
parens makes it easy to write things like tree traversal algorithms:
def left_to_right_traverse(node):
yield x for x in node.left
yield node .value
yield x for x in node.right
In reality, I expect yielding each item of a sub-iterable to be more common than
building a generator that yields generators.
> , and I am quite uncomfortable
> with the difference between the proposed behaviors with and without the
> parens.
Why? Adding parentheses can be expected to have significant effects when it
causes things to be parsed differently. Like the example I posted originally:
[x for x in iterable] # List comp (no parens == eval in place)
[(x for x in iterable)] # Parens - generator goes in list
Or, for some other cases where parentheses severely affect parsing:
print x, y
print (x, y)
assert x, y
assert (x, y)
If we want to pass an iterator into a function, we use a generator expression,
not extended call syntax. It makes sense to base a sub-iterable yield syntax on
the former, rather than the latter.
> Moreover, since "yield" is supposed to be analogous to "return", what does
>
> return x for x in gen1(arg)
>
> do? Both "it returns a list" and "it returns a generator" have some
> arguments in their favor.
No, it would translate to:
for x in gen1(arg):
return x
Which is nonsense, so you would never make it legal.
> And I just now note that any * syntax, indeed, any syntax at all will
> break this.
As you noted, this argument is specious because it applies to *any* change to
the yield syntax - yield and return are fundamentally different, since yield
allows resumption of processing on the next call to next().
> You know, given the marginal gains this gives anyway,
I'm not so sure the gains will be marginal. Given the penalties CPython imposes
on recursive calls, eliminating the nested "next()" invocations could
significantly benefit any code that uses nested iterators.
An interesting example where this could apply is:
def flatten(iterable):
for item in iterable:
if item is iterable:
# Do the right thing for self-iterative things
# like length 1 strings
yield iterable
raise StopIteration
try:
itr = iter(item):
except TypeError:
yield item
else:
yield x for x in flatten(item)
Cheers,
Nick.
P.S. Which looks more like executable pseudocode?
def traverse(node):
yield *node.left
yield node .value
yield *node.right
def traverse(node):
yield x for x in node.left
yield node .value
yield x for x in node.right
> Doug> def foogen(arg1):
>
> Doug> def foogen1(arg2):
> Doug> # Some code here
>
> Doug> # Some code here
> Doug> yield_all foogen1(arg3)
> Doug> # Some code here
> Doug> yield_all foogen1(arg4)
> Doug> # Some code here
> Doug> yield_all foogen1(arg5)
> Doug> # Some code here
> Doug> yield_all foogen1(arg6)
>
> If this idea advances I'd rather see extra syntactic sugar introduced to
> complement the current yield statement instead of adding a new keyword.
You can work around the need for something like yield_all, or
explicit loops, by defining an "iflatten" generator, which yields
every element of its (iterable) argument, unless the element is a
generator, in which case we recurse into it:
>>> from types import GeneratorType
>>> def iflatten(it):
... it = iter(it)
... for val in it:
... if isinstance(val, GeneratorType):
... for v2 in iflatten(val):
... yield v2
... else:
... yield val
To take this one step further, you can define an @iflattened
decorator (yes, it needs a better name...)
>>> def iflattened(f):
... def wrapper(*args, **kw):
... for val in iflatten(f(*args, **kw)):
... yield val
... return wrapper
Now, we can do things like:
>>> @iflattened
... def t():
... def g1():
... yield 'a'
... yield 'b'
... yield 'c'
... def g2():
... yield 'd'
... yield 'e'
... yield 'f'
... yield g1()
... yield 1
... yield g2()
...
>>> list(t())
['a', 'b', 'c', 1, 'd', 'e', 'f']
This can probably be tidied up and improved, but it may be a
reasonable workaround for something like the original example.
Paul.
--
The most effective way to get information from usenet is not to ask
a question; it is to post incorrect information. -- Aahz's Law
This is why even though in some sense I'd love to see yield *expr, I can't
imagine it's going to get into the language itself; it's too easy to do it
yourself, or provide a library function to do it (which would A: Be a lot
easier if we had some sort of "iterable" interface support and B: Be a
great demonstration of something useful that really needs protocol support
to come off right, because isinstance(something, GeneratorType) isn't
sufficient in general).
Abstractly I like the star syntax, but concretely I'm not a big fan of
adding something to the language that can be done right now with a fairly
short function/generator and hardly even any extra keystrokes to invoke it
when done right, and that overrides my abstract appreciation.
Paul> You can work around the need for something like yield_all,
Paul> or explicit loops, by defining an "iflatten" generator,
Paul> which yields every element of its (iterable) argument,
Paul> unless the element is a generator, in which case we recurse
Paul> into it:
Paul> ...
Only if you'd never want to yield a generator.
Regards,
Isaac.