Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Help with coroutine-based state machines?

55 views
Skip to first unread message

Alan Kennedy

unread,
May 29, 2003, 11:10:02 AM5/29/03
to
Greetings all,

I've got a few questions about generators, coroutines, etc, that
I've been meaning to ask for several weeks. But I knew it would take
time to explain. So I left it until I had the time to write it up
properly. And it turned out that there was a lot more to it than I
originally foresaw, mainly because I wanted the concepts to be as
clear as possible. As I wrote it and rewrote it and rewrote it, it
quickly grew to be a length more suitable for an article than a
Usenet post. But because I was asking questions about stuff I didn't
know, an article wasn't appropriate.

So, the post/thing itself is given below. It's quite long, and a
little rambling in parts. You'll probably need some time available
to work through it all, if you're interested.

If any of the good people in the group can help me anwser some of
the questions, I'll write this all up more concisely, and put it on
it's own web page.

Thanks for any assistance you can offer.

Alan.

===============================================================
State machines.
---------------

I've been playing around with state machine paradigms, seeking an
optimum paradigm with three primary requirements in mind.

1. That the paradigm be as simple to code as possible.
2. That machines developed using the paradigm be as flexible as
possible, and ideally inheritable.
3. That the paradigm be as resource-efficient as possible. (not so
important).

"Traditional" Paradigms.
------------------------

I played around with a few traditional state-machine paradigms, the
kind you see posted, for example, in the Cookbook. They were mostly
based on declaring machines or even individual states as objects,
and sometimes requiring explicit declaration of all state
transitions and events.

To me, most of them seemed difficult to work with, because of the
need to modify up to 3 or 4 different variables (in different
sections of the code) in order to make the simplest modification to
a state or event. While I recognise the benefit of such approaches
in more formal or rigourous circumstances, I needed more
flexibility.

So, none of more "traditional" paradigms (that I found) really stood
out as what I was looking for in terms of simplicity and
expressiveness.

Generators and state machines.
------------------------------

But then I came across David Mertz excellent article describing the
use of generators to implement state machines, and I was bigtime
impressed.

http://www-106.ibm.com/developerworks/linux/library/l-pygen.html

This article was the "dawning point" for me, when (I think) I
finally "got" generators, and the flexibility that they offer.

If you haven't read David's article before, I recommend that you do
if you are going to continue reading this post, because I'll be
continually referring to the generator-based state machine
implementation in that article.

Simplifying the model.
----------------------

However, once I did finally "get it", when I looked back at the
example code in the article, I found myself easily confused by the
code. Trying to get my head around the important difference between
the generator-function and the generator-iterator, while also trying
to follow the dictionaries and tuples needed for dispatch, mixed in
with a non-generator function (using classic 'return's) for state
classification and namespace declarations to access data, left me a
little spun out.

So I decided to try and simplify the model a little. Code for a full
working example (but a different state machine) is given below. I
left it all in one piece rather than split it up and sprinkle it
through the discussion points. I have included code snippets where
it is necessary to illustrate a point.

Eliminating global data.
------------------------

The first thing that I found confusing was the use of global data.
Instead, it seemed more natural to me to encapsulate the state
machine in a class, and to use instance variables of objects of that
class to contain instance data. That way, no namespace (i.e.
"global") declarations would be required: the data could be simply
accessed through the self object.

Cleaning up the dispatch table.
-------------------------------

When I wrapped it all in a class, and removed the "cargo" and
namespace declarations, I realised that the generators were all
yielding a simple string, containing the name of the next state,
like so

yield 'newstate'

This would be then looked up in a "jump table", i.e. a dictionary
mapping the state names to generator-iterators representing the
states, like so

while 1:
newstate = jumptable[newstate].next()

This jumptable, containing the instantiated generator-iterator
objects, would be set up at start time.

This seemed to me superfluous, since I had already named all of the
states, through the names of the generator-functions themselves: why
should I need a string to identify them as well? Also, having a jump
table means that I have to maintain state names in three different
places: the name of the generator-function, the name of the state in
the jump-table and when instantiating the generator-iterator.

So, if I wanted to do away with the jumptable, I would have to yield
the generator-iterator for the next state directly to the
dispatcher, and have the dispatcher call its .next() method. But how
would I refer to the generator-iterator, when the states name in the
instance dictionary (self) referred to the generator-function, not a
generator-iterator. I would have to come up with some way to
directly yield the generator-iterator object for the state, but
accessing them using their generator-function names. A state change
could then be declared as

yield newstate

And dispatching would then simply become

while 1:
newstate = newstate.next()

Sweet. Much simpler :-)

Turning generator-functions into generator-iterators.
-----------------------------------------------------

In order to do away with the jump table, I decided that all names
(in the instance dictionary) which referred to generator-functions
be changed so that they referred instead to the corresponding
instantiated generator-iterators. This would mean that state changes
would be indicated by a statement like

yield self.newstate

To do this, here is the little bit of hackery that I came up with:
at __init__ time of the state-machine object, it modifies the
instance dictionary, replacing generator-functions with their
instantiated generator-iterators.

def __init__(self):
statenames = ['state1', 'state2', 'state3', 'state4']
for name in statenames:
# Turn each generator-function into an instantiated
# generator-iterator
self.__dict__[name] = eval ('self.%s()' % name)

# Obviously, there must be a generator-function
# in the class corresponding to each element of statenames list,
e.g.

def state1(self):
while 1:
if self.readytogo:
yield self.state2
else:
yield self.state3


This turns every reference to a generator-function into a reference
to the corresponding instantiated generator-iterator.

Differentiating generator-functions from functions.
---------------------------------------------------

I'm not happy with the way that I have to explicitly list the state
names. I tried to find a way to differentiate between functions and
generator-functions while reflecting on the instance' dictionary.
Pseudo code for this would look like

import types

for k in self.__dict__.keys():
if type(self.__dict__[k]) is types.GeneratorType:
self.__dict__[k] = eval ('self.%s()' % k)

But there isn't a generator type in the types module? So, that's a
question I'd like to ask: can anyone tell me how to discover the
generator-functions of an object, and differentiate them from
ordinary functions?

A full working example.
-----------------------

Now's the time for a full example. In order to simplify it, I chose
a different state machine from that in David's article. It's a very
simple state machine: one that counts from one number to another in
a specified increment, and then stops. The paradigm is capable of a
lot more than that (I've just used it very successfully to control
multiple independent automated users in a load testing simulation,
for example): but I want this example to be short and concise.

#----------------------------------------------
# This code is also available at
# http://xhaus.com/alan/python/FSMgenpy.html
#

class FSM:

def __init__(self, startnum, incrnum, stopnum):
for name in ['idle','start','increment','checkfinished','finished']:
self.__dict__[name] = eval ('self.%s()' % name)
self.startnum = startnum
self.incrnum = incrnum
self.stopnum = stopnum
self.counter = 0
self.exit = 'exitsentinel'

def idle(self):
while 1:
# We could wait for some condition here.
print "Idle state: %d" % self.counter
yield self.start

def start(self):
while 1:
self.counter = self.startnum
print "Start state: %d" % self.counter
yield self.increment

def increment(self):
while 1:
self.counter = self.counter + self.incrnum
print "Increment state: %d" % self.counter
yield self.checkfinished

def checkfinished(self):
while 1:
if self.counter >= self.stopnum:
yield self.finished
else:
yield self.increment

def finished(self):
while 1:
print "Finished state: %d" % self.counter
yield self.exit

def dispatch(self, startstate):
newstate = startstate
while 1:
next = newstate.next()
if next == self.exit:
return
else:
newstate = next

if __name__ == "__main__":
m = FSM(1,5,50)
m.dispatch(m.idle)
m.dispatch(m.idle)
#----------------------------------------------

Pausing or stopping from the state machine.
-------------------------------------------

You will notice that I have slightly modified the dispatch function,
compared to the one described above. It now looks for a sentinel
value signifying exit from the state machine. This is because the
only way in which the state machine can suspend operation and
relinquish control is to return from a non-generator function. If
any form of return is attempted from inside a generator-iterator, or
an exception such as a StopIteration() is raised, then that
generator-iterator will be destroyed and the state-machine will stop
working. Furthermore, it won't be possible to create a new
generator-iterator, because we've overwritten the reference to the
generator-function in the instance dictionary! Hmm, maybe that is
too much a hack :-| Anyway.

The addition of the sentinel value for exiting or suspending the
state machine means that we can now suspend and resume the machine
at will. It can be suspended by yielding a "suspend" state (which
would be a simple string sentinel value to the dispatcher), and then
resumed by calling the dispatch method with a new or saved starting
state. The example "main" function given above runs the example
state machine through its paces twice. When you run it, take a look
at the value of the counter in the "idle" state, and how it differs
between invocations.

Combinining with threads.
-------------------------

In the example usage I described above, where I used this paradigm
to control automated testers, each automated tester ran in its own
thread. Each had a dedicated input Queue.Queue, through which it
would receive commands. Which would then be carried out by setting
instance data and transferring to the relevant state.

In order to check for messages in the incoming queue, at least one
of the states must check if there any entries in the queue. In
simple cases, this can be a blocking wait, where the
Queue.Queue.get() does not return until a message is available.

If you need the machine to be going off doing tasks in its parallel
thread, but still checking for messages, e.g. stop commands, then a
non-blocking .get(), carried out at a regular interval, is
recommended. However, making the time intervals too short can
significantly consume your cpu cycles. Also, I found that
threading.Condition() variables were very useful as a mechanism for
synchronisation between threads, and integrated well with the
"resumable function" concept.

Paradigm found.
---------------

To me, this paradigm of state machines is simplicity itself. Thus,
it fulfills my primary goal of having a simple paradigm: I really
can't see any way for the code to be simpler, in textual terms at
least. Furthermore, because all of the parameter parsing involved in
function calls is avoided, it should be also more cpu-efficient than
a "procedurally" based model. (Actually, I haven't done any timing
to validate this, but I can't see how the generator version could be
slower?)

Parameter passing.
------------------

However, this avoidance of parameter parsing has a downside: you
can't pass parameters between states :-) Which would be a problem in
most real-world state-machines, which would most likely be
processing external data streams of some form. The two ways that I
can think of for getting data into the state-machine are

1. Sending parameters wrapped up in a message object on a
Queue.Queue. This would be recommended in a multi-threaded
situation.
2. By setting instance variables of the state-machine object
directly. This could be dangerous in a multi-threaded situation.

I am aware of Raymond Hettingers PEP 288: Generator Attributes, and
now appreciate the problem that it is trying to solve. And I also
appreciate the "cargo" problem David was trying to solve in his
article. If anyone has any improved methods to passing parameters in
the paradigm I've described above, I'd like to hear about it.

Generator Gestalt.
------------------

It was only while stepping through the execution path of the code
above that I finally understood some of the sophisticated
flow-control mechanisms that generators offer: if you temporarily
"forget" that there is a dispatcher function involved, you see the
thread of execution far more naturally: named units of python code
where the data and the code that manipulates it are very closely
associated: The flow of control passes seamlessly back and forth
between the named units: There is no stack giving you an ancestry or
history of invocations: Each "state" in the state-machine
implementation really *is* an encapsulated, resumable state.

The "temporal locality" of the code (as opposed to the "spatial
locality" of traditional code) greatly aids readability, and
modifiability, IMHO. The code seems more natural, more
"encapsulated", because data is generally only modified in code that
is "close" to the declaration of the state machine "event" to which
it relates. Also, the ease with which the flow between states can be
modified (i.e. no mucking with state or transition declarations,
lookup tables or parameter values) greatly increases the flexibility
of the code.

It also seems to me that this paradigm should have significant
cpu-efficiency advantages over other paradigms, and potentially make
python-implemented algorithms cpu-competitive with statically typed
and compiled languages based on a traditional function paradigm,
such as Java or C.

At the beginning, when writing real state machines with the above
paradigm, I found myself getting weird bugs and subtle deviations
from expected behaviour. For example, a counter might start at 2,
rather than an expected, recently initialised, 1. But mostly they
weren't bugs at all, they were inherent flaws I designed into the
state-machine. Sometimes when writing and debugging the code, I
mistakenly assumed that the code was being entered from the top each
time, i.e. directly after the while statement, as it would be in
"procedural" code. But rather, the real situation was that the code
was being resumed in the statement immediately after the last
executed yield statement. Once I had discarded this last
preconception, it all fell into place (anybody know a Buddha smilie
O:-)

Generators: powerful stuff, to be sure, to be sure. Although I know
there was at some stage controversy about the rate at which new
features were/are being introduced into python, I'm convinced that
advanced features like generators are a necessary step for python to
remain competitive in conceptual terms, compared to other languages.
New concepts in computer hardware, such as massively parallel
computing or hyperthreading, and new classes of computational
problems such as protein folding, will require new conceptual
approaches from computer languages. Generators are those concepts,
or are at least the precursors of the future concepts. (I hope that
one day I will also "get" Continuations, and Co-Routines, and am now
at least in the position of having identified some of what I don't
know about Stackless Python %~} <= my brain is seriously starting to
wobble.......

Sorry, wrong meeting ;-) For a moment there, I thought I was at the
"Stackless Revolution" meeting, which is on *Tuesday* nights, down
the docks.......

Ah, the Zen of Python :-)

Inheritance.
------------

Lastly, about my final requirement: That hopefully my state-machine
paradigm should support extensibility, or ideally inheritance.
Although I haven't yet tried to do it, I imagine that subclassing
these state-machine classes should work perfectly well, as long as
the instantiation of generator-functions to generator-iterators is
properly managed.

Request for comments.
---------------------

As I mentioned above, there are a number of questions in this post
which prevent it from being an explanatory article about this
paradigm. However, if (as I hope :-) the great and knowledgable
people in this group help me out with ironing out the wrinkles, I'll
write it up and post it on its own page.

And finally: infinite loops.
----------------------------

Another question I just thought of: Each of the state
generator-functions must be wrapped inside an infinite while loop,
so that the states never "run out of values". These while loops
clutter the code somewhat, and it would be nice to not need them. I
wonder if it would be possible to remove them? On second thoughts,
maybe not. At least not without fundamentally changing the meaning
of the yield keyword (is that what "yield every" is for?). Or maybe
some incantation with __metaclasses__?

And really finally: notification of state change.
-------------------------------------------------

Since I've still got your attention ;-) another thing that I plan to
address, perhaps with __metaclasses__, is sending notification about
state changes to registered objects, without having to stick in
explicit callback notifications in every state. This could be simply
done by putting a name attribute on every generator-iterator, and
notifying the notifiable objects in the dispatch loop. So the
__init__ would be like this

statenames = ['state1', 'state2', 'state3', 'state4']
for state in statenames:
self.__dict__[state] = eval ('self.%s()' % state)
self.__dict__[state].name = state

and the dispatch would become

while 1:
next = newstate.next()
for o in self.notifiableobjects:
o.notifystatechange(next.name, self)
newstate = next

But it would be cooler to use __metaclasses__ B-)

Really really finally: event-driven processing loops.
-----------------------------------------------------

And for the last moebius-twist of all, I've been trying to picture
how I could integrate state-machines like this with an asynchronous
event-driven server framework like twisted, medusa or asyncore. I
read some posts about "sources" and "sinks", but methinks it may be
more complex than that. Particularly, it is the "psuedo-threading"
which I don't quite get. Any pointers to relevant
articles/emails/postings/sourcecode greatly appreciated.

No, really, really finally: light bulb switches on...
-----------------------------------------------------

A very interesting thought just struck me.

The state changes in the examples above always 'yield'ed the next
state from the current state machine, i.e. 'yield self.newstate'.
But if there were multiple objects all operating this state machine
paradigm, they could yield back and forth to each other, simply by
saying "yield other.newstate" (obviously they must hold a reference
to 'other').

If 'other's dispatching mechanism operated by the same principle,
and was capable of suspending the state machine through 'return'ing
a value instead of generating the next state, then input could be
processed by being fed into one state-machine object and the final
processed output being returned from another connected state-machine
object. So I could build a network of mutually 'yield'-ing state
machines to process a stream of input.

Also, the interim states in the state-machine network could be
suspended when no input was available, by 'yield'ing a
generator-iterator which generates the resume state, when its
.next() method is called. This would have the effect of resuming the
state machine network directly in place from where it 'yield'ed,
when the input becomes available. I think I see the beginnings of
the answer to my asycnore question. O-) <= brain settles back down
to a bose-einstein state of calm awareness

Ah! So that's what full coroutines are? One or more objects,
'yield'ing control back and forth between a set of resumable states,
without the need for a dispatcher function? Which would mean that
you could call a "function" on one object and get the return value
from a different "function" on the same object, or even a different
"function" on a different object. Which could make a right mess of a
traditional function call stack. Which is why Stackless python is as
it is. Ah, the soft chink of the penny dropping.......why is my head
suddenly full of Feynmann diagrams.......?

Which is why python generators are only "semi"-coroutines, as
opposed to full coroutines. Because, without a dispatcher function,
you can only yield control to the calling function, rather than any
resumable state that you have a reference to. So how would the
interface between "procedural" and coroutine code work in the full
coroutine world? Presumably, there would be no difference between a
generator-function and a generator-iterator: or at least, there
would be no need to explicitly instantiate the generator-iterator?
And they could be passed parameters on each invocation? And the
top-of-stack data structure would have to be designed to represent
the free flow back and forth between resumable states of multiple
objects. And generator-iterators would not be "broken" by
'return'ing a value or raising an exception? Implementation
question: Why does this asymmetry exist in the current (c)python
generator implementation? It seems to me that it should be
reasonably straightforward to get around.........

Continuations.
--------------

If I've got it right about coroutines, then what is the last step of
understanding required for Continuations?

What do continuations offer that coroutines don't?

Is it that continuations allow resumption of states across thread
boundaries?

Thanks for reading this far through my ramblings :-)

kind regards,

--
alan kennedy
-----------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan: http://xhaus.com/mailto/alan

Terry Reedy

unread,
May 29, 2003, 2:59:53 PM5/29/03
to

"Alan Kennedy" <ala...@hotmail.com> wrote in message
news:3ED622CA...@hotmail.com...

> Turning generator-functions into generator-iterators.
> -----------------------------------------------------
>
> In order to do away with the jump table, I decided that all names
> (in the instance dictionary) which referred to generator-functions
> be changed so that they referred instead to the corresponding
> instantiated generator-iterators. This would mean that state changes
> would be indicated by a statement like
>
> yield self.newstate

I very much like the idea of factoring out the unneeded redundancy of
representing states by name strings instead of by the state objects
themselves.

> To do this, here is the little bit of hackery that I came up with:
> at __init__ time of the state-machine object, it modifies the
> instance dictionary, replacing generator-functions with their
> instantiated generator-iterators.

I found this somewhat misleading when trying to understand the code.
The generator functions, as instance methods, are class attributes in
the *class* dictionary. They should typically be called exactly once
for each instance of the machine, with that instance as the argument.
The obvious place to do this is in .__init__(). So I would not call
this hackery.

The resulting gen-it objects, being instance specific, belong and get
put in the *instance* dict. Again, this is not hackery. Because you
used the same names, they overshadow rather than replace their
namesakes, which continue to sit, unchanged, in the class dict. So


> Furthermore, it won't be possible to create a new
> generator-iterator, because we've overwritten the reference to the
> generator-function in the instance dictionary!

is incorrect. You can still access FSM.idle as exactly that, even
from the same instance.

This shadowing could be called hackery, but it is unnecessary. Give
either the gen-funcs or the gen-its an affix to differentiate the two
types. For instance, give the former a 'g_' prefix, as in " def
g_idle(self): ...", and add the prefix to the calls, so that


> self.__dict__[name] = eval ('self.%s()' % name)

becomes
self.__dict__[name] = eval ('self.g_%s()' % name)

There is also a little more redundancy to squeeze out. You only need
the gen-its to access their .next methods and the lookup only needs to
be done once, in .__init__, rather than with each state switch. So
you could move '.next' from .dispatch to .__init__ to get

self.__dict__[name] = eval ('self.g_%s()' % name).next
and
next = newstate()
(instead of newstate.next()). In other words, represent states by the
corresponding .next methods instead of the gen-its that have those
methods. Since the .next functions do not have parameters, not even
self (since self is bound within by the gen-func call), they also
belong in the instance dict.

Terry J. Reedy


David Mertz

unread,
May 29, 2003, 4:54:37 PM5/29/03
to
Alan Kennedy <ala...@hotmail.com> wrote previously:

|yield 'newstate'
|This would be then looked up in a "jump table", i.e. a dictionary
|mapping the state names to generator-iterators representing the
|states, like so
|while 1:
| newstate = jumptable[newstate].next()
|This seemed to me superfluous, since I had already named all of the
|states, through the names of the generator-functions themselves

In retrospect, I agree with Kennedy. In my book, I have revised the
state machine examples to use the state objects directly, rather than
look up their names in a jump table.

That said, the reason I had once done it the way I did was because I had
an example that let users select a jump target, e.g. by typing its name
at a prompt (or a GUI widget with state items would work, although I
never presented that). Users enter strings not object identities. And
likewise, strings (names of states) might come out of text processing,
e.g. reading a file that contains flow instructions.


Carel Fellinger

unread,
May 29, 2003, 6:08:58 PM5/29/03
to
On Thu, May 29, 2003 at 04:10:02PM +0100, Alan Kennedy wrote:

... a lot of fine things snipped

> To do this, here is the little bit of hackery that I came up with:
> at __init__ time of the state-machine object, it modifies the
> instance dictionary, replacing generator-functions with their
> instantiated generator-iterators.
>
> def __init__(self):
> statenames = ['state1', 'state2', 'state3', 'state4']
> for name in statenames:
> # Turn each generator-function into an instantiated
> # generator-iterator
> self.__dict__[name] = eval ('self.%s()' % name)

I don't see the hackery here, maybe you don't too:) if you replace
__dict__ and eval with get/setattr calls, like:

setattr(self, name, getattr(self, name)())

see, it looks like ordinairy python code now:)

...
> Differentiating generator-functions from functions.
...


> question I'd like to ask: can anyone tell me how to discover the
> generator-functions of an object, and differentiate them from
> ordinary functions?

the simplest and best thing to do would be to distinguish them on name, like:

all generator function names have to start with "g_"

you could also use the inspect module to get the source of each function
and look for "yield", or play with the code object, but how do you know
that every so found generator is ment to be used as a state-iterator,
could well be some kind of helper generator.

...


> working. Furthermore, it won't be possible to create a new
> generator-iterator, because we've overwritten the reference to the
> generator-function in the instance dictionary! Hmm, maybe that is
> too much a hack :-| Anyway.

Ah, I see why you thought it hackery, but be asured: you didn't:)
The generator-function is still there as a method in the class
dictionary, you merely shadowed it by binding the generator-iterator
to the same name in the instance of the class.


> The addition of the sentinel value for exiting or suspending the
> state machine means that we can now suspend and resume the machine
> at will. It can be suspended by yielding a "suspend" state (which

I haven't really thought it out, but couldn't you use an instance flag
for this and test on it instead of those "while 1:" loops? besides...

...


> The "temporal locality" of the code (as opposed to the "spatial
> locality" of traditional code) greatly aids readability, and
> modifiability, IMHO. The code seems more natural, more

...right...

> And finally: infinite loops.
> ----------------------------
>
> Another question I just thought of: Each of the state
> generator-functions must be wrapped inside an infinite while loop,
> so that the states never "run out of values". These while loops
> clutter the code somewhat, and it would be nice to not need them. I

...wrong:) To me atleast, the presence of those infinite loops
highlights that were dealing with "temporary locality" here, so
I would leave them in.


--
groetjes, carel

Steven Taschuk

unread,
May 29, 2003, 5:40:08 PM5/29/03
to
Quoth Alan Kennedy:
[...]

> def __init__(self):
> statenames = ['state1', 'state2', 'state3', 'state4']
> for name in statenames:
> # Turn each generator-function into an instantiated
> # generator-iterator
> self.__dict__[name] = eval ('self.%s()' % name)
[...]

> I'm not happy with the way that I have to explicitly list the state
> names. I tried to find a way to differentiate between functions and
> generator-functions while reflecting on the instance' dictionary.

I'd just use a naming convention.

def __init__(self):
for name in self.__dict__:
if name.startswith('state_'):
self.__dict__[name[6:]] = self.__dict__[name]()

Then

def state_foo(self):
...

causes an iterator to be added as self.foo.

[...]
> class FSM:
[...]


> def idle(self):
> while 1:
> # We could wait for some condition here.
> print "Idle state: %d" % self.counter
> yield self.start
>
> def start(self):
> while 1:
> self.counter = self.startnum
> print "Start state: %d" % self.counter
> yield self.increment

[...]

All your state function are of this form, I note:

while 1:
# do some calculation, determining next_state
yield next_state

There's not much point to using generators in such a case, imho;
just use normal functions and return the next state:

class Squares(object):
def __init__(self, start, end):
self.startnum = start
self.endnum = end
def start(self):
self.counter = self.startnum
return self.increment
def increment(self):
print self.counter**2
self.counter += 1
if self.counter >= self.endnum:
return None
else:
return self.increment
def run(self):
for x in self.iterrun():
pass
def iterrun(self):
# *This* is where a generator is handy.
self._state = self.start
while state is not None:
self._state = self._state()
yield None # or something more useful

(This gives suspend/resume, btw; for example

machine = Squares(3, 100).iterrun()
while 1:
for i in range(5):
machine.next()
# do other stuff, possibly halt, etc.

pauses the machine every five state changes to do something else.)

Moreover, when the state functions are generators, you can write
"states" such as

def foo(self):
for nextstate in [self.state0, self.state1, self.state2]:
yield nextstate

But this is not a state in the normal sense of "state machine";
each time you reach this "state", the machine's behaviour is
different, due solely to differences in the history of execution
(instead of due to, e.g., differences in input, which would be
legitimate). The real state of the machine is the combination of
the states of all of the iterators.

Besides, one of the joys of generators is that they let you *get
rid of* state machines. The whole machine above is better as

def squares(start, end):
for i in range(start, end):
yield i**2

for val in squares(3, 6):
print val

(Note that you get suspend/resume for free here too.) The states
are now provided by the normal control flow structures of the
language, which is much much easier to read.

[...]


> Which is why python generators are only "semi"-coroutines, as
> opposed to full coroutines. Because, without a dispatcher function,
> you can only yield control to the calling function, rather than any

> resumable state that you have a reference to. [...]

In part, yes. But even with the dispatcher, you can't call
another function and have *it* transfer control to another
coroutine, suspending the whole call stack. (Not transparently,
anyway; it could be done by replacing all calls to functions with
yields to a dispatcher.)

[...]
--
Steven Taschuk stas...@telusplanet.net
"What I find most baffling about that song is that it was not a hit."
-- Tony Dylan Davis (CKUA)

Neil Schemenauer

unread,
May 30, 2003, 1:33:56 AM5/30/03
to
Alan Kennedy <ala...@hotmail.com> wrote:
> Differentiating generator-functions from functions.
> ---------------------------------------------------

> But there isn't a generator type in the types module?

Is that a question? :-) Seriously, there is no generator type
because a generator-function has the same type as a function.

> So, that's a question I'd like to ask: can anyone tell me how
> to discover the generator-functions of an object, and
> differentiate them from ordinary functions?

Here's one way:

>>> def f():
... pass
...
>>> def g():
... yield 1
...
>>> f.func_code.co_flags & 0x20
0
>>> g.func_code.co_flags & 0x20
32

I don't recommend using it though. The CO_GENERATOR (0x20)
constant is not available from Python code, AFAIK.

One final parting shot. From the caller's point of view, there
is essentially no difference between:

def f():
return iter([1, 2, 3])

and:

def g():
for i in [1, 2, 3]:
yield i

or even:

def f2():
return g()

Cheers,

Neil

Yair Chuchem

unread,
May 30, 2003, 4:54:07 AM5/30/03
to
How about this implemetation of your state-machine, without the cluttering
of the while loops, and as a side effect, without use of generators too:


### Code snippet start

class FSM:

def __init__(self, startnum, incrnum, stopnum):

self.startnum = startnum
self.incrnum = incrnum
self.stopnum = stopnum
self.counter = 0
self.exit = 'exitsentinel'

def idle(self):


# We could wait for some condition here.
print "Idle state: %d" % self.counter

return self.start

def start(self):


self.counter = self.startnum
print "Start state: %d" % self.counter

return self.increment

def increment(self):


self.counter = self.counter + self.incrnum
print "Increment state: %d" % self.counter

return self.checkfinished

def checkfinished(self):
if self.counter >= self.stopnum:
return self.finished
else:
return self.increment

def finished(self):


print "Finished state: %d" % self.counter

return self.exit

def dispatch(self, startstate):
newstate = startstate

while newstate<>self.exit:
newstate = newstate()

if __name__ == "__main__":
m = FSM(1,5,50)
m.dispatch(m.idle)
m.dispatch(m.idle)

### Code snippet end


Is there something I'm missing?

Yair

Alan Kennedy

unread,
May 30, 2003, 6:47:12 AM5/30/03
to
Terry Reedy wrote:

> I very much like the idea of factoring out the unneeded redundancy of
> representing states by name strings instead of by the state objects
> themselves.

Thanks for all your great points Terry.

Indeed, factoring them out made this one case simpler for me to understand. But
as the good Dr. Mertz has pointed out in another message, there are many real
world situations where having a jump-table (maybe "dispatch-table" is a better
term?) indexed by instances of your input data tokens is the best approach in
terms of clarity and efficiency.

> The generator functions, as instance methods, are class attributes in
> the *class* dictionary. They should typically be called exactly once
> for each instance of the machine, with that instance as the argument.
> The obvious place to do this is in .__init__(). So I would not call
> this hackery.

Thanks for pointing out my misunderstanding. I only realised my mistake when I
went to rewrite the code to incorporate some of the suggested improvements.

When I actually went to iterate through the instance dictionary looking for
gen-funcs, I found there were no funcs there at all! They were of course on the
class dictionary. So, combining this with

1. Your other great tip of storing the gen-it.next methods (not the gen-its
themselves) in the instance dictionary,
2. Carel Fellinger's nicer setattr/getattr call instead of the eval(),
3. Using a naming convention to differentiate state gen-funcs from other
gen-funcs and funcs

The __init__ method now looks like

def __init__(self):
for name in self.__class__.__dict__.keys():
if type(self.__class__.__dict__[name]) is types.FunctionType \
and name[:2] == 's_':
setattr(self, name, getattr(self, name)().next)

> The resulting gen-it objects, being instance specific, belong and get
> put in the *instance* dict. Again, this is not hackery. Because you
> used the same names, they overshadow rather than replace their

> namesakes, which continue to sit, unchanged, in the class dict. You


> can still access FSM.idle as exactly that, even
> from the same instance.

This is now completely clear to me, many thanks. This means that if any of the
gen-its get destroyed, it could be recreated. So it would probably be better to
factor out the gen-it creation into a ".reset()" function, which could be called
between invocations of the state machine, as well as from the __init__.

> There is also a little more redundancy to squeeze out. You only need
> the gen-its to access their .next methods and the lookup only needs to
> be done once, in .__init__, rather than with each state switch. So
> you could move '.next' from .dispatch to .__init__ to get
>
> self.__dict__[name] = eval ('self.g_%s()' % name).next
> and
> next = newstate()
> (instead of newstate.next()). In other words, represent states by the
> corresponding .next methods instead of the gen-its that have those
> methods. Since the .next functions do not have parameters, not even
> self (since self is bound within by the gen-func call), they also
> belong in the instance dict.

Thanks for that last remaining contraction, which more cleanly describes the
actuality of dispatching, in the coroutine sense. This makes the dispatch loop
even cleaner, especially when combined with Carel Fellinger's tip of checking an
instance flag for the exit condition on the loop, instead of looping infinitely:

def dispatch(self, startstate):
state = startstate
while state != self.s_exit:
state = state()

Ahhh, sublime! (Still seeking that "Zen of Python" smilie, if anyone has one?)

The updated version of the code then becomes

#----------------------------------------------
# This code is also available at

# http://xhaus.com/alan/python/FSMgen2py.html
#

import types

class FSM:

def __init__(self, startnum, incrnum, stopnum):

for name in self.__class__.__dict__.keys():
if type(self.__class__.__dict__[name]) is types.FunctionType\
and name[:2] == 's_':
setattr(self, name, getattr(self, name)().next)
#self.__dict__[name] = eval ('self.%s()' % name).next


self.startnum = startnum
self.incrnum = incrnum
self.stopnum = stopnum
self.counter = 0

self.s_exit = 'exitsentinel'

def s_idle(self):


while 1:
# We could wait for some condition here.
print "Idle state: %d" % self.counter

yield self.s_start

def s_start(self):


while 1:
self.counter = self.startnum
print "Start state: %d" % self.counter

yield self.s_increment

def s_increment(self):


while 1:
self.counter = self.counter + self.incrnum
print "Increment state: %d" % self.counter

yield self.s_checkfinished

def s_checkfinished(self):


while 1:
if self.counter >= self.stopnum:

yield self.s_finished
else:
yield self.s_increment

def s_finished(self):


while 1:
print "Finished state: %d" % self.counter

yield self.s_exit

def dispatch(self, startstate):
state = startstate
while state != self.s_exit:
state = state()

if __name__ == "__main__":
m = FSM(1,5,50)

m.dispatch(m.s_idle)
m.dispatch(m.s_idle)
#----------------------------------------------

Thanks to all who took the time to help me understand.

kind regards,

Alan Kennedy

unread,
May 30, 2003, 7:56:47 AM5/30/03
to
Yair Chuchem wrote:

> How about this implemetation of your state-machine, without the
> cluttering of the while loops, and as a side effect, without use
> of generators too:

[extremely clear example elided]

> Is there something I'm missing?

No, obviously something I'm missing :-)

Yours is hopefully the sort of solution I would have arrived at
for my original problem if I had really actually thought the problem
all the way through, and designed it myself, rather than go looking
for pre-cooked recipes.

I fully recognise that my design for a state machine could be
considered overblown. I suppose that what fired my enthusiasm was
that during the process of developing my design, I inadvertently
stumbled on the realisation of what coroutines were, which was
much more fun than state machines ;-)

Thanks also to Neil Schemenauer and Stephen Taschuk for providing
several alternatives which are just as clean as Yairs, and also
don't need coroutines or generators.

Particularly I have taken on board Stephen's advice re using generators
to avoid the need for state machines, and also that the same
suspend/resume functionality can be implemented by simply having a
yield in the dispatch function.

Michele Simionato

unread,
May 30, 2003, 9:40:49 AM5/30/03
to
Alan Kennedy <ala...@hotmail.com> wrote in message news:<3ED622CA...@hotmail.com>...

> But there isn't a generator type in the types module? So, that's a


> question I'd like to ask: can anyone tell me how to discover the
> generator-functions of an object, and differentiate them from
> ordinary functions?
>

Other have answered your question. I will add that *there is* a generator
type in the types module (even if it does not do what you want)

>>> from types import GeneratorType

>>> def g():
... yield 1

>>> type(g()) is GeneratorType
True

Hint: look at the source code of the types module and see how GeneratorType
is defined ...


Michele

Alan Kennedy

unread,
May 30, 2003, 9:47:49 AM5/30/03
to
Steven Taschuk wrote:

> [Much good advice on appropriate use of generators elided]

> Moreover, when the state functions are generators, you can write
> "states" such as
>
> def foo(self):
> for nextstate in [self.state0, self.state1, self.state2]:
> yield nextstate
>
> But this is not a state in the normal sense of "state machine";
> each time you reach this "state", the machine's behaviour is
> different, due solely to differences in the history of execution
> (instead of due to, e.g., differences in input, which would be
> legitimate). The real state of the machine is the combination of
> the states of all of the iterators.

Had you said this to me a month ago, I wouldn't have had a clue what
you were talking about. But now I understand, and I agree that what I
coded could be "dangerous" for, say, someone trying to debug the code
and who does not understand how the history of execution and the
resumption of "states" affects the current state of the machine.

But if that principle is correct, i.e. that "states" in a "machine"
should not have "after-effects", then surely that same argument could
be applied to the concept of coroutines as a whole?

At some level, every object/module/codeobject+data is a state-machine.
But objects that use coroutines have "after-effects", so they perhaps
should not be used to implement state-machines (i.e. objects). Hmmm.

How to resolve this paradox?

>> Which is why python generators are only "semi"-coroutines, as
>> opposed to full coroutines. Because, without a dispatcher function,
>> you can only yield control to the calling function, rather than any
>> resumable state that you have a reference to. [...]

> In part, yes. But even with the dispatcher, you can't call
> another function and have *it* transfer control to another
> coroutine, suspending the whole call stack. (Not transparently,
> anyway; it could be done by replacing all calls to functions with
> yields to a dispatcher.)

I don't understand.

But I think I need to work through and code a good few examples
before I'll be in a position to know what I don't understand, and
before I could ask meaningful questions.

Thanks much.

Terry Reedy

unread,
May 30, 2003, 11:15:07 AM5/30/03
to

"Yair Chuchem" <yai...@012.net.il> wrote in message
news:3ed7...@news.012.net.il...

> How about this implemetation of your state-machine, without the
cluttering
> of the while loops, and as a side effect, without use of generators
too:

In this example, the while loops are certainly not needed. In a real
example, driven by input, states could be visited many times. For
that, while loops are needed within generators. Of course, any
generator *method* can be written as a plain method by rearranging
lines and saving persistent data as instance attributes. But the
generator form will be faster in repeated execution and may be easier
to write.

Terry J. Reedy


Lulu of the Lotus-Eaters

unread,
May 30, 2003, 12:49:58 PM5/30/03
to
Yair Chuchem <yai...@012.net.il> wrote previously:

|How about this implemetation of your state-machine, without the cluttering
|of the while loops, and as a side effect, without use of generators too:
|Is there something I'm missing?

Almost everything.

Generators serve at least three purposes for state machines:

(1) They avoid function call overhead (much faster)
(2) They let you branch into the middle of an execution context (where
you left off).
(3) They let you (easily) save local values for when you return to a
state.

Your approach loses all of those benefits (and doesn't even get rid of
the 'while' loop, just imposes a narrow condition rather than more
flexible breaks from a 'while 1').

Yours, Lulu...

--
mertz@ _/_/_/_/_/_/_/ THIS MESSAGE WAS BROUGHT TO YOU BY:_/_/_/_/ v i
gnosis _/_/ Postmodern Enterprises _/_/ s r
.cx _/_/ MAKERS OF CHAOS.... _/_/ i u
_/_/_/_/_/ LOOK FOR IT IN A NEIGHBORHOOD NEAR YOU_/_/_/_/_/ g s


Steven Taschuk

unread,
May 30, 2003, 1:52:18 PM5/30/03
to
Quoth Alan Kennedy:
> Steven Taschuk wrote:
[...]

> > But this is not a state in the normal sense of "state machine";
[...]

> But if that principle is correct, i.e. that "states" in a "machine"
> should not have "after-effects", then surely that same argument could
> be applied to the concept of coroutines as a whole?
>
> At some level, every object/module/codeobject+data is a state-machine.
> But objects that use coroutines have "after-effects", so they perhaps
> should not be used to implement state-machines (i.e. objects). Hmmm.
>
> How to resolve this paradox?

No paradox. It's just a question of what represents the state of
the machine; for your FSM objects, it might not be as simple as
"the value of the state attribute", but might also include the
internal state of the iterator which that attribute refers to.

For example:

x = True
def foo():
global x
while True:
yield x
x = not x

In this boring state machine, the state is the present value of x;
knowing that, you can predict the entire future behaviour of the
machine. But in this other boring state machine:

x = True
def foo():
global x
while True:
yield x
yield not x
x = not x

the state is a combination of the present value of x and whether
the iterator is about to resume from the first or the second
yield. Knowing x doesn't tell you enough to predict the future
behaviour of the machine.

In the former case it would be clear to rename 'x' to 'state',
since that's what that variable means. But in the latter such a
name would be misleading, since that variable is not the complete
state.

In the same way, with an implementation of your FSM class in which
the generator iterators preserve information across .next
invocations, such as

def pseudostate(self):
for next in [self.state0, self.state1, self.state2]:
yield next

the value of the state attribute is not the whole story, so it's a
bit misleading imho to call that the state of the machine.

(Also it's easy to make state machines which are not *finite*
state machines using the FSM class. For example,

def state_foo(self):
n = 0L
while True:
yield n
n += 1

Now the machine has an infinite number of states (theoretically;
in practical terms n will eventually consume all memory). Another
way to see that state_foo represents not a single state, but a
family of states.)

[...]
> > In part, yes. But even with the dispatcher, you can't call
> > another function and have *it* transfer control to another
> > coroutine, suspending the whole call stack. (Not transparently,
> > anyway; it could be done by replacing all calls to functions with
> > yields to a dispatcher.)
>
> I don't understand.

Perhaps I can make this clearer with some code that doesn't work:

queue = []

def read():
if not queue: # queue empty
yield 'producer'
return queue.pop(0)

def write(value):
if len(queue) >= 5: # queue full
yield 'consumer'
queue.append(value)

def producer():
i = 0
while True:
write(i)
i += 1

def consumer():
while 1:
print read()

What this broken code is intended to do is this: The consumer
process reads from the queue and prints values. When the queue is
empty, the call to read() will transfer control to the producer
process, which will then write new values to the queue. When the
queue gets full, the producer's call to write() will transfer
control back to the consumer, which -- and this is the important
bit -- will resume in the middle of its call to read(), and
continue as if nothing had happened.

There's no dispatcher function we can write which will make this
work as written, because the yields in read() and write() can't
yield to a dispatcher which invoked producer and consumer; they
can only yield to their immediate caller.

Now, if the calls to read() and write() were done using a
dispatcher, rather than using the normal Python call mechanism, it
could be done:

queue = []
_returned = None

def read():
if not queue:
print 'queue empty, transferring to producer'
yield 'transfer', 'producer'
assert queue
value = queue.pop(0)
print 'read %r from queue' % (value,)
yield 'return', value

def write(value):
if len(queue) >= 5:
print 'queue full, transferring to consumer'
yield 'transfer', 'consumer'
assert len(queue) < 5
queue.append(value)
print 'wrote %r to queue' % (value,)
yield 'return', None

def producer():
for i in range(23):
yield 'call', 'write', i
print 'no more values; transferring to consumer to flush queue'
yield 'transfer', 'consumer'

def consumer():
while True:
yield 'call', 'read'
print _returned

def run():
global _returned
coroutines = { # call stacks
'producer': [producer()],
'consumer': [consumer()],
}
functions = {
'read': read,
'write': write,
}

current = 'producer' # 'consumer' works just as well
while True:
stack = coroutines[current]
prefix = ' '*len(stack)
try:
command = stack[-1].next()
except (StopIteration, IndexError):
print 'halt!'
break
if command[0] == 'transfer':
current = command[1]
elif command[0] == 'call':
func = functions[command[1]]
if len(command) > 2:
it = func(command[2])
else:
it = func()
stack.append(it)
elif command[0] == 'return':
_returned = command[1]
stack.pop()
else:
raise ValueError('unknown command %r' % (command[0],))

if __name__ == '__main__':
run()

This is kind of neat imho, but it has lots of disadvantages. Most
notably, it's not transparent -- you can't write normal calls and
normal returns and expect everything to just work. You have to
know you're running under the dispatcher and make calls
appropriately. (Also a real version of this idea should provide
tracebacks which refer not to the actual Python call stack but to
the call stack implemented by the dispatcher.)

(One of my blue-sky ideas is a tool which would transform a normal
program into the form above (perhaps by bytecode translation).
Then you could write normal calls and returns, but still have
coroutines. Or you could just use Stackless's microthreads.)

--
Steven Taschuk Aral: "Confusion to the enemy, boy."
stas...@telusplanet.net Mark: "Turn-about is fair play, sir."
-- _Mirror Dance_, Lois McMaster Bujold

Alan Kennedy

unread,
Jun 3, 2003, 5:26:55 AM6/3/03
to
[Alan spends the weekend upgrading an old laptop, so he can look at
this stuff in the comfort of the garden, which was bathed in glorious
sunshine for the entirety of the Irish Bank Holiday weekend B-). He
also prints and reads the writings of Christian Tismer[1], David
Mertz[2,3] and the Timbot[4,5], in relation to generators,
coroutines, continuations, pseudo/weightless/micro-threads, etc].

Steven Taschuk wrote:

[
Lots of great stuff which illustrates the problems of calling from
generator/coroutine code to functional code and vice versa.
]

I now understand. Thanks for taking the time to explain.

Thanks to your clear examples, I now picture coroutines in the
following way (which I hope is correct :-).

1. The "traditional" function/method call paradigm builds a stack
frame every time a function/method is called. Since this stack frame
holds the references to the local variables for the function/method,
it must be destroyed and collected if memory is not to be "leaked".
The only time when the stack frame can be destroyed is when the
references it contains are no longer required: i.e. after the
instance of the function/method call has finished.

2. The only method by which the "traditional" function call can
invoke another "traditional" function/method call is to call it
"recursively", thus causing the construction of another stack frame.
There is no mechanism for it to "substitute" the stack frame of the
called function "in place of" its own stack frame. (Although I
believe that Stackless can do this, after much rewriting and
reinvention of truth :-).

3. Because all "traditional" function/method calls, involve the
creation and eventual destruction of a stack frame, I label the space
in which they operate "Linear stack space".

4. There is another space, which I call "Generator space". When a
call is made into "Generator space", a stack frame is not
constructed: at least not every time. Instead, a resumable and
persistent stack frame is "resumed": this "resumable stack frame" was
created once, sometime in the past: it is termed, in current python
terminology, the generator-iterator.

5. When the code in "Generator space" (i.e. the generator-iterator)
is called, it resumes immediately after where it last 'yield'ed, and
continues until it 'yield's again, or returns or excepts. When it
'yield's, two things happen. A: The resumable stack frame is
"frozen", so that it can later be resumed again. and B: A python
object, which may be None, is transferred back to the caller.

6. For any call from "Linear Stack space" into "Generator space",
there is a crossover point, P. When the called code in "Generator
space" finishes executing, it can only enter back into "Linear stack
space" through point P: it cannot exit through any other point.
(According to the current python model).

7. If any calls are made from "Generator space" into "Linear stack
space", this leads to the creation of a stack frame, which must be
destroyed. If the called function/method in "Linear stack space" then
calls back into "Generator space", and the "Linear space" function is
not allowed to exit naturally, this is essentially unbound recursion,
which will lead eventually to a blown stack. (The non-functioning
Producer/Consumer example illustrates this: the code in "Generator
space" could be made to work by "traditionally" calling the write and
read functions, but this mutual recursion would eventually blow the
"Linear space" stack).

8. When a stack frame in "Generator space" 'yield's, it is possible
to adopt a convention for the 'yield'ed value: the "dispatcher"
convention. Under this convention, the value 'yield'ed can be a
reference to another resumable stack frame in Generator space. A
special "dispatch"-function is coded to recognise these references,
and to immediately call back into "Generator space". The newly
resumed stack frame is still limited by the same constraints as the
original call into "Generator space": it must eventually return to
"Linear stack space" through point P, in this case the dispatcher
function.

9. A complex network of interacting "resumable stack frames", or
generators, can mutually 'yield' control to each other, assuming the
cooperation of the dispatcher function. Execution continually
"bounces" back and forth between the dispatch function (Point P, in
"Linear stack space") and various 'yield'ed resumable states in
"Generator space".

10. A network of interacting generator-iterators could potentially
execute much faster than an equivalent series of "traditional"
function/method calls in "Linear stack space", since there is no
overhead associated with con/destruction of stack frames, no parsing of
parameters, etc. Such a network of interacting generators are called
"coroutines", and require the presence of a dispatcher function: the
dispatcher function must be explicitly created by the programmer, as
python currently exists.

10a: refinement: In fact python generators are really
"semi"-coroutines, because of the requirement to keep 'yield'ing to a
dispatcher function. In a language which implemented
"full"-coroutines (maybe a __future__ version of python?), the
dispatcher would not be required (at least explicitly), and resumable
states could transfer control directly to any other resumable state
for which they hold a reference.

11. The efficiency benefits of generators can be also realised by
*not* adopting the dispatcher convention. Instead, a generator can
simply 'yield' the series of values it has been coded to generate:
the nodes in a data structure, the numbers in a computed series, etc.
This can lead to more natural code, particularly for the calling
function which utilises the series of values 'yield'ed: it is mostly
unaware that there is a resumable state involved in the generation of
the values. Simple example:

def squares(n):
for i in xrange(n):
yield n, n*n

# Instantiate a generator-iterator
sqgen = squares(100)
# We can do the following because the generator-iterator
# conventiently implements the iterator protocol.
for i, sq in sqgen():
print "The square of %d is %d" % (i, sq)

12. It is possible to avoid the "mutual" recursion problem of calling
from "Generator space" into "Linear stack space", by the following
actions

o Moving all "traditional" function/method calls required from
"Linear stack space" into "Generator space".
o By expanding the "dispatcher" convention to allow generators to
yield special values representing a function call or a return value.
The dispatcher function must then explicitly manage a call stack.

13. It is possible to model ultra-lightweight threads by representing
each thread as a simple generator which steps through the states
required to implement the functionality required of the "thread". The
"execution pointer" of such threads is advanced simply by calling the
".next()" method of the "thread"s generator-iterator. (Incidentally,
as well as being highly efficient in terms of resource consumption,
these ultra-lightweight threads offer much finer control of
thread-prioritisation, thread-creation, destruction and "collection",
etc.)

Now that I think I've got my head around it, I think I'm going to try
my hand at an example or two, which I will post to the newsgroup
(might be a few weeks, I'm kinda busy right now). The two main
interesting examples that come to mind are

1. A ultra-lightweight thread implementation of an
asyncore/medusa/twisted style socket server.
2. A coroutine based version of something that is normally
(relatively) resource-intensive: a series of SAX2 callbacks to a
chain of ContentHandlers/Filters.

Lastly, I think I'm beginning to "get" continuations. Although the
examples given in Christian Tismers paper "Continuations and
Stackless Python"[1] are difficult to follow without the definition
of a continuation object (which seems to require intimate familiarity
with the Stackless implementation), I think I understand the general
principle.

And it's a principal I'm not really interested in pursuing, because I
can't see that it will make me any more productive, or my programs
more efficient (than they could be using the "coroutines" and
"weightless-threads" described above). And I can imagine that
continuation code could be a little brain-bending to write (thus
making me *less* productive %-), as this example of a while loop
(which I sort of understand) shows:

def looptest(n):
this = continuation.current()
k = this.update(n)
if k:
this(k-1)
else:
del this.link

But I can see the power inherent in the continuations paradigm.

Again, many thanks to all who responded to my original post.

Reading material.

[1] http://www.stackless.com/spcpaper.pdf
[2] http://www-106.ibm.com/developerworks/linux/library/l-pygen.html
[3] http://www-106.ibm.com/developerworks/library/l-pythrd.html
[4] http://tinyurl.com/dbyn
[5] http://www.tismer.com/research/stackless/coroutines.tim.peters.html

Terry Reedy

unread,
Jun 3, 2003, 2:48:22 PM6/3/03
to

"Alan Kennedy" <ala...@hotmail.com> wrote in message
news:3EDC69DF...@hotmail.com...
...

> Thanks to your clear examples, I now picture coroutines in the
> following way (which I hope is correct :-).

Some further comments, based on my understandings...

> 2. The only method by which the "traditional" function call can
> invoke another "traditional" function/method call is to call it
> "recursively",

'Recursive' usually refers to a function calling itself, either
directly or via another function. Using 'recursive' for what are
normally considered non-recursive calls that create a call chain in
which each function is unique is a somewhat confusing expansion of its
meaning.

> thus causing the construction of another stack frame.
> There is no mechanism for it to "substitute" the stack frame of the
> called function "in place of" its own stack frame. (Although I
> believe that Stackless can do this, after much rewriting and
> reinvention of truth :-).
>
> 3. Because all "traditional" function/method calls, involve the
> creation and eventual destruction of a stack frame, I label the
space
> in which they operate "Linear stack space".

Generators also involve the creation and eventual distruction of a
stack frame...

> 4. There is another space, which I call "Generator space".

What specifically do you consider to be part of this 'space'?
Candidates include generator functions, the generator objects they
produce (which follow the iterator protocal by having .__iter__ and
.next methods), the .next methods themselves, and the frames
associated with activated .next methods.

> When a call is made into "Generator space",

This implies that it contains the .next methods ...

> a stack frame is not constructed: at least not every time.

One is constructed on the first call to a generator .next() method.
On this call, control enters at the top like any other function. On
subsequent calls ...

> Instead, a resumable and persistent stack frame is "resumed":
> this "resumable stack frame" was created once, sometime in the past:

Very specifically, on the first call

> it is termed, in current python terminology, the generator-iterator.

Calling a generator function returns a generator object that is not
callable but which has a .next() method which is. The frame is
associated with the .next method.

> 5. When the code in "Generator space" (i.e. the generator-iterator)
> is called, it resumes immediately after where it last 'yield'ed, and
> continues until it 'yield's again, or returns or excepts. When it
> 'yield's, two things happen.
> A: The resumable stack frame is "frozen",
> so that it can later be resumed again. and

This is more something that does not happen - the usual destruction of
the stack frame. 'Freezing' implies that something special is done
that takes time and that would have to be reversed by 'thawing'.
Generators are efficient because yielding leaves the frame as it so
execution can be resumed exactly where left off.

> B: A python object, which may be None, is transferred back to the
caller.

As with normal functions.

> 6. For any call from "Linear Stack space" into "Generator space",
> there is a crossover point, P. When the called code in "Generator
> space" finishes executing, it can only enter back into "Linear stack
> space" through point P: it cannot exit through any other point.
> (According to the current python model).

I'm not sure that this is a helpful way to look at the process.. When
a gen-it.next frame is created or reactivated, it is tacked on to the
end of the linear frame stack just like any other execution frame. If
it calls another function, yet another frame is added. When it yields
or returns, it is removed from the chain, just like any other
function, and control passes back the to calling function, just like
any other function. The only differences are that 'removal' !=
'deletion' (because the gen object holds a reference to the frame, so
ref count does *not* go to 0) and 're-addition' != 'set execution
pointer to beginning of associated code-object'.

> 7. If any calls are made from "Generator space" into "Linear stack
> space", this leads to the creation of a stack frame, which must be
> destroyed. If the called function/method in "Linear stack space"
then
> calls back into "Generator space", and the "Linear space" function
is
> not allowed to exit naturally, this is essentially unbound
recursion,
> which will lead eventually to a blown stack.

To me, this is unnecessarily confusing. For some reason, you see and
present generators as more special than they are. They are resumable
functions, and they are resumable mostly because yield does not
disable. I don't believe they do anything that cannot be done with
class methods. They are just easier to write (usually) and faster to
execute.

Indeed, I think it possibly useful for understanding generators to
think of a def statement with yield in the body as being something
like a syntactically condensed version of a class statement iterator
template with standard defs for __init__ and __iter__ and a
class-specific .next method that is filled in from the body of the
def. Calling a generator function returns an initialized instance
with the args strored away for later use by the .next method. Because
the whole package is a fairly common use case, it can be and now has
been optimized so the programmer need only write what is specific and
the interpreter adds less overhead than normal.

Terry J. Reedy


Alan Kennedy

unread,
Jun 3, 2003, 4:32:58 PM6/3/03
to
[Terry]

> Some further comments, based on my understandings...

I completely agree with all that you said here.

In defense of the original statements, I can only say that I
was trying to express the concepts in a way that made sense to me,
and that would hopefully make sense to others. A good example of
this is:

[Alan]


>> 2. The only method by which the "traditional" function call can
>> invoke another "traditional" function/method call is to call it
>> "recursively",

[Terry]


> 'Recursive' usually refers to a function calling itself, either
> directly or via another function. Using 'recursive' for what are
> normally considered non-recursive calls that create a call chain in
> which each function is unique is a somewhat confusing expansion of its
> meaning.

That is why my original statements were liberally sprinkled with ""s.
I simply don't know how to concisely express the concept of a function
calling another function, resulting in the construction of a new stack
frame. I don't know the reason why we don't have a simple word like
"recursion" to describe this. Indeed, the word "recursion" itself only
implies a function calling itself, there is no concept of stack frames
at all. But I definitely used it inappropriately.

I suppose my description of it was also coloured by reading Christian
Tismer's description of removing the recursive calls to the C eval loop
in the interpreter, which is what happens each time a python function
is executed.

[Alan]


>> 4. There is another space, which I call "Generator space".

[Terry]


> What specifically do you consider to be part of this 'space'?

Whenever I think about this stuff, I find myself thinking more in terms
of what happens in the space, i.e. how the rules change in the space,
rather than objects that live in the space.

But on reading your other points, I see that I may be trying a little
too hard with the "Space" analogy.

> This is more something that does not happen - the usual destruction of
> the stack frame. 'Freezing' implies that something special is done
> that takes time and that would have to be reversed by 'thawing'.
> Generators are efficient because yielding leaves the frame as it so
> execution can be resumed exactly where left off.

You're right, I'm describing something that doesn't happen. I was
trying to verbally describe the concept of the stack frame staying
persistent and undamaged, ready for later re-entry.

[Terry]


> I'm not sure that this is a helpful way to look at the process.. When
> a gen-it.next frame is created or reactivated, it is tacked on to the
> end of the linear frame stack just like any other execution frame. If
> it calls another function, yet another frame is added. When it yields
> or returns, it is removed from the chain, just like any other
> function, and control passes back the to calling function, just like
> any other function. The only differences are that 'removal' !=
> 'deletion' (because the gen object holds a reference to the frame, so
> ref count does *not* go to 0) and 're-addition' != 'set execution
> pointer to beginning of associated code-object'.

These were the key statements that made me rethink the "space" analogy. The
change of behaviour I was really trying to describe (badly ;-) was really
the change in the behaviour of the top of the stack before, during and
after .next() calls. Seems like the top of stack has protruded into some
other "space" where the rules just aren't same ....... :-)

I think what originally put the "space" analogy into my head was the fact
that, by virtue of introducing a dispatcher function, the resumable stack
frame at the top of the call stack during a .next() call can be made to
(rapidly) move around between a set of such resumable states. The increased
rapidity (compared to non-generator function calls) was an attribute of
"generator space".

And what also put the "space" idea in my mind was the fact that there could
be multiple connected resumable states, just "hanging there in space",
waiting to be resumed. Every running program has a set of zero or more of
these resumable states that the interpreter keeps somewhere special .....

[Alan]
[ Description of an attribute of generator space elided]

[Terry]


> To me, this is unnecessarily confusing. For some reason, you see and
> present generators as more special than they are. They are resumable
> functions, and they are resumable mostly because yield does not
> disable. I don't believe they do anything that cannot be done with
> class methods. They are just easier to write (usually) and faster to
> execute.

Absolutely.

The only thing I would add is that I was trying to build a mental
picture of coroutines as well as generators. I don't think I explained
that properly.

> Indeed, I think it possibly useful for understanding generators to
> think of a def statement with yield in the body as being something
> like a syntactically condensed version of a class statement iterator
> template with standard defs for __init__ and __iter__ and a
> class-specific .next method that is filled in from the body of the
> def. Calling a generator function returns an initialized instance
> with the args strored away for later use by the .next method. Because
> the whole package is a fairly common use case, it can be and now has
> been optimized so the programmer need only write what is specific and
> the interpreter adds less overhead than normal.

A much better model to use, many thanks.

Terry Reedy

unread,
Jun 3, 2003, 5:43:28 PM6/3/03
to

"Alan Kennedy" <ala...@hotmail.com> wrote in message
news:3EDD05FA...@hotmail.com...

After reading your response to my response to your essay, I think we
can combine ideas and converge on the following essentials, with
implementation details left out:

Traditionally, execution frames form a linear stack. With generators,
there can also be a 'pool' of detached, suspended, ready-to-go frames
that are not currently on the stack.

The interpreter executes the code attached to the top frame of the
stack. That code can bury its frame by calling another function,
whose frame gets put on top. Or it can unbury the frame underneath by
returning (or yielding), so that its frame get detached and deleted
(or not). But it cannot directly replace itself (or its frame), such
as with a co-routine call.

However, we can similate such co-routine frame switching as follows:
write a set of 'state' generators, giving each knowledge of the others
it needs to know about. Instead of having them yield data (pass data
through attributes or globals instead ), have them yield the generator
method corresponding to the next state. Then support this set of
state methods with a servant dispatch method/frame that sits under the
changing top frame and that helps the state methods 'in effect' call
each other by calling the state method yielded by each state on behalf
of that yielding state.

While mechanically the same, this is conceptually different from the
typical generator usage in which the caller decides on the sequence of
calls. By making the caller a servant, control is effectively passes
to the callees.

Terry J. Reedy


Lulu of the Lotus-Eaters

unread,
Jun 3, 2003, 7:12:00 PM6/3/03
to
Alan Kennedy <ala...@hotmail.com> wrote previously:
|def squares(n):
| for i in xrange(n):
| yield n, n*n
|sqgen = squares(100)

|for i, sq in sqgen():
| print "The square of %d is %d" % (i, sq)

Should be:

def squares(n):
for i in xrange(n):

yield i, i*i # <-- i
sqgen = squares(100)
for i, sq in sqgen: # <-- no parens


print "The square of %d is %d" % (i, sq)

--
Keeping medicines from the bloodstreams of the sick; food from the bellies
of the hungry; books from the hands of the uneducated; technology from the
underdeveloped; and putting advocates of freedom in prisons. Intellectual
property is to the 21st century what the slave trade was to the 16th.

Lulu of the Lotus-Eaters

unread,
Jun 3, 2003, 6:32:10 PM6/3/03
to
Alan Kennedy <ala...@hotmail.com> wrote previously:
|def squares(n):
| for i in xrange(n):
| yield n, n*n
|sqgen = squares(100)

|for i, sq in sqgen():
| print "The square of %d is %d" % (i, sq)

Should be:

def squares(n):
for i in xrange(n):

yield i, i*i # <-- i
sqgen = squares(100)

for i, sq in sqgen: # <-- no parens


print "The square of %d is %d" % (i, sq)

--

Alan Kennedy

unread,
Jun 4, 2003, 6:04:35 AM6/4/03
to
Terry Reedy wrote:

> After reading your response to my response to your essay, I think we
> can combine ideas and converge on the following essentials, with
> implementation details left out:

[ Excellent model for generators and coroutines elided ]

Thanks, Terry, for that excellent summary.

I'm really glad that you raised the problems with my original essay, because now
I have a much clearer picture of what is happening. Or rather, my picture is
almost the same, but my terminology when describing it and using it will be
*much* cleaner.

This kind of collaborative model development is, to my mind, what Usenet is
great for. When I first approached the generators and coroutines concepts, I
really needed to find a discussion similar to the one we've just had, to help me
clarify my thoughts. And now, thanks to the magic of Usenet archives like Google
Groups, etc, our model is preserved for all to see and use.

Thanks, Terry, it's been fun :-)

And of course, thanks to all the other contributors to this thread.

0 new messages