Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Trying to wrap my head around futures and coroutines

53 views
Skip to first unread message

Skip Montanaro

unread,
Jan 6, 2014, 7:56:00 PM1/6/14
to Python

I've been programming for a long while in an event&callback-driven world. While I am comfortable enough with the mechanisms available (almost 100% of what I do is in a PyGTK world with its signal mechanism), it's never been all that satisfying, breaking up my calculations into various pieces, and thus having my algorithm scattered all over the place.

So, I'm looking for a little guidance. It seems to me that futures, coroutines, and/or the new Tulip/asyncio package might be my salvation, but I'm having a bit of trouble seeing exactly how that would work. Let me outline a simple hypothetical calculation. I'm looking for ways in which these new facilities might improve the structure of my code.

Let's say I have a dead simple GUI with two buttons labeled, "Do A" and "Do B". Each corresponds to executing a particular activity, A or B, which take some non-zero amount of time to complete (as perceived by the user) or cancel (as perceived by the state of the running system - not safe to run A until B is complete/canceled, and vice versa). The user, being the fickle sort that he is, might change his mind while A is running, and decide to execute B instead. (The roles can also be reversed.) If s/he wants to run task A, task B must be canceled or allowed to complete before A can be started. Logically, the code looks something like (I fear Gmail is going to destroy my indentation):

def do_A():
when B is complete, _do_A()
cancel_B()

def do_B():
when A is complete, _do_B()
cancel_A()

def _do_A():
do the real A work here, we are guaranteed B is no longer running

def _do_B():
do the real B work here, we are guaranteed A is no longer running

cancel_A and cancel_B might be no-ops, in which case they need to start up the other calculation immediately, if one is pending.

This is pretty simple execution, and if my job was this simple, I'd probably just keep doing things the way I do now, which is basically to catch a "complete" or "canceled" signal from the A and B tasks and execute the opposite task if it's pending. But it's not this simple. In reality there are lots of, "oh, you want to do X? You need to make sure A, B, and C are not active." And other stuff like that.

I have this notion that I should be able to write do_A() something like this:

def do_A():
cancel_B()
yield from ... ???
_do_A()
...

or

def do_A():
future = cancel_B()
future.on_completion(_do_A)
... or ???

with the obvious similar structure for do_B. To my mind's eye, the first option is preferable, since it's obvious that when control reaches the line after the yield from statement, it's fine to do the guts of task A.

So, is my simpleminded view of the world a possibility with the current facilities available in 3.3 or 3.4?

Thx,

Skip

MRAB

unread,
Jan 6, 2014, 9:22:22 PM1/6/14
to pytho...@python.org, pytho...@python.org
[snip]
Do you really need to use futures, etc?

What you could do is keep track of which tasks are active.

When the user clicks a button to start a task, the task checks whether
it can run. If it can run, it starts the real work. On the other hand,
if it can't run, it's set as the pending task.

When a task completes or is cancelled, if there is a pending task, that
task is unset as the pending task and retried; it'll then either start
the real work or be set as the pending task again.

Cameron Simpson

unread,
Jan 6, 2014, 9:29:58 PM1/6/14
to Skip Montanaro, Python
On 06Jan2014 18:56, Skip Montanaro <skip.mo...@gmail.com> wrote:
[...]
> Let's say I have a dead simple GUI with two buttons labeled, "Do A" and "Do
> B". Each corresponds to executing a particular activity, A or B, which take
> some non-zero amount of time to complete (as perceived by the user) or
> cancel (as perceived by the state of the running system - not safe to run A
> until B is complete/canceled, and vice versa). The user, being the fickle
> sort that he is, might change his mind while A is running, and decide to
> execute B instead. (The roles can also be reversed.) If s/he wants to run
> task A, task B must be canceled or allowed to complete before A can be
> started.

I take it we can ignore user's hammering on buttons faster than
jobs can run or be cancelled?

> Logically, the code looks something like (I fear Gmail is going to
> destroy my indentation):
>
> def do_A():
> when B is complete, _do_A()
> cancel_B()
[...]
> def _do_A():
> do the real A work here, we are guaranteed B is no longer running
[...]
> cancel_A and cancel_B might be no-ops, in which case they need to start up
> the other calculation immediately, if one is pending.

I wouldn't have cancel_A do this, I'd have do_A do this more overtly.

> This is pretty simple execution, and if my job was this simple, I'd
> probably just keep doing things the way I do now, which is basically to
> catch a "complete" or "canceled" signal from the A and B tasks and execute
> the opposite task if it's pending. But it's not this simple. In reality
> there are lots of, "oh, you want to do X? You need to make sure A, B, and C
> are not active." And other stuff like that.

What's wrong with variations on:

from threading import Lock

lock_A = Lock()
lock_B = Lock()

def do_A():
with lock_B():
with lock_A():
_do_A()

def do_B():
with lock_A():
with lock_B():
_do_B()

You can extend this with multiple locks for A,B,C provided you take
the excluding locks before taking the inner lock for the core task.

Regarding cancellation, I presume your code polls some cancellation
flag regularly during the task?

Cheers,
--
Cameron Simpson <c...@zip.com.au>

Many are stubborn in pursuit of the path they have chosen, few in pursuit
of the goal. - Friedrich Nietzsche

Cameron Simpson

unread,
Jan 6, 2014, 9:45:21 PM1/6/14
to Skip Montanaro, Python
On 07Jan2014 13:29, I wrote:
> def do_A():
> with lock_B():
> with lock_A():
> _do_A()

Um, of course there would be a cancel_B() up front above, like this:

def do_A():
cancel_B()
with lock_B():
with lock_A():
_do_A()

I'm with MRAB: you don't really need futures unless you looking to
move to a multithreaded appraoch and aren't multithreaded already.
Even then, you don't need futures, just track running threads and
what's meant to run next.

You can do all your blocking with Locks fairly easily unless there
are complexities not yet revealed. (Of course, this is a truism,
but I mean "conveniently".)

Cheers,
--
Cameron Simpson <c...@zip.com.au>

Follow! But! Follow only if ye be men of valor, for the entrance to this cave
is guarded by a creature so foul, so cruel that no man yet has fought with it
and lived! Bones of four fifty men lie strewn about its lair. So,
brave knights, if you do doubt your courage or your strength, come no
further, for death awaits you all with nasty big pointy teeth.
- Tim The Enchanter

Skip Montanaro

unread,
Jan 6, 2014, 10:15:56 PM1/6/14
to Python
>From the couple responses I've seen, I must have not made myself
clear. Let's skip specific hypothetical tasks. Using coroutines,
futures, or other programming paradigms that have been introduced in
recent versions of Python 3.x, can traditionally event-driven code be
written in a more linear manner so that the overall algorithms
implemented in the code are easier to follow? My code is not
multi-threaded, so using threads and locking is not really part of the
picture. In fact, I'm thinking about this now precisely because the
first sentence of the asyncio documentation mentions single-threaded
concurrent code: "This module provides infrastructure for writing
single-threaded concurrent code using coroutines, multiplexing I/O
access over sockets and other resources, running network clients and
servers, and other related primitives."

I'm trying to understand if it's possible to use coroutines or objects
like asyncio.Future to write more readable code, that today would be
implemented using callbacks, GTK signals, etc.

S

MRAB

unread,
Jan 6, 2014, 10:23:32 PM1/6/14
to pytho...@python.org, pytho...@python.org
> def do_A():
> with lock_B():
> with lock_A():
> _do_A()
>
> def do_B():
> with lock_A():
> with lock_B():
> _do_B()
>
It's safer to lock in the same order to reduce the chance of deadlock:

def do_A():
with lock_A():
with lock_B():

Phil Connell

unread,
Jan 15, 2014, 5:18:32 AM1/15/14
to Skip Montanaro, Python
On Mon, Jan 06, 2014 at 06:56:00PM -0600, Skip Montanaro wrote:
> So, I'm looking for a little guidance. It seems to me that futures,
> coroutines, and/or the new Tulip/asyncio package might be my salvation, but
> I'm having a bit of trouble seeing exactly how that would work. Let me
> outline a simple hypothetical calculation. I'm looking for ways in which
> these new facilities might improve the structure of my code.

This instinct is exactly right -- the point of coroutines and tulip futures is
to liberate you from having to daisy chain callbacks together.


>
> Let's say I have a dead simple GUI with two buttons labeled, "Do A" and "Do
> B". Each corresponds to executing a particular activity, A or B, which take
> some non-zero amount of time to complete (as perceived by the user) or
> cancel (as perceived by the state of the running system - not safe to run A
> until B is complete/canceled, and vice versa). The user, being the fickle
> sort that he is, might change his mind while A is running, and decide to
> execute B instead. (The roles can also be reversed.) If s/he wants to run
> task A, task B must be canceled or allowed to complete before A can be
> started. Logically, the code looks something like (I fear Gmail is going to
> destroy my indentation):
>
> def do_A():
> when B is complete, _do_A()
> cancel_B()
>
> def do_B():
> when A is complete, _do_B()
> cancel_A()
>
> def _do_A():
> do the real A work here, we are guaranteed B is no longer running
>
> def _do_B():
> do the real B work here, we are guaranteed A is no longer running
>
> cancel_A and cancel_B might be no-ops, in which case they need to start up
> the other calculation immediately, if one is pending.

It strikes me that what you have two linear sequences of 'things to do':
- 'Tasks', started in reaction to some event.
- Cancellations, if a particular task happens to be running.

So, a reasonable design is to have two long-running coroutines, one that
executes your 'tasks' sequentially, and another that executes cancellations.
These are both fed 'things to do' via a couple of queues populated in event
callbacks.

Something like (apologies for typos/non-working code):


cancel_queue = asyncio.Queue()
run_queue = asyncio.Queue()

running_task = None
running_task_name = ""

def do_A():
cancel_queue.put_nowait("B")
run_queue.put_nowait(("A", _do_A()))

def do_B():
cancel_queue.put_nowait("A")
run_queue.put_nowait(("B", _do_B()))

def do_C():
run_queue.put_nowait(("C", _do_C()))

@asyncio.coroutine
def canceller():
while True:
name = yield from cancel_queue.get()
if running_task_name == name:
running_task.cancel()

@asyncio.coroutine
def runner():
while True:
name, coro = yield from run_queue.get()
running_task_name = name
running_task = asyncio.async(coro)
yield from running_task

def main():
...
cancel_task = asyncio.Task(canceller())
run_task = asyncio.Task(runner())
...



Cheers,
Phil

Oscar Benjamin

unread,
Jan 15, 2014, 7:58:26 AM1/15/14
to pytho...@python.org
Hi Skip,

I don't yet understand how asyncio works in complete examples (I'm not sure
that many people do yet) but I have a loose idea of it so take the following
with a pinch of salt and expect someone else to correct me later. :)

With asyncio the idea is that you can run IO operations concurrently in the
a single thread. Execution can switch between different tasks while each task
can be written as a linear-looking generator function without the need for
callbacks and locks. Execution switching is based on which task has avilable
IO data. So the core switcher keeps track of a list of objects (open files,
sockets etc.) and executes the task when something is available.

>From the perspective of the task generator code what happens is that you yield
to allow other code to execute while you wait for some IO e.g.:

@asyncio.coroutine
def task_A():
a1 = yield from futureA1()
a2 = yield from coroutineA2(a1) # Indirectly yields from futures
a3 = yield from futureA3(a2)
return a3

At each yield statement you are specifying some operation that takes time.
During that time other coroutine code is allowed to execute in this thread.

If task_B has a reference to the future that task_A is waiting on then it can
be cancelled with the Future.cancel() method. I think that you can also cancel
with a reference to the task. So I think you can do something like

@asyncio.coroutine
def task_A():
# Cancel the other task and wait
if ref_taskB is not None:
ref_taskB.cancel()
asyncio.wait([ref_taskB])
try:
# Linear looking code with no callbacks
a1 = yield from futureA1()
a2 = yield from coroutineA2(a1) # Indirectly yields from futures
a3 = yield from futureA3(a2)
except CancelledError:
stop_A()
raise # Or return something...
return a3

Then task_B() would have the reciprocal structure. The general form with more
than just A or B would be to have a reference to the current task then you
could factor out the cancellation code with context managers, decorators or
something else.


Oscar
0 new messages