Why Tasks are not callable?

Paul Sokolovsky

unread,

Apr 24, 2014, 8:32:15 PM4/24/14

to python...@googlegroups.com

Hello,

This must be FAQ, but: 1) google can't get search results right; 2) people gotta ask that again, again, and again, so here it goes.

Why Task objects in asyncio are not callable? Just studying asyncio, it becomes clear that it's core is loop which runs callbacks from a queue. So, how do we integrate coroutines via generators into that? Obvious, we add an adapter object which translates call protocol into generator protocol, voila.

Now, why that undocumented _step() method, worse, why is it scheduled to be called in *Task constructor*? How just creating a task object makes it scheduled? Why are users denied ability to schedule Tasks as they see fit? In general, why all this tight coupledness between event loop and Tasks?

Where this comes from: I'm looking for the simplest possible way to schedule a coroutine to be run. run_until_complete(), asyncio.wait() - too complicated and bloated ways for such a simple task (and that's what asyncio docs give in examples).

Thanks.

Guido van Rossum

unread,

Apr 24, 2014, 11:32:43 PM4/24/14

to Paul Sokolovsky, python-tulip

Paul,

I recommend that you take a step back and look at it from a different angle.

If you have a coroutine that is waiting for you to tell it to run again, you should use a Future. The coroutine should use "yield from" on it, and in the other code, when you are ready to make it run, you call set_result() (or set_exception()) on that same Future.

That's all there is to it!

Good luck,

--Guido

--
--Guido van Rossum (python.org/~guido)

Paul Sokolovsky

unread,

Apr 25, 2014, 12:33:41 PM4/25/14

to gu...@python.org, python-tulip, Victor Stinner

Hello,

On Thu, 24 Apr 2014 20:32:43 -0700
Guido van Rossum <gu...@python.org> wrote:

> Paul,
>
> I recommend that you take a step back and look at it from a different
> angle.

I appreciate BDFL's taking time to respond to this! I'm a novice to this
and definitely so far look from one angle (specifically, how to
use asyncio with coroutines and don't let other stuff like Future's
stay in the way). I may imagine that discussed interface/behavior for
Task's come from requirements to interact with other asyncio entities.
I appreciate helping to understand which and why exactly.

>
> If you have a coroutine that is waiting for you to tell it to run
> again, you should use a Future. The coroutine should use "yield from"
> on it, and in the other code, when you are ready to make it run, you
> call set_result() (or set_exception()) on that same Future.

... Unfortunately this description doesn't provide much insight to
me ;-(. Hope it's not because I'm dumb, but because, as many
presentations mention, details of async programming in general, and
asyncio module in particular, may be not obvious to develoeprs who are
used to a conventional programming model, so people would need to spend
time to learn it.

Back to your description, I'm not even sure which of my questions it
answers. Granted, there were few of them, and actually after I wrote
the original mail, it occurred to me, that I had all data to answer one
of the questions I posed: "What is the simplest possible way to
schedule a coroutine to be run?". But the answer was so unexpected, so
ambush-like, that didn't occur to me at that time:

The way to schedule a coroutine is to instantiate a Task object for
it.

So, the main question which remains is: How an instantiation of object
may have such a strong side effect? Why such an object is modestly
called Task, and not
SelfSchedulingTaskViolatingExplicitIsBetterThanImplicitPrinciple?

Actually, my first question was: is it ever documented somewhere? And
re-reading asyncio docs, I found a proverbial example of fineprint. So,
there's section "18.5.2.4.
Task" (https://docs.python.org/3.4/library/asyncio-task.html#task), it
has subsection "18.5.2.4.1. Example: Parallel execution of tasks". From
the name it's clear that the subsection contains just an example and
commentary to it. So, after a long example block, in very last short
line of the subsection - a place where few people will pay enough
attention to it, I found that sacral phrase - "A task is automatically
scheduled for execution when it is created. The event loop stops when
all tasks are done."

--
Best regards,
Paul mailto:pmi...@gmail.com

Gustavo Carneiro

unread,

Apr 25, 2014, 1:07:02 PM4/25/14

to Paul Sokolovsky, Guido van Rossum, python-tulip, Victor Stinner

On 25 April 2014 17:33, Paul Sokolovsky <pmi...@gmail.com> wrote:

Hello,

Because the whole point of Task is to be scheduled for execution, it makes no sense to instantiate a Task and then have to call a separate method the schedule it. If you do not want to schedule it _right now_ then what is the point of creating the Task in the first place?

I suspect your expectactions are tainted by the previous knowledge of the threading API, which has a separate Thread.start() method. I think it makes _some_ sense that Thread objects do not start the actual thread automatically, since threads are preemptive and prone to race conditions, and you may want to store the Thread object in some data structure _before_ the thread actually begins executing. With asyncio.Task, even if the task is scheduled to be executed, it is guaranteed not to be executed until you reach "yield from" statement, so you have plenty of opportunity to any setup prior to the task executing.

--

Gustavo J. A. M. Carneiro

Gambit Research
"The universe is always one step beyond logic." -- Frank Herbert

Paul Sokolovsky

unread,

Apr 25, 2014, 2:05:48 PM4/25/14

to Gustavo Carneiro, python-tulip

Hello,

On Fri, 25 Apr 2014 18:07:02 +0100
Gustavo Carneiro <gjcar...@gmail.com> wrote:

[]

> > So, the main question which remains is: How an instantiation of
> > object may have such a strong side effect? Why such an object is
> > modestly called Task, and not
> > SelfSchedulingTaskViolatingExplicitIsBetterThanImplicitPrinciple?
> >
> > Actually, my first question was: is it ever documented somewhere?
> > And re-reading asyncio docs, I found a proverbial example of
> > fineprint. So, there's section "18.5.2.4.
> > Task" (https://docs.python.org/3.4/library/asyncio-task.html#task),
> > it has subsection "18.5.2.4.1. Example: Parallel execution of
> > tasks". From the name it's clear that the subsection contains just
> > an example and commentary to it. So, after a long example block, in
> > very last short line of the subsection - a place where few people
> > will pay enough attention to it, I found that sacral phrase - "A
> > task is automatically scheduled for execution when it is created.
> > The event loop stops when all tasks are done."
> >
> >
> Because the whole point of Task is to be scheduled for execution, it
> makes no sense to instantiate a Task and then have to call a separate
> method the schedule it.

What a fresh perspective! Indeed, why not take all objects and make
them do "the most expected action" by default right away. Why all those
messy methods on Popen - it can just slurp any output on construction
and terminate the child process, HttpRequest can do a GET and slurp
entire response, etc.

> If you do not want to schedule it _right
> now_ then what is the point of creating the Task in the first place?

Because asyncio forces me! Advertised as native Python async framework,
it doesn't even allow to use directly native Python coroutines, and
instead appears to be attempt to marry Twisted and Node.JS, with native
Python features being second-class citizens requiring being disguised
into those projects' stuff, which has questionable traits, hinting of
its heritage.

> I suspect your expectactions are tainted by the previous knowledge of
> the threading API, which has a separate Thread.start() method. I

My expectations are "tainted" by: 1) basic programming rule of thumb
that you first initialize things properly, and then execute them; 2)
intuitive feeling, and even explicit knowledge, of Python's "explicit is
better than implicit" principle; 3) acquaintance (cursory, I have to
admit) with many-year history of using generators/coroutines for async
cooperative multitasking, and desire to use that using standardized API
asyncio promotes.

> think it makes _some_ sense that Thread objects do not start the
> actual thread automatically, since threads are preemptive and prone
> to race conditions, and you may want to store the Thread object in
> some data structure _before_ the thread actually begins executing.
> With asyncio.Task, even if the task is scheduled to be executed, it
> is guaranteed not to be executed until you reach "yield from"
> statement, so you have plenty of opportunity to any setup prior to
> the task executing.

Let's sum up what you're saying here: asyncio Task implementation, by
relying on internal asyncio implementation details (so, naive users
who will get fixation on such behavior will fail miserably in other
contexts), violates "Explicit is better than implicit" principle *just
because it can* ?

Gustavo Carneiro

unread,

Apr 25, 2014, 2:27:31 PM4/25/14

to Paul Sokolovsky, python-tulip

Actually Popen is a good example. When you instantiate a Popen object, the subprocess is launched immediately, event before the Popen object is returned to the caller. If you want to do any additional setup, you can optionally pass a preexec_fn parameter. This parameter is optional because it is so rarely needed. It makes sense that Popen launches a subprocess directly because that's what it is there to do, it's obvious.

> If you do not want to schedule it _right
> now_ then what is the point of creating the Task in the first place?

Because asyncio forces me! Advertised as native Python async framework,
it doesn't even allow to use directly native Python coroutines, and
instead appears to be attempt to marry Twisted and Node.JS, with native
Python features being second-class citizens requiring being disguised
into those projects' stuff, which has questionable traits, hinting of
its heritage.

I do not know what you mean by "native Python coroutines". As far as I'm aware, there are no "native" Python coroutines. Natively, Python has generator functions. Asyncio is a library that creates an illusion of co-routine on top of generator functions.

If you are complaining about whether or not Python should have native coroutines, with deep integration with the language itself, then I think that discussion is out of scope for this list.

> I suspect your expectactions are tainted by the previous knowledge of
> the threading API, which has a separate Thread.start() method. I

My expectations are "tainted" by: 1) basic programming rule of thumb
that you first initialize things properly, and then execute them; 2)
intuitive feeling, and even explicit knowledge, of Python's "explicit is
better than implicit" principle; 3) acquaintance (cursory, I have to
admit) with many-year history of using generators/coroutines for async
cooperative multitasking, and desire to use that using standardized API
asyncio promotes.

As I said, Task is not executed immiediately upon instantiation, it is merely _scheduled_ to be executed as soon as you return control to the main loop. You have plenty of opportunity to initialize things before the task is executed.

> think it makes _some_ sense that Thread objects do not start the
> actual thread automatically, since threads are preemptive and prone
> to race conditions, and you may want to store the Thread object in
> some data structure _before_ the thread actually begins executing.
> With asyncio.Task, even if the task is scheduled to be executed, it
> is guaranteed not to be executed until you reach "yield from"
> statement, so you have plenty of opportunity to any setup prior to
> the task executing.

Let's sum up what you're saying here: asyncio Task implementation, by
relying on internal asyncio implementation details (so, naive users
who will get fixation on such behavior will fail miserably in other
contexts), violates "Explicit is better than implicit" principle *just
because it can* ?

I'm not sure what you mean.

I suspect you want to do something clever and non-standard with Task and asyncio, and are facing an obstacle due to the way Task schedules itself for execution. I believe this is what normal asyncio users want. If you are trying to achieve something "less normal", maybe you can work with the Tulip maintainers to allow instantiating a Task without it being automatically scheduled for execution. For instance, by adding a keword parameter to the constructor:

Task(mygenerator(), schedule_for_execution=False)

Regards,

--

Paul Sokolovsky

unread,

Apr 25, 2014, 3:47:56 PM4/25/14

to Gustavo Carneiro, python-tulip

Hello,

On Fri, 25 Apr 2014 19:27:31 +0100
Gustavo Carneiro <gjcar...@gmail.com> wrote:

[]

> I do not know what you mean by "native Python coroutines".

I mean the same as this PEP: http://legacy.python.org/dev/peps/pep-0342/

> As far as
> I'm aware, there are no "native" Python coroutines. Natively, Python
> has generator functions. Asyncio is a library that creates an
> illusion of co-routine on top of generator functions.
>
> If you are complaining about whether or not Python should have native
> coroutines, with deep integration with the language itself, then I
> think that discussion is out of scope for this list.
>
>
> >
> > > I suspect your expectactions are tainted by the previous
> > > knowledge of the threading API, which has a separate
> > > Thread.start() method. I
> >
> > My expectations are "tainted" by: 1) basic programming rule of thumb
> > that you first initialize things properly, and then execute them; 2)
> > intuitive feeling, and even explicit knowledge, of Python's
> > "explicit is better than implicit" principle; 3) acquaintance
> > (cursory, I have to admit) with many-year history of using
> > generators/coroutines for async cooperative multitasking, and
> > desire to use that using standardized API asyncio promotes.
> >
>
> As I said, Task is not executed immiediately upon instantiation, it is
> merely _scheduled_ to be executed as soon as you return control to
> the main loop. You have plenty of opportunity to initialize things
> before the task is executed.

And that's neither natural nor obvious. Take for example
https://docs.python.org/3.4/library/asyncio-task.html#example-parallel-execution-of-tasks
One may think that it's asyncio.wait(), after being scheduled and run
itself, causes its argument tasks to be scheduled and run. So, one may
want to precreate task list to be passed to wait() at later time, when
actually needed. But nope, that's not how it works - tasks will start to
run even before passed to wait.

> > > think it makes _some_ sense that Thread objects do not start the
> > > actual thread automatically, since threads are preemptive and
> > > prone to race conditions, and you may want to store the Thread
> > > object in some data structure _before_ the thread actually begins
> > > executing. With asyncio.Task, even if the task is scheduled to be
> > > executed, it is guaranteed not to be executed until you reach
> > > "yield from" statement, so you have plenty of opportunity to any
> > > setup prior to the task executing.
> >
> > Let's sum up what you're saying here: asyncio Task implementation,
> > by relying on internal asyncio implementation details (so, naive
> > users who will get fixation on such behavior will fail miserably in
> > other contexts), violates "Explicit is better than implicit"
> > principle *just because it can* ?
> >
>
> I'm not sure what you mean.
>
> I suspect you want to do something clever and non-standard with Task
> and asyncio, and are facing an obstacle due to the way Task schedules
> itself for execution.

I would like to write alternative asyncio implementation, centered
around efficient coroutines support. My implementation would allow to
schedule coroutines in event loop directly, but as asyncio doesn't
allow that, I would need to provide those asyncio.async() and
asyncio.Task() adapters. My initial thought that those would be just
identity functions, but oops - it turned out that those function/object
have side effects, and such that actually it's pretty hard to both have
lean, simple and clean implementation/API, and upstream asyncio
compatibility.

> I believe this is what normal asyncio users
> want. If you are trying to achieve something "less normal", maybe
> you can work with the Tulip maintainers to allow instantiating a Task
> without it being automatically scheduled for execution. For
> instance, by adding a keword parameter to the constructor:
>
> Task(mygenerator(), schedule_for_execution=False)

Well, I know too little of asyncio to suggest something like that, what
I'm trying to achieve first is to understand why asyncio was
designed/implemented this way. From my point of view however, the best
solution would be to add a loop method allowing to schedule a
coroutine/generator directly. Implementation for mainline asyncio would
be trivial wrapping it with Task(), but other implementations could
implement just e.g. coroutine support w/o extra wrappers and layers.

Paul Sokolovsky

unread,

May 1, 2014, 10:14:37 PM5/1/14

to Paul Sokolovsky, python-tulip

Hello,

On Fri, 25 Apr 2014 21:05:48 +0300
Paul Sokolovsky <pmi...@gmail.com> wrote:

[]

> > I suspect your expectactions are tainted by the previous knowledge
> > of the threading API, which has a separate Thread.start() method. I
>
> My expectations are "tainted" by: 1) basic programming rule of thumb
> that you first initialize things properly, and then execute them; 2)
> intuitive feeling, and even explicit knowledge, of Python's "explicit
> is better than implicit" principle; 3) acquaintance (cursory, I have
> to admit) with many-year history of using generators/coroutines for
> async cooperative multitasking, and desire to use that using
> standardized API asyncio promotes.
>
> > think it makes _some_ sense that Thread objects do not start the
> > actual thread automatically, since threads are preemptive and prone
> > to race conditions, and you may want to store the Thread object in
> > some data structure _before_ the thread actually begins executing.
> > With asyncio.Task, even if the task is scheduled to be executed, it
> > is guaranteed not to be executed until you reach "yield from"
> > statement, so you have plenty of opportunity to any setup prior to
> > the task executing.
>
> Let's sum up what you're saying here: asyncio Task implementation, by
> relying on internal asyncio implementation details (so, naive users
> who will get fixation on such behavior will fail miserably in other
> contexts), violates "Explicit is better than implicit" principle *just
> because it can* ?

Ok, I did some (re)reading on the topic, and had some time to think
about it, based on the arguments provided, and here some additional
thoughts and arguments:

Point #1

First of all I probably should have mentioned that my expectations for
coroutine scheduler are set forth by wonderful series on generators and
coroutines by David Beazley. This specific slide give the essence of
it:
http://www.slideshare.net/dabeaz/a-curious-course-on-coroutines-and-concurrency-5286140/137
. So, it's possible to write *coroutine* scheduler in such a way that
coroutines do not (and cannot if needed) access the main loop directly.
They communicate with using yield/yield from, which serve the same
purpose as syscall in an OS design. So, knowing that Python offers such
level separation, it added to cognitive dissonance to see that asyncio
not only does not separate object access, it tightly couple even
behavior of Task to a loop.

Point #2

The latest of David' series was presented just at the recent PyCon
2014: http://www.dabeaz.com/finalgenerator/ . And from slide 43 he
presents step-by-step walkthru on building a concurrent execution
framework, which (un)surprisingly shapes up as having almost the same
API and architecture asyncio. So, it should be fair to say that those
slides are good tutorial on asyncio design for dummies. So, his
framework is very similar to asyncio: it's starts with
callbacks, then switches to coroutines as more adequate representation,
they got wrapped in Task's for bookkeeping, results are represented by
Future's, then it's shown that Task and Future share many traits, so it
makes sense to make to make one subclass of another, etc.

They are very similar except for one implementation detail: David's
framework doesn't use cooperative multitasking for execution, but
rather a thread pool. You can easily imagine what that means: a started
Task really does start immediately, so if it suddenly starts behind
user's back, there's no time to add callbacks to it later. That's why
David's framework doesn't start Tasks behind user's back, which is
natural solution (like, you don't need to know that it doesn't start
them - it's just default choice). During initial stages of design,
Tasks are kickstarted using a .step() method, later explicit scheduling
function introduced: start_inline_future(), run_inline_future().

So, let's step back at overview the situation.
https://docs.python.org/3.4/library/asyncio-task.html#future explicitly
says that asyncio.Future is "almost" compatible with
concurrent.futures.Future. Why "almost"? Apparently because
concurrent.futures.Future has some features depending on concurrent
execution model and specifically underlying thread/process
implementations, which don't map well to cooperative/event loop
execution model. PEP-3156 explicitly mentions that it would be nice to
unify both Futures in the future.

Certainly, asyncio would learn from such experience and try to provide
API model not relying on particular underlying details which would
hamper compatibility and reuse, yes? No, because what we talk about is
that asyncio (ab)uses the fact that underlying event loop doesn't start
execution immediately, so forcefully schedules a Task a makes user add
important changes to it after it is in active state, which is
backwards from general point of view.

Point #3

Yet another perspective. Ok, after all there's nothing wrong with being
able to schedule a coroutine using a global function - after all,
Point #1 above praises complete separation between coroutines and loop
using a yield. As yield cannot be used outside a function, it's not
so bad idea to provide global function to schedule a coroutine. One
problem here is that "Task" or "async" are not too suggestive names for
a function which performs scheduling. Actually, I have hypothesis why
it's not too plausible to imagine such purpose for them at all. It's
grounded in dichotomization of asyncio API:

1. Some operations are expressed as methods of event loop object, e.g.

loop.run_forever()
loop.call_soon()

2. While other are expressed as global functions taking optional loop
parameter:

asyncio.wait(..., loop=None, ...)
asyncio.sleep(..., loop=None)

This API asymmetry is not particularly obvious from first look. The docs
start with description of loop methods, which kind of sets expectations
that all important functions should be available as such, and the rest
are just objects/factory functions, and not normal functions with side
effects, to which category both

asyncio.Task(..., loop=None)
asyncio.async(..., loop=None)

should be related (regardless of the actual implementation details, like
the fact that "Task" is implemented as a class).

How this issue can be solved (besides being clearly described in docs)?
Well, it would help if the module offered just a particular variety of
API. For example, my problem is that I expected all operations to be
available as methods of loop.

But dropping that and having stuff like:

asyncio.run_forever(loop=None)

would work just as well, and probably would just allow for even more
efficient implementation (no need for dummy loop object when we have
"embedded loop" for example).

Finally, having both models, but offering more complete coverage of
operations in both (with easy-to-understand names) would be good either.

Guido van Rossum

unread,

May 1, 2014, 10:41:52 PM5/1/14

to Paul Sokolovsky, python-tulip

Paul,

Where were you when PEP 3156 was being discussed?

There's probably a very good reason that explains why the current API is "right", but the point is moot -- we have selected an API, we have implemented it, we have released it, and now we should live with it and start using it.

--Guido

Paul Sokolovsky

unread,

May 2, 2014, 12:38:16 PM5/2/14

to gu...@python.org, python-tulip

Hello,

On Thu, 1 May 2014 19:41:52 -0700

Guido van Rossum <gu...@python.org> wrote:

> Paul,
>
> Where were you when PEP 3156 was being discussed?

I wasn't around, sorry ;-).

>
> There's probably a very good reason that explains why the current API
> is "right", but the point is moot -- we have selected an API, we have
> implemented it, we have released it, and now we should live with it
> and start using it.

Yeah, "start using" is exactly what I'm trying to do. I hope this close
attention to asyncio API doesn't get misinterpreted - I'm sure many
people, myself included, consider asyncio very important module, so
detailed attention to it is yet to come IMHO. Besides, PEP mentions
that there's room for adjustments till 3.5, if they will be found
worthy. But as I mentioned, I don't pledge for any (specific) changes
(besides docs clarifications), I just consider it good to learn what
asyncio *is* by considering how it compares with expectations from
prior art and how it could be different. If this thread will give
insight to a casual googler later, I already consider it serve its
purpose well.

With that intro, I'd like to finish this thread with conclusions on
specific usage case I intended for asyncio. As I mentioned previously,
that was an idea to implement "light" version of asyncio for
MicroPython, so the 1) unchanged code could run unmodified on both
CPython and uPython implementations, while 2) uPython implementation
was really light on resource usage. 2nd requirement means that
framework should be very thin layer on top of coroutines, as they are
implemented on C level and thus inherently more efficient than Python
callbacks and Future instances to wrap them (and I spent a lot of time
figuring out how "yield from" should work and implementing that).

Well, careful reading of PEP3156 sets the accounts right. asyncio was
designed with quite different requirements! Put it in a funny passage, I
took for granted that Python's async framework would support
coroutines, and then looked for an excuse why all those Futures and
Tasks stand in my way to just using the them. But asyncio actually
started with that proverbial "least common denominator" of callbacks,
and then provided an excuse to support native coroutines in the same
framework (because BDFL doesn't like callbacks. Just kidding ;-) ).
Specifically, event loop abstraction is specifically lacks any support
for coroutines, and coroutine support is fully implemented on top of
public event loop API. Here're relevant quotes from PEP:

"For users (like myself) who don't like using callbacks, a scheduler is
provided for writing asynchronous I/O code as coroutines using the PEP
380 yield from expressions. The scheduler is not pluggable;
pluggability occurs at the event loop level, and the standard scheduler
implementation should work with any conforming event loop
implementation. (In fact this is an important litmus test for
conforming implementations.)"

"For interoperability between code written using coroutines and other
async frameworks, the scheduler defines a Task class that behaves like
a Future."

"The scheduler has no public interface. You interact with it by using
yield from future and yield from task. In fact, there is no single
object representing the scheduler -- its behavior is implemented by the
Task and Future classes using only the public interface of the event
loop, so it will work with third-party event loop implementations, too."

Does it makes sense? Pretty much, especially taking into account that
asyncio "business case" is providing a foundation for various
existing async frameworks to interoperate.

As I mentioned previously, I would consider that adding a loop method
to allow schedule a coroutine directly would solve my issues, but based
on the requirements above, it's no-goer, as it will break asyncio
layering.

So, that's it - asyncio has respectable aims and requirements, but
those unfortunately do not cover all possible requirements for an async
framework a Python community may have. That's of course comes as a
little surprise, but I hope people who use other frameworks will at
least give asyncio a try and if dismiss it, then based on requirements
mismatch, and not because they didn't have time for it, don't "like",
or don't understand.

My specific choice for MicroPython then is to develop event loop with
native coroutine support - it will mimic asyncio API, but won't be
compatible, so another adaptation layer will be needed to run
asyncio-upy on native asyncio. And I don't know where that will lead me
- maybe I'll find Futures and wrapping coros in Tasks to be unavoidably
useful, and then it will be just one small step for full asyncio
compatibility. We'll see.

In the meantime, while analyzing all this stuff, I drafted a trivial
asyncio subset implementation (native API) which is capable to run
examples from asyncio docs (even loop & coroutines/tasks sections).
Maybe it will be useful for others studying asyncio design:
https://github.com/micropython/micropython-lib/tree/asyncio/asyncio_slow

Reply all

Reply to author

Forward