Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

PEP 492: isn't the "await" redundant?

79 views
Skip to first unread message

Kouli

unread,
Aug 26, 2016, 5:04:48 AM8/26/16
to
Hello,

recently, I have discovered Python's coroutines and enjoyed the whole
asyncio system a lot. But I ask you to make me understand one thing in
Python's coroutines design: why do we have to use "await" (or "yield
from") in coroutines? Why can coroutines etc. not be used
_from_coroutines_ (designated by 'async def') by a simple call-like
syntax (i.e. without the 'await' keyword)? The same for "await with"
and "await from". At most places a coroutine is referenced from
another coroutine, it is referenced using "await". Couldn't it be
avoided at theese places? This way, one would not have to
differentiate between function and coroutine "call" from within a
coroutine...

Current syntax:

async def work(x):
await asyncio.sleep(x)
def main(x):
loop.run_until_complete(work(x))

Proposed syntax:

async def work(x):
asyncio.sleep(x) # compiler "adds" 'await' automatically when
in 'async def'
def main(x):
loop.run_until_complete(work(x)) # compiler leaves as is when in 'def'

Historically, generators were defined by using keyword 'yield' inside
its definition. We now have explicit syntax with keyword 'async' so
why should we use yet the additional keyword 'await'? I tried to show
(a minor) example which would need the "leave as is" behavior inside
'async def', but I haven't found such a coroutine refence in examples.
Should it be needed (please, tell me), then it requires a special
syntax (at least for arguments - without arguments one can leave out
the parentheses).

Kouli

Chris Angelico

unread,
Aug 26, 2016, 5:18:38 AM8/26/16
to
On Fri, Aug 26, 2016 at 6:55 PM, Kouli <d...@kou.li> wrote:
> recently, I have discovered Python's coroutines and enjoyed the whole
> asyncio system a lot. But I ask you to make me understand one thing in
> Python's coroutines design: why do we have to use "await" (or "yield
> from") in coroutines? Why can coroutines etc. not be used
> _from_coroutines_ (designated by 'async def') by a simple call-like
> syntax (i.e. without the 'await' keyword)?

Two reasons. One is that Python allows you to call any function and
inspect its return value - async functions are no different, and they
do return something. The other is that it makes yield points obvious.
Consider this hypothetical function:

async def get_user_data(id):
await db.query("select name from users where id=?", (id,))
name = await db.fetchone()[0]
# transaction handling elided
return name

Now, suppose we're trying to figure out what's going on. One good
solid technique is what I call "IIDPIO debugging": If In Doubt, Print
It Out.

async def get_user_data(id):
print("Starting g_u_d")
q = db.query("select name from users where id=?", (id,))
print(q)
await q
f = db.fetchone()[0]
print(f)
name = await f
# transaction handling still elided
print("g_u_d: Returning %r" % name)
return name

It's completely obvious, here, that this function will call db.query,
print stuff out, and then put itself on ice until the query's done,
before attempting the fetch. If the call on the second line
automatically put the function into waiting mode, this display would
be impossible, and the wait points would be entirely implicit. (If you
want implicit wait points, use threading, not async I/O.)

It's also possible to wait for things that didn't come from function
calls per se. For instance, the same database lookup could be
implemented using an ORM, something like this:

class Table:
async def __getitem__(self, id):
if id in self._cache:
return self._cache[id]
await db.query(...)
data = await db.fetchone()
self._cache[id] = self.make_object(...)
return self._cache[id]

Whether this is good code or not is up to you, but it's perfectly
legal, and would be used as "users = Table(...); my_user = await
users[123]". To allow that, Python absolutely has to allow arbitrary
expressions to be waited on, not just function calls.

Does that answer the question?

ChrisA

Marko Rauhamaa

unread,
Aug 26, 2016, 5:41:48 AM8/26/16
to
Kouli <d...@kou.li>:
> We now have explicit syntax with keyword 'async' so why should we use
> yet the additional keyword 'await'?

This is an important question.

> This way, one would not have to differentiate between function and
> coroutine "call" from within a coroutine...

You'd still need to remember to add the 'async' keyword all over the
place.

How about making *every* function *always* and async, unconditionally?
That way *every* function would be an async and every function call
would be an await.


Marko

Chris Angelico

unread,
Aug 26, 2016, 5:50:13 AM8/26/16
to
On Fri, Aug 26, 2016 at 7:41 PM, Marko Rauhamaa <ma...@pacujo.net> wrote:
> How about making *every* function *always* and async, unconditionally?
> That way *every* function would be an async and every function call
> would be an await.

If you want threading, you know where to find it.

ChrisA

Marko Rauhamaa

unread,
Aug 26, 2016, 6:08:25 AM8/26/16
to
Chris Angelico <ros...@gmail.com>:
Ultimately, asyncio and multithreading might well merge. It will be
difficult for a programmer to decide in the beginning of the design
which way to go as the programming models are almost identical.


Marko

Chris Angelico

unread,
Aug 26, 2016, 6:49:15 AM8/26/16
to
(Did you mean to send this to the list? I hope so; I'm replying to the list.)

On Fri, Aug 26, 2016 at 8:30 PM, Milan Krčmář <milan....@gmail.com> wrote:
>> Two reasons. One is that Python allows you to call any function and
>> inspect its return value - async functions are no different, and they
>> do return something. The other is that it makes yield points obvious.
>> Consider this hypothetical function:
>
> The return value could be (with proposed syntax) inspected as well.
> The yield point is often visible just from the function you are using:
> async.sleep() vs. sleep() etc.

Not sure how it could be inspected - only the resulting value could.
You couldn't see the Awaitable that gets returned in between.
Depending on the situation, that could be extremely useful.

>> Now, suppose we're trying to figure out what's going on. One good
>> solid technique is what I call "IIDPIO debugging": If In Doubt, Print
>> It Out.
>
> Yes, the 'q = ...; print(q); await q' is a use case to introduce await.
>
>>
>> async def get_user_data(id):
>> print("Starting g_u_d")
>> q = db.query("select name from users where id=?", (id,))
>> print(q)
>> await q
>
>> "users = Table(...); my_user = await users[123]"
>
> An interesting example. But the 'my_user = await users[123]' must have
> appeared inside 'async def',
> so would be written as 'my_user = users[123]' in "my" syntax...

So what that really means is that, the instant something hits an
awaitable, the entire thread gets paused. That's a perfectly
reasonable way of thinking... if you have a concept of threads that
get explicitly spun off. Otherwise, it's a bit tricky, because there's
no easy way to implement the boundary - the point at which the
awaitable gets added to a queue somewhere (ie the top of the event
loop).

> Chris, thank you for such a long reply. I feel being much more
> reconciled with the "verbose" syntax ;-)

No probs, happy to help out.

ChrisA

Gregory Ewing

unread,
Aug 26, 2016, 10:18:03 PM8/26/16
to
Marko Rauhamaa wrote:

> How about making *every* function *always* and async, unconditionally?
> That way *every* function would be an async and every function call
> would be an await.

1. Many people regard it as a feature that you can see where
potential suspension points are.

2. Doing this would require massive changes to the core
interpreter and all C extensions. (The original version of
Stackless Python did something similar, and it was judged
far too big a change to incorporate into CPython.)

--
Greg

Marko Rauhamaa

unread,
Aug 27, 2016, 12:59:17 AM8/27/16
to
Gregory Ewing <greg....@canterbury.ac.nz>:

> Marko Rauhamaa wrote:
>> How about making *every* function *always* and async,
>> unconditionally? That way *every* function would be an async and
>> every function call would be an await.
>
> 1. Many people regard it as a feature that you can see where
> potential suspension points are.

Yeah, it's actually crucial since every suspension point will also
require consideration for alternate stimuli like a possible cancellation
or timeout.


Marko

Kouli

unread,
Aug 27, 2016, 3:34:02 PM8/27/16
to
Thank you for all your answers. After all, I am more confident with
the current syntax.

The most important reason for 'await' to me now is the fact you quite
_often_ need to prepare the 'awaitable' object to wait for it later
(like the ChrisA's example with print()), i.e. split the expression
into more lines:

fut = coro(x)
....
await fut

I supposed it to be only a minor use case (compared to 'await
coro(x)'), but I learned it isn't. Every time you need to "wait for
more than one thing" (more than one 'future'), you also need the
split. Not only for parallel branching, but also even for simple async
operations combined with timeout - asyncio.wait_for() etc. And I
prefer the explicit 'await' for simple waiting to special syntax for
spliting (i.e. do simple waiting without 'await' as was the proposal
at top of this thread - and - introduce more complicated syntax for
split - something like functools.partial(coro, x)).

Kouli

Lawrence D’Oliveiro

unread,
Sep 9, 2016, 5:24:04 AM9/9/16
to
On Friday, August 26, 2016 at 10:08:25 PM UTC+12, Marko Rauhamaa wrote:
> Ultimately, asyncio and multithreading might well merge. It will be
> difficult for a programmer to decide in the beginning of the design
> which way to go as the programming models are almost identical.

The two will never merge, because asyncio is non-preemptive, while threading is preemptive. Threading is for compute performance (and this is no good in pure Python unless you write a C extension module), at the cost of much trickier programming and greater propensity to bugs, while asyncio lets you interleave background processing with waits for time-consuming external activities (I/O, including network I/O, or just waiting for the user to click a button or press a key), while keeping the race conditions manageable.

So you see, they have very different application areas, that only superficially overlap.

Chris Angelico

unread,
Sep 9, 2016, 5:28:42 AM9/9/16
to
FWIW I think it's great that they have similar coding styles. We don't
have a problem with threading and multiprocessing having very similar
APIs, do we? Yet they exist to solve distinctly different problems.

ChrisA

Marko Rauhamaa

unread,
Sep 9, 2016, 7:27:12 AM9/9/16
to
Chris Angelico <ros...@gmail.com>:
> FWIW I think it's great that they have similar coding styles. We don't
> have a problem with threading and multiprocessing having very similar
> APIs, do we? Yet they exist to solve distinctly different problems.

Well, Ext4, BtrFS, XFS and ReiserFS have very similar APIs. In fact,
they exist to solve the same problems. One day, a file system might
emerge that supersedes all other file systems.

It's a similar deal between asyncio and threading. The problem space is
the same: managing concurrency with almost identical programming models.


Marko

Chris Angelico

unread,
Sep 9, 2016, 8:10:53 AM9/9/16
to
C, Python, Ruby, and COBOL exist to solve the same problems, and have
very similar APIs - write your code in a text file, then run it
through some parser so it executes. Are we some day going to eliminate
all programming languages bar one? I doubt it. And I don't want to.

ChrisA

Steve D'Aprano

unread,
Sep 9, 2016, 8:28:14 AM9/9/16
to
On Fri, 9 Sep 2016 07:28 pm, Chris Angelico wrote:

> We don't
> have a problem with threading and multiprocessing having very similar
> APIs, do we? Yet they exist to solve distinctly different problems.

Surely not?

I would think that threading and multiprocessing are two distinct
implementations of the same problem: how to run two or more chunks of code
at the same time.

In CPython we usually say "use threads for I/O bound tasks, processes for
CPU bound tasks" but that's only because of the GIL. One doesn't need such
a distinction in Jython or IronPython.



--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

Chris Angelico

unread,
Sep 9, 2016, 8:39:04 AM9/9/16
to
On Fri, Sep 9, 2016 at 10:27 PM, Steve D'Aprano
<steve+...@pearwood.info> wrote:
> On Fri, 9 Sep 2016 07:28 pm, Chris Angelico wrote:
>
>> We don't
>> have a problem with threading and multiprocessing having very similar
>> APIs, do we? Yet they exist to solve distinctly different problems.
>
> Surely not?
>
> I would think that threading and multiprocessing are two distinct
> implementations of the same problem: how to run two or more chunks of code
> at the same time.
>
> In CPython we usually say "use threads for I/O bound tasks, processes for
> CPU bound tasks" but that's only because of the GIL. One doesn't need such
> a distinction in Jython or IronPython.

You also want to use processes if you need the ability to kill one of
them externally, or track resource usage separately, or have
independence of other process-wide features such as current working
directory. So there are some problems (eg multi-user services) where
the additional isolation is important.

In contrast, you want to use threads if you need the ability to
quickly and easily share mutable data, or if you want all resource
usage to be lumped together - eg if you're not really doing several
separate jobs, but are doing one single conceptual job.

>From a systems administration POV, threads logically belong together,
but processes are distinct beasts that communicate through
clearly-defined IPC. There are times when you want one, and times when
you want the other. The GIL just pulls a specific category of problem
out of the hands of threads and into the hands of processes, due to
its inability to spread Python code across CPU cores; but it didn't
create the distinction. If a future version of CPython eliminates the
GIL and allows threads to concurrently run CPU-heavy code, there will
still be a need for multiprocessing.

ChrisA

Lawrence D’Oliveiro

unread,
Sep 9, 2016, 5:03:33 PM9/9/16
to
On Saturday, September 10, 2016 at 12:39:04 AM UTC+12, Chris Angelico wrote:
> In contrast, you want to use threads if you need the ability to
> quickly and easily share mutable data, or if you want all resource
> usage to be lumped together - eg if you're not really doing several
> separate jobs, but are doing one single conceptual job.

Multiple processes are usually preferable to multiple threads. The default-shared-nothing memory model is less bug-prone than default-shared-everything.

Think of every time you use “&” and “|” in a shell command line: you are creating multiple processes, and yet they are doing a “single conceptual job”.
0 new messages