WSGI and Async [was: Pyramid 2 ideas]

109 views
Skip to first unread message

Alice Bevan–McGregor

unread,
Mar 24, 2011, 3:05:33 AM3/24/11
to pylons...@googlegroups.com, web...@python.org
[Cross posted between the source of the discussion and the Web-SIG for
additional discussion there.]

On 2011-03-15 14:54:18 -0700, Mike Orr said:

> There has been an ongoing discussion between the WSGI developers and
> Twisted about how to be more compatible. The upshot is that
> asynchronous servers need some kind of token in the output stream that
> means "I'm not ready; come back later." Other middleware would have to
> pass this token through unchanged. And of course, the application would
> have to use non-blocking libraries such as non-blocking database
> executors. I'm not sure if ordinary file access is "blocking" enough to
> require that too.

Not just Twisted; have a gander at the Web-SIG mailing list for
December and January.[1]

Unfortunately the amount of interconnectedness (and thus complexity)
needed for a working solution takes the concept of async completely out
of the domain of a low-level specification like WSGI.

An "I'm not ready; come back later" token, which in marrow.server.http
is already implemented—yield None from your body iterator—would, as an
example, add an immediate (or slightly delayed) callback in the
reactor[2] which will then poll the application for real data. That's
not async; that's no better than AJAX polling! (And is unidirectional
to boot.)

Non-blocking libraries… how do they determine how to be non-blocking?
Socket and file operations, which can be easily made non-blocking
through the use of select/epoll/kqueue/libevent/libev, have the
distinction of being handled (and likely already used) at the WSGI
server's reactor level. How would a third-party library interface to
an existing reactor in an agnostic way? I'm fairly confident that it
just wouldn't be feasible.

If non-blocking libraries implemented their own async reactors… how
would you coordinate the mess of having, potentially, half a dozen
reactors?

Futures objects take some of the headache away, allowing for the WSGI
application to ask for some work to be performed by a third-party
library, returning a Future, then being suspended pending the result of
the Future. Futures are easy to detect (through duck typing) and can
be easily ignored (and passed along) by middleware.

Still, Futures are usually bound to an Executor (reactor), and that
executor instance would need to be passed to the third-party libraries
somehow. marrow.server.http provides a 'wsgi.executor' environment
variable which is, usually, a thread pool worker, which still doesn't
quite qualify for async status.

A PEP extending Futures for use with true async models would be a great
start, and could likely be combined with a simple extension to WSGI to
add the appropriate environment variables. Alex Grönholm, Graham
Dumpleton, everyone on the (excellent, if lacking in bloody combat ;)
WSGI panel at PyCon, and I seem to all agree that async has no part in
core WSGI. There would simply be no way to get a consensus on a single
API with so many disparate implementations already in the wild.

> The upshot has been that Twisted runs WSGI applications in a thread
> anyway because it can't be sure they won't block.

As does marrow.server.http if requested to do so. Extremely small or
efficient applications can choose not to.

> And there hasn't been enough interest from WSGI developers to actually
> pursue using it with asynchronous servers.

I've been interested, as has my partner in crime. We've actually
fiddled around with futures-based core IO reactors, different return /
yield styles for WSGI applications, and all sorts of crazy things, and
always came to the same conclusion. :(

> I think Python has a future object now which standardizes Twisted's
> Deferred and the equivalent in other asynchronous servers. So that's a
> start.

The core Futures implementation (concurrent.futures; core in Python 3.2
with a portable back-port maintained, I believe, by Alex) utilizes a
thread pool or process pool, has referential limitations (i.e. don't
pass the executor to a future running in a process pool… deadlocks are
bad), and I simply have no idea at this time how difficult it would be
to create a true async reactor under that model.


The end result of all of this is that async support should be its own
PEP, extending WSGI (333, now 3333) and potentially extending Futures,
PEP 3148[3], to create an acceptable generalized API for async
interfaces, not just worker pools. I've abandoned the idea for my own
WSGI 2 WIP, PEP 444[3], which marrow.server.http is the reference
implementation (and idea sandbox) of/for.

— Alice.

P.s. If anyone has information I don't have, or simply can't remember
at midnight after a very long day, feel free to correct me! :)

[1] http://mail.python.org/pipermail/web-sig/

[2] I know, 'reactor' isn't exactly an accurate term. I just can't
remember the right one right now.

[3] http://www.python.org/dev/peps/pep-3148/

[4] http://bit.ly/fRyMJ2


Daniel Holth

unread,
Mar 24, 2011, 4:59:16 PM3/24/11
to pylons...@googlegroups.com
Proposal: import eventlet

Done! Invented in 1963, coroutines let me use WSGI asynchronously without rewriting anything. What could be sweeter?

Alice Bevan–McGregor

unread,
Mar 24, 2011, 9:44:49 PM3/24/11
to pylons...@googlegroups.com

Cross-platform, cross-implementation. (Both re: the greenlet C extension.)

Python 3 support, esp. now that we have PEP 3333.

Something that's officially out of beta status. (See the pypi page.)

Something that doesn't require hideous monkey patching.

Something light-weight. (Core is 7K SLoC in 76 files… and unit tests
fail badly on my system.)

Something better maintained. (http://blog.gevent.org/2010/02/27/why-gevent/)

Something with fewer new & open substantive issues.
(https://bitbucket.org/which_linden/eventlet/issues?status=new&status=open)

Something

everyone can agree on. ;)

Feel free to correct me on any of the above points, though I'm aware
that the use of greenlet is optional via a installation-time command
line argument.

— Alice.


Daniel Holth

unread,
Mar 25, 2011, 4:22:53 PM3/25/11
to pylons...@googlegroups.com, Alice Bevan–McGregor
None of those points say anything about a perfect implementation of coroutines (I heard lua has the best ones) versus callback-style or some other more-stackful-style of asynchronicity. On the other hand for example I'm sure there is a lot of Twisted code that doesn't need to be rewritten. I just feel proponents of one style sometimes forget to explain that the word 'async' by itself does not imply a particular programming style.
Reply all
Reply to author
Forward
0 new messages