Re: [Web-SIG] PEP 444 / WSGI 2 Async

Alice Bevan–McGregor

unread,

Jan 5, 2011, 11:01:47 PM1/5/11

to web...@python.org

[Apologies if this is a double- or triple-post; I seem to be having a
stupid number of connectivity problems today.]

Howdy!

Apologies for the delay in responding, it’s been a hectic start to the
new year. :)

On 2011-01-03, at 6:22 AM, Timothy Farrell wrote:

> You don't know me but I'm the author of the Rocket Web Server
> (http://pypi.python.org/pypi/rocket) and have, in the past, been
> involved in the web2py community. Like you, I'm interested in seeing
> web development come to Python3. I'm glad you're taking up WSGI2. I
> have a feature-request for it that perhaps we could work in.

Of course; in fact, I hope you don’t mind that I’ve re-posted this
response to the web-sig mailing list. Async needs significantly
broader discussion. I would appreciate it if you could reply to the
mailing list thread.

> I would like to see futures added as a server option. This way,
> controllers could dispatch emails (or run some other blocking or
> long-running task) that would not block the web-response. WSGI2
> Servers could provide a futures executor as environ['wsgi.executor']
> that the app could use to offload processes that need not complete
> before the web-request is served to the client.

E-mail dispatch is one of the things I solved a long time ago with
TurboMail; it uses a dedicated thread pool and can deliver > 100 unique
messages per second (more if you use BCC) in the default configuration,
so I don’t really see that one use case as one that can benefit from
the futures module. Updating TurboMail to use futures would be an
interesting exercise. ;)

I was thinking of exposing the executor as
environ[‘wsgi.async.executor’], with ‘wsgi.async’ being a boolean value
indicating support.

> What should the server do with the future instances?

The executor returns future instances when running executor.submit/map;
the application never generates its own Future instances. The
application may, however, use whatever executor it sees fit; it can,
for example, have one thread pool executor and one process pool, used
for different tasks.

The server itself can utilize any combination of single-threaded
IO-based async (see further on in this message), and multi-threaded or
multi-process management of WSGI requests. Resuming suspended
applications (ones pending future results) is an implementation detail
of the server.

> Should future.add_done_callback() be allowed? I'm not sure how
> practical/reliable this would be. (By the time the callback is called,
> the calling environment could be gone. Is this undefined behavior?)

If you wrap your callback in a partial(my_callback, environ) the
environ will survive the end of the request/response cycle (due to the
incremented reference count), and should be allowed to enable
intelligent behaviour in the callbacks. (Obviously the callbacks will
not be able to deliver a response to the client at the time they are
called; the body iterator can, however, wait for the future instance to
complete and/or timeout.)

A little bit later in this message I describe a better solution than
the application registering its own callbacks.

> Do we need to also specify what type of executor is provided (threaded
> vs. separate process)?

I think that’s an application-specific configuration issue, not really
the concern of the PEP.

> Do you have any thoughts about this?

I believe that intelligent servers need some way to ‘pause’ a WSGI
worker rather than relying on the worker executing in a thread and
blocking while waiting for the return value of a future. Using
generator syntax (yield) with the following rules is my initial idea:

* The application may yield None. This is a polite way to have the
async reactor (in the WSGI server/gateway) reschedule the worker for
the next reactor cycle. Useful as a hint that “I’m about do do
something that may take a moment”, allowing other workers to get a
chance to perform work. (Cooperative multi-tasking on single-threaded
async servers.)

* The application must yield one 3-tuple WSGI response, and must not
yield additional data afterwords. This is usually the last thing the
WSGI application would do, with possible cleanup code afterwords
(before falling off the bottom / raising StopIteration / returning
None).

* The application may yield Future instances returned by
environ[‘wsgi.executor’].submit/map; the worker will then be paused
pending execution of the future; the return value of the future will be
returned from the yield statement. Exceptions raised by the future
will be re-raised from the yield statement and can thus be captured in
a natural way. E.g.:

try:
complex_value = yield environ[‘wsgi.executor’].submit(long_running)
except:
pass # handle exceptions generated from within long_running

Similar rules apply to the response body iterator: it yields
bytestrings, may yield unicode strings where native strings are unicode
strings, and may yield Future instances which will pause the body
iterator as per the application callable.

Servers must:

* Allow configuration of the future implementation for options like
threading / processes.

* Allow developers to override the executor completely.

* Provide additional attributes on wsgi.input: async_ prefixed versions
of the read methods, which are factories returning server-specific
Future instances. (Allowing a single-threaded async server to handle
socket IO intelligently with select/epoll/etc.)

To the libraries you use, futures make async pretty much transparent.
E.g. libraries (such as a DB layer) must not create their own Future
objects, but must instead utilize an executor passed to them explicitly
by the application.

My ideas thus far,

— Alice.

P.s. a number of these ideas (wsgi.executor, wsgi.async, some of the
yield syntax described above) have been soundly argued against by a
co-conspirator over IRC. I’ll re-read my IRC logs and reply with those
considerations in mind (and transcribed logs) shortly.

P.p.s. my kernel panicked while I was translating my rewrite into ReST;
I'll re-do the conversion tonight or tomorrow morning and submit it
downstream ASAP.

_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com

Alice Bevan–McGregor

unread,

Jan 6, 2011, 12:51:44 AM1/6/11

to web...@python.org

Alex Grönholm and I have been discussing async implementation details
(and other areas of PEP 444) for some time on IRC. Below is the
cleaned up log transcriptions with additional notes where needed.

Note: The logs are in mixed chronological order — discussion of one
topic is chronological, potentially spread across days, but separate
topics may jump around a bit in time. Because of this I have
eliminated the timestamps as they add nothing to the discussion.
Dialogue in square brackets indicates text added after-the-fact for
clarity. Topics are separated by three hyphens. Backslashes indicate
joined lines.

This should give a fairly comprehensive explanation of the rationale
behind some decisions in the rewrite; a version of these conversations
(in narrative style vs. discussion) will be added to the rewrite Real
Soon Now™ under the Rationale section.

— Alice.

--- General

agronholm: my greatest fear is that a standard is adopted that does not
solve existing problems

GothAlice: [Are] there any guarantees as to which thread / process a
callback [from the future instance] will be executed in?

--- 444 vs. 3333

agronholm: what new features does pep 444 propose to add to pep 3333? \
async, filters, no buffering?

GothAlice: Async, filters, no server-level buffering, native string
usage, the definition of "byte string" as "the format returned by
socket read" (which, on Java, is unicode!), and the allowance for
returned data to be Latin1 Unicode. \ All of this together will allow a
'''def hello(environ): return "200 OK", [], ["Hello world!"]''' example
application to work across Python versions without modification (or use
of b"" prefix)

agronholm: why the special casing for latin1 btw? is that an http thing?

GothAlice: Latin1 = \u0000 → \u00FF — it's one of the only formats that
can be decoded while preserving raw bytes, and if another encoding is
needed, transcode safely. \ Effectively requiring Latin1 for unicode
output ensures single byte conformance on the data. \ If an application
needs to return UTF-8, for example, it can return an encoded UTF-8
bytestream, which will be passed right through,

--- Filters

agronholm: regarding middleware, you did have a point there --
exception handling would be pretty difficult with ingress/egress filters

GothAlice: Yup. It's pretty much a do or die scenario in filter-land.

agronholm: but if we're not ditching middleware, I wonder about the
overall benefits of filtering \ it surely complicates the scenario so
it'd better be worth it \ I don't so much agree with your reasoning
that [middleware] complicates debugging \ I don't see any obvious
performance improvements either (over middleware)

GothAlice: Simplified debugging of your application w/ reduced stack to
sort through, reduced nested stack overhead (memory allocation
improvement), clearer separation of tasks (egress compression is a good
example). This follows several of the Zen of Python guidelines: \
Simple is better than complex. \ Flat is better than nested. \ There
should be one-- and preferably only one --obvious way to do it. \ If
the implementation is hard to explain, it's a bad idea. \ If the
implementation is easy to explain, it may be a good idea.

agronholm: I would think that whatever memory the stack elements
consume is peanuts compared to the rest of the application \
ingress/egress isn't exactly simpler than middleware

GothAlice: The implementation for ingress/egress filters is two lines
each: a for loop and a call to the elements iterated over. Can't get
much simpler or easier to explain. ;) \ Middleware is pretty complex…
\ The majority of ingress filters won't have to examine wsgi.input, and
supporting async on egress would be relatively easy for the filters
(pass-through non-bytes data in body_iter). \ If you look at a system
that offers input filtering, output filtering, and decorators
(middleware), modifying input should "obviously" be an input filter,
and vice-versa.

agronholm: how does a server invoke the ingress filters \ in my
opinion, both ingress and egress filters should essentially be pipes \
compression filters are a good example of this \ once a block of
request data (body) comes through from the client, it should be sent
through the filter chain

agronholm: consider an application that receives a huge gzip encoded
upload \ the decompression filter decompresses as much as it can using
the incoming data \ the application only gets the next block once the
decompression filter has enough raw data to decompress

GothAlice: Ingress decompression, for example, would accept the environ
argument, detect gzip content-encoding, then decompress the wsgi.input
into its own buffer, and finally replace wsgi.input in the environ with
its decompressed version. \ Alternatively, it could decompress chunks
and have a more intelligent replacement for wsgi.input (to delay
decompression until it is needed).

agronholm: are you saying that the filter should decompress all of the
data at once? how would this work with async?

GothAlice: The first example is the easiest to implement, but you are
correct in that it would buffer all the data up-front. The second I
described (intelligent wsgi.input replacement) would work in an async
application environment. (But would be harder to code and unit-test.)

agronholm: I don't really see how it would work

GothAlice: environ = parse_headers() ; decompression_filter(environ)

agronholm: wouldn't it be simpler to just have ingress filters return
the data chunk, altered or not?

GothAlice: decompression_filter(environ): if
environ.get('HTTP_TRANSFER_ENCODING', None) == 'gzip':
environ['wsgi.input'] = StreamDecompression(environ['wsgi.input'])

agronholm: I'm not very comfortable with the idea of wsgi.input in
async apps \ I'm just thinking what would happen when you do
environ['wsgi.input'].read()

GothAlice: One of two things: in a sync environment, it blocks until it
can read, in an async environment [combined with yield] it
pauses/shelves your application until the data is available.

agronholm: I'd rather do away with wsgi.input altogether, but I haven't
yet figured out how the application would read the entire request body
then

agronholm: it should be fairly easy to write a helper function for that though

GothAlice: Returning the internal socket representation would improve
some things, and make things generally worse. :/

agronholm: returning socket from what?

GothAlice: In Tornado's HTTP server, you read and write directly
from/to the IOStream. \ wsgi.input, though, is more abstracted

agronholm: argh, I can't think of a way to make this work beautifully

GothAlice: Yeah. :(

agronholm: the requirements of async apps are a big problem

agronholm: returning magic values from the app sounds like a bad idea

agronholm: the best solution I can come up with is to have
wsgi.async_input or something, which returns an async token for any
given read operation

agronholm: most filters only deal with the headers \ so what if we made
it so that the filter chain is only accessed once, and filters that
need to modify the body as well would return a generator \ and when the
server receives more data, it would feed it to the first generator in
the chain, feed the results from that to the next etc.

agronholm: the generators could also return futures, at which point the
server adjourns processing of the chain until the callback fires \ in
multithreaded mode, the server would simply call .result() which would
block, and in single threaded mode, add a callback to the reactor

GothAlice: Hmm.

agronholm: the ingress filters' return values would affect what is sent
to the application

agronholm: [I'm] trying to solve the inherent difficulties with having
a file-like object in the environ \ my solution would allow them to
work transparently with sync and async apps alike

GothAlice: Hmm. What would the argspec of an ingress filter be, then?
(The returned value, via yield, being wsgi.input chunks.)

agronholm: probably environ, body_generator or something

agronholm: the beauty in wsgi in general is of course that it requires
no importing of predefined functions or anything \ so there should be
some way for the application to read the entire request at once

GothAlice: I think combining wsgi.async with specific attributes on
wsgi.input which can be yielded as async tokens might be a way to go.

GothAlice: agronholm: yielding None from the application being a polite
way to re-schedule the application after a reactor cycle to give other
connections a chance before doing something potentially blocking.

agronholm: I thought None meant "I'm done here" \ otoh, the app has to
return some response

GothAlice: That's yielding an application response tuple followed by
StopIteration. \ (Not necessarily immediately returning StopIteration
after yielding the response; there may be clean-up to do; which is a
nice addition.)

GothAlice: Three options: yield None (reschedule to be nice/cooperative
behaviour), yield status, headers, body (deliver a response), and yield
AsyncToken.

agronholm: so what would the application yield if it wanted to generate
the body in chunks? (potentially a slow process)

GothAlice: A body_iter that generates the body in chunks, as per a
standard (non-generator) application callable. \ That wouldn't change.
\ But often an application would want to async stream the response body
in before starting body generation.

GothAlice: An application MUST be a callable returning (status_bytes,
header_list, body_iter) OR a generator. IF the application is a
generator, it MUST yield EITHER None (delay execution), a
(status_bytes, header_list, body_iter) tuple, or an async token. After
yielding a response the application generator MAY perform additional
actions before raising StopIteration, but MUST NOT yield anything but
None or async tokens from that point onward.

agronholm: one of my concerns is how a request body modifying
middleware will work with async apps unless it's specifically designed
with those in mind \ you suggested that such middleware replace
wsgi.input with their own

GothAlice: It would have to be; or it could simply yield through
non-bytes chunks, returning the result of the yield back up (which may
be ignored).

agronholm: what guarantee is there that the replacement has
.async_read() unless the filter was specifically designed to be async
aware?

GothAlice: Or, if the developer was in a particularly black mood, the
middleware could re-set wsgi.async to be false. ;)

agronholm: I don't quite understand the meaning or point of wsgi.async

GothAlice: wsgi.async is a boolean representing the capability of the
underlying server to accept async tokens.

agronholm: why would that ever be false? \ in a blocking/threaded
server, implementing support for that is trivial

GothAlice: Why does no HTTP server in Python conform to the HTTP/1.1
spec properly? Lazy developers. ;) [And lack of interest
down-stream. Calling server authors idiots was not my intention.]

agronholm: they could just as well forgo setting wsgi.async altogether

GothAlice: environ.get('wsgi.async', False) is the only way to armor
against that, I guess.

agronholm: well I think we're talking about *conforming* servers here \
there's not much that can be done about incomplete implementations

GothAlice: However, if wsgi.async is going to be in the WSGI2 spec,
it'll be required. if the server hasn't gotten around to implementing
async yet, it should be False.

agronholm: I think wsgi.async is useless \ "hasn't gotten around to"?
that's not a lot of work, really \
that flag just paves way for half assed implementations

GothAlice: Still, some method to detect the capability should be
present. Something more than attempting to access wsgi.input's
async_read attribute and catching the AttributeError exception.

agronholm: the capability should be *required* \ given how easy it is
to implement \ I don't see any justification not to require it

GothAlice: We'll have to see how easy it is to add to m.s.http before
I'll admit it's "easy" in the general sense. ;) If it turns out to be
simple (and not-too-badly-performance-impacting) I'll make it required.

agronholm: fair enough

agronholm: robert pointed out the difficulty of executing them in the
right order

GothAlice: Indeed; this problem exists with the current middleware system, too.

agronholm: it'd probably be easier to specify them as a list in the
deployment descriptor

GothAlice: (My note about appending to ingress_filters, prepending to
egress_filters to simulate middleware behaviour is functional, though
non-optimal; the filters, if co-dependant, should be middleware
instead.)

agronholm: webcore's current middleware system is too much magic imho

GothAlice: I agree. \ A init.d-style ordering system would have to be
its own PEP.

agronholm: also, I was thinking if we could filters that needed both
ingress/egress capabilities (such as session middleware) in a way that
only required specifying it once

GothAlice: … wouldn't that be middleware? ;) \ Thus far I've defined
ingress and egress filters as distinct and separate, with
dual-functionality requirements being fulfilled by middleware.
agronholm: we could probably simplify that

GothAlice: "There should be one, and preferably only one, right way to
do something." ;)

agronholm: yes, and that is the point of my idea :)

GothAlice: Replacing middleware isn't a small task; the power of
potentially redirecting application flow (amongst other gems the
middleware structure brings to the table) would be very difficult to
model cleanly when separated into ingress/egress.

agronholm: btw, I very much agreed with PJE's suggestion of making
filtering its own middleware instead of a static part of the interface

GothAlice: The problem with not mentioning filtering in the PEP is that
middleware authors wont take it into consideration when coding. (That's
why it's optional for servers to implement and includes an example
middleware implementation of the API.)

--- Async

agronholm: +1 for async wsgi using the new concurrent.futures stdlib feature

agronholm: I still don't like the idea of wsgi.executor \ imho that
should be left up to the application or framework \ not the web server
\ and I still disapprove of the wsgi.async flag

GothAlice: The server does, however, need to be able to capture async
read requests across environ['wsgi.input'].async_read*

GothAlice: What would the semantics be for a worker on a
single-threaded async server to wait for a long-running task? Was my
code example (the simplified try/except block) inadequate?

agronholm: if the app needs to do heavy lifting, it delegates the task
to a thread/process pool, which returns a future, which the app yields
back \ when the callback is activated, the reactor will resume
execution of that app \ I think you pretty much got it right in your
revised example code

GothAlice: Just replace environ['wsgi.executor'] with an
application-specific one?

agronholm: essentially, yes \ that would greatly simplify the
implementation of the interface

GothAlice: And it is all done via done_callbacks… hmm. For the
purposes of the callbacks, though, exceptions are ignored. :/

agronholm: what is your concern with this specifically?

GothAlice: That my desired syntax (try/except around a value=yield
future) won't be able to capture and elevate exceptions back to the
WSGI application.

agronholm: oh, that is not a problem since the reactor will call
.result() on it anyway and send any exceptions back to the application

GothAlice: Back to the environment issue for a moment: not providing an
executor in the environment means middleware will not be able to
utilize async features without having their own executor in addition to
the application's. How about I explicitly require that servers allow
overriding of the executor used? \ How often would an application want
to utilize multiple executors at once?

agronholm: the middleware could have a constructor argument for passing
an executor

GothAlice: That would then require passing an executor to multiple
layers of middleware, creating a number of additional references and
local variables, vs. configuring a "default executor" at the server
level.

agronholm: there are pros and cons with the wsgi.executor approach

GothAlice: There would be no requirement for the application to use
wsgi.executor; if an application has a default threaded executor
(wsgi.executor), it can use a multi-process one for specific jobs
[ignoring the one in the env] without too much worry.

agronholm: essentially wsgi.executor would be a convenience then

GothAlice: Exactly. \ (And mostly a helper to middleware so they don't
each need explicit configuration or management of their own executors.)

--- Optional Components

GothAlice: I think full HTTP/1.1 conformance should be a requirement
for WSGI2 servers, too. (chunked requests, not just chunked responses)
\ Because there's really no point in writing a -new- HTTP/1.0 server.
;)

agronholm: indeed

GothAlice: One thing I've been grappling [while] rewriting PEP 444 is
that pretty much everything marked 'optional' or 'may' in WSGI 1 / PEP
333 no developer actually gets around to implementing. Thus making
HTTP/1.1 support non-optional [in PEP 444].

GothAlice: Something I've noticed with Python HTTP code: none of it is
complete, and all of the servers that report HTTP/1.1 compliance
straight up lie. Zero I found support chunked response bodies, and
zero support chunked requests (which is required by HTTP/1.1). \ (The
servers I looked at universally had embedded comments along the lines
of: "Chunked responses are left up to application developers.")

GothAlice: If it's too demanding [or appears too daunting], a "may
implement" feature becomes a "never will be implemented" feature.

agronholm: I would prefer requiring HTTP/1.1 support from all WSGI2 servers

GothAlice: I mean, if I can do it in 172 Python opcodes, I'm certain it
can't be -that- hard to implement. ;)

Antoine Pitrou

unread,

Jan 6, 2011, 6:53:14 AM1/6/11

to web...@python.org

Alice Bevan–McGregor <alice@...> writes:
>
> agronholm: what new features does pep 444 propose to add to pep 3333? \
> async, filters, no buffering?
>
> GothAlice: Async, filters, no server-level buffering, native string
> usage, the definition of "byte string" as "the format returned by
> socket read" (which, on Java, is unicode!), and the allowance for
> returned data to be Latin1 Unicode.

Regardless of the rest, I think the latter would be a large step backwards.
Clear distinction between bytes and unicode is a *feature* of Python 3.
Unicode-ignorant programmers should use frameworks which do the encoding work
for them.

(by the way, why you are targeting both Python 2 and 3?)

> agronholm: I'm not very comfortable with the idea of wsgi.input in
> async apps \ I'm just thinking what would happen when you do
> environ['wsgi.input'].read()
>
> GothAlice: One of two things: in a sync environment, it blocks until it
> can read, in an async environment [combined with yield] it
> pauses/shelves your application until the data is available.

Er, for the record, in Python 3 non-blocking file objects return None when
read() would block. For example:

>>> r, w = os.pipe()
>>> flags = fcntl.fcntl(r, fcntl.F_GETFL, 0); fcntl.fcntl(r, fcntl.F_SETFL,
flags | os.O_NONBLOCK)
0
>>> os.read(r, 1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 11] Resource temporarily unavailable
>>> f = open(r, "rb")
>>> f.read(1) is None
True

> agronholm: the requirements of async apps are a big problem
>
> agronholm: returning magic values from the app sounds like a bad idea
>
> agronholm: the best solution I can come up with is to have
> wsgi.async_input or something, which returns an async token for any
> given read operation

The idiomatic abstraction for non-blockingness under POSIX is file descriptors.
So, at the low level (the WSGI level), exchanging fds between server and app
could be enough to allow both to wake up each other (perhaps two fds: one the
server can wait on, one the app can wait on). Similarly to what signalfd() does.
Then higher-level tools can wrap inside Futures or whatever else.

However, this also means Windows compatibility becomes more complicated, unless
the fds are sockets.

Regards

Antoine.

chris...@gmail.com

unread,

Jan 6, 2011, 8:03:15 AM1/6/11

to web...@python.org

On Wed, 5 Jan 2011, Alice Bevan–McGregor wrote:

> This should give a fairly comprehensive explanation of the rationale behind
> some decisions in the rewrite; a version of these conversations (in narrative
> style vs. discussion) will be added to the rewrite Real Soon Now™ under the
> Rationale section.

Thanks for this. I've been trying to follow along with this
conversation as an interested WSGI app developer and admit that much
of the thrust of things is getting lost in the details and people's
tendency to overquote.

One thing that would be useful is if, when you post, Alice, you could
give the URL of whatever and wherever your current draft is.

That out of the way some comments:

For me WSGI is a programmers' aid used to encourage ecapsulation and
separation of concerns in web applications I develop. After that there's
a bit about reuability and portability, but the structure of the
apps/middleware themselves are the most important concerns for me. I
don't use frameworks, or webob or any of that stuff. I just cook up
callables that take environ and start_response. I don't want my
awareness of the basics of HTTP abstracted away, because I want to make
sure that my apps behave well.

Plain WSGI is a good thing, for me, because it means that my
applications are a) very webby (in the stateless HTTP sense) and b)
very testable.

This is all works because WSGI is very simple, so my tendency is to be
resistant to ideas which appear to add complexity.

> --- 444 vs. 3333

> GothAlice: Async, filters, no server-level buffering, native string usage,
> the definition of "byte string" as "the format returned by socket read"
> (which, on Java, is unicode!), and the allowance for returned data to be
> Latin1 Unicode. \ All of this together will allow a '''def hello(environ):
> return "200 OK", [], ["Hello world!"]''' example application to work across
> Python versions without modification (or use of b"" prefix)

On async:

I agree with some others who have suggested that maybe async should be
its own thing, rather than integrated into a WSGI2. A server could
choose to be WSGI2 compliant or AWSGI compliant, or both.

Having spent some time messing about with node.js recently, I can say
that the coding style for happy little async apps is great fun, but
actually not what I really want to be doing in my run-of-the-mill as-
RESTful-as-possible web apps.

This might make me a bit of a dinosaur. Or a grape.

That said I can understand why an app author might like to be able to
read or write in an async way, and being able to shelf an app to wait
around for the next cycle would be a good thing. I just don't want
efforts to make that possible to make writing a boring wsgi thing more
annoying.

On filters:

I can't get my head around filters yet. They sound like a different
way to do middleware, with a justification of something along the
lines of "I don't like middleware for filtering". I'd like to be
(directly) pointed at a more robust justification. I suspect you have
already pointed at such a thing, but it is lost in the sands of
time...

Filters seem like something that could be added via a standardized piece
of middleware, rather than being part of the spec. I like minimal specs.

> GothAlice: Latin1 = \u0000 → \u00FF — it's one of the only formats that can
> be decoded while preserving raw bytes, and if another encoding is needed,
> transcode safely. \ Effectively requiring Latin1 for unicode output ensures
> single byte conformance on the data. \ If an application needs to return
> UTF-8, for example, it can return an encoded UTF-8 bytestream, which will be
> passed right through,

There's a rule of thumb about constraints. If you must constrain, do
none, one or all, never some. Ah, here it is:
http://en.wikipedia.org/wiki/Zero_One_Infinity

Does that apply here? It seems you either allow unicode strings or you
don't, not a certain subsection.

My own personal method is: textual apps _always_ return unicode
producing iterators and a piece of (required, thus not offical by some
people's biases) middleware turns it into UTF-8 on the way out. I've
naively never understood why you want do anything else? My general
rule is unicode inside, UTF-8 at the boundaries.

That's all I got so far. I applaud you for taking on this challenge.
It's work that needs to be done. I hope to be able to comment more and
make a close reading of the various documents, but time is tough
sometimes. I'll do what I can as I can.

Thanks.
--
Chris Dent http://burningchrome.com/
[...]

P.J. Eby

unread,

Jan 6, 2011, 9:07:52 AM1/6/11

to chris...@gmail.com, web...@python.org

At 01:03 PM 1/6/2011 +0000, chris...@gmail.com wrote:
>Does that apply here? It seems you either allow unicode strings or you
>don't, not a certain subsection.

That's why PEP 3333 requires bytes instead - only the application
knows what it's sending, and the server and middleware shouldn't have to guess.

>My general rule is unicode inside, UTF-8 at the boundaries.

Which would be easy to enforce if you can only yield bytes, as is the
case with PEP 3333.

I worry a bit that right now, there may be Python 3.2 servers (other
than the ones built on wsgiref.handlers) that may not be enforcing
this rule yet.

Alice Bevan–McGregor

unread,

Jan 6, 2011, 10:27:50 AM1/6/11

to web...@python.org

On 2011-01-06 03:53:14 -0800, Antoine Pitrou said:
> Alice Bevan-€“McGregor <alice@...> writes:
>> GothAlice: ... native string usage, the definition of "byte string" as
>> "the format returned by socket read" (which, on Java, is unicode!) ...

Just so no-one feels the need to correct me; agronholm made sure I
didn't drink the kool-aid of one article I was reading and basing some
ideas on. Java socket ojects us byte-based buffers, not unicode. My
bad!

> Regardless of the rest, I think the latter would be a large step backwards.
> Clear distinction between bytes and unicode is a *feature* of Python 3.
> Unicode-ignorant programmers should use frameworks which do the encoding work
> for them.

+0.5

I'm beginning to agree; with the advent of b'' syntax in 2.6, the only
compelling reason to include this "feature" (examples that work without
modification across major versions of Python) goes up in smoke. The
examples should use the b'' syntax and have done with it.

> (by the way, why you are targeting both Python 2 and 3?)

For the same reason that Python 3 features are introduced to 2.x;
migration. Users are more likely to adopt something that doesn't
require them to change production environments, and 3.x is far away
from being deployed in production anywhere but on Gentoo, it seems. ;)

Broad development and deployment options are a Good Thing™, and with
b'', there is no reason -not- to target 2.6+. (There is no requirement
that a PEP 444 / WSGI 2 server even try to be a cross-compatible
polygot; there is room for 2.x-specific and 3.x-specific solutions,
and, in theory, it should be possible to support Python < 2.6, I just
don't feel it's worthwhile to lock your application into Very Old™
interpreters.)

>> agronholm: I'm not very comfortable with the idea of wsgi.input in
>> async apps \ I'm just thinking what would happen when you do
>> environ['wsgi.input'].read()
>>
>> GothAlice: One of two things: in a sync environment, it blocks until it
>> can read, in an async environment [combined with yield] it
>> pauses/shelves your application until the data is available.
>
> Er, for the record, in Python 3 non-blocking file objects return None when
> read() would block.

-1

I'm aware, however that's not practically useful. How would you detect
from within the WSGI 2 application that the file object has become
readable? Implement your own async reactor / select / epoll loop?
That's crazy talk! ;)

>> agronholm: the requirements of async apps are a big problem
>>
>> agronholm: returning magic values from the app sounds like a bad idea
>>
>> agronholm: the best solution I can come up with is to have
>> wsgi.async_input or something, which returns an async token for any
>> given read operation
>
> The idiomatic abstraction for non-blockingness under POSIX is file descriptors.
> So, at the low level (the WSGI level), exchanging fds between server and app
> could be enough to allow both to wake up each other (perhaps two fds: one the
> server can wait on, one the app can wait on). Similarly to what
> signalfd() does.
> Then higher-level tools can wrap inside Futures or whatever else.

-0

Hmm; I'll have to mull that over. Initial thoughts: having a magic
yield value that combines a fd and operation (read/write) is too
magical.

> However, this also means Windows compatibility becomes more complicated, unless
> the fds are sockets.

+1 for pure futures which (in theory) eliminate the need for dedicated
async versions of absolutely everything at the possible cost of
slightly higher overhead.

- Alice.

Alice Bevan–McGregor

unread,

Jan 6, 2011, 10:44:19 AM1/6/11

to web...@python.org

Chris,

On 2011-01-06 05:03:15 -0800, Chris Dent said:
> On Wed, 5 Jan 2011, Alice Bevan–McGregor wrote:
>> This should give a fairly comprehensive explanation of the rationale
>> behind > some decisions in the rewrite; a version of these
>> conversations (in narrative > style vs. discussion) will be added to
>> the rewrite Real Soon Now™ under the > Rationale section.
>
> Thanks for this. I've been trying to follow along with this
> conversation as an interested WSGI app developer and admit that much of
> the thrust of things is getting lost in the details and people's
> tendency to overquote.

Yeah; I knew the IRC log dump was only so useful. It's a lot of
material to go through, and much of it was discussed at strange hours
with little sleep. ;)

> One thing that would be useful is if, when you post, Alice, you could
> give the URL of whatever and wherever your current draft is.

Tomorrow (ack, today!) I'll finish converting over the PEP from Textile
to ReStructuredText and get it re-submitted to the Python website.

https://github.com/GothAlice/wsgi2/blob/master/pep444.textile
http://www.python.org/dev/peps/pep-0444/

> I don't use frameworks, or webob or any of that stuff. I just cook up
> callables that take environ and start_response. I don't want my
> awareness of the basics of HTTP abstracted away, because I want to make
> sure that my apps behave well.

Kudos! That approach is heavily frowned upon in the #python IRC
channel, but I fully agree that working solutions can be reasonably
made using that methedology. There are some details that are made
easier by frameworks, though. Testing benefits from MVC: you can test
the dict return value of the controller, the templates, and the model
all separately.

> Plain WSGI is a good thing, for me, because it means that my
> applications are a) very webby (in the stateless HTTP sense) and b)
> very testable.

c) And very portable. You need not depend on some pre-arranged stack
(including web server).

> I agree with some others who have suggested that maybe async should be
> its own thing, rather than integrated into a WSGI2. A server could
> choose to be WSGI2 compliant or AWSGI compliant, or both.

-1

That is already the case with filters, and will be when I ratify the
async idea (after further discussion here). My current thought process
is that async will be optional for server implementors and will be
easily detectable by applications and middleware and have zero impact
on middleware/applications if disabled (by configuration) or missing.

> That said I can understand why an app author might like to be able to
> read or write in an async way, and being able to shelf an app to wait
> around for the next cycle would be a good thing.

Using futures, async covers any callable at all; you can queue up a
dozen DB calls at the top of your application, then (within a body
generator) yield those futures to be paused pending the data. That
would, as an example, allow complex pages to be generated and streamed
to the end-user in a efficient way -- the user would see a page begin
to appear, and the browser downloading static resources, while
intensive tasks complete.

> I just don't want efforts to make that possible to make writing a
> boring wsgi thing more
> annoying.

+9001

See above.

> I can't get my head around filters yet. They sound like a different way
> to do middleware, with a justification of something along the lines of
> "I don't like middleware for filtering". I'd like to be (directly)
> pointed at a more robust justification. I suspect you have already
> pointed at such a thing, but it is lost in the sands of time...

Filters offer several benefits, some of which are mild:

:: Simplified application / middleware debugging via smaller stack.
:: Clearly defined tasks; ingress = altering the environ / input,
egress = altering the output.
:: Egress filters are not executed if an unhandled exception is raised.

The latter point is important; you do not want badly written middleware
to absorb exceptions that should bubble, etc. (I'll need to elaborate
on this and add a few more points when I get some sleep.)

> Filters seem like something that could be added via a standardized
> piece of middleware, rather than being part of the spec. I like minimal
> specs.

Filters are optional, and an example is/will be provided for utilizing
ingress/egress filter stacks as middleware.

The problem with /not/ including the filtering API (which, by itself is
stupidly simple and would barely warrant its own PEP, IMHO) is that a
separate standard would not be seen and taken into consideration when
developers are writing what they will think /must/ be middleware.
Seing as a middleware version of a filter is trivial to create (just
execute the filter in a thin middleware wrapper), it should be a
consideration up front.

>> Latin1 = \u0000 → \u00FF [snip]

>
> There's a rule of thumb about constraints. If you must constrain, do
> none, one or all, never some. Ah, here it is:
> http://en.wikipedia.org/wiki/Zero_One_Infinity
>
> Does that apply here? It seems you either allow unicode strings or you
> don't, not a certain subsection.

+1

See my post from a few minutes ago which covers this.

> That's all I got so far. I applaud you for taking on this challenge.
> It's work that needs to be done. I hope to be able to comment more and
> make a close reading of the various documents, but time is tough
> sometimes. I'll do what I can as I can.

Thank you, and I look forward to additional input!

- Alice.

chris...@gmail.com

unread,

Jan 6, 2011, 12:06:10 PM1/6/11

to web...@python.org

On Thu, 6 Jan 2011, Alice Bevan–McGregor wrote:

> Yeah; I knew the IRC log dump was only so useful. It's a lot of material to
> go through, and much of it was discussed at strange hours with little sleep.
> ;)

I wasn't actually talking about the log dump. That was useful. What I
was talking about were earlier messages in the thread where people
were making responses, quoting vast swaths of text for no clear
reason.

> https://github.com/GothAlice/wsgi2/blob/master/pep444.textile

Thanks, watching that now.

>> I don't use frameworks, or webob or any of that stuff. I just cook up
>> callables that take environ and start_response. I don't want my awareness
>> of the basics of HTTP abstracted away, because I want to make
>> sure that my apps behave well.
>
> Kudos! That approach is heavily frowned upon in the #python IRC channel, but
> I fully agree that working solutions can be reasonably made using that
> methedology. There are some details that are made easier by frameworks,
> though. Testing benefits from MVC: you can test the dict return value of the
> controller, the templates, and the model all separately.

I should have been more explicit here as I now feel I must defend
myself from frowns. I'm not talking about single methods that do the
entire app. I nest a series of middleware that bottom out at Selector
which then does url based dispatch to applications, which themselves
are defined as handlers (simple wsgi functions) and access
StorageInterfaces and Serializations. The middleware, handlers, stores
and serializers are all independently testable (and usable).

I find the MVC language ineffective when thinking about the HTTP
verbs and the resources the app(s) present(s). In fact I think it
encourages _bad_ thinking.

Anyway, the point of the methodolgy is, that from the perspective of
the web data, the entry and exit points are unadulterated WSGI.

> That is already the case with filters, and will be when I ratify the async
> idea (after further discussion here). My current thought process is that
> async will be optional for server implementors and will be easily detectable
> by applications and middleware and have zero impact on
> middleware/applications if disabled (by configuration) or missing.

This notion of being detectable seems weird to me. Are we actually
expecting an application to query the server, find out it is not async
capable, and choose a different code path as a result? Seems much more
likely that the installer will choose a server or app that meets their
needs. That is: you don't need to detect, you need to know (presumably
at install/config time).

Or maybe I am imagining the use cases incorrectly here. I think of app
being async as an explicit choice made by the builder to achieve some
goal.

>> I can't get my head around filters yet.[snip]

>
> Filters offer several benefits, some of which are mild:
>
> :: Simplified application / middleware debugging via smaller stack.
> :: Clearly defined tasks; ingress = altering the environ / input, egress =
> altering the output.
> :: Egress filters are not executed if an unhandled exception is raised.

Taken individually none of these seem super critical to me.

Or to put it another way: Yeah, so?

(This is the aforementioned resistance showing through. The above
sounds perfectly nice, reasonable and desireable, but not
_necessary_.)

> Filters are optional, and an example is/will be provided for utilizing
> ingress/egress filter stacks as middleware.

In a conversation with some people about the Atom Publishing Protocol
I tried to convince them that the terms SHOULD and MAY had no place in
a spec. WSGI* is not really the same kind of spec, but optionality
still grates in the same way.

> The problem with /not/ including the filtering API (which, by itself is
> stupidly simple and would barely warrant its own PEP, IMHO) is that a
> separate standard would not be seen and taken into consideration when
> developers are writing what they will think /must/ be middleware.

Yeah, so? :)

> See my post from a few minutes ago which covers this.

Yay!

Randy Syring

unread,

Jan 6, 2011, 12:20:48 PM1/6/11

to web...@python.org

Alice,

Being a web application developer and relying on frameworks like Werkzeug and WebOb, I may not have much of a dog in this fight. However, I have been following web-sig for a couple years and I have seen the difficulties involved in reaching consensus on modifying/updating the WSGI spec. Its clear to me that most people on this list who can contribute in meaningful ways to the creation of WSGI 2 have very little time to do so. Jobs, family, and life in general limit the amount of time that can be spent on something like this. Motivation seems generally low anyway, because what we have currently works. It may have warts, but it works, and that very fact seems to limit the number of people interested in donating time to improving the spec.

So, getting a WSGI 2 spec discussed and approved is going to be hard enough. Every time something controversial is added to the spec, its going to make it that much harder to move forward. I'm not saying that means all controversial items should be dropped, some things are worth fighting for, just pointing out that IMO we are already working uphill, and adding weights to our rucksacks should only be done when absolutely necessary.

On 01/06/2011 10:44 AM, Alice Bevan–McGregor wrote:

On 2011-01-06 05:03:15 -0800, Chris Dent said:

I agree with some others who have suggested that maybe async should be its own thing, rather than integrated into a WSGI2. A server could choose to be WSGI2 compliant or AWSGI compliant, or both.

Adding async to the spec is a death blow IMO. You gain nothing by putting it in and lose a lot of interest and time spent discussing it. Make it a separate PEP that references the first. That way, those who don't really care about it can still work on WSGI 2 without the distraction of the async parts. If you make the new async PEP dependent on the WSGI 2 spec, then those ideas can be tossed around all day long without distracting from or taking energy away from the core WSGI 2 ideas.

So, I agree with Chris and others who have said async should be a separate PEP.

-1 on having async in PEP 444 / WSGI 2

With respect to filters:

On 12/14/2010 04:25 PM, Ian Bicking wrote:

<...>

GzipFilter is wonky at best (it interacts oddly with range requests and etags). Prefix handling is useful (e.g., paste.deploy.config.PrefixMiddleware), and usually global and unconfigured. Debugging and logging stuff often needs per-path configuration, which can mean multiple instances applied after dispatch. Encoding and Decoding don't apply to WSGI. Tidy is intrusive and I think questionable on a global level. I don't think the use cases are there. Tightly bound pre-filters and post-filters are particularly problematic. This all seems like a lot of work to avoid a few stack frames in a traceback.

I agree with Ian's analysis of filters. I don't see the benefit and its just another item to detract from other core issues that could be addressed. -1 on filters.

Alice, I do appreciate the time you are giving this issue. But my feeling so far is that the things you have focused on are not the things that concern most of the people pushing towards a WSGI 2. On January 2nd, Phillip Eby sent three emails to web-sig. IMO, they had great wisdom concerning different parts of WSGI 2, async, and the political aspects of the PEP process. Your approach so far doesn't seem to have benefited from that wisdom, especially regarding the latter two items, and IMO this ship is dead in the water until that changes.

What I mean is, your approach just doesn't seem to take the history of the web-sig and the main contributor's opinions into account enough. I have been on the web-sig long enough to have respect for the opinions of guys like Ian, Graham, Phillip, Armin, etc. They have shown their competence and their care through their contributions via code and via posts to this list. I have personally benefited greatly by being able to use the code they have built and I trust them and their opinions with regards to issues discussed on web-sig. So when I see Ian write the above about filters or I see Phillip write the following about async:

I suggest reviewing the Web-SIG history of previous async discussions; there's a lot more to having a meaningful API spec than having a plausible approach. It's not that there haven't been past proposals, they just couldn't get as far as making it possible to write a non-trivial async application that would actually be portable among Python-supporting asynchronous web servers.

and I don't see you respond to their concerns/suggestions, it makes it difficult for me to trust the direction you are heading.

Overall, you have shown energy and a willingness to contribute, which I greatly appreciate. But, I have the opinion that you are coming into this process ignorant of or mostly disregarding the many discussions that have taken place before on this list, are pushing an agenda that seems mostly defined by your likes and dislikes, and are mostly disregarding the suggestions/concerns of the list "heavy-weights". That opinion may not be true, in fact, I am not even saying that is what is going on. What I am saying is that this is what it appears like to a no-name follower of the web-sig. We only see what you write here, the burden of proof is on you to communicate your attentions and agenda. That may not be fair, but if life were to suddenly get fair, I doubt it would happen on web-sig... :)

Then again, my opinion and impression could be completely off, and if that is the case, feel free to ignore me. :)

--------------------------------------
Randy Syring
Intelicom
Direct: 502-276-0459
Office: 502-212-9913

For the wages of sin is death, but the
free gift of God is eternal life in 
Christ Jesus our Lord (Rom 6:23)

Eric Larson

unread,

Jan 6, 2011, 1:02:33 PM1/6/11

to chris...@gmail.com, web...@python.org

At Thu, 6 Jan 2011 13:03:15 +0000 (GMT),
chris dent wrote:
> <snip>

> On async:
>
> I agree with some others who have suggested that maybe async should be
> its own thing, rather than integrated into a WSGI2. A server could
> choose to be WSGI2 compliant or AWSGI compliant, or both.

> </snip>

+1

After seeing some of the ideas regarding how to add async into a new
version of WSGI, it isn't the specific problem the async feature
addresses in terms of WSGI servers. Is the goal is to support long
running connections? Are we trying to support WebSockets and other
long running connection interfaces? If that is the case, async is a
*technique* for handling this paradigm, but it doesn't address the
real problem. There are techniques that have sounded reasonable like
making available the socket such that a server can give it to the
application to do something use with it (here is an example doing
something similar with CherryPy - http://trac.defuze.org/browser/oss/ws4cp/ws4cp.py).

Just to summarize, I'm for making async something else while finding a
way to support long running connections in WSGI outside of adopting a
particular technique a potentially viable goal.

Just my $.02 on the issue.

Eric Larson

Antoine Pitrou

unread,

Jan 6, 2011, 1:15:19 PM1/6/11

to web...@python.org

Alice Bevan–McGregor <alice@...> writes:

> > Er, for the record, in Python 3 non-blocking file objects return None when
> > read() would block.
>
> -1
>
> I'm aware, however that's not practically useful. How would you detect
> from within the WSGI 2 application that the file object has become
> readable? Implement your own async reactor / select / epoll loop?
> That's crazy talk! ;)

I was just pointing out that if you need to choose a convention for signaling
blocking reads on a non-blocking object, it's already there.

By the way, an event loop is the canonical implementation of asynchronous
programming, so I'm not sure what you're complaining about. Or perhaps you're
using "async" in a different meaning? (which one?)

> >> agronholm: the requirements of async apps are a big problem
> >>
> >> agronholm: returning magic values from the app sounds like a bad idea
> >>
> >> agronholm: the best solution I can come up with is to have
> >> wsgi.async_input or something, which returns an async token for any
> >> given read operation
> >
> > The idiomatic abstraction for non-blockingness under POSIX is file
descriptors.
> > So, at the low level (the WSGI level), exchanging fds between server and
app
> > could be enough to allow both to wake up each other (perhaps two fds: one
the
> > server can wait on, one the app can wait on). Similarly to what
> > signalfd() does.
> > Then higher-level tools can wrap inside Futures or whatever else.
>
> -0
>
> Hmm; I'll have to mull that over. Initial thoughts: having a magic
> yield value that combines a fd and operation (read/write) is too
> magical.

I don't understand why you want a "yield" at this level. IMHO, WSGI needn't
involve generators. A higher-level wrapper (framework, middleware, whatever) can
wrap fd-waiting in fancy generator stuff if so desired. Or, in some other
environments, delegate it to a reactor with callbacks and deferreds. Or whatever
else, such as futures.

By the way, the concurrent.futures module is new. Though it will be there in
3.2, it's not guaranteed that its API and semantics will be 100% stable while
people start to really flesh it out.

> +1 for pure futures which (in theory) eliminate the need for dedicated
> async versions of absolutely everything at the possible cost of
> slightly higher overhead.

I don't understand why futures would solve the need for a low-level async
facility. You still need to define a way for the server and the app to wake each
other (and for the server to wake multiple apps). This isn't done "naturally" in
Python (except perhaps with stackless or greenlets). Using fds give you
well-known flexible possibilities.

If you want to put the futures API in WSGI, think of the poor authors of a WSGI
server written in C who will have to write their own executor and future
implementation. I'm sure they have better things to do.

Regards

Antoine.

Alice Bevan–McGregor

unread,

Jan 6, 2011, 2:59:11 PM1/6/11

to web...@python.org

On 2011-01-06 09:06:10 -0800,

chris...@gmail.com said:
> I wasn't actually talking about the log dump. That was useful. What I
> was talking about were earlier messages in the thread where people were
> making responses, quoting vast swaths of text for no clear reason.

Ah. :) I do make an effort to trim quoted text to only the relevant parts.

> On Thu, 6 Jan 2011, Alice Bevan–McGregor wrote:
>> https://github.com/GothAlice/wsgi2/blob/master/pep444.textile
>
> Thanks, watching that now.

The textile document will no longer be updated; the pep-444.rst
document is where it'll be at.

> I should have been more explicit here as I now feel I must defend
> myself from frowns. I'm not talking about single methods that do the
> entire app. I nest a series of middleware that bottom out at Selector
> which then does url based dispatch to applications, which themselves
> are defined as handlers (simple wsgi functions) and access
> StorageInterfaces and Serializations. The middleware, handlers, stores
> and serializers are all independently testable (and usable).

*nods* My framework (WebCore) is basically a packaged up version of a
custom middleware stack so I can easily re-use it from project to
project. I assumed (in my head) you were "rolling your own"
framework/stack.

>> That is already the case with filters, and will be when I ratify the
>> async idea (after further discussion here). My current thought process
>> is that async will be optional for server implementors and will be
>> easily detectable by applications and middleware and have zero impact
>> on middleware/applications if disabled (by configuration) or missing.
>
> This notion of being detectable seems weird to me. Are we actually
> expecting an application to query the server, find out it is not async
> capable, and choose a different code path as a result? Seems much more
> likely that the installer will choose a server or app that meets their
> needs. That is: you don't need to detect, you need to know (presumably
> at install/config time).
>
> Or maybe I am imagining the use cases incorrectly here. I think of app
> being async as an explicit choice made by the builder to achieve some
> goal.

More to the point it needs to be detectable by middleware without
explicitly configuring every layer of middleware, potentially with
differing configuration mechanics and semantics. (I.e. arguments like
enable_async, async_enable, iLoveAsync, ...)

>>> I can't get my head around filters yet.[snip]
>>
>> Filters offer several benefits, some of which are mild:
>>
>> :: Simplified application / middleware debugging via smaller stack.
>> :: Clearly defined tasks; ingress = altering the environ / input,
>> egress = > altering the output.
>> :: Egress filters are not executed if an unhandled exception is raised.
>
> Taken individually none of these seem super critical to me.
>
> Or to put it another way: Yeah, so?
>
> (This is the aforementioned resistance showing through. The above
> sounds perfectly nice, reasonable and desireable, but not _necessary_.)

It isn't necessary; it is, however, an often re-implemented feature of
a framework on top of WSGI. CherryPy, Paste, Django, etc. all
implement some form of non-WSGI (or, hell, Paste uses WSGI middleware)
thing they call a 'filter'.

>> Filters are optional, and an example is/will be provided for utilizing
>> > ingress/egress filter stacks as middleware.
>
> In a conversation with some people about the Atom Publishing Protocol I
> tried to convince them that the terms SHOULD and MAY had no place in a
> spec. WSGI* is not really the same kind of spec, but optionality
> still grates in the same way.

I fully agree; that's why a lot of the PEP 333 "optionally" or "may"
features have become "must". "Optionally" and "may" simply never get
implemented.

Filters are optional because a number of people have raised valid
arguments that it might not be entirely needed. Thus, it's not
required. But I strongly feel that some defined API should be present
in (or /at least/ referred to by) the PEP, otherwise the future will
hold the same server-specific incompatible implementations.

Alex Grönholm

unread,

Jan 6, 2011, 3:48:21 PM1/6/11

to web...@python.org

06.01.2011 20:02, Eric Larson kirjoitti:
> At Thu, 6 Jan 2011 13:03:15 +0000 (GMT),
> chris dent wrote:
>> <snip>
>> On async:
>>
>> I agree with some others who have suggested that maybe async should be
>> its own thing, rather than integrated into a WSGI2. A server could
>> choose to be WSGI2 compliant or AWSGI compliant, or both.
>> </snip>
> +1
>
> After seeing some of the ideas regarding how to add async into a new
> version of WSGI, it isn't the specific problem the async feature
> addresses in terms of WSGI servers. Is the goal is to support long
> running connections? Are we trying to support WebSockets and other
> long running connection interfaces? If that is the case, async is a
> *technique* for handling this paradigm, but it doesn't address the
> real problem. There are techniques that have sounded reasonable like
> making available the socket such that a server can give it to the
> application to do something use with it (here is an example doing
> something similar with CherryPy - http://trac.defuze.org/browser/oss/ws4cp/ws4cp.py).

The primary idea behind asynchronous servers/applications is the ability
to efficiently serve a huge number of concurrent connections with a
small number of threads. Asynchronous applications tend to be faster
because there is less thread context switching happening in the CPU. Any
application that runs on top of a web server that allocates less threads
to the application than the number of connections has to be quick to
respond so as not to starve the thread pool or block the event loop.
This is true regardless of whether nonblocking I/O or some other
technique is used. I'm a bit unclear as to how else you would do this.
Care to elaborate on that? I looked at the Cherrypy code, but I couldn't
yet figure that out.

> Just to summarize, I'm for making async something else while finding a
> way to support long running connections in WSGI outside of adopting a
> particular technique a potentially viable goal.
>
> Just my $.02 on the issue.
>
> Eric Larson
> _______________________________________________
> Web-SIG mailing list
> Web...@python.org
> Web SIG: http://www.python.org/sigs/web-sig

> Unsubscribe: http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi

Robert Brewer

unread,

Jan 6, 2011, 4:08:04 PM1/6/11

to Alice Bevan–McGregor, web...@python.org

Alice Bevan–McGregor wrote:
> chris...@gmail.com said:
> > I can't get my head around filters yet...

>
> It isn't necessary; it is, however, an often re-implemented feature of
> a framework on top of WSGI. CherryPy, Paste, Django, etc. all
> implement some form of non-WSGI (or, hell, Paste uses WSGI middleware)
> thing they call a 'filter'.

Or, if you had actually read what I wrote weeks ago, you'd say "CherryPy used to have a thing they call a 'filter', but then replaced it with a much better mechanism ("hooks and tools") once the naïve categories of ingress/egress were shown in practice to be inadequate." Not to mention that, even when CherryPy had something called a 'filter', that it not only predated WSGI but ran at the innermost WSGI layer, not the outermost. It's apples and oranges at best, or reinventing the square wheel at worst.

We don't need Yet Another Way of hooking in processing components; if anything, we need a standard mechanism to compose existing middleware graphs so that invariant orderings are explicit and guaranteed. For example, "encode, then gzip, then cache". By introducing egress filters as described in PEP 444 (which mentions gzip as a candidate for an egress filter), you're then stuck in a tug-of-war as to whether to build a new caching component as middleware, as an egress filter, or (most likely, in order to compete) both.

Robert Brewer
fuma...@aminus.org

Sylvain Hellegouarch

unread,

Jan 6, 2011, 4:11:33 PM1/6/11

to Alex Grönholm, web...@python.org

2011/1/6 Alex Grönholm <alex.g...@nextday.fi>

06.01.2011 20:02, Eric Larson kirjoitti:

At Thu, 6 Jan 2011 13:03:15 +0000 (GMT),
chris dent wrote:

<snip>
On async:

I agree with some others who have suggested that maybe async should be
its own thing, rather than integrated into a WSGI2. A server could
choose to be WSGI2 compliant or AWSGI compliant, or both.
</snip>

+1

After seeing some of the ideas regarding how to add async into a new
version of WSGI, it isn't the specific problem the async feature
addresses in terms of WSGI servers. Is the goal is to support long
running connections? Are we trying to support WebSockets and other
long running connection interfaces? If that is the case, async is a
*technique* for handling this paradigm, but it doesn't address the
real problem. There are techniques that have sounded reasonable like
making available the socket such that a server can give it to the
application to do something use with it (here is an example doing
something similar with CherryPy - http://trac.defuze.org/browser/oss/ws4cp/ws4cp.py).

The primary idea behind asynchronous servers/applications is the ability to efficiently serve a huge number of concurrent connections with a small number of threads. Asynchronous applications tend to be faster because there is less thread context switching happening in the CPU. Any application that runs on top of a web server that allocates less threads to the application than the number of connections has to be quick to respond so as not to starve the thread pool or block the event loop. This is true regardless of whether nonblocking I/O or some other technique is used. I'm a bit unclear as to how else you would do this. Care to elaborate on that? I looked at the Cherrypy code, but I couldn't yet figure that out.

Since I wrote that piece of code, I guess I ought to chime in. First of all, the code isn't part of CherryPy, simply it's one idea to provide WebSocket to CherryPy. Considering WebSocket bootstraps on HTTP but once that's done, it's just a raw socket with bits and pieces on top, I wanted to find a way not to block CherryPy from serving other requests once a WebSocket handshake had been performed. The idea was simply to detach the socket from the worker thread once the handshake had been performed. Then the application had a socket at hand and this particular instance, I simply decided to use asyncore to loop through those sockets aside from the CherryPy HTTP server. In effect, you end up with asyncore for WS sockets and CherryPy for any HTTP serving but from within one single process, using CherryPy's main loop.

By large this is not a generic solution for implementing async in WSGI but a specific example on how one can have both threads and an async loop playing together. It's merely a proof of concept :)

Hope that clarifies that piece of code.

--

- Sylvain
http://www.defuze.org
http://twitter.com/lawouach

Alice Bevan–McGregor

unread,

Jan 6, 2011, 4:59:54 PM1/6/11

to web...@python.org

On 2011-01-06 13:08:04 -0800, Robert Brewer said:

> Or, if you had actually read what I wrote weeks ago...

I did. Apologies for forgetting the detail of the implementation being
deprecated.

> We don't need Yet Another Way of hooking in processing components; if
> anything, we need a standard mechanism to compose existing middleware
> graphs so that invariant orderings are explicit and guaranteed. For
> example, "encode, then gzip, then cache". By introducing egress filters
> as described in PEP 444 (which mentions gzip as a candidate for an
> egress filter), you're then stuck in a tug-of-war as to whether to
> build a new caching component as middleware, as an egress filter, or
> (most likely, in order to compete) both.

I do, in fact, have a proposal for declaring dependancies, however such
declaration is utterly useless unless differing middleware-based
implementations (e.g. sessions) can agree on a common API for their
feature sets. I feel strongly that this idea does not belong in PEP
444; it's one of the few things I think should be its own PEP.

My mechanism (for which I do have a working implementation against WSGI
1; my web framework uses it) involves middleware layers declaring
several attributes on themselves:

provides - abstract API names
uses - ordering hint, no dependancy
needs - die if dependancy is not met
before - explicit ordering, including "*"
after - explicit ordering, including "*"

For this to really work, however, it'd also need either an
entrypoint-based way of looking up components (making the graph truly
dynamic), or it needs to be combined with explicit packages a la
setuptools.require. In that instance, you've already done the ordering
yourself, so dependancy graphing is moot.

- Alice.

Alex Grönholm

unread,

Jan 6, 2011, 7:00:25 PM1/6/11

to web...@python.org

Yes, this is how I figured it too. In the end, what really matters is that code that doesn't get a dedicated thread has to be designed a little differently. The purpose of this discussion is to come up with a standard interface for such applications. I'd also like to explore the possibility of incorporating such a mechanism in PEP 444, provided that it does not complicate the implementation too much. Otherwise, a separate specification may be necessary.

Alice Bevan–McGregor

unread,

Jan 6, 2011, 8:47:55 PM1/6/11

to web...@python.org

On 2011-01-06 09:20:48 -0800, Randy Syring said:
> Being a web application developer and relying on frameworks like
> Werkzeug and WebOb, I may not have much of a dog in this fight.

All input is welcome; I do want to hear from both framework developers
and users of frameworks. I suspect this discussion ocurring on the
Web-SIG list would be somewhat of an impediment for users to
contribute, so thank you for posting!

> However, I have been following web-sig for a couple years and I have
> seen the difficulties involved in reaching consensus on
> modifying/updating the WSGI spec.

I've read through the archives and seen the issues as well. I do
believe that, on this one topic, it will be simply impossible to please
everyone. Up here in Canada we have

> Its clear to me that most people on this list who can contribute in
> meaningful ways to the creation of WSGI 2 have very little time to do
> so.

One benefit of mailing lists over other communications channels (IRC,
etc.) is that mailing list traffic sticks around for a while and
doesn't require realtime effort.

> Motivation seems generally low anyway, because what we have currently works.

The burst of traffic after Guido offered to push PEP 3333 ratification
proves that what we have /doesn't/ currently work, at least, for
everyone. Python 3 continues to be a problem.

> It may have warts, but it works, and that very fact seems to limit the
> number of people interested in donating time to improving the spec.

Limiting the scope to Python 2; PEP 333 has a number of issues
including, I feel the worst sins for a "standard": ambiguity and
complexity. While people may feel comfortable with the standard they
have learned thus far, I don't think they should be complacent when it
comes to examining possible improvements.

> Every time something controversial is added to the spec, its going to
> make it that much harder to move forward.

Thus my pushing for the controversial parts to be optional. While,
demonstrably, not everyone will use these parts, having them be present
is important to capture mind-space and get people thinking about the
broader implications of what they code.

> On 2011-01-06 05:03:15 -0800, Chris Dent said:
>> I agree with some others who have suggested that maybe async should be
>> its own thing, rather than integrated into a WSGI2. A server could
>> choose to be WSGI2 compliant or AWSGI compliant, or both.
>
> Adding async to the spec is a death blow IMO. You gain nothing by
> putting it in and lose a lot of interest and time spent discussing it.
> Make it a separate PEP that references the first. That way, those who
> don't really care about it can still work on WSGI 2 without the
> distraction of the async parts. If you make the new async PEP
> dependent on the WSGI 2 spec, then those ideas can be tossed around all
> day long without distracting from or taking energy away from the core
> WSGI 2 ideas.

Tossing the idea around all day long will then, of course, be happening
regardless. Unfortunately for that particular discussion, PEP 3148 /
Futures seems to have won out in the broader scope. Having a ratified
and incorporated language PEP (core in 3.2 w/ compatibility package for
2.5 or 2.6+ support) reduces the scope of async discussion down to:
"how do we integrate futures into WSGI 2" instead of "how do we define
an async API at all".

> I suggest reviewing the Web-SIG history of previous async discussions;
> there's a lot more to having a meaningful API spec than having a
> plausible approach. It's not that there haven't been past proposals,
> they just couldn't get as far as making it possible to write a
> non-trivial async application that would actually be portable among
> Python-supporting asynchronous web servers.

See my previous paragraph about futures.

> [snip]

>
> We only see what you write here, the burden of proof is on you to
> communicate your attentions and agenda.

Unfortunately the reality is that no solution will be agreeable to
everyone, and, for lack of a better phrase, no amount of hand-holding
can make it otherwise.

I am attempting to improve my PR somewhat through the many posts today
and with as much thorough information as possible. ;)

> Then again, my opinion and impression could be completely off, and if
> that is the case, feel free to ignore me. :)

That, despite popular opinion otherwise, is something I rarely do,
though my brain is so full these days that some things might get
accidentally expunged from my LRU heap. ;)

P.J. Eby

unread,

Jan 6, 2011, 11:49:57 PM1/6/11

to al...@gothcandy.com, web...@python.org

At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote:
>Tossing the idea around all day long will then, of course, be
>happening regardless. Unfortunately for that particular discussion,
>PEP 3148 / Futures seems to have won out in the broader scope.

Do any established async frameworks or server (e.g. Twisted,
Eventlet, Gevent, Tornado, etc.) make use of futures?

> Having a ratified and incorporated language PEP (core in 3.2 w/
> compatibility package for 2.5 or 2.6+ support) reduces the scope of
> async discussion down to: "how do we integrate futures into WSGI 2"
> instead of "how do we define an async API at all".

It would be helpful if you addressed the issue of scope, i.e., what
features are you proposing to offer to the application developer.

While the idea of using futures presents some intriguing
possibilities, it seems to me at first glance that all it will do is
move the point where the work gets done. That is, instead of simply
running the app in a worker, the app will be farming out work to
futures. But if this is so, then why doesn't the server just farm
the apps themselves out to workers?

I guess what I'm saying is, I haven't heard use cases for this from
the application developer POV -- why should an app developer care
about having their app run asynchronously?

So far, I believe you're the second major proponent (i.e. ones with
concrete proposals and/or implementations to discuss) of an async
protocol... and what you have in common with the other proponent is
that you happen to have written an async server that would benefit
from having apps operating asynchronously. ;-)

I find it hard to imagine an app developer wanting to do something
asynchronously for which they would not want to use one of the
big-dog asynchronous frameworks. (Especially if their app involves
database access, or other communications protocols.)

This doesn't mean I think having a futures API is a bad thing, but
ISTM that a futures extension to WSGI 1 could be defined right now
using an x-wsgi-org extension in that case... and you could then
find out how many people are actually interested in using it.

Mainly, though, what I see is people using the futures thing to
shuffle off compute-intensive tasks... but if they do that, then
they're basically trying to make the server's life easier... but
under the existing spec, any truly async server implementing WSGI is
going to run the *app* in a "future" of some sort already...

Which means that the net result is that putting in async is like
saying to the app developer: "hey, you know this thing that you just
could do in WSGI 1 and the server would take care of it for
you? Well, now you can manage that complexity by yourself! Isn't
that wonderful?" ;-)

I could be wrong of course, but I'd like to see what concrete use
cases people have for async. We dropped the first discussion of
async six years ago because someone (I think it might've been James)
pointed out that, well, it isn't actually that useful. And every
subsequent call for use cases since has been answered with, "well,
the use case is that you want it to be async."

Only, that's a *server* developer's use case, not an app developer's
use case... and only for a minority of server developers, at that.

Alex Grönholm

unread,

Jan 7, 2011, 12:24:12 AM1/7/11

to web...@python.org

07.01.2011 06:49, P.J. Eby kirjoitti:
> At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote:
>> Tossing the idea around all day long will then, of course, be
>> happening regardless. Unfortunately for that particular discussion,
>> PEP 3148 / Futures seems to have won out in the broader scope.
>
> Do any established async frameworks or server (e.g. Twisted, Eventlet,
> Gevent, Tornado, etc.) make use of futures?

I understand that Twisted has incorporated futures support to their
deferreds. Others, I believe, don't support them yet. You have to
consider that Python 3.2 (the first Python with futures support in
stdlib) hasn't even been released yet, and it's only been two weeks
since I released the drop-in backport
(http://pypi.python.org/pypi/futures/2.1).

>
>
>> Having a ratified and incorporated language PEP (core in 3.2 w/
>> compatibility package for 2.5 or 2.6+ support) reduces the scope of
>> async discussion down to: "how do we integrate futures into WSGI 2"
>> instead of "how do we define an async API at all".
>
> It would be helpful if you addressed the issue of scope, i.e., what
> features are you proposing to offer to the application developer.
>
> While the idea of using futures presents some intriguing
> possibilities, it seems to me at first glance that all it will do is
> move the point where the work gets done. That is, instead of simply
> running the app in a worker, the app will be farming out work to
> futures. But if this is so, then why doesn't the server just farm the
> apps themselves out to workers?
>
> I guess what I'm saying is, I haven't heard use cases for this from
> the application developer POV -- why should an app developer care
> about having their app run asynchronously?

Applications need to be asynchronous to work on a single threaded
server. There is no other benefit than speed and concurrency, and having
to program a web app to operate asynchronously can be a pain. AFAIK
there is no other way if you want to avoid the context switching
overhead and support a huge number of concurrent connections.

Thread/process pools are only necessary in an asynchronous application
where the app needs to use blocking network APIs or do heavy
computation, and such uses can unfortunately present a bottleneck. It
follows that it's pretty pointless to have an asynchronous application
that uses a thread/process pool on every request.

The goal here is to define a common API for these mutually incompatible
asynchronous servers to implement so that you could one day run an
asynchronous app on Twisted, Tornado, or whatever without modifications.

> http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi

Guido van Rossum

unread,

Jan 7, 2011, 12:30:11 AM1/7/11

to P.J. Eby, al...@gothcandy.com, web...@python.org

On Thu, Jan 6, 2011 at 8:49 PM, P.J. Eby <p...@telecommunity.com> wrote:
> At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote:
>>
>> Tossing the idea around all day long will then, of course, be happening
>> regardless. Unfortunately for that particular discussion, PEP 3148 /
>> Futures seems to have won out in the broader scope.
>
> Do any established async frameworks or server (e.g. Twisted, Eventlet,
> Gevent, Tornado, etc.) make use of futures?

PEP 3148 Futures are meant for a rather different purpose than those
async frameworks. Those frameworks all are trying to minimize the
number of threads using some kind of callback-based non-blocking I/O
system. PEP 3148 OTOH doesn't care about that -- it uses threads or
processes proudly. This is useful for a different type of application,
where there are fewer, larger tasks, and the overhead of threads
doesn't matter.

The Monocle framework, which builds on top of Tornado or Twisted, uses
something not entirely unlike Futures, though they call it Callback.

I don't think the acceptance of PEP 3148 should be taken as forcing
the direction that async frameworks should take.

> http://mail.python.org/mailman/options/web-sig/guido%40python.org
>

--
--Guido van Rossum (python.org/~guido)

Jacob Kaplan-Moss

unread,

Jan 7, 2011, 12:35:24 AM1/7/11

to Alice Bevan–McGregor, web...@python.org

On Thu, Jan 6, 2011 at 7:47 PM, Alice Bevan–McGregor
<al...@gothcandy.com> wrote:
> All input is welcome; I do want to hear from both framework developers and
> users of frameworks.

OK, here's my input. I'm not comfortable speaking on behalf of the
entire Django core team, but I am consciously wearing my Django BDFL
hat, and I do know that many (most?) of Django's core team feels as I
do.

And I'm feeling incredibly disheartened.

Python 3 was released in December 2008. I assumed we'd have an updated
WSGI spec wthin maybe 6 months. It's been two years, and we still
don't have a WSGI spec. This fundamentally means that it's not worth
my time to port Django to Python 3 -- the bits where Django meets WSGI
are critical, and I simply can't get excited about targeting a moving
spec with potential incompatible implementations. The lack of a
WSGI-for-Py3 is a fundamental enthusiasm killer. Django is, at the end
of the day, a framework designed to be deployed. Until I can do so,
there's no use even starting the porting process.

A few months ago, PJE posted PEP 3333. It looked good... and then
nothing happened. I tried to prod things forward, and some more
discussion ensued... and now it looks like it's stalling again. Each
time, discussion of PEP 444 seems to derail discussion of PEP 3333.

I have no skin in this game. Frankly, I have a huge amount of trouble
following the discussions, and I can't speak to the technical merits
of one over the other. But even if PEP 444 is a million times better
than PEP 3333, 444 is clearly a *lot farther off. But PEP 444 seems to
be where all the energy keeps ending up.

At this rate, I really wonder if it'll be another two years before we
have a working WSGI for Python 3. I hope I'm being pessimistic. Prove
me wrong. Please.

Can we please, please, PLEASE, pause discussion of PEP 444 until PEP
3333 is finalized?

Jacob

Alice Bevan–McGregor

unread,

Jan 7, 2011, 2:23:52 AM1/7/11

to web...@python.org

On 2011-01-06 21:35:24 -0800, Jacob Kaplan-Moss said:
> And I'm feeling incredibly disheartened.

As the author of my own small WSGI framework (with world-wide, though
still limited use) I have the luxury of being able to embrace
experimental technologies. The lack of WSGI capability in Python 3
thoroughly depressed me for the same reasons you describe.

Then I got fed up, tracked down something to tackle, and picked up PEP
444 knowing full well that PEP 3333 existed and was nearer to
completion. PEP 3333 -should- be ratified ASAP in order for developers
to begin to move forward. PEP 444, despite the seeming high blood
pressure on the Web-SIG list, is a long, long way off, and I recognize
that. Despite my boundless enthusiasm for debate, I certainly hope
everyone else realizes this, too. ;)

I wrote an experimental proof-of-concept HTTP/1.1 server against PEP
444 (and continue to update it as my rewrite progresses) over the
course of a week. It just so happened to be stupidly performant under
ideal conditions (see the webpy mailing list for a more real-world
comparison against a CPython extension-based server), extremely simple
code to maintain and experiment on, and will continue to be my/the
"reference implementation" for PEP 444.

Other than mod_wsgi, are there any PEP 3333-compliant (or
near-compliant) components in the wild? Enough to bring a framework to
life in Python 3? What I see is the chicken-and-egg problem endemic
with Python 3: developers wait on upstream to port before they do, and
upstream developers are either waiting themselves or don't see the
demand to port.

Any standard needs early adopters / implementors in order to truly test
the specification; without such, much of the discussion is pure
thought-experiment and practical problems may arise after the standard
is ratified, which is never good. ;)

With the Marrow suite I'm attempting to brute-force the Python 3
problem domain within the context of testing PEP 444 and providing
(after ratification) a solid meta-framework foundation a la Paste.
Yes, that means I'm re-inventing enough wheels for a 6-axel rig, but it
also means (in theory) I should have a solid understanding of the
strengths and weaknesses of the PEP. (The WebOb equivalent is only
partially complete as of this writing.)

> A few months ago, PJE posted PEP 3333. It looked good... and then
> nothing happened. I tried to prod things forward, and some more
> discussion ensued... and now it looks like it's stalling again. Each
> time, discussion of PEP 444 seems to derail discussion of PEP 3333.

I see the opposite in regards to recent traffic on the Web-SIG; PEP 444
discussion has encouraged PEP 3333 discussion. See the "Declaring PEP
3333 accepted" thread (encouraged by Guido himself).

> At this rate, I really wonder if it'll be another two years before we
> have a working WSGI for Python 3. I hope I'm being pessimistic. Prove
> me wrong. Please.

I'll do what I can. :)

> Can we please, please, PLEASE, pause discussion of PEP 444 until PEP
> 3333 is finalized?

This is something I've seen fairly often around PEP 444 threads;
instead of reviving (or starting a new) PEP 3333 thread, a complaint is
levied against PEP 444 discussion itself. That doesn't help. ;)

Truthfully, this month already surpasses the amount of activity
(post-wise) of the last three months combined, and includes quite a
number of posts about PEP 3333. (PEP 3333 has had no significant
discussion - again, by post count - since October.)

- Alice.

Graham Dumpleton

unread,

Jan 7, 2011, 2:40:53 AM1/7/11

to Alice Bevan–McGregor, web...@python.org

On 7 January 2011 18:23, Alice Bevan–McGregor <al...@gothcandy.com> wrote:
> On 2011-01-06 21:35:24 -0800, Jacob Kaplan-Moss said:
> Other than mod_wsgi, are there any PEP 3333-compliant (or near-compliant)
> components in the wild? Enough to bring a framework to life in Python 3?
> What I see is the chicken-and-egg problem endemic with Python 3: developers
> wait on upstream to port before they do, and upstream developers are either
> waiting themselves or don't see the demand to port.

There is also uWSGI and CherryPy WSGI server. I recollect that Benoit
may have started looking it over for gunicorn.

Graham

Alice Bevan–McGregor

unread,

Jan 7, 2011, 2:44:26 AM1/7/11

to web...@python.org

On 2011-01-06 23:40:53 -0800, Graham Dumpleton said:

> There is also uWSGI and CherryPy WSGI server. I recollect that Benoit
> may have started looking it over for gunicorn.

Ah, right, I recall seeing CherryPy mentioned in archived discussions.
So there's hope, then, for relatively quick adoption once ratified. :)

- Alice.

Alice Bevan–McGregor

unread,

Jan 7, 2011, 3:39:28 AM1/7/11

to web...@python.org

On 2011-01-06 20:49:57 -0800, P.J. Eby said:
> It would be helpful if you addressed the issue of scope, i.e.,

> whatfeatures are you proposing to offer to the application developer.

Conformity, predictability, and portability. That's a lot of y's.
(Pardon the pun!)

Alex Grönholm's post describes the goal quite clearly.

> So far, I believe you're the second major proponent (i.e. ones with
> concrete proposals and/or implementations to discuss) of an async
> protocol... and what you have in common with the other proponent is
> that you happen to have written an async server that would benefit from
> having apps operating asynchronously. ;-)

Well, the Marrow HTTPd does operate in multi-process mode, and, one
day, multi-threaded or a combination. Integration of a futures
executor to the WSGI environment would alleviate the major need for a
multi-threaded implementation in the server core; intensive tasks can
be deferred to a thread pool vs. everything being deferred to a thread
pool. (E.g. template generation, PDF/other text extraction for
indexing of file uploads, image scaling, etc. all of which are real use
cases I have which would benefit from futures.)

> I find it hard to imagine an app developer wanting to do something
> asynchronously for which they would not want to use one of the big-dog
> asynchronous frameworks. (Especially if their app involves database
> access, or other communications protocols.)

Admittedly, a truly async server needs some way to allow file
descriptors to be registered with the reactor core, with the WSGI
application being resumed upon some event (e.g. socket is readable or
writeable for DB access, or even pipe operations for use cases I can't
think of at the moment).

Futures integration is a Good Idea, IMHO, and being optional and easily
added to the environ by middleware for servers that don't implement it
natively is even better.

As for how to provide a generic interface to an async core, I have two
ideas, but one is magical and the other is more so; I'll describe these
in a descrete post.

> This doesn't mean I think having a futures API is a bad thing, butISTM

> that a futures extension to WSGI 1 could be defined right nowusing an
> x-wsgi-org extension in that case... and you could thenfind out how

> many people are actually interested in using it.

I'll add writing up a WSGI middleware layer that configures and adds a
future.executor to the environ to my already overweight to-do list. It
actually is something I have a use for right now on at least one
commercial project. :)

> Mainly, though, what I see is people using the futures thing to shuffle
> off compute-intensive tasks...

That's what it's for. ;)

> ...but if they do that, then they're basically trying to make the

> server's life easier... but under the existing spec, any truly async
> server implementing WSGI is going to run the *app* in a "future" of
> some sort already...

Running the application in a future is actually not a half-bad way for
me to add threading to marrow.server... thanks!

> Which means that the net result is that putting in async is like saying
> to the app developer: "hey, you know this thing that you just could do

> in WSGI 1 and the server would take care of it foryou? Well, now you

> can manage that complexity by yourself! Isn't that wonderful?" ;-)

That's a bit extreme; PEP 444 servers may still implement threading,
multi-processing, etc. at the reactor level (a la CherryPy or Paste).
Giving WSGI applications access to a futures executor (possibly the one
powering the main processing threads) simply gives applications the
ability to utilize it, not the requirement to do so.

> I could be wrong of course, but I'd like to see what concrete usecases
> people have for async.

Earlier in this post I illustrated a few that directly apply to a
commercial application I am currently writing. I'll elaborate:

:: Image scaling would benefit from multi-processing (spreading the
load across cores). Also, only one sacle is immediately required before
returning the post-upload page: the thumbnail. The other scales can be
executed without halting the WSGI application's return.

:: Asset content extraction and indexing would benefit from threading,
and would also not require pausing the WSGI application.

:: Since most templating engines aren't streaming (see my unanswered
thread in the general mailing list re: this), pausing the application
pending a particularly difficult render is a boon to single-threaded
async servers, though true streaming templating (with flush semantics)
would be the holy grail. ;)

:: Long-duration calls to non-async-aware libraries such as DB access.
The WSGI application could queue up a number of long DB queries, pass
the futures instances to the template, and the template could then
.result() (block) across them or yield them to be suspended and resumed
when the result is available.

:: True async is useful for WebSockets, which seem a far superior
solution to JSON/AJAX polling in addition to allowing real web-based
socket access, of course.

> We dropped the first discussion of async six years ago because someone

> (I think it might've been James)pointed out that, well, it isn't

> actually that useful. And every subsequent call for use cases since
> has been answered with, "well, the use case is that you want it to be
> async."

See the above. ;)

- Alice.

Alice Bevan–McGregor

unread,

Jan 7, 2011, 4:02:49 AM1/7/11

to web...@python.org

On 2011-01-06 10:15:19 -0800, Antoine Pitrou said:

> Alice Bevan–McGregor <alice@...> writes:
>>> Er, for the record, in Python 3 non-blocking file objects return None when
>>> read() would block.
>>
>> -1
>>
>> I'm aware, however that's not practically useful. How would you detect
>> from within the WSGI 2 application that the file object has become
>> readable? Implement your own async reactor / select / epoll loop?
>> That's crazy talk! ;)
>
> I was just pointing out that if you need to choose a convention for
> signaling blocking reads on a non-blocking object, it's already there.

I don't. I need a way to suspend execution of a WSGI application
pending some operation, often waiting for socket or file read or write
availability. (Just as often something entirely unrelated to file
descriptors, see my previous post from a few moments ago.)

> By the way, an event loop is the canonical implementation of
> asynchronous programming, so I'm not sure what you're complaining
> about. Or perhaps you're using "async" in a different meaning? (which
> one?)

If you use non-blocking sockets, and the WSGI server provides a way to
directly access the client socket (ack!), utilizing the none response
on reads would require you to utilize a tight loop within your
application to wait for actual data. That's really, really bad, and in
a single-threaded server, deadly.

> I don't understand why you want a "yield" at this level. IMHO, WSGI
> needn't involve generators. A higher-level wrapper (framework,
> middleware, whatever) can wrap fd-waiting in fancy generator stuff if
> so desired. Or, in some other environments, delegate it to a reactor
> with callbacks and deferreds. Or whatever else, such as futures.

WSGI already involves generators: the response body. In fact, the
templating engine I wrote (and extended to support flush semantics)
utilizes a generator to return the response body. Works like a hot
damn, too.

Yield is the Python language's native way to suspend execution of a
callable in a re-entrant way. A trivial example of this is an async
"ping-pong" reactor. I wrote one ("you aren't a real Python programmer
unless...") as an experiment and utilize it for server monitoring with
tasks being generally scheduled against time, vs. edge-triggered or
level-triggered fd operation availability.

Everyone has their own idea of what a "deferred" is, and there is only
one definition of a "future", which (in a broad sense) is the same as
the general idea of a "deferred". Deferreds just happen to be
implementation-specific and often require rewriting large portions of
external libraries to make them compatible with that specific deferred
implementation. That's not a good thing.

Hell; an extension to the futures spec to handle file descriptor events
might not be a half-bad idea. :/

> By the way, the concurrent.futures module is new. Though it will be
> there in 3.2, it's not guaranteed that its API and semantics will be
> 100% stable while people start to really flesh it out.

Ratification of PEP 444 is a long way off itself. Also, Alex Grönholm
maintains a pypi backport of the futures module compatible with 2.x+
(not sure of the specific minimum version) and < 3.2. I'm fairly
certain deprecation warnings wouldn't kill the usefulness of that
implementation. Worrying about instability, at this point, may be
premature.

>> +1 for pure futures which (in theory) eliminate the need for dedicated
>> async versions of absolutely everything at the possible cost of
>> slightly higher overhead.
>
> I don't understand why futures would solve the need for a low-level
> async facility.

You mis-interpreted; I didn't mean to infer that futures would replace
an async core reactor, just that long-running external library calls
could be trivially deferred using futures.

> You still need to define a way for the server and the app to wake each
> other (and for the server to wake multiple apps).

Futures is a pretty convienent way to have a server wake an app; using
a future completion callback wrapped (using partial) with the paused
application generator would do it. (The reactor Marrow uses, a
modified Tornado IOLoop, would require calling
reactor.add_callback(partial(worker, app_gen)) followed by
reactor._wake() in the future callback.)

"Waking up the server" would be accomplished by yielding a futures
instance (or fd magical value, etc).

> This isn't done "naturally" in Python (except perhaps with stackless or
> greenlets). Using fds give you well-known flexible possibilities.

Yield is the natural way for one side of that, re-entering the
generator on future completion covers the other side. Stackless and
greenlets are alternate ideas, but yield is built-in (and soon, so will
futures).

> If you want to put the futures API in WSGI, think of the poor authors
> of a WSGI server written in C who will have to write their own executor
> and future implementation. I'm sure they have better things to do.

If they embed a Python interpreter via C, they can utilize native
implementations of future executors, though these will obviously be
slightly less performant than a native C implementation. (That is,
unless the stdlib version in 3.2 will have C backing.)

- Alice.

P.J. Eby

unread,

Jan 7, 2011, 11:10:43 AM1/7/11

to al...@gothcandy.com, web...@python.org

At 12:39 AM 1/7/2011 -0800, Alice BevanMcGregor wrote:
>Earlier in this post I illustrated a few that directly apply to a
>commercial application I am currently writing. I'll elaborate:
>
>:: Image scaling would benefit from multi-processing (spreading the
>load across cores). Also, only one sacle is immediately required
>before returning the post-upload page: the thumbnail. The other
>scales can be executed without halting the WSGI application's return.
>
>:: Asset content extraction and indexing would benefit from
>threading, and would also not require pausing the WSGI application.
>
>:: Since most templating engines aren't streaming (see my unanswered
>thread in the general mailing list re: this), pausing the
>application pending a particularly difficult render is a boon to
>single-threaded async servers, though true streaming templating
>(with flush semantics) would be the holy grail. ;)

In all these cases, ISTM the benefit is the same if you future the
WSGI apps themselves (which is essentially what most current async
WSGI servers do, AFAIK).

>:: Long-duration calls to non-async-aware libraries such as DB access.
>The WSGI application could queue up a number of long DB queries,
>pass the futures instances to the template, and the template could
>then .result() (block) across them or yield them to be suspended and
>resumed when the result is available.
>
>:: True async is useful for WebSockets, which seem a far superior
>solution to JSON/AJAX polling in addition to allowing real web-based
>socket access, of course.

The point as it relates to WSGI, though, is that there are plenty of
mature async APIs that offer these benefits, and some of them (e.g.
Eventlet and Gevent) do so while allowing blocking-style code to be
written. That is, you just make what looks like a blocking call, but
the underlying framework silently suspends your code, without tying
up the thread.

Or, if you can't use a greenlet-based framework, you can use a
yield-based framework. Or, if for some reason you really wanted to
write continuation-passing style code, you could just use the raw Twisted API.

But in all of these cases you would be better off than if you used a
half-implementation of the same thing using futures under WSGI,
because all of those frameworks already have mature and sophisticated
APIs for doing async communications and DB access. If you try to do
it with WSGI under the guise of "portability", all this means is that
you are stuck rolling your own replacements for those existing APIs.

Even if you've already written a bunch of code using raw sockets and
want to make it asynchronous, Eventlet and Gevent actually let you
load a compatibility module that makes it all work, by replacing the
socket API with an exact duplicate that secretly suspends your code
whenever a socket operation would block.

IOW, if you are writing a truly async application, you'd almost have
to be crazy to want to try to do it *portably*, vs. picking a
full-featured async API and server suite to code against. And if
you're migrating an existing, previously-synchronous WSGI app to
being asynchronous, the obvious thing to do would just be to grab a
copy of Eventlet or Gevent and import the appropriate compatibility
modules, not rewrite the whole thing to use futures.

Antoine Pitrou

unread,

Jan 7, 2011, 12:04:07 PM1/7/11

to web...@python.org

Alice Bevan–McGregor <alice@...> writes:
>
> > I don't understand why you want a "yield" at this level. IMHO, WSGI
> > needn't involve generators. A higher-level wrapper (framework,
> > middleware, whatever) can wrap fd-waiting in fancy generator stuff if
> > so desired. Or, in some other environments, delegate it to a reactor
> > with callbacks and deferreds. Or whatever else, such as futures.
>
> WSGI already involves generators: the response body.

Wrong. The response body is an arbitrary iterable, which means it can be a
sequence, a generator, or something else. WSGI doesn't mandate any specific
feature of generators, such as coroutine-like semantics, and the server doesn't
have to know about them.

> Everyone has their own idea of what a "deferred" is, and there is only
> one definition of a "future", which (in a broad sense) is the same as
> the general idea of a "deferred".

A Twisted deferred is as well defined as a Python stdlib future; actually,
deferreds have been in use by the Python community for much, much longer than
futures. But that's besides the point, since I'm proposing that your spec
doesn't rely on a high-level abstraction at all.

> Ratification of PEP 444 is a long way off itself.

Right, that's why I was suggesting you drop your concern for Python 2
compatibility.

Antoine.

Timothy Farrell

unread,

Jan 7, 2011, 12:37:35 PM1/7/11

to web...@python.org

When I originally requested a futures executor option (the email that started this thread), this is more like what I had in mind. I'm not against async...rather indifferent. But I wanted the ability for the server to run something after the response had been fully served to the client and thus not blocking the response. The example I gave was sending an email, but there are plenty of other use cases. Futures seemed like the right way to do this. I'm also not sure futures is the right way to build an async specification and for that matter, there will be a lot to work out with regard to PEP 444.

Rather than responding to this, I'll start a new thread since this takes the environ["wsgi.executor"] discusssion in a different direction. Please send your comments there.

-t

Unsubscribe: http://mail.python.org/mailman/options/web-sig/tfarrell%40owassobible.org

Jacob Kaplan-Moss

unread,

Jan 7, 2011, 1:40:42 PM1/7/11

to Alice Bevan–McGregor, web...@python.org

On Fri, Jan 7, 2011 at 1:23 AM, Alice Bevan–McGregor
<al...@gothcandy.com> wrote:
> Other than mod_wsgi, are there any PEP 3333-compliant (or near-compliant)
> components in the wild? Enough to bring a framework to life in Python 3?
> What I see is the chicken-and-egg problem endemic with Python 3: developers
> wait on upstream to port before they do, and upstream developers are either
> waiting themselves or don't see the demand to port.

I don't see that problem any more. I have at least three WSGI servers
I could test against: modwsgi, CherryPy, and Django's half-assed
built-in server. I guess I could add wsgiref as a 4th, but only sorta.
And looks like Benoit's geting Gunicorn up to snuff.

What happens now goes something like this:

1. Get excited to port Django to Python 3.
2. Hack for a while.
3. Get something working under runserver - woo!
4. Hm, it fails under modwsgi.
5. OK, problem fixed.
6. Wait, no, now it doesn't work under runserver.
7. Or CherryPy. Dammit.
8. Lose interest for another 6 months.

At this point, there's a bug somewhere. It's *probably* in my code --
like I said earlier, I only barely grok WSGI -- but without a spec to
refer to I'm pretty much hosed. See, Django on Py 2 jumps through a
whole bunch of hoops to gloss over the string/unicode distinction and
over the question of encoding, and that's stuff's pretty fiddley. The
"right" way to handle that on Python 3 depends entirely on the issues
hammered out in PEP 3333 -- particularly the byte/str decisions. I'm
starting to assume that PEP 3333 is going to get accepted in a form
fundamentally the same as it appears right now, but if I'm wrong I get
to do this stuff all over again. Nothing's more of an
enthusiasm-killer than knowing I might have to start all over again
later.

I really want a definitive answer. If the spec says it's my fault, I
want people to yell at me loudly until Django is compliant. If the
spec says it's not my fault, I want to be able to be an asshole [1]
until all the app containers are compliant.

[1] That's a technical term; see
http://diveintomark.org/archives/2004/08/16/specs.

>> Can we please, please, PLEASE, pause discussion of PEP 444 until PEP 3333
>> is finalized?
>
> This is something I've seen fairly often around PEP 444 threads; instead of
> reviving (or starting a new) PEP 3333 thread, a complaint is levied against
> PEP 444 discussion itself. That doesn't help. ;)

Unfortunately, I really don't know any other way of helping than being
a pain in the ass. I don't understand the issues well enough to
contribute technically, so I've decided I'm going to continue to
complain loudly. Hopefully you'll get sick of me and give me a spec so
I'll shut the hell up!

I do feel crappy asking you to put 444 on hold. I understand that the
SIG should be perfectly capable of working on more than one thing at
once. However, to my eyes it seems like it keeps getting derailed. I
understand this, too: clearly WSGI isn't perfect, and when you run up
against some of those issues it's a *lot* more fun to ignore 'em just
for a bit longer and work on something more exciting.

I really do appreciate the enthusiasm for PEP 444. I share it: it
seems a lot easier to implement, and it'll certainly make some of the
things Django's doing a lot easier. I just would really like to see
that enthusiasm and energy turned full blast on PEP 3333 until it's
done.

Jacob

(Luckily for you, I'm going on vacation tomorrow, so you won't have to
deal with me me complaining for at least a week!)

Alice Bevan–McGregor

unread,

Jan 7, 2011, 3:29:56 PM1/7/11

to web...@python.org

On 2011-01-07 09:04:07 -0800, Antoine Pitrou said:
> Alice Bevan–McGregor <alice@...> writes:
>>> I don't understand why you want a "yield" at this level. IMHO, WSGI
>>> needn't involve generators. A higher-level wrapper (framework,
>>> middleware, whatever) can wrap fd-waiting in fancy generator stuff if
>>> so desired. Or, in some other environments, delegate it to a reactor
>>> with callbacks and deferreds. Or whatever else, such as futures.
>>
>> WSGI already involves generators: the response body.
>
> Wrong.

I'm aware that it can be any form of iterable, from a list-wrapped
string all the way up to generators or other nifty things. I
mistakenly omitted these assuming that the other iterables were
universally understood and implied.

However, using a generator is a known, vlaid use case that I do see in
the wild. (And also rely upon in some of my own applications.)

> Right, that's why I was suggesting you drop your concern for Python 2
> compatibility.

-1

There is practically no reason for doing so; esp. considering that I've
managed to write a 2k/3k polygot server that is more performant out of
the box than any other WSGI HTTP server I've come across and is far
simpler in implementation than most of the ones I've come across with
roughly equivelant feature sets.

Cross compatibility really isn't that hard, and arguing that 2.x
support should be dropped for the sole reason that "it might be dead by
the time this is ratified" is a bit off.

Python 2.x will be around for a long time.

- Alice.

Alice Bevan–McGregor

unread,

Jan 7, 2011, 3:37:38 PM1/7/11

to web...@python.org

On 2011-01-07 08:10:43 -0800, P.J. Eby said:
> At 12:39 AM 1/7/2011 -0800, Alice BevanMcGregor wrote:
>> :: Image scaling would benefit from multi-processing (spreading
>> the>load across cores). Also, only one sacle is immediately
>> required>before returning the post-upload page: the thumbnail. The
>> other>scales can be executed without halting the WSGI application's
>> return.
>>
>> :: Asset content extraction and indexing would benefit from>threading,
>> and would also not require pausing the WSGI application.
>

> In all these cases, ISTM the benefit is the same if you future theWSGI
> apps themselves (which is essentially what most current asyncWSGI
> servers do, AFAIK).

Image scaling and asset content extraction should not block the
response to a HTTP request; these need to be 'forked' from the main
request. Only template generation (where the app needs to effectively
block pending completion) is solved easily by threading the whole
application call.

>> :: Long-duration calls to non-async-aware libraries such as DB access.
>> The WSGI application could queue up a number of long DB queries,>pass
>> the futures instances to the template, and the template could>then
>> .result() (block) across them or yield them to be suspended and>resumed
>> when the result is available.
>>
>> :: True async is useful for WebSockets, which seem a far
>> superior>solution to JSON/AJAX polling in addition to allowing real
>> web-based>socket access, of course.
>
> The point as it relates to WSGI, though, is that there are plenty

> ofmature async APIs that offer these benefits, and some of them
> (e.g.Eventlet and Gevent) do so while allowing blocking-style code to
> bewritten. That is, you just make what looks like a blocking call,
> butthe underlying framework silently suspends your code, without
> tyingup the thread.

>
> Or, if you can't use a greenlet-based framework, you can use a
> yield-based framework. Or, if for some reason you really wanted to
> write continuation-passing style code, you could just use the raw
> Twisted API.

But is there really any problem with providing a unified method for
indication a suspend point? What the server does when it gets the
yielded value is entirely up to the implementation of the server; if it
(the server) wants to use greenlets, it can. If it has other
methedologies, it can go nuts.

> Even if you've already written a bunch of code using raw sockets and
> want to make it asynchronous, Eventlet and Gevent actually let youload
> a compatibility module that makes it all work, by replacing the socket
> API with an exact duplicate that secretly suspends your code whenever a
> socket operation would block.

I generally frown upon magic, and each of these implementations is
completely specific. :/

- Alice.

Paul Davis

unread,

Jan 7, 2011, 3:42:24 PM1/7/11

to Alice Bevan–McGregor, web...@python.org

> There is practically no reason for doing so; esp. considering that I've
> managed to write a 2k/3k polygot server that is more performant out of the
> box than any other WSGI HTTP server I've come across and is far simpler in
> implementation than most of the ones I've come across with roughly
> equivelant feature sets.

Is the code for this server online? I'd be interested in reading through it.

Antoine Pitrou

unread,

Jan 7, 2011, 4:21:36 PM1/7/11

to web...@python.org

Alice Bevan–McGregor <alice@...> writes:
>
> On 2011-01-07 09:04:07 -0800, Antoine Pitrou said:
> > Alice Bevan–McGregor <alice@...> writes:
> >>> I don't understand why you want a "yield" at this level. IMHO, WSGI
> >>> needn't involve generators. A higher-level wrapper (framework,
> >>> middleware, whatever) can wrap fd-waiting in fancy generator stuff if
> >>> so desired. Or, in some other environments, delegate it to a reactor
> >>> with callbacks and deferreds. Or whatever else, such as futures.
> >>
> >> WSGI already involves generators: the response body.
> >
> > Wrong.
>

> I'm aware that it can be any form of iterable, [snip]

Ok, so, WSGI doesn't "already involve generators". QED.

> > Right, that's why I was suggesting you drop your concern for Python 2
> > compatibility.
>
> -1
>
> There is practically no reason for doing so;

Of course, there is one: a less complex PEP without any superfluous
compatibility language sprinkled all over. And a second one: a simpler PEP is
probably easier to get contructive comments about, and (perhaps some day)
consensus on.

> esp. considering that I've
> managed to write a 2k/3k polygot server that is more performant out of
> the box than any other WSGI HTTP server I've come across and is far
> simpler in implementation than most of the ones I've come across with
> roughly equivelant feature sets.

Just because you "managed to write" some piece of code for a *particular* use
case doesn't mean that cross-compatibility is a solved problem. If you think
it's easy, then I'm sure the authors of various 3rd-party libs would welcome
your help achieving it.

> Python 2.x will be around for a long time.

And so will PEP 3333 and even PEP 333. People who value legacy compatibility
will favour these old PEPs over your new one anyway. People who don't will
progressively jump to 3.x.

Antoine.

Alice Bevan–McGregor

unread,

Jan 7, 2011, 4:23:41 PM1/7/11

to web...@python.org

On 2011-01-07 12:42:24 -0800, Paul Davis said:

> Is the code for this server online? I'd be interested in reading through it.

https://github.com/pulp/marrow.server.http

There are two branches: master will always refer to the version
published on Python.org, and draft refers to my rewrite. (When
published, draft will be merged.)

- Alice.

Alice Bevan–McGregor

unread,

Jan 7, 2011, 5:33:23 PM1/7/11

to web...@python.org

On 2011-01-07 13:21:36 -0800, Antoine Pitrou said:
> Ok, so, WSGI doesn't "already involve generators". QED.

This can go around in circles; by allowing all forms of iterable, it
involves generators. Geneators are a type of iterable. QED right
back. ;)

>>> Right, that's why I was suggesting you drop your concern for Python 2
>>> compatibility.
>>
>> -1
>>
>> There is practically no reason for doing so;
>
> Of course, there is one: a less complex PEP without any superfluous
> compatibility language sprinkled all over.

There isn't any "compatibility language" sprinkled within the PEP. In
fact, the only mention of it is in the introduction (stating that < 2.6
support may be possible but is undefined) and the title of a section
"Python Cross-Version Compatibility".

Using native strings where possible encourages compatibility, though
for the environ variables previously mentioned (URI, etc.) explicit
exceptional behaviour is clearly defined. (Byte strings and true
unicode.)

> Just because you "managed to write" some piece of code for a
> *particular* use case doesn't mean that cross-compatibility is a solved
> problem.

The particular use case happens to be PEP 444 as implemented using an
async and multi-process (some day multi-threaded) HTTP server, so I'm
not quite sure what you're getting at, here. I think that use case is
sufficiently broad to be able to make claims about the ease of
implementing PEP 444 in a compatible way.

> If you think it's easy, then I'm sure the authors of various 3rd-party
> libs would welcome your help achieving it.

I helped proof a book about Python 3 compatibility and am giving a
presentation in March that contains information on Python 3
compatibility from the viewpoint of implementing the Marrow suite.

>> Python 2.x will be around for a long time.
>
> And so will PEP 3333 and even PEP 333. People who value legacy
> compatibility will favour these old PEPs over your new one anyway.
> People who don't will progressively jump to 3.x.

Yup. Not sure how this is really an issue. PEP 444 is the /future/,
333[3] is /now/ [-ish].

- Alice.

Alice Bevan–McGregor

unread,

Jan 7, 2011, 5:38:40 PM1/7/11

to web...@python.org

On 2011-01-07 09:04:07 -0800, Antoine Pitrou said:

> WSGI doesn't mandate any specific feature of generators, such as
> coroutine-like semantics, and the server doesn't have to know about
> them.

The joy of writing a new specification is that we are not (potentially)
shackled by old ways of doing things. Case in point: dropping
start_response and changing the return value. PEP 444 isn't WSGI 1,
and can change things, including additional changes to the allowable
return value.

- Alice.

Alex Grönholm

unread,

Jan 7, 2011, 6:24:30 PM1/7/11

to web...@python.org

07.01.2011 07:24, Alex Grönholm kirjoitti:
> 07.01.2011 06:49, P.J. Eby kirjoitti:
>> At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote:
>>> Tossing the idea around all day long will then, of course, be
>>> happening regardless. Unfortunately for that particular discussion,
>>> PEP 3148 / Futures seems to have won out in the broader scope.
>>
>> Do any established async frameworks or server (e.g. Twisted,
>> Eventlet, Gevent, Tornado, etc.) make use of futures?
> I understand that Twisted has incorporated futures support to their
> deferreds. Others, I believe, don't support them yet. You have to
> consider that Python 3.2 (the first Python with futures support in
> stdlib) hasn't even been released yet, and it's only been two weeks
> since I released the drop-in backport
> (http://pypi.python.org/pypi/futures/2.1).

Exarkun corrected me on this -- there is currently no futures support in
Twisted. Sorry about the false information.

Antoine Pitrou

unread,

Jan 7, 2011, 10:36:52 PM1/7/11

to web...@python.org

Alice Bevan–McGregor <alice@...> writes:
>

> On 2011-01-07 13:21:36 -0800, Antoine Pitrou said:
> > Ok, so, WSGI doesn't "already involve generators". QED.
>
> This can go around in circles; by allowing all forms of iterable, it
> involves generators. Geneators are a type of iterable. QED right
> back. ;)

Please read back in context.

> There isn't any "compatibility language" sprinkled within the PEP.[...]
>
> Using native strings where possible encourages compatibility, [snip]

The whole "native strings" thing *is* compatibility cruft. A Python 3 PEP would
only need two string types: bytes and unicode (str).

> > Just because you "managed to write" some piece of code for a
> > *particular* use case doesn't mean that cross-compatibility is a solved
> > problem.
>
> The particular use case happens to be PEP 444 as implemented using an
> async and multi-process (some day multi-threaded) HTTP server, so I'm
> not quite sure what you're getting at, here.

It's becoming to difficult to parse. You aren't sure yet what the async part of
PEP 444 should look like but you have already implemented it?

> > If you think it's easy, then I'm sure the authors of various 3rd-party
> > libs would welcome your help achieving it.
>
> I helped proof a book about Python 3 compatibility and am giving a
> presentation in March that contains information on Python 3
> compatibility from the viewpoint of implementing the Marrow suite.

Well, I hope not too many people will waste time trying to write code
cross-compatible code rather than solely target Python 3. The whole point of
Python 3 is to make developers' life better, not worse.

> >> Python 2.x will be around for a long time.
> >
> > And so will PEP 3333 and even PEP 333. People who value legacy
> > compatibility will favour these old PEPs over your new one anyway.
> > People who don't will progressively jump to 3.x.
>
> Yup. Not sure how this is really an issue. PEP 444 is the /future/,
> 333[3] is /now/ [-ish].

Please read back in context (instead of stripping it), *again*.

Alex Grönholm

unread,

Jan 7, 2011, 10:45:00 PM1/7/11

to web...@python.org

08.01.2011 05:36, Antoine Pitrou kirjoitti:
> Alice Bevan–McGregor<alice@...> writes:
>> On 2011-01-07 13:21:36 -0800, Antoine Pitrou said:
>>> Ok, so, WSGI doesn't "already involve generators". QED.
>> This can go around in circles; by allowing all forms of iterable, it
>> involves generators. Geneators are a type of iterable. QED right
>> back. ;)
> Please read back in context.
>
>> There isn't any "compatibility language" sprinkled within the PEP.[...]
>>
>> Using native strings where possible encourages compatibility, [snip]
> The whole "native strings" thing *is* compatibility cruft. A Python 3 PEP would
> only need two string types: bytes and unicode (str).
>
>>> Just because you "managed to write" some piece of code for a
>>> *particular* use case doesn't mean that cross-compatibility is a solved
>>> problem.
>> The particular use case happens to be PEP 444 as implemented using an
>> async and multi-process (some day multi-threaded) HTTP server, so I'm
>> not quite sure what you're getting at, here.
> It's becoming to difficult to parse. You aren't sure yet what the async part of
> PEP 444 should look like but you have already implemented it?

We are still discussing the possible mechanics of PEP 444 with async
support. There is nothing definite yet, and certainly no workable
implementation yet either. Async support may or may not materialize in
PEP 444, in another PEP or not at all based on the discussions on this
list and on IRC.

>>> If you think it's easy, then I'm sure the authors of various 3rd-party
>>> libs would welcome your help achieving it.
>> I helped proof a book about Python 3 compatibility and am giving a
>> presentation in March that contains information on Python 3
>> compatibility from the viewpoint of implementing the Marrow suite.
> Well, I hope not too many people will waste time trying to write code
> cross-compatible code rather than solely target Python 3. The whole point of
> Python 3 is to make developers' life better, not worse.
>
>>>> Python 2.x will be around for a long time.
>>> And so will PEP 3333 and even PEP 333. People who value legacy
>>> compatibility will favour these old PEPs over your new one anyway.
>>> People who don't will progressively jump to 3.x.
>> Yup. Not sure how this is really an issue. PEP 444 is the /future/,
>> 333[3] is /now/ [-ish].
> Please read back in context (instead of stripping it), *again*.
>
>
> _______________________________________________
> Web-SIG mailing list
> Web...@python.org
> Web SIG: http://www.python.org/sigs/web-sig

> Unsubscribe: http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi

P.J. Eby

unread,

Jan 8, 2011, 12:09:34 AM1/8/11

to al...@gothcandy.com, web...@python.org

At 12:37 PM 1/7/2011 -0800, Alice BevanMcGregor wrote:
>But is there really any problem with providing a unified method for
>indication a suspend point?

Yes: a complexity burden that is paid by the many to serve the few --
or possibly non-existent.

I still haven't seen anything that suggests there is a large enough
group of people who want a "portable" async API to justify
inconveniencing everyone else in order to suit their needs, vs.
simply having a different calling interface for that need.

If I could go back and change only ONE thing about WSGI 1, it would
be the calling convention. It was messed up from the start,
specifically because I wasn't adamant enough about weighing the needs
of the many enough against the needs of the few. Only a few needed a
push protocol (write()), and only a few even remotely cared about our
minor nod to asynchrony (yielding empty strings to pause output).

If I'd been smart (or more to the point, prescient), I'd have just
done a 3-tuple return value from the get-go, and said to hell with
those other use cases, because everybody else is paying to carry a
few people who aren't even going to use these features for real. (As
it happens, I thought write() would be needed in order to drive
adoption, and it may well have been at one time.)

Anyway, with a new spec we have the benefit of hindsight: we know
that, historically, nobody has actually cared enough to propose a
full-blown async API who wasn't also trying to make their async
server implementation work without needing threads. Never in the
history of the web-sig, AFAIK, has anyone come in and said, "hey, I
want to have an async app that can run on any async framework."

Nobody blogs or twitters about how terrible it is that the async
frameworks all have different APIs and that this makes their apps
non-portable. We see lots of complaints about not having a Python 3
WSGI spec, but virtually none about WSGI being essentially synchronous.

I'm not saying there's zero audience for such a thing... but then,
at some point there was a non-zero audience for write() and for
yielding empty strings. ;-)

The big problem is this: if, as an app developer, you want this
hypothetical portable async API, you either already have an app that
is async or you don't. If you do, then you already got married to
some particular API and are happy with your choice -- or else you'd
have bit the bullet and ported.

What you would not do, is come to the Web-SIG and ask for a spec to
help you port, because you'd then *still have to port* to the new
API... unless of course you wanted it to look like the API you're
already using... in which case, why are you porting again, exactly?

Oh, you don't have an app... okay, so *hypothetically*, if you had
this API -- which, because you're not actually *using* an async API
right now, you probably don't even know quite what you need --
hypothetically if you had this API you would write an app and then
run it on multiple async frameworks...

See? It just gets all the way to silly. The only way you can
actually get this far in the process seems to be if you are on the
server side, thinking it would be really cool to make this thing
because then surely you'll get users.

In practice, I can't imagine how you could write an app with
substantial async functionality that was sanely portable across the
major async frameworks, with the possible exception of the two that
at least share some common code, paradigms, and API. And even if you
could, I can't imagine someone wanting to.

So far, you have yet to give a concrete example of an application
that you personally (or anyone you know of) want to be able to run on
two different servers. You've spoken of hypothetical apps and
hypothetical portability... but not one concrete, "I want to run
this under both Twisted and Eventlet" (or some other two
frameworks/servers), "because of [actual, non-hypothetical rationale here]".

I don't deny that [actual non-hypothetical rationale] may exist
somewhere, but until somebody shows up with a concrete case, I don't
see a proposal getting much traction. (The alternative would be if
you pull a rabbit out of your hat and propose something that doesn't
cost anybody anything to implement... but the fact that you're
tossing the 3-tuple out in favor of yielding indicates you've got no
such proposal ready at the present time.)

On the plus side, the "run this in a future after the request"
concept has some legs, and I hope Timothy (or anybody) takes it and
runs with it. That has plenty of concrete use cases for portability
-- every sufficiently-powerful web framework will want to either
provide that feature, build other features on top of it, or both.

It's the "make the request itself async" part that's the hard sell
here, and in need of some truly spectacular rationale in order to
justify the ubiquitous costs it imposes.

Alex Grönholm

unread,

Jan 8, 2011, 1:13:17 AM1/8/11

to web...@python.org

How do you suppose common async middleware could be implemented without
a common async API? Today we have plenty of WSGI middleware, which would
not be possible without a common API. You would have to make separate
interfaces for every major framework and separately test against each of
them instead of having a reasonable expectation that it will work
uniformly across compliant frameworks. I would really love to see common
middleware components that are usable on twisted, tornado etc. without
modifications.

You seem to be under the impression that asynchronous applications only
have some specialized uses. Asynchronous applications are no more
limited in scope than synchronous ones are. It's just an alternative
programming paradigm that has the potential of squeezing more
performance out of a server. Note that I am in now way insisting that
PEP 444 require async support; I'm only exploring that possibility. If
we cannot figure out a way to make it easy for implementors to support,
then I will push for a separate specification.

>
>
> I don't deny that [actual non-hypothetical rationale] may exist
> somewhere, but until somebody shows up with a concrete case, I don't
> see a proposal getting much traction. (The alternative would be if
> you pull a rabbit out of your hat and propose something that doesn't
> cost anybody anything to implement... but the fact that you're tossing
> the 3-tuple out in favor of yielding indicates you've got no such
> proposal ready at the present time.)
>
> On the plus side, the "run this in a future after the request" concept
> has some legs, and I hope Timothy (or anybody) takes it and runs with
> it. That has plenty of concrete use cases for portability -- every
> sufficiently-powerful web framework will want to either provide that
> feature, build other features on top of it, or both.

What exactly does "run this in a future after the request" mean? There
seems to be some terminology confusion here.

>
>
> It's the "make the request itself async" part that's the hard sell
> here, and in need of some truly spectacular rationale in order to
> justify the ubiquitous costs it imposes.
>
> _______________________________________________
> Web-SIG mailing list
> Web...@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:

> http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi

Alice Bevan–McGregor

unread,

Jan 8, 2011, 1:20:07 AM1/8/11

to web...@python.org

On 2011-01-07 22:13:17 -0800, Alex Grönholm said:

> 08.01.2011 07:09, P.J. Eby wrote:
>> On the plus side, the "run this in a future after the request" concept

>> has some legs... [snip]

>
> What exactly does "run this in a future after the request" mean? There
> seems to be some terminology confusion here.

I suspect he's referring to some of the notes on the "PEP 444 feature
request - Futures executor" thread and several of my illustrated use
cases, notably:

:: Image scaling (e.g. to multiple sizes) after uploading of an image
to be scaled where the response (Congratulations, image uploded!) does
not require the result of the scaling.

:: Content indexing which can also be performed after returning the
success page.

The former would executor.submit() a number of scaling jobs, attach
completion callbacks to perform some cleanup / database updating /
etc., and return a response immediately. The latter is a single
executor submission that is entirely non-time-critical.

And likely other use cases as well. This (inclusion of an executor
tuned to the underlying server in the environment) is one thing I think
we can (almost) all agree is a good idea. :D Discussion on that
particular idea should be relegated to the feature request thread,
though.

- Alice.

Alice Bevan–McGregor

unread,

Jan 8, 2011, 1:57:17 AM1/8/11

to web...@python.org

On 2011-01-07 13:21:36 -0800, Antoine Pitrou said:
> Ok, so, WSGI doesn't "already involve generators". QED.

Let me try this again. With the understanding that:

:: PEP 333[3] and 444 define a response body as an iterable.
:: Thus WSGI involves iterables through definition.
:: A generator is a type of iterable.
:: Thus WSGI involves generators through the use of iterables.

The hypothetical redefinition of an application as a generator is not
too far out to lunch, considering that WSGI _already involves
generators_. (And that the simple case, an application that does not
utilize async, will require a single word be changed: s/return/yield)

Is that clearer? The idea refered to below (and posted separately)
involve this redefinition, which I understand fully will have a number
of strong opponents. Considering PEP 444 is a new spec (already
breaking direct compatibility via the /already/ redefined return value)
I hope people do not reject this out of hand but instead help explore
the idea further.

On 2011-01-07 19:36:52 -0800, Antoine Pitrou said:
> Alice Bevan–McGregor <alice@...> writes:
>> The particular use case happens to be PEP 444 as implemented using an
>> async and multi-process (some day multi-threaded) HTTP server, so I'm
>> not quite sure what you're getting at, here.
>
> It's becoming to difficult to parse. You aren't sure yet what the async
> part of PEP 444 should look like but you have already implemented it?

Marrow HTTPd (marrow.server.http) [1] is, internally, an asynchronous
server. It does not currently expose the reactor to the WSGI
application via any interface whatsoever. I am, however, working on
some p-code examples (that I will post for discussion as mentioned
above) which I can base a fork of m.s.http off of to experiment.

This means that, yes, I'm not sure how async will work in PEP 444 /in
the end/, but I am at least attempting to explore the practical
implications of the ideas thus far in a real codebase. I'm "getting it
done", even if it has to change or be scrapped.

>> I helped proof a book about Python 3 compatibility and am giving a
>> presentation in March that contains information on Python 3
>> compatibility from the viewpoint of implementing the Marrow suite.
>
> Well, I hope not too many people will waste time trying to write code
> cross-compatible code rather than solely target Python 3. The whole
> point of Python 3 is to make developers' life better, not worse.

I agree, with one correction to your first point. Application and
framework developers should whole-heartedly embrase Python 3 and make
full use of its many features, simplifications and clarifications.
However, it is demonstrably not Insanely Difficult™ to have compatible
server and middleware implementations with the draft's definition of
native string. If server and middleware developers are willing to
create polygot code, I'm not going to stop them.

Note that this type of compatibility is not mandated, and the use of
native strings (with one well defined byte string exception) means that
pure Python 3 programmers can be blissfully ignorant of the
compatibility implications -- everything else is "unicode" (str), even
if it's just "bytes-in-unicode" (latin1/iso-8859-1). Pure Python 2
programmers have only a small difference (for them) of the URI values
being unicode; the remaining values are byte strings (str).

I would like to hear a technical reason why this (native strings) is a
bad idea instead of vague "this will make things harder" -- it won't,
at least, not measurably, and I have the proof as a working, 100% unit
tested, performant, cross-compatible polygot HTTP/1.1-compliant server.
Written in several days worth of "full-time work" spread across weeks
because this is a spare-time project; i.e. not a lot of literal work,
nor "hard".

Hell, it has transformed from a crappy hack to experiment with HTTP
into a complete (or very nearly so) implementation of PEP 444 in both
of its current forms (published and draft) that is almost usable,
ignoring the fact that PEP 444 is mutable, of course.

- Alice.

[1] http://bit.ly/fLfamO

Reply all

Reply to author

Forward