Re: [Web-SIG] [Python-Dev] wsgi validator with asynchronous handlers/servers

8 views
Skip to first unread message

PJ Eby

unread,
Mar 24, 2013, 1:14:10 AM3/24/13
to Luca Sbardella, web-sig
On Sat, Mar 23, 2013 at 7:30 PM, Luca Sbardella
<luca.sb...@gmail.com> wrote:
>PJ Eby wrote:
>> The validator is correct for the spec. You *must* call
>> start_response() before yielding any strings at all.
>
>
> Thanks for response PJ,
> that is what I, unfortunately, didn't want to hear, the validator being
> correct for the "spec" means I can't use it for my asynchronous stuff, which
> is a shame :-(((
> But why commit to send headers when you may not know about your response?
> Sorry if this is the wrong mailing list for the issue, I'll adjust as I go
> along.

Because async was added as an afterthought to WSGI about nine years
ago, and we didn't get it right, but it long ago was too late to do
anything about it. A properly async WSGI implementation will probably
have to wait for Tulip (Guido's project to bring a standard async
programming API to Python).
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com

Guido van Rossum

unread,
Mar 24, 2013, 9:08:04 PM3/24/13
to Luca Sbardella, web-sig, python-tulip
Hi Luca,

Unfortunately I haven't thought yet about the interactions between WSGI and Tulip or PEP 3156. While I am pretty familiar with WSGI, I have never used its async features, so I can't be much of a help. My best guess is that we won't make any changes to WSGI to support PEP 3156 in Python 3.4, but that once that is out, some folks will come up with an improved design for WSGI that supports interoperability with standard async event loops. OTOH, maybe you can read up on the PEP and check out the Tulip implementation (http://code.google.com/p/tulip/) and maybe you can come up with a suitable design for integrating PEP 3156 into WSGI? Though it may have to be named WSGI 2.0 to emphasize that it is backwards incompatible.

--Guido



On Sun, Mar 24, 2013 at 2:18 PM, Luca Sbardella <luca.sb...@gmail.com> wrote:
Hello,

first time here, I'm Luca and I write lots of python of the asynchronous variety.
This question is about wsgi and the way pulsar http://quantmind.github.com/pulsar/ handles asynchronous wsgi responses.

Yesterday I sent a message to the python-dev mailing list regarding wsgiref.validator, this is the original message

I have an asynchronous wsgi application handler which yields empty bytes before it is ready to yield the response body and, importantly, to call start_response.

Something like this:

def wsgi_handler(environ, start_response):
        body = generate_body(environ)
        body = maybe_async(body)
        while is_async(body):
            yield b''
        start_response(...)
        ...

I started using wsgiref.validator recently, nice little gem in the standard lib, and I discovered that the above handler does not validate! Disaster.
Reading pep 3333
 
"the application must invoke the start_response() callable before the iterable yields its first body bytestring, so that the server can send the headers before any body content. However, this invocation may be performed by the iterable's first iteration, so servers must not assume that start_response() has been called before they begin iterating over the iterable."

The pseudocode above does yields bytes before start_response, but they are not *body* bytes, they are empty bytes so that the asynchronous wsgi server releases the eventloop and call back at the next eventloop iteration.


And the response was


>PJ Eby wrote:
>> The validator is correct for the spec.  You *must* call
>> start_response() before yielding any strings at all.
>
>
> Thanks for response PJ,
> that is what I, unfortunately, didn't want to hear, the validator being
> correct for the "spec" means I can't use it for my asynchronous stuff, which
> is a shame :-(((
> But why commit to send headers when you may not know about your response?
> Sorry if this is the wrong mailing list for the issue, I'll adjust as I go
> along.

Because async was added as an afterthought to WSGI about nine years
ago, and we didn't get it right, but it long ago was too late to do
anything about it.  A properly async WSGI implementation will probably
have to wait for Tulip (Guido's project to bring a standard async
programming API to Python).

and so here I am.
I know tulip is on its early stages but is there anything on the pipeline about wsgi?
Happy to help if needed.

Regards
Luca




--
--Guido van Rossum (python.org/~guido)

Guido van Rossum

unread,
Mar 25, 2013, 2:48:09 PM3/25/13
to Luca Sbardella, web-sig, python-tulip
Awesome! Can't wait to see that.


On Mon, Mar 25, 2013 at 11:30 AM, Luca Sbardella <luca.sb...@gmail.com> wrote:
maybe you can read up on the PEP and check out the Tulip implementation (http://code.google.com/p/tulip/) and maybe you can come up with a suitable design for integrating PEP 3156 into WSGI? Though it may have to be named WSGI 2.0 to emphasize that it is backwards incompatible.


I have an idea already,
I'll write an initial implementation based on tulip.http.Response & tulip.http.ServerHttpProtocol and I'll write a little example using it. 

Luca

Manlio Perillo

unread,
Mar 25, 2013, 5:50:17 PM3/25/13
to web...@python.org
Il 24/03/2013 06:14, PJ Eby ha scritto:
> [...]
>> Thanks for response PJ,
>> that is what I, unfortunately, didn't want to hear, the validator being
>> correct for the "spec" means I can't use it for my asynchronous stuff, which
>> is a shame :-(((
>> But why commit to send headers when you may not know about your response?
>> Sorry if this is the wrong mailing list for the issue, I'll adjust as I go
>> along.
>
> Because async was added as an afterthought to WSGI about nine years
> ago, and we didn't get it right, but it long ago was too late to do
> anything about it. A properly async WSGI implementation will probably
> have to wait for Tulip (Guido's project to bring a standard async
> programming API to Python).

Do you really need a standard async programming API to design and
implement an async WSGI specification?

I think it is not needed.
Some time ago I posted a sample implementation and documentation for a
very simple async extension for WSGI:
https://bitbucket.org/mperillo/txwsgi

An interesting example about how an async API can be designed is
PostgreSQL libpq, where the API expose a direct interface to the
protocol state machine (pqConsumeInput), so you can not only use it with
any async framework you like, but you can also use it in blocking mode.

This, as far as I know, is impossible with the network protocol
implementations in Twisted or other async frameworks.



Regards Manlio

est

unread,
Apr 27, 2013, 12:36:29 AM4/27/13
to Guido van Rossum, Luca Sbardella, python-tulip, web-sig
Hi,

Newbie opinion here.

Since we are talking about Tulip and PEP 3156, I think it's high time we address some of the design flaws in WSGI 1.0

One major problem with WSGI is that it can not handle true post-response hooks.

The closest hack I found is this:


As discussed by Graham Dumpleton here

Although the response was returned to the client, It will still hold the http connection open until __callback finishes.

While it's pretty common design pattern for a post-response hook in modern Web world. I can think a few usage:

 - User uploads file, return HTML says Upload OK, then Web worker continue to transfer file to Amazon S3, which is slow and takes some time.
 - After a series of user interaction on a web page, using the existing db connection to write OLAP logs of later analysis.
 - notify the http request to another ZMQ/XMPP connection

Currently, Celery is extremely popular (at least in Django or other non-async web frameworks). But IMHO it's too heavy weight and copying python data & objects from a cluster of Web workers to another cluster of task queue workers is not worth it.

Another problem is the good old CGI environ design. I can't help to ask? Why?

Every HTTP header is transfered via envion, and capitalized with a HTTP_ prefix e.g. HTTP_HOST. There's some serious information loss here.

1. Actual header string case 
2. header order

Since WSGI is higher level framework, I think it's time for us to deliver the original header status in a SortedDict.

Again, as a newbie advice, we should take this chance of integrating PEP 3156 with a deadly simple WSGI 3.0 design:

def application(request):
    ip = request.remote_ip
    length = request.headers["Content-Length"]
    request.write("<html>done.</html>")
    request.close()
    db.log(length) # some post-response actions.



_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig

Graham Dumpleton

unread,
Apr 27, 2013, 1:24:33 AM4/27/13
to est, Luca Sbardella, web-sig, python-tulip
I described a different way of doing WSGI which would better cope with post response hooks at the Python Web Summit at PyCon in 2012. It made use of the context manager abstraction so it wouldn't screw with the returned iterable.


Graham

PJ Eby

unread,
Apr 27, 2013, 6:21:31 PM4/27/13
to Graham Dumpleton, Luca Sbardella, python-tulip, web-sig
On Sat, Apr 27, 2013 at 1:24 AM, Graham Dumpleton
<graham.d...@gmail.com> wrote:
> I described a different way of doing WSGI which would better cope with post
> response hooks at the Python Web Summit at PyCon in 2012. It made use of the
> context manager abstraction so it wouldn't screw with the returned iterable.
>
> http://www.slideshare.net/GrahamDumpleton/pycon-us-2012-state-of-wsgi-2-14808297

Also, wsgi_lite provides a way of registering resources to be closed
post-response, that works within WSGI 1.0, also without altering the
returned iterable:

https://bitbucket.org/pje/wsgi_lite#close-and-resource-cleanups

Although wsgi_lite provides programmatic support for this, it's
internally implemented as a stock WSGI extension key
('wsgi_lite.closing') in the environ, and can be offered today by
servers or middleware in a 1.0 environment. I just haven't gotten
around to knocking out a PEP for it.
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com
Reply all
Reply to author
Forward
0 new messages