> It would be helpful if you addressed the issue of scope, i.e., > whatfeatures are you proposing to offer to the application developer.
Conformity, predictability, and portability. That's a lot of y's. (Pardon the pun!)
Alex Grönholm's post describes the goal quite clearly.
> So far, I believe you're the second major proponent (i.e. ones with > concrete proposals and/or implementations to discuss) of an async > protocol... and what you have in common with the other proponent is > that you happen to have written an async server that would benefit from > having apps operating asynchronously. ;-)
Well, the Marrow HTTPd does operate in multi-process mode, and, one day, multi-threaded or a combination. Integration of a futures executor to the WSGI environment would alleviate the major need for a multi-threaded implementation in the server core; intensive tasks can be deferred to a thread pool vs. everything being deferred to a thread pool. (E.g. template generation, PDF/other text extraction for indexing of file uploads, image scaling, etc. all of which are real use cases I have which would benefit from futures.)
> I find it hard to imagine an app developer wanting to do something > asynchronously for which they would not want to use one of the big-dog > asynchronous frameworks. (Especially if their app involves database > access, or other communications protocols.)
Admittedly, a truly async server needs some way to allow file descriptors to be registered with the reactor core, with the WSGI application being resumed upon some event (e.g. socket is readable or writeable for DB access, or even pipe operations for use cases I can't think of at the moment).
Futures integration is a Good Idea, IMHO, and being optional and easily added to the environ by middleware for servers that don't implement it natively is even better.
As for how to provide a generic interface to an async core, I have two ideas, but one is magical and the other is more so; I'll describe these in a descrete post.
> This doesn't mean I think having a futures API is a bad thing, butISTM > that a futures extension to WSGI 1 could be defined right nowusing an > x-wsgi-org extension in that case... and you could thenfind out how > many people are actually interested in using it.
I'll add writing up a WSGI middleware layer that configures and adds a future.executor to the environ to my already overweight to-do list. It actually is something I have a use for right now on at least one commercial project. :)
> Mainly, though, what I see is people using the futures thing to shuffle > off compute-intensive tasks...
That's what it's for. ;)
> ...but if they do that, then they're basically trying to make the > server's life easier... but under the existing spec, any truly async > server implementing WSGI is going to run the *app* in a "future" of > some sort already...
Running the application in a future is actually not a half-bad way for me to add threading to marrow.server... thanks!
> Which means that the net result is that putting in async is like saying > to the app developer: "hey, you know this thing that you just could do > in WSGI 1 and the server would take care of it foryou? Well, now you > can manage that complexity by yourself! Isn't that wonderful?" ;-)
That's a bit extreme; PEP 444 servers may still implement threading, multi-processing, etc. at the reactor level (a la CherryPy or Paste). Giving WSGI applications access to a futures executor (possibly the one powering the main processing threads) simply gives applications the ability to utilize it, not the requirement to do so.
> I could be wrong of course, but I'd like to see what concrete usecases > people have for async.
Earlier in this post I illustrated a few that directly apply to a commercial application I am currently writing. I'll elaborate:
:: Image scaling would benefit from multi-processing (spreading the load across cores). Also, only one sacle is immediately required before returning the post-upload page: the thumbnail. The other scales can be executed without halting the WSGI application's return.
:: Asset content extraction and indexing would benefit from threading, and would also not require pausing the WSGI application.
:: Since most templating engines aren't streaming (see my unanswered thread in the general mailing list re: this), pausing the application pending a particularly difficult render is a boon to single-threaded async servers, though true streaming templating (with flush semantics) would be the holy grail. ;)
:: Long-duration calls to non-async-aware libraries such as DB access. The WSGI application could queue up a number of long DB queries, pass the futures instances to the template, and the template could then .result() (block) across them or yield them to be suspended and resumed when the result is available.
:: True async is useful for WebSockets, which seem a far superior solution to JSON/AJAX polling in addition to allowing real web-based socket access, of course.
> We dropped the first discussion of async six years ago because someone > (I think it might've been James)pointed out that, well, it isn't > actually that useful. And every subsequent call for use cases since > has been answered with, "well, the use case is that you want it to be > async."
On 2011-01-06 10:15:19 -0800, Antoine Pitrou said:
> Alice Bevan–McGregor <alice@...> writes: >>> Er, for the record, in Python 3 non-blocking file objects return None when >>> read() would block.
>> -1
>> I'm aware, however that's not practically useful. How would you detect >> from within the WSGI 2 application that the file object has become >> readable? Implement your own async reactor / select / epoll loop? >> That's crazy talk! ;)
> I was just pointing out that if you need to choose a convention for > signaling blocking reads on a non-blocking object, it's already there.
I don't. I need a way to suspend execution of a WSGI application pending some operation, often waiting for socket or file read or write availability. (Just as often something entirely unrelated to file descriptors, see my previous post from a few moments ago.)
> By the way, an event loop is the canonical implementation of > asynchronous programming, so I'm not sure what you're complaining > about. Or perhaps you're using "async" in a different meaning? (which > one?)
If you use non-blocking sockets, and the WSGI server provides a way to directly access the client socket (ack!), utilizing the none response on reads would require you to utilize a tight loop within your application to wait for actual data. That's really, really bad, and in a single-threaded server, deadly.
> I don't understand why you want a "yield" at this level. IMHO, WSGI > needn't involve generators. A higher-level wrapper (framework, > middleware, whatever) can wrap fd-waiting in fancy generator stuff if > so desired. Or, in some other environments, delegate it to a reactor > with callbacks and deferreds. Or whatever else, such as futures.
WSGI already involves generators: the response body. In fact, the templating engine I wrote (and extended to support flush semantics) utilizes a generator to return the response body. Works like a hot damn, too.
Yield is the Python language's native way to suspend execution of a callable in a re-entrant way. A trivial example of this is an async "ping-pong" reactor. I wrote one ("you aren't a real Python programmer unless...") as an experiment and utilize it for server monitoring with tasks being generally scheduled against time, vs. edge-triggered or level-triggered fd operation availability.
Everyone has their own idea of what a "deferred" is, and there is only one definition of a "future", which (in a broad sense) is the same as the general idea of a "deferred". Deferreds just happen to be implementation-specific and often require rewriting large portions of external libraries to make them compatible with that specific deferred implementation. That's not a good thing.
Hell; an extension to the futures spec to handle file descriptor events might not be a half-bad idea. :/
> By the way, the concurrent.futures module is new. Though it will be > there in 3.2, it's not guaranteed that its API and semantics will be > 100% stable while people start to really flesh it out.
Ratification of PEP 444 is a long way off itself. Also, Alex Grönholm maintains a pypi backport of the futures module compatible with 2.x+ (not sure of the specific minimum version) and < 3.2. I'm fairly certain deprecation warnings wouldn't kill the usefulness of that implementation. Worrying about instability, at this point, may be premature.
>> +1 for pure futures which (in theory) eliminate the need for dedicated >> async versions of absolutely everything at the possible cost of >> slightly higher overhead.
> I don't understand why futures would solve the need for a low-level > async facility.
You mis-interpreted; I didn't mean to infer that futures would replace an async core reactor, just that long-running external library calls could be trivially deferred using futures.
> You still need to define a way for the server and the app to wake each > other (and for the server to wake multiple apps).
Futures is a pretty convienent way to have a server wake an app; using a future completion callback wrapped (using partial) with the paused application generator would do it. (The reactor Marrow uses, a modified Tornado IOLoop, would require calling reactor.add_callback(partial(worker, app_gen)) followed by reactor._wake() in the future callback.)
"Waking up the server" would be accomplished by yielding a futures instance (or fd magical value, etc).
> This isn't done "naturally" in Python (except perhaps with stackless or > greenlets). Using fds give you well-known flexible possibilities.
Yield is the natural way for one side of that, re-entering the generator on future completion covers the other side. Stackless and greenlets are alternate ideas, but yield is built-in (and soon, so will futures).
> If you want to put the futures API in WSGI, think of the poor authors > of a WSGI server written in C who will have to write their own executor > and future implementation. I'm sure they have better things to do.
If they embed a Python interpreter via C, they can utilize native implementations of future executors, though these will obviously be slightly less performant than a native C implementation. (That is, unless the stdlib version in 3.2 will have C backing.)
>Earlier in this post I illustrated a few that directly apply to a >commercial application I am currently writing. I'll elaborate:
>:: Image scaling would benefit from multi-processing (spreading the >load across cores). Also, only one sacle is immediately required >before returning the post-upload page: the thumbnail. The other >scales can be executed without halting the WSGI application's return.
>:: Asset content extraction and indexing would benefit from >threading, and would also not require pausing the WSGI application.
>:: Since most templating engines aren't streaming (see my unanswered >thread in the general mailing list re: this), pausing the >application pending a particularly difficult render is a boon to >single-threaded async servers, though true streaming templating >(with flush semantics) would be the holy grail. ;)
In all these cases, ISTM the benefit is the same if you future the WSGI apps themselves (which is essentially what most current async WSGI servers do, AFAIK).
>:: Long-duration calls to non-async-aware libraries such as DB access. >The WSGI application could queue up a number of long DB queries, >pass the futures instances to the template, and the template could >then .result() (block) across them or yield them to be suspended and >resumed when the result is available.
>:: True async is useful for WebSockets, which seem a far superior >solution to JSON/AJAX polling in addition to allowing real web-based >socket access, of course.
The point as it relates to WSGI, though, is that there are plenty of mature async APIs that offer these benefits, and some of them (e.g. Eventlet and Gevent) do so while allowing blocking-style code to be written. That is, you just make what looks like a blocking call, but the underlying framework silently suspends your code, without tying up the thread.
Or, if you can't use a greenlet-based framework, you can use a yield-based framework. Or, if for some reason you really wanted to write continuation-passing style code, you could just use the raw Twisted API.
But in all of these cases you would be better off than if you used a half-implementation of the same thing using futures under WSGI, because all of those frameworks already have mature and sophisticated APIs for doing async communications and DB access. If you try to do it with WSGI under the guise of "portability", all this means is that you are stuck rolling your own replacements for those existing APIs.
Even if you've already written a bunch of code using raw sockets and want to make it asynchronous, Eventlet and Gevent actually let you load a compatibility module that makes it all work, by replacing the socket API with an exact duplicate that secretly suspends your code whenever a socket operation would block.
IOW, if you are writing a truly async application, you'd almost have to be crazy to want to try to do it *portably*, vs. picking a full-featured async API and server suite to code against. And if you're migrating an existing, previously-synchronous WSGI app to being asynchronous, the obvious thing to do would just be to grab a copy of Eventlet or Gevent and import the appropriate compatibility modules, not rewrite the whole thing to use futures.
> > I don't understand why you want a "yield" at this level. IMHO, WSGI > > needn't involve generators. A higher-level wrapper (framework, > > middleware, whatever) can wrap fd-waiting in fancy generator stuff if > > so desired. Or, in some other environments, delegate it to a reactor > > with callbacks and deferreds. Or whatever else, such as futures.
> WSGI already involves generators: the response body.
Wrong. The response body is an arbitrary iterable, which means it can be a sequence, a generator, or something else. WSGI doesn't mandate any specific feature of generators, such as coroutine-like semantics, and the server doesn't have to know about them.
> Everyone has their own idea of what a "deferred" is, and there is only > one definition of a "future", which (in a broad sense) is the same as > the general idea of a "deferred".
A Twisted deferred is as well defined as a Python stdlib future; actually, deferreds have been in use by the Python community for much, much longer than futures. But that's besides the point, since I'm proposing that your spec doesn't rely on a high-level abstraction at all.
> Ratification of PEP 444 is a long way off itself.
Right, that's why I was suggesting you drop your concern for Python 2 compatibility.
When I originally requested a futures executor option (the email that started this thread), this is more like what I had in mind. I'm not against async...rather indifferent. But I wanted the ability for the server to run something after the response had been fully served to the client and thus not blocking the response. The example I gave was sending an email, but there are plenty of other use cases. Futures seemed like the right way to do this. I'm also not sure futures is the right way to build an async specification and for that matter, there will be a lot to work out with regard to PEP 444.
Rather than responding to this, I'll start a new thread since this takes the environ["wsgi.executor"] discusssion in a different direction. Please send your comments there.
----- Original Message ----- From: "Guido van Rossum" <gu...@python.org> To: "P.J. Eby" <p...@telecommunity.com>
Cc: al...@gothcandy.com, web-...@python.org Sent: Thursday, January 6, 2011 11:30:11 PM Subject: Re: [Web-SIG] PEP 444 / WSGI 2 Async
On Thu, Jan 6, 2011 at 8:49 PM, P.J. Eby <p...@telecommunity.com> wrote: > At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote:
>> Tossing the idea around all day long will then, of course, be happening >> regardless. Unfortunately for that particular discussion, PEP 3148 / >> Futures seems to have won out in the broader scope.
> Do any established async frameworks or server (e.g. Twisted, Eventlet, > Gevent, Tornado, etc.) make use of futures?
PEP 3148 Futures are meant for a rather different purpose than those async frameworks. Those frameworks all are trying to minimize the number of threads using some kind of callback-based non-blocking I/O system. PEP 3148 OTOH doesn't care about that -- it uses threads or processes proudly. This is useful for a different type of application, where there are fewer, larger tasks, and the overhead of threads doesn't matter.
The Monocle framework, which builds on top of Tornado or Twisted, uses something not entirely unlike Futures, though they call it Callback.
I don't think the acceptance of PEP 3148 should be taken as forcing the direction that async frameworks should take.
>> Having a ratified and incorporated language PEP (core in 3.2 w/ >> compatibility package for 2.5 or 2.6+ support) reduces the scope of async >> discussion down to: "how do we integrate futures into WSGI 2" instead of >> "how do we define an async API at all".
> It would be helpful if you addressed the issue of scope, i.e., what features > are you proposing to offer to the application developer.
> While the idea of using futures presents some intriguing possibilities, it > seems to me at first glance that all it will do is move the point where the > work gets done. That is, instead of simply running the app in a worker, the > app will be farming out work to futures. But if this is so, then why > doesn't the server just farm the apps themselves out to workers?
> I guess what I'm saying is, I haven't heard use cases for this from the > application developer POV -- why should an app developer care about having > their app run asynchronously?
> So far, I believe you're the second major proponent (i.e. ones with concrete > proposals and/or implementations to discuss) of an async protocol... and > what you have in common with the other proponent is that you happen to have > written an async server that would benefit from having apps operating > asynchronously. ;-)
> I find it hard to imagine an app developer wanting to do something > asynchronously for which they would not want to use one of the big-dog > asynchronous frameworks. (Especially if their app involves database access, > or other communications protocols.)
> This doesn't mean I think having a futures API is a bad thing, but ISTM that > a futures extension to WSGI 1 could be defined right now using an x-wsgi-org > extension in that case... and you could then find out how many people are > actually interested in using it.
> Mainly, though, what I see is people using the futures thing to shuffle off > compute-intensive tasks... but if they do that, then they're basically > trying to make the server's life easier... but under the existing spec, any > truly async server implementing WSGI is going to run the *app* in a "future" > of some sort already...
> Which means that the net result is that putting in async is like saying to > the app developer: "hey, you know this thing that you just could do in WSGI > 1 and the server would take care of it for you? Well, now you can manage > that complexity by yourself! Isn't that wonderful?" ;-)
> I could be wrong of course, but I'd like to see what concrete use cases > people have for async. We dropped the first discussion of async six years > ago because someone (I think it might've been James) pointed out that, well, > it isn't actually that useful. And every subsequent call for use cases > since has been answered with, "well, the use case is that you want it to be > async."
> Only, that's a *server* developer's use case, not an app developer's use > case... and only for a minority of server developers, at that.
On Fri, Jan 7, 2011 at 1:23 AM, Alice Bevan–McGregor
<al...@gothcandy.com> wrote: > Other than mod_wsgi, are there any PEP 3333-compliant (or near-compliant) > components in the wild? Enough to bring a framework to life in Python 3? > What I see is the chicken-and-egg problem endemic with Python 3: developers > wait on upstream to port before they do, and upstream developers are either > waiting themselves or don't see the demand to port.
I don't see that problem any more. I have at least three WSGI servers I could test against: modwsgi, CherryPy, and Django's half-assed built-in server. I guess I could add wsgiref as a 4th, but only sorta. And looks like Benoit's geting Gunicorn up to snuff.
What happens now goes something like this:
1. Get excited to port Django to Python 3. 2. Hack for a while. 3. Get something working under runserver - woo! 4. Hm, it fails under modwsgi. 5. OK, problem fixed. 6. Wait, no, now it doesn't work under runserver. 7. Or CherryPy. Dammit. 8. Lose interest for another 6 months.
At this point, there's a bug somewhere. It's *probably* in my code -- like I said earlier, I only barely grok WSGI -- but without a spec to refer to I'm pretty much hosed. See, Django on Py 2 jumps through a whole bunch of hoops to gloss over the string/unicode distinction and over the question of encoding, and that's stuff's pretty fiddley. The "right" way to handle that on Python 3 depends entirely on the issues hammered out in PEP 3333 -- particularly the byte/str decisions. I'm starting to assume that PEP 3333 is going to get accepted in a form fundamentally the same as it appears right now, but if I'm wrong I get to do this stuff all over again. Nothing's more of an enthusiasm-killer than knowing I might have to start all over again later.
I really want a definitive answer. If the spec says it's my fault, I want people to yell at me loudly until Django is compliant. If the spec says it's not my fault, I want to be able to be an asshole [1] until all the app containers are compliant.
>> Can we please, please, PLEASE, pause discussion of PEP 444 until PEP 3333 >> is finalized?
> This is something I've seen fairly often around PEP 444 threads; instead of > reviving (or starting a new) PEP 3333 thread, a complaint is levied against > PEP 444 discussion itself. That doesn't help. ;)
Unfortunately, I really don't know any other way of helping than being a pain in the ass. I don't understand the issues well enough to contribute technically, so I've decided I'm going to continue to complain loudly. Hopefully you'll get sick of me and give me a spec so I'll shut the hell up!
I do feel crappy asking you to put 444 on hold. I understand that the SIG should be perfectly capable of working on more than one thing at once. However, to my eyes it seems like it keeps getting derailed. I understand this, too: clearly WSGI isn't perfect, and when you run up against some of those issues it's a *lot* more fun to ignore 'em just for a bit longer and work on something more exciting.
I really do appreciate the enthusiasm for PEP 444. I share it: it seems a lot easier to implement, and it'll certainly make some of the things Django's doing a lot easier. I just would really like to see that enthusiasm and energy turned full blast on PEP 3333 until it's done.
On 2011-01-07 09:04:07 -0800, Antoine Pitrou said:
> Alice Bevan–McGregor <alice@...> writes: >>> I don't understand why you want a "yield" at this level. IMHO, WSGI >>> needn't involve generators. A higher-level wrapper (framework, >>> middleware, whatever) can wrap fd-waiting in fancy generator stuff if >>> so desired. Or, in some other environments, delegate it to a reactor >>> with callbacks and deferreds. Or whatever else, such as futures.
>> WSGI already involves generators: the response body.
> Wrong.
I'm aware that it can be any form of iterable, from a list-wrapped string all the way up to generators or other nifty things. I mistakenly omitted these assuming that the other iterables were universally understood and implied.
However, using a generator is a known, vlaid use case that I do see in the wild. (And also rely upon in some of my own applications.)
> Right, that's why I was suggesting you drop your concern for Python 2 > compatibility.
-1
There is practically no reason for doing so; esp. considering that I've managed to write a 2k/3k polygot server that is more performant out of the box than any other WSGI HTTP server I've come across and is far simpler in implementation than most of the ones I've come across with roughly equivelant feature sets.
Cross compatibility really isn't that hard, and arguing that 2.x support should be dropped for the sole reason that "it might be dead by the time this is ratified" is a bit off.
> At 12:39 AM 1/7/2011 -0800, Alice BevanMcGregor wrote: >> :: Image scaling would benefit from multi-processing (spreading >> the>load across cores). Also, only one sacle is immediately >> required>before returning the post-upload page: the thumbnail. The >> other>scales can be executed without halting the WSGI application's >> return.
>> :: Asset content extraction and indexing would benefit from>threading, >> and would also not require pausing the WSGI application.
> In all these cases, ISTM the benefit is the same if you future theWSGI > apps themselves (which is essentially what most current asyncWSGI > servers do, AFAIK).
Image scaling and asset content extraction should not block the response to a HTTP request; these need to be 'forked' from the main request. Only template generation (where the app needs to effectively block pending completion) is solved easily by threading the whole application call.
>> :: Long-duration calls to non-async-aware libraries such as DB access. >> The WSGI application could queue up a number of long DB queries,>pass >> the futures instances to the template, and the template could>then >> .result() (block) across them or yield them to be suspended and>resumed >> when the result is available.
>> :: True async is useful for WebSockets, which seem a far >> superior>solution to JSON/AJAX polling in addition to allowing real >> web-based>socket access, of course.
> The point as it relates to WSGI, though, is that there are plenty > ofmature async APIs that offer these benefits, and some of them > (e.g.Eventlet and Gevent) do so while allowing blocking-style code to > bewritten. That is, you just make what looks like a blocking call, > butthe underlying framework silently suspends your code, without > tyingup the thread.
> Or, if you can't use a greenlet-based framework, you can use a > yield-based framework. Or, if for some reason you really wanted to > write continuation-passing style code, you could just use the raw > Twisted API.
But is there really any problem with providing a unified method for indication a suspend point? What the server does when it gets the yielded value is entirely up to the implementation of the server; if it (the server) wants to use greenlets, it can. If it has other methedologies, it can go nuts.
> Even if you've already written a bunch of code using raw sockets and > want to make it asynchronous, Eventlet and Gevent actually let youload > a compatibility module that makes it all work, by replacing the socket > API with an exact duplicate that secretly suspends your code whenever a > socket operation would block.
I generally frown upon magic, and each of these implementations is completely specific. :/
> There is practically no reason for doing so; esp. considering that I've > managed to write a 2k/3k polygot server that is more performant out of the > box than any other WSGI HTTP server I've come across and is far simpler in > implementation than most of the ones I've come across with roughly > equivelant feature sets.
> On 2011-01-07 09:04:07 -0800, Antoine Pitrou said: > > Alice Bevan–McGregor <alice@...> writes: > >>> I don't understand why you want a "yield" at this level. IMHO, WSGI > >>> needn't involve generators. A higher-level wrapper (framework, > >>> middleware, whatever) can wrap fd-waiting in fancy generator stuff if > >>> so desired. Or, in some other environments, delegate it to a reactor > >>> with callbacks and deferreds. Or whatever else, such as futures.
> >> WSGI already involves generators: the response body.
> > Wrong.
> I'm aware that it can be any form of iterable, [snip]
Ok, so, WSGI doesn't "already involve generators". QED.
> > Right, that's why I was suggesting you drop your concern for Python 2 > > compatibility.
> -1
> There is practically no reason for doing so;
Of course, there is one: a less complex PEP without any superfluous compatibility language sprinkled all over. And a second one: a simpler PEP is probably easier to get contructive comments about, and (perhaps some day) consensus on.
> esp. considering that I've > managed to write a 2k/3k polygot server that is more performant out of > the box than any other WSGI HTTP server I've come across and is far > simpler in implementation than most of the ones I've come across with > roughly equivelant feature sets.
Just because you "managed to write" some piece of code for a *particular* use case doesn't mean that cross-compatibility is a solved problem. If you think it's easy, then I'm sure the authors of various 3rd-party libs would welcome your help achieving it.
> Python 2.x will be around for a long time.
And so will PEP 3333 and even PEP 333. People who value legacy compatibility will favour these old PEPs over your new one anyway. People who don't will progressively jump to 3.x.
There are two branches: master will always refer to the version published on Python.org, and draft refers to my rewrite. (When published, draft will be merged.)
On 2011-01-07 13:21:36 -0800, Antoine Pitrou said:
> Ok, so, WSGI doesn't "already involve generators". QED.
This can go around in circles; by allowing all forms of iterable, it involves generators. Geneators are a type of iterable. QED right back. ;)
>>> Right, that's why I was suggesting you drop your concern for Python 2 >>> compatibility.
>> -1
>> There is practically no reason for doing so;
> Of course, there is one: a less complex PEP without any superfluous > compatibility language sprinkled all over.
There isn't any "compatibility language" sprinkled within the PEP. In fact, the only mention of it is in the introduction (stating that < 2.6 support may be possible but is undefined) and the title of a section "Python Cross-Version Compatibility".
Using native strings where possible encourages compatibility, though for the environ variables previously mentioned (URI, etc.) explicit exceptional behaviour is clearly defined. (Byte strings and true unicode.)
> Just because you "managed to write" some piece of code for a > *particular* use case doesn't mean that cross-compatibility is a solved > problem.
The particular use case happens to be PEP 444 as implemented using an async and multi-process (some day multi-threaded) HTTP server, so I'm not quite sure what you're getting at, here. I think that use case is sufficiently broad to be able to make claims about the ease of implementing PEP 444 in a compatible way.
> If you think it's easy, then I'm sure the authors of various 3rd-party > libs would welcome your help achieving it.
I helped proof a book about Python 3 compatibility and am giving a presentation in March that contains information on Python 3 compatibility from the viewpoint of implementing the Marrow suite.
>> Python 2.x will be around for a long time.
> And so will PEP 3333 and even PEP 333. People who value legacy > compatibility will favour these old PEPs over your new one anyway. > People who don't will progressively jump to 3.x.
Yup. Not sure how this is really an issue. PEP 444 is the /future/, 333[3] is /now/ [-ish].
On 2011-01-07 09:04:07 -0800, Antoine Pitrou said:
> WSGI doesn't mandate any specific feature of generators, such as > coroutine-like semantics, and the server doesn't have to know about > them.
The joy of writing a new specification is that we are not (potentially) shackled by old ways of doing things. Case in point: dropping start_response and changing the return value. PEP 444 isn't WSGI 1, and can change things, including additional changes to the allowable return value.
> 07.01.2011 06:49, P.J. Eby kirjoitti: >> At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote: >>> Tossing the idea around all day long will then, of course, be >>> happening regardless. Unfortunately for that particular discussion, >>> PEP 3148 / Futures seems to have won out in the broader scope.
>> Do any established async frameworks or server (e.g. Twisted, >> Eventlet, Gevent, Tornado, etc.) make use of futures? > I understand that Twisted has incorporated futures support to their > deferreds. Others, I believe, don't support them yet. You have to > consider that Python 3.2 (the first Python with futures support in > stdlib) hasn't even been released yet, and it's only been two weeks > since I released the drop-in backport > (http://pypi.python.org/pypi/futures/2.1).
Exarkun corrected me on this -- there is currently no futures support in Twisted. Sorry about the false information.
>>> Having a ratified and incorporated language PEP (core in 3.2 w/ >>> compatibility package for 2.5 or 2.6+ support) reduces the scope of >>> async discussion down to: "how do we integrate futures into WSGI 2" >>> instead of "how do we define an async API at all".
>> It would be helpful if you addressed the issue of scope, i.e., what >> features are you proposing to offer to the application developer.
>> While the idea of using futures presents some intriguing >> possibilities, it seems to me at first glance that all it will do is >> move the point where the work gets done. That is, instead of simply >> running the app in a worker, the app will be farming out work to >> futures. But if this is so, then why doesn't the server just farm >> the apps themselves out to workers?
>> I guess what I'm saying is, I haven't heard use cases for this from >> the application developer POV -- why should an app developer care >> about having their app run asynchronously? > Applications need to be asynchronous to work on a single threaded > server. There is no other benefit than speed and concurrency, and > having to program a web app to operate asynchronously can be a pain. > AFAIK there is no other way if you want to avoid the context switching > overhead and support a huge number of concurrent connections.
> Thread/process pools are only necessary in an asynchronous application > where the app needs to use blocking network APIs or do heavy > computation, and such uses can unfortunately present a bottleneck. It > follows that it's pretty pointless to have an asynchronous application > that uses a thread/process pool on every request.
> The goal here is to define a common API for these mutually > incompatible asynchronous servers to implement so that you could one > day run an asynchronous app on Twisted, Tornado, or whatever without > modifications.
>> So far, I believe you're the second major proponent (i.e. ones with >> concrete proposals and/or implementations to discuss) of an async >> protocol... and what you have in common with the other proponent is >> that you happen to have written an async server that would benefit >> from having apps operating asynchronously. ;-)
>> I find it hard to imagine an app developer wanting to do something >> asynchronously for which they would not want to use one of the >> big-dog asynchronous frameworks. (Especially if their app involves >> database access, or other communications protocols.)
>> This doesn't mean I think having a futures API is a bad thing, but >> ISTM that a futures extension to WSGI 1 could be defined right now >> using an x-wsgi-org extension in that case... and you could then >> find out how many people are actually interested in using it.
>> Mainly, though, what I see is people using the futures thing to >> shuffle off compute-intensive tasks... but if they do that, then >> they're basically trying to make the server's life easier... but >> under the existing spec, any truly async server implementing WSGI is >> going to run the *app* in a "future" of some sort already...
>> Which means that the net result is that putting in async is like >> saying to the app developer: "hey, you know this thing that you just >> could do in WSGI 1 and the server would take care of it for you? >> Well, now you can manage that complexity by yourself! Isn't that >> wonderful?" ;-)
>> I could be wrong of course, but I'd like to see what concrete use >> cases people have for async. We dropped the first discussion of >> async six years ago because someone (I think it might've been James) >> pointed out that, well, it isn't actually that useful. And every >> subsequent call for use cases since has been answered with, "well, >> the use case is that you want it to be async."
>> Only, that's a *server* developer's use case, not an app developer's >> use case... and only for a minority of server developers, at that.
> On 2011-01-07 13:21:36 -0800, Antoine Pitrou said: > > Ok, so, WSGI doesn't "already involve generators". QED.
> This can go around in circles; by allowing all forms of iterable, it > involves generators. Geneators are a type of iterable. QED right > back. ;)
Please read back in context.
> There isn't any "compatibility language" sprinkled within the PEP.[...]
> Using native strings where possible encourages compatibility, [snip]
The whole "native strings" thing *is* compatibility cruft. A Python 3 PEP would only need two string types: bytes and unicode (str).
> > Just because you "managed to write" some piece of code for a > > *particular* use case doesn't mean that cross-compatibility is a solved > > problem.
> The particular use case happens to be PEP 444 as implemented using an > async and multi-process (some day multi-threaded) HTTP server, so I'm > not quite sure what you're getting at, here.
It's becoming to difficult to parse. You aren't sure yet what the async part of PEP 444 should look like but you have already implemented it?
> > If you think it's easy, then I'm sure the authors of various 3rd-party > > libs would welcome your help achieving it.
> I helped proof a book about Python 3 compatibility and am giving a > presentation in March that contains information on Python 3 > compatibility from the viewpoint of implementing the Marrow suite.
Well, I hope not too many people will waste time trying to write code cross-compatible code rather than solely target Python 3. The whole point of Python 3 is to make developers' life better, not worse.
> >> Python 2.x will be around for a long time.
> > And so will PEP 3333 and even PEP 333. People who value legacy > > compatibility will favour these old PEPs over your new one anyway. > > People who don't will progressively jump to 3.x.
> Yup. Not sure how this is really an issue. PEP 444 is the /future/, > 333[3] is /now/ [-ish].
Please read back in context (instead of stripping it), *again*.
> Alice Bevan–McGregor<alice@...> writes: >> On 2011-01-07 13:21:36 -0800, Antoine Pitrou said: >>> Ok, so, WSGI doesn't "already involve generators". QED. >> This can go around in circles; by allowing all forms of iterable, it >> involves generators. Geneators are a type of iterable. QED right >> back. ;) > Please read back in context.
>> There isn't any "compatibility language" sprinkled within the PEP.[...]
>> Using native strings where possible encourages compatibility, [snip] > The whole "native strings" thing *is* compatibility cruft. A Python 3 PEP would > only need two string types: bytes and unicode (str).
>>> Just because you "managed to write" some piece of code for a >>> *particular* use case doesn't mean that cross-compatibility is a solved >>> problem. >> The particular use case happens to be PEP 444 as implemented using an >> async and multi-process (some day multi-threaded) HTTP server, so I'm >> not quite sure what you're getting at, here. > It's becoming to difficult to parse. You aren't sure yet what the async part of > PEP 444 should look like but you have already implemented it?
We are still discussing the possible mechanics of PEP 444 with async support. There is nothing definite yet, and certainly no workable implementation yet either. Async support may or may not materialize in PEP 444, in another PEP or not at all based on the discussions on this list and on IRC.
>>> If you think it's easy, then I'm sure the authors of various 3rd-party >>> libs would welcome your help achieving it. >> I helped proof a book about Python 3 compatibility and am giving a >> presentation in March that contains information on Python 3 >> compatibility from the viewpoint of implementing the Marrow suite. > Well, I hope not too many people will waste time trying to write code > cross-compatible code rather than solely target Python 3. The whole point of > Python 3 is to make developers' life better, not worse.
>>>> Python 2.x will be around for a long time. >>> And so will PEP 3333 and even PEP 333. People who value legacy >>> compatibility will favour these old PEPs over your new one anyway. >>> People who don't will progressively jump to 3.x. >> Yup. Not sure how this is really an issue. PEP 444 is the /future/, >> 333[3] is /now/ [-ish]. > Please read back in context (instead of stripping it), *again*.
At 12:37 PM 1/7/2011 -0800, Alice BevanMcGregor wrote:
>But is there really any problem with providing a unified method for >indication a suspend point?
Yes: a complexity burden that is paid by the many to serve the few -- or possibly non-existent.
I still haven't seen anything that suggests there is a large enough group of people who want a "portable" async API to justify inconveniencing everyone else in order to suit their needs, vs. simply having a different calling interface for that need.
If I could go back and change only ONE thing about WSGI 1, it would be the calling convention. It was messed up from the start, specifically because I wasn't adamant enough about weighing the needs of the many enough against the needs of the few. Only a few needed a push protocol (write()), and only a few even remotely cared about our minor nod to asynchrony (yielding empty strings to pause output).
If I'd been smart (or more to the point, prescient), I'd have just done a 3-tuple return value from the get-go, and said to hell with those other use cases, because everybody else is paying to carry a few people who aren't even going to use these features for real. (As it happens, I thought write() would be needed in order to drive adoption, and it may well have been at one time.)
Anyway, with a new spec we have the benefit of hindsight: we know that, historically, nobody has actually cared enough to propose a full-blown async API who wasn't also trying to make their async server implementation work without needing threads. Never in the history of the web-sig, AFAIK, has anyone come in and said, "hey, I want to have an async app that can run on any async framework."
Nobody blogs or twitters about how terrible it is that the async frameworks all have different APIs and that this makes their apps non-portable. We see lots of complaints about not having a Python 3 WSGI spec, but virtually none about WSGI being essentially synchronous.
I'm not saying there's zero audience for such a thing... but then, at some point there was a non-zero audience for write() and for yielding empty strings. ;-)
The big problem is this: if, as an app developer, you want this hypothetical portable async API, you either already have an app that is async or you don't. If you do, then you already got married to some particular API and are happy with your choice -- or else you'd have bit the bullet and ported.
What you would not do, is come to the Web-SIG and ask for a spec to help you port, because you'd then *still have to port* to the new API... unless of course you wanted it to look like the API you're already using... in which case, why are you porting again, exactly?
Oh, you don't have an app... okay, so *hypothetically*, if you had this API -- which, because you're not actually *using* an async API right now, you probably don't even know quite what you need -- hypothetically if you had this API you would write an app and then run it on multiple async frameworks...
See? It just gets all the way to silly. The only way you can actually get this far in the process seems to be if you are on the server side, thinking it would be really cool to make this thing because then surely you'll get users.
In practice, I can't imagine how you could write an app with substantial async functionality that was sanely portable across the major async frameworks, with the possible exception of the two that at least share some common code, paradigms, and API. And even if you could, I can't imagine someone wanting to.
So far, you have yet to give a concrete example of an application that you personally (or anyone you know of) want to be able to run on two different servers. You've spoken of hypothetical apps and hypothetical portability... but not one concrete, "I want to run this under both Twisted and Eventlet" (or some other two frameworks/servers), "because of [actual, non-hypothetical rationale here]".
I don't deny that [actual non-hypothetical rationale] may exist somewhere, but until somebody shows up with a concrete case, I don't see a proposal getting much traction. (The alternative would be if you pull a rabbit out of your hat and propose something that doesn't cost anybody anything to implement... but the fact that you're tossing the 3-tuple out in favor of yielding indicates you've got no such proposal ready at the present time.)
On the plus side, the "run this in a future after the request" concept has some legs, and I hope Timothy (or anybody) takes it and runs with it. That has plenty of concrete use cases for portability -- every sufficiently-powerful web framework will want to either provide that feature, build other features on top of it, or both.
It's the "make the request itself async" part that's the hard sell here, and in need of some truly spectacular rationale in order to justify the ubiquitous costs it imposes.
> At 12:37 PM 1/7/2011 -0800, Alice BevanMcGregor wrote: >> But is there really any problem with providing a unified method for >> indication a suspend point?
> Yes: a complexity burden that is paid by the many to serve the few -- > or possibly non-existent.
> I still haven't seen anything that suggests there is a large enough > group of people who want a "portable" async API to justify > inconveniencing everyone else in order to suit their needs, vs. simply > having a different calling interface for that need.
> If I could go back and change only ONE thing about WSGI 1, it would be > the calling convention. It was messed up from the start, specifically > because I wasn't adamant enough about weighing the needs of the many > enough against the needs of the few. Only a few needed a push > protocol (write()), and only a few even remotely cared about our minor > nod to asynchrony (yielding empty strings to pause output).
> If I'd been smart (or more to the point, prescient), I'd have just > done a 3-tuple return value from the get-go, and said to hell with > those other use cases, because everybody else is paying to carry a few > people who aren't even going to use these features for real. (As it > happens, I thought write() would be needed in order to drive adoption, > and it may well have been at one time.)
> Anyway, with a new spec we have the benefit of hindsight: we know > that, historically, nobody has actually cared enough to propose a > full-blown async API who wasn't also trying to make their async server > implementation work without needing threads. Never in the history of > the web-sig, AFAIK, has anyone come in and said, "hey, I want to have > an async app that can run on any async framework."
> Nobody blogs or twitters about how terrible it is that the async > frameworks all have different APIs and that this makes their apps > non-portable. We see lots of complaints about not having a Python 3 > WSGI spec, but virtually none about WSGI being essentially synchronous.
> I'm not saying there's zero audience for such a thing... but then, at > some point there was a non-zero audience for write() and for yielding > empty strings. ;-)
> The big problem is this: if, as an app developer, you want this > hypothetical portable async API, you either already have an app that > is async or you don't. If you do, then you already got married to > some particular API and are happy with your choice -- or else you'd > have bit the bullet and ported.
> What you would not do, is come to the Web-SIG and ask for a spec to > help you port, because you'd then *still have to port* to the new > API... unless of course you wanted it to look like the API you're > already using... in which case, why are you porting again, exactly?
> Oh, you don't have an app... okay, so *hypothetically*, if you had > this API -- which, because you're not actually *using* an async API > right now, you probably don't even know quite what you need -- > hypothetically if you had this API you would write an app and then run > it on multiple async frameworks...
> See? It just gets all the way to silly. The only way you can > actually get this far in the process seems to be if you are on the > server side, thinking it would be really cool to make this thing > because then surely you'll get users.
> In practice, I can't imagine how you could write an app with > substantial async functionality that was sanely portable across the > major async frameworks, with the possible exception of the two that at > least share some common code, paradigms, and API. And even if you > could, I can't imagine someone wanting to.
> So far, you have yet to give a concrete example of an application that > you personally (or anyone you know of) want to be able to run on two > different servers. You've spoken of hypothetical apps and > hypothetical portability... but not one concrete, "I want to run this > under both Twisted and Eventlet" (or some other two > frameworks/servers), "because of [actual, non-hypothetical rationale > here]".
How do you suppose common async middleware could be implemented without a common async API? Today we have plenty of WSGI middleware, which would not be possible without a common API. You would have to make separate interfaces for every major framework and separately test against each of them instead of having a reasonable expectation that it will work uniformly across compliant frameworks. I would really love to see common middleware components that are usable on twisted, tornado etc. without modifications.
You seem to be under the impression that asynchronous applications only have some specialized uses. Asynchronous applications are no more limited in scope than synchronous ones are. It's just an alternative programming paradigm that has the potential of squeezing more performance out of a server. Note that I am in now way insisting that PEP 444 require async support; I'm only exploring that possibility. If we cannot figure out a way to make it easy for implementors to support, then I will push for a separate specification.
> I don't deny that [actual non-hypothetical rationale] may exist > somewhere, but until somebody shows up with a concrete case, I don't > see a proposal getting much traction. (The alternative would be if > you pull a rabbit out of your hat and propose something that doesn't > cost anybody anything to implement... but the fact that you're tossing > the 3-tuple out in favor of yielding indicates you've got no such > proposal ready at the present time.)
> On the plus side, the "run this in a future after the request" concept > has some legs, and I hope Timothy (or anybody) takes it and runs with > it. That has plenty of concrete use cases for portability -- every > sufficiently-powerful web framework will want to either provide that > feature, build other features on top of it, or both.
What exactly does "run this in a future after the request" mean? There seems to be some terminology confusion here.
> It's the "make the request itself async" part that's the hard sell > here, and in need of some truly spectacular rationale in order to > justify the ubiquitous costs it imposes.
> 08.01.2011 07:09, P.J. Eby wrote: >> On the plus side, the "run this in a future after the request" concept >> has some legs... [snip]
> What exactly does "run this in a future after the request" mean? There > seems to be some terminology confusion here.
I suspect he's referring to some of the notes on the "PEP 444 feature request - Futures executor" thread and several of my illustrated use cases, notably:
:: Image scaling (e.g. to multiple sizes) after uploading of an image to be scaled where the response (Congratulations, image uploded!) does not require the result of the scaling.
:: Content indexing which can also be performed after returning the success page.
The former would executor.submit() a number of scaling jobs, attach completion callbacks to perform some cleanup / database updating / etc., and return a response immediately. The latter is a single executor submission that is entirely non-time-critical.
And likely other use cases as well. This (inclusion of an executor tuned to the underlying server in the environment) is one thing I think we can (almost) all agree is a good idea. :D Discussion on that particular idea should be relegated to the feature request thread, though.
On 2011-01-07 13:21:36 -0800, Antoine Pitrou said:
> Ok, so, WSGI doesn't "already involve generators". QED.
Let me try this again. With the understanding that:
:: PEP 333[3] and 444 define a response body as an iterable. :: Thus WSGI involves iterables through definition. :: A generator is a type of iterable. :: Thus WSGI involves generators through the use of iterables.
The hypothetical redefinition of an application as a generator is not too far out to lunch, considering that WSGI _already involves generators_. (And that the simple case, an application that does not utilize async, will require a single word be changed: s/return/yield)
Is that clearer? The idea refered to below (and posted separately) involve this redefinition, which I understand fully will have a number of strong opponents. Considering PEP 444 is a new spec (already breaking direct compatibility via the /already/ redefined return value) I hope people do not reject this out of hand but instead help explore the idea further.
On 2011-01-07 19:36:52 -0800, Antoine Pitrou said:
> Alice Bevan–McGregor <alice@...> writes: >> The particular use case happens to be PEP 444 as implemented using an >> async and multi-process (some day multi-threaded) HTTP server, so I'm >> not quite sure what you're getting at, here.
> It's becoming to difficult to parse. You aren't sure yet what the async > part of PEP 444 should look like but you have already implemented it?
Marrow HTTPd (marrow.server.http) [1] is, internally, an asynchronous server. It does not currently expose the reactor to the WSGI application via any interface whatsoever. I am, however, working on some p-code examples (that I will post for discussion as mentioned above) which I can base a fork of m.s.http off of to experiment.
This means that, yes, I'm not sure how async will work in PEP 444 /in the end/, but I am at least attempting to explore the practical implications of the ideas thus far in a real codebase. I'm "getting it done", even if it has to change or be scrapped.
>> I helped proof a book about Python 3 compatibility and am giving a >> presentation in March that contains information on Python 3 >> compatibility from the viewpoint of implementing the Marrow suite.
> Well, I hope not too many people will waste time trying to write code > cross-compatible code rather than solely target Python 3. The whole > point of Python 3 is to make developers' life better, not worse.
I agree, with one correction to your first point. Application and framework developers should whole-heartedly embrase Python 3 and make full use of its many features, simplifications and clarifications. However, it is demonstrably not Insanely Difficult™ to have compatible server and middleware implementations with the draft's definition of native string. If server and middleware developers are willing to create polygot code, I'm not going to stop them.
Note that this type of compatibility is not mandated, and the use of native strings (with one well defined byte string exception) means that pure Python 3 programmers can be blissfully ignorant of the compatibility implications -- everything else is "unicode" (str), even if it's just "bytes-in-unicode" (latin1/iso-8859-1). Pure Python 2 programmers have only a small difference (for them) of the URI values being unicode; the remaining values are byte strings (str).
I would like to hear a technical reason why this (native strings) is a bad idea instead of vague "this will make things harder" -- it won't, at least, not measurably, and I have the proof as a working, 100% unit tested, performant, cross-compatible polygot HTTP/1.1-compliant server. Written in several days worth of "full-time work" spread across weeks because this is a spare-time project; i.e. not a lot of literal work, nor "hard".
Hell, it has transformed from a crappy hack to experiment with HTTP into a complete (or very nearly so) implementation of PEP 444 in both of its current forms (published and draft) that is almost usable, ignoring the fact that PEP 444 is mutable, of course.