[Web-SIG] PEP 444 Goals

Timothy Farrell

unread,

Jan 6, 2011, 2:36:17 PM1/6/11

to web...@python.org

Hello web-sig. My name is Timothy Farrell. I'm the developer of the Rocket web server. I understand that most of you are more experienced and passionate than myself. But I'm come here because I want to see certain things standardized. I'm pretty new to this forum but I've read through all the recent discussions on PEP 444. That being said, I'll try to take a humble approach.

It seems to me that the spec that Alice is working on could be something great but the problems are not well defined (in the PEP). This causes confusion about what the goals are. There's some disagreement about whether or not certain features should be in PEP 444. I think those people have a different idea for what PEP 444 ought to be. The first thing that should be done is clearly defining the shortcomings with PEP 3333 that PEP 444 seeks to address and limit our PEP 444 discussions to solving those problems.

Since Alice is rewriting the PEP perhaps we should all sit back for a second until we have a PEP to work off of. That will help the discussion be a little more focused.

Sorry if I've stepped on anyone's toes.

-tim
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com

Alice Bevan–McGregor

unread,

Jan 6, 2011, 3:52:29 PM1/6/11

to web...@python.org

> It seems to me that the spec that Alice is working on could be
> something great but the problems are not well defined (in the PEP).
> This causes confusion about what the goals are.

For completeness sake, here's a slightly simplified Abstract:

:: A proposed second-generation standard interface between web servers
and Python 2.6+ and 3.1+ applications.

The rationale for even having such an interface is outlined in PEP 333.

Ignoring async for the moment, the goals of the PEP 444 rewrite are:

:: Clear separation of "narrative" from "rules to be followed". This
allows developers of both servers and applications to easily run
through a confomance "check list".

:: Isolation of examples and rationale to improve readability of the
core rulesets.

:: Clarification of often mis-interpreted rules from PEP 333 (and those
carried over in 3333).

:: Elimination of unintentional non-conformance, esp. re: cgi.FieldStorage.

:: Massive simplification of call flow. Replacing start_response with
a returned 3-tuple immensely simplifies the task of middleware that
needs to capture HTTP status or manipulate (or even examine) response
headers. [1]

:: Reduction of re-implementation / NIH syndrome by incorporating the
most common (1%) of features most often relegated to middleware or
functional helpers. Unicode decoding of a small handful of values (CGI
values that pull from the request URI) is the biggest example. [2, 3]

:: Cross-compatibility considerations. The definition and use of
native strings vs. byte strings is the biggest example of this in the
rewrite.

:: Making optional (and thus rarely-implemented) features non-optional.
E.g. server support for HTTP/1.1 with clarifications for interfacing
applications to 1.1 servers. Thus pipelining, chunked encoding, et.
al. as per the HTTP 1.1 RFC.

There are likely others I can't think of at the moment. ;) If I
remember anything else as I wake up more fully (caffeine zombie, here)
I'll post an additional reply.

Footnotes:

[1] This also happens to be a very Pythonic solution.

[2] This does not mean WSGI 2 will attempt to "compete" with
frameworks; merely reduce the multiplication of effort for the common
denominator.

[3] Filters are covered under re-implementation.

> Since Alice is rewriting the PEP perhaps we should all sit back for a
> second until we have a PEP to work off of. That will help the
> discussion be a little more focused.

I'll have a direct translation of my current rewritten draft into ReST
for incorporation on the Python.org website within a few hours.
Unfortunately, in the short term, it still doesn't include a high-level
goal overview, though will incorporate the consensus (thus far) on
removing the ability to return unicode response data.

> Sorry if I've stepped on anyone's toes.

No worries; you do raise a very valid point.

- Alice.

James Y Knight

unread,

Jan 6, 2011, 4:06:36 PM1/6/11

to Alice Bevan–McGregor, web...@python.org

On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
> :: Making optional (and thus rarely-implemented) features non-optional. E.g. server support for HTTP/1.1 with clarifications for interfacing applications to 1.1 servers. Thus pipelining, chunked encoding, et. al. as per the HTTP 1.1 RFC.

Requirements on the HTTP compliance of the server don't really have any place in the WSGI spec. You should be able to be WSGI compliant even if you don't use the HTTP transport at all (e.g. maybe you just send around requests via SCGI).

The original spec got this right: chunking etc are something which is not relevant to the wsgi application code -- it is up to the server to implement the HTTP transport according to the HTTP spec, if it's purporting to be an HTTP server.

James

Alice Bevan–McGregor

unread,

Jan 6, 2011, 4:56:09 PM1/6/11

to web...@python.org

On 2011-01-06 13:06:36 -0800, James Y Knight said:

> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>> :: Making optional (and thus rarely-implemented) features non-optional.
>> E.g. server support for HTTP/1.1 with clarifications for interfacing
>> applications to 1.1 servers. Thus pipelining, chunked encoding, et.
>> al. as per the HTTP 1.1 RFC.
>
> Requirements on the HTTP compliance of the server don't really have any
> place in the WSGI spec. You should be able to be WSGI compliant even if
> you don't use the HTTP transport at all (e.g. maybe you just send
> around requests via SCGI).
> The original spec got this right: chunking etc are something which is
> not relevant to the wsgi application code -- it is up to the server to
> implement the HTTP transport according to the HTTP spec, if it's
> purporting to be an HTTP server.

Chunking is actually quite relevant to the specification, as WSGI and
PEP 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;)
allow for chunked bodies regardless of higher-level support for
chunking. The body iterator. Previously you /had/ to define a length,
with chunked encoding at the server level, you don't.

I agree, however, that not all gateways will be able to implement the
relevant HTTP/1.1 features. FastCGI does, SCGI after a quick Google
search, seems to support it as well. I should re-word it as:

"For those servers capable of HTTP/1.1 features the implementation of
such features is required."

+1

- Alice.

James Y Knight

unread,

Jan 6, 2011, 5:01:09 PM1/6/11

to Alice Bevan–McGregor, web...@python.org

On Jan 6, 2011, at 4:56 PM, Alice Bevan–McGregor wrote:

> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>
>> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>>> :: Making optional (and thus rarely-implemented) features non-optional. E.g. server support for HTTP/1.1 with clarifications for interfacing applications to 1.1 servers. Thus pipelining, chunked encoding, et. al. as per the HTTP 1.1 RFC.
>> Requirements on the HTTP compliance of the server don't really have any place in the WSGI spec. You should be able to be WSGI compliant even if you don't use the HTTP transport at all (e.g. maybe you just send around requests via SCGI).
>> The original spec got this right: chunking etc are something which is not relevant to the wsgi application code -- it is up to the server to implement the HTTP transport according to the HTTP spec, if it's purporting to be an HTTP server.
>
> Chunking is actually quite relevant to the specification, as WSGI and PEP 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for chunked bodies regardless of higher-level support for chunking. The body iterator. Previously you /had/ to define a length, with chunked encoding at the server level, you don't.

No you don't -- HTTP 1.0 allows indeterminate-length output. The server simply must close the connection to indicate the end of the response if either the client version HTTP/1.0, or the server doesn't implement HTTP/1.1.

James

Alice Bevan–McGregor

unread,

Jan 6, 2011, 5:14:32 PM1/6/11

to web...@python.org

On 2011-01-06 14:01:09 -0800, James Y Knight said:

> No you don't -- HTTP 1.0 allows indeterminate-length output. The server
> simply must close the connection to indicate the end of the response if
> either the client version HTTP/1.0, or the server doesn't implement
> HTTP/1.1.

Ah, you are correct. There was something, somewhere I was reading
related to WSGI about requiring content-length... but no matter.

Interestingly enough, HTTP/1.0 also supports pipelining (though
obviously not if content-length is missing) via the `Connection:
keep-alive` header. HTTP/1.1 mandates keep-alive by default (which is
a good thing, IMHO) and offers a work-around for missing content-length
to preserve the connection: chunked encoding. Add to that 100-Continue
(allowing delayed /transfer/ of the request body until the first
wsgi.input.read() operation) and allows proper, full URLs to be
requested, amongst other goodies.

Arguing against mandated HTTP/1.1 support (where possible) seems...
silly to me. HTTP/1.1 has been around for a long time (adopted by the
major browsers in 1996), is well understood, is /trivial/ to implement
(I managed it as part of my 172 Python opcode HTTP server
implementation), and Just Makes Sense.

If there can be a good technical reason why the adapted language ("if
possible, it's required") can not be used, I'll definitely re-consider
this point. Considering that detection is easy (SERVER_PROTOCOL ==
"1.0"), adaption by the application to either case is easy (detect and
if not present consume the body_iter and determine length) and it's a
15 year old standard: welcome to the 20'th century. ;)

- Alice.

Graham Dumpleton

unread,

Jan 6, 2011, 5:29:36 PM1/6/11

to Alice Bevan–McGregor, web...@python.org

I would question whether FASTCGI, SCGI or AJP support the concept of
chunking of responses to the extent that the application can prepare
the final content including chunks as required by the HTTP
specification. Further, in Apache at least, the output from a web
application served via those protocols is still pushed through the
Apache output filter chain so as to allow the filters to modify the
response, eg., apply compression using mod_deflate. As a consequence,
the standard HTTP 'CHUNK' output filter is still a part of the output
filter stack. This means that were a web application to try and do
chunking itself, then Apache would rechunk such that the original
chunking became part of the content, rather than the transfer
encoding.

So, in order to be able to achieve what I think you want, with a web
application being able to do chunking itself, you would need to modify
the implementations of mod_fcgid, mod_fastcgi, mod_scgi, mod_ajp and
also like mod_cgi and mod_cgid of Apache.

The only WSGI implementation I know of for Apache where you might even
be able to do what you want is uWSGI. This is because I believe from
memory it uses a mode in Apache by default called assbackwords. What
this allows is for the output from the web application to bypass the
Apache output filter stack and directly control the raw HTTP output.
This gives uWSGI a little bit less overhead in Apache, but at the loss
of the ability to actually use Apache output filters and for Apache to
fix up response headers in any way. There is a flag in uWSGI which can
optionally be set to make it use the more traditional mode and not use
assbackwords.

Thus, I believe you would be fighting against server implementations
such as Apache and likely also nginx, Cherokee, lighttpd etc, to allow
chunking to be supported at the level of the web application.

About all you can do is ensure that the WSGI specification doesn't
include anything in it which would prevent a web application
harnessing indirectly such a feature as chunking where the web server
supports it.

As it is, it isn't chunked responses which is even the problem,
because if a underlying web server supports chunking for responses,
all you need to do is not set the content length.

The problem area with chunking is the request content as the way that
the WSGI specification is written prevents being able to have chunked
request content. I have described the issue previously and made
suggestions about alternate way that wsgi.input could be used.

Graham

> +1
>
> - Alice.
>
>
> _______________________________________________
> Web-SIG mailing list
> Web...@python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:

> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com

Graham Dumpleton

unread,

Jan 6, 2011, 6:14:16 PM1/6/11

to Alice Bevan–McGregor, web...@python.org

One other comment about HTTP/1.1 features.

You will always be battling to have some HTTP/1.1 features work in a
controllable way. This is because WSGI gateways/adapters aren't often
directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
AJP, CGI etc. In this sort of situation you are at the mercy of what
the modules implementing those protocols do, or even are hamstrung by
how those protocols work.

The classic example is 100-continue processing. This simply cannot
work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
mechanisms where proxying is performed as the protocol being used
doesn't implement a notion of end to end signalling in respect of
100-continue.

The current WSGI specification acknowledges that by saying:

"""
Servers and gateways that implement HTTP 1.1 must provide transparent
support for HTTP 1.1's "expect/continue" mechanism. This may be done
in any of several ways:

* Respond to requests containing an Expect: 100-continue request with
an immediate "100 Continue" response, and proceed normally.
* Proceed with the request normally, but provide the application with
a wsgi.input stream that will send the "100 Continue" response if/when
the application first attempts to read from the input stream. The read
request must then remain blocked until the client responds.
* Wait until the client decides that the server does not support
expect/continue, and sends the request body on its own. (This is
suboptimal, and is not recommended.)
"""

If you are going to try and push for full visibility of HTTP/1.1 and
an ability to control it at the application level then you will fail
with 100-continue to start with.

So, although option 2 above would be the most ideal and is giving the
application control, specifically the ability to send an error
response based on request headers alone, and with reading the response
and triggering the 100-continue, it isn't practical to require it, as
the majority of hosting mechanisms for WSGI wouldn't even be able to
implement it that way.

The same goes for any other feature, there is no point mandating a
feature that can only be realistically implementing on a minority of
implementations. This would be even worse where dependence on such a
feature would mean that the WSGI application would no longer be
portable to another WSGI server and destroys the notion that WSGI
provides a portable interface.

This isn't just restricted to HTTP/1.1 features either, but also
applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers
that are directly hooked into the URL parsing of the base HTTP server
can provide that information, which basically means that only pure
Python HTTP/WSGI servers are likely able to provide it without
guessing, and in that case such servers usually are always used where
WSGI application mounted at root anyway.

Graham

Alex Grönholm

unread,

Jan 6, 2011, 7:46:48 PM1/6/11

to web...@python.org

07.01.2011 01:14, Graham Dumpleton kirjoitti:

One other comment about HTTP/1.1 features.

You will always be battling to have some HTTP/1.1 features work in a
controllable way. This is because WSGI gateways/adapters aren't often
directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
AJP, CGI etc. In this sort of situation you are at the mercy of what
the modules implementing those protocols do, or even are hamstrung by
how those protocols work.

The classic example is 100-continue processing. This simply cannot
work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
mechanisms where proxying is performed as the protocol being used
doesn't implement a notion of end to end signalling in respect of
100-continue.

I think we need some concrete examples to figure out what is and isn't possible with WSGI 1.0.1.
My motivation for participating in this discussion can be summed up in that I want the following two applications to work properly:

- PlasmaDS (Flex Messaging implementation)
- WebDAV

The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS. Interoperability with the existing implementation requires that both the request and response use chunked transfer encoding, to achieve bidirectional streaming. I don't really care how this happens, I just want to make sure that there is nothing preventing it.

The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102):

The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the server SHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed.
Again, I don't care how this is done as long as it's possible.

On 7 January 2011 08:56, Alice Bevanï¿½McGregor <al...@gothcandy.com> wrote:

On 2011-01-06 13:06:36 -0800, James Y Knight said:

On Jan 6, 2011, at 3:52 PM, Alice Bevanï¿½McGregor wrote:

:: Making optional (and thus rarely-implemented) features non-optional.
E.g. server support for HTTP/1.1 with clarifications for interfacing
applications to 1.1 servers. ï¿½Thus pipelining, chunked encoding, et. al. as
per the HTTP 1.1 RFC.

Requirements on the HTTP compliance of the server don't really have any
place in the WSGI spec. You should be able to be WSGI compliant even if you
don't use the HTTP transport at all (e.g. maybe you just send around
requests via SCGI).
The original spec got this right: chunking etc are something which is not
relevant to the wsgi application code -- it is up to the server to implement
the HTTP transport according to the HTTP spec, if it's purporting to be an
HTTP server.

Chunking is actually quite relevant to the specification, as WSGI and PEP
444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for
chunked bodies regardless of higher-level support for chunking. ï¿½The body
iterator. ï¿½Previously you /had/ to define a length, with chunked encoding at
the server level, you don't.

I agree, however, that not all gateways will be able to implement the
relevant HTTP/1.1 features. ï¿½FastCGI does, SCGI after a quick Google search,
seems to support it as well. I should re-word it as:

"For those servers capable of HTTP/1.1 features the implementation of such
features is required."

+1

ï¿½ ï¿½ ï¿½ ï¿½- Alice.


_______________________________________________
Web-SIG mailing list

Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe:
http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com

_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig

Unsubscribe: http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi

Graham Dumpleton

unread,

Jan 6, 2011, 8:15:21 PM1/6/11

to Alex Grönholm, web...@python.org

2011/1/7 Alex Grönholm <alex.g...@nextday.fi>:

> 07.01.2011 01:14, Graham Dumpleton kirjoitti:
>
> One other comment about HTTP/1.1 features.
>
> You will always be battling to have some HTTP/1.1 features work in a
> controllable way. This is because WSGI gateways/adapters aren't often
> directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI,
> AJP, CGI etc. In this sort of situation you are at the mercy of what
> the modules implementing those protocols do, or even are hamstrung by
> how those protocols work.
>
> The classic example is 100-continue processing. This simply cannot
> work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting
> mechanisms where proxying is performed as the protocol being used
> doesn't implement a notion of end to end signalling in respect of
> 100-continue.
>
> I think we need some concrete examples to figure out what is and isn't
> possible with WSGI 1.0.1.
> My motivation for participating in this discussion can be summed up in that
> I want the following two applications to work properly:
>
> - PlasmaDS (Flex Messaging implementation)
> - WebDAV
>
> The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS.
> Interoperability with the existing implementation requires that both the
> request and response use chunked transfer encoding, to achieve bidirectional
> streaming. I don't really care how this happens, I just want to make sure
> that there is nothing preventing it.

That can only be done by changing the rules around wsgi.input is used.
I'll try and find a reference to where I have posted information about
this before, otherwise I'll write something up again about it.

> The WebDAV spec, on the other hand, says
> (http://www.webdav.org/specs/rfc2518.html#STATUS_102):
>
> The 102 (Processing) status code is an interim response used to inform the
> client that the server has accepted the complete request, but has not yet
> completed it. This status code SHOULD only be sent when the server has a
> reasonable expectation that the request will take significant time to
> complete. As guidance, if a method is taking longer than 20 seconds (a
> reasonable, but arbitrary value) to process the server SHOULD return a 102
> (Processing) response. The server MUST send a final response after the
> request has been completed.

That I don't offhand see a way of being able to do as protocols like
SCGI and CGI definitely don't allow interim status. I am suspecting
that FASTCGI and AJP don't allow it either.

I'll have to even do some digging as to how you would even handle that
in Apache with a normal Apache handler.

Graham

> On 7 January 2011 08:56, Alice Bevan–McGregor <al...@gothcandy.com> wrote:
>
> On 2011-01-06 13:06:36 -0800, James Y Knight said:
>

> On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote:
>
> :: Making optional (and thus rarely-implemented) features non-optional.
> E.g. server support for HTTP/1.1 with clarifications for interfacing

> applications to 1.1 servers. Thus pipelining, chunked encoding, et. al. as

> per the HTTP 1.1 RFC.
>
> Requirements on the HTTP compliance of the server don't really have any
> place in the WSGI spec. You should be able to be WSGI compliant even if you
> don't use the HTTP transport at all (e.g. maybe you just send around
> requests via SCGI).
> The original spec got this right: chunking etc are something which is not
> relevant to the wsgi application code -- it is up to the server to implement
> the HTTP transport according to the HTTP spec, if it's purporting to be an
> HTTP server.
>
> Chunking is actually quite relevant to the specification, as WSGI and PEP
> 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for

> chunked bodies regardless of higher-level support for chunking. The body
> iterator. Previously you /had/ to define a length, with chunked encoding at

> the server level, you don't.
>
> I agree, however, that not all gateways will be able to implement the

> relevant HTTP/1.1 features. FastCGI does, SCGI after a quick Google search,

Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com

Graham Dumpleton

unread,

Jan 6, 2011, 9:09:58 PM1/6/11

to Alex Grönholm, web...@python.org

2011/1/7 Graham Dumpleton <graham.d...@gmail.com>:

BTW, even if WSGI specification were changed to allow handling of
chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or
mod_wsgi daemon mode. Also not likely to work on uWSGI either.

This is because all of these work on the expectation that the complete
request body can be written across to the separate application process
before actually reading the response from the application.

In other words, both way streaming is not possible.

The only solution which would allow this with Apache is mod_wsgi
embedded mode, which in mod_wsgi 3.X already has an optional feature
which can be enabled so as to allow you to step out of current bounds
of the WSGI specification and use wsgi.input as I will explain, to do
this both way streaming.

Pure Python HTTP/WSGI servers which are a front facing server could
also be modified to handle this is WSGI specification were changed,
but whether those same will work if put behind a web proxy will depend
on how the front end web proxy works.

Graham

Alex Grönholm

unread,

Jan 6, 2011, 9:36:32 PM1/6/11

to web...@python.org

Then I suppose this needs to be standardized in PEP 444, wouldn't you agree?

Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com

Graham Dumpleton

unread,

Jan 6, 2011, 9:55:15 PM1/6/11

to Alex Grönholm, web...@python.org

Huh! Not sure you understand what I am saying. Even if you changed the
WSGI specification to allow for it, the bulk of implementations
wouldn't be able to support it. The WSGI specification has no
influence over distinct protocols such as FASTCGI, SCGI, AJP or CGI or
proxy implementations and so cant be used to force them to be changed.

So, as much as I would like to see WSGI specification changed to allow
it, others may not on the basis that there is no point if few
implementations could support it.

Graham

> http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com

James Y Knight

unread,

Jan 6, 2011, 9:55:20 PM1/6/11

to Alex Grönholm, web...@python.org

On Jan 6, 2011, at 7:46 PM, Alex Grönholm wrote:

The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102):

The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the serverSHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed.

Again, I don't care how this is done as long as it's possible.

This pretty much has to be generated by the server implementation. One thing that could be done in WSGI is a callback function inserted into the environ to suggest to the server that it generate a certain 1xx response. That is, something like:

if 'wsgi.intermediate_response' in environ:

environ['wsgi.intermediate_response'](102, {'Random-Header': 'Whatever'})

If a server implements this, it should probably ignore any requests from the app to send a 100 or 101 response. The server should be free to ignore the request, or not implement it. Given that the only actual use case (WebDAV) is rather rare and marks it as a SHOULD, I don't see any real practical issues with it being optional.

The other thing that could be done is simply have a server-side configuration to allow sending 102 after *any* request takes > 20 seconds to process. That wouldn't require any changes to WSGI.

I'd note that HTTP/1.1 clients are *required* to be able to handle any number of 1xx responses followed by a final response, so it's supposed to be perfectly safe for a server to always send a 102 as a response to any request, no matter what the app is, or what client user-agent is (so long as it advertised HTTP/1.1), or even whether the resource has anything to do with WebDAV. Of course, I'm willing to bet that's patently false back here in the Real World -- no doubt plenty of "HTTP/1.1" clients incorrectly barf on 1xx responses.

James

Graham Dumpleton

unread,

Jan 6, 2011, 10:08:20 PM1/6/11

to James Y Knight, web...@python.org

2011/1/7 James Y Knight <fo...@fuhm.net>:

FWIW, Apache provides ap_send_interim_response() to allow interim status.

This is used by mod_proxy, but no where else in Apache core code. So,
you would be fine if proxying to a pure Python HTTP/WSGI server which
could generate interim responses, but would be out of luck with
FASTCGI, SCGI, AJP, CGI and any modules which do custom proxying using
own protocol such as uWSGI or mod_wsgi daemon mode.

In all the latter, the wire protocols for proxy connection would
themselves need to be modified as well as module implementation, which
isn't going to happen for any of those which are generic protocols.

Graham

_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig

Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com

Alex Grönholm

unread,

Jan 6, 2011, 11:12:20 PM1/6/11

to web...@python.org

I believe I understand what you are saying, but I don't want to restrict
the freedom of the developer just because of some implementations that
can't support some particular feature. If you need to do streaming, use
a server that supports it, obviously! If Java can do it, why can't we? I
would hate having to rely on a non-standard implementation if we have
the possibility to standardize this in a specification.

P.J. Eby

unread,

Jan 6, 2011, 11:18:12 PM1/6/11

to al...@gothcandy.com, web...@python.org

At 12:52 PM 1/6/2011 -0800, Alice BevanMcGregor wrote:
>Ignoring async for the moment, the goals of the PEP 444 rewrite are:
>
>:: Clear separation of "narrative" from "rules to be
>followed". This allows developers of both servers and applications
>to easily run through a confomance "check list".
>
>:: Isolation of examples and rationale to improve readability of the
>core rulesets.
>
>:: Clarification of often mis-interpreted rules from PEP 333 (and
>those carried over in 3333).
>
>:: Elimination of unintentional non-conformance, esp. re: cgi.FieldStorage.
>
>:: Massive simplification of call flow. Replacing start_response
>with a returned 3-tuple immensely simplifies the task of middleware
>that needs to capture HTTP status or manipulate (or even examine)
>response headers. [1]

A big +1 to all the above as goals.

>:: Reduction of re-implementation / NIH syndrome by incorporating
>the most common (1%) of features most often relegated to middleware
>or functional helpers.

Note that nearly every application-friendly feature you add will
increase the burden on both server developers and middleware
developers, which ironically means that application developers
actually end up with fewer options.

> Unicode decoding of a small handful of values (CGI values that
> pull from the request URI) is the biggest example. [2, 3]

Does that mean you plan to make the other values bytes, then? Or
will they be unicode-y-bytes as well? What happens for additional
server-provided variables?

The PEP 3333 choice was for uniformity. At one point, I advocated
simply using surrogateescape coding, but this couldn't be made
uniform across Python versions and maintain compatibility.

Unfortunately, even with the move to 2.6+, this problem remains,
unless server providers are required to register a surrogateescape
error handler -- which I'm not even sure can be done in Python 2.x.

>:: Cross-compatibility considerations. The definition and use of
>native strings vs. byte strings is the biggest example of this in the rewrite.

I'm not sure what you mean here. Do you mean "portability of WSGI 2
code samples across Python versions (esp. 2.x vs. 3.x)?"

Alice Bevan–McGregor

unread,

Jan 7, 2011, 12:13:06 AM1/7/11

to web...@python.org

On 2011-01-06 14:14:32 -0800, Alice Bevan–McGregor said:
> There was something, somewhere I was reading related to WSGI about
> requiring content-length... but no matter.

Right, I remember now: the HTTP 1.0 specification. (Honestly not
trying to sound sarcastic!) See:

http://www.w3.org/Protocols/HTTP/1.0/draft-ietf-http-spec.html#Entity-Body

However, after testing every browser on my system (from Links and
ELinks, through Firefox, Chrome, Safari, Konqueror, and Dillo) across
the following test code, I find that they all handle a missing
content-length in the same way: reading the socket until it closes.

http://pastie.textmate.org/1435415

James Y Knight

unread,

Jan 7, 2011, 12:26:32 AM1/7/11

to Alice Bevan–McGregor, web...@python.org

On Jan 7, 2011, at 12:13 AM, Alice Bevan–McGregor wrote:

> On 2011-01-06 14:14:32 -0800, Alice Bevan–McGregor said:
>> There was something, somewhere I was reading related to WSGI about requiring content-length... but no matter.
>
> Right, I remember now: the HTTP 1.0 specification. (Honestly not trying to sound sarcastic!) See:
>
> http://www.w3.org/Protocols/HTTP/1.0/draft-ietf-http-spec.html#Entity-Body

You've misread that section. In HTTP/1.0, *requests* were required to have a Content-Length if they had a body (HTTP 1.1 fixed that with chunked request support). Responses have never had that restriction: they have always (even since before HTTP 1.0) been allowed to omit Content-Length and terminate by closing the socket.

HTTP 1.1 didn't really add any new functionality to *responses* by adding chunking, simply bit of efficiency and error detection ability.

James

Alice Bevan–McGregor

unread,

Jan 7, 2011, 12:31:52 AM1/7/11

to web...@python.org

On 2011-01-06 21:26:32 -0800, James Y Knight said:
> You've misread that section. In HTTP/1.0, *requests* were required to
> have a Content-Length if they had a body (HTTP 1.1 fixed that with
> chunked request support). Responses have never had that restriction:
> they have always (even since before HTTP 1.0) been allowed to omit
> Content-Length and terminate by closing the socket.

Ah ha, that explains my confusion, then! Thank you.

- Alice.

chris...@gmail.com

unread,

Jan 7, 2011, 4:08:42 AM1/7/11

to web...@python.org

On Thu, 6 Jan 2011, Alice Bevan–McGregor wrote:

> :: Clear separation of "narrative" from "rules to be followed". This allows
> developers of both servers and applications to easily run through a
> confomance "check list".

+1

> :: Isolation of examples and rationale to improve readability of the core
> rulesets.

+1

> :: Clarification of often mis-interpreted rules from PEP 333 (and those
> carried over in 3333).

+1

> :: Elimination of unintentional non-conformance, esp. re: cgi.FieldStorage.

+1

> :: Massive simplification of call flow. Replacing start_response with a
> returned 3-tuple immensely simplifies the task of middleware that needs to
> capture HTTP status or manipulate (or even examine) response headers. [1]

+1

I was initially resistant to this one in a we fear change kind of
way, but I've since recognized that a) I was thinking about it
mostly in terms of existing code I have that would need to be
changed b) it _is_ more pythonic.

> :: Reduction of re-implementation / NIH syndrome by incorporating the most
> common (1%) of features most often relegated to middleware or functional
> helpers. Unicode decoding of a small handful of values (CGI values that pull
> from the request URI) is the biggest example. [2, 3]

0 (as in unsure, need to be convinced, etc)

The zero here is in large part because this particular goal could
cover a large number of things from standardized query string
processing (maybe a good idea) to filters (which I've already
expressed reservations about).

So this goal seems like it ought to be several separate goals.

> :: Cross-compatibility considerations. The definition and use of native
> strings vs. byte strings is the biggest example of this in the rewrite.

+1

> :: Making optional (and thus rarely-implemented) features non-optional. E.g.
> server support for HTTP/1.1 with clarifications for interfacing applications
> to 1.1 servers. Thus pipelining, chunked encoding, et. al. as per the HTTP
> 1.1 RFC.

0

The other option (than non-optional) for optional things is to
remove them.

I think working from a list of goals is an excellent way to make
some headway.

--
Chris Dent http://burningchrome.com/
[...]

Alice Bevan–McGregor

unread,

Jan 7, 2011, 4:17:05 AM1/7/11

to web...@python.org

On 2011-01-06 20:18:12 -0800, P.J. Eby said:
>> :: Reduction of re-implementation / NIH syndrome by incorporating>the
>> most common (1%) of features most often relegated to middleware>or
>> functional helpers.
>
> Note that nearly every application-friendly feature you add will
> increase the burden on both server developers and middleware
> developers, which ironically means that application developers actually
> end up with fewer options.

Some things shouldn't have multiple options in the first place. ;) I
definitely consider implementation overhead on server, middleware, and
application authors to be important.

As an example, if yield syntax is allowable for application objects (as
it is for response bodies) middleware will need to iterate over the
application, yielding up-stream anything that isn't a 3-tuple. When it
encounters a 3-tuple, the middleware can do its thing. If the app
yield semantics are required (which may be a good idea for consistency
and simplicity sake if we head down this path) then async-aware
middleware can be implemented as a generator regardless of the
downstream (wrapped) application's implementation. That's not too much
overhead, IMHO.

>> Unicode decoding of a small handful of values (CGI values that> pull
>> from the request URI) is the biggest example. [2, 3]
>
> Does that mean you plan to make the other values bytes, then? Or will
> they be unicode-y-bytes as well?

Specific CGI values are bytes (one, I believe), specific ones are true
unicode (URI-related values) and decoded using a configurable encoding
with a fallback to "bytes in unicode" (iso-8859-1/latin1), are kept
internally consistent (if any one fails, treat as if they all failed),
have the encoding used recorded in the environ, and all others are
native strings ("bytes in unicode" where native strings are unicode).

> What happens for additional server-provided variables?

That is the domain of the server to document, though native strings
would be nice. (The PEP only covers CGI variables.)

> The PEP 3333 choice was for uniformity. At one point, I advocated
> simply using surrogateescape coding, but this couldn't be made uniform
> across Python versions and maintain compatibility.

As an open question to anyone: is surrogateescape availabe in Python
2.6? Mandating that as a minimum version for PEP 444 has yielded
benefits in terms of back-ported features and syntax, like b''.

>> :: Cross-compatibility considerations. The definition and use
>> of>native strings vs. byte strings is the biggest example of this in
>> the rewrite.
>
> I'm not sure what you mean here. Do you mean "portability of WSGI

> 2code samples across Python versions (esp. 2.x vs. 3.x)?"

It should be possible (and currently is, as demonstrated by
marrow.server.http) to create a polygot server, polygot
middleware/filters (demonstrated by marrow.wsgi.egress.compression),
and polygot applications, though obviously polygot code demands the
"lowest common denominator" in terms of feature use. Application /
framework authors would likely create Python 3 specific WSGI
applications to make use of the full Python 3 feature set, with
cross-compatibility relegated to server and middleware authors.

- Alice.

Alice Bevan–McGregor

unread,

Jan 7, 2011, 4:31:32 AM1/7/11

to web...@python.org

On 2011-01-07 01:08:42 -0800, chris.dent said:
> ... this particular goal [reduction of reimplementation / NIH] could

> cover a large number of things from standardized query string
> processing (maybe a good idea) to filters (which I've already expressed
> reservations about).
>
> So this goal seems like it ought to be several separate goals.

+1

This definitely needs to be broken out to be explicit over the things
that can be abstracted away from middleware and applications. Input
from framework authors would be valuable here to see what they disliked
re-implementing the most. ;)

Query string processing is a difficult task at the best of times, and
is one area that is reimplemented absolutely everywhere. (At some
point I should add up the amount of code + unit testing code that
covers this topic alone from the top 10 frameworks.)

> The other option (than non-optional) for optional things is to remove them.

True; though optional things already exist as if they were not there.
Implementors rarely, it seems, expend the effort to implement optional
components, thus every HTTP server I came across having comments in the
code saying "up to the application to implement chunked responses"
indicating -some- thought, but despite chunked /request/ support being
mandated by HTTP/1.1. (And other ignored requirements.)

P.J. Eby

unread,

Jan 7, 2011, 11:28:15 AM1/7/11

to al...@gothcandy.com, web...@python.org

At 01:17 AM 1/7/2011 -0800, Alice BevanMcGregor wrote:
>On 2011-01-06 20:18:12 -0800, P.J. Eby said:
>>>:: Reduction of re-implementation / NIH syndrome by
>>>incorporating>the most common (1%) of features most often
>>>relegated to middleware>or functional helpers.
>>Note that nearly every application-friendly feature you add will
>>increase the burden on both server developers and middleware
>>developers, which ironically means that application developers
>>actually end up with fewer options.
>
>Some things shouldn't have multiple options in the first place. ;)

I meant that if a server doesn't implement the spec because of a
required feature, then the app developer doesn't have the option of
using that feature anyway -- meaning that adding the feature to the
spec didn't really help.

> I definitely consider implementation overhead on server,
> middleware, and application authors to be important.
>
>As an example, if yield syntax is allowable for application objects
>(as it is for response bodies) middleware will need to iterate over
>the application, yielding up-stream anything that isn't a
>3-tuple. When it encounters a 3-tuple, the middleware can do its
>thing. If the app yield semantics are required (which may be a good
>idea for consistency and simplicity sake if we head down this path)
>then async-aware middleware can be implemented as a generator
>regardless of the downstream (wrapped) application's implementation.
>That's not too much overhead, IMHO.

The reason I proposed the 3-tuple return in the first place (see
http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html
) was that I wanted to make middleware *easy* to write.

Easy enough to write quick, say, 10-line utility functions that are
correct middleware -- so that you could actually build your
application out of WSGI functions calling other WSGI-based functions.

The yielding thing wouldn't work for that at all.

>>>Unicode decoding of a small handful of values (CGI values that>
>>>pull from the request URI) is the biggest example. [2, 3]
>>Does that mean you plan to make the other values bytes, then? Or
>>will they be unicode-y-bytes as well?
>
>Specific CGI values are bytes (one, I believe), specific ones are
>true unicode (URI-related values) and decoded using a configurable
>encoding with a fallback to "bytes in unicode" (iso-8859-1/latin1),
>are kept internally consistent (if any one fails, treat as if they
>all failed), have the encoding used recorded in the environ, and all
>others are native strings ("bytes in unicode" where native strings
>are unicode).

So, in order to know what type each CGI variable is, you'll need a reference?

>>What happens for additional server-provided variables?
>
>That is the domain of the server to document, though native strings
>would be nice. (The PEP only covers CGI variables.)

I mean the ones required by the spec, not server-specific extensions.

>>The PEP 3333 choice was for uniformity. At one point, I advocated
>>simply using surrogateescape coding, but this couldn't be made
>>uniform across Python versions and maintain compatibility.
>
>As an open question to anyone: is surrogateescape availabe in Python
>2.6? Mandating that as a minimum version for PEP 444 has yielded
>benefits in terms of back-ported features and syntax, like b''.

No, otherwise I'd totally go for the surrogateescape approach. Heck,
I'd still go for it if it were possible to write a surrogateescape
handler for 2.6, and require that a PEP 444 server register one with
Python's codec system. I don't know if it's *possible*, though,
hopefully someone with more knowledge can weigh in on that.

>>>:: Cross-compatibility considerations. The definition and use
>>>of>native strings vs. byte strings is the biggest example of this
>>>in the rewrite.
>>I'm not sure what you mean here. Do you mean "portability of WSGI
>>2code samples across Python versions (esp. 2.x vs. 3.x)?"
>
>It should be possible (and currently is, as demonstrated by
>marrow.server.http) to create a polygot server, polygot
>middleware/filters (demonstrated by marrow.wsgi.egress.compression),
>and polygot applications, though obviously polygot code demands the
>"lowest common denominator" in terms of feature use. Application /
>framework authors would likely create Python 3 specific WSGI
>applications to make use of the full Python 3 feature set, with
>cross-compatibility relegated to server and middleware authors.

I'm just asking whether, in your statement of goals and rationale,
you would expand "cross compatibility" as meaning cross-python
version portability, or whether you meant something else.

Éric Araujo

unread,

Jan 7, 2011, 11:36:56 AM1/7/11

to P.J. Eby, web...@python.org

> No, otherwise I'd totally go for the surrogateescape approach. Heck,
> I'd still go for it if it were possible to write a surrogateescape
> handler for 2.6, and require that a PEP 444 server register one with
> Python's codec system. I don't know if it's *possible*, though,
> hopefully someone with more knowledge can weigh in on that.

This error handler is written in C; I don’t know whether it would be
possible to reimplement it in Python. See PEP 383 for a description,
Python/codecs.c for the source.

Regards

Alice Bevan–McGregor

unread,

Jan 7, 2011, 4:22:23 PM1/7/11

to web...@python.org

On 2011-01-07 08:28:15 -0800, P.J. Eby said:
> At 01:17 AM 1/7/2011 -0800, Alice BevanMcGregor wrote:
>> On 2011-01-06 20:18:12 -0800, P.J. Eby said:
>>>> :: Reduction of re-implementation / NIH syndrome by>>>incorporating>the
>>>> most common (1%) of features most often>>>relegated to middleware>or
>>>> functional helpers.
>>> Note that nearly every application-friendly feature you add
>>> will>>increase the burden on both server developers and
>>> middleware>>developers, which ironically means that application
>>> developers>>actually end up with fewer options.
>>
>> Some things shouldn't have multiple options in the first place. ;)
>
> I meant that if a server doesn't implement the spec because of

> arequired feature, then the app developer doesn't have the option of

> using that feature anyway -- meaning that adding the feature to the
> spec didn't really help.

I truly can not worry about non-conformant applications, middleware, or
servers and still keep my hair.

>> I definitely consider implementation overhead on server,> middleware,
>> and application authors to be important.
>>
>> As an example, if yield syntax is allowable for application objects>(as
>> it is for response bodies) middleware will need to iterate over>the
>> application, yielding up-stream anything that isn't a>3-tuple. When it
>> encounters a 3-tuple, the middleware can do its>thing. If the app
>> yield semantics are required (which may be a good>idea for consistency
>> and simplicity sake if we head down this path)>then async-aware
>> middleware can be implemented as a generator>regardless of the
>> downstream (wrapped) application's implementation.>That's not too much
>> overhead, IMHO.
>
> The reason I proposed the 3-tuple return in the first place (see
> http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html)
> was that I wanted to make middleware *easy* to write.

This was noted several times, and I do agree with that blog article
which states that a lot of middleware shouldn't be middleware.

> Easy enough to write quick, say, 10-line utility functions that

> arecorrect middleware -- so that you could actually build
> yourapplication out of WSGI functions calling other WSGI-based

> functions.
>
> The yielding thing wouldn't work for that at all.

Handling a possible generator isn't that difficult.

>>>> Unicode decoding of a small handful of values (CGI values that>>>>pull
>>>> from the request URI) is the biggest example. [2, 3]
>>> Does that mean you plan to make the other values bytes, then? Or>>will
>>> they be unicode-y-bytes as well?
>>
>> Specific CGI values are bytes (one, I believe), specific ones are>true
>> unicode (URI-related values) and decoded using a configurable>encoding
>> with a fallback to "bytes in unicode" (iso-8859-1/latin1),>are kept
>> internally consistent (if any one fails, treat as if they>all failed),
>> have the encoding used recorded in the environ, and all>others are
>> native strings ("bytes in unicode" where native strings>are unicode).
>
> So, in order to know what type each CGI variable is, you'll need a reference?

Reference? Re-read what I wrote. Only URI-specific values utilize an
encoding reference variable in the environment; that's four values out
of the entire environ. There is one, clearly defined bytes value. The
rest are native strings, decoded using
latin1/iso-8859-1/"str-in-unicode" where native strings are unicode.

>>> What happens for additional server-provided variables?
>>
>> That is the domain of the server to document, though native
>> strings>would be nice. (The PEP only covers CGI variables.)
>
> I mean the ones required by the spec, not server-specific extensions.

The spec clearly defines the expected value types (see above). If it
doesn't, I will fix that. ;)

> I'm just asking whether, in your statement of goals and rationale,you
> would expand "cross compatibility" as meaning cross-pythonversion
> portability, or whether you meant something else.

Cross-Python version portability is what it was intended to mean.

- Alice.

P.J. Eby

unread,

Jan 7, 2011, 11:34:09 PM1/7/11

to al...@gothcandy.com, web...@python.org

At 01:22 PM 1/7/2011 -0800, Alice BevanMcGregor wrote:
>On 2011-01-07 08:28:15 -0800, P.J. Eby said:
>>At 01:17 AM 1/7/2011 -0800, Alice BevanMcGregor wrote:
>>>On 2011-01-06 20:18:12 -0800, P.J. Eby said:
>>>>>:: Reduction of re-implementation / NIH syndrome
>>>>>by>>>incorporating>the most common (1%) of features most
>>>>>often>>>relegated to middleware>or functional helpers.
>>>>Note that nearly every application-friendly feature you add
>>>>will>>increase the burden on both server developers and
>>>>middleware>>developers, which ironically means that application
>>>>developers>>actually end up with fewer options.
>>>Some things shouldn't have multiple options in the first place. ;)
>>I meant that if a server doesn't implement the spec because of
>>arequired feature, then the app developer doesn't have the option
>>of using that feature anyway -- meaning that adding the feature to
>>the spec didn't really help.
>
>I truly can not worry about non-conformant applications, middleware,
>or servers and still keep my hair.

I said "if a server doesn't implement the *spec*", meaning, they
choose not to support PEP 444 *at all*, not that they skip providing
the feature.

>>Easy enough to write quick, say, 10-line utility functions that
>>arecorrect middleware -- so that you could actually build
>>yourapplication out of WSGI functions calling other WSGI-based functions.
>>The yielding thing wouldn't work for that at all.
>
>Handling a possible generator isn't that difficult.

That it's difficult at all means removes degree-of-difficulty as a
strong motivation to switch.

>>So, in order to know what type each CGI variable is, you'll need a reference?
>
>Reference? Re-read what I wrote. Only URI-specific values utilize
>an encoding reference variable in the environment; that's four
>values out of the entire environ. There is one, clearly defined
>bytes value. The rest are native strings, decoded using
>latin1/iso-8859-1/"str-in-unicode" where native strings are unicode.

IOW, there are six specific facts someone needs to remember in order
to know the type of a given CGI variable, over and above the mere
fact that it's a CGI variable. Hence, "reference".

Alice Bevan–McGregor

unread,

Jan 8, 2011, 1:13:07 AM1/8/11

to web...@python.org

On 2011-01-07 20:34:09 -0800, P.J. Eby said:
> That it [handling generators] is difficult at all means removes

> degree-of-difficulty as a strong motivation to switch.

Agreed. I will be following up with a more concrete idea (including
p-code) to better describe what is currently in my brain. (One half of
which will be just as objectionable, the other half, with Alex
Grönholm's input, far more reasonable.)

> IOW, there are six specific facts someone needs to remember in orderto

> know the type of a given CGI variable, over and above the merefact that

> it's a CGI variable. Hence, "reference".

No, practically there is one. If you are implementing a Python 3
solution, a single value (original URI) is an instance of bytes, the
rest are str. If you are implementing a Python 2 solution, there's a
single rule you need to remember: values derived from the URI
(QUERY_STRING, PATH_INFO, etc.) are unicode, the rest are str.

Poloygot implementors are already accepting that they will need to
include more in their headspace before writing a single line of code;
knowing that "native string" differs between the two langauges is a
fundamental concept nessicary for the act of writing polygot code.

- Alice.

Reply all

Reply to author

Forward