Unlike compression, which is supported by almost all web servers (even though it's next to impossible to handle etags properly for dynamic content at this level), decompression isn't supported in general.
My only concern with adding this to GzipMiddleware is backwards-compatibility. Suddenly every user of GzipMiddleware may see a different behavior.
For example, double-decoding could occur in the following circumstances:
- if an application gzip-decodes the request body without checking the Content-Encoding header, because it knows it's gzipped,
- if a WSGI middleware gzip-decods the request body but doesn't strip the Content-Encoding header.
Arguable, these are bad designs, and I think we can make this change and document it in the release notes.
The idea behind this is simply to decompress the body of requests containing Content-Encoding: gzip header.
This patch doesn't work at all. request.POST is a MultiValueDict, not a string! You're artificially setting it to a string in the test, and you're unit-testing the implementation, so the test passes, but a real request wouldn't work.
The process for parsing the request body into POST, FILES, or request.body is quite complicated in Django. The only reasonable implementation is to wrap environ['wsgi.input'] in a gzip decoder, and that much easier to implement as a WSGI middleware. I have such an implementation floating around:
import logging
from StringIO import StringIO
import zlib
class UnzipRequestMiddleware(object):
"""A middleware that unzips POSTed data.
For this middleware to kick in, the client must provide a value
for the ``Content-Encoding`` header. The only accepted value is
``gzip``. Any other value is ignored.
"""
def __init__(self, app):
self.app = app
def __call__(self, environ, start_response):
encoding = environ.get('HTTP_CONTENT_ENCODING')
if encoding == 'gzip':
data = environ['wsgi.input'].read()
try:
uncompressed = zlib.decompress(data)
environ['wsgi.input'] = StringIO(uncompressed)
environ['CONTENT_LENGTH'] = len(uncompressed)
except zlib.error:
logging.warning(u"Could not decompress request data.", exc_info=True)
environ['wsgi.input'] = StringIO(data)
return self.app(environ, start_response)
If you'd like to implement this as a Django middleware, I'll have a look at the code. It will be more complex that your initial attempt and it will require a healthy amount of documentation, especially for middleware ordering. For example, if a middleware accesses request.POST prior to unzipping, that won't work.
You should also check how this interacts with upload handlers. If the decoding is performed at the right level, it should be independent, but please check: