A base compression middleware class

57 views
Skip to first unread message

Illia Volochii

unread,
Jun 14, 2021, 7:47:01 PM6/14/21
to Django developers (Contributions to Django itself)
Hi all,

There is `GZipMiddleware` that compresses response bodies using gzip.
https://github.com/django/django/blob/main/django/middleware/gzip.py

But there are other algorithms supported by browsers (e.g., Brotli).
At the moment, if somebody wants to add support for any of them in a
Django project, one has to create a new middleware class redefining
all the logic processing responses.

I propose simplifying the process by creating
`BaseCompressionMiddleware`. Its subclasses would just need to define
a tuple of supported compressors.

The tuple elements could be instances of such a data class:
```
@dataclass
class Compressor:
    name: str
    compress_string_func: Callable[[bytes], bytes]
    compress_sequence_func: Optional[
        Callable[[Iterable[bytes]], Iterator[bytes]]
    ] = None

    def is_accepted_by(self, accept_encoding):
        return bool(re.search(rf'\b{self.name}\b', accept_encoding))
```


And the class could select a compressor using such code:
```
@classmethod
def select_compressor(cls, request, response):
    ae = request.META.get('HTTP_ACCEPT_ENCODING', '')
    for compressor in cls.compressors:
        if (
            compressor.is_accepted_by(ae)
            and (not response.streaming or compressor.compress_sequence_func)
        ):
            return compressor
```


Note, in the example `compress_string_func` is required and
`compress_sequence_func` is optional. I think that
`compress_string_func` would usually be a library function, and
`compress_sequence_func` would require some custom logic. And we
should not force users to define the latter.

`GZipMiddleware` could be changed to
```
gzip_compressor = Compressor(
    name='gzip',
    compress_string_func=compress_string,
    compress_sequence_func=compress_sequence,
)

class GZipMiddleware(BaseCompressionMiddleware):
    compressors = (gzip_compressor,)
```


And someone wanting to support both Brotli and gzip could define:
```
import brotli
from django.middleware.gzip import gzip_compressor

brotli_compressor = Compressor(
    name='br',
    compress_string_func=brotli.compress,
)

class CompressionMiddleware(BaseCompressionMiddleware)
    compressors = (brotli_compressor, gzip_compressor)
```


Please let me know what you think about such an enhancement.
I will be happy to create a pull request if this is accepted.

Thanks,
Illia

Curtis Maloney

unread,
Jun 14, 2021, 8:16:23 PM6/14/21
to 'Mike Hansen' via Django developers (Contributions to Django itself)
Hi Illia,

I like the idea here, and your design looks sensible at first blush, but I feel conflicted.

As much as I like the idea of allowing more flexibility and future-proofing this middleware,
AIUI the gzip middleware is generally discouraged, as it's typically more efficient to have your web server do this work.

I'd be more tempted to deprecate GzipMiddleware in core, and move this feature into a 3rd party app.

An additional hook for your interface: a function to test if this compressor should be used on this response. For instance, the gzip middleware has a minimum size below which it's deemed not worth the effort to try to compress.

--
Curtis
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.

Tom Carrick

unread,
Jun 15, 2021, 6:35:35 AM6/15/21
to django-d...@googlegroups.com
It's also worth noting that the security issue mentioned in the docs[1] makes it unsuitable for many (most?) Django projects, unfortunately, and brotli is also susceptible to this attack.

It's probably not something I'd be keen on encouraging the use of, though I also think the idea in principle is a good one.


Tom

Reply all
Reply to author
Forward
0 new messages