[RFC] HttpResponse: freeze API, read_once

6 views
Skip to first unread message

Forest Bond

unread,
Sep 13, 2010, 1:34:55 PM9/13/10
to django-developers
Hi,

I'd like to return to the problem of middleware, view decorators, etc. affecting
responses in negative ways.


Background
==========

Previous discussion on django-developers (this has come up more than once,
though, I think):

* http://groups.google.com/group/django-developers/browse_thread/thread/f5ba6df0400a9c57

Some relevant tickets (there are probably more):

* http://code.djangoproject.com/ticket/6527
* http://code.djangoproject.com/ticket/6027
* http://code.djangoproject.com/ticket/12214

This originally came up for me with streaming responses (with iterator content),
but since then I've seen a few new situations where I wanted to be able to
prevent middleware from doing certain things to certain responses.

Russell had proposed a capabilities-based approach to dealing with this problem
(see previous discussion). Ultimately, I don't think this is the correct way to
go simply because I don't think the problem is limited to the question of what
you *can* do to a response. It's also a question of what you *should* do to a
response.

In fact, the only real "capability" I'm aware of is the case where the response
content is an iterator. Everything else is a matter of developer preference,
which he would most likely want to express in his view or using a view
decorator. To accommodate this, we need an API that the developer can use to
express his preferences for how a particular response should be handled.

So here are the things that I think we need to be able to communicate about a
response:

1. Don't set, change, or remove header X.
2. Don't modify, replace, or remove the response content.
3. Response content can only be read once, don't try to read it.


Response Freezing
=================

To address #1 and #2, I'd like to propose a response "freezing" API. The basic
idea is that I can freeze a particular header expect that that header will not
be modified by middleware or view decorators. I can also freeze the response
content to indicate that the content should not be modified.

The relevant functions would be:

* HttpResponse.freeze_header(header)
* HttpResponse.freeze_all_headers()
* HttpResponse.header_is_frozen(header)
* HttpResponse.freeze_content()
* HttpResponse.content_is_frozen()

Any response-handling code should check to see if the things it wants to modify
are frozen before actually doing anything. If they are, it should not modify
the response at all.

If middleware tries to modify something that has been frozen, an exception
should be raised. This is backwards compatible because the freeze API is new,
and so long as no one calls a freeze method, responses will continue to behave
as they do now. However, if there are concerns about the transition, we can
emit a DeprecationWarning instead for a few releases. In any case, middleware
included with Django should be updated to respect frozen requests.

To be clear with regard to backwards compatibility: no views, decorators, or
middleware included with Django should call the freeze_* functions, at least
until the freeze API has been out long enough for people to fix their
middleware.


Iterator Protection
===================

To address #3, we need a way to protect responses with iterator content from
being drained prematurely. For this, I propose a boolean attribute on
HttpResponse instances called "read_once". This will default to False to
preserve backwards compatibility, but if it is set to True, an exception should
be raised when the content is read. Middleware can inspect response.read_once
to see if it is okay to read the response content.

Obviously, there will need to be some special handling by the HTTP handler so
that it can access the content so it can feed it to the client without tripping
protection.


Examples
========

Some simple functions that demonstrate how middleware/view decorators would need
to handle responses:


def set_etag(response):
if response.read_once or response.header_is_frozen('ETag'):
return response
response['ETag'] = calculate_etag(response.content)
return response


def set_content_length(response):
if response.read_once or response.header_is_frozen('Content-Length'):
return response
response['Content-Length'] = str(len(response.content))
return response


def compress_content(response):
# Note that we don't need to check response.read_once here since we are
# replacing the content with a non-iterator. If the user didn't want us
# to do this, he would have frozen the content.
if response.header_is_frozen('Content-Encoding') \
or response.header_is_frozen('Content-Length') \
or response.content_is_frozen():
return response
response.content = compress_string(response.content)
response['Content-Encoding'] = 'gzip'
response['Content-Length'] = str(len(response.content))
return response


Shortcomings
============

The biggest problem with the approach I've proposed is that it is not
sufficiently granular to handle the case where I don't want the response content
modified (e.g. compressed) but I *do* want to return HTTP 304 Not Modified where
appropriate. That's the only use case I can think of right now that is not
addressed by my proposal. I have a few ideas about this but I'm curious to see
what ideas other people have and I think it can be addressed later.


I am more than happy to provide an implementation of the above, but I'd like to
get an idea of what chance it has of being accepted before I start working on
it. I am also happy to maintain it outside of Django svn for a while so that
people can test it out.

Feedback would be appreciated.


Thanks,
Forest
--
Forest Bond
http://www.alittletooquiet.net
http://www.pytagsfs.org

signature.asc

Santiago Perez

unread,
Sep 13, 2010, 2:58:12 PM9/13/10
to django-d...@googlegroups.com
The biggest problem with the approach I've proposed is that it is not
sufficiently granular to handle the case where I don't want the response content
modified (e.g. compressed) but I *do* want to return HTTP 304 Not Modified where
appropriate.  That's the only use case I can think of right now that is not
addressed by my proposal.  I have a few ideas about this but I'm curious to see
what ideas other people have and I think it can be addressed later.

Another missing use case is when you have streaming content that you don't want to load in memory but you still want it to be modified (compressed for instance) in ways that can be done during the streaming. 

Forest Bond

unread,
Sep 13, 2010, 4:08:58 PM9/13/10
to django-d...@googlegroups.com
Hi Santiago,

Good point.

But it can be added fairly easily if we also stipulate that with iterator
content the iterator itself is available as attribute "content_iterator".

def compress_content(response):


if response.header_is_frozen('Content-Encoding') \
or response.header_is_frozen('Content-Length') \
or response.content_is_frozen():
return response

if response.read_once:
response.content_iterator = compress_iterator(response.content_iterator)
# We cannot know the response length.
if 'Content-Length' in response:
del response['Content-Length']
else:
response.content = compress_string(response.content)
response['Content-Length'] = str(len(response.content))
response['Content-Encoding'] = 'gzip'
return response

signature.asc

Luke Plant

unread,
Sep 13, 2010, 8:22:37 PM9/13/10
to django-d...@googlegroups.com
Hi Forest,

you wrote:
>
> I'd like to return to the problem of middleware, view decorators, etc. affecting
> responses in negative ways.

Thanks so much for all your hard work and persistence on this issue.
Although I haven't gone through this in detail, I think this is arriving
at a good solution now. I massively prefer the explicit 'freeze' API on
HttpResponse to subclasses that do different things.

It would be extremely useful to me if you could go through the use cases
linked on the previous thread and see if your solution addressed them
all.

> I am more than happy to provide an implementation of the above, but I'd like to
> get an idea of what chance it has of being accepted before I start working on
> it. I am also happy to maintain it outside of Django svn for a while so that
> people can test it out.

>From my perspective that would be great. One thing that would be
brilliant whatever the eventual solution would be a set of use cases
encoded into some tests - I suspect that whatever solution/API is chosen
it would be worth doing the tests first so that you can see if it is
going to work in practice, and so that other people can see how to use
it.

Many thanks.

Luke

--
When you're great, people often mistake candor for bragging.
-- Calvin and Hobbes

Luke Plant || http://lukeplant.me.uk/

Forest Bond

unread,
Sep 14, 2010, 12:27:54 AM9/14/10
to django-d...@googlegroups.com
Hi Luke,

On Tue, Sep 14, 2010 at 01:22:37AM +0100, Luke Plant wrote:
> you wrote:
> >
> > I'd like to return to the problem of middleware, view decorators, etc. affecting
> > responses in negative ways.
>
> Thanks so much for all your hard work and persistence on this issue.

Thanks. Persistence is one of my few good traits. ;)

> Although I haven't gone through this in detail, I think this is arriving
> at a good solution now. I massively prefer the explicit 'freeze' API on
> HttpResponse to subclasses that do different things.

It's no silver bullet, but it's probably good enough to cover the majority of
use cases without making things complicated. I'm curious to hear what people
think.

> It would be extremely useful to me if you could go through the use cases
> linked on the previous thread and see if your solution addressed them
> all.

I'll get an e-mail out tomorrow with a good overview of each use case I know of
and how well the freeze API would deal with them.

> > I am more than happy to provide an implementation of the above, but I'd like to
> > get an idea of what chance it has of being accepted before I start working on
> > it. I am also happy to maintain it outside of Django svn for a while so that
> > people can test it out.
>
> From my perspective that would be great. One thing that would be
> brilliant whatever the eventual solution would be a set of use cases
> encoded into some tests - I suspect that whatever solution/API is chosen
> it would be worth doing the tests first so that you can see if it is
> going to work in practice, and so that other people can see how to use
> it.

That's a great idea. Probably step one then would be to write up some test
cases and make sure we all agree about those. Once I hear what people have to
say about the proposed API, I'll start on that.

signature.asc

Forest Bond

unread,
Sep 14, 2010, 10:03:06 AM9/14/10
to django-developers
Hi,

I was re-reading this and wanted to clarify a few points.

On Mon, Sep 13, 2010 at 01:34:55PM -0400, Forest Bond wrote:
> Response Freezing
> =================
>
> To address #1 and #2, I'd like to propose a response "freezing" API. The basic
> idea is that I can freeze a particular header expect that that header will not
> be modified by middleware or view decorators. I can also freeze the response
> content to indicate that the content should not be modified.
>
> The relevant functions would be:
>
> * HttpResponse.freeze_header(header)
> * HttpResponse.freeze_all_headers()

I actually don't think freeze_all_headers is necessary.

> Iterator Protection
> ===================
>
> To address #3, we need a way to protect responses with iterator content from
> being drained prematurely. For this, I propose a boolean attribute on
> HttpResponse instances called "read_once". This will default to False to
> preserve backwards compatibility, but if it is set to True, an exception should
> be raised when the content is read. Middleware can inspect response.read_once
> to see if it is okay to read the response content.

I just realized that this paragraph is inconsistent with my compress_content
example below, which wouldn't work if an exception is raised when the content is
read.

Perhaps we can simply say that if read_once is True, middleware can access the
response content via attribute content_iterator only. I'd like to raise an
exception if it is accessed directly via attribute content just so we can weed
out the common case where a middleware has not been updated to check read_once.

> Examples
> ========


>
> def compress_content(response):
> # Note that we don't need to check response.read_once here since we are
> # replacing the content with a non-iterator. If the user didn't want us
> # to do this, he would have frozen the content.
> if response.header_is_frozen('Content-Encoding') \
> or response.header_is_frozen('Content-Length') \
> or response.content_is_frozen():
> return response
> response.content = compress_string(response.content)
> response['Content-Encoding'] = 'gzip'
> response['Content-Length'] = str(len(response.content))
> return response

Thanks,

signature.asc

Forest Bond

unread,
Sep 14, 2010, 11:33:18 PM9/14/10
to django-d...@googlegroups.com
Hi,

On Tue, Sep 14, 2010 at 01:22:37AM +0100, Luke Plant wrote:

> It would be extremely useful to me if you could go through the use cases
> linked on the previous thread and see if your solution addressed them
> all.

Sorry, I'll have to get this out tomorrow. I ended up being a little short on
time today.

For what it's worth, looking at the use cases again has been instructive and I
think the proposed API can be simplified some and still be able to address each
of them.

signature.asc

Tai Lee

unread,
Sep 15, 2010, 12:25:36 AM9/15/10
to Django developers
I'm going to try and preempt a possible objection that has been raised
in previous discussions on this subject.

Won't this change require a lot of repetitive logic to be shifted into
middleware functions? All middleware authors will now need to inspect
the response and find out if they can make the changes they want to
make.

If an old middleware hasn't been updated and a developer tries to
freeze a header that the old middleware needs to alter, an exception
will be raised. The only option for the developer in this case would
be to alter the old 3rd party middleware themselves?

As an alternative (that has been raised in one of the previous
discussions), what's wrong with the "disabled_middleware" (or whatever
name is deemed appropriate) attribute on HttpResponse? This would make
it easy for developers to disable or skip certain response middleware
on a per-response basis, or create their own HttpResponse subclasses
that skip certain incompatible middleware, and would not require any
changes to existing middleware functions nor require middleware
authors to check in their functions if it is allowed to make any
changes. The code that determines whether or not changes (middleware)
are applied to responses would be in one place, the Django code that
applies middleware, rather than the 3rd party middleware code itself.

Cheers.
Tai.
>  signature.asc
> < 1KViewDownload

Tai Lee

unread,
Sep 15, 2010, 1:14:54 AM9/15/10
to Django developers
After re-reading the rest of those previous discussions, I see at
least one thing wrong with the "disabled_middleware" attribute
approach. 3rd party apps won't know which middleware is installed that
might break their responses, and end-users won't have a mechanism to
disable middleware for responses in 3rd party apps (just their own
apps).

The last post (by me) on the previous discussion thread that was
linked has a suggestion to work around this by adding a
CONDITIONAL_MIDDLEWARE_CLASSES setting, where end-users can map
response middleware that should or shouldn't execute for a particular
HttpResponse class.

This should be completely backwards compatible, will not require
duplicate, repetitive or error prone code in each middleware function
to interrogate the capabilities (or frozen headers, etc) for each
response.

Perhaps this could even also map response middleware functions to view
functions in addition to HttpResponse classes?

Regarding the freeze API discussed above, how would one make use of
this in a view that generates and returns a response with
direct_to_template (or similar)? Would users simply be required to
construct and alter their own HttpResponse manually and return that,
rather than using a helper function like direct_to_template?

Cheers.
Tai.

Forest Bond

unread,
Sep 15, 2010, 2:08:49 PM9/15/10
to django-d...@googlegroups.com
Hi Tai,

On Tue, Sep 14, 2010 at 09:25:36PM -0700, Tai Lee wrote:
> I'm going to try and preempt a possible objection that has been raised
> in previous discussions on this subject.
>
> Won't this change require a lot of repetitive logic to be shifted into
> middleware functions? All middleware authors will now need to inspect
> the response and find out if they can make the changes they want to
> make.

Middleware will have to be changed to support streaming responses, yes. But the
checks that middleware have to do on the response are fairly trivial.

> If an old middleware hasn't been updated and a developer tries to
> freeze a header that the old middleware needs to alter, an exception
> will be raised. The only option for the developer in this case would
> be to alter the old 3rd party middleware themselves?

Or they can fix it by subclassing::

class FixedMiddleware(BrokenMiddleware):
def process_response(self, request, response):
if response.header_is_frozen('X-My-Header'):
return response
return super(FixedMiddleware, self).process_response(request, response)

That's not a huge penalty for someone that really need streaming responses, I
wouldn't think. It also wouldn't take that much for the author of the 3rd party
middleware to implement the fix properly.

signature.asc
Reply all
Reply to author
Forward
0 new messages