Re: [Django] #7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse objects.

34 views
Skip to first unread message

Django

unread,
Jun 23, 2011, 1:41:15 AM6/23/11
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New | Status: new
feature | Component: Core (Other)
Milestone: | Severity: Normal
Version: SVN | Keywords: stream HttpResponse
Resolution: | Content-Length
Triage Stage: Design | Has patch: 1
decision needed | Needs tests: 0
Needs documentation: 0 | Easy pickings: 0
Patch needs improvement: 0 |
UI/UX: 0 |
-------------------------------------+-------------------------------------
Changes (by mrmachine):

* ui_ux: => 0
* easy: => 0


Comment:

Just updated the patch to apply cleanly against trunk again. Removed (now)
redundant changes to CSRF middleware (it no longer breaks streaming). Also
made `_get_content_generator()` consistent with `_get_content()` in that
it now calls `smart_str()` on each chunk. This was causing me problems
when passing a generator that yielded unicode strings instead of bytecode
strings, which was breaking `compress_sequence()`.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:24>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 14, 2011, 7:34:48 PM9/14/11
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New | Status: new
feature | Component: Core (Other)
Milestone: | Severity: Normal
Version: SVN | Keywords: stream HttpResponse
Resolution: | Content-Length
Triage Stage: Design | Has patch: 1
decision needed | Needs tests: 0
Needs documentation: 0 | Easy pickings: 0
Patch needs improvement: 1 |
UI/UX: 0 |
-------------------------------------+-------------------------------------
Changes (by PaulM):

* needs_better_patch: 0 => 1


Comment:

I'm afraid this patch broke again with r16829. I think that change should
make this patch quite a bit cleaner in the long run.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:25>

Django

unread,
Sep 16, 2011, 5:52:07 PM9/16/11
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New | Status: new
feature | Component: Core (Other)
Milestone: | Severity: Normal
Version: SVN | Keywords: stream HttpResponse
Resolution: | Content-Length
Triage Stage: Accepted | Has patch: 1
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 1 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Changes (by jacob):

* stage: Design decision needed => Accepted


Comment:

It's clear to me that we have to do *something* about this (though *what*
is far from obvious).

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:26>

Django

unread,
Jan 7, 2012, 12:57:08 AM1/7/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: SVN
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by Chris Ghormley <chris@…>):

It doesn't advance the discussion on what to do next, but the patch I just
added should let people like me who wanted to painlessly apply mrmachine's
patch to Django 1.3.1, do so.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:27>

Django

unread,
Jan 13, 2012, 7:54:37 AM1/13/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: SVN
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by mitar):

* cc: mmitar@… (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:28>

Django

unread,
Jul 31, 2012, 10:58:57 PM7/31/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by mrmachine):

I've updated this patch on a branch at GitHub.

https://github.com/thirstydigital/django/tree/tickets/7581-streaming-
response-middleware

If somebody can provide some concrete direction on the desired approach to
resolve this (BDFL decision?), I'll try to add docs and tests as well.

FYI, I've been using this patch in various incarnations for 4 years in
production environments without any problems.

My vote is still for allowing middleware to ask the response if it is
streaming (generator as content) or not, and act accordingly. Either by
changing their behaviour (to be compatible with streaming responses),
doing nothing (to avoid breaking a streaming response), or forcing the
response to consume the content (breaking the streaming response, but
ensuring that the middleware always runs).

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:29>

Django

unread,
Aug 23, 2012, 11:35:41 AM8/23/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by mrmachine):

Updated this patch with some minor tweaks arising from IRC discussion with
akaariai. Consume generator content on access, not assignment, so that
generator content can still stream if no middleware has to access
`response.content`. Raise `PendingDeprecationWarning` if you explicitly
create an `HttpResponse` with `streaming_content=True` and then you (or
some middleware) accesses `response.content`.

Also see recent google developers discussion at
https://groups.google.com/d/topic/django-developers/RrNPfuJxnlM/discussion

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:30>

Django

unread,
Sep 3, 2012, 12:26:25 AM9/3/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by mindsocket):

* cc: mindsocket (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:31>

Django

unread,
Sep 22, 2012, 2:07:49 PM9/22/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 1

Has patch: 1 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by mrmachine):

* needs_docs: 0 => 1


Comment:

I've updated the patch on my branch after further feedback from Aymeric
Augustin and Anssi Kääriäinen on google groups discussion.

The new patch refactors `HttpResponse` into `HttpResponseBase` and
`HttpResponse` classes, and adds a new `HttpStreamingResponse` class.

The `HttpResponse` class is simplified, and will consume iterator content
whenever it is assigned, so that content can be safely accessed multiple
times.

The `HttpStreamingResponse` class exposes a new attribute,
`streaming_content`. If you assign an iterator as streaming content, it
will not be consumed until the response is iterated. The streaming content
can be wrapped with a new generator (e.g. GZipMiddleware). The class
prevents accidental assignment to the `content` attribute, to avoid
confusing middleware that checks for a `content` attribute on response
objects.

Both classes use `HttpResponseBase.make_bytes()` to yield bytestrings when
iterated, or a single bytestring when `HttpResponse.content` is accessed.
This has simplified the implementation a little and removed some
duplication.

The `close()` method will now close all "file-like" content objects that
were assigned to `content` or `streaming_content`, even if they are later
replaced by different content. I think the old code would have left
unclosed objects in this case.

As an added bonus, you can now `write()` to both regular and streaming
responses, when iterable or non-iterable content is assigned. For
streaming responses, this just wraps the old streaming content with a new
generator that yields an additional chunk at the end. For regular
responses, assigned content is always normalised to a list now, so we can
always append additional content.

Middleware have been updated for compatibility with streaming responses.
They now check responses for `content` or `streaming_content` before
attempting to read or replace those attributes. New and 3rd party
middleware should follow this example.

We now have tests, too, and the full test suite passes here (2.7, OS X,
SQLite). If this approach is accepted, I will add docs as well.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:32>

Django

unread,
Sep 28, 2012, 4:53:29 AM9/28/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 1
Has patch: 1 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by mrmachine):

I've updated the branch again slightly, to use `HttpStreamingResponse` in
the `django.views.static.serve` view. This is MASSIVELY faster on my local
machine. It was slightly less of an improvement with `GZipMiddleware`
enabled, but still a huge improvement compared to regular responses.

I noticed when I was testing a site that hosts a 16MB iPad app. When I
tried to install the app on the iPad from my local dev/test site, it was
*extremely* slow. So I tried to download the file directly with Firefox on
the same machine. Firefox was pulling it down at 60-70KB/s from localhost.

When using `HttpStreamingResponse` instead, it's instantaneous. This is
not only useful or noticeable with large files, but in general all images
and other static content now loads MUCH faster with the dev server.
Previously I could literally see even small files (60KB images) being
progressively drawn to the page. Now it's all instant. Much nicer during
development, especially on image heavy sites.

I also found that `django.http.utils.conditional_content_removal()` was
accessing `response.content` directly. I've updated that to work with
regular and streaming responses. I couldn't find *any* tests for this
function (or any in `django.http.utils`, so I added a new regression tests
package, `http_utils`.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:33>

Django

unread,
Sep 28, 2012, 5:05:23 PM9/28/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 1
Has patch: 1 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by akaariai):

Quick skimming, and looks good. Could you create a pull request for easier
reviewing?

I will try to see if we can get this into 1.5. Unfortunately HTTP protocol
isn't my strongest area, so help is welcome for reviewing...

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:34>

Django

unread,
Sep 29, 2012, 12:41:58 AM9/29/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0

Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by mrmachine):

* needs_better_patch: 1 => 0
* needs_docs: 1 => 0


Comment:

I've added docs (patch should be complete, now), and opened a pull
request.

https://github.com/django/django/pull/407

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:35>

Django

unread,
Oct 19, 2012, 10:20:36 AM10/19/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by aaugustin):

There's a lot of history on this ticket; I reviewed as much as possible
but couldn't read everything. Here's the current status of my review.

I'm still at the design decision level; I haven't looked at the code yet.

'''Hard requirements'''

- The behavior of responses instantiated with a string must not change.
- ''Below, I'm only considering responses instantiated with an iterator
(often a generator).''
- Accessing `request.content` in a regular `HttpResponse` must exhaust the
iterator and switch to a string-based response, which can be repeatedly
accessed.

'''Requirements'''

- Discussions on CSRF middleware are moot for the reason explained by Luke
in comment 20.
- Discussions on the `Content-Length` header can be ignored; it's optional
and should be omitted if it can't be computed efficiently.
- Discussions about conditionally disabling response middleware are
outside of the scope of this ticket.
- I'm -1 on `HttpResponse.disabled_middleware` and all similar ideas,
especially those requiring new settings — it's already way too difficult
to figure out how to make middleware run in a meaningful order.

'''Personal opinions'''

- I consider streaming a large CSV export as a valid use case. Ultimately
we should be able to stream from the database to the end user (I know this
isn't possible with `django.db` right now). The two other uses cases
brought up in the [https://groups.google.com/forum/#!topic/django-
developers/RrNPfuJxnlM/discussion latest discussion] are also interesting.
- I have absolutely no interest in fixing `GzipMiddleware`. I'd rather
deprecate it, on the grounds that it's
[http://www.python.org/dev/peps/pep-3333/#other-http-features prohibited
by PEP 3333] (and everyone should be using WSGI these days).

'''Questions'''

- One of the goals is to switch some parts of Django (e.g.
`django.views.static.serve`) to streaming by default. This will be
backward incompatible if the `StreamingHttpResponse` API isn't (at least
temporarily) a superset of the `HttpResponse` API — specifically if
accessing `StreamingHttpResponse.content` raises an exception.
- In the long run accessing `StreamingHttpResponse.content` should raise
an exception, in order to guarantee the streaming behavior isn't
accidentally broken by middleware.
- This is why `StreamingHttpResponse` provides the content in a
different attribute: `streaming_content`
- We could use a deprecation path to solve both issues at the same time:
`StreamingHttpResponse.content` would work initially, but be deprecated
and removed in 1.7.
- Middleware must be able to provide different implementation for regular
and streamed responses.
- The suggested idiom is `if hasattr(response, 'streaming_content'):
...`. This is more pythonic than the alternative `if isinstance(response,
StreamingHttpResponse): ...` but it still looks wrong.
- I'm considering adding `get_streaming_response` to the response
middleware API. If a response is a streaming response (which would be
determined by duck-typing: `hasattr(response, 'streaming_content')`) and
the middleware provides a `get_streaming_response` method, that method
would be called. Otherwise `get_response` would be called. Middleware
authors who want to support streaming responses will have to implement
that new method.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:36>

Django

unread,
Oct 19, 2012, 12:38:47 PM10/19/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by mrmachine):

I assume you meant `process_response()` and `process_streaming_response()`
methods for middleware classes? I think this would be much nicer to work
with, and easier to explain to middleware authors than asking them to do
`hasattr(response, 'streaming_content')`. But it would mean a little more
work to refactor some of the bundled middleware classes.

One thing you didn't mention is automatically closing file-like objects
passed as content to regular and streaming responses.

For streaming responses, Django should close them automatically, because
users won't be able to. I think that we should do the same for regular
responses as well for consistency and to avoid accidentally leaving files
open.

I'm not sure if this would be a backwards incompatible change, or if it
would need a deprecation path, or if a deprecation path would be possible?
I guess this could be implemented separately, if it decided to go ahead.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:37>

Django

unread,
Oct 19, 2012, 12:57:49 PM10/19/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by aaugustin):

Yes I meant `process_response`.

I didn't look at the closing issue yet, but it shouldn't be a problem.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:38>

Django

unread,
Oct 20, 2012, 5:51:10 AM10/20/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by aaugustin):

'''Design decisions'''

- I'm not going to refactor `HttpResponse` right now; there's another
ticket about that.
- I'm going to introduce a subclass of `StreamingHttpResponse` that
provides a deprecated content attribute, and use that class in the static
serve view, to preserve backwards compatibility.
- I'm going to introduce a `response.streaming` boolean flag to avoid the
`hasattr(response, 'streaming_content')` pattern.
- I'm not going to change the middleware API.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:39>

Django

unread,
Oct 20, 2012, 9:29:02 AM10/20/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by aaugustin):

I've finished working on a patch. Barring unexpected objections, I'll
commit it soon.

To keep things manageable, I have made the minimum changes required to
support streaming responses and enable them in the static serve view. I've
taken care not to change the behavior of `HttpResponse` in any way. (I've
saved that for #18796.)

I've kept the tests written by mrmachine (with the exception of those that
tested changes to `HttpResponse`). I've rewritten the code to make fewer
changes, because I didn't trust myself enough to push such a large and
sensitive patch in a single commit.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:40>

Django

unread,
Oct 20, 2012, 11:31:00 AM10/20/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: new
Component: Core (Other) | Version: master
Severity: Normal | Resolution:
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by aaugustin):

mrmachine's patch contained code to call `close()` on the response object
when the generator terminates.

This isn't necessary, because the WSGI server is already required to call
that method when the response is finished.

I removed it.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:41>

Django

unread,
Oct 20, 2012, 2:05:42 PM10/20/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: closed

Component: Core (Other) | Version: master
Severity: Normal | Resolution: fixed

Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by Aymeric Augustin <aymeric.augustin@…>):

* status: new => closed
* resolution: => fixed


Comment:

In [changeset:"4b27813198ae31892f1159d437e492f7745761a0"]:
{{{
#!CommitTicketReference repository=""
revision="4b27813198ae31892f1159d437e492f7745761a0"
Fixed #7581 -- Added streaming responses.

Thanks mrmachine and everyone else involved on this long-standing ticket.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:42>

Django

unread,
Oct 21, 2012, 4:43:36 AM10/21/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: closed
Component: Core (Other) | Version: master
Severity: Normal | Resolution: fixed
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by mrmachine):

Excellent. Very happy to see this committed. I have a couple of comments
that may or may not be important.

1. We're not checking if the `close` attribute on assigned content is
callable before adding it to `_closable_objects`? Probably an edge case
not worth worrying about, though.

2. Should we pop closable objects off the list before closing them (or
just empty the list after closing all items), so that multiple calls to
`close()` don't re-attempt to close objects already closed? Graham's blog
seems to indicate that multiple WSGI middleware may each call `close()` on
the response? http://blog.dscpl.com.au/2012/10/obligations-for-calling-
close-on.html

3. Returning `self._iterator` directly in
`HttpStreamingResponse.streaming_content`, instead of a new iterator that
yields bytestrings means that middleware authors can't expect consistent
bytestring chunks when wrapping streaming_content in a new iterator? They
may have to type check or coerce each chunk (potentially reproducing
something like `make_bytes()`? For comparison, `HttpResponse.content`
always returns a bytestring.

4. There may still be a case for consuming iterators assigned to
`HttpResponse.content` immediately, rather than on first access. Maybe
after a deprecation period, though? If `HttpResponse._container` is an
iterator, the response can't be pickled by cache middleware. It would also
allow additional content to be written to the response. This is probably
an issue for another ticket, anyway.

5. We should add an `ETag` header in the `serve()` view, if etags are
enabled in settings. ETags are normally added by `CommonMiddleware`, but
middleware can't calculate an ETag for a streaming response, and we do
have access to the raw file in `serve()` so it should be done there (like
we do for `Last-Modified` and `Content-Length`).

6. The note in the docs about iterators being closed after the response is
iterated as a point of difference from regular responses is not true
anymore. We rely on WSGI to call `close()` on both regular and streaming
responses.

7. The `http_utils` test wasn't included?

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:43>

Django

unread,
Oct 21, 2012, 5:10:09 AM10/21/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: closed
Component: Core (Other) | Version: master
Severity: Normal | Resolution: fixed
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by aaugustin):

Replying to [comment:43 mrmachine]:


> 1. We're not checking if the `close` attribute on assigned content is
callable before adding it to `_closable_objects`? Probably an edge case
not worth worrying about, though.

Yes, primarily because `callable` is gone in Python 3.

If someone has a custom file-like / iterator object that has a `close`
attribute that isn't callable, that'll teach him a bit of API design
principles :)

> 2. Should we pop closable objects off the list before closing them (or
just empty the list after closing all items), so that multiple calls to
`close()` don't re-attempt to close objects already closed? Graham's blog
seems to indicate that multiple WSGI middleware may each call `close()` on
the response? http://blog.dscpl.com.au/2012/10/obligations-for-calling-
close-on.html

I'm not sure which part of that blog post you're referring to. This
sentence indicates that `close()` should be called once:

> The intent of that statement, at least from the perspective of the WSGI
server, is that close() only be called once all content has been consumed
from the iterable and that content has been sent to the client.

Furthermore, `close()` can usually be called on objects that are already
closed; it just does nothing.

> 3. Returning `self._iterator` directly in
`HttpStreamingResponse.streaming_content`, instead of a new iterator that
yields bytestrings means that middleware authors can't expect consistent
bytestring chunks when wrapping streaming_content in a new iterator? They
may have to type check or coerce each chunk (potentially reproducing
something like `make_bytes()`? For comparison, `HttpResponse.content`
always returns a bytestring.

That's what #18796 is about: normalizing the conversion of the content to
`bytes` (which is moot because content is already `bytes` in 99.99% of
responses).

I'll copy your remark over there.

> 4. There may still be a case for consuming iterators assigned to
`HttpResponse.content` immediately, rather than on first access. Maybe
after a deprecation period, though? If `HttpResponse._container` is an
iterator, the response can't be pickled by cache middleware.

If that's true, it's a bug of the cache middleware, which can easily be
fixed by making it access `response.content`. If that bug wasn't reported
yet, could you create a new ticket?

> It would also allow additional content to be written to the response.
This is probably an issue for another ticket, anyway.

Allowing writes to a regular response instantiated with an iterator is
#6527. I've uploaded a patch extracted from your work over there.

> 5. We should add an `ETag` header in the `serve()` view, if etags are
enabled in settings. ETags are normally added by `CommonMiddleware`, but
middleware can't calculate an ETag for a streaming response, and we do
have access to the raw file in `serve()` so it should be done there (like
we do for `Last-Modified` and `Content-Length`).

This never came into the discussion until now, could you create a new
ticket for this feature request?

> 6. The note in the docs about iterators being closed after the response
is iterated as a point of difference from regular responses is not true
anymore. We rely on WSGI to call `close()` on both regular and streaming
responses.

I'll fix that.

> 7. The `http_utils` test wasn't included?

Oops, my git-fu must have failed me. I'll commet them separately.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:44>

Django

unread,
Oct 21, 2012, 4:45:47 PM10/21/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: closed
Component: Core (Other) | Version: master
Severity: Normal | Resolution: fixed
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by Aymeric Augustin <aymeric.augustin@…>):

In [changeset:"5e629a015e96a5564a0b0a273e1374b9651ce8e9"]:
{{{
#!CommitTicketReference repository=""
revision="5e629a015e96a5564a0b0a273e1374b9651ce8e9"
Added tests for conditional_content_removal.

Refs #7581. Thanks mrmachine.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:45>

Django

unread,
Dec 24, 2012, 2:30:38 PM12/24/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: closed
Component: Core (Other) | Version: master
Severity: Normal | Resolution: fixed
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by Aymeric Augustin <aymeric.augustin@…>):

In [changeset:"cd914175c8209c5f366e87d94f8f6d050347757d"]:
{{{
#!CommitTicketReference repository=""
revision="cd914175c8209c5f366e87d94f8f6d050347757d"
[1.5.x] Prevented caching of streaming responses.

The test introduced in 4b278131 accidentally passed because of a
limitation of Python < 3.3.

Refs #17758, #7581.

Backport of 1c8be95 from master.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:46>

Django

unread,
Dec 24, 2012, 2:30:39 PM12/24/12
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: closed
Component: Core (Other) | Version: master
Severity: Normal | Resolution: fixed
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by Aymeric Augustin <aymeric.augustin@…>):

In [changeset:"1c8be95a864540d416602577d1aa03d58ba33168"]:
{{{
#!CommitTicketReference repository=""
revision="1c8be95a864540d416602577d1aa03d58ba33168"


Prevented caching of streaming responses.

The test introduced in 4b278131 accidentally passed because of a
limitation of Python < 3.3.

Refs #17758, #7581.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:47>

Django

unread,
May 8, 2013, 12:56:58 AM5/8/13
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: closed
Component: Core (Other) | Version: master
Severity: Normal | Resolution: fixed
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by anonymous):

It should be noted that the Apache2 webserver by default (at least on
Ubuntu Linux) enables mod_deflate that suffers from the same issue and
thus effectively prevents streaming. With mod_deflate disabled, streaming
with Django works very nice.

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:48>

Django

unread,
Jun 29, 2013, 12:50:02 PM6/29/13
to django-...@googlegroups.com
#7581: Middleware accessing HttpResponse.content breaks streaming HttpResponse
objects.
-------------------------------------+-------------------------------------
Reporter: mrmachine | Owner: ccahoon
Type: New feature | Status: closed
Component: Core (Other) | Version: master
Severity: Normal | Resolution: fixed
Keywords: stream HttpResponse | Triage Stage: Accepted
Content-Length | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by Aymeric Augustin <aymeric.augustin@…>):

In [changeset:"8b9b8d3bda09eb1b447631182d06c6c5e51425f6"]:
{{{
#!CommitTicketReference repository=""
revision="8b9b8d3bda09eb1b447631182d06c6c5e51425f6"
Removed compatibility code for streaming responses.

This code provided a deprecation path for old-style streaming responses.

Refs #6527, #7581.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/7581#comment:49>

Reply all
Reply to author
Forward
0 new messages