[Django] #17834: ETag generated from empty content can break http caching

12 views
Skip to first unread message

Django

unread,
Mar 5, 2012, 10:48:27 AM3/5/12
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+--------------------
Reporter: paulegan | Owner: nobody
Type: Bug | Status: new
Component: HTTP handling | Version: 1.3
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------
The [source:/django/trunk/django/utils/cache.py?rev=17286#L100
patch_response_headers] function will set an ETag header using an md5 hash
of the response content. Many responses like redirects can have an empty
body and this results in the same ETag for each such response. The
response headers may vary but an intermediate cache can serve an incorrect
response because the ETag is not unique.

A simple solution is to not set ETag if the content is empty - see
attached patch.

{{{#!python
>>> from django.http import HttpResponseRedirect
>>> from django.utils.cache import patch_response_headers
>>> r1 = HttpResponseRedirect('/u1')
>>> patch_response_headers(r1)
>>> r2 = HttpResponseRedirect('/u2')
>>> patch_response_headers(r2)
>>> r1['ETag'] == r2['ETag']
True
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/17834>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Mar 5, 2012, 11:22:07 AM3/5/12
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------
Reporter: paulegan | Owner: nobody
Type: Bug | Status: new
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------
Changes (by claudep):

* needs_docs: => 0
* has_patch: 0 => 1
* needs_better_patch: => 0
* needs_tests: => 1
* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:1>

Django

unread,
Mar 5, 2012, 1:43:26 PM3/5/12
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------
Reporter: paulegan | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------
Changes (by claudep):

* status: new => closed
* resolution: => wontfix


Comment:

I digged a little more in the code. In Django code, patch_response_headers
is never called for responses which code is not in the 200 range. If you
want to call this function from your own code, I think it's your
responsability to not call it on redirect responses.

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:2>

Django

unread,
Mar 5, 2012, 2:09:28 PM3/5/12
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------
Reporter: paulegan | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------

Comment (by paulegan):

True, internally it's currently used only in UpdateCacheMiddleware, but
even there a response with status 200 can still have an empty body.

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:3>

Django

unread,
Mar 5, 2012, 3:47:54 PM3/5/12
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------
Reporter: paulegan | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------

Comment (by claudep):

Yes, but being empty or not, two identical responses get the same ETag. Is
it a problem for you?

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:4>

Django

unread,
Mar 6, 2012, 2:48:01 AM3/6/12
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------
Reporter: paulegan | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------

Comment (by paulegan):

The problem is that the responses are not necessarily identical. The
response headers are often as important as the content itself. The http
spec is clear about this -
http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html - and goes on to say
"in order to be legal, a strong entity tag MUST change whenever the
associated entity value changes in any way".

Ideally Django ought to compute the ETag from the entity body and entity
headers but in practice the body alone is usually sufficient for uniquely
identifying the entity. Of course that's not the case when the body is
empty. If explicitly coding an ETag for such a response, the values from
the appropriate entity headers would be included when generating the tag.
For a function like patch_response_headers, where the addition of the ETag
header is implicit, I think it is simpler & safer to not compute a tag
when the body is empty (and the headers dominate).


I should note that this ticket isn't simple nit-picking but based on a
real-world issue found in a production environment.

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:5>

Django

unread,
Mar 6, 2012, 3:34:22 AM3/6/12
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------
Reporter: paulegan | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------

Comment (by claudep):

I guess that generating the ETag based on the content will match 98% of
use cases. Now for the 2% remaining, you should be able to take advantage
of view decorators to customize the generation of ETags
(https://docs.djangoproject.com/en/dev/topics/conditional-view-
processing/). Did you try to use them?

Replying to [comment:5 paulegan]:

> I should note that this ticket isn't simple nit-picking but based on a
real-world issue found in a production environment.

Sure, I don't blame you for discussing it :-) But as the use case of the
!HttpResponseRedirect in the description is not entirely valid IMHO, maybe
you could provide us with more details about your real use case.

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:6>

Django

unread,
Mar 6, 2012, 3:59:42 AM3/6/12
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------
Reporter: paulegan | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------

Comment (by paulegan):

The specific issue I came across was with a redirect response and wasn't
using the cache middleware. I don't entirely agree that this immediately
invalidates the bug however since many projects use patch_response_headers
directly (a google search throws up quite a few examples and there are
sure to be many more private projects, as in my case).

Stepping aside that argument, my main point is that a user of
patch_response_headers may unexpectedly create responses that don't meet
the http requirements and run into caching issues. It's certainly an
unusual case that probably won't affect too many projects and can
definitely be worked-around with other methods (an explicit ETag or the
conditional decorators). However when the entity body is empty it seems
to me to be safer to avoid the potential for ETag collision.

I guess it comes down to a balance between those users of
patch_response_headers who expect an ETag even on empty responses and
those who expect the function to do the right thing (not break caching).
:-)

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:7>

Django

unread,
Mar 6, 2012, 10:19:04 AM3/6/12
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------------+-------------------------------------
Reporter: paulegan | Owner: nobody
Type: Bug | Status: reopened
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Design
Has patch: 1 | decision needed
Needs tests: 1 | Needs documentation: 0
Easy pickings: 0 | Patch needs improvement: 0
| UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by claudep):

* status: closed => reopened
* resolution: wontfix =>
* stage: Accepted => Design decision needed


Comment:

At this point, I see 3 ways to solve this:

1. Consider it is a corner case, and let the programmers handle
patch_response_headers carefully for responses without content.
2. Change ETag computing to always include headers (generates digest on
str(response) instead of str(response.content)).
3. Special case ETag computing to use headers only if response.content is
empty (OP proposal).

Note also that I have refactored ETag computing in a patch on #14722.
Marking as DDN to get a core committer opinion.

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:8>

Django

unread,
Jun 5, 2012, 7:55:37 AM6/5/12
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------------+-------------------------------------
Reporter: paulegan | Owner: nobody
Type: Bug | Status: reopened
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Design
Has patch: 1 | decision needed
Needs tests: 1 | Needs documentation: 0
Easy pickings: 0 | Patch needs improvement: 0
| UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by rene.puls@…):

* cc: rene.puls@… (added)


Comment:

I just stumbled over the same problem and would like to add another
possible solution.

The RFC says this about redirect responses (e.g. section 10.3.2): "Unless
the request method was HEAD, the entity of the response SHOULD contain a
short hypertext note with a hyperlink to the new URI(s)." While this is
not a requirement, it would help in this scenario, because the response
body (and thus the ETag) would then depend on the redirect URL.

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:9>

Django

unread,
Mar 23, 2013, 3:40:29 AM3/23/13
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------

Reporter: paulegan | Owner: nobody
Type: Bug | Status: new
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:

Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------
Changes (by aaugustin):

* stage: Design decision needed => Accepted


Comment:

It seems to me that the headers could be included in the Etag calculation
without any harm.

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:11>

Django

unread,
Mar 23, 2013, 9:26:30 AM3/23/13
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------

Reporter: paulegan | Owner: nobody
Type: Bug | Status: new
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------

Comment (by aaugustin):

#12789 was a duplicate.

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:12>

Django

unread,
Nov 7, 2015, 11:11:24 AM11/7/15
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+-----------------------------------------
Reporter: paulegan | Owner: dwightgunning
Type: Bug | Status: assigned

Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+-----------------------------------------
Changes (by dwightgunning):

* owner: nobody => dwightgunning
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:13>

Django

unread,
Oct 14, 2016, 1:02:45 AM10/14/16
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------------
Reporter: Paul Egan | Owner: Dwight Gunning
Type: Bug | Status: assigned

Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------------
Changes (by Kevin Christopher Henry):

* cc: k@… (added)


Comment:

For reference, here's what the updated
[https://tools.ietf.org/html/rfc7232#section-2.1 RFC 7232] specification
has to say on the matter:

> A "strong validator" [like the ETags we compute] is representation
metadata that changes value whenever a change occurs to the representation
data that would be observable in the payload body of a 200 (OK) response
to GET. A strong validator might change for reasons other than a change to
the representation data, such as when a semantically significant part of
the representation metadata is changed (e.g., Content-Type), but it is in
the best interests of the origin server to only change the value when it
is necessary to invalidate the stored responses held by remote caches and
authoring tools.

So, the specification does not require the `ETag` to change unless the
response body itself changes. It //could// change, if we know that a
change to the headers invalidates the response. Since the framework can't,
in general, know that, and since the specification cautions against
invalidating the response when it's unnecessary, my opinion is that the
status quo of basing the `ETag` only on the response body is fine.

(If we were to try and take the headers into account, though, it's worth
noting which headers they should be.
[https://tools.ietf.org/html/rfc7231#section-3.1 RFC 7231] defines the
representation metadata as the `Content-Type`, `Content-Encoding`,
`Content-Language`, and `Content-Location` headers.)

--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:14>

Django

unread,
Apr 27, 2025, 7:22:35 AM4/27/25
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------
Reporter: Paul Egan | Owner: nobody
Type: Bug | Status: new
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------
Comment (by Jitesh Nair):

Hi,

The original issues is not reproducible in the latest version 6.

Also in 3.1 release ETAG is no longer added to the empty response
https://docs.djangoproject.com/en/5.2/releases/3.1/#:~:text=ETag%20header%20to%20responses%20with%20an%20empty

Corresponding ticket https://code.djangoproject.com/ticket/30812

Its very old issue. Unless someone can reproduce this issue, I propose to
close this.

Regards,
Jitesh
--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:15>

Django

unread,
Apr 29, 2025, 3:44:35 AM4/29/25
to django-...@googlegroups.com
#17834: ETag generated from empty content can break http caching
-------------------------------+------------------------------------
Reporter: Paul Egan | Owner: nobody
Type: Bug | Status: closed
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------
Changes (by Sarah Boyce):

* resolution: => fixed
* status: new => closed

Comment:

Thank you Jitesh
Agreed that this is fixed by ee6b17187fbf19d498c16bd46ec6dd6aaf86f453 and
the tests there should be sufficient to prevent this being re-introduced.
--
Ticket URL: <https://code.djangoproject.com/ticket/17834#comment:16>
Reply all
Reply to author
Forward
0 new messages