Proposal: Track used headers and use that information to automatically populate Vary header.

77 views
Skip to first unread message

Linus Lewandowski

unread,
Jan 25, 2019, 10:03:32 AM1/25/19
to Django developers (Contributions to Django itself)
Right now, it's hard to report Vary header correctly. Headers might get accessed in many different places, like middlewares, subroutines (which can't use patch_vary_headers as they don't have access to the response object), etc - and all those cases should be reflected in the Vary header, or something might get cached incorrectly.

However, thanks to the newly added request.headers property (see https://code.djangoproject.com/ticket/20147 and https://github.com/django/django/commit/4fc35a9c3efdc9154efce28cb23cb84f8834517e), we now have a single place which is used to access request headers. We can track which ones were accessed, and then set Vary header automatically, for example in a middleware.

What do you think about:
1) adding some code to track accessed headers in request.headers,
2) adding a new middleware (or expanding an existing one), that sets the Vary header based on 1),
3) deprecating patch_vary_headers function and vary_on_headers/vary_on_cookie decorators and recommending to use request.headers instead?

Thanks,
Linus

PS. This is a follow-up to the https://code.djangoproject.com/ticket/28533 ticket.

Adam Johnson

unread,
Jan 25, 2019, 10:56:06 AM1/25/19
to django-d...@googlegroups.com
Accessing the value of a header doesn't necessarily mean that the response varies based up on, for example it might simply be accessed for storage in informational logs. Additionally, Request.headers is not the only way to access the values now, Request.META has not been removed. I don't believe any of Django's internal header lookups have been changed to use Request.headers and it's unlike third party packages or applications will ever all be moved.

Anyway I'm pretty sure you can write such a middleware yourself, replacing Request.headers with a proxy object (and maybe Request.META too), then adding 'Vary' on the way out based upon accessed keys, at least as a proof of concept. If it gets some usage that would prove it could be valuable to add to Django itself.



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/81484ff1-552e-4103-9fa8-8a3348512b84%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Adam

Linus Lewandowski

unread,
Jan 25, 2019, 12:39:43 PM1/25/19
to django-d...@googlegroups.com
Accessing the value of a header doesn't necessarily mean that the response varies based up on, for example it might simply be accessed for storage in informational logs.

True, probably a way to access headers without marking them as used would be required - maybe something like request.headers.get(XYZ, vary_response=False).

However, right now people are commonly forgetting to patch Vary, which leads to problems with caching. This way - this won't happen ever again; but in some cases, we might make caching less efficient than possible, because somebody used request.headers[XYZ] and not request.headers.get(XYZ, vary_response=False). Given these two cases - I feel that working correctly is more important than perfectly-efficient caching - but opinions here may differ.

> Additionally, Request.headers is not the only way to access the values now, Request.META has not been removed. I don't believe any of Django's internal header lookups have been changed to use Request.headers and it's unlike third party packages or applications will ever all be moved.

It's not, but we can assume that all this code uses patch_vary_headers correctly, so we don't need to track it here. It's mostly about new code, that's going to be written with request.headers, so that it will work correctly without worrying about the Vary header.


> Anyway I'm pretty sure you can write such a middleware yourself, replacing Request.headers with a proxy object (and maybe Request.META too), then adding 'Vary' on the way out based upon accessed keys, at least as a proof of concept. If it gets some usage that would prove it could be valuable to add to Django itself.

I guess it isn't something that people are going to be looking for, it's always easier to add that another patch_vary_headers invocation than to add a new package, so the usage won't be high; but I'll probably do so, at least for my own usage. I'm already using it in one project, and I need it in others.

On Fri, Jan 25, 2019 at 4:56 PM Adam Johnson <m...@adamj.eu> wrote:

On Fri, 25 Jan 2019 at 14:46, Linus Lewandowski <linus.le...@netguru.pl> wrote:
Right now, it's hard to report Vary header correctly. Headers might get accessed in many different places, like middlewares, subroutines (which can't use patch_vary_headers as they don't have access to the response object), etc - and all those cases should be reflected in the Vary header, or something might get cached incorrectly.

However, thanks to the newly added request.headers property (see https://code.djangoproject.com/ticket/20147 and https://github.com/django/django/commit/4fc35a9c3efdc9154efce28cb23cb84f8834517e), we now have a single place which is used to access request headers. We can track which ones were accessed, and then set Vary header automatically, for example in a middleware.

What do you think about:
1) adding some code to track accessed headers in request.headers,
2) adding a new middleware (or expanding an existing one), that sets the Vary header based on 1),
3) deprecating patch_vary_headers function and vary_on_headers/vary_on_cookie decorators and recommending to use request.headers instead?

Thanks,
Linus

PS. This is a follow-up to the https://code.djangoproject.com/ticket/28533 ticket.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/81484ff1-552e-4103-9fa8-8a3348512b84%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Adam

--
You received this message because you are subscribed to a topic in the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-developers/LlQtbOm_YWw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-develop...@googlegroups.com.

To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

For more options, visit https://groups.google.com/d/optout.


--
Regards:
Linus Lewandowski
Senior Python Developer

Florian Apolloner

unread,
Jan 25, 2019, 4:53:04 PM1/25/19
to Django developers (Contributions to Django itself)
While reading this thread https://code.djangoproject.com/ticket/19649 came to mind. I think most (if not all) from there basically is the same issue, even though that one just concerns itself with the Cookie header.

I do not agree that request.headers is __now__ the single place for accessing headers, that still seems to be request.META. So in that sense every argument for fixing this should probably not rely on request.headers, at least not as long as we still have headers in request.META.

Cheers,
Florian

Dan Davis

unread,
Jan 25, 2019, 8:16:44 PM1/25/19
to django-d...@googlegroups.com
I would like this - Django is a framework with batteries, and my development group tells me "Django is too hard".  This is because they don't understand HTTP; mostly they understand HTML/CSS and SQL, with maybe some easy jquery level of SQL. So, this kind of solution would fit well for my developers. The young engineers all love that we switched to Django, however, so any such solution should be opt-out.

James Bennett

unread,
Jan 25, 2019, 8:27:33 PM1/25/19
to django-d...@googlegroups.com
On Fri, Jan 25, 2019 at 9:39 AM Linus Lewandowski <linus.le...@netguru.co> wrote:
True, probably a way to access headers without marking them as used would be required - maybe something like request.headers.get(XYZ, vary_response=False).

However, right now people are commonly forgetting to patch Vary, which leads to problems with caching. This way - this won't happen ever again; but in some cases, we might make caching less efficient than possible, because somebody used request.headers[XYZ] and not request.headers.get(XYZ, vary_response=False). Given these two cases - I feel that working correctly is more important than perfectly-efficient caching - but opinions here may differ.

My immediate thought here is: if people already aren't taking the time to patch using the existing mechanism, they also aren't going to take the time to opt out of patching. So what you're proposing is effectively still "any accessed header patches Vary". And that seems like it's as bad as the problem it's trying to solve.

Dan Davis

unread,
Jan 25, 2019, 9:02:15 PM1/25/19
to django-d...@googlegroups.com
On Fri, Jan 25, 2019 at 8:27 PM James Bennett <ubern...@gmail.com> wrote:
My immediate thought here is: if people already aren't taking the time to patch using the existing mechanism, they also aren't going to take the time to opt out of patching. So what you're proposing is effectively still "any accessed header patches Vary". And that seems like it's as bad as the problem it's trying to solve.

The people who do take the time to patch the Vary header probably can easily adapt to the new mechanism, its the people who don't take the time to do this that we can help with such a feature. 
Reply all
Reply to author
Forward
0 new messages