If I understand correctly, this code is ran for each request. Would we
not get more performance by setting
{{{
redirect_url = ''
if settings.PREPEND_WWW:
host = request.get_host()
if host and not host.startswith('www.')
redirect_url = ('%s://www.%s' % (request.scheme, host))
}}}
as this evaluates most of the code only when settings.PREPEND_WWW is true.
Again, maybe I completely misunderstand how python code optimization
works. But I hope it helps
--
Ticket URL: <https://code.djangoproject.com/ticket/27575>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Old description:
> please ignore all this if its stupid, but I was trying to create a 1.10
> middleware and learn from existing code. Looking at
> django.middleware.common.CommonMiddleware I saw this in process_request
> {{{
> host = request.get_host()
> must_prepend = settings.PREPEND_WWW and host and not
> host.startswith('www.')
> redirect_url = ('%s://www.%s' % (request.scheme, host)) if must_prepend
> else ''
> }}}
>
> If I understand correctly, this code is ran for each request. Would we
> not get more performance by setting
> {{{
> redirect_url = ''
> if settings.PREPEND_WWW:
> host = request.get_host()
> if host and not host.startswith('www.')
> redirect_url = ('%s://www.%s' % (request.scheme, host))
> }}}
> as this evaluates most of the code only when settings.PREPEND_WWW is
> true.
>
> Again, maybe I completely misunderstand how python code optimization
> works. But I hope it helps
New description:
please ignore all this if its stupid, but I was trying to create a 1.10
middleware and learn from existing code. Looking at
django.middleware.common.CommonMiddleware I saw this in process_request
{{{
host = request.get_host()
must_prepend = settings.PREPEND_WWW and host and not
host.startswith('www.')
redirect_url = ('%s://www.%s' % (request.scheme, host)) if must_prepend
else ''
}}}
If I understand correctly, this code is ran for each request. Would we
not get more performance by setting
{{{
redirect_url = ''
if settings.PREPEND_WWW:
host = request.get_host()
if host and not host.startswith('www.'):
redirect_url = ('%s://www.%s' % (request.scheme, host))
}}}
as this evaluates most of the code only when settings.PREPEND_WWW is true.
Again, maybe I completely misunderstand how python code optimization
works. But I hope it helps
--
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:1>
Comment (by Simon Charette):
Hi Joris,
I'd say it's worth investigating but one thing that you'll need to make
sure is that `get_host()` is called inconditionally as this method has the
side effect of performing `ALLOWED_HOSTS` validation.
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:2>
* stage: Unreviewed => Accepted
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:3>
Comment (by JorisBenschop):
Hi Simon,
Thanks for this clarification.
Looking at the code, to my eye the get_host() only returns it output to a
variable that is only used once, but now i understand there is some
implicit 'magic' ran behind the screens as well. Indeed that would make
the patch a lot less useful, but then would it maybe be worth annotating
this hidden behaviour in a comment within the def instead?
Thanks for your gelp
Joris
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:4>
* has_patch: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:5>
* cc: desecho@… (added)
Comment:
We could add a comment in process_request method
{{{
host = request.get_host() # Also perform host validation
}}}
And probably in get_host method as well
{{{
"""Return the HTTP host using the environment or request headers. Perform
host validation."""
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:6>
Comment (by JorisBenschop):
Wouldnt refactoring this function (validate_host?) make sense if it does a
lot more than retrieving a hostname?
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:7>
Comment (by Anton Samarchyan):
A function get_host could be renamed to get_host_and_validate
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:8>
* status: new => assigned
* owner: nobody => Ketan Bhatt
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:9>
Comment (by Ketan Bhatt):
@Simon Charette
So the final changes that need to be made are:
1. Rename `get_host()` to `get_host_and_validate()`
2. Handle occurrences of `get_host` everywhere in the docs and change it
to the new name
3. Mention the change in the changelist (will need help here regarding
where to put this change exactly)
Please confirm once and I shall make a PR
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:10>
Comment (by Tim Graham):
I don't think there's value in renaming the function at this point. In
fact, it might be better to move that validation elsewhere so it always
runs during a request, regardless of if `get_host()` is called. There was
some discussion about this at the time host header validation was added
but I don't have time to look into this more right now.
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:11>
* component: Core (Other) => HTTP handling
* easy: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:12>
Comment (by Chris Jerdonek):
> In fact, it might be better to move that validation elsewhere so it
always runs during a request, regardless of if get_host() is called.
Doing this might be complicated by the fact that, currently, host
validation has to take place after custom middleware in certain
situations, per this note in the `HttpRequest.get_host()` docs:
https://docs.djangoproject.com/en/3.1/ref/request-
response/#django.http.HttpRequest.get_host
An easier option might be to cache `get_host()`'s return value. Then
middleware that might need the return value could call `get_host()` early
on to warm the cache (or fail the request) without too much penalty (since
`get_host()` can be getting called elsewhere, anyways).
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:13>
* owner: Ketan Bhatt => (none)
* status: assigned => new
Comment:
Tim, what were you envisioning for this ticket? In particular, were you
envisioning a backwards incompatible change?
Also, something weaker than requiring it to be run for all Django installs
for all requests would be to ensure that it runs for all requests only
when any of certain middleware is enabled. That initial list of middleware
could include e.g. any middleware that currently calls
`HttpRequest.get_host()` under certain conditions (e.g.
`CsrfViewMiddleware`). That approach would at least let middleware call it
without needing to handle `DisallowedHost` everywhere, because those
middleware could each be updated to call it at the beginning if it hasn't
been called yet (and then caching the value for later middleware).
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:14>
Comment (by Tim Graham):
I didn't have anything specific in mind. It seems that running host
validation before middleware isn't feasible due to the use case you
pointed out. The other option would be to run it afterward (just in case
no middleware already did, which I'm not sure is a likely possibility, but
that was a concern with the optimization originally proposed here in
`CommonMiddleware`).
--
Ticket URL: <https://code.djangoproject.com/ticket/27575#comment:15>