I have reopened #285 [2] and attached a patch which seemingly fixes
this. Could I please have some people more expert than me check it out
and provide some feedback?
[1] http://code.google.com/p/modwsgi/wiki/IntegrationWithDjango
[2] http://code.djangoproject.com/ticket/285
>From memory, your patch to mod_python handler will possibly break
things. This is because even when application is hosted at root of
server, for certain setups path_info is not the whole URI but has a
leading component missing because of how Apache matches URLs to
resources.
There are also various other impacts on Django from making the changes
which I cant remember right now. It is something which has been
discussed before and if Google groups search isn't totally screwed up
you might be able to find the discussion in the archives.
Graham
Graham
On Jul 27, 12:51 pm, Graham Dumpleton <Graham.Dumple...@gmail.com>
wrote:
On Jul 27, 2:51 pm, Graham Dumpleton <Graham.Dumple...@gmail.com>
wrote:
> On Jul 27, 12:34 pm, SmileyChris <smileych...@gmail.com> wrote:
>
> > Reading through the mod_wsgi docs [1], I came to a section explaining
> > a problem with Django's core HTTP handling (search for "SCRIPT_NAME").
>
> > I have reopened #285 [2] and attached a patch which seemingly fixes
> > this. Could I please have some people more expert than me check it out
> > and provide some feedback?
>
> > [1]http://code.google.com/p/modwsgi/wiki/IntegrationWithDjango
> > [2]http://code.djangoproject.com/ticket/285
>From memory, your patch to mod_python handler will possibly break
> things. This is because even when application is hosted at root of
> server, for certain setups path_info is not the whole URI but has a
> leading component missing because of how Apache matches URLs to
> resources.
I understand that. That's why I propose changing PATH_INFO to match
mod_python_request_object.uri. Nothing in the previous discussion has
brought up why this is a bad idea.
In fact, ModPythonRequest currently sets path = self._req.uri
currently anyway.
> There are also various other impacts on Django from making the changes
> which I cant remember right now.
I read through all other references to PATH_INFO in Django. Nothing
else which jumps out at me that would break.
Since I changed the url resolver to work with PATH_INFO rather than
request.path, it won't break reverse resolving even for Request
handlers which use SCRIPT_NAME.
Ahh, I read the way the patch was being applied the wrong way
around. :-(
Anyway, one potential reason why using req.uri may be bad is that
Apache does not do complete normalisation on it. Thus, one can get
repeating slashes which can cause the URL not to match a URL pattern
unless Django is at some point normalising it which I don't know if it
does or not. The req.path_info value on the other hand is normalised
properly.
> > There are also various other impacts on Django from making the changes
> > which I cant remember right now.
>
> I read through all other references to PATH_INFO in Django. Nothing
> else which jumps out at me that would break.
>
> Since I changed the url resolver to work with PATH_INFO rather than
> request.path, it won't break reverse resolving even for Request
> handlers which use SCRIPT_NAME.
>From memory, one of the issues is that if one is going to have Django
understand the larger concept of SCRIPT_NAME, then it would be
desirable to be able to provide in urls.py URL patterns which are
anchored, but expressed relative to the SCRIPT_NAME mount point. That
way you can change what the mount point is and urls.py doesn't need to
be changed and everything in Django will just work. At the moment one
has to include the mount point in the URL patterns meaning you have to
change them if you change the mount point, thus making it hard to
provide a Django application as a relocatable WSGI application
component.
One can't simply make such a change though such that this more
desirable behaviour would be the default, instead for backward
compatibility you would need to retain the existing behaviour and
require a switch to be added to enable the new behaviour.
The other problem is that although with the better WSGI adapters
SCRIPT_NAME is provided correctly set for you automatically, in
mod_python one cannot actually deduce SCRIPT_NAME automatically with
the way the adapter works and so you push onto the user the need to
define the mount point manually a second time.
What it comes down to is there are a few sides issues that really need
to be looked at at the same time and the simplistic change may not be
adequate to cover these or not result in the same behaviour on all
hosting solutions.
Graham
Uh, for mod_python handler, Django is *already* using req.uri. Nearly
everywhere in Django, we use request.path, which is filled from
req.uri
I'm making the change to request.META['PATH_INFO'] which isn't used
anywhere (important) in Django anyway.
> > Since I changed the url resolver to work with PATH_INFO rather than
> > request.path, it won't break reverse resolving even for Request
> > handlers which use SCRIPT_NAME.
> From memory, one of the issues is that if one is going to have Django
> understand the larger concept of SCRIPT_NAME, then it would be
> desirable to be able to provide in urls.py URL patterns which are
> anchored, but expressed relative to the SCRIPT_NAME mount point.
This is exactly why I wrote the patch, and it does it without stepping
on toes.
URL patterns are archored
I'll have to plead then that I don't understand the implications of
the change you made to base.py. I guess that is what I get for
commenting when I don't even use Django myself. :-)
All I know is that the change in wsgi.py by itself is more or less the
same one I suggest as a workaround for mod_wsgi to allow mounting
Django at point other than root, and with only that change you still
need to provide the mount point, ie., SCRIPT_NAME, in urls.py.
It sounds like you are saying then that the base.py change results in
what I refer to as the desirable behaviour for urls.py.
Graham
On Jul 27, 4:16 pm, Graham Dumpleton <Graham.Dumple...@gmail.com>
wrote:
> The other problem is that although with the better WSGI adapters
> SCRIPT_NAME is provided correctly set for you automatically, in
> mod_python one cannot actually deduce SCRIPT_NAME automatically with
> the way the adapter works and so you push onto the user the need to
> define the mount point manually a second time.
I get that mod_python is a bit dumb, but my change isn't changing
anything for mod_python users. It only changes the behaviour of the
wsgi request handler.
> What it comes down to is there are a few sides issues that really need
> to be looked at at the same time and the simplistic change may not be
> adequate to cover these or not result in the same behaviour on all
> hosting solutions.
Granted, the one thing I hadn't put too much consideration into is url
`reverse` method. At the moment, it is totally decoupled from request,
so it can't figure out SCRIPT_NAME. Perhaps request could get a
SCRIPT_NAME aware reverse method (which would still work without it)?
Are there any other issues I'm missing?
Exactly. That change ensures that the url (forward) resolver doesn't
need to worry about SCRIPT_NAME (if it's provided, which for
mod_python it isn't so no change there).
The only consideration I see that needs addressing is reverse
resolving (see my previous comment).
Coming in a bit late to this thread because I'm catching up on a back
log today. Reverse URL construction (especially w.r.t things like HTTP
redirects) is the big missing part. I was talking about this with a
couple of people at OSCON, including discussing with Jacob how we can
incorporate this so that the necessary backwards incompatibility changes
aren't too painful. I've got a similar patch in one of my branches
locally that I'll compare with your patch Chris and grab any bits I've
missed. It'll be a couple of days before I'm all set up to test this
correctly, since I'm travelling and reinstalled my laptop shortly before
leaving, so I need to configure Apache, etc, first.
Haven't worked out exactly how to acommodate things like mod_python not
giving the full SCRIPT_NAME yet, but it's a relatively minor issue since
it's just a fact of life we need to work around. Something will present
itself.
We do need to fix this, so thanks for the patch so far. We might as well
fix the whole thing at once so that it's easier to move an existing
setup to a different URL root without having to edit all the code
(letting Apache handle the SCRIPT_NAME portion). Initially there will be
a settings variable that will preserve backwards-compat and we'll break
it a bit before 1.0 after everybody's had time to adjust their code. So
it will be relatively painless.
Cheers,
Malcolm
FWIW, I know I said that SCRIPT_NAME can't be calculated in
mod_python. Well, that isn't strictly true.
If you are using mod_python 3.3.1 and have used the Location directive
then it is probably possible by using req.hlist.location.
I didn't mention this before because the fact it doesn't work with
older versions of mod_python nor when Location is not being used sort
of limits its usefulness. Didn't seem to make sense to therefore
depend on it.
Graham
Aah.. thanks. That provides the explanation to resolve a note I made on
Thursday evening: "did mod_python change at some point? 3.3 is now
returning SCRIPT_NAME?!"
Should be able to accommodate that, I'm sure. Thanks, Graham. I'm not
going mad after all.
Cheers,
Malcolm
I'm not sure I've got much to add to this discussion that I didn't
tell Malcolm in person, but I do have one idea worth tossing out
there. Back in the dark ages of Django-as-internal-project, we didn't
have the concept of URL inclusion. Each top-level bit had its own
location block in Apache; something like::
<Location /news/>
PythonOption urlconf news.urls
PythonOption urlroot /news/
</Location>
Repeat for each top level path component... we've come a long way,
baby. We threw this whole thing out when we figured out the inclusion
mechanism, but the url-root-as-PythonOption bit isn't such a bad idea.
People using mod-python < 3.3.1 could use the PythonOption and kill it
once they upgrade.
Not a brilliant idea, to be sure, but now someone *else* has to come
up with something better :)
Jacob
Sorry, but you may be going mad after all, unless I am.
The change in mod_python 3.3 to allow access to the value of the
Location directive in which the PythonHandler directive was used,
doesn't in anyway change what the value of SCRIPT_NAME is as set by
Apache. I would thus expect SCRIPT_NAME to still be wrong in various
circumstances.
Am a bit busy at the moment, but when I get a chance I'll revisit this
issue of SCRIPT_NAME with mod_python and see if I can't work out some
half sensible guidelines as to how Apache works out what to set it to
and how we may be able to work out a correct value. As I suggest
though, such magic may only be an option with mod_python 3.3 and then
not always.
Graham
We obviously need some way to get the SCRIPT_NAME from request to the
reverse resolver. Can we rely on SCRIPT_NAME remaining a constant for
handler instance?
If so, how about we just set it as a global cached var which the
resolver can look at, which would make the whole process transparent
to the end user.
If not, we'll need to tie the two together somehow (hence earlier
request.reverse(...) suggestion). We'd need to rethink the {% url %}
tag too...
Sort of like this wiki solution [1] but put the whole request into
_thread_locals.
Would there be any performance downside in doing that? Is it safe to
do that?
This would solve the reverse URL construction issue of this discussion
thread, along with request.urlconf not being able to be used in
permalinks [2] and the url template tag [3]
It should be mandatory Middleware, not optional (I've written
something related to that in my latest patch in #987 [4] - by the way,
Malcolm, have you checked that patch out yet?).
[1] http://code.djangoproject.com/wiki/CookBookThreadlocalsAndUser
[2] http://code.djangoproject.com/ticket/3530
[3] http://code.djangoproject.com/ticket/5034
[4] http://code.djangoproject.com/ticket/987
http://code.djangoproject.com/ticket/285 <- seems to have a good,
working patch.
(I've noted that it doesn't work with scgi. I'm not familiar with the
full history of this. I guess we could write middleware to work with
full_path if it isn't desired that request.path == request.full_path)
Regardless, one way or another I think this patch should make it in.
As noted in 285, django currently isn't following the CGI RFC. This
issue has been open for well over two years now. Perhaps now is a
good time to finally resolve it?
Ho-hum. Saying a ticket has been open for two years is completely
irrelevant. Instead, have a look at the activity on the ticket. It's
only relatively recently that a more or less complete patch has arrived.
So it's been showing some activity and is moving along. Not "opened two
years ago and nothing's been done". There's quite a difference.
Then you read a bit further in the comments and you can see where it's
up to (needing a final review). It hasn't been checked in yet because
(a) it's a very crucial piece of infrastructure so getting it right is
important, (b) it will be backwards incompatible, so getting it right is
important to reduce the amount of disruption caused by needing to take
multiple swings at it (and to keep the backwards incomaptibility to a
minimum and to find the easiest forwards porting route) and (c) because
there are a couple of little things in it that I'm not happy with and I
haven't had time to fix yet (one of which is the one you noticed).
It will be checked in in due course. For the reasons listed above, we're
being careful, but it's not being ignored.
Malcolm
--
Works better when plugged in.
http://www.pointy-stick.com/blog/