[Django] #17133: get_script_name goofs when there is Apache URL rewriting

18 views
Skip to first unread message

Django

unread,
Oct 28, 2011, 6:34:41 PM10/28/11
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------+--------------------
Reporter: gjanee@… | Owner: nobody
Type: Uncategorized | Status: new
Component: HTTP handling | Version: 1.3
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------
When running under Apache+mod_wsgi (I'm not sure the mod_wsgi is relevant
here), using a WSGIScriptAlias of /foo, a request for URL /foo/bar//baz
(note double slash) gets transformed by Apache to /foo/bar/baz and then by
Django to /foo//bar/baz (note where double slashes are now). Furthermore,
a callback function for urlpattern "^bar/.*" will be called on this latter
path even though it doesn't match the pattern. That is, request.path will
be "/foo//bar/baz", which can be confusing for a callback function that is
expecting request.path to match the WSGIScriptAlias plus the urlpattern.

I don't totally understand what Django is doing here, but I believe the
problem is in django.core.handlers.base.get_script_name. In that
function, to get the script name in this situation Django starts with the
environment variable SCRIPT_URL, and strips off a number of characters at
the end equal to the length of environment variable PATH_INFO. Because in
this case Apache collapses the double slash in /foo/bar//baz to
/foo/bar/baz, get_script_name strips off one less character than
necessary, so that instead of the script name being /foo it is /foo/.
This may account for the double slash appearing /foo, as in /foo//bar/baz.

--
Ticket URL: <https://code.djangoproject.com/ticket/17133>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Oct 28, 2011, 6:44:03 PM10/28/11
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------+--------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: Uncategorized | Status: new
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by anonymous):

* needs_docs: => 0
* needs_tests: => 0
* needs_better_patch: => 0


Comment:

Sorry, the wiki formatting screwed the description up, attempt number 2:

{{{

When running under Apache+mod_wsgi (I'm not sure the mod_wsgi is relevant
here), using a WSGIScriptAlias of /foo, a request for URL /foo/bar//baz
(note double slash) gets transformed by Apache to /foo/bar/baz and then by
Django to /foo//bar/baz (note where double slashes are now). Furthermore,
a callback function for urlpattern "^bar/.*" will be called on this latter
path even though it doesn't match the pattern. That is, request.path will
be "/foo//bar/baz", which can be confusing for a callback function that is
expecting request.path to match the WSGIScriptAlias plus the urlpattern.

I don't totally understand what Django is doing here, but I believe the
problem is in django.core.handlers.base.get_script_name. In that
function, to get the script name in this situation Django starts with the
environment variable SCRIPT_URL, and strips off a number of characters at
the end equal to the length of environment variable PATH_INFO. Because in
this case Apache collapses the double slash in /foo/bar//baz to
/foo/bar/baz, get_script_name strips off one less character than
necessary, so that instead of the script name being /foo it is /foo/.
This may account for the double slash appearing /foo, as in /foo//bar/baz.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:1>

Django

unread,
Nov 25, 2011, 3:43:10 PM11/25/11
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------+--------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: Uncategorized | Status: closed
Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by aaugustin):

* status: new => closed
* resolution: => needsinfo


Comment:

While it's clear that something's wrong in your setup, I'm unable to
confirm that there's a bug in Django, and determine how we could fix it,
sorry.

You might be exceeding the limits of how much `mod_rewrite` magic Django
can compensate for; if so, you can work around the problem with the
`FORCE_SCRIPT_NAME` setting.

If we wanted to debug this further, we'd need the values of the relevant
environment variables, especially `SCRIPT_URL`, `REDIRECT_URL`,
`PATH_INFO`, `SCRIPT_NAME`, as well as the relevant bits of your Apache
configuration.

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:2>

Django

unread,
Oct 23, 2012, 8:54:14 AM10/23/12
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------+--------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: Uncategorized | Status: reopened

Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by mwilck):

* status: closed => reopened
* has_patch: 0 => 1
* resolution: needsinfo =>


Comment:

This is definitely a Django bug, given that Apache/mod_wsgi is as it is.
Here are the variables for my case:

SCRIPT_URL = '/mst/milestones//accounts/login//help'
PATH_INFO = u'/milestones/accounts/login/help'

SCRIPT_NAME = u'/mst/m'

The correct SCRIPT_NAME would be just u'/mst'.

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:3>

Django

unread,
Oct 23, 2012, 8:57:45 AM10/23/12
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------+--------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: Uncategorized | Status: reopened

Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------

Comment (by mwilck):

See e.g.
http://trac.edgewall.org/demo-1.1/changeset/10609/branches/0.12-stable/trac/web/main.py
for an independent assessment of the same problem.

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:4>

Django

unread,
Oct 23, 2012, 8:59:38 AM10/23/12
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------+--------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: Uncategorized | Status: reopened

Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------

Comment (by anonymous):

Replying to [comment:3 mwilck]:
Sorry I messed up the formatting, and it's important here.

{{{


SCRIPT_URL = '/mst/milestones//accounts/login//help'
PATH_INFO = u'/milestones/accounts/login/help'
SCRIPT_NAME = u'/mst/m'
}}}

The correct SCRIPT_NAME would be just u'/mst'.

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:5>

Django

unread,
Oct 23, 2012, 10:08:47 AM10/23/12
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------+--------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: Uncategorized | Status: reopened

Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Description changed by claudep:

Old description:

> When running under Apache+mod_wsgi (I'm not sure the mod_wsgi is relevant
> here), using a WSGIScriptAlias of /foo, a request for URL /foo/bar//baz
> (note double slash) gets transformed by Apache to /foo/bar/baz and then
> by Django to /foo//bar/baz (note where double slashes are now).
> Furthermore, a callback function for urlpattern "^bar/.*" will be called
> on this latter path even though it doesn't match the pattern. That is,
> request.path will be "/foo//bar/baz", which can be confusing for a
> callback function that is expecting request.path to match the
> WSGIScriptAlias plus the urlpattern.
>
> I don't totally understand what Django is doing here, but I believe the
> problem is in django.core.handlers.base.get_script_name. In that
> function, to get the script name in this situation Django starts with the
> environment variable SCRIPT_URL, and strips off a number of characters at
> the end equal to the length of environment variable PATH_INFO. Because
> in this case Apache collapses the double slash in /foo/bar//baz to
> /foo/bar/baz, get_script_name strips off one less character than
> necessary, so that instead of the script name being /foo it is /foo/.
> This may account for the double slash appearing /foo, as in
> /foo//bar/baz.

New description:

When running under Apache+mod_wsgi (I'm not sure the mod_wsgi is relevant

here), using a WSGIScriptAlias of `/foo`, a request for URL


`/foo/bar//baz` (note double slash) gets transformed by Apache to
`/foo/bar/baz` and then by Django to `/foo//bar/baz` (note where double
slashes are now). Furthermore, a callback function for urlpattern
`"^bar/.*"` will be called on this latter path even though it doesn't
match the pattern. That is, request.path will be `/foo//bar/baz`, which
can be confusing for a callback function that is expecting request.path to
match the WSGIScriptAlias plus the urlpattern.

I don't totally understand what Django is doing here, but I believe the
problem is in django.core.handlers.base.get_script_name. In that
function, to get the script name in this situation Django starts with the
environment variable SCRIPT_URL, and strips off a number of characters at
the end equal to the length of environment variable PATH_INFO. Because in
this case Apache collapses the double slash in `/foo/bar//baz` to

`/foo/bar/baz`, get_script_name strips off one less character than
necessary, so that instead of the script name being `/foo` it is `/foo/`.
This may account for the double slash appearing `/foo`, as in
`/foo//bar/baz`.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:6>

Django

unread,
Oct 23, 2012, 3:42:41 PM10/23/12
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------+------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: Uncategorized | Status: reopened

Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------
Changes (by lrekucki):

* needs_docs: => 0
* needs_better_patch: => 0
* needs_tests: => 0
* stage: Unreviewed => Accepted


Comment:

Here's an explanation from Graham Dumpleton (a bit old, but seems up-to-
date): https://groups.google.com/d/msg/django-
users/31oV1WhuAZ4/lpbt_3CCcXYJ

Nitpicking: The import and regexp definition should be on top of the file
and a comment with a link to this issue would decrease the WTH level ;)

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:7>

Django

unread,
Mar 15, 2013, 8:50:20 AM3/15/13
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
--------------------------------------+------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: Cleanup/optimization | Status: reopened

Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by aaugustin):

* type: Uncategorized => Cleanup/optimization


--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:8>

Django

unread,
Jun 13, 2014, 8:23:50 PM6/13/14
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
--------------------------------------+------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: Cleanup/optimization | Status: new

Component: HTTP handling | Version: 1.3
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by timo):

* needs_tests: 0 => 1


Comment:

Is there any way to add a test?

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:10>

Django

unread,
Oct 23, 2015, 9:04:01 AM10/23/15
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
--------------------------------------+------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: HTTP handling | Version: master

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by claudep):

* version: 1.3 => master
* needs_tests: 1 => 0


Comment:

PR [https://github.com/django/django/pull/5468 #5468]

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:11>

Django

unread,
Oct 23, 2015, 4:09:35 PM10/23/15
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------------+-------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: | Status: new
Cleanup/optimization |

Component: HTTP handling | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:12>

Django

unread,
Oct 23, 2015, 4:19:04 PM10/23/15
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------------+-------------------------------------
Reporter: gjanee@… | Owner: nobody
Type: | Status: closed

Cleanup/optimization |
Component: HTTP handling | Version: master
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Claude Paroz <claude@…>):

* status: new => closed

* resolution: => fixed


Comment:

In [changeset:"10ace52a41fc3458fb7df51d11974b398f0fbde3" 10ace52a]:
{{{
#!CommitTicketReference repository=""
revision="10ace52a41fc3458fb7df51d11974b398f0fbde3"
Fixed #17133 -- Properly handled successive slashes in incoming requests

Thanks gja...@ucop.edu for the report and Tim Graham for the review.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:13>

Django

unread,
Oct 27, 2015, 9:30:10 AM10/27/15
to django-...@googlegroups.com
#17133: get_script_name goofs when there is Apache URL rewriting
-------------------------------------+-------------------------------------
Reporter: gjanee@… | Owner: nobody

Type: | Status: closed
Cleanup/optimization |
Component: HTTP handling | Version: master
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Claude Paroz <claude@…>):

In [changeset:"ea2f48ce8bd865874108af79766cdef2d197e540" ea2f48c]:
{{{
#!CommitTicketReference repository=""
revision="ea2f48ce8bd865874108af79766cdef2d197e540"
Refs #17133 -- Optimized script_url handling in get_script_name

10ace52a added some regex processing for each request with SCRIPT_URL set.
In a speed critical section, conditionally apply of the regex will save
some
resources.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/17133#comment:14>

Reply all
Reply to author
Forward
0 new messages