[Django] #25623: Django always returns a white page along with 400 on latin encoded URLs.

11 views
Skip to first unread message

Django

unread,
Oct 28, 2015, 10:50:57 AM10/28/15
to django-...@googlegroups.com
#25623: Django always returns a white page along with 400 on latin encoded URLs.
-----------------------------+--------------------
Reporter: shredding | Owner: nobody
Type: Bug | Status: new
Component: Core (URLs) | Version: 1.8
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
-----------------------------+--------------------
Given a URL like /Raumh%F6he.htm served with gunicorn and waitress (maybe
others as well), django raises a UnicodeDecodeError here:

https://github.com/django/django/blob/master/django/core/handlers/wsgi.py#L200

... which ends up in serving a white page with a 400 (sample:
https://www.djangoproject.com/Raumh%F6he.htm)

This does not happen with the built in runserver command, because it
double encodes the query, as far as I understand:

https://github.com/django/django/blob/master/django/core/servers/basehttp.py#L154-L157

Gunicorn on the other hand is explicitly casting to latin:

https://github.com/benoitc/gunicorn/blob/master/gunicorn/_compat.py#L82

... which leads to the error as django is explicitly expecting utf-8.

--
Ticket URL: <https://code.djangoproject.com/ticket/25623>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Oct 28, 2015, 10:52:18 AM10/28/15
to django-...@googlegroups.com
#25623: Django always returns a white page along with 400 on latin encoded URLs.
-----------------------------+--------------------------------------

Reporter: shredding | Owner: nobody
Type: Bug | Status: new
Component: Core (URLs) | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-----------------------------+--------------------------------------
Changes (by shredding):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Old description:

> Given a URL like /Raumh%F6he.htm served with gunicorn and waitress (maybe
> others as well), django raises a UnicodeDecodeError here:
>
> https://github.com/django/django/blob/master/django/core/handlers/wsgi.py#L200
>
> ... which ends up in serving a white page with a 400 (sample:
> https://www.djangoproject.com/Raumh%F6he.htm)
>
> This does not happen with the built in runserver command, because it
> double encodes the query, as far as I understand:
>
> https://github.com/django/django/blob/master/django/core/servers/basehttp.py#L154-L157
>
> Gunicorn on the other hand is explicitly casting to latin:
>
> https://github.com/benoitc/gunicorn/blob/master/gunicorn/_compat.py#L82
>
> ... which leads to the error as django is explicitly expecting utf-8.

New description:

Given a URL like /Raumh%F6he served with gunicorn and waitress (maybe


others as well), django raises a UnicodeDecodeError here:

https://github.com/django/django/blob/master/django/core/handlers/wsgi.py#L200

... which ends up in serving a white page with a 400 (sample:

https://www.djangoproject.com/Raumh%F6he)

This does not happen with the built in runserver command, because it
double encodes the query, as far as I understand:

https://github.com/django/django/blob/master/django/core/servers/basehttp.py#L154-L157

Gunicorn on the other hand is explicitly casting to latin:

https://github.com/benoitc/gunicorn/blob/master/gunicorn/_compat.py#L82

... which leads to the error as django is explicitly expecting utf-8.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/25623#comment:1>

Django

unread,
Oct 28, 2015, 11:10:47 AM10/28/15
to django-...@googlegroups.com
#25623: Django always returns a white page along with 400 on latin encoded URLs.
-----------------------------+--------------------------------------

Reporter: shredding | Owner: nobody
Type: Bug | Status: new
Component: Core (URLs) | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-----------------------------+--------------------------------------

Comment (by shredding):

For what it's worth:

Pyramid raises a 500 (http://www.pylonsproject.org/Raumh%F6he.htm) (error
code: URLDecodeError: 'utf8' codec can't decode byte 0xf6 in position 11:
invalid start byte (urldispatch.py, line 86)

Flask handles it correctly.

--
Ticket URL: <https://code.djangoproject.com/ticket/25623#comment:2>

Django

unread,
Oct 28, 2015, 11:23:54 AM10/28/15
to django-...@googlegroups.com
#25623: Django always returns a white page along with 400 on latin encoded URLs.
-------------------------------+------------------------------------

Reporter: shredding | Owner: nobody
Type: Bug | Status: new
Component: HTTP handling | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------
Changes (by timgraham):

* component: Core (URLs) => HTTP handling
* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/25623#comment:3>

Django

unread,
Oct 29, 2015, 2:28:14 AM10/29/15
to django-...@googlegroups.com
#25623: Django always returns a white page along with 400 on latin encoded URLs.
-------------------------------+------------------------------------

Reporter: shredding | Owner: nobody
Type: Bug | Status: new
Component: HTTP handling | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------

Comment (by DheerendraRathor):

In the part mentioned by OP,

{{{
def get_path_info(environ):
"""
Returns the HTTP request's PATH_INFO as a unicode string.
"""
path_info = get_bytes_from_wsgi(environ, 'PATH_INFO', '/')

return path_info.decode(UTF_8)
}}}

I changed `UTF-8 ` to `ISO_8859_1` in my local Django installation and
then it was correctly throwing 404 instead of 400.

--
Ticket URL: <https://code.djangoproject.com/ticket/25623#comment:4>

Django

unread,
Nov 7, 2015, 6:09:38 AM11/7/15
to django-...@googlegroups.com
#25623: Django always returns a white page along with 400 on latin encoded URLs.
-------------------------------+------------------------------------
Reporter: shredding | Owner: nobody
Type: Bug | Status: closed

Component: HTTP handling | Version: 1.8
Severity: Normal | Resolution: invalid

Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------+------------------------------------
Changes (by jgeskens):

* status: new => closed
* resolution: => invalid


Comment:

Django handles this in the correct way. URI's which are encoded as latin1
are not standard, they should be UTF-8 encoded (see
https://tools.ietf.org/html/rfc3986).

Because the url is not encoded in the standard way, Django correctly gives
"400 Bad request".

--
Ticket URL: <https://code.djangoproject.com/ticket/25623#comment:5>

Reply all
Reply to author
Forward
0 new messages