The code is in utils/http.py here:
https://github.com/django/django/blob/382ab137312961ad62feb8109d70a5a581fe8350/django/utils/http.py#L282
If a non-unicode URL is passed into is_safe_url(), it now throws the
following exception:
TypeError: must be unicode, not str
due to the unicodedata.category() call on this line:
--
Ticket URL: <https://code.djangoproject.com/ticket/26308>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* cc: ned@… (added)
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:1>
Comment (by timgraham):
The issue came up after we privately developed the patch, however, we
didn't see a reason why an application would use a non-unicode value for a
URL since Django handles all URLs as unicode. Using bytestring URLs
already crashed `is_safe_url()` on Python 3.
What's your use case? Could you fix your application instead?
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:2>
Comment (by claudep):
The issue is on Python 2, where bytestrings may be as common as unicode
strings, especially for strings like URLs which have seldom non-ASCII
chars in them. I'd suggest adding a `force_text` somewhere which could be
removed once we drop Python 2.
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:3>
* cc: pauloswald@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:4>
* cc: michaelvantellingen@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:5>
* stage: Unreviewed => Accepted
Comment:
This was also reported in IRC and on the security mailing list. In both
cases the problematic code passes `request.META.get('HTTP_REFERER')`
(`str` (bytes) on Python 2 and `str` (unicode on Python 3) to
`is_safe_url()`).
Doing this crashes Python 2 even before the security patch if the URL
contains any non-ASCII characters. I guess there isn't much disadvantage
to a `force_text()` call. Some projects are adapting by
[https://github.com/django-oscar/django-
oscar/commit/8a06e9334e501bb5a56a2ffb1d29221e4ba09dce inserting it
themselves].
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:6>
Comment (by doctoryes):
In our application ([https://github.com/edx/edx-
platform/blob/master/lms/urls.py#L71 edx-platform]), we use the following
Django module in our urls.py for language setting:
{{{ url(r'^i18n/', include('django.conf.urls.i18n')),}}}
The application then performs a POST to http://example.com/i18n/setlang/,
which flows through the following stack:
{{{
/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-
packages/django/core/handlers/base.py in get_response
response =
wrapped_callback(request, *callback_args, **callback_kwargs)
/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-
packages/django/utils/decorators.py in inner
return func(*args, **kwargs)
/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-
packages/django/utils/decorators.py in inner
return func(*args, **kwargs)
/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-
packages/django/views/i18n.py in set_language
if not is_safe_url(url=next,
host=request.get_host()):
/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-
packages/django/utils/http.py in is_safe_url
return _is_safe_url(url, host) and
_is_safe_url(url.replace('\\', '/'), host)
/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-
packages/django/utils/http.py in _is_safe_url
if unicodedata.category(url[0])[0] == 'C':
}}}
So the URL handling in its current form never flows through our code -
only Django code.
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:7>
Comment (by doctoryes):
For anyone blocked by this bug, below is a monkey patch that should
function as a work-around.
NO IMPLIED GUARANTEES OR WARRANTIES!
{{{
#!div style="font-size: 80%"
{{{#!python
from django.utils import http
from django.utils.encoding import force_text
def patch():
"""
Monkey patch the django.utils.http.is_safe_url function to convert the
incoming
url and host parameters to unicode.
"""
def create_is_safe_url_wrapper(wrapped_func):
def _wrap_is_safe_url(*args, **kwargs):
def _conv_text(value):
return None if value is None else force_text(value)
return wrapped_func(
# Converted *args.
*tuple(map(_conv_text, args)),
# Converted **kwargs.
**{key: _conv_text(value) for key, value in
kwargs.items()}
)
return _wrap_is_safe_url
http.is_safe_url = create_is_safe_url_wrapper(http.is_safe_url)
}}}
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:8>
Comment (by felixxm):
Replying to [comment:6 timgraham]:
> This was also reported in IRC and on the security mailing list. In both
cases the problematic code passes `request.META.get('HTTP_REFERER')`
(`str` (bytes) on Python 2 and `str` (unicode on Python 3) to
`is_safe_url()`).
>
> Doing this crashes Python 2 even before the security patch if the URL
contains any non-ASCII characters. I guess there isn't much disadvantage
to a `force_text()` call. Some projects are adapting by
[https://github.com/django-oscar/django-
oscar/commit/8a06e9334e501bb5a56a2ffb1d29221e4ba09dce inserting it
themselves].
As far As i'm concerned we should add following patch do i18n:
{{{#!diff
--- a/django/views/i18n.py
+++ b/django/views/i18n.py
@@ -13,6 +13,7 @@ from django.utils._os import upath
from django.utils.encoding import smart_text
from django.utils.formats import get_format, get_format_modules
from django.utils.http import is_safe_url
+from django.utils.text import force_text
from django.utils.translation import (
LANGUAGE_SESSION_KEY, check_for_language, get_language, to_locale,
)
@@ -35,6 +36,8 @@ def set_language(request):
next = request.POST.get('next', request.GET.get('next'))
if not is_safe_url(url=next, host=request.get_host()):
next = request.META.get('HTTP_REFERER')
+ if next is not None:
+ next = force_text(next)
if not is_safe_url(url=next, host=request.get_host()):
next = '/'
response = http.HttpResponseRedirect(next)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:9>
Comment (by timgraham):
Why would you patch there instead of `is_safe_url()` itself to prevent
Python 2 users from having to add similar boilerplate in similar code? I
was thinking something like (with `errors='replace'` if that's needed):
{{{#!diff
diff --git a/django/utils/http.py b/django/utils/http.py
index 6e782bd..73e45fd 100644
--- a/django/utils/http.py
+++ b/django/utils/http.py
@@ -290,6 +290,8 @@ def is_safe_url(url, host=None):
url = url.strip()
if not url:
return False
+ if six.PY2:
+ url = force_text(url)
# Chrome treats \ completely as / in paths but it could be part of
some
# basic auth credentials so we need to check both URLs.
return _is_safe_url(url, host) and _is_safe_url(url.replace('\\',
'/'), host)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:10>
Comment (by claudep):
[https://github.com/django/django/pull/6242 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:11>
* has_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:12>
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:13>
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:14>
* cc: felisiak.mariusz@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:15>
* status: new => closed
* resolution: => fixed
Comment:
In [changeset:"ada7a4aefb9bec4c34667b511022be6057102f98" ada7a4ae]:
{{{
#!CommitTicketReference repository=""
revision="ada7a4aefb9bec4c34667b511022be6057102f98"
Fixed #26308 -- Prevented crash with binary URLs in is_safe_url()
This fixes a regression introduced by c5544d28923.
Thanks John Eskew for the reporti and Tim Graham for the review.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:16>
Comment (by Claude Paroz <claude@…>):
In [changeset:"78f48300567b816b3c8177c33bef1a3ea6b36987" 78f4830]:
{{{
#!CommitTicketReference repository=""
revision="78f48300567b816b3c8177c33bef1a3ea6b36987"
[1.9.x] Fixed #26308 -- Prevented crash with binary URLs in is_safe_url()
This fixes a regression introduced by c5544d28923.
Thanks John Eskew for the reporti and Tim Graham for the review.
Backport of ada7a4aef from master.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:17>
Comment (by Claude Paroz <claude@…>):
In [changeset:"28bed24f552aa01e5b69902493f5ee2e06514522" 28bed24]:
{{{
#!CommitTicketReference repository=""
revision="28bed24f552aa01e5b69902493f5ee2e06514522"
[1.8.x] Fixed #26308 -- Prevented crash with binary URLs in is_safe_url()
This fixes a regression introduced by c5544d28923.
Thanks John Eskew for the reporti and Tim Graham for the review.
Backport of ada7a4aef from master.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:18>
Comment (by timgraham):
In 552f03869ea7f3072b3fa19ffb6cb2d957fd8447:
Added safety to URL decoding in is_safe_url() on Python 2
The errors='replace' parameter to force_text altered the URL before
checking
it, which wasn't considered sane. Refs [24fc935218] and [ada7a4aef].
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:19>
Comment (by timgraham):
In 9c195d45a64b0d2baee218e617ca3a762efc0bf5:
[1.9.x] Added safety to URL decoding in is_safe_url() on Python 2
The errors='replace' parameter to force_text altered the URL before
checking
it, which wasn't considered sane. Refs 24fc935218 and ada7a4aef.
Backport of 552f03869e from master.
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:20>
Comment (by timgraham):
In beb392b85e71fdd41209d323126181d74090fecb:
[1.8.x] Added safety to URL decoding in is_safe_url() on Python 2
The errors='replace' parameter to force_text altered the URL before
checking
it, which wasn't considered sane. Refs 24fc935218 and ada7a4aef.
Backport of 552f03869e from master.
--
Ticket URL: <https://code.djangoproject.com/ticket/26308#comment:21>