[Django] #28796: urlsafe_base64_encode broken on Django 2.0

11 views
Skip to first unread message

Django

unread,
Nov 13, 2017, 6:19:08 PM11/13/17
to django-...@googlegroups.com
#28796: urlsafe_base64_encode broken on Django 2.0
-----------------------------------------+------------------------
Reporter: Nick | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 2.0
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-----------------------------------------+------------------------
Suppose we have:

hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
url = urlsafe_base64_encode(hashed.digest())

Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-like
objects could be placed in urls no problem, creating (say)
localhost:8000/asdfwerqwefa

However, in django 2.0, the url becomes (say)

localhost:8000/b'asdfwerqwefa'

which seems less 'urlsafe', as the quotes and the 'b' are added into the
url. Shouldn't urlsafe_base64_encode now return a string, if bytes in urls
now retain the quotes?

--
Ticket URL: <https://code.djangoproject.com/ticket/28796>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Nov 13, 2017, 10:35:11 PM11/13/17
to django-...@googlegroups.com
#28796: django.utils.http.urlsafe_base64_encode() broken in Django 2.0
---------------------------+--------------------------------------
Reporter: Nick | Owner: nobody
Type: Bug | Status: new
Component: Utilities | Version: 2.0
Severity: Normal | Resolution:

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------+--------------------------------------
Changes (by Tim Graham):

* type: Uncategorized => Bug
* component: Uncategorized => Utilities


Comment:

Could you please edit the description to include a complete code snippet
that reproduces the problem?

Ideally, you could also
[https://docs.djangoproject.com/en/dev/internals/contributing/triaging-
tickets/#bisecting-a-regression bisect] to find the commit where the
behavior changed.

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:1>

Django

unread,
Nov 14, 2017, 3:38:42 AM11/14/17
to django-...@googlegroups.com
#28796: django.utils.http.urlsafe_base64_encode() broken in Django 2.0
---------------------------+--------------------------------------
Reporter: Nick | Owner: nobody

Type: Bug | Status: new
Component: Utilities | Version: 2.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------+--------------------------------------

Comment (by Claude Paroz):

I would be all for making `urlsafe_base64_encode` return a string, but it
would be plainly backwards incompatible, as much existing code is already
doing a `decode()` on the result of this function. See usage in
`django/contrib/auth/forms.py` for example.

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:2>

Django

unread,
Nov 14, 2017, 2:16:40 PM11/14/17
to django-...@googlegroups.com
#28796: django.utils.http.urlsafe_base64_encode() broken in Django 2.0
---------------------------+--------------------------------------
Reporter: Nick | Owner: nobody

Type: Bug | Status: new
Component: Utilities | Version: 2.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------+--------------------------------------
Description changed by Nick:

Old description:

> Suppose we have:
>
> hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
> url = urlsafe_base64_encode(hashed.digest())
>
> Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-

> like objects could be placed in urls no problem, creating (say)


> localhost:8000/asdfwerqwefa
>
> However, in django 2.0, the url becomes (say)
>
> localhost:8000/b&#39;asdfwerqwefa&#39;
>
> which seems less 'urlsafe', as the quotes and the 'b' are added into the
> url. Shouldn't urlsafe_base64_encode now return a string, if bytes in
> urls now retain the quotes?

New description:

Suppose we have:

hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
url = urlsafe_base64_encode(hashed.digest())

Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-like
objects could be placed in urls no problem, creating (say)
localhost:8000/asdfwerqwefa

However, in django 2.0, the url becomes (say)

localhost:8000/b&#39;asdfwerqwefa&#39;

which seems less 'urlsafe', as the quotes and the 'b' are added into the
url. Shouldn't urlsafe_base64_encode now return a string, if bytes in urls
now retain the quotes?

I have created a full project which reproduces the bug here:

https://github.com/NAThompson/urlsafe_bug

To reproduce the bug, type:

$ ./manage.py shell
>>> import reproduce

--

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:3>

Django

unread,
Nov 14, 2017, 2:53:59 PM11/14/17
to django-...@googlegroups.com
#28796: django.utils.http.urlsafe_base64_encode() broken in Django 2.0
---------------------------+--------------------------------------
Reporter: Nick | Owner: nobody

Type: Bug | Status: new
Component: Utilities | Version: 2.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------+--------------------------------------
Description changed by Nick:

Old description:

> Suppose we have:


>
> hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
> url = urlsafe_base64_encode(hashed.digest())
>
> Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-

> like objects could be placed in urls no problem, creating (say)


> localhost:8000/asdfwerqwefa
>
> However, in django 2.0, the url becomes (say)
>
> localhost:8000/b&#39;asdfwerqwefa&#39;
>
> which seems less 'urlsafe', as the quotes and the 'b' are added into the
> url. Shouldn't urlsafe_base64_encode now return a string, if bytes in
> urls now retain the quotes?
>

> I have created a full project which reproduces the bug here:
>
> https://github.com/NAThompson/urlsafe_bug
>
> To reproduce the bug, type:
>
> $ ./manage.py shell
> >>> import reproduce

New description:

Suppose we have:

hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
url = urlsafe_base64_encode(hashed.digest())

Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-like
objects could be placed in urls no problem, creating (say)
localhost:8000/asdfwerqwefa

However, in django 2.0, the url becomes (say)

localhost:8000/b&#39;asdfwerqwefa&#39;

which seems less 'urlsafe', as the quotes and the 'b' are added into the
url. Shouldn't urlsafe_base64_encode now return a string, if bytes in urls
now retain the quotes?

I have created a full project which reproduces the bug here:

https://github.com/NAThompson/urlsafe_bug

To reproduce the bug, type:

$ ./manage.py shell
>>> import reproduce


This bug is present in commit d90936f41a2d7a3361e51fc49be033ba0f05f458,
but much before that the 'django.urls.path' doesn't exist, so bisection
doesn't work cleanly.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:4>

Django

unread,
Nov 14, 2017, 4:06:27 PM11/14/17
to django-...@googlegroups.com
#28796: reverse() coerces bytes in args/kwargs to str which adds "b" prefixes to
the ouput
-----------------------------+--------------------------------------
Reporter: Nick | Owner: nobody
Type: Bug | Status: new
Component: Core (URLs) | Version: 2.0
Severity: Normal | Resolution:

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-----------------------------+--------------------------------------
Changes (by Tim Graham):

* component: Utilities => Core (URLs)


Comment:

I completed the bisect by replacing `path(...)` with `url('(?P<token>.*)',
...` in the sample project. The commit where the behavior changed is
301de774c21d055e9e5a7073e5bffdb52bc71079.

The change in behavior is in `reverse()`. Args and kwargs have `str()`
applied rather than `force_text()` which adds the b prefix.
https://github.com/django/django/commit/301de774c21d055e9e5a7073e5bffdb52bc71079
#diff-b2f90a810a553411112f412de2a26617R424

Do you consider that a bug, Claude?

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:5>

Django

unread,
Nov 15, 2017, 4:05:50 AM11/15/17
to django-...@googlegroups.com
#28796: reverse() coerces bytes in args/kwargs to str which adds "b" prefixes to
the ouput
-----------------------------+--------------------------------------
Reporter: Nick | Owner: nobody

Type: Bug | Status: new
Component: Core (URLs) | Version: 2.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-----------------------------+--------------------------------------

Comment (by Claude Paroz):

I would say that `reverse` is not supposed to handle arbitrary
bytestrings, so generally speaking, it's developer's role to decode any
bytestring to a string (e.g. `force_text(b'caf\xe9')` will still produce a
`DjangoUnicodeDecodeError`).

Now I could understand that we privilege backwards compatibility and still
accept ASCII bytesstrings by reintroducing `force_text` (at the expense of
a slight overhead). It's some sort of design choice.

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:6>

Django

unread,
Nov 15, 2017, 10:52:50 AM11/15/17
to django-...@googlegroups.com
#28796: reverse() coerces bytes in args/kwargs to str which adds "b" prefixes to
the ouput
-----------------------------+--------------------------------------
Reporter: Nick | Owner: nobody

Type: Bug | Status: new
Component: Core (URLs) | Version: 2.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-----------------------------+--------------------------------------

Old description:

> Suppose we have:
>
> hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
> url = urlsafe_base64_encode(hashed.digest())
>
> Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-

> like objects could be placed in urls no problem, creating (say)


> localhost:8000/asdfwerqwefa
>
> However, in django 2.0, the url becomes (say)
>
> localhost:8000/b&#39;asdfwerqwefa&#39;
>
> which seems less 'urlsafe', as the quotes and the 'b' are added into the
> url. Shouldn't urlsafe_base64_encode now return a string, if bytes in
> urls now retain the quotes?
>

> I have created a full project which reproduces the bug here:
>
> https://github.com/NAThompson/urlsafe_bug
>
> To reproduce the bug, type:
>
> $ ./manage.py shell
> >>> import reproduce
>

> This bug is present in commit d90936f41a2d7a3361e51fc49be033ba0f05f458,
> but much before that the 'django.urls.path' doesn't exist, so bisection
> doesn't work cleanly.

New description:

Suppose we have:
{{{
import hmac
from hashlib import sha384
from django.utils.http import urlsafe_base64_encode

hashed_msg = hmac.new(b"secret", msg=b"qwerty", digestmod=sha384)
url = urlsafe_base64_encode(hashed_msg.digest())


}}}
Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-like

objects could be used in `reverse()`, creating (say)
localhost:8000/asdfwerqwefa

However, in Django 2.0, the url becomes (say)

localhost:8000/b&#39;asdfwerqwefa&#39;

which seems less 'urlsafe', as the quotes and the 'b' are added into the
url. Shouldn't urlsafe_base64_encode now return a string, if bytes in urls
now retain the quotes?

I have created a full project which reproduces the bug here:

https://github.com/NAThompson/urlsafe_bug

To reproduce the bug, type:
{{{
$ ./manage.py shell
>>> import reproduce
}}}

This bug is present in commit d90936f41a2d7a3361e51fc49be033ba0f05f458,
but much before that the 'django.urls.path' doesn't exist, so bisection
doesn't work cleanly.

--

Comment (by Tim Graham):

I think we could document the backwards incompatibility and point out the
solution to add `.decode()`, which should also work without older versions
of Django. Does it seem okay, Nick?

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:7>

Django

unread,
Nov 15, 2017, 4:41:59 PM11/15/17
to django-...@googlegroups.com
#28796: reverse() coerces bytes in args/kwargs to str which adds "b" prefixes to
the ouput
-----------------------------+--------------------------------------
Reporter: Nick | Owner: nobody

Type: Bug | Status: new
Component: Core (URLs) | Version: 2.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-----------------------------+--------------------------------------

Comment (by Nick):

I think this backwards incompatibility is an improvement on the overall
design, so I think documenting it and moving on is a good decision.

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:8>

Django

unread,
Nov 15, 2017, 6:18:37 PM11/15/17
to django-...@googlegroups.com
#28796: reverse() coerces bytes in args/kwargs to str which adds "b" prefixes to
the ouput
--------------------------------------+------------------------------------
Reporter: Nick | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Documentation | Version: 2.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by Tim Graham):

* type: Bug => Cleanup/optimization
* has_patch: 0 => 1
* component: Core (URLs) => Documentation
* stage: Unreviewed => Accepted


Comment:

[https://github.com/django/django/pull/9358 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:9>

Django

unread,
Nov 16, 2017, 9:29:53 AM11/16/17
to django-...@googlegroups.com
#28796: reverse() coerces bytes in args/kwargs to str which adds "b" prefixes to
the ouput
--------------------------------------+------------------------------------
Reporter: Nick | Owner: nobody
Type: Cleanup/optimization | Status: closed
Component: Documentation | Version: 2.0
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by Tim Graham <timograham@…>):

* status: new => closed
* resolution: => fixed


Comment:

In [changeset:"6bf85ff7e3b837378589e449ba27be8971d9b14c" 6bf85ff]:
{{{
#!CommitTicketReference repository=""
revision="6bf85ff7e3b837378589e449ba27be8971d9b14c"
Fixed #28796 -- Doc'd backwards incompatibility when reverse() receives
bytestring args/kwargs.

Due to 301de774c21d055e9e5a7073e5bffdb52bc71079.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:10>

Django

unread,
Nov 16, 2017, 9:30:15 AM11/16/17
to django-...@googlegroups.com
#28796: reverse() coerces bytes in args/kwargs to str which adds "b" prefixes to
the ouput
--------------------------------------+------------------------------------
Reporter: Nick | Owner: nobody

Type: Cleanup/optimization | Status: closed
Component: Documentation | Version: 2.0
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"0f7b5b38b603e06d6e3c8bdeecd096ad7f305af6" 0f7b5b38]:
{{{
#!CommitTicketReference repository=""
revision="0f7b5b38b603e06d6e3c8bdeecd096ad7f305af6"
[2.0.x] Fixed #28796 -- Doc'd backwards incompatibility when reverse()
receives bytestring args/kwargs.

Due to 301de774c21d055e9e5a7073e5bffdb52bc71079.

Backport of 6bf85ff7e3b837378589e449ba27be8971d9b14c from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:11>

Reply all
Reply to author
Forward
0 new messages