hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
url = urlsafe_base64_encode(hashed.digest())
Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-like
objects could be placed in urls no problem, creating (say)
localhost:8000/asdfwerqwefa
However, in django 2.0, the url becomes (say)
localhost:8000/b'asdfwerqwefa'
which seems less 'urlsafe', as the quotes and the 'b' are added into the
url. Shouldn't urlsafe_base64_encode now return a string, if bytes in urls
now retain the quotes?
--
Ticket URL: <https://code.djangoproject.com/ticket/28796>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* type: Uncategorized => Bug
* component: Uncategorized => Utilities
Comment:
Could you please edit the description to include a complete code snippet
that reproduces the problem?
Ideally, you could also
[https://docs.djangoproject.com/en/dev/internals/contributing/triaging-
tickets/#bisecting-a-regression bisect] to find the commit where the
behavior changed.
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:1>
Comment (by Claude Paroz):
I would be all for making `urlsafe_base64_encode` return a string, but it
would be plainly backwards incompatible, as much existing code is already
doing a `decode()` on the result of this function. See usage in
`django/contrib/auth/forms.py` for example.
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:2>
Old description:
> Suppose we have:
>
> hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
> url = urlsafe_base64_encode(hashed.digest())
>
> Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-
> like objects could be placed in urls no problem, creating (say)
> localhost:8000/asdfwerqwefa
>
> However, in django 2.0, the url becomes (say)
>
> localhost:8000/b'asdfwerqwefa'
>
> which seems less 'urlsafe', as the quotes and the 'b' are added into the
> url. Shouldn't urlsafe_base64_encode now return a string, if bytes in
> urls now retain the quotes?
New description:
Suppose we have:
hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
url = urlsafe_base64_encode(hashed.digest())
Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-like
objects could be placed in urls no problem, creating (say)
localhost:8000/asdfwerqwefa
However, in django 2.0, the url becomes (say)
localhost:8000/b'asdfwerqwefa'
which seems less 'urlsafe', as the quotes and the 'b' are added into the
url. Shouldn't urlsafe_base64_encode now return a string, if bytes in urls
now retain the quotes?
I have created a full project which reproduces the bug here:
https://github.com/NAThompson/urlsafe_bug
To reproduce the bug, type:
$ ./manage.py shell
>>> import reproduce
--
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:3>
Old description:
> Suppose we have:
>
> hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
> url = urlsafe_base64_encode(hashed.digest())
>
> Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-
> like objects could be placed in urls no problem, creating (say)
> localhost:8000/asdfwerqwefa
>
> However, in django 2.0, the url becomes (say)
>
> localhost:8000/b'asdfwerqwefa'
>
> which seems less 'urlsafe', as the quotes and the 'b' are added into the
> url. Shouldn't urlsafe_base64_encode now return a string, if bytes in
> urls now retain the quotes?
>
> I have created a full project which reproduces the bug here:
>
> https://github.com/NAThompson/urlsafe_bug
>
> To reproduce the bug, type:
>
> $ ./manage.py shell
> >>> import reproduce
New description:
Suppose we have:
hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
url = urlsafe_base64_encode(hashed.digest())
Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-like
objects could be placed in urls no problem, creating (say)
localhost:8000/asdfwerqwefa
However, in django 2.0, the url becomes (say)
localhost:8000/b'asdfwerqwefa'
which seems less 'urlsafe', as the quotes and the 'b' are added into the
url. Shouldn't urlsafe_base64_encode now return a string, if bytes in urls
now retain the quotes?
I have created a full project which reproduces the bug here:
https://github.com/NAThompson/urlsafe_bug
To reproduce the bug, type:
$ ./manage.py shell
>>> import reproduce
This bug is present in commit d90936f41a2d7a3361e51fc49be033ba0f05f458,
but much before that the 'django.urls.path' doesn't exist, so bisection
doesn't work cleanly.
--
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:4>
* component: Utilities => Core (URLs)
Comment:
I completed the bisect by replacing `path(...)` with `url('(?P<token>.*)',
...` in the sample project. The commit where the behavior changed is
301de774c21d055e9e5a7073e5bffdb52bc71079.
The change in behavior is in `reverse()`. Args and kwargs have `str()`
applied rather than `force_text()` which adds the b prefix.
https://github.com/django/django/commit/301de774c21d055e9e5a7073e5bffdb52bc71079
#diff-b2f90a810a553411112f412de2a26617R424
Do you consider that a bug, Claude?
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:5>
Comment (by Claude Paroz):
I would say that `reverse` is not supposed to handle arbitrary
bytestrings, so generally speaking, it's developer's role to decode any
bytestring to a string (e.g. `force_text(b'caf\xe9')` will still produce a
`DjangoUnicodeDecodeError`).
Now I could understand that we privilege backwards compatibility and still
accept ASCII bytesstrings by reintroducing `force_text` (at the expense of
a slight overhead). It's some sort of design choice.
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:6>
Old description:
> Suppose we have:
>
> hashed_msg = hmac.new(key, msg=message, digestmod=sha384)
> url = urlsafe_base64_encode(hashed.digest())
>
> Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-
> like objects could be placed in urls no problem, creating (say)
> localhost:8000/asdfwerqwefa
>
> However, in django 2.0, the url becomes (say)
>
> localhost:8000/b'asdfwerqwefa'
>
> which seems less 'urlsafe', as the quotes and the 'b' are added into the
> url. Shouldn't urlsafe_base64_encode now return a string, if bytes in
> urls now retain the quotes?
>
> I have created a full project which reproduces the bug here:
>
> https://github.com/NAThompson/urlsafe_bug
>
> To reproduce the bug, type:
>
> $ ./manage.py shell
> >>> import reproduce
>
> This bug is present in commit d90936f41a2d7a3361e51fc49be033ba0f05f458,
> but much before that the 'django.urls.path' doesn't exist, so bisection
> doesn't work cleanly.
New description:
Suppose we have:
{{{
import hmac
from hashlib import sha384
from django.utils.http import urlsafe_base64_encode
hashed_msg = hmac.new(b"secret", msg=b"qwerty", digestmod=sha384)
url = urlsafe_base64_encode(hashed_msg.digest())
}}}
Now url is a bytes object, say b'asdfwerqwefa'. In django 1.11, bytes-like
objects could be used in `reverse()`, creating (say)
localhost:8000/asdfwerqwefa
However, in Django 2.0, the url becomes (say)
localhost:8000/b'asdfwerqwefa'
which seems less 'urlsafe', as the quotes and the 'b' are added into the
url. Shouldn't urlsafe_base64_encode now return a string, if bytes in urls
now retain the quotes?
I have created a full project which reproduces the bug here:
https://github.com/NAThompson/urlsafe_bug
To reproduce the bug, type:
{{{
$ ./manage.py shell
>>> import reproduce
}}}
This bug is present in commit d90936f41a2d7a3361e51fc49be033ba0f05f458,
but much before that the 'django.urls.path' doesn't exist, so bisection
doesn't work cleanly.
--
Comment (by Tim Graham):
I think we could document the backwards incompatibility and point out the
solution to add `.decode()`, which should also work without older versions
of Django. Does it seem okay, Nick?
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:7>
Comment (by Nick):
I think this backwards incompatibility is an improvement on the overall
design, so I think documenting it and moving on is a good decision.
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:8>
* type: Bug => Cleanup/optimization
* has_patch: 0 => 1
* component: Core (URLs) => Documentation
* stage: Unreviewed => Accepted
Comment:
[https://github.com/django/django/pull/9358 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:9>
* status: new => closed
* resolution: => fixed
Comment:
In [changeset:"6bf85ff7e3b837378589e449ba27be8971d9b14c" 6bf85ff]:
{{{
#!CommitTicketReference repository=""
revision="6bf85ff7e3b837378589e449ba27be8971d9b14c"
Fixed #28796 -- Doc'd backwards incompatibility when reverse() receives
bytestring args/kwargs.
Due to 301de774c21d055e9e5a7073e5bffdb52bc71079.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:10>
Comment (by Tim Graham <timograham@…>):
In [changeset:"0f7b5b38b603e06d6e3c8bdeecd096ad7f305af6" 0f7b5b38]:
{{{
#!CommitTicketReference repository=""
revision="0f7b5b38b603e06d6e3c8bdeecd096ad7f305af6"
[2.0.x] Fixed #28796 -- Doc'd backwards incompatibility when reverse()
receives bytestring args/kwargs.
Due to 301de774c21d055e9e5a7073e5bffdb52bc71079.
Backport of 6bf85ff7e3b837378589e449ba27be8971d9b14c from master
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/28796#comment:11>