1.6 reverse() escapes unreserved chars in path components

111 views
Skip to first unread message

Erik van Zijst

unread,
Mar 1, 2014, 1:26:09 AM3/1/14
to django-d...@googlegroups.com
Django's django.core.urlresolvers.reverse() seems to have changed its behavior in 1.6. It now runs the arguments through quote(), without specifying the safe characters for path components. As a result:

on 1.4.10:
In [2]: reverse('test', args=['foo:bar'])

Out[2]: '/foo:bar'

but on 1.6.2:
In [2]: reverse('test', args=['foo:bar'])
Out[2]: '/foo%3Abar'

It would seem to me that this is a regression, as ":@-._~!$&'()*+,;=" are all allowed unescaped in path segments AFAIK.

Cheers,
Erik

Sam Lai

unread,
Mar 1, 2014, 5:41:41 PM3/1/14
to django-d...@googlegroups.com
The relevant commit and issue -

https://github.com/django/django/commit/31b5275235bac150a54059db0288a19b9e0516c7
https://code.djangoproject.com/ticket/13260
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-develop...@googlegroups.com.
> To post to this group, send email to django-d...@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/064ba557-a722-484f-93bf-423048b51b14%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Erik van Zijst

unread,
Mar 1, 2014, 8:28:33 PM3/1/14
to django-d...@googlegroups.com
Yes I saw that, but I'm confused. I thought these characters are
allowed unescaped in path segments.
> You received this message because you are subscribed to a topic in the Google Groups "Django developers" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-developers/ZLGk7T4mJuw/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to django-develop...@googlegroups.com.
> To post to this group, send email to django-d...@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CABxbXqXKhcKFPS8ufmYDGmgHU_QjBuFUb%3DaFXk3FROJyzAJw5A%40mail.gmail.com.

Sam Lai

unread,
Mar 2, 2014, 8:58:37 AM3/2/14
to django-d...@googlegroups.com
I wasn't expressing an opinion either way; just adding the relevant
commit to the conversation.

Looks like RFC 3986 is the relevant RFC describing the permitted
characters in URIs, specifically section 2.2 and 2.3 -
http://tools.ietf.org/html/rfc3986#section-2.2

It seems like the fix makes it easier for 90% of the uses, but
explicitly blocks the other 10% (i.e. uses involving the use of
'reserved' characters as permitted by the RFC).

The relevant django-developers discussion is here -
https://groups.google.com/forum/#!searchin/django-developers/13260/django-developers/Gofq5y40mYA/v_4yjrBItWkJ
The final post addresses this issue, but doesn't seem to have been
taken into account when the patch was accepted.
> To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CA%2B69USsj%2BuWHJJfw7-Fr8SFq34Xq0TLThR3Bq2t3r66K9oAFrw%40mail.gmail.com.

Erik van Zijst

unread,
Mar 5, 2014, 5:04:51 PM3/5/14
to django-d...@googlegroups.com
On Sunday, 2 March 2014 05:58:37 UTC-8, Sam Lai wrote:
It seems like the fix makes it easier for 90% of the uses, but
explicitly blocks the other 10% (i.e. uses involving the use of
'reserved' characters as permitted by the RFC).

Yes. I'm bringing this up because it breaks certain OAuth 1 clients against Bitbucket.

In some places we redirect to URLs whose path segment contains a ":". Prior to us upgrading to 1.6 the response's location header preserved that colon, but now it gets escaped, changing the URL (e.g. https://api.bitbucket.org/2.0/repositories/david/django-storages/pullrequests/51/diff redirecting to https://api.bitbucket.org/2.0/repositories/david/django-storages/diff/regadas/django-storages%3A069fd1d01fbf..f153a70ba254)

In OAuth 1, requests are signed, including the request URL, but the RFC-5849 does not mandate any pre-processing of the URL. For several OAuth clients (including requests-oauthlib and python-oauth2) that means they compute the signature over a string that contains "%3A" instead of ":".

On the server however, the request path automatically gets unquoted before it hits the middlewares and views. As our OAuth layer is a middleware that reconstructs the signature, it ends up computing over ":", yielding a different signature than the client, breaking authentication.

This might be addressable by changing these OAuth clients to perform unquoting on the path segment, but a better solution would seem to make urlresolvers.py:RegexURLResolver respect the reserved characters for path segments and not escape what does not need to be escaped.

I'll follow up with a pull request, unless there are string feelings, or unwanted consequences of that approach.

Cheers,
Erik

Erik van Zijst

unread,
Mar 5, 2014, 5:30:44 PM3/5/14
to django-d...@googlegroups.com
On Wednesday, 5 March 2014 14:04:51 UTC-8, Erik van Zijst wrote:
I'll follow up with a pull request, unless there are string feelings, or unwanted consequences of that approach.

Reply all
Reply to author
Forward
0 new messages