[Django] #30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F

20 views
Skip to first unread message

Django

unread,
Dec 15, 2018, 8:14:35 PM12/15/18
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton | Owner: nobody
Partridge |
Type: Bug | Status: new
Component: | Version: master
contrib.admin | Keywords: admin, urlfield,
Severity: Normal | smart_urlquote, url, quote
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Intended behavior: If a URLField is used in the Django admin, and the
currently saved URL in the database contains the escape %2F, the href of
the rendered link should contain that escape verbatim (e.g. %2F, not a /
in that location).

Current behavior (master branch, Python 3.7.0): The href of the rendered
link contains a / in that location, creating an extra portion of the path
and, when clicked, sending the admin user to a completely different and
unintended URL.

Root cause:
https://github.com/django/django/blob/315357ad25a6590e7f4564ec2e56a22132b09001/django/contrib/admin/widgets.py#L340
uses `smart_urlquote` to set the href (subsequently read by
https://github.com/django/django/blob/master/django/contrib/admin/templates/admin/widgets/url.html
).

smart_urlquote (as implemented here:
https://github.com/django/django/blob/315357ad25a6590e7f4564ec2e56a22132b09001/django/utils/html.py#L203
) has caused problems before e.g.
https://code.djangoproject.com/ticket/28123 . That issue, though, refers
to a backport, whereas this behavior is broken on the master branch in
Python 3.

One fix would be to not use smart_urlquote, and just ensure that the href
is set to the exact same value as initially shown in the input field and
rendered as the text of the <a> tag. Is there a reason why smart_urlquote
was used in the first place, though?

I'm happy to submit a patch and test case if this is a desired fix.

--
Ticket URL: <https://code.djangoproject.com/ticket/30043>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Dec 17, 2018, 5:55:59 AM12/17/18
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton Partridge | Owner: nobody
Type: Bug | Status: new
Component: contrib.admin | Version: master
Severity: Normal | Resolution:
Keywords: admin, urlfield, | Triage Stage: Accepted
smart_urlquote, url, quote |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Carlton Gibson):

* stage: Unreviewed => Accepted


Comment:

Hi Brenton.

Thanks for the report.

I'm going to Accept this for now. It looks right...


{{{
>>> from django.utils.html import smart_urlquote
>>> smart_urlquote('%2F')
'/'
}}}

...but I'd quite like to see an example of the actual type of URL you're
hitting problems with.

> I'm happy to submit a patch and test case...

Can you begin by adding a test case (here or in a PR) showing the failing
example?
Thanks.

--
Ticket URL: <https://code.djangoproject.com/ticket/30043#comment:1>

Django

unread,
Dec 21, 2018, 6:39:43 PM12/21/18
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton Partridge | Owner: nobody
Type: Bug | Status: new

Component: contrib.admin | Version: master
Severity: Normal | Resolution:
Keywords: admin, urlfield, | Triage Stage: Accepted
smart_urlquote, url, quote |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* cc: Florian Apolloner (added)


Comment:

The `smart_urlquote()` dates to the original implementation in #17549
merged by Florian. I don't have much expertise to say whether or not it's
useful.

I explored the first
[https://github.com/django/django/blob/19e863a844db137045499276f0b0494b180f3f1a/tests/admin_widgets/tests.py#L369
test URL] `http://example.com/<sometag>some text</sometag>` and found that
clicking that URL stills works after removing `smart_urlquote()` as long
as the space in the test URL is encoded (bit test URL changed from `some
text` to `some%20text`. That seems fine because `URLValidator` won't
accept URLs that unencoded spaces in them.

Tread carefully since there was also a security issue in this code,
cbe6d5568f4f5053ed7228ca3c3d0cce77cf9560.

--
Ticket URL: <https://code.djangoproject.com/ticket/30043#comment:2>

Django

unread,
Dec 22, 2018, 4:19:31 AM12/22/18
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton Partridge | Owner: nobody
Type: Bug | Status: new

Component: contrib.admin | Version: master
Severity: Normal | Resolution:
Keywords: admin, urlfield, | Triage Stage: Accepted
smart_urlquote, url, quote |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Florian Apolloner):

Dropping `smart_urlquote` would result in the wrong URL returned for
`/äöü`. The main issue here is that we do not know if the URL supplied is
already sufficiently quoted or not. Hence we try to unquote and quote
again to prevent attacks which contain half quoted URLs. To me unquoting
seems a valid choice to do since most users will properly copy URLs from
browser location bars which would return `/%C3%A4%C3%B6%C3%BC` for the
previous example.

But please try to convince me otherwise, I might miss something important
here (this code is tricky).

--
Ticket URL: <https://code.djangoproject.com/ticket/30043#comment:3>

Django

unread,
Sep 26, 2019, 7:58:06 PM9/26/19
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton Partridge | Owner: nobody
Type: Bug | Status: new

Component: contrib.admin | Version: master
Severity: Normal | Resolution:
Keywords: admin, urlfield, | Triage Stage: Accepted
smart_urlquote, url, quote |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Brady):

* cc: Brady (added)


Comment:

I've opened up a PR that is my first attempt at fixing this by adding a
special case to check for `%2F` and pass the segment that's being quoted
through `iri_to_uri` instead, which has the `%` sign set as "safe"

The PR can be found here:
[https://github.com/django/django/pull/11837]

--
Ticket URL: <https://code.djangoproject.com/ticket/30043#comment:4>

Django

unread,
Sep 27, 2019, 2:46:09 AM9/27/19
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton Partridge | Owner: nobody
Type: Bug | Status: new

Component: contrib.admin | Version: master
Severity: Normal | Resolution:
Keywords: admin, urlfield, | Triage Stage: Accepted
smart_urlquote, url, quote |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* has_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/30043#comment:5>

Django

unread,
Sep 28, 2019, 10:50:43 AM9/28/19
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton Partridge | Owner: Brady
Type: Bug | Status: assigned

Component: contrib.admin | Version: master
Severity: Normal | Resolution:
Keywords: admin, urlfield, | Triage Stage: Accepted
smart_urlquote, url, quote |
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* owner: nobody => Brady
* needs_better_patch: 0 => 1
* status: new => assigned
* needs_tests: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/30043#comment:6>

Django

unread,
Oct 2, 2019, 6:14:54 AM10/2/19
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton Partridge | Owner: Brady
Type: Bug | Status: assigned
Component: contrib.admin | Version: master
Severity: Normal | Resolution:
Keywords: admin, urlfield, | Triage Stage: Accepted
smart_urlquote, url, quote |
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Carlton Gibson):

Hi Brady.

Florian on the PR:

> I do not feel good about the special casing, also the tests seem rather
minimal. How long can a segment be? what happens if the segment contains
other special characters etc?

Taking the special casing first. At the DjangoCon sprints we were close to
thinking this might be intractable... because of the way `quote()` will
always double-escape... (memory slipping now). Can you re-construct the
analysis here that led to the idea to just handle the special case?

It may be that we have to say we can't solve it, but at least then we'll
see *why* your suggestion is as it is.

From there we can think about whether more test cases would be needed.
Thanks for the effort!

--
Ticket URL: <https://code.djangoproject.com/ticket/30043#comment:7>

Django

unread,
Oct 2, 2019, 9:55:57 AM10/2/19
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton Partridge | Owner: Brady
Type: Bug | Status: assigned
Component: contrib.admin | Version: master
Severity: Normal | Resolution:
Keywords: admin, urlfield, | Triage Stage: Accepted
smart_urlquote, url, quote |
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Brady):

Hey Carlton,

So I got close to thinking this was intractable essentially due to not
being able to identify if the string had already been encoded or not.
Specifically around the path that had been specified (I think this is
where the original ticket came from, because during testing the %2F only
became a problem when it was part of the path. Identified by the use case
you showed above `smart_urlquote('%2F')` ) - Although I would think this
is expected behavior in the sense that if you have %2F in the path, I
would think you'd want that converted to a `/` anyways.

I tried a few ways of handling this case without a special case, and could
not come up with a viable solution, but once I added a check to see if the
segment already contained `%2F` was I able to pass the tests. To be honest
though, the more I think about it, the more I might have been coding to
pass the tests as opposed to actually solving the problem. (I'm not
actually sure what the real problem is, because the ticket author didn't
provide a use case)

I can continue to write some more tests, and try to figure out a way to
make this less of a special case, but would love to hear both your
thoughts on how much of a problem this actually is.

--
Ticket URL: <https://code.djangoproject.com/ticket/30043#comment:8>

Django

unread,
Oct 2, 2019, 10:40:28 AM10/2/19
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton Partridge | Owner: Brady
Type: Bug | Status: assigned
Component: contrib.admin | Version: master
Severity: Normal | Resolution:
Keywords: admin, urlfield, | Triage Stage: Accepted
smart_urlquote, url, quote |
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Carlton Gibson):

Right, yes, that was it. The issue is that there's no (easy) way to know
if a string has already been quoted, which re-reading is exactly what
Florian said above:

> The main issue here is that we do not know if the URL supplied is
already sufficiently quoted or not.

I think the use-case will be where you have a URL, with a quoted-URL as a
query parameter (say). As understand it, you do want `%2F`s in there, so
you'd use `quote_plus`, as used by default by `urlencode`:

{{{
>>> from urllib.parse import quote, quote_plus, urlencode
>>> quote('https://djangoproject.com/')
'https%3A//djangoproject.com/'
>>> quote_plus('https://djangoproject.com/testing/')
'https%3A%2F%2Fdjangoproject.com%2Ftesting%2F'
>>> urlencode({'q':'https://djangoproject.com/testing/'})
'q=https%3A%2F%2Fdjangoproject.com%2Ftesting%2F'
}}}

But, if it's part of the path, and not the query parameters, you don't
want `quote_plus`. Taking Florian's example:

{{{
>>> quote('/äöü')
'/%C3%A4%C3%B6%C3%BC'
>>> quote_plus('/äöü')
'%2F%C3%A4%C3%B6%C3%BC'
}}}

The 2nd, as part of the path, would be wrong.

The good thing about `iri_to_uri()` is that it won't ''double-percent-
escape'', whereas `quote()` keeps on changing the input (re-quoting) each
time, so if a URL does **look** quoted you can pass it through and get the
right results. **But edge cases...???**

`smart_urlquote()`, perhaps why it has the name,
[https://github.com/django/django/blob/315357ad25a6590e7f4564ec2e56a22132b09001/django/utils/html.py#L224-L233
**already** handles the the query string parameters separately, using
`urlencode`]. (And looking at the issue linked in the comment there, that
was the exact discussion from #22267...) — **SO** short of concrete test
cases (i.e. full example URLs where the behaviour is wrong) I can't see
that we can progress.

On that basis, I'm going to close as `needsinfo`. Happy to re-open if full
test cases come up.

Thank you for your effort and time looking into this Brady! Welcome
aboard! 🎉

--
Ticket URL: <https://code.djangoproject.com/ticket/30043#comment:9>

Django

unread,
Nov 30, 2019, 5:44:25 AM11/30/19
to django-...@googlegroups.com
#30043: AdminURLFieldWidget incorrectly unquotes URLs e.g. containing %2F
-------------------------------------+-------------------------------------
Reporter: Brenton Partridge | Owner: Brady
Type: Bug | Status: closed
Component: contrib.admin | Version: master
Severity: Normal | Resolution: needsinfo

Keywords: admin, urlfield, | Triage Stage: Accepted
smart_urlquote, url, quote |
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Baptiste Mispelon):

* status: assigned => closed
* resolution: => needsinfo


Comment:

Closing per [comment:9 previous comment]

--
Ticket URL: <https://code.djangoproject.com/ticket/30043#comment:10>

Reply all
Reply to author
Forward
0 new messages