[Django] #22130: Accelerated deprecation fix_ampersands and clean_html

11 views
Skip to first unread message

Django

unread,
Feb 23, 2014, 3:08:27 PM2/23/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
---------------------------------+------------------------
Reporter: erikr | Owner: nobody
Type: Bug | Status: new
Component: Template system | Version: 1.6
Severity: Normal | Keywords: nlsprint14
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+------------------------
I would like to propose the accelerated deprecation of the
`fix_ampersands` built-in template filter, because in all use cases, it
either simply does not work, or stimulates the user to create security
vulnerabilities.

'''fix_ampersands'''
Fix_ampersands aims to fix unescaped ampersands, by replacing them with
`&`. It attempts smart detection of HTML entities, so in `©`, no
change is made. The detection of this is, understandably, limited, for
example `R&D; HRM; ...` would cause it not to 'fix' the ampersand, even
though it is not an entity.

The template tag is marked with `is_safe=True`, which means that the
default escaping will occur of the input string was unsafe, and will not
occur if the string was safe. However, this means that with an unsafe
string, the fixing is ineffective, because the ampersand in the just
generated entity is again escaped by the standard HTML escaping:

{{{
>>> print Template("{{ text|fix_ampersands }}").render(Context({'text':
'foo & bar'}))
foo & bar
}}}

If the string is marked as safe, this issue does not occur:


{{{
>>> print Template("{{ text|fix_ampersands|safe
}}").render(Context({'text': 'foo & bar'}))
foo & bar
}}}

To the unsuspecting user, this may seem like the solution. However, this
means only the ampersand is escaped, introducing an XSS vulnerability:


{{{
>>> print Template("{{ text|fix_ampersands|safe
}}").render(Context({'text': 'foo & bar <script>alert("xss")</script>'}))
foo &amp; bar <script>alert("xss")</script>
}}}

Therefore, the fix_ampersands tag is only useful when an otherwise safe
string is encountered, but which has unescaped ampersands, and does not
have any unescaped ampersands that could be misread as an HTML entity.
However, safe strings should never have unescaped ampersands. Therefore,
the only use cases I see for this tag are broken or insecure.

'''clean_html'''
Fix_ampersands is only used in `django.utils.clean_html`. This function is
not documented, and appears to have been added with the initial
publication of django. The use of this function is unclear, because it
does numerous manipulations on HTML tags, but does not actually make it
safe to include. Also, the things it cleans look like they are arbitrary
selected, like 'Trim stupid HTML such as <br clear="all">'.

'''Proposal'''
Searching on GitHub, I have not actually managed to locate any usage of
either `clean_html` or `fix_ampersands`. I did not do a very thorough
search, but it seems it is quite rarely used.

Considering that `clean_html`'s behaviour is rather arbitrary and
incomplete, and any use of `fix_ampersands` is broken or insecure, I
suggest to start an accelerated deprecation cycle as of 1.7. I suggest
this to be accelerated because the current tag stimulates users to create
potentially serious vulnerabilities in their projects. If accepted, I will
be happy to write the patch.

--
Ticket URL: <https://code.djangoproject.com/ticket/22130>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Feb 23, 2014, 3:08:47 PM2/23/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
---------------------------------+--------------------------------------

Reporter: erikr | Owner: nobody
Type: Bug | Status: new
Component: Template system | Version: 1.6
Severity: Normal | Resolution:

Keywords: nlsprint14 | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------
Changes (by erikr):

* cc: eromijn@… (added)
* needs_docs: => 0
* needs_tests: => 0
* needs_better_patch: => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:1>

Django

unread,
Feb 23, 2014, 3:16:48 PM2/23/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
---------------------------------+--------------------------------------

Reporter: erikr | Owner: nobody
Type: Bug | Status: new
Component: Template system | Version: 1.6
Severity: Normal | Resolution:

Keywords: nlsprint14 | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------

Comment (by erikr):

Django-developers thread started on
https://groups.google.com/forum/#!topic/django-developers/UehmGnEjiSE

Also, it should be `django.utils.html.clean_html`, not
`django.utils.clean_html`.

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:2>

Django

unread,
Feb 23, 2014, 3:18:22 PM2/23/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
--------------------------------------+------------------------------------
Reporter: erikr | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Utilities | Version: master
Severity: Normal | Resolution:
Keywords: nlsprint14 | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by bmispelon):

* type: Bug => Cleanup/optimization
* version: 1.6 => master
* component: Template system => Utilities
* stage: Unreviewed => Accepted


Comment:

I'm +1 on the idea.

`fix_ampersands` and `clean_html` have been in Django since the beginning
and I'm guessing that they were made for a very specific use-case at World
Online but I don't believe they have their place inside a generic web
framework.

As an argument, here's `clean_html`'s docstring, explaining what it does
exactly:
{{{
Clean the given HTML. Specifically, do the following:
* Convert <b> and <i> to <strong> and <em>.
* Encode all ampersands correctly.
* Remove all "target" attributes from <a> tags.
* Remove extraneous HTML, such as presentational tags that open and
immediately close and <br clear="all">.
* Convert hard-coded bullets into HTML unordered lists.
* Remove stuff like "<p>&nbsp;&nbsp;</p>", but only if it's at the
bottom of the text.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:3>

Django

unread,
Mar 1, 2014, 4:44:16 AM3/1/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
--------------------------------------+------------------------------------
Reporter: erikr | Owner: erikr
Type: Cleanup/optimization | Status: assigned

Component: Utilities | Version: master
Severity: Normal | Resolution:
Keywords: nlsprint14 | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by erikr):

* owner: nobody => erikr
* status: new => assigned


Comment:

As there were no objections on the mailing list, I think we can move
forward on this.

As Alex clarified on the list, there does not appear to be any code on
GitHub that uses clean_html, and given the functionality, I don't see any
use for it at all. There was also no documentation for it. Therefore, I
will simply remove clean_html. ``fix_ampersands`` will be set up for
accelerated deprecation: DeprecationWarning in 1.7, removal in 1.8.

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:4>

Django

unread,
Mar 1, 2014, 7:23:02 AM3/1/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
--------------------------------------+------------------------------------
Reporter: erikr | Owner: erikr
Type: Cleanup/optimization | Status: assigned
Component: Utilities | Version: master
Severity: Normal | Resolution:
Keywords: nlsprint14 | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by erikr):

* has_patch: 0 => 1


Comment:

PR in https://github.com/django/django/pull/2382

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:5>

Django

unread,
Mar 1, 2014, 12:09:05 PM3/1/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
--------------------------------------+------------------------------------
Reporter: erikr | Owner: erikr
Type: Cleanup/optimization | Status: closed
Component: Utilities | Version: master
Severity: Normal | Resolution: fixed

Keywords: nlsprint14 | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------

Comment (by Aymeric Augustin <aymeric.augustin@…>):

In [changeset:"3273bd7b254680a5b241e2fdbc3196956b2b44e8"]:
{{{
#!CommitTicketReference repository=""
revision="3273bd7b254680a5b241e2fdbc3196956b2b44e8"
Merge pull request #2382 from erikr/deprecate-fix-ampersands

Fixed #22130 -- Deprecated fix_ampersands and clean_html.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:7>

Django

unread,
Mar 1, 2014, 12:09:04 PM3/1/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
--------------------------------------+------------------------------------
Reporter: erikr | Owner: erikr
Type: Cleanup/optimization | Status: closed
Component: Utilities | Version: master
Severity: Normal | Resolution: fixed
Keywords: nlsprint14 | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by Erik Romijn <eromijn@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"775975f15d7d461c154e558cba5fb0592539126f"]:
{{{
#!CommitTicketReference repository=""
revision="775975f15d7d461c154e558cba5fb0592539126f"
Fixed #22130 -- Deprecated fix_ampersands, removed utils.clean_html()
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:6>

Django

unread,
Mar 1, 2014, 8:06:17 PM3/1/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
--------------------------------------+------------------------------------
Reporter: erikr | Owner: erikr
Type: Cleanup/optimization | Status: new

Component: Utilities | Version: master
Severity: Normal | Resolution:
Keywords: nlsprint14 | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by timo):

* status: closed => new
* has_patch: 1 => 0
* resolution: fixed =>


Comment:

The test of the new warning fails when run after `template_tests`:
`./runtests.py template_tests defaultfilters`

See
[http://ci.djangoproject.com/job/Django/4100/database=sqlite3,python=python3.3/testReport/junit/defaultfilters.tests/DefaultFiltersTests/test_fix_ampersands/
Jenkins] for details.

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:8>

Django

unread,
Mar 2, 2014, 3:04:00 AM3/2/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
--------------------------------------+------------------------------------
Reporter: erikr | Owner: erikr
Type: Cleanup/optimization | Status: new
Component: Utilities | Version: master
Severity: Normal | Resolution:
Keywords: nlsprint14 | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by erikr):

* has_patch: 0 => 1


Comment:

Sorry about that - this is the same issue I asked about on the mailing
list: it turns out the warnings filters across multiple tests interact.
`template_tests` sets an ignore for these DeprecationWarnings, which means
`defaultfilters` does not see them anymore, even though it sets filters
back to `default`. Even an explicit call to `warnings.resetwarnings()` did
not change this. I fixed this for `utils_tests` by removing that specific
check, hadn't realised it was an issue here too as the order for my
locally run tests was different.

As Aymeric told me it was not really necessary to test for the warning
actually being issued, the quick fix is:
https://github.com/django/django/pull/2383

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:9>

Django

unread,
Mar 2, 2014, 3:21:51 AM3/2/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
-------------------------------------+-------------------------------------
Reporter: erikr | Owner: aaugustin
Type: | Status: assigned
Cleanup/optimization | Version: master
Component: Utilities | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: nlsprint14 | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by aaugustin):

* status: new => assigned

* owner: erikr => aaugustin


Comment:

I'll try to figure out if I can stop template_tests from leaking warning
state first.

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:10>

Django

unread,
Mar 2, 2014, 9:30:55 AM3/2/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
-------------------------------------+-------------------------------------
Reporter: erikr | Owner: aaugustin
Type: | Status: assigned
Cleanup/optimization | Version: master
Component: Utilities | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: nlsprint14 | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by aaugustin):

I spend some time working on this and I cannot figure out what's going on.
It doesn't appear to be related to warning filters. If you insert the
following code in `test_fix_ampersands`:

{{{
print("\n\nWARNING STATE\n")
for f in warnings.filters:
print(f)
if f[1] and f[1].pattern: print(' ' + f[1].pattern)
if f[3] and f[3].pattern: print(' ' + f[3].pattern)
}}}

you can see that the warning filters are exactly identical with or without
running the `template_tests` first, even though captured warnings are
different .

While attempting to nail down the issue, I hit #22184, and that's more
work than I'm willing to do just to test a deprecation warning that very
obviously works. I don't want to waste more time on this, so I'll just
merge Erik's pull request.

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:11>

Django

unread,
Mar 2, 2014, 9:31:22 AM3/2/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
-------------------------------------+-------------------------------------
Reporter: erikr | Owner: aaugustin
Type: | Status: closed
Cleanup/optimization | Version: master
Component: Utilities | Resolution: fixed

Severity: Normal | Triage Stage: Accepted
Keywords: nlsprint14 | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by Erik Romijn <eromijn@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"6268792e19ab37446e438f4777c0bb8e786c5c3f"]:
{{{
#!CommitTicketReference repository=""
revision="6268792e19ab37446e438f4777c0bb8e786c5c3f"
Fixed #22130 -- fixed template_tests/defaultfilters order dependent test
failure
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:12>

Django

unread,
Mar 2, 2014, 9:31:22 AM3/2/14
to django-...@googlegroups.com
#22130: Accelerated deprecation fix_ampersands and clean_html
-------------------------------------+-------------------------------------
Reporter: erikr | Owner: aaugustin
Type: | Status: closed
Cleanup/optimization | Version: master
Component: Utilities | Resolution: fixed
Severity: Normal | Triage Stage: Accepted
Keywords: nlsprint14 | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by Aymeric Augustin <aymeric.augustin@…>):

In [changeset:"a08f906556c14357852bbc9c4c7d10c6558dafcf"]:
{{{
#!CommitTicketReference repository=""
revision="a08f906556c14357852bbc9c4c7d10c6558dafcf"
Merge pull request #2383 from erikr/fix-tests-fix-ampersands-deprecation

Fixed #22130 -- fixed template_tests/defaultfilters order dependent test
failure
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/22130#comment:13>

Reply all
Reply to author
Forward
0 new messages