An argument against mark_safe.

112 views
Skip to first unread message

Jonathan Slenders

unread,
Oct 16, 2013, 3:30:01 PM10/16/13
to django-d...@googlegroups.com

Currently, on python-ideas there is a discussion going on about taint tracking in Python. It's tracking data that come from untrusted sources and preventing it from being used in sensitive places. This video [1] from last year explains the problems very well.

In noticed that we can do better in Django. We already have mark_safe, but what does such a SafeText mean? Safe as HTML, javascript, css, SQL or even something else? We know it's usually HTML, but that's not always the case.

Some people still have javascript in their templates and they use template tags inside their javascript. :(

Some people use the templating engine even for other stuff then generating HTML. The point is that we can't assume that "safe" means "safe as HTML". We have many languages in the web and HTML is just one of them.

I propose some changes that are backwards compatible for django.utils.safestring:
We should rename SafeText to HtmlText. Further we should not expect people to call format_html.

Instead of mark_safe, I propose that we call:
HtmlText('<p> %s </p>')
Explicitely annotating a string as HTML.

Instead of format_html, I propose that we do:
HtmlText('<p> %s </p>') % 'unsafe text'
Like django.utils.SafeText.__add__, we can implement SafeText.__mod__

I think that string interpolation feels more natural. (Or for those who prefer .format(), we can add that method to HtmlText.)

It can also be possible to stack escaping filters in the future:
HtmlText('<script>%s</script>') % JavascriptText("function() { echo '%s'; }") % 'hello world'
(implementing JavascriptText can be hard, as escaping is different in different parts of the code.)

Further, I would deprecate mark_for_escaping and EscapeData. [2] There should never be a reason to call this function.

Any suggestions?

[1] http://www.youtube.com/watch?v=WmZvnKYiNlE
[2] mark_for_escaping = lambda s: str(s) # Actually: mark_for_escaping == str

Russell Keith-Magee

unread,
Oct 16, 2013, 7:20:18 PM10/16/13
to Django Developers
I can't fault your reasoning -- "safe" isn't a universal concept, and we don't just generate HTML. There's already been a couple of bugs raised about escaping in JavaScript IIRC. 

Your proposed approach certainly sounds like a good starting point. I'm sure there will be plenty of devil in the detail -- not the least of which, how do you handle this in templates which can have mixed content (e.g., a HTML page with JavaScript and CSS content embedded).

From an implementation point of view, we can't just remove mark_safe arbitrarily -- we'd need to have a migration plan in place, so that all the existing code out there that is using mark_safe continues to work (presumably interpreting safe text as safe HTML in the interim).

The other problem is finding someone to volunteer to actually do the work :-) What you've described isn't a small undertaking.

Yours
Russ Magee %-)

Daniele Procida

unread,
Oct 17, 2013, 4:06:00 AM10/17/13
to django-d...@googlegroups.com
On Wed, Oct 16, 2013, Jonathan Slenders <jonathan...@gmail.com> wrote:

>Some people still have javascript in their templates and they use template
>tags inside their javascript. :(

I am not sure if you're saying this is a bad thing, but it is unavoidable, isn't it? For example I use the Google Maps API, and I don't know of any other way to generate the map items dynamically than to build some of the JS for it using template tags. Is there a problem doing it like this, and is there a better way?

Daniele

Marc Tamlyn

unread,
Oct 17, 2013, 4:17:32 AM10/17/13
to django-d...@googlegroups.com

Personally I tend to attach data attributes to the wrapping node for the map/chart at HTML escaped JSON and then read them from an external JS file. Especially as that js can be fairly complex it's best to keep it outside the template where it can be compressed.

But we digress. I'm in favour of mark_safe having a name which suggest HTML as that's what it does. I don't however think it will stop people doing it wrong. However, I'm not sure if there's an easy way to apply your own default escape function (instead of HTML escaping) say if you are outputting a JS or TeX doc and have written an escaper you're happy with. We may as well make this pluggable.

Marc

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/20131017080600.1363122391%40smtp.modern-world.net.
For more options, visit https://groups.google.com/groups/opt_out.

Bruno Renié

unread,
Oct 17, 2013, 4:18:25 AM10/17/13
to django-d...@googlegroups.com
I see at least 2 more robust ways:

* Loading map items in another JSON request or as an embedded JSON
string in a hidden <textarea> or something. Then your JS code can
simply JSON.parse() the data and no other code generation is required.

* Using data-attributes that can be parsed using completely static JS
code: <div id="map" data-width="600" data-height="200">. You could
imagine storing map items as DOM elements with data-attributes
attached.

Bruno
Reply all
Reply to author
Forward
0 new messages