Currently, on python-ideas there is a discussion going on about taint tracking in Python. It's tracking data that come from untrusted sources and preventing it from being used in sensitive places. This video [1] from last year explains the problems very well.
In noticed that we can do better in Django. We already have mark_safe, but what does such a SafeText mean? Safe as HTML, javascript, css, SQL or even something else? We know it's usually HTML, but that's not always the case.
Some people still have javascript in their templates and they use template tags inside their javascript. :(
Some people use the templating engine even for other stuff then generating HTML. The point is that we can't assume that "safe" means "safe as HTML". We have many languages in the web and HTML is just one of them.
I propose some changes that are backwards compatible for django.utils.safestring:
We should rename SafeText to HtmlText. Further we should not expect people to call format_html.
Instead of mark_safe, I propose that we call:
HtmlText('<p> %s </p>')
Explicitely annotating a string as HTML.
Instead of format_html, I propose that we do:
HtmlText('<p> %s </p>') % 'unsafe text'
Like django.utils.SafeText.__add__, we can implement SafeText.__mod__
I think that string interpolation feels more natural. (Or for those who prefer .format(), we can add that method to HtmlText.)
It can also be possible to stack escaping filters in the future:
HtmlText('<script>%s</script>') % JavascriptText("function() { echo '%s'; }") % 'hello world'
(implementing JavascriptText can be hard, as escaping is different in different parts of the code.)
Further, I would deprecate mark_for_escaping and EscapeData. [2] There should never be a reason to call this function.
Any suggestions?
[1] http://www.youtube.com/watch?v=WmZvnKYiNlE
[2] mark_for_escaping = lambda s: str(s) # Actually: mark_for_escaping == str
Personally I tend to attach data attributes to the wrapping node for the map/chart at HTML escaped JSON and then read them from an external JS file. Especially as that js can be fairly complex it's best to keep it outside the template where it can be compressed.
But we digress. I'm in favour of mark_safe having a name which suggest HTML as that's what it does. I don't however think it will stop people doing it wrong. However, I'm not sure if there's an easy way to apply your own default escape function (instead of HTML escaping) say if you are outputting a JS or TeX doc and have written an escaper you're happy with. We may as well make this pluggable.
Marc
--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/20131017080600.1363122391%40smtp.modern-world.net.
For more options, visit https://groups.google.com/groups/opt_out.