Escaping in templates...

18 views
Skip to first unread message

James Bennett

unread,
Apr 16, 2007, 1:54:25 AM4/16/07
to django-d...@googlegroups.com
Short and sweet: since we're already planning some
backwards-incompatible changes for the next release, how about we
hammer out auto-escaping of template output while we're at it? Even
those of us who don't like it (myself included) are probably at the
point of accepting that we have to do it eventually, so why don't we
get it out of the way?

--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

Malcolm Tredinnick

unread,
Apr 16, 2007, 2:10:15 AM4/16/07
to django-d...@googlegroups.com
On Mon, 2007-04-16 at 00:54 -0500, James Bennett wrote:
> Short and sweet: since we're already planning some
> backwards-incompatible changes for the next release, how about we
> hammer out auto-escaping of template output while we're at it? Even
> those of us who don't like it (myself included) are probably at the
> point of accepting that we have to do it eventually, so why don't we
> get it out of the way?

What do you see as the currently undecided issues? We seemed to have
quite a good consensus after the innumerable earlier threads and the
implementation I wrote has been kept up to date by Michael Radziej
(except for the admin portion, which I can update to newforms-admin
without too much trouble).

On or off by default seemed to be the only debate (and it's not really a
debate -- Adrian didn't like it).

Regards,
Malcolm

James Bennett

unread,
Apr 16, 2007, 2:39:54 AM4/16/07
to django-d...@googlegroups.com
On 4/16/07, Malcolm Tredinnick <mal...@pointy-stick.com> wrote:
> On or off by default seemed to be the only debate (and it's not really a
> debate -- Adrian didn't like it).

I don't *like* it, but I've come around to accepting that it's better
for us not to point the gun at a developer's foot and say "try not to
pull the trigger".

So I'd like to see a final decision on enabled-by-default vs. optional
and get it documented and in trunk.

(and, though I'd pretty much always be turning it off manually, I'd be
+1 on enabling by default)

Michael Radziej

unread,
Apr 16, 2007, 3:04:31 AM4/16/07
to django-d...@googlegroups.com
Hi,

Malcolm Tredinnick wrote:
> What do you see as the currently undecided issues? We seemed to have
> quite a good consensus after the innumerable earlier threads and the
> implementation I wrote has been kept up to date by Michael Radziej
> (except for the admin portion, which I can update to newforms-admin
> without too much trouble).

First, a disclaimer--I brought it up to date only with oldforms in mind.
Especially the rendering function will probably need to be modified for
auto-escape. There are also no tests for {% autoescape %} within newforms.

Default on/off isn't really a big issue, since you can switch it on in
your base template.

I have been developing with the autoescape patch applied (without admin
and only within oldforms) for about two months, though the result is not
productive (but it'll get productive this week).

There's currently one bug with the patch: In tracebacks (debug view),
everything is escaped two times. I haven't ruled out a connection with
my other patches, that's why I haven't reported it before.

My general experience is that it's very nice and easy for the templates,
but it makes the template tags and filters a bit more difficult.


> On or off by default seemed to be the only debate (and it's not really a
> debate -- Adrian didn't like it).

I wouldn't turn it on by default before more people have made experience
with it. My proposal:

- apply the patches
- leave it off by default
- wait for feedback


Michael

Malcolm Tredinnick

unread,
Apr 16, 2007, 4:14:20 AM4/16/07
to django-d...@googlegroups.com
On Mon, 2007-04-16 at 09:04 +0200, Michael Radziej wrote:
> Hi,
>
> Malcolm Tredinnick wrote:
> > What do you see as the currently undecided issues? We seemed to have
> > quite a good consensus after the innumerable earlier threads and the
> > implementation I wrote has been kept up to date by Michael Radziej
> > (except for the admin portion, which I can update to newforms-admin
> > without too much trouble).
>
> First, a disclaimer--I brought it up to date only with oldforms in mind.
> Especially the rendering function will probably need to be modified for
> auto-escape. There are also no tests for {% autoescape %} within newforms.
>
> Default on/off isn't really a big issue, since you can switch it on in
> your base template.
>
> I have been developing with the autoescape patch applied (without admin
> and only within oldforms) for about two months, though the result is not
> productive (but it'll get productive this week).
>
> There's currently one bug with the patch: In tracebacks (debug view),
> everything is escaped two times. I haven't ruled out a connection with
> my other patches, that's why I haven't reported it before.

I don't think it used to do that, but I can't say for it was an area I
tested extensively (although I'm pretty sure I triggered more than one
debug traceback when developing it). Easy enough to fix, though.

> My general experience is that it's very nice and easy for the templates,
> but it makes the template tags and filters a bit more difficult.

Only those that generate output which potentially includes HTML, though.
For all the others, you don't have to do anything. For those with mixed
content, it should be only a couple of lines extra, I would have thought
(based on porting the standard filters and tags and my own use of this
patch) -- are you seeing something different?

Regards,
Malcolm

Armin Ronacher

unread,
Apr 16, 2007, 6:08:43 AM4/16/07
to Django developers
Hoi,

-sys.maxint for autoescaping. I added support for that into jinja
quite a while ago and it was pain in the ass. It makes things more
complicated (speaking for return values and arguments of filters) and
it blows up the implementation. Not worth the work.

Regards,
Armin

Malcolm Tredinnick

unread,
Apr 16, 2007, 6:15:52 AM4/16/07
to django-d...@googlegroups.com

You understand we already have a working implementation in Trac, right?
Includes documentation for developers and everything. :-)

Most filters and template tags require no changes, for example.

Regards,
Malcolm


Michael Radziej

unread,
Apr 16, 2007, 7:08:50 AM4/16/07
to django-d...@googlegroups.com

Armin, you're coming a bit late to this discussion. We had a few quite
extensive threads months ago. If you seriously want to engage in this
discussion, please read what's been written before. I understand that you
cannot know when you joined later, but I wouldn't like to go through all the
arguments again and again.

Anyway, to give you a bit of ease ahead:

- it is not obligatory and probably not on by default (and you'll be always
able to switch it off without much effort).

- there's a way to mark strings as "don't need further escaping"
(mark_safe())

Whether it is an advantage or not is probably a matter of how you develop and
how much you trust your web designer (or yourself). Different people will
see different trade-offs here, that's just normal.

Ticket #2359 contains patches and instructions.

Michael


--
noris network AG - Deutschherrnstraße 15-19 - D-90429 Nürnberg -
Tel +49-911-9352-0 - Fax +49-911-9352-100
http://www.noris.de - The IT-Outsourcing Company

Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Hansjochen Klenk -
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689

Armin Ronacher

unread,
Apr 16, 2007, 10:49:58 AM4/16/07
to Django developers
Hi,

On Apr 16, 1:08 pm, Michael Radziej <m...@noris.de> wrote:
> Armin, you're coming a bit late to this discussion. We had a few quite
> extensive threads months ago. If you seriously want to engage in this
> discussion, please read what's been written before. I understand that you
> cannot know when you joined later, but I wouldn't like to go through all the
> arguments again and again.

Dammit, haven't investigated further. Sorry for that.

> - there's a way to mark strings as "don't need further escaping"
> (mark_safe())
>
> Whether it is an advantage or not is probably a matter of how you develop and
> how much you trust your web designer (or yourself). Different people will
> see different trade-offs here, that's just normal.

Yeah. After reading the current implementation it looks quite sane to
me. As long as there is a way to turn it off completely I can live
with that :-)
The problems I discovered was what happens if you pass an string
containing markup to a filter etc. In my test cases I often screwed up
things because a filter concatenated the markup object with another
string with the result of having a plain string with the markup in.

Regards,
Armin

Tom Tobin

unread,
Apr 16, 2007, 11:26:32 AM4/16/07
to django-d...@googlegroups.com
On 4/16/07, James Bennett <ubern...@gmail.com> wrote:
>
> Short and sweet: since we're already planning some
> backwards-incompatible changes for the next release, how about we
> hammer out auto-escaping of template output while we're at it? Even
> those of us who don't like it (myself included) are probably at the
> point of accepting that we have to do it eventually, so why don't we
> get it out of the way?

I haven't been participating much on Django-dev over the last few
months, but this made me go "eep?".

I'm still -1 on autoescaping as implemented in the latest patch in
#2359; the terminology used is strongly HTML-centric (e.g.,
``convert_to_words.is_safe`` -- safe from what?). We should be using
naming that makes it explicit that this is for HTML escaping, since
Django templates see a wide range of application (e.g., emails, CSV
files, Javascript . . .).

If that's fixed, I'll consider myself -0 if autoescaping is off by
default, and -1 otherwise. I still consider autoescaping to be a poor
substitute for actually *knowing your code*, but I get the feeling at
this point that the social pressure in its favor is going to win out.

Michael Radziej

unread,
Apr 17, 2007, 4:24:19 AM4/17/07
to django-d...@googlegroups.com
Hi Tom,

On Mon, Apr 16, Tom Tobin wrote:

> I haven't been participating much on Django-dev over the last few
> months, but this made me go "eep?".
>
> I'm still -1 on autoescaping as implemented in the latest patch in
> #2359; the terminology used is strongly HTML-centric (e.g.,
> ``convert_to_words.is_safe`` -- safe from what?). We should be using
> naming that makes it explicit that this is for HTML escaping, since
> Django templates see a wide range of application (e.g., emails, CSV
> files, Javascript . . .).
>
> If that's fixed, I'll consider myself -0 if autoescaping is off by
> default, and -1 otherwise. I still consider autoescaping to be a poor
> substitute for actually *knowing your code*, but I get the feeling at
> this point that the social pressure in its favor is going to win out.

Malcolm hasn't tried to implement a general escaping framework to escape
mail, xml, pdf or else. It's only html escaping. So I don't see why
the terminology shouldn't be from html.

A general escaping framework has been rejected, IIRC. First, Django is a
web development framework, and as such mostly produces html. Second, the
main motivation for auto-escape is to have a safeguard against scripting
attacks, and this is also for html.

Maybe in your case "knowing your code" is fine, but even within django
missing escapes have shown up as bugs in the past. This is the number one
reason of cross site scripting holes.

Simon G.

unread,
Apr 17, 2007, 8:00:32 AM4/17/07
to Django developers
This is one of those issues which is never going to please everyone.

So - I've started a list of the various proposals (1), and could you
all add any other proposals to this page, along with any pros/cons,
and vote on the one(s) you prefer.

This way we can get some idea of what a consensus view might look like

--Simon
[1] http://code.djangoproject.com/wiki/AutoEscapingProposals

Ned Batchelder

unread,
Apr 17, 2007, 9:08:11 AM4/17/07
to django-d...@googlegroups.com
I've been following this discussion with interest. XSS fragility is a
real weak point for text-based templating engines, and we need to find a
solution.

On the topic of HTML-escaping vs. general escaping: Absolutely the
reason to do auto-escaping is to make it dead easy to avoid XSS
problems, and so HTML escaping is easily the most important thing to get
right. While Django's dedication to template agnosticism is great
(allowing emails to be generated with templates, for example), by far
most of the text generated through the template engine is HTML, and that
is the most vulnerable part of the Django ecosystem.

That said, though, keep in mind that not all text in a .html template is
HTML:

<p>My first variable is {{my_var1}}</p>
<script>
var my_second_variable = "{{my_var2}}";
blah();
</script>

In this case, my_var1 needs to have "escape" applied. The case of
my_var2 is a bit trickier. "addslashes" is good, but isn't enough
(since </script> appearing in my_var2 will cause problems). Things can
of course get even trickier:

<script>
document.write("<p>{{my_var3}}</p>");
</script>

My brain starts to hurt trying to figure out how to protect my_var3!

--Ned.

> .
>
>

--
Ned Batchelder, http://nedbatchelder.com

Tom Tobin

unread,
Apr 17, 2007, 11:29:47 AM4/17/07
to django-d...@googlegroups.com
On 4/17/07, Michael Radziej <m...@noris.de> wrote:
> >
> > I'm still -1 on autoescaping as implemented in the latest patch in
> > #2359; the terminology used is strongly HTML-centric (e.g.,
> > ``convert_to_words.is_safe`` -- safe from what?). We should be using
> > naming that makes it explicit that this is for HTML escaping, since
> > Django templates see a wide range of application (e.g., emails, CSV
> > files, Javascript . . .).
> >
> > If that's fixed, I'll consider myself -0 if autoescaping is off by
> > default, and -1 otherwise. I still consider autoescaping to be a poor
> > substitute for actually *knowing your code*, but I get the feeling at
> > this point that the social pressure in its favor is going to win out.
>
> Malcolm hasn't tried to implement a general escaping framework to escape
> mail, xml, pdf or else. It's only html escaping. So I don't see why
> the terminology shouldn't be from html.
>
> A general escaping framework has been rejected, IIRC. First, Django is a
> web development framework, and as such mostly produces html. Second, the
> main motivation for auto-escape is to have a safeguard against scripting
> attacks, and this is also for html.

I think you misunderstood me; I'm not saying there should be a
general-output escaping framework. I'm saying that if there *is* an
HTML escaping framework, the object/variable naming should make it
clear that we're dealing with HTML-specific escaping where such code
comes into contact with the general templating system. Setting
"is_safe" on a filter doesn't tell me a thing about what it's "safe"
from; setting "is_html_safe" *does* give me an idea about what's going
on.

> Maybe in your case "knowing your code" is fine, but even within django
> missing escapes have shown up as bugs in the past. This is the number one
> reason of cross site scripting holes.

I don't think this line of argument is ever going to reach resolution
between the pro and con camps regarding auto-escaping, so I'm not
really trying to argue that point here (my strongly-held views
notwithstanding). I'm trying to make sure that whatever auto-escaping
implementation *does* get accepted is tolerable. :-)

Michael Radziej

unread,
Apr 17, 2007, 12:03:58 PM4/17/07
to django-d...@googlegroups.com
On Tue, Apr 17, Tom Tobin wrote:

> I think you misunderstood me; I'm not saying there should be a
> general-output escaping framework. I'm saying that if there *is* an
> HTML escaping framework, the object/variable naming should make it
> clear that we're dealing with HTML-specific escaping where such code
> comes into contact with the general templating system. Setting
> "is_safe" on a filter doesn't tell me a thing about what it's "safe"
> from; setting "is_html_safe" *does* give me an idea about what's going
> on.

Ah, you're right. Now, honestly, I am not a huge fan of the current names,
but I don't consider very important, and I don't have good names either. I
trust Malcolm to choose what he finds best ;-)

> > Maybe in your case "knowing your code" is fine, but even within django
> > missing escapes have shown up as bugs in the past. This is the number one
> > reason of cross site scripting holes.
>
> I don't think this line of argument is ever going to reach resolution
> between the pro and con camps regarding auto-escaping, so I'm not
> really trying to argue that point here (my strongly-held views
> notwithstanding). I'm trying to make sure that whatever auto-escaping
> implementation *does* get accepted is tolerable. :-)

Fine! Apologies for this misunderstanding.

Malcolm Tredinnick

unread,
Apr 17, 2007, 10:49:38 PM4/17/07
to django-d...@googlegroups.com
On Tue, 2007-04-17 at 05:00 -0700, Simon G. wrote:
> This is one of those issues which is never going to please everyone.
>
> So - I've started a list of the various proposals (1), and could you
> all add any other proposals to this page, along with any pros/cons,
> and vote on the one(s) you prefer.
>
> This way we can get some idea of what a consensus view might look like

If you're going to make a page like that, perhaps leave out the truly
subjective judgements like "magic" (a poorly defined term at the best of
times). Aside from the fact that I totally disagree with the claim (and
looking back at all the previous threads on this nobody's ever mentioned
the word except when talking about some alternative proposals), it
doesn't really add anything to the value of such a list.

Thanks,
Malcolm


Malcolm Tredinnick

unread,
Apr 17, 2007, 10:52:03 PM4/17/07
to django-d...@googlegroups.com

The last two cases (var2 and var3) are where you do need to understand
your code. If that style of code is predominant in a particular block in
a template, it's worth marking that block as non-autoescaping and doing
it manually. Even if it is autoescaping in that block, you can still
make those variables as safe from further escaping with the current
implementation.

Anything resembling auto-escaping in script blocks is a pretty doomed
exercise. The examples you gave are typical -- it's almost impossible to
come up with a robust scheme since the escaping that needs to be applied
to Javascript code is too context-dependent (and even
semantics-dependent sometimes, as in your var3 case).

Cheers,
Malcolm

Simon G.

unread,
Apr 17, 2007, 11:35:56 PM4/17/07
to Django developers
Sorry - I just skim read the discussions on it in "AutoEscape" and
"AutoEscaping Alternative" where that was mentioned. Wasn't making any
value judgements :-)

--Simon

On Apr 18, 2:49 pm, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

Malcolm Tredinnick

unread,
Apr 17, 2007, 11:48:27 PM4/17/07
to django-d...@googlegroups.com
On Tue, 2007-04-17 at 20:35 -0700, Simon G. wrote:
> Sorry - I just skim read the discussions on it in "AutoEscape" and
> "AutoEscaping Alternative" where that was mentioned. Wasn't making any
> value judgements :-)

Unfortunately, the AutoEscapingAlternative page uses strawmen to try and
make its arguments. I really think we should keep the discussion on this
list, rather than trying to also track "votes" on a wiki page.

In the past threads, we basically had consensus anyway, I'm not sure
that revisiting everything again is worth the hassle.

Regards,
Malcolm

SmileyChris

unread,
Apr 18, 2007, 4:35:58 AM4/18/07
to Django developers
On Apr 18, 3:48 pm, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

> Unfortunately, the AutoEscapingAlternative page uses strawmen to try and
> make its arguments.
Ok, it's less "controversial" now.

SmileyChris

unread,
Apr 18, 2007, 4:58:00 AM4/18/07
to Django developers
On Apr 18, 3:48 pm, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

> In the past threads, we basically had consensus anyway, I'm not sure
> that revisiting everything again is worth the hassle.

Without trying to rock the boat... reading back, I'm not sure there
was a resounding consensus.

I actually like Malcom's proposal. Can't say I'd be thrilled if it was
on by default though.

I don't want to push my alternative that hard, because it's easy
enough to use without it being in core (slightly related: I still
would have liked to see escaping work recursively for lists -
http://code.djangoproject.com/ticket/2862)

Michael Radziej

unread,
Apr 18, 2007, 6:46:46 AM4/18/07
to django-d...@googlegroups.com
Hi Chris,

To round up my opinion about this: Chris's alternative is too simplistic.

For me, Malcolm's approach solves two key issues:

- It makes "escaping" the rule and not-escaping the exception.

If you err on the wrong side and get double escaping, this isn't nice,
but it's harmless. If you err and skip escaping, you get a possible
XSS attack. Chris's approach cannot do this to the same degree since
you'll usually get plenty of exceptions as soon as you use template
filters that return html code (e.g., for rendering special variables).

- It moves the responsiblity for escaping from the template to the context.

The template writer shouldn't need to know whether the context variables
are already escaped or not. Worse, this could change over time.
Finally, the programmer should know what has already been escaped
and what hasn't, because he's the one who does it.

I can honestly from say from own experience that Chris's approach wouldn't
work for me, while Malcom's does.


So long,

Armin Ronacher

unread,
Apr 18, 2007, 1:35:14 PM4/18/07
to Django developers
Hoi,

Another small notice. Pylons and other frameworks thought about
implementing "__html__" for objects that return an html representation
of the object. If there is none it's converted to unicode and escaped.
Adding something like "__html__ = lambda s: s" to the escaped string
base classes could improve support for other template engines like
genshi.

Regards,
Armin

SmileyChris

unread,
Apr 18, 2007, 6:18:49 PM4/18/07
to Django developers
Thinking about it more, I wouldn't actually be against Malcom's
autoescaping solution being on by default - I do see the importance of
solid XSS protection!

I can also see a solution which would maintain backwards compatibility
for old sites:
TEMPLATE_AUTOESCAPE=False in conf.global_settings
TEMPLATE_AUTOESCAPE=True in conf.project_template.settings

Would that be too confusing to document? It seems like a good
compromise - old projects don't break, new projects are protected by
default.

Malcolm Tredinnick

unread,
Apr 18, 2007, 9:36:00 PM4/18/07
to django-d...@googlegroups.com
On Wed, 2007-04-18 at 08:58 +0000, SmileyChris wrote:
[...]

> I actually like Malcom's proposal. Can't say I'd be thrilled if it was
> on by default though.

Since this is actually a good idea, we should give credit where it's
due: the whole theory behind this (and most of the details) is Simon
Willison's creation. I just wrote the code and polished off some of the
less smooth corners (and brow-beat Adrian and Jacob into thinking about
it at OSCON last year). All credit to our young English friend, please

[Oh.. and if people don't start taking the effort to spell "Malcolm"
correctly, there's going to be blood in the streets. I'm just saying...
it's not that hard. Thankyou.]

Cheers,
Malcolm

SmileyChris

unread,
Apr 19, 2007, 1:07:14 AM4/19/07
to Django developers
On Apr 19, 1:36 pm, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

> Since this is actually a good idea, we should give credit where it's
> due: the whole theory behind this (and most of the details) is Simon
> Willison's creation. I just wrote the code and polished off some of the
> less smooth corners (and brow-beat Adrian and Jacob into thinking about
> it at OSCON last year). All credit to our young English friend, please

Yes yes, I meant Malcolm's implementation. I heard a little voice
saying something was askew when I was writing it but I tend to ignore
the voices in my head.
Three cheers to all the brains involved! :D

Michael Radziej

unread,
Apr 19, 2007, 3:07:29 AM4/19/07
to django-d...@googlegroups.com
Hi Malcolm,

Malcolm Tredinnick wrote:
> On Wed, 2007-04-18 at 08:58 +0000, SmileyChris wrote:
> [...]
>> I actually like Malcom's proposal. Can't say I'd be thrilled if it was
>> on by default though.
>
> Since this is actually a good idea, we should give credit where it's
> due: the whole theory behind this (and most of the details) is Simon
> Willison's creation. I just wrote the code and polished off some of the
> less smooth corners (and brow-beat Adrian and Jacob into thinking about
> it at OSCON last year). All credit to our young English friend, please

Well ... Simon *is* much easier to spell, so it's fine for us!

> [Oh.. and if people don't start taking the effort to spell "Malcolm"
> correctly, there's going to be blood in the streets. I'm just saying...
> it's not that hard. Thankyou.]

Hey, you spelled mine wrong 10 months and 3 days ago. I still have one
good ;-)

Michael

James Bennett

unread,
Apr 19, 2007, 6:02:50 AM4/19/07
to django-d...@googlegroups.com
On 4/18/07, SmileyChris <smile...@gmail.com> wrote:
> I can also see a solution which would maintain backwards compatibility
> for old sites:
> TEMPLATE_AUTOESCAPE=False in conf.global_settings
> TEMPLATE_AUTOESCAPE=True in conf.project_template.settings

I'd be against this; any solution which has the same template
rendering differently depending on project settings will get a -1 from
me because it brings back too many bad memories of PHP and things
which would mysteriously change their behavior based on some obscure
setting.

webograph

unread,
Apr 20, 2007, 10:57:38 AM4/20/07
to django-d...@googlegroups.com
i like the idea of having a __html__ __str__-like function a lot, especially for string/html representation of database objects. is this something that would be compatible with django's design principles?

for implementation, here's a suggestion of which i'm not sure if it works:
one could define __str__ functions like
def __str__(self, contenttype='text/plain')
using different output if specified or raising an exception for unknown content types. can be easily extended to serve text/javascript or other potential representations.

just another thought on that issue: is something like
>>> red=ColourObject('red','#FF0000')
>>> s=ContentString('<html><body><h1>%s</h1></body></html>','text/html')
>>> print s%red
<html><body><h1><span style="color:#FF0000">red</span></html></body></html>
>>> print "spam is %s."%red
spam is red.
>>> print ContentString(red,'text/css')
color:#FF0000;

make sense?

webograph


ps if there is a mime type object in django, this could of course replace the strings

Reply all
Reply to author
Forward
0 new messages