Concrete django.template Suggestions

Armin Ronacher

unread,

Sep 19, 2008, 12:08:57 PM9/19/08

to Django developers

Hi everybody,

After my quite controversal blog post about Jinja and Django I summed
up some suggestions for the Django template engine that don't require
a complete reimplementation / breaking backwards compatibility :)

First of all the thread in-safeties I discovered [and the explanation
why nobody noticed so far]. First of all the explanation, because
that's probably more interesting: Per default templates are parsed and
evaluated each request so you will never notice that. However I've
seen enough applications by now that keep the template objects in
memory where these problems become visible. Why would anyone want to
not parse the templates each request? Because it's a lot faster. How
much? ~100 req/sec versus 60 req/sec for the file browser page on
bitbucket, and that's the overall request time including all of the
request handling, http header parsing and of course database.

Where are threading bugs? Everywhere, where a tag render method
somehow modifies the node object. Where does this happen? From what
I've seen so far it's done in "block", "ifchanged", "cycle" and
"extends". Now, of course the attribute assignments happen for a
reason: they are necessary, at least for cycle and ifchanged. Block
and extends look like accidents, they could be solved a lot cleaner
(block would put a reference to a hypothetical SuperBlock into the
context, rather than a reference to itself) and extends doesn't have
to store the evaluated value on the node at all, a local variable is
just fine. Ifchanged and cycle however have to somewhere store the
information, and I recommend using the context object for that. The
context could get `get_state_var`, `set_state_var` functions that can
be used to store node related information from one render call to the
next::

def get_state_var(self, node, name, default=None):
storage = self._state_vars.get(id(node), None)
if storage is not None and name in storage:
return storage[name]
return default

def set_state_var(self, node, name, value):
self._state_vars.setdefault(id(node), {})[name] = value

The cycle node could then look like this::

class CycleNode(Node):
def __init__(self, cyclevars, variable_name=None):
self.cycle_vars = map(Variable, cyclevars)
self.variable_name = variable_name

def render(self, context):
pos = context.get_state_var(self, 'pos', 0)
value = self.cycle_iter[pos].resolve(context)
context.set_state_var(self, 'pos', (pos + 1) %
len(self.cycle_vars))
if self.variable_name:
context[self.variable_name] = value
return value

This used in all core tags makes the `Template` object thread safe,
but not the `Context` which is perfectly okay because it doesn't
happen (or should not happen) that a context object is shared between
threads. At least I don't see a situation where this is useful.

What's harder to fix is how the i18n integration translates filter
arguments or gettext constants (those _("foo") thingies). Currently
that happens very often at node/object creation rather than node/
object evaluation. A good example for that problem is
FilterExpression.__init__. The translation would have to be moved to
the resolve method in that case. When would a language change between
template compilation and rendering? If the language changes each
request which is a very common case on multilingual websites.

Changing the parser to a more advanced system or making the template
engine more consistent is not so important for the time being, but I
want to raise a few situations I encountered where the behaviour is
confusing:

- cycle arguments are delimited by spaces and each item can be an
expression (however if there is a comma involved somewhere, it
seems like the tag is interpreted as comma separated list of
strings
which makes the 'cycle "foo", "bar", "baz"' yield unexpected
results.

- On the other hand arguments for the url tag are comma delimited,
but whitespace after comma is not allowed.

- The group-by part of the regroup tag is an expression, but
everything
else but a variable name returns unexpected results. Furthermore
does this tag only work on dicts. By the group-by part I'm
refering
to the expression after the "by" keyword:

{% regroupy foo by bar as blah %}

bar is here treated as constant string, even though it's
represented
as variable in the regroup node and in the syntax.

- Likewise ssi excepts one argument that looks like a variable, but
is
treated as constant string.

Tags to be excluded from the variable-as-string rule should be block
and load because they represent neither variables nor strings. The
argument for block works like a label and the argument for load is a
library.

Regards,
Armin

Malcolm Tredinnick

unread,

Sep 19, 2008, 9:48:33 PM9/19/08

to django-d...@googlegroups.com

On Fri, 2008-09-19 at 09:08 -0700, Armin Ronacher wrote:
[...]
> Where are threading bugs?

We're all going to find this easier if the right words are used.
Template class instances are mutable objects that hold their current
state. This is a design choice, not a bug. When something is done
intentionally and works as intended, that's a design choice, not a bug.
The distinction is important, as it leads to discussions being made as
to the differences in opinions about the decision, not an all-or-nothing
bug fix. It's for the same reason we don't refer to dictionaries, lists,
sets and other mutable Python objects as having thread bugs.

> Everywhere, where a tag render method
> somehow modifies the node object. Where does this happen? From what
> I've seen so far it's done in "block", "ifchanged", "cycle" and
> "extends". Now, of course the attribute assignments happen for a
> reason: they are necessary, at least for cycle and ifchanged. Block
> and extends look like accidents, they could be solved a lot cleaner
> (block would put a reference to a hypothetical SuperBlock into the
> context, rather than a reference to itself) and extends doesn't have
> to store the evaluated value on the node at all, a local variable is
> just fine.

That would change the rendering behaviour, however. Whether they were
accidents originally or not isn't really a concern right this moment.
Backwards compatibility is. The approach is certainly worth considering
(we have to do something like this for reset-ability of cycling, for
example), but the backwards-incompatible effects needs to be itemised
and addressed as well.

[...]

> This used in all core tags makes the `Template` object thread safe,
> but not the `Context` which is perfectly okay because it doesn't
> happen (or should not happen) that a context object is shared between
> threads. At least I don't see a situation where this is useful.

All that leaves is the thousands of third-party tags. Again, trying to
reach for some guarantee that instances can be shared between threads
has quite an extensive set of secondary effects. It might be preferable
to say that our current situation of being a mutable object holding
state is not that bad, rather than causing massive disruptions to the
userbase, since the current code still meets the "fast enough" criteria
for a very large group of uses.

> What's harder to fix is how the i18n integration translates filter
> arguments or gettext constants (those _("foo") thingies). Currently
> that happens very often at node/object creation rather than node/
> object evaluation. A good example for that problem is
> FilterExpression.__init__. The translation would have to be moved to
> the resolve method in that case. When would a language change between
> template compilation and rendering? If the language changes each
> request which is a very common case on multilingual websites.

That's actually a known. From memory, it's buried in a ticket talking
about something else, but I know it's been on my list of things to
address for a while now and pre-1.1 is probably the time frame where I'm
goign to be doing a lot of i18n work.

A quite reasonable goal is that template instances should be cacheable
right after they've been loaded (we already know that rendering changes
state, but right after loading they're always in the same state). That
means being pickle-able and, later, unpickle-able in a different view
with, as you note, a different active locale.

There are actually a whole swathe of problems with the various i18n
template tags that need to be cleaned up. They're in line behind sorting
some of the thing like making Variable() consistent and stuff like that
so that we can reuse the common support stuff. That will come out of
merging the work Johannes Dollinger is proposing with the existing code.
So there's an ordering of fixes going on there, but it's certainly on my
personal radar.

>
> Changing the parser to a more advanced system or making the template
> engine more consistent is not so important for the time being, but I
> want to raise a few situations I encountered where the behaviour is
> confusing:
>
> - cycle arguments are delimited by spaces and each item can be an
> expression (however if there is a comma involved somewhere, it
> seems like the tag is interpreted as comma separated list of
> strings
> which makes the 'cycle "foo", "bar", "baz"' yield unexpected
> results.
>
> - On the other hand arguments for the url tag are comma delimited,
> but whitespace after comma is not allowed.
>
> - The group-by part of the regroup tag is an expression, but
> everything
> else but a variable name returns unexpected results. Furthermore
> does this tag only work on dicts. By the group-by part I'm
> refering
> to the expression after the "by" keyword:
>
> {% regroupy foo by bar as blah %}
>
> bar is here treated as constant string, even though it's
> represented
> as variable in the regroup node and in the syntax.

Most of these are already parts of existing tickets. You might want to
make sure the remainder are filed as such if they can be fixed by
loosening the parsing slightly (whilst maintaining backwards
compatibility for the advertised, documented behaviour). I think you'll
also find that Johannes' patch(es) address a number of these.

> - Likewise ssi excepts one argument that looks like a variable, but
> is
> treated as constant string

Things like that have been around forever and really can't be changed
now. It's not that hard to get it right and backwards compatibility is
very important.

Regards,
Malcolm

zvoase

unread,

Sep 20, 2008, 5:31:03 AM9/20/08

to Django developers

Probably the best thing to do, if anything, is to make something new
and get it working alongside the old system, much like "oldforms" and
"newforms" in 0.96. That way you keep backwards compatibility but
people can also leverage new features in their code.

Regards,
Zack

On Sep 20, 3:48 am, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

zvoase

unread,

Sep 20, 2008, 9:56:49 AM9/20/08

to Django developers

Oh, and by the way, with regards to thread safety, changes should be
made in the context, not the node. I'm currently working on a refactor
of the template system at the moment, something will be done within a
couple of days. If it's really necessary (I mean, I'm just doing it
for fun, but it might be useful).

Regards,
Zack

Johannes Dollinger

unread,

Sep 21, 2008, 9:48:34 AM9/21/08

to django-d...@googlegroups.com

Am 19.09.2008 um 18:08 schrieb Armin Ronacher:
[snip]

>
> What's harder to fix is how the i18n integration translates filter
> arguments or gettext constants (those _("foo") thingies). Currently
> that happens very often at node/object creation rather than node/
> object evaluation. A good example for that problem is
> FilterExpression.__init__. The translation would have to be moved to
> the resolve method in that case. When would a language change between
> template compilation and rendering? If the language changes each
> request which is a very common case on multilingual websites.

My patch treats _(x) as syntactic sugar for x|gettext - where
`gettext = lambda x: x and ugettext(x) or u''`.

I agree, a convention for tuples and literal strings would be really
helpful. I had to work around the {% url %} viewname issue (that's
why TokenStream.parse_string() takes bare=True). It's solved
backwards compatible yet accepts properly quoted strings. Whitespace
may occur between tokens but is discarded. {% regroup %} uses
TokenStream.parse_expression(), so that should be fixed too. {% cycle
%} is problematic because it accepts the legacy {% cycle a,b as c %}
format but that should be curable.

Reply all

Reply to author

Forward