Q's about speeding up template rendering.

21 views
Skip to first unread message

Chris Slowe

unread,
Aug 9, 2007, 9:28:18 PM8/9/07
to Mako Templates for Python
Hi Group,

I'm one reddit's developers, and we are currently in the midst of
rearchitecting the site with Mako as the new templating engine.
Overall I'd say the experience (having migrated from Cheetah) has been
very positive. We're right now trying to clean up and speed up and
I'm noticing in the code profiling a couple of spots in Mako that seem
to be slowing the rendering down a bit.

For starters, I'm noticing that when the templates are compiled, <
%namespace> tags in the template generate corresponding _populate()
calls in the compiled form, but that these seem to be inserted
regardless of whether any of the resulting imports are used in that
funcall. Where this becomes problematic is that I ended up creating a
"library" template of commonly rendered parts of the page (a bunch of
%defs and a %namespace at the top of that one with some other basic
bits and pieces) which gets called pretty much everywhere. After I
tracked this, I ended up splitting the file up by the "what namespace
do you need" metric and that seems to have made things a bit perkier,
but of course at the cost of keeping things organized. Is there
another way to do this?

Another one is related to the way filtering is done. In the past,
we've operated under assumption that we'll accept whatever suspicious/
malicious data the user might want to send to us, and sanitize it
later whenever it is rendered. This is a rather straightforward
operation in cheetah with #filter to toggle the different fillter
built-ins. In Mako there seems to be a global default filter, and
%def (or %{}) specific ones. But, turning on a global "h" filter
(which is the closest analogy I can make to what we have done in the
past in Cheetah) seems to generate a bit of a performance hit, and has
rather dire consequences as the to the quality of the html generated
in funcalls, as there seems no built in "unfilter" or "nofilter"
operation. Am I missing something? Suggestions?

Considering that our pages are basically long listings of near-
identical objects with similar rendering code, I have no doubt that
we've managed to abuse Mako a bit. Mainly I hope this didn't come off
as a diatribe (I really like working with mako) and was aiming for
something like a call to help/call to action.

Any help, and any other known "gotchas" to improve/hamper performance,
is appreciated,
Chris

Michael Bayer

unread,
Aug 9, 2007, 11:20:38 PM8/9/07
to Mako Templates for Python

On Aug 9, 9:28 pm, Chris Slowe <chris.sl...@gmail.com> wrote:
> Hi Group,
>
> I'm one reddit's developers, and we are currently in the midst of
> rearchitecting the site with Mako as the new templating engine.
> Overall I'd say the experience (having migrated from Cheetah) has been
> very positive.

that is really exciting to hear. I'm a reddit junkie myself.

>
> For starters, I'm noticing that when the templates are compiled, <
> %namespace> tags in the template generate corresponding _populate()
> calls in the compiled form, but that these seem to be inserted
> regardless of whether any of the resulting imports are used in that
> funcall.
> Where this becomes problematic is that I ended up creating a
> "library" template of commonly rendered parts of the page (a bunch of
> %defs and a %namespace at the top of that one with some other basic
> bits and pieces) which gets called pretty much everywhere. After I
> tracked this, I ended up splitting the file up by the "what namespace
> do you need" metric and that seems to have made things a bit perkier,
> but of course at the cost of keeping things organized. Is there
> another way to do this?

the most straightforward way around this is to not use "import='*'",
and to instead access namespace attributes from the namespace name
directly, i.e.

<%namespace name="foo" file="bar"/>

${foo.bat()}

or to at least qualify which names you'll be using, i.e. <%namespace
file="bar" import="x, y, z"/>.

One would hypothesize that there would be a way, given the usage of
"import='*'", that we could figure out the actual list of tokens to be
pulled from the import at compile time, based on the names present
within the body of the template, since we are doing lots of decision
making based upon that information.

But the issue here is that if your template imports three namespaces
using "import='*'", and then you reference some identifier names
"foo", "bar", and "bat", Mako has to assume you want to get those from
the context, since you haven't specified anywhere else they might come
from. Therefore the current approach is that each namespace which had
an 'import='*'' populates its full set of tokens into the context.

i can think of two ways, from a "mako architecture" point of view, to
decrease the time spent populating the namespace into the context.
one would be to not populate the context at all, and instead have the
context "fall back" onto the namespace data if a value is not found;
however this means that context values sent to render() would
supercede namespace names, which is something i dont like (random data
placed into the context can change the execution of the template to
something unexpected). but also, it means the logic of context.get()
must be changed to consult the namespaces as well which also adds
overhead to retrieving those names. if the order of lookup were
reversed, then you add overhead in looking up the non-namespace
names. so that's unlikely to improve performance in the majority of
cases.

the other way would be to limit the _populate() call just to names
that are locally used in the code block; i.e. instead of
_populate(['*']) it would be _populate(['a', 'b', 'c']), where 'a',
'b', and 'c' are names that were referenced locally. that's a little
weird though because *all* namespaces would have those same names
called (since we don't know at compile time which namespaces actually
have which tokens available), and each namespace would have to
silently skip over token names which they do not contain. this might
speed you up a little bit though it would require some testing.

However, you can do the equivalent to the latter approach anyway, by
just saying "import='a, b, c'" instead of "import='*'". that
eliminates the guesswork that Mako would have to otherwise do and is
faster at both compile and runtime.

But contrast both of these approaches to just calling values off the
namespace directly, i.e. don't use "import" at all, and just say $
{foo.bat()}, and theres no comparison. theres zero upfront overhead
to this approach and if you don't reference the namespace's functions
in the block, then *nothing* gets rendered into the compiled
render_body(), regardless of how many <%namespace> tags you have.
originally this was my intended usage of namespaces, the
"import=something" idea was kind of an afterthought. i think its
cleaner to reference functions relative to their parent namespace
anyway and this is the approach I take in my own code.

>
> Another one is related to the way filtering is done. In the past,
> we've operated under assumption that we'll accept whatever suspicious/
> malicious data the user might want to send to us, and sanitize it
> later whenever it is rendered. This is a rather straightforward
> operation in cheetah with #filter to toggle the different fillter
> built-ins. In Mako there seems to be a global default filter, and
> %def (or %{}) specific ones. But, turning on a global "h" filter
> (which is the closest analogy I can make to what we have done in the
> past in Cheetah) seems to generate a bit of a performance hit, and has
> rather dire consequences as the to the quality of the html generated
> in funcalls, as there seems no built in "unfilter" or "nofilter"
> operation. Am I missing something? Suggestions?

yes, youre missing something, but its not your fault; i added it a few
releases ago and forgot to document it. If you are using a global
filter of some kind, i.e. at the <%page> or Template() level, you can
disable it by using the "n" filter, which looks like:

${"sometext" | n}

This filter can also be used with other filters, such as:

${"sometext" | n, x}

which would turn off default filtering turn on XML filtering in that
expression. the 'n' flag turns off *everything*, even the global
"unicode()" filter thats declared in Template, so it eliminates all
filtering overhead.

also, all of these filters apply only to expressions, i.e. ${}.
theres no default filtering applied to the output of <%defs> (since
the output of a <%def> which you'd want to filter is...the
expressions!).

of course, if you're lamenting that theres *any* default filter, that
can be turned off entirely by just sending default_filters=[] to your
Template/TemplateLookup. the whole filtering thing is completely
optional (which is very intentional because i know its a speed hit).

>
> Considering that our pages are basically long listings of near-
> identical objects with similar rendering code, I have no doubt that
> we've managed to abuse Mako a bit. Mainly I hope this didn't come off
> as a diatribe (I really like working with mako) and was aiming for
> something like a call to help/call to action.

oh not at all you haven't hit upon anything i wasn't aware of. when i
wrote this thing, all i would do was run the basic.py test as compared
to cheetah and spend all day trying to trim hundredths of a second off
the rendering time, so as to stay matched with cheetah and its C
extensions.

>
> Any help, and any other known "gotchas" to improve/hamper performance,
> is appreciated,

I hope my suggestions prove to be helpful, and let me know if you need
anything else. this is the first time anyone has dug into mako's
performance, since we already were pretty darn fast out of the gate,
but naturally there are always more optimizations that can be made and
i am glad you are taking a close look at them.

im also 'zzzeek' on freenode channels #pylons and #sqlalchemy.

- mike

Michael Bayer

unread,
Aug 10, 2007, 12:15:24 AM8/10/07
to Mako Templates for Python

On Aug 9, 11:20 pm, Michael Bayer <zzz...@gmail.com> wrote:
>
> But the issue here is that if your template imports three namespaces
> using "import='*'", and then you reference some identifier names
> "foo", "bar", and "bat", Mako has to assume you want to get those from
> the context, since you haven't specified anywhere else they might come
> from. Therefore the current approach is that each namespace which had
> an 'import='*'' populates its full set of tokens into the context.

Actually, I have to correct myself. this is not exactly what
happens. what happens *is* actually along the lines of the "check the
namespaces first, then check the context" technique i mentioned,
combined with a dictionary which is being populated. it conceptually
looks like this, if you imported namespaces "foo", "bar", and "bat",
and your template then references the variable names "x", "y", and
"z":

namespace_names = {}
for namespace in (foo, bar, bat):
namespace.copy_everything_into(namespace_names)

x = namespace_names.get("x", context.get("x", UNDEFINED))
y = namespace_names.get("y", context.get("y", UNDEFINED))
z = namespace_names.get("z", context.get("z", UNDEFINED))

the alternate way to do this would be:

x = foo.get("x", bar.get("x", bat.get("x", context.get("x",
UNDEFINED))))
...

which one is faster, generally depends on how many names are present
in each of "foo", "bar", and "bat", versus how many namespaces are
imported into the template overall. but also im looking at
_populate(), and i can see why its slowing you down, since theres a
lot of overhead in creating a callable "namespace" thingy which gets
placed in the rendered dictionary (we're talking about the workings of
Namespace._get_star()). Whereas if we went with the second approach,
and "x" didnt exist anywhere, we'd have the overhead of three
"AttributeErrors" being thrown (although hasattr() might speed it up).

anyway, let me know if you can just use <namespacename>.<defname>
instead, since its still way faster than anything with can do with
"import"..otherwise, we can look into doing away with the _get_star()
call and adding a fast "return this def if you have it" function to
Namespace. this would also be an excuse for me to put out the next
release where theres some other things that have been waiting around.

Reply all
Reply to author
Forward
0 new messages