Anybody interested in helping flesh out what would be involved in a
Genshi performance enhancement project, or even better in mentoring a
student who wants to work on this?
--
Mark Ramm-Christensen
email: mark at compoundthinking dot com
blog: www.compoundthinking.com/blog
On 05.03.2008, at 20:51, Mark Ramm wrote:
> Someone suggested to me that we try to find a GSoC candidate to work
> on Genshi related performance enhancments as part of the TurboGears
> GSoC mentoring organization proposal. I love genshi but I would love
> it a lot more we were able to improve the performance somewhat ;)
>
> Anybody interested in helping flesh out what would be involved in a
> Genshi performance enhancement project, or even better in mentoring a
> student who wants to work on this?
That's a pretty good idea.
One thing though is that the candidate would need to come with pretty
good knowledge of Python, and ideally also HTML/XML. Many parts of the
Genshi code-base aren't exactly for the faint of heart, especially in
those areas that are performance critical (match templates, xpath,
serialization). But those are also the areas with the largest
potential improvements. So it'd really take someone with solid
understanding of Python (in particular stuff like generators and
closures).
If we can find a good candidate for this, there are a number of
approaches for improving performance that have been discussed so far.
Some of those already have branches in the repository, or (incomplete/
outdated) patches in the ticket system. I'll try to give a quick
overview here:
1) Static match templates
Many match templates don't actually need to be run at render time;
rather they represent transformations that could be done immediately
after the template has been parsed (let's call this "compile time").
For example, if you have a match template that matches every <foo>…</
foo> element and transforms that into <div class="foo">…</div>, Genshi
should be able to expand that transformation at compile time. When the
template is actually rendered, Genshi no longers sees <foo> tags and a
corresponding match template, it just sees the <div class="foo"> tags.
Static matching should probably be opt-in (or at least opt-out), for
example by defining the match template as <py:match path="foo"
static="true">.
This would require some surgery to add a proper optimization stage to
the template "compilation" process. That stage would also allow other
kinds of optimizations, such as moving the checks for valid nesting of
py:choose/py:when/py:otherwise out of the render stage. But static
matching definitely has the most potential for a huge speed boost in a
lot of scenarios.
2) Matching fast-paths
Here, the idea is to add fast paths for simple but common match
template constructs such as matching by tag name and/or attribute
value. Instead of going through the full XPath matching algorithm, you
use a simple hash lookup to determine whether a given element matches.
(Alec Flett recently brought up this idea, and started a branch to
implement it. Alec, how's that branch doing?)
3) Serialization hints / Markup event collapsing
Genshi currently computes the XML or HTML representation of every
event in a template output stream at render time, to enable generating
different serializations of the same markup at the infoset(-ish)
level. I.e. write your templates in XHTML, and switch to HTML on the
fly when rendering.
However, if you know beforehand that you're going to be using a
specific serialization method, the representation of many of the
template events could be pre-computed, so that actually serializing
that event just means returning a static string. For example, you
could pre-compute the string representation of the markup event
(START, ('b', [(class, 'foo')])) to be <b class="foo">. When the
serializer sees that event, it just returns the pre-computed string.
The challenge with this is that template directives and expressions,
but also stream filters, can replace events in a template output
stream. So if that START tag had an expression in an attribute value,
you can't pre-compute the serialization (especially considering that
expressions returning None remove the attribute altogether). Or a
stream filter might replace that event entirely, changing the tagname
to "strong" or adding/modifying attributes. XML namespaces raise yet
more challenges.
But in general, being able to map an event to a static string should
really help performance. Taking it a bit further, it should be
possible to collapse multiple "static" events into just one event.
Alec Thomas has started some work in this direction on the "optimizer"
branch. Alec, any insights you'd like to add to this?
4) Compilation to Python byte code
This would compile templates down to Python byte code. I started this
on the "inline" branch, but it's incomplete, and the measurement
results were somewhat underwhelming. It might still help general
performance, but I think the stuff mentioned above would be more
beneficial.
Okay that's all I can think of for now.
I'd be willing to act as mentor for such a project, if there's someone
who wants to tackle the problems and seems capable of doing the work.
Cheers,
Chris
--
Christopher Lenz
cmlenz at gmx.de
http://www.cmlenz.net/
Hi,
First, I think getting involved with GSoC is a great idea.
Now a rough outline of what I was doing before life intervened:
The rendering optimisation concentrates on two overall strategies:
reducing the number of events that need rendering, and decreasing the
total number of events that require processing at all.
It does this by wrapping each Template in an Optimizer object. The
Optimizer analyses the Genshi stream during rendering, annotating the
stream with OPTIMIZER events during analysis, then generally performing
optimisations after the render is complete.
The collapsing of static events Chris mentioned is already implemented:
http://genshi.edgewall.org/browser/branches/experimental/optimizer/genshi/template/optimize.py#L189
There are still bugs, but it's really only a PoC.
The next strategy I was playing with was caching rendered blocks keyed
on input variables.
eg.
Given the template fragment:
<div>
<a href="${href}">${label}</a>
</div>
The optimiser would cache the rendered output on the key (href, label).
I'd like to say I have time to continue working on it soon, but time is
a rare commodity at the moment :(
--
Evolution: Taking care of those too stupid to take care of themselves.
This fast-path py:match work sounds really cool. Is there a way to get
hold of it so I can try it out?
Schiavo
Simon
Nevermind, I am silly and have found the branch [1].
[1] http://svn.edgewall.org/repos/genshi/branches/experimental/match-fastpaths
Schiavo
Simon