Hey Mark,
On 05.03.2008, at 20:51, Mark Ramm wrote:
> Someone suggested to me that we try to find a GSoC candidate to work
> on Genshi related performance enhancments as part of the TurboGears
> GSoC mentoring organization proposal. I love genshi but I would love
> it a lot more we were able to improve the performance somewhat ;)
> Anybody interested in helping flesh out what would be involved in a
> Genshi performance enhancement project, or even better in mentoring a
> student who wants to work on this?
That's a pretty good idea.
One thing though is that the candidate would need to come with pretty
good knowledge of Python, and ideally also HTML/XML. Many parts of the
Genshi code-base aren't exactly for the faint of heart, especially in
those areas that are performance critical (match templates, xpath,
serialization). But those are also the areas with the largest
potential improvements. So it'd really take someone with solid
understanding of Python (in particular stuff like generators and
closures).
If we can find a good candidate for this, there are a number of
approaches for improving performance that have been discussed so far.
Some of those already have branches in the repository, or (incomplete/
outdated) patches in the ticket system. I'll try to give a quick
overview here:
1) Static match templates
Many match templates don't actually need to be run at render time;
rather they represent transformations that could be done immediately
after the template has been parsed (let's call this "compile time").
For example, if you have a match template that matches every <foo>…</
foo> element and transforms that into <div class="foo">…</div>, Genshi
should be able to expand that transformation at compile time. When the
template is actually rendered, Genshi no longers sees <foo> tags and a
corresponding match template, it just sees the <div class="foo"> tags.
Static matching should probably be opt-in (or at least opt-out), for
example by defining the match template as <py:match path="foo"
static="true">.
This would require some surgery to add a proper optimization stage to
the template "compilation" process. That stage would also allow other
kinds of optimizations, such as moving the checks for valid nesting of
py:choose/py:when/py:otherwise out of the render stage. But static
matching definitely has the most potential for a huge speed boost in a
lot of scenarios.
2) Matching fast-paths
Here, the idea is to add fast paths for simple but common match
template constructs such as matching by tag name and/or attribute
value. Instead of going through the full XPath matching algorithm, you
use a simple hash lookup to determine whether a given element matches.
(Alec Flett recently brought up this idea, and started a branch to
implement it. Alec, how's that branch doing?)
3) Serialization hints / Markup event collapsing
Genshi currently computes the XML or HTML representation of every
event in a template output stream at render time, to enable generating
different serializations of the same markup at the infoset(-ish)
level. I.e. write your templates in XHTML, and switch to HTML on the
fly when rendering.
However, if you know beforehand that you're going to be using a
specific serialization method, the representation of many of the
template events could be pre-computed, so that actually serializing
that event just means returning a static string. For example, you
could pre-compute the string representation of the markup event
(START, ('b', [(class, 'foo')])) to be <b class="foo">. When the
serializer sees that event, it just returns the pre-computed string.
The challenge with this is that template directives and expressions,
but also stream filters, can replace events in a template output
stream. So if that START tag had an expression in an attribute value,
you can't pre-compute the serialization (especially considering that
expressions returning None remove the attribute altogether). Or a
stream filter might replace that event entirely, changing the tagname
to "strong" or adding/modifying attributes. XML namespaces raise yet
more challenges.
But in general, being able to map an event to a static string should
really help performance. Taking it a bit further, it should be
possible to collapse multiple "static" events into just one event.
Alec Thomas has started some work in this direction on the "optimizer"
branch. Alec, any insights you'd like to add to this?
4) Compilation to Python byte code
This would compile templates down to Python byte code. I started this
on the "inline" branch, but it's incomplete, and the measurement
results were somewhat underwhelming. It might still help general
performance, but I think the stuff mentioned above would be more
beneficial.
Okay that's all I can think of for now.
I'd be willing to act as mentor for such a project, if there's someone
who wants to tackle the problems and seems capable of doing the work.
Cheers,
Chris
--
Christopher Lenz
cmlenz at gmx.de
http://www.cmlenz.net/