This is probably true. Genshi's approach is different to the other templating engines in that Genshi:
* first parses an XML template into a stream of events * generates a stream of output events from the template * renders the stream of output events to a new XML string
The down side to this approach is performance. The upside is that templates which are valid XML can be expected to generate output that is valid XML and problems like missing or broken end tags can be picked up easily in the templates themselves.
Whether the reduced performance is worth this benefit will depend on your use case. In my own current use case the performance requirements are very moderate (parsing a few tens of text and XML files at process launch) and the benefits of picking up problems early are clearly worth a few fractions of a second in start-up time.
Other templating engines I've looked at generally work directly on the XML as though it were plain text. This appears to be born out by the analysis conducted into the talk you reference -- it's all about raw string operations and concatenating output. In Genshi's case a substantial part of the time is spend in processing event streams (if I'm remembering correctly from my dive into Genshi benchmarking many years ago).
The article doesn't actually appear to contain the details of the code used to perform the Genshi benchmark so I can't comment in any detail on whether they could have done better. The Trac developers on the list will probably have a better idea of how to optimally use Genshi for web templating than I do.
I am keen to see Genshi's performance improved, potentially even at the expense of some sort of major code reworking if the speed increase warranted it (say an order of magnitude or two decrease in the time a real world application like Trac spends rendering templates). I don't think it makes any sense to change Genshi's core philosophy of parsing XML though -- if we did that it wouldn't be Genshi.
I've done some profiling myself a few months ago and it seems that
Genshi spends most of output serialization time in applying it's
filters. While the generators used probably aren't slow at all, the
speed gain is entirely offset by the number of method calls required
to process the stream events. Installing with the C-extension speedups
don't seem to help much either. I'm like you, I actually think that
parse time error checking and filters are 2 of Genshi's most
distinguishing features and have refused to switch to another template
engine even though I have some pretty heavy traffic requirements.
I agree that this is a pretty firm design issue that I can't do much
about without some fundamental changes in how the stream events are
represented. Is using ElementTree or lxml a viable option at all
should a revamp happen?
Jimmy Yuen Ho Wong
On May 13, 4:05 am, Simon Cross <hodges...@gmail.com> wrote:
> This is probably true. Genshi's approach is different to the other
> templating engines in that Genshi:
> * first parses an XML template into a stream of events
> * generates a stream of output events from the template
> * renders the stream of output events to a new XML string
> The down side to this approach is performance. The upside is that
> templates which are valid XML can be expected to generate output that
> is valid XML and problems like missing or broken end tags can be
> picked up easily in the templates themselves.
> Whether the reduced performance is worth this benefit will depend on
> your use case. In my own current use case the performance requirements
> are very moderate (parsing a few tens of text and XML files at process
> launch) and the benefits of picking up problems early are clearly
> worth a few fractions of a second in start-up time.
> Other templating engines I've looked at generally work directly on the
> XML as though it were plain text. This appears to be born out by the
> analysis conducted into the talk you reference -- it's all about raw
> string operations and concatenating output. In Genshi's case a
> substantial part of the time is spend in processing event streams (if
> I'm remembering correctly from my dive into Genshi benchmarking many
> years ago).
> The article doesn't actually appear to contain the details of the code
> used to perform the Genshi benchmark so I can't comment in any detail
> on whether they could have done better. The Trac developers on the
> list will probably have a better idea of how to optimally use Genshi
> for web templating than I do.
> I am keen to see Genshi's performance improved, potentially even at
> the expense of some sort of major code reworking if the speed increase
> warranted it (say an order of magnitude or two decrease in the time a
> real world application like Trac spends rendering templates). I don't
> think it makes any sense to change Genshi's core philosophy of parsing
> XML though -- if we did that it wouldn't be Genshi.
> I've done some profiling myself a few months ago and it seems that > Genshi spends most of output serialization time in applying it's > filters. While the generators used probably aren't slow at all, the > speed gain is entirely offset by the number of method calls required > to process the stream events. Installing with the C-extension speedups > don't seem to help much either. I'm like you, I actually think that > parse time error checking and filters are 2 of Genshi's most > distinguishing features and have refused to switch to another template > engine even though I have some pretty heavy traffic requirements.
> I agree that this is a pretty firm design issue that I can't do much > about without some fundamental changes in how the stream events are > represented. Is using ElementTree or lxml a viable option at all > should a revamp happen?
By the way, did someone already put cmlenz' ideas about a "Genshi v2" somewhere on the wiki? If not, I can try to dig the mail he sent last year to Trac-dev about this.
On Thu, May 12, 2011 at 11:17 PM, Yuen Ho Wong <wyue...@gmail.com> wrote:
> I agree that this is a pretty firm design issue that I can't do much > about without some fundamental changes in how the stream events are > represented. Is using ElementTree or lxml a viable option at all > should a revamp happen?
ElementTree would break the streaming model which would be a pity (and I don't know off-hand whether it would be faster). I don't know lxml well enough to comment on it's use.
Aside: Genshi provides some things (e.g. XPath) that I'm not sure will ever be blindly fast in the general case.
On Thu, May 12, 2011 at 11:29 PM, Christian Boos <cb...@neuf.fr> wrote: > By the way, did someone already put cmlenz' ideas about a "Genshi v2" > somewhere on the wiki? If not, I can try to dig the mail he sent last year > to Trac-dev about this.
I don't think I've seen them and a quick search of the Trac and Genshi wikis and mailing lists didn't turn anything up. If you could dig them up that would be much appreciated.
> On Thu, May 12, 2011 at 11:29 PM, Christian Boos<cb...@neuf.fr> wrote: >> By the way, did someone already put cmlenz' ideas about a "Genshi v2" >> somewhere on the wiki? If not, I can try to dig the mail he sent last year >> to Trac-dev about this. > I don't think I've seen them and a quick search of the Trac and Genshi > wikis and mailing lists didn't turn anything up. If you could dig them > up that would be much appreciated.
I just took a look at the serializer's __call__ methods again and it
seems that a lot of the ideas Tenjin uses is already being used in
Genshi except for aliasing the global names locally. Memoization
was added in 0.6 so that should help somewhat (The benchmark Tenjin
did was against 0.5.x). Maybe for the low-hanging fruits someone can
rewrite all the _emit calls in C and do name aliasing and see how they
perform? Anything more than that but short of going Genshi2 would be
to rewrite all the serializers in C entirely, but would rewriting the
generators in C make THAT much of a difference?
On May 13, 7:01 am, Christian Boos <cb...@neuf.fr> wrote:
> > On Thu, May 12, 2011 at 11:29 PM, Christian Boos<cb...@neuf.fr> wrote:
> >> By the way, did someone already put cmlenz' ideas about a "Genshi v2"
> >> somewhere on the wiki? If not, I can try to dig the mail he sent last year
> >> to Trac-dev about this.
> > I don't think I've seen them and a quick search of the Trac and Genshi
> > wikis and mailing lists didn't turn anything up. If you could dig them
> > up that would be much appreciated.
On Thursday, May 12, 2011 at 10:05:26 PM, Simon Cross <hodges...@gmail.com> wrote: > On Thu, May 12, 2011 at 8:40 PM, Yuen Ho Wong <wyue...@gmail.com> > wrote: > > I was just wondering if there is some truth to its claim that > > Genshi is the > > second slowest template engine in the batch... > > http://www.slideshare.net/kwatch/how-to-create-a-highspeed-template-e... > > in-python [snip] > I am keen to see Genshi's performance improved, potentially even at > the expense of some sort of major code reworking if the speed increase > warranted it (say an order of magnitude or two decrease in the time a > real world application like Trac spends rendering templates). I don't > think it makes any sense to change Genshi's core philosophy of parsing > XML though -- if we did that it wouldn't be Genshi.
Agreed here. Has anyone else here looked at chameleon's genshi support? It's supposed to retain the XML-structure-based approach while producing speedup...
AFAIK Chameleon's Genshi support doesn't do stream processing, so it
doesn't support filters. They've also dropped support for Genshi in
2.0. I'm not sure whether 2.0 final will have Genshi support.
On May 13, 2:32 pm, David Fraser <dav...@sjsoft.com> wrote:
> On Thursday, May 12, 2011 at 10:05:26 PM, Simon Cross <hodges...@gmail.com> wrote:
> > On Thu, May 12, 2011 at 8:40 PM, Yuen Ho Wong <wyue...@gmail.com>
> > wrote:
> > > I was just wondering if there is some truth to its claim that
> > > Genshi is the
> > > second slowest template engine in the batch...
> > >http://www.slideshare.net/kwatch/how-to-create-a-highspeed-template-e...
> > > in-python
> [snip]
> > I am keen to see Genshi's performance improved, potentially even at
> > the expense of some sort of major code reworking if the speed increase
> > warranted it (say an order of magnitude or two decrease in the time a
> > real world application like Trac spends rendering templates). I don't
> > think it makes any sense to change Genshi's core philosophy of parsing
> > XML though -- if we did that it wouldn't be Genshi.
> Agreed here. Has anyone else here looked at chameleon's genshi support? It's supposed to retain the XML-structure-based approach while producing speedup...
Ok I've done some thinking and a little experiments and I must say
that cmlez is correct. If we could somehow make matching templates and
includes static, that'll probably shave a significant portion of time
used when serializing the output. I'm not sure about fragment caching
tho. Are you going to cache the fragments as strings or stream events?
If you cache them as strings, then all the XPath code, matching
templates and whatnot are going to be completely broken. You can't
transform the strings back to stream events in place as that'll
probably be even slower then just storing as streams. Compiling to
bytecodes is likely not useful also. Bytecodes don't speed up
execution time in CPython, but only startup time.
As this point, I have to ask, how was it decided that the basic data
structure in Genshi was to be a stream? Event-based parsing is a more
suitable model for one-off, memory-efficient light processing. This is
well-known in the Java world. Genshi seems to be designed for some
pretty dynamic manipulation and therefore a tree is probably a far
more suitable model than a stream. Generators also seems to be a
terrible choice for when you need to reconstitute the large number of
things into some other form, like a string or a list, which has to be
done in the end.
>>> timeit.timeit('list(i for i in xrange(10000))', number=10000)
7.5280599594116211
>>> timeit.timeit('list([i for i in xrange(10000)])', number=10000)
6.1257951259613037
>>> timeit.timeit('[i for i in xrange(10000)]', number=10000)
5.4970419406890869
When you chose stream, were you concerned about speed, memory usage or
text templates? I ask this because memory usage concern doesn't seem
to be the case either since there is a cache and a template loader.
I'd imagine most server-side usage of markup templates, or indeed even
text templates would require caching of the data structure from
parsing. However, text templates are really better stored as a
sequence of string fragments and some transformation functions that
take some Python data and output a string.
I think by using a stream as the fundamental data structure, you end
up neither serving markup templates nor text templates right. Do
people use Genshi as their text template engine at all? Can we just
have a template engine that just do one thing only and does it well? I
plan to experiment with a rewrite of Genshi in the coming couple of
months that preserve as much Genshi markup template functionality as
possible and be fast at the same time. I really like the idea of
Genshi and I'd hate to see it stagnate. Any ideas, opinions and
suggestions is welcome at this point.
On May 13, 7:01 am, Christian Boos <cb...@neuf.fr> wrote:
> > On Thu, May 12, 2011 at 11:29 PM, Christian Boos<cb...@neuf.fr> wrote:
> >> By the way, did someone already put cmlenz' ideas about a "Genshi v2"
> >> somewhere on the wiki? If not, I can try to dig the mail he sent last year
> >> to Trac-dev about this.
> > I don't think I've seen them and a quick search of the Trac and Genshi
> > wikis and mailing lists didn't turn anything up. If you could dig them
> > up that would be much appreciated.
> Ok I've done some thinking and a little experiments and I must say > that cmlez is correct. If we could somehow make matching templates and > includes static, that'll probably shave a significant portion of time > used when serializing the output. I'm not sure about fragment caching > tho. Are you going to cache the fragments as strings or stream events? > If you cache them as strings, then all the XPath code, matching > templates and whatnot are going to be completely broken. You can't > transform the strings back to stream events in place as that'll > probably be even slower then just storing as streams. Compiling to > bytecodes is likely not useful also. Bytecodes don't speed up > execution time in CPython, but only startup time.
> As this point, I have to ask, how was it decided that the basic data > structure in Genshi was to be a stream? Event-based parsing is a more > suitable model for one-off, memory-efficient light processing. This is > well-known in the Java world. Genshi seems to be designed for some > pretty dynamic manipulation and therefore a tree is probably a far > more suitable model than a stream. Generators also seems to be a > terrible choice for when you need to reconstitute the large number of > things into some other form, like a string or a list, which has to be > done in the end.
Not to mention that IIRC in a lot of places in the code, the stream is "linearized" with list(stream). Not using generators would also probably make it easier to "grasp" what the code is actually doing ;-) I would be very interested to see how a tree-based Genshi would behave. Another wild idea: with an lxml.etree backend, the xpath operations could be very fast...
>>>> timeit.timeit('list(i for i in xrange(10000))', number=10000) > 7.5280599594116211 >>>> timeit.timeit('list([i for i in xrange(10000)])', number=10000) > 6.1257951259613037 >>>> timeit.timeit('[i for i in xrange(10000)]', number=10000) > 5.4970419406890869
> When you chose stream, were you concerned about speed, memory usage or > text templates? I ask this because memory usage concern doesn't seem > to be the case either since there is a cache and a template loader. > I'd imagine most server-side usage of markup templates, or indeed even > text templates would require caching of the data structure from > parsing. However, text templates are really better stored as a > sequence of string fragments and some transformation functions that > take some Python data and output a string.
When I first learned about Genshi and its generator based implementation, I thought it would be possible to stream the results, e.g. by using chunked encoding transfer. However, we never really tried to do it in Trac so I don't know whether this is actually possible or not. I also don't know if this was part of the initial motivation for using generators but it's a possibility.
> I think by using a stream as the fundamental data structure, you end > up neither serving markup templates nor text templates right. Do > people use Genshi as their text template engine at all? Can we just > have a template engine that just do one thing only and does it well? I > plan to experiment with a rewrite of Genshi in the coming couple of > months that preserve as much Genshi markup template functionality as > possible and be fast at the same time. I really like the idea of > Genshi and I'd hate to see it stagnate. Any ideas, opinions and > suggestions is welcome at this point.
Trac uses text templates in some places. Actually we initially used "normal" xhtml templates to generate text but while possible, this proved to be awkard to the point Christopher added TextTemplates. See http://trac.edgewall.org/changeset/3730
> I think by using a stream as the fundamental data structure, you end > up neither serving markup templates nor text templates right. Do > people use Genshi as their text template engine at all? Can we just > have a template engine that just do one thing only and does it well? I
I use it for generating email message body both in text and html format.
On Sat, May 14, 2011 at 10:05 AM, Christian Boos <cb...@neuf.fr> wrote: > Trac uses text templates in some places. Actually we initially used "normal" > xhtml templates to generate text but while possible, this proved to be > awkard to the point Christopher added TextTemplates. See > http://trac.edgewall.org/changeset/3730
Greetings,
I haven't followed Genshi much lately but I saw this post and just wanted to add my two cents.
I like the text template feature as well. It's useful for generating CSS, JavaScript, and the like. I'd hate to have to use a separate template engine just for that, or use the aforementioned kludge of wrapping the text in XML. And at any rate, it's not the culprit of Genshi's performance issues and shouldn't be sacrificed.
Genshi is still far and away my favorite templating engine for Python. The performance issues are not normally a problem for me, but they're still a problem. I just hope it doesn't languish away completely.
On Fri, Jun 17, 2011 at 9:16 PM, Erik <hyugaricd...@gmail.com> wrote: > I like the text template feature as well. It's useful for generating > CSS, JavaScript, and the like. I'd hate to have to use a separate > template engine just for that, or use the aforementioned kludge of > wrapping the text in XML. And at any rate, it's not the culprit of > Genshi's performance issues and shouldn't be sacrificed.
Just to re-assure everyone, TextTemplates aren't going anywhere. Trac and various Trac plugins use them and these form an important part of Genshi's user base.
At my own day job, Genshi TextTemplates are used more extensively than XML templates.
No worries, I'm just experimenting with my own reimplementation of
Genshi that suits my needs. Base on what I've tried so far, it's seems
like it's going to be extremely difficult to reproduce the current
Genshi functionality with high fidelity using lxml. The problem is
Genshi does a lot of things like empty tag filtering and namespace
flattening, white space filtering and so forth that don't seem to have
1 to 1 equivalent functionality in the API lxml exposes. I could be
wrong tho, so I can definitely use some help. At this point I really
don't have a whole lot of confidence in speeding up Genshi using lxml
from the way it is now. Using libxml2 and libxslt Python bindings
might be able to make this new Genshi backwards compatible, but it
sounds like a lot of work.
For those who worry about TextTemplates going away, don't be, I'm just
a random dude dropping in looking for ideas. I'm not gonna fork a new
Genshi without TextTemplate... yet. I think it's more likely for me to
just give up and move on to pyTenjin than forking Genshi.
Y.H. Wong
On Jun 18, 7:39 am, Simon Cross <hodges...@gmail.com> wrote:
> On Fri, Jun 17, 2011 at 9:16 PM, Erik <hyugaricd...@gmail.com> wrote:
> > I like the text template feature as well. It's useful for generating
> > CSS, JavaScript, and the like. I'd hate to have to use a separate
> > template engine just for that, or use the aforementioned kludge of
> > wrapping the text in XML. And at any rate, it's not the culprit of
> > Genshi's performance issues and shouldn't be sacrificed.
> Just to re-assure everyone, TextTemplates aren't going anywhere. Trac
> and various Trac plugins use them and these form an important part of
> Genshi's user base.
> At my own day job, Genshi TextTemplates are used more extensively than
> XML templates.
It looks like a throw back to Kid. I would reeeeeally like to see any
Genshi derivatives to support filters. But then again, I think it's
too early judge Kajiki now since it's still a very new project. I also
haven't seen any performance numbers and I'm too lazy to test them out
myself. i18n support is still not quite on par with Genshi. I think
Kajiki is off to a good start but I'd love to see API compatibility
with Genshi. It's definitely a project worth watching for.
On Jun 21, 2:13 am, Nando Florestan <nandoflores...@gmail.com> wrote:
On the topic of improved performance, I know very little of Genshi's internals, so bear with me if what I've been thinking isn't possible with Genshi itself, but have been wondering whether it would be possible to improve performance whilst retaining xml parsing goodness by way of the following.
The benefits of a template system that truly understands XML is incredibly useful at development time, but not so important in production, where it causes performance issues. How hard would it be to have the template system parse XML at development time, but when in production use a simpler string replacement mechanism by way of template compilation and switching using a 'development mode'?
Is it remotely doable, or a bit mental in the case of Genshi?
On 20 June 2011 19:13, Nando Florestan <nandoflores...@gmail.com> wrote:
> So how do you guys like the approach taken by Kajiki?
> -- > You received this message because you are subscribed to the Google Groups > "Genshi" group. > To post to this group, send email to genshi@googlegroups.com. > To unsubscribe from this group, send email to > genshi+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/genshi?hl=en.
On Monday, June 27, 2011 at 11:55:38 AM, "Nicholas Dudfield" <ndudfi...@gmail.com> wrote: >> The benefits of a template system that truly understands XML is >> incredibly useful at development time,
Thanks Nicholas, I've seen that before. It looks like a nice workflow!
My question/point though, is whether it would be possible to build a version of Genshi that has a fast 'production mode', where the system is simplified and compiled for speed.
I think if it is possible, it would give the best of both worlds. Anyone?
On 27 June 2011 10:55, Nicholas Dudfield <ndudfi...@gmail.com> wrote:
> -- > You received this message because you are subscribed to the Google Groups > "Genshi" group. > To post to this group, send email to genshi@googlegroups.com. > To unsubscribe from this group, send email to > genshi+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/genshi?hl=en.
On Monday, June 27, 2011, at 5:01:41 PM, "Joshua Rowley" <jos...@vannelluna.com> wrote: > Thanks Nicholas, I've seen that before. It looks like a nice > workflow!
> My question/point though, is whether it would be possible to build a > version of Genshi that has a fast 'production mode', where the > system is simplified and compiled for speed.
> I think if it is possible, it would give the best of both worlds. > Anyone?
Indeed. It's worth looking into and I think some people have tried some such things (vagueness...) But if you build onto the genshi architecture significantly then your code all works around streams being processed through filters, which means it's not simple...