Non-compiled: 0.0679 sec
Precompiled: 0.0052 sec
BytecodeCache: 0.0018 sec
While precompiled templates are 13 times faster than non-compiled
ones, they are still 3 times slower than when we use bytecode cache
(take these number as illustrations, the benchmark was not
scientific). This is due to the exec used to run the generated code, I
guess.
So I experimented loading the precompiled templates as a module, and
(surprise!), they were consistently 3 times faster than using bytecode
cache:
templates as modules: 0.0006 sec
This is blazing fast. However, I had to hack how Jinja2 generates the
Python code, so that a precompiled template could be imported. The
generated functions expect environment to be present in the module,
and environment is defined in the namespace during exec in
Template.from_code(). Now my lack of Python knowledge strikes, as I
could not import the generated code while inserting the required
environment variable (I asked in #python, and they said it is not
possible :P). So I added a "lazy" environment variable to the
generated code (hacking jinja2.compiler).
So, what you think about wrapping the generated code by a function,
and call the function passing environment and whatever other
parameters are needed in Template.from_code()? Or whatever other
method to make the environment variable 'injectable' without exec
(allowing the code to be imported)? Do you think 'templates as
modules' make sense and can be supported?
I hope this makes sense and I am not missing anything. :-)
regards,
-- rodrigo
http://paste.pocoo.org/show/124381/
Generated code is wrapped by a run(environment, template) function,
which is executed in Template.from_code() to get the template
variables. The generated code is now importable, and we can load a
template "as module". No compile(), no exec.
Let me know what you think.
-- rodrigo
Some notes about the previous proof of concept:
1) existing cache becomes obsolete (cached bytecodes need to be cleaned).
2) it doesn't affect performance of normal loading or bytecode cache loading.
3) a new loading method can be used - precompiled code as modules
(quick and dirty ModuleLoader example is here [1]).
4) loading from modules is several times faster than using bytecode
cache (from 3 to 5 times in the templates I tested). This will shine
in App Engine and kick a** of everything else, but also can be used
outside of App Engine for those that want extra performance for the
cost of precompiling the templates.
Ok, now I'm definitely flooding pocoo-libs. :-/ Sorry about this. :-)
-- rodrigo