django template modules compiled with cython

601 views
Skip to first unread message

Alexandru Damian

unread,
Dec 9, 2015, 9:25:19 AM12/9/15
to Django developers (Contributions to Django itself)
Hi,

I've compiled the django.template.{base,context,context_processor} modules with cython in order to speed up template rendering.

I've come to the conclusion that this is needed after profiling long page loads that had the data from the database returned in 10-50 ms,
but the page would take seconds to complete - especially admin change views with lots of inlines.

I am seeing a 13% decrease in page rendering time across the board, obviously passing all django tests.

To make this easy to deploy I've compiled it as a drop-in replacement (targeting django 1.8.7) that uses import hooks (PEP 302)
to replace targeted modules on the fly.

The sources/releases are here: https://github.com/ddalex/django-cemplate

Can you please advise on how to continue this work ?
I feel that maintaining separate trees for each Django release is not scalable on my part.

Cheers,
Alex

Russell Keith-Magee

unread,
Dec 9, 2015, 7:30:03 PM12/9/15
to Django Developers
HI Alex,

On Wed, Dec 9, 2015 at 9:57 PM, Alexandru Damian <dda...@gmail.com> wrote:
Hi,

I've compiled the django.template.{base,context,context_processor} modules with cython in order to speed up template rendering.

I've come to the conclusion that this is needed after profiling long page loads that had the data from the database returned in 10-50 ms,
but the page would take seconds to complete - especially admin change views with lots of inlines.

If you’re seeing admin pages taking *seconds* to render, and the database is truly taking 10-50ms, there’s something going *very* wrong here. Can you share your profile data?
 
I am seeing a 13% decrease in page rendering time across the board, obviously passing all django tests.

Have you tried running PyPy instead? The reports I’ve seen suggest you’ll get a *lot* more than a 13% speedup.

To make this easy to deploy I've compiled it as a drop-in replacement (targeting django 1.8.7) that uses import hooks (PEP 302)
to replace targeted modules on the fly.

The sources/releases are here: https://github.com/ddalex/django-cemplate

Can you please advise on how to continue this work ?
I feel that maintaining separate trees for each Django release is not scalable on my part.
 
It’s difficult to say without knowing what you’re proposing. Is there some change you’d like to see merged into Django? Some API hook that you’d like to add to make your life easier?

Yours,
Russ Magee %-)

Alexandru Damian

unread,
Dec 10, 2015, 6:52:46 AM12/10/15
to Django developers (Contributions to Django itself)
Hi Russ,

Thanks for coming back to me so quickly.


On Thursday, December 10, 2015 at 12:30:03 AM UTC, Russell Keith-Magee wrote:
HI Alex,

On Wed, Dec 9, 2015 at 9:57 PM, Alexandru Damian <dda...@gmail.com> wrote:
Hi,

I've compiled the django.template.{base,context,context_processor} modules with cython in order to speed up template rendering.

I've come to the conclusion that this is needed after profiling long page loads that had the data from the database returned in 10-50 ms,
but the page would take seconds to complete - especially admin change views with lots of inlines.

If you’re seeing admin pages taking *seconds* to render, and the database is truly taking 10-50ms, there’s something going *very* wrong here. Can you share your profile data?

I cProfile-instrumented def get_response() in django/core/handlers/base.py  without (vanilla django) and with cemplate, posted the results here:

https://github.com/ddalex/django-cemplate/tree/master/profile_data

Both profiles are with all caches primed. Please compare time spent in django/template/base.py in vanilla (0.773 seconds) and cemplate (0.288) seconds.
 
I am seeing a 13% decrease in page rendering time across the board, obviously passing all django tests.

Have you tried running PyPy instead? The reports I’ve seen suggest you’ll get a *lot* more than a 13% speedup.

Looked at pypy first; unfortunately, not all the modules we are using are pypy-compatible.

To make this easy to deploy I've compiled it as a drop-in replacement (targeting django 1.8.7) that uses import hooks (PEP 302)
to replace targeted modules on the fly.

The sources/releases are here: https://github.com/ddalex/django-cemplate

Can you please advise on how to continue this work ?
I feel that maintaining separate trees for each Django release is not scalable on my part.
 
It’s difficult to say without knowing what you’re proposing. Is there some change you’d like to see merged into Django? Some API hook that you’d like to add to make your life easier?


I am thinking of cythonizing parts of django (e.g. the modules that I am modifying here) - this will bring .c compilation on install, but will make code faster in the framework, where it is likely that nobody is modifying it. It brings in an extra dependency on cython for developers. What is your take on this ?

Yours,
Russ Magee %-)

Cheers,
Alex

Florian Apolloner

unread,
Dec 10, 2015, 1:14:46 PM12/10/15
to Django developers (Contributions to Django itself)


On Thursday, December 10, 2015 at 12:52:46 PM UTC+1, Alexandru Damian wrote:
I am thinking of cythonizing parts of django (e.g. the modules that I am modifying here) - this will bring .c compilation on install, but will make code faster in the framework, where it is likely that nobody is modifying it. It brings in an extra dependency on cython for developers. What is your take on this ?

As long as it is optional, getting Cython running on Windows is most likely not the easiest thing to do ;)

Cheers,
Florian

Tim Graham

unread,
Dec 10, 2015, 3:44:55 PM12/10/15
to Django developers (Contributions to Django itself)
Sorry for the ignorance, but I have little idea what cython is and what incorporating this into Django involves. If you could give a little background information that would be appreciated.

Russell Keith-Magee

unread,
Dec 10, 2015, 6:56:38 PM12/10/15
to Django Developers
If you’re proposing that Django use Cython by default, I have to say that I’m not especially enthusiastic about the idea. I’m not a huge fan of Cython - I’ve found it to be a bunch of headaches every time I’ve tried to use it, and based on the data you’re presenting, a 13% performance boost isn’t enough to get me excited.

If you’re proposing a mechanism by which someone who wants to use Cython can opt-in - I might be more inclined; but I’d need to see what the maintenance overhead would be for the core team.

I’d be a *lot* more interested in providing some sort of hook so that Cython support can be maintained externally, rather than by the core team. 

Yours,
Russ Magee %-)

Shai Berger

unread,
Dec 10, 2015, 7:18:52 PM12/10/15
to django-d...@googlegroups.com
On Thursday 10 December 2015 22:44:54 Tim Graham wrote:
> Sorry for the ignorance, but I have little idea what cython is and what
> incorporating this into Django involves. If you could give a little
> background information that would be appreciated.
>
Cython (http://cython.org/) is a system for writing C extensions for Python,
It defines a language which is mostly a superset of Python and provides a
compiler which translates code from this language to C, such that modules
defined in Cython become loadable extension modules.

The Cython language includes, besides Python constructs, syntax which
translates more directly to C, allowing Cython code to easily use existing C
libraries; since it produces extension modules, the same code is also easy to
combine with pure Python.

The home page says "All of this makes Cython the ideal language for wrapping
external C libraries, embedding CPython into existing applications, and for
fast C modules that speed up the execution of Python code". As you may
imagine, the picture is not all rosy -- you do get complicated builds and
cryptic error messages and subtle interactions and all that jazz, but in my
experience it mostly stands up to that promise.

(this doesn't mean it's a perfect match for Django, of course)

Shai.

Florian Apolloner

unread,
Dec 11, 2015, 4:11:55 AM12/11/15
to Django developers (Contributions to Django itself)
Hi Alexandru,


On Thursday, December 10, 2015 at 12:52:46 PM UTC+1, Alexandru Damian wrote:
I cProfile-instrumented def get_response() in django/core/handlers/base.py  without (vanilla django) and with cemplate, posted the results here:

The results are odd, looks as if you are running in a "test"-like environment. Please make sure you run them with DEBUG=False and without any test instrumentation to mimic a production environment.

Both profiles are with all caches primed. Please compare time spent in django/template/base.py in vanilla (0.773 seconds) and cemplate (0.288) seconds.

Yeah, that is roughly a 100% speedup in template rendering (does one say 100% if twice as fast?!)
 
Can you please advise on how to continue this work ?
I feel that maintaining separate trees for each Django release is not scalable on my part.
 
It’s difficult to say without knowing what you’re proposing. Is there some change you’d like to see merged into Django? Some API hook that you’d like to add to make your life easier?

I am not really convinced that replacing the whole file is a good idea. In my experience one gets better results when using Cython by strategically replacing single functions and rewriting those in C directly. A factor of two is all nice an well, but if this is still just 10% in the overall response there might be other (better) optimizations out there.

Cheers,
Florian

Alexandru Damian

unread,
Dec 14, 2015, 9:53:50 AM12/14/15
to Django developers (Contributions to Django itself)


On Friday, December 11, 2015 at 9:11:55 AM UTC, Florian Apolloner wrote:
Hi Alexandru,

On Thursday, December 10, 2015 at 12:52:46 PM UTC+1, Alexandru Damian wrote:
I cProfile-instrumented def get_response() in django/core/handlers/base.py  without (vanilla django) and with cemplate, posted the results here:

The results are odd, looks as if you are running in a "test"-like environment. Please make sure you run them with DEBUG=False and without any test instrumentation to mimic a production environment.

I've switched to using runprofileserver from django-extended; the uploaded files are in kcachegrind format - these are profiled with all instrumentation off, and DEBUG set to False.

Both profiles are with all caches primed. Please compare time spent in django/template/base.py in vanilla (0.773 seconds) and cemplate (0.288) seconds.

Yeah, that is roughly a 100% speedup in template rendering (does one say 100% if twice as fast?!)
 
Can you please advise on how to continue this work ?
I feel that maintaining separate trees for each Django release is not scalable on my part.
 
It’s difficult to say without knowing what you’re proposing. Is there some change you’d like to see merged into Django? Some API hook that you’d like to add to make your life easier?

I am not really convinced that replacing the whole file is a good idea. In my experience one gets better results when using Cython by strategically replacing single functions and rewriting those in C directly.

This is the actual approach I am taking, but at class level. I selectively choose the base classes and convert those to Cython language; the modules are packaged in as a whole to make packaging easier.

I am not sure how one would go about replacing just certain classes at runtime in Django, if that's your suggestion. Any suggestions ?

 
A factor of two is all nice an well, but if this is still just 10% in the overall response there might be other (better) optimizations out there.

I'm still trying to find low hanging fruits out here. I have profiled the no-data pages at 300ms which is still too high for my targets (render pages under 200ms). As expected, most of the time is spent in django.template.*, but it is difficult to pinpoint what exactly is going wrong.


Cheers,
Florian

Florian Apolloner

unread,
Dec 15, 2015, 3:51:58 AM12/15/15
to Django developers (Contributions to Django itself)


On Monday, December 14, 2015 at 3:53:50 PM UTC+1, Alexandru Damian wrote:
I am not really convinced that replacing the whole file is a good idea. In my experience one gets better results when using Cython by strategically replacing single functions and rewriting those in C directly.

This is the actual approach I am taking, but at class level. I selectively choose the base classes and convert those to Cython language; the modules are packaged in as a whole to make packaging easier.

Well, defaulttags.pyx and base.pyx seems to indicate to me that the whole files are compiled? I understand that this is most likely due to cdefing some base classes -- therefore you need a C-context. But that is what I ment with replacing the whole files.

Cheers,
Florian

Adam Johnson

unread,
Feb 5, 2016, 11:23:59 AM2/5/16
to Django developers (Contributions to Django itself)
Hi guys,

I work with Alex here at YPlan. We deployed a tidied updated version of Alex's code as django-speedboost, since it looked promising in local profiling. You can see the code here: https://github.com/YPlan/django-speedboost . It uses a Cythonized version of Django 1.8.8's template engine, and passes Django 1.8.8's test suite. This is also compatible with 1.8.9, since there were no template changes, but not 1.9.x (as far as we know!).

However, we did take it out of production recently though. The speed boost we could measure locally with profiling, about 10% on whole page time, didn't seem to translate into a speed boost in production. We don't really know why, but since it's quite hackish and requires maintenance, we've stopped using it. Just wanted to let everyone know where we got.

Thanks,

Adam

Cristiano Coelho

unread,
Feb 5, 2016, 7:28:16 PM2/5/16
to Django developers (Contributions to Django itself)
Hi Adam,

I'm sorry it didnt'work out after all, let me tell you that Python itself is quite slow and the template engine has some good overhead as well making this slowlyness even worse, greatly noticed on big or nested loops. Also, 10% increase seems quite low to be important.

I think that if you really want a performance boost you have a few options (all quite bad though):

- Use PyPy rather than python, this one is quite complicated if you have cloud deplyoments or lots of 3rd party libraries.
- Use Jinja2 template engine which is said to be faster
- Re write the template engine or parts in actual C code as C extension (this is very complicated)
- Stop using templates completely (only keep it for a start page) and move to client side templating such as AngularJS (the web is moving towards this with client side + REST Api, this is a highly performant and scalable option compared to server side templating)
- There's a new Microsoft project called Pyjion, which says they have implemented a JIT API for CPython, adding JIT support to python while still being 100% compatible with it, this is quite different from PyPy approach. I believe this is quite alpha yet but looks promising: https://github.com/Microsoft/Pyjion

Adam Johnson

unread,
Feb 8, 2016, 10:40:38 AM2/8/16
to Django developers (Contributions to Django itself)
Hey Cristiano

Yeah it's a shame it didn't work out.

In our case, we would love to try Jinja2, but we can't, because the slow pages are in the Django Admin rather than our own code :(

For our main site we do use javascript templates - clientside as you say, except we call out to a node.js server to render the first page view "isomorphically" so Google likes us.

Pyjion does look intereseting, thanks for the link.
Reply all
Reply to author
Forward
0 new messages