Cython usage within Django

1,366 views
Skip to first unread message

Tom Forbes

unread,
May 21, 2017, 5:59:55 PM5/21/17
to django-d...@googlegroups.com
Hello,
There was a very interesting talk at Pycon about using Cython to speed up hotspots in Python programs:

https://www.youtube.com/watch?v=_1MSX7V28Po

It got me wondering about possibly using Cython in selected places within Django. I realize since Django was first released the distribution situation was a bit more wild-west, resulting in part to Django not relying on any third party dependencies. But that situation is rapidly changing (see https://github.com/django/deps/blob/master/draft/0007-dependency-policy.rst#background-and-motivation) and with these changes could it also be a time to investigate Cython usage for select parts of Django?

Several popular projects use Cython 'speedup' modules with pure-python fallbacks with great success, for example aiohttp (https://github.com/aio-libs/aiohttp/blob/master/setup.py#L20). I did some quick and dirty profiling of the 'django.utils.html.escape' function and found that by simply including Cython as part of the build, and with no syntax changes, the function executes twice as fast.

There are lots of considerations to take into account (like ensuring the Cython functions are in sync with the fallback ones), but it seems that it could make a big difference with small, self contained functions (like html.escape or html.escapejs) that are executed frequently as part of a request. Other functions that might be worth looking at include core.http.mutliparser.parse_header or utils.baseconv.BaseConverter.convert.

My question is: this this something that's worth exploring, or is it outside of the realms of possibility?

Carl Meyer

unread,
May 21, 2017, 6:30:52 PM5/21/17
to django-d...@googlegroups.com
On 05/21/2017 02:59 PM, Tom Forbes wrote:
> There are lots of considerations to take into account (like ensuring the
> Cython functions are in sync with the fallback ones)

It's possible to only have one version of the code, using only Python
syntax, and conditionally compile it with Cython if available. This
gives up some potential efficiency wins from type annotation, but avoids
the need to keep two copies in sync.

Regardless, though, I think CI would need to run the tests both with
Cython and with non-Cython fallback.

We've moved toward releasing wheels instead of sdist on PyPI for recent
versions; for this to be useful it would mean releasing multiple binary
wheels for different platforms.

There's no question this could make a big difference to Django CPU
usage; the question is whether it's worth the added CI and release
complexity when it would likely provide little value to the majority of
Django users.

Carl

Russell Keith-Magee

unread,
May 21, 2017, 6:36:29 PM5/21/17
to django-d...@googlegroups.com
Hi Tom,

My immediate reaction is No, for three reasons:

1. My experience has been that Cython isn’t especially stable.  Admittedly, I haven’t looked at it for a couple of years, but when I did, I ended up getting caught in some really nasty bugs that came back and forth between micro versions.

2. Even if Cython *was* stable: The execution speed of your Django stack is almost certainly *not* the bottleneck of your application. Query time, database transfer time, and just basic client-end connection latency will, for most applications, be a *much* bigger performance problem than the execution time of the Python stack.

3. Even if the Django code in your app *was* your bottleneck, switching to PyPy as your interpreter will almost certainly give you better performance for less engineering effort. 

If you want to do some experimentation, by all means go right ahead; however, I would caution you that any patch you produce will need to demonstrate a *significant* improvement in real-world use cases for us to adopt the engineering overhead of integrating Cython into Django’s runtime environment.

Yours
Russ Magee %-)
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFNZOJOsAdL422Ntj4cUkYF1bjqUBdMAXp33xZ%3DapSwqXMasvA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Tom Forbes

unread,
May 21, 2017, 7:30:20 PM5/21/17
to django-d...@googlegroups.com
Hey Karl and Russell,
Thank you for your quick replies! 

Karl: 
Agreed, duplicate CI runs would have to be performed (which would double the time or double the number of runners required). As I understand it the Django project itself would not distribute pre-compiled wheels, the setup.py Cython 'stuff' would handle this (and fail gracefully if anything goes wrong, like no C compiler being available). The type annotations sounds interesting and would alleviate a fair bit of engineering effort, keeping two duplicate copies in sync sounded horrible.

Russell:
The point release issues certainly sound troubling and almost make me want to give up based on that alone. Can you elaborate on them at all - where you doing anything particularly crazy or complex with your cythonified code?

You are of course right about the speed of the Python code itself being a bottleneck and the usual suspects like the database are more important to optimize. However I don't think it's necessarily always a waste of time to explore optimizing some Python/django functions, if that only means simply moving them to a separate file and running Cython on them to get a speed boost.

That's the dream at least, but it's rarely that simple in practice. After perusing the Django code for some functions that look like they could benefit from Cython it seems a lot are tightly coupled and could not be extracted without a bit of effort. Plus the engineering/ci/release overhead would be considerable.

So, perhaps it seems this is just a pipe dream and not worth the effort. Thanks for replying anyway!

Tom

On 21 May 2017 23:36, "Russell Keith-Magee" <rus...@keith-magee.com> wrote:
Hi Tom,

My immediate reaction is No, for three reasons:

1. My experience has been that Cython isn’t especially stable.  Admittedly, I haven’t looked at it for a couple of years, but when I did, I ended up getting caught in some really nasty bugs that came back and forth between micro versions.

2. Even if Cython *was* stable: The execution speed of your Django stack is almost certainly *not* the bottleneck of your application. Query time, database transfer time, and just basic client-end connection latency will, for most applications, be a *much* bigger performance problem than the execution time of the Python stack.

3. Even if the Django code in your app *was* your bottleneck, switching to PyPy as your interpreter will almost certainly give you better performance for less engineering effort. 

If you want to do some experimentation, by all means go right ahead; however, I would caution you that any patch you produce will need to demonstrate a *significant* improvement in real-world use cases for us to adopt the engineering overhead of integrating Cython into Django’s runtime environment.

Yours
Russ Magee %-)

On 21 May 2017, 2:59 PM -0700, Tom Forbes <t...@tomforb.es>, wrote:
Hello,
There was a very interesting talk at Pycon about using Cython to speed up hotspots in Python programs:

https://www.youtube.com/watch?v=_1MSX7V28Po

It got me wondering about possibly using Cython in selected places within Django. I realize since Django was first released the distribution situation was a bit more wild-west, resulting in part to Django not relying on any third party dependencies. But that situation is rapidly changing (see https://github.com/django/deps/blob/master/draft/0007-dependency-policy.rst#background-and-motivation) and with these changes could it also be a time to investigate Cython usage for select parts of Django?

Several popular projects use Cython 'speedup' modules with pure-python fallbacks with great success, for example aiohttp (https://github.com/aio-libs/aiohttp/blob/master/setup.py#L20). I did some quick and dirty profiling of the 'django.utils.html.escape' function and found that by simply including Cython as part of the build, and with no syntax changes, the function executes twice as fast.

There are lots of considerations to take into account (like ensuring the Cython functions are in sync with the fallback ones), but it seems that it could make a big difference with small, self contained functions (like html.escape or html.escapejs) that are executed frequently as part of a request. Other functions that might be worth looking at include core.http.mutliparser.parse_header or utils.baseconv.BaseConverter.convert.

My question is: this this something that's worth exploring, or is it outside of the realms of possibility?
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.

Curtis Maloney

unread,
May 21, 2017, 11:47:49 PM5/21/17
to django-d...@googlegroups.com
On 22/05/17 09:30, Tom Forbes wrote:
> Hey Karl and Russell,
> Thank you for your quick replies!
>
> Russell:
>
> You are of course right about the speed of the Python code itself being
> a bottleneck and the usual suspects like the database are more important
> to optimize. However I don't think it's necessarily always a waste of
> time to explore optimizing some Python/django functions, if that only
> means simply moving them to a separate file and running Cython on them
> to get a speed boost.

Of course, I'm sure Russel won't object to be shown to be wrong, if you
feel like giving it a go anyway :)

--
Curtis

Adam Johnson

unread,
May 22, 2017, 3:02:26 AM5/22/17
to django-d...@googlegroups.com
FYI a colleague and I tried Cythonizing parts of django.template in 2015, we shared our findings on this list: https://groups.google.com/forum/#!topic/django-developers/CKcZwC3J1eQ . We didn't build it in Django core or a fork, our package replaced parts of Django as they imported using an import hook.

TLDR is that it worked and we could measure the improvement locally whilst benchmarking a single function, but it didn't move the needle measurably in production, though we expected it would.

--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

For more options, visit https://groups.google.com/d/optout.



--
Adam

Florian Apolloner

unread,
May 22, 2017, 9:37:09 AM5/22/17
to Django developers (Contributions to Django itself)


On Monday, May 22, 2017 at 1:30:20 AM UTC+2, Tom Forbes wrote:
As I understand it the Django project itself would not distribute pre-compiled wheels, the setup.py Cython 'stuff' would handle this (and fail gracefully if anything goes wrong, like no C compiler being available).

Since most of the pip installs should be wheels nowadays, there is no setup.py invoked on the target computer -- hence as Carl said: We'd need to provide manylinux wheels.

ijazz jazz

unread,
Jun 12, 2018, 9:35:22 AM6/12/18
to Django developers (Contributions to Django itself)
 Cython is a superset of Python that lets you significantly improve the speed .The easiest way to use Cython is to use the special pyximport feature. You need to put the code to cythonize in its own module, write one line

Reply all
Reply to author
Forward
0 new messages