Tornado performance decay

771 views
Skip to first unread message

Jacob Kristhammar

unread,
Dec 10, 2010, 8:09:13 AM12/10/10
to Tornado Web Server
While performance testing a project that's based on Tornado we've
started to notice a steady decline in performance.
This lead us to take a closer look on some of the components in our
system.

As it turns out, the Tornado performance has effectively been cut in
half since July this year. The performance metric we refer to is the
simple hello world test on the Tornado documentation site [1].

In order to pin-point what's been going on in Tornado the last six
months we wrote a simple script that made it easier to see when things
started going south (you can find the scripts in [2]).

The script simply checks out all commits in turn and runs the basic ab-
test.
We had one dedicated machine[3] hosting Tornado and another dedicated
machine[4] doing the bombing.
This is repeated five times (raw test results [5]).
Another script[6] processes the output data and calculates the mean
and standard deviation of all the samples.

We imported the data into a Google spreadsheet and plotted how
performance has changed over time [7].
The document has the following sheets:
1. "Annotated trend" we've marked the commit's that seems to have the
biggest impact on performance.
2. "Annotated data" the data used to create sheet one.
3. "Raw data" the raw data imported after processing with [6].
4. "Raw trend" a plot of the performance over time with all commits
annotated (pretty messy)

As stated on the Tornado web page, "Not very scientific, but at a high
level, it should give you a sense" of how Tornado is evolving in terms
of performance.

We're not exactly sure what to do with this data, but it's an
interesting observation we wanted to share with the community.

The following points are at least worth mentioning:
1. The performance number in [1] is no longer true. (We haven't
compared Tornado against Django or web.py though)
2. A lot of the performance hits seems to be a result of the
introduction of StackContext (maybe a fair tradeoff?)
3. Many selected Tornado because of it's simplicity and performance
which seems to be the things in decay.
4. We're no longer sure if we should invest more time in Tornado
(unless this trend changes) and start to look at other options.

In general...
What's your thoughts about the future of Tornado?
What's changed since you first open sourced Tornado e.g. in this tech-
talk [8] (at 3:20) Bret Taylor says:
"There's a lot of intention to not adding things that would hurt
performance, so if somethings useful but slow, we won't add it"?

-- Jacob Kristhammar / Rickard Böttcher


[1] http://www.tornadoweb.org/documentation#performance
[2] https://gist.github.com/735246#file_start_test.sh
[3] Memory: 8GB, CPU: Quad-Core AMD Opteron™ Processor 1354, 2.2GHz
OS: Ubuntu 10.04 LTS
[4] Memory: 4GB, CPU: Intel(R) Core(TM)2 Duo CPU T9550 @ 2.66GHzOS
OS: Ubuntu Server 9.10
[5] https://gist.github.com/735246#file_test_raw_data.txt
[6] https://gist.github.com/735246#file_process_tornado_bm.py
[7] https://spreadsheets.google.com/ccc?key=0AhQ6xoOmtiH-dFdGeDE1YUU5dUpId0FUZklYb1hKZnc&hl=en&authkey=CJDyuSk
[8] http://bret.appspot.com/entry/tornado-tech-talk

Rickard Böttcher

unread,
Dec 10, 2010, 9:35:59 AM12/10/10
to Tornado Web Server
Quick note:

Some of the spikes seen in the chart is due to the fact that
development has been made in different branches and is thus using an
old codebase where a feature making Tornado slower/faster is not yet
incorporated. When the code is later merged back in all of the
features is present and thus the performance goes back to what we see
before the commit.
> [7]https://spreadsheets.google.com/ccc?key=0AhQ6xoOmtiH-dFdGeDE1YUU5dUpI...
> [8]http://bret.appspot.com/entry/tornado-tech-talk

Japhy Bartlett

unread,
Dec 10, 2010, 1:31:29 PM12/10/10
to python-...@googlegroups.com
bummer! thanks for doing some serious testing and bringing it to light.

- Japhy

2010/12/10 Rickard Böttcher <rickard....@gmail.com>:

Craig Campbell

unread,
Dec 10, 2010, 1:44:02 PM12/10/10
to Tornado Web Server
For the sake of those who don't know, what exactly did StackContext
add? (sorry if this is a stupid question)

On Dec 10, 1:31 pm, Japhy Bartlett <japhy.bartl...@gmail.com> wrote:
> bummer!  thanks for doing some serious testing and bringing it to light.
>
> - Japhy
>
> 2010/12/10 Rickard Böttcher <rickard.bottc...@gmail.com>:

Phil Plante

unread,
Dec 10, 2010, 3:03:02 PM12/10/10
to Tornado Web Server
Is it possible for you to run a similar test against other
frameworks? I would be interested to see this for other heavily
developed projects such as Django.

Ben Darnell

unread,
Dec 10, 2010, 3:12:39 PM12/10/10
to python-...@googlegroups.com
Wow, thanks a lot for the detailed analysis. This is great
information and I'd love to get these scripts or something like it
running on a regular basis.

FWIW on my laptop (i.e. the least scientific or reproducible
environment possible), I see a much less dramatic performance drop -
~20% fewer requests per second today compared to the 1.0 release.

On Fri, Dec 10, 2010 at 5:09 AM, Jacob Kristhammar
<krist...@gmail.com> wrote:
> The following points are at least worth mentioning:
> 1. The performance number in [1] is no longer true. (We haven't
> compared Tornado against Django or web.py though)
> 2. A lot of the performance hits seems to be a result of the
> introduction of StackContext (maybe a fair tradeoff?)

This is very surprising to me and warrants further analysis.
Fundamentally StackContext is doing the same sort of thing the old
async_callback wrappers were doing, so it really shouldn't add that
much overhead (although it does add the equivalent of a couple of
async_callback wrappers to purely synchronous handlers like the hello
world handler used in this benchmark). I think the convenience of
StackContext is worth a small performance hit, but it's hard to
justify in the face of such a large slowdown.

For Craig, and anyone else following along who is unfamiliar with
StackContext, it was added in Tornado 1.1 primarily as an
error-handling mechanism for asynchronous operations. It eliminated
the need to explicitly wrap your callbacks in self.async_callback().

> 3. Many selected Tornado because of it's simplicity and performance
> which seems to be the things in decay.

Personally my attraction to Tornado has always been its simplicity,
with performance being a secondary concern. Where do you feel that
Tornado is losing simplicity? (or maybe this should be a separate
thread).

> 4. We're no longer sure if we should invest more time in Tornado
> (unless this trend changes) and start to look at other options.

That would be unfortunate, since you've made a number of valuable contributions.

>
> In general...
> What's your thoughts about the future of Tornado?

I think the core of Tornado (IOLoop, HTTPServer, etc) is fairly stable
now, and I don't anticipate major changes in that area (certainly not
anything as invasive as StackContext was). I don't think we'll see
this downward trend in performance continue, and now that you've
called my attention to it things should start to improve.

> What's changed since you first open sourced Tornado e.g. in this tech-

I've taken over the project from Bret, for one thing. :) But
seriously, I don't think there's been a change in philosophy or
strategy. If I had been looking at benchmark numbers more closely I
probably wouldn't have added StackContext (at least not in its current
form), but my performance analysis was always in the context of a
non-trivial application, where per-request overhead tends to fade into
the background (if you had asked me before this email which part of
tornado was most in need of performance improvements, I probably would
have said template rendering).

-Ben

Ben Darnell

unread,
Dec 10, 2010, 3:52:46 PM12/10/10
to python-...@googlegroups.com
On Fri, Dec 10, 2010 at 12:12 PM, Ben Darnell <b...@bendarnell.com> wrote:
> This is very surprising to me and warrants further analysis.
> Fundamentally StackContext is doing the same sort of thing the old
> async_callback wrappers were doing, so it really shouldn't add that
> much overhead (although it does add the equivalent of a couple of
> async_callback wrappers to purely synchronous handlers like the hello
> world handler used in this benchmark).  I think the convenience of
> StackContext is worth a small performance hit, but it's hard to
> justify in the face of such a large slowdown.


It turns out that the 'with' statement has a huge amount of overhead -
a with statement is more than 5 times as expensive as a try/except,
and the @contextlib.contextmanager decorator is another factor of 5
slower than a manually-written class with __enter__/__exit__ methods.
(benchmark: https://gist.github.com/736778).

It's easy enough to get rid of the @contextmanager decorator, and I'll
look into whether we can do without the with statement at all.

-Ben

Jacob Kristhammar

unread,
Dec 10, 2010, 6:27:15 PM12/10/10
to python-...@googlegroups.com
First of all, it's great to see the good reception and appreciation of this post. It was only meant as constructive criticism to switch some focus back on the things that made us pick Tornado to begin with (if it was ever lost).

It's great to see that you've already started to find weak spots and potential improvements, nice work!

More comments below,

-- Jacob


On Dec 10, 2010, at 9:12 PM, Ben Darnell wrote:

> Wow, thanks a lot for the detailed analysis. This is great
> information and I'd love to get these scripts or something like it
> running on a regular basis.
>

We actually realized the same thing for some of our other projects while compiling this data ;)

Maybe it would be a good thing to revive one of the SC threads, but here's some thoughts.

Wrapping your head around how StackContext changes the internals of Tornado is not the most straight forward task. We've been messing around quite a lot in the internals of Tornado writing some client libraries etc. and hence had to get our hands dirty (or at least understand how SC changed things).
Even though it's clearly "easier" from the standard users point of view, it added "magic" making people less conscious about what is actually is going on. Prior to SC it was easier to have a mental model of how things worked all the way to the ioloop, which is something I found very nice in comparison to more complex frameworks that makes it close to impossible to know everything.

Writing asynchronous code is not the most straight forward task in it self, but Tornado was/is a very clean framework providing you with a minimal toolkit to do it. Most everything just made sense when Tornado was released to the public. The fact that you explicitly had to wrap with async_callback worked as a good sanity check that you were doing things right and it felt very natural IMHO. Maybe it's just my lack of deep understanding how SC really works, but I actually find myself thinking "hmm, well, SC just fixed it" (without knowing why) sometimes when things blow up (which I guess is part of it's purpose).

TL;DR; It's harder to track a requests lifetime, from seed to bread so to speak.


>
>> 4. We're no longer sure if we should invest more time in Tornado
>> (unless this trend changes) and start to look at other options.
>
> That would be unfortunate, since you've made a number of valuable contributions.

We still depend heavily on Tornado and will continue to use it for now. I've also gained confidence in a brighter future after the good reaction to this post. As long as we know that performance is still valuable we can dig in to fix it. I liked the idea of not putting to much features in there (i.e. brets quote).

>
>>
>> In general...
>> What's your thoughts about the future of Tornado?
>
> I think the core of Tornado (IOLoop, HTTPServer, etc) is fairly stable
> now, and I don't anticipate major changes in that area (certainly not
> anything as invasive as StackContext was). I don't think we'll see
> this downward trend in performance continue, and now that you've
> called my attention to it things should start to improve.
>
>> What's changed since you first open sourced Tornado e.g. in this tech-
>
> I've taken over the project from Bret, for one thing. :) But
> seriously, I don't think there's been a change in philosophy or
> strategy. If I had been looking at benchmark numbers more closely I
> probably wouldn't have added StackContext (at least not in its current
> form), but my performance analysis was always in the context of a
> non-trivial application, where per-request overhead tends to fade into
> the background (if you had asked me before this email which part of
> tornado was most in need of performance improvements, I probably would
> have said template rendering).

This is a good point. These numbers doesn't mean the world. But it's probably a good trend indicator of overall performance.

Ben Darnell

unread,
Dec 10, 2010, 8:04:55 PM12/10/10
to python-...@googlegroups.com
On Fri, Dec 10, 2010 at 12:52 PM, Ben Darnell <b...@bendarnell.com> wrote:
> On Fri, Dec 10, 2010 at 12:12 PM, Ben Darnell <b...@bendarnell.com> wrote:
>> This is very surprising to me and warrants further analysis.
>> Fundamentally StackContext is doing the same sort of thing the old
>> async_callback wrappers were doing, so it really shouldn't add that
>> much overhead (although it does add the equivalent of a couple of
>> async_callback wrappers to purely synchronous handlers like the hello
>> world handler used in this benchmark).  I think the convenience of
>> StackContext is worth a small performance hit, but it's hard to
>> justify in the face of such a large slowdown.
>
>
> It turns out that the 'with' statement has a huge amount of overhead -
> a with statement is more than 5 times as expensive as a try/except,
> and the @contextlib.contextmanager decorator is another factor of 5
> slower than a manually-written class with __enter__/__exit__ methods.
> (benchmark: https://gist.github.com/736778).
>
> It's easy enough to get rid of the @contextmanager decorator, and I'll
> look into whether we can do without the with statement at all.

I've just committed a few changes that in my tests reclaim about 40%
of the performance lost since 1.0.

-Ben

Landon

unread,
Dec 10, 2010, 11:06:28 PM12/10/10
to Tornado Web Server
Ben-

As a side note to the discussion here- you rock as a open source
project owner. Keep on truckin'. Tornado's a fantastic framework and
I can't explain how great the community around it is.

-Landon
> >>> [7]https://spreadsheets.google.com/ccc?key=0AhQ6xoOmtiH-dFdGeDE1YUU5dUpI...
> >>> [8]http://bret.appspot.com/entry/tornado-tech-talk

Romy

unread,
Dec 11, 2010, 2:41:17 AM12/11/10
to Tornado Web Server
+1

Josh Marshall

unread,
Dec 11, 2010, 2:58:22 AM12/11/10
to python-...@googlegroups.com

Totally agree - keep rocking Ben.

Josh Marshall

On Dec 11, 2010 1:41 AM, "Romy" <romy.m...@gmail.com> wrote:

+1


On Dec 10, 8:06 pm, Landon <lando...@gmail.com> wrote:
> Ben-
>

> As a side note to the discussion ...

Jesus Cea

unread,
Dec 11, 2010, 7:30:11 AM12/11/10
to python-...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/12/10 21:52, Ben Darnell wrote:
> It turns out that the 'with' statement has a huge amount of overhead -
> a with statement is more than 5 times as expensive as a try/except,
> and the @contextlib.contextmanager decorator is another factor of 5
> slower than a manually-written class with __enter__/__exit__ methods.
> (benchmark: https://gist.github.com/736778).
>
> It's easy enough to get rid of the @contextmanager decorator, and I'll
> look into whether we can do without the with statement at all.

Could I suggest to file a bug in python?. It would be an improvement for
python 3.3. That would be not of help for Tornado, since it doesn't
support Python 3.x (yet), but filing a bug seems appropiate and sensible.

- --
Jesus Cea Avion _/_/ _/_/_/ _/_/_/
jc...@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/
jabber / xmpp:jc...@jabber.org _/_/ _/_/ _/_/_/_/_/
. _/_/ _/_/ _/_/ _/_/ _/_/
"Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/
"My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTQNu05lgi5GaxT1NAQJoUAQAnsBKbV4mBAHsqNUrp3a7OvQ8J0EohHDi
QqAWPte9r2qBrsVMaJoBtRMIRHTRt1s6nxeA9eYlc2hiEX3hnsK4I2MEodDmazj8
PB7p/5LQ4FbB2EJOJADeeaw5IHFeFxMxt7Z1NS324unFrZzPZkb1v4S6vnoD78Vx
wm47YTtSW/Q=
=1O8x
-----END PGP SIGNATURE-----

paolo.losi

unread,
Dec 16, 2010, 3:25:52 AM12/16/10
to Tornado Web Server


On Dec 11, 1:30 pm, Jesus Cea <j...@jcea.es> wrote:
> > It's easy enough to get rid of the @contextmanager decorator, and I'll
> > look into whether we can do without the with statement at all.
>
> Could I suggest to file a bug in python?. It would be an improvement for
> python 3.3. That would be not of help for Tornado, since it doesn't
> support Python 3.x (yet), but filing a bug seems appropiate and sensible.

I concur with Jesus.

It's a real pity that the contextmanager decorator gets hit by such
a big performance degradation.

I've to say that I find stack contexts so elegant and powerful
that I would be willing to pay 50% decrease in performance to get
it's elegance and effectiveness back.

If you compare "original" stack context with the state of art
(twisted errback or node.js eventsource [1] which is still
in development), it's easy to see how powerful and useful concept
it is.

Paolo

[1] http://markmail.org/download.xqy?id=cgdskhvomzw6smyw&number=1
Reply all
Reply to author
Forward
0 new messages