Performance of a django website

401 views
Skip to first unread message

Richard Coleman

unread,
Dec 11, 2007, 12:53:31 PM12/11/07
to django...@googlegroups.com
At work, we are developing a commercial website based on Django. It's a
fairly dynamic site (think social networking). I am doing the initial
load testing to estimate the number of servers we will need for the
production site. The production site will be load balanced using a pair
of BigIP boxes.

When I stress test the dynamic part of the site, I am only getting about
300 requests per second on my test setup. This is using two Dell 1950's
(one for web, one for mysql database). These are very powerful machines
(3.0ghz Xeons, 8 cores each, 16 gig of ram, 15k SAS drives, etc.)

I've done all the standard performance tuning I find when reading
through various websites about django tuning. Is this the performance I
should expect.

When I'm running the load test, the CPU on the web server gets
completely buried (even with 8 cores). The mysql server doesn't seem
loaded at all. Any suggestions on how to find the bottlenecks?

I'm running Ubuntu 7.10 (server, 64 bit version). Django from a very
recent trunk.

Thanks for any advice.

Richard Coleman
rcol...@criticalmagic.com

l5x

unread,
Dec 11, 2007, 1:08:16 PM12/11/07
to Django users
I think that some info about your web-server could be important.

Rajesh Dhawan

unread,
Dec 11, 2007, 1:08:48 PM12/11/07
to Django users
Hi Richard,

>
> When I stress test the dynamic part of the site, I am only getting about
> 300 requests per second on my test setup. This is using two Dell 1950's
> (one for web, one for mysql database). These are very powerful machines
> (3.0ghz Xeons, 8 cores each, 16 gig of ram, 15k SAS drives, etc.)

- How are you running your Django app? Mod_python? FastCGI?
- What web server are you using?
- What's the nature of the dynamic Django request you are measuring?
How many DB queries does it make? Is DEBUG mode off?
- What kind of numbers do you get when you serve a simple and small
HTML file statically from your Web server without Django?
- What kind of numbers do you get when the Django view you are
benchmarking goes directly to a simple template (i.e. no DB queries)?
- Are you using memcache?

-Rajesh D

Joseph Heck

unread,
Dec 11, 2007, 1:13:00 PM12/11/07
to django...@googlegroups.com
Add on to Rajesh's list -

what pages are you requesting and have you profiled them to understand
what's taking long?

-joe

Joe

unread,
Dec 11, 2007, 1:18:54 PM12/11/07
to Django users
I've found the largest memory hog to be the native way related tables
are setup.

Check class definitions with related tables and edit as such:

class ...(models.Mode):
relatedtable = models.ForeignKey(RelatedTable, core=True,
raw_id_admin=True)

The raw_id_admin=True prevents django from pulling up all related
records when the object is loaded etc. While I would have expected
this to only affect the admin side, it had a huge impact on the public
side of our site. Memory consumption went from +500mb to 20-30mb.

Joe

On Dec 11, 12:53 pm, Richard Coleman <rcole...@criticalmagic.com>
wrote:
> rcole...@criticalmagic.com

Jeremy Dunck

unread,
Dec 11, 2007, 1:26:38 PM12/11/07
to django...@googlegroups.com
On Dec 11, 2007 12:18 PM, Joe <joer...@gmail.com> wrote:
>
> I've found the largest memory hog to be the native way related tables
> are setup.
>
> Check class definitions with related tables and edit as such:
>
> class ...(models.Mode):
> relatedtable = models.ForeignKey(RelatedTable, core=True,
> raw_id_admin=True)
>
> The raw_id_admin=True prevents django from pulling up all related
> records when the object is loaded etc. While I would have expected
> this to only affect the admin side, it had a huge impact on the public
> side of our site. Memory consumption went from +500mb to 20-30mb.
>

You must be using manipulators in oldforms. Try newforms.

Richard Coleman

unread,
Dec 11, 2007, 1:37:36 PM12/11/07
to django...@googlegroups.com

1. mod_python
2. apache 2.2.4
3. I'm using funkload and ab to measure the requests per second of one
of the base pages within the dynamic part of the website
4. When I hit a static page in the same way (using ab), I get 6500
requests per second.
5. This is without memcached, or any other caching.

Richard Coleman
rcol...@criticalmagic.com

Richard Coleman

unread,
Dec 11, 2007, 1:41:03 PM12/11/07
to django...@googlegroups.com

I forgot to mention these:

6. django debug turned off
7. mod_python debug turn off
8. django template debugging turned off
9. apache maxclient cranked up to 1000 (although it never gets close to
that many processes).

Richard Coleman
rcol...@criticalmagic.com

Brian Morton

unread,
Dec 11, 2007, 1:41:34 PM12/11/07
to Django users
Are you serving static content from the same apache instance? Also,
what kind of network connectivity do you have between your web and
mysql servers? It sounds like apache might need some tuning in terms
of thread parameters. Have you enabled caching yet? Turn on the
cache framework site-wide and set your expiration period to 1 minute
or something like that. You can go back and only enable it for views
that you want cached later.

On Dec 11, 12:53 pm, Richard Coleman <rcole...@criticalmagic.com>
wrote:
> rcole...@criticalmagic.com

Eric Walstad

unread,
Dec 11, 2007, 1:41:42 PM12/11/07
to django...@googlegroups.com
If you don't already use them:

Firebug:
http://www.getfirebug.com/
will show you how long each component of the page takes from request to
response and shows you what components are returned in parallel (among
other really cool features). This can help identify your bottleneck.
For example, if your images are taking longer than expected while the
page itself returns quickly then you may want to serve images with a
dedicated web server, like lighttpd.

Yahoo's 'yslow' tool gives your page a 'grade' and offers advice on how
to improve your grade. It also has other handy bits (JSLint, Empty vs.
Primed cache comparisons).
http://developer.yahoo.com/yslow/

The grades are based on their 'Rules for High Performance Websites':
http://developer.yahoo.com/performance/index.html#rules

Hm, funny, their yslow page scores a 'D' :)

Richard Coleman

unread,
Dec 11, 2007, 1:49:30 PM12/11/07
to django...@googlegroups.com
Brian Morton wrote:
> Are you serving static content from the same apache instance? Also,
> what kind of network connectivity do you have between your web and
> mysql servers? It sounds like apache might need some tuning in terms
> of thread parameters. Have you enabled caching yet? Turn on the
> cache framework site-wide and set your expiration period to 1 minute
> or something like that. You can go back and only enable it for views
> that you want cached later.
>
>
1. Static content and dynamic are on the same server (that will change
on production). But they are on different apache virtual. Mod_python
is turned off for the static stuff. Hitting only a static page is very
fast (6500 requests per second). Hitting the dynamic side is slow (300
requests per second).

2. It's gigE between the web server and mysql server, on a dedicated
switch. The database is working pretty hard (7000 selects per second),
but doesn't seem to be the bottleneck. The webserver is hammered
(typing any command takes a long time).
3. I'm using prefork MPM on apache with maxclients set to 1000.

We are starting to experiment with caching, but I want to improve the
raw performance of the site as well.

Richard Coleman
rcol...@criticalmagic.com

Joseph Heck

unread,
Dec 11, 2007, 1:54:22 PM12/11/07
to django...@googlegroups.com
Definitely take that specific dynamic page and profile it - see what's
happening and why its slow. That's clearly your first bottleneck to
work on from the data you've provided.

There's a wiki page on that very process here:
http://code.djangoproject.com/wiki/ProfilingDjango and there's
additionally some information you can get when you enable DEBUG to see
how long the SQL queries are taking for that dynamic page. There's a
nice snippet to help there at
http://www.djangosnippets.org/snippets/93/

-joe

Brian Morton

unread,
Dec 11, 2007, 2:11:11 PM12/11/07
to Django users
Joseph raises a good point. I only recently discovered what a
performance killer DEBUG mode can be when it comes to queries. It
will cause your client machine (the web server in this case) to run
out of memory after a lengthy process (such as data loading). Are you
profiling in DEBUG mode?

On Dec 11, 1:54 pm, "Joseph Heck" <joseph.h...@gmail.com> wrote:
> Definitely take that specific dynamic page and profile it - see what's
> happening and why its slow. That's clearly your first bottleneck to
> work on from the data you've provided.
>
> There's a wiki page on that very process here:http://code.djangoproject.com/wiki/ProfilingDjangoand there's
> additionally some information you can get when you enable DEBUG to see
> how long the SQL queries are taking for that dynamic page. There's a
> nice snippet to help there athttp://www.djangosnippets.org/snippets/93/
>
> -joe
>
> On Dec 11, 2007 10:49 AM, Richard Coleman <rcole...@criticalmagic.com> wrote:
>
>
>
> > Brian Morton wrote:
> > > Are you serving static content from the same apache instance? Also,
> > > what kind of network connectivity do you have between your web and
> > > mysql servers? It sounds like apache might need some tuning in terms
> > > of thread parameters. Have you enabled caching yet? Turn on the
> > > cache framework site-wide and set your expiration period to 1 minute
> > > or something like that. You can go back and only enable it for views
> > > that you want cached later.
>
> > 1. Static content and dynamic are on the same server (that will change
> > on production). But they are on different apache virtual. Mod_python
> > is turned off for the static stuff. Hitting only a static page is very
> > fast (6500 requests per second). Hitting the dynamic side is slow (300
> > requests per second).
>
> > 2. It's gigE between the web server and mysql server, on a dedicated
> > switch. The database is working pretty hard (7000 selects per second),
> > but doesn't seem to be the bottleneck. The webserver is hammered
> > (typing any command takes a long time).
> > 3. I'm using prefork MPM on apache with maxclients set to 1000.
>
> > We are starting to experiment with caching, but I want to improve the
> > raw performance of the site as well.
>
> > Richard Coleman
> > rcole...@criticalmagic.com

Brian Morton

unread,
Dec 11, 2007, 2:18:46 PM12/11/07
to Django users
Sorry, just saw your earlier post about debug being turned off. I
would say it is definitely time to start profiling your code.

On Dec 11, 2:11 pm, Brian Morton <rokclim...@gmail.com> wrote:
> Joseph raises a good point. I only recently discovered what a
> performance killer DEBUG mode can be when it comes to queries. It
> will cause your client machine (the web server in this case) to run
> out of memory after a lengthy process (such as data loading). Are you
> profiling in DEBUG mode?
>
> On Dec 11, 1:54 pm, "Joseph Heck" <joseph.h...@gmail.com> wrote:
>
> > Definitely take that specific dynamic page and profile it - see what's
> > happening and why its slow. That's clearly your first bottleneck to
> > work on from the data you've provided.
>
> > There's a wiki page on that very process here:http://code.djangoproject.com/wiki/ProfilingDjangoandthere's

Rajesh Dhawan

unread,
Dec 11, 2007, 2:21:39 PM12/11/07
to Django users
Hi again,

> 3. I'm using prefork MPM on apache with maxclients set to 1000.

Apart from Joe's excellent profiling suggestion, I would recommend
reducing maxclients to a much lower value (like say 100) and then
increasing it gradually.

From your description of the sluggish response of your web server, it
looks like the machine may be swapping to disk. Each of your clients
fires up a Python VM taking up several tens of MB of RAM. Even if each
VM takes up just 20MB (it's probably more than that), a thousand
clients will eat up 20GB, forcing your machine to swap.

-Rajesh

Brian Morton

unread,
Dec 11, 2007, 2:22:17 PM12/11/07
to Django users

Forest Bond

unread,
Dec 11, 2007, 2:27:28 PM12/11/07
to django...@googlegroups.com
Hi,

On Tue, Dec 11, 2007 at 01:37:36PM -0500, Richard Coleman wrote:
> Rajesh Dhawan wrote:
> >> When I stress test the dynamic part of the site, I am only getting about
> >> 300 requests per second on my test setup. This is using two Dell 1950's
> >> (one for web, one for mysql database). These are very powerful machines
> >> (3.0ghz Xeons, 8 cores each, 16 gig of ram, 15k SAS drives, etc.)
> >
> > - How are you running your Django app? Mod_python? FastCGI?
> > - What web server are you using?
> > - What's the nature of the dynamic Django request you are measuring?
> > How many DB queries does it make? Is DEBUG mode off?
> > - What kind of numbers do you get when you serve a simple and small
> > HTML file statically from your Web server without Django?
> > - What kind of numbers do you get when the Django view you are
> > benchmarking goes directly to a simple template (i.e. no DB queries)?
> > - Are you using memcache?
>

> 1. mod_python
> 2. apache 2.2.4
> 3. I'm using funkload and ab to measure the requests per second of one
> of the base pages within the dynamic part of the website
> 4. When I hit a static page in the same way (using ab), I get 6500
> requests per second.
> 5. This is without memcached, or any other caching.

I'm not all that qualified to comment as I've never done this sort of testing,
but I do wonder if the performance you are seeing may not be all that
unreasonable, given that no caching is currently being performed? Using
memcached has been known to improve performance dramatically.

Profiling never hurts, but there's nothing wrong with doing the easy stuff
first, right?

-Forest
--
Forest Bond
http://www.alittletooquiet.net

signature.asc

Richard Coleman

unread,
Dec 11, 2007, 2:28:58 PM12/11/07
to django...@googlegroups.com

The machine is definitely not swapping. It has 16 gig of ram, and top
never shows more than 7 gig of ram in use when I do the load testing.

I have maxclients set to 1000, but the number of apache processes never
gets close to that value. I've played around with various settings for
maxclient, but as long as I don't set it too low, this never comes into
play.

Richard Coleman
rcol...@criticalmagic.com

Jeremy Dunck

unread,
Dec 11, 2007, 2:38:47 PM12/11/07
to django...@googlegroups.com
On Dec 11, 2007 1:21 PM, Rajesh Dhawan <rajesh...@gmail.com> wrote:
>
> Hi again,
>
> > 3. I'm using prefork MPM on apache with maxclients set to 1000.
>
> Apart from Joe's excellent profiling suggestion, I would recommend
> reducing maxclients to a much lower value (like say 100) and then
> increasing it gradually.

Please make 'hello world' view to simplify the troubleshooting.

If it's still "only" 300 r/s without hitting the DB, the problem is
much smaller.

You haven't listed your ab parameters; if you're using maxclients so
high, perhaps you're paying the process startup cost (nearly) each
time, so consider setting startservers high. In general, if you have
a dedicated web server, setting startservers near maxclients isn't a
bad way to go.

Richard Coleman

unread,
Dec 11, 2007, 2:39:50 PM12/11/07
to django...@googlegroups.com
Forest Bond wrote:
>>> - How are you running your Django app? Mod_python? FastCGI?
>>> - What web server are you using?
>>> - What's the nature of the dynamic Django request you are measuring?
>>> How many DB queries does it make? Is DEBUG mode off?
>>> - What kind of numbers do you get when you serve a simple and small
>>> HTML file statically from your Web server without Django?
>>> - What kind of numbers do you get when the Django view you are
>>> benchmarking goes directly to a simple template (i.e. no DB queries)?
>>> - Are you using memcache?
>>>
>> 1. mod_python
>> 2. apache 2.2.4
>> 3. I'm using funkload and ab to measure the requests per second of one
>> of the base pages within the dynamic part of the website
>> 4. When I hit a static page in the same way (using ab), I get 6500
>> requests per second.
>> 5. This is without memcached, or any other caching.
>>
>
> I'm not all that qualified to comment as I've never done this sort of testing,
> but I do wonder if the performance you are seeing may not be all that
> unreasonable, given that no caching is currently being performed? Using
> memcached has been known to improve performance dramatically.
>
> -Forest
>
I think a large part of my question really comes down to "Is 300
requests per second reasonable for an uncached django site on a single
machine?". Maybe it is.

We are looking at using the memcached API in our code, and I'm sure we
will get a speedup there. But before we start down that route, I wanted
to make sure we weren't doing any silly on the raw site.

I think my next step is to use the profiler and find out why the site is
so CPU intensive.

Thanks for all the help.

Richard Coleman
rcol...@criticalmagic.com

Karen Tracey

unread,
Dec 11, 2007, 2:52:16 PM12/11/07
to django...@googlegroups.com
On Dec 11, 2007 1:49 PM, Richard Coleman <rcol...@criticalmagic.com> wrote:
1. Static content and dynamic are on the same server (that will change
on production).  But they are on different apache virtual.  Mod_python
is turned off for the static stuff.  Hitting only a static page is very
fast (6500 requests per second).  Hitting the dynamic side is slow (300
requests per second).

2. It's gigE between the web server and mysql server, on a dedicated
switch.  The database is working pretty hard (7000 selects per second),
but doesn't seem to be the bottleneck.  The webserver is hammered
(typing any command takes a long time).

DB handling 7000 selects/sec while web server is serving 300 pages/sec implies each page request is generating >20 select requests.  Does that seem reasonable to you based on the content of the page you are serving?  (It seems high to me.)  Brian Morton provided a pointer to the doc for select_related, which could help to reduce the number of selects.  Even if the database is keeping up, so many round-trips talking to it for a single page request is going to add up.

Karen

Jeremy Dunck

unread,
Dec 11, 2007, 2:54:13 PM12/11/07
to django...@googlegroups.com
On Dec 11, 2007 1:39 PM, Richard Coleman <rcol...@criticalmagic.com> wrote:
...

> >
> I think a large part of my question really comes down to "Is 300
> requests per second reasonable for an uncached django site on a single
> machine?". Maybe it is.
>
> We are looking at using the memcached API in our code, and I'm sure we
> will get a speedup there. But before we start down that route, I wanted
> to make sure we weren't doing any silly on the raw site.
>
> I think my next step is to use the profiler and find out why the site is
> so CPU intensive.
>
> Thanks for all the help.

Again I suggest a hello-world view as a baseline, but if you really
are doing 7000 queries per second, perhaps you're also constructing a
lot of ORM objects; there's some overhead to the signalling (pre/post
init, pre/post save).

James Bennett

unread,
Dec 11, 2007, 3:15:03 PM12/11/07
to django...@googlegroups.com
On Dec 11, 2007 1:28 PM, Richard Coleman <rcol...@criticalmagic.com> wrote:
> I have maxclients set to 1000, but the number of apache processes never
> gets close to that value. I've played around with various settings for
> maxclient, but as long as I don't set it too low, this never comes into
> play.

Just a possibility:

If you're launching huge numbers of requests, but all from the same
machine (e.g., the one running ab), your request-generating machine
may be saturating its own bandwidth, which could cause Apache
processes on the web nodes to stall as they wait for you to be able to
receive data, which in turn leads to requests piling up and increased
load on the web node (but not on the DB node, which doesn't notice
anything unusual).

Not saying that's necessarily what's happening, but it's something to
look into. Also, 300 req./sec. for something you apparently haven't
done much profiling on is pretty damned impressive in my book.

--
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."

Graham Dumpleton

unread,
Dec 11, 2007, 5:35:17 PM12/11/07
to Django users
On Dec 12, 6:21 am, Rajesh Dhawan <rajesh.dha...@gmail.com> wrote:
> Hi again,
>
> > 3. I'm using prefork MPM on apache with maxclients set to 1000.
>
> Apart from Joe's excellent profiling suggestion, I would recommend
> reducing maxclients to a much lower value (like say 100) and then
> increasing it gradually.
>
> From your description of the sluggish response of your web server, it
> looks like the machine may be swapping to disk. Each of your clients
> fires up a Python VM taking up several tens of MB of RAM.

To clarify on this comment. Each of the 'clients' does not fire up a
Python VM.

The Python framework as a whole is initialised only once per Apache
child process. This initialisation as a side effect will always create
the main Python interpreter instance for the Apache child process.

Unless you use PythonInterpreter directive to force site to always run
in the main Python interpreter instance, then by default an additional
Python sub interpreter instance will be created per virtual host. The
per virtual host sub interpreter instance will be created the first
time a request to mod_python for that virtual host is received.

When any interpreter instance is created within an Apache child
process, that interpreter instance will then persist for the life of
that process and will be used in handling all requests which map to
that interpreter instance.

A Python interpreter instance when first created and before any web
application is loaded should only add a few hundred kilobytes,
although mod_python adds on top of that with its dispatch code, it is
not tens of MBs though. Thus to say that 'Python VM taking up several
tens of MB of RAM' isn't accurate. If you are seeing this sort of
memory usage from just loading mod_python, then you have a Python
installation which doesn't provide a shared library for Python and
which also may include debug symbols.

Now, when a Django site instance is loaded, typically this can add
about 5MB per process. This value will increase over time based on
your application and how it uses memory based on database queries and
internal operations. Yes this can get up to tens of MBs, but that is
from the Django application and not mod_python or Python itself.

> Even if each
> VM takes up just 20MB (it's probably more than that), a thousand
> clients will eat up 20GB, forcing your machine to swap.

This can be a problem if using prefork, so dropping max clients to a
level more realistic for the amount of memory available is a good
idea. If memory is still an issue, then validating that your
particular Django application is thread safe would be worthwhile and
switch to worker MPM for Apache instead.

Although worker MPM uses multithreading, claims that the GIL would be
a problem is not the big issue some would say it is. This is because
Apache is a multiprocess web server so there is no restriction on
making good use of multiple processes and cores.

As far as memory overhead goes, mod_wsgi uses less base level memory
than mod_python. The per request overhead of mod_wsgi is also less
than mod_python, so if you really wanted to squeeze as much as
possible out of what you got, mod_wsgi should be better. That said,
realistically though the bottlenecks appear to be in the application
or database rather than underlying web server adapter.

Graham

Nic

unread,
Dec 12, 2007, 5:48:43 AM12/12/07
to django...@googlegroups.com
"Karen Tracey" <kmtr...@gmail.com> writes:


Have you tried a connection pool?


--
Nic Ferrier
http://www.woome.com - Enjoy the minute!

lgr888999

unread,
Dec 17, 2007, 1:32:07 PM12/17/07
to Django users
Whats really interesting in this discussion is that there are no words
about the django application it self.... have you coded your
application in a django effencient way?

Artiom Diomin

unread,
Dec 18, 2007, 3:31:35 AM12/18/07
to django...@googlegroups.com
have you tried mod_wsgi ?

Nic пишет:

Dima Dogadaylo

unread,
Dec 26, 2007, 2:44:18 AM12/26/07
to Django users


On Dec 11, 8:37 pm, Richard Coleman <rcole...@criticalmagic.com>
wrote:
> 1. mod_python
> 2. apache 2.2.4
> 3. I'm using funkload and ab to measure the requests per second of one
> of the base pages within the dynamic part of the website
> 4. When I hit a static page in the same way (using ab), I get 6500
> requests per second.
> 5. This is without memcached, or any other caching.

Apache may cache static pages and serve it from memory, but Django
requests Apache passes to your server without caching. If you want to
achieve speed of static pages for you Django pages, try to use
WebAlchemy: http://www.mysoftparade.com/blog/webalchemy-django-apache/

Graham Dumpleton

unread,
Dec 26, 2007, 5:04:31 AM12/26/07
to Django users
I suspect that that could have been done somewhat simpler by using
other features of Apache mod_rewrite. In other words, one could have
avoided the need to rewrite the .htaccess file when files are updated.
One could just have used features of mod_rewrite to check for the
existence of a static file of appropriate name and internally redirect
to that if it existed, otherwise pass through to web application as
per normal.

So, reasonable approach, not a new one though and not done as well as
it could have. :-)

Graham

ecir...@gmail.com

unread,
Dec 26, 2007, 6:32:45 PM12/26/07
to Django users


On Dec 11, 8:39 pm, Richard Coleman <rcole...@criticalmagic.com>
wrote:
> rcole...@criticalmagic.com

Hi,
this might be of help:
http://effbot.org/zone/zone-django-notes.htm
http://effbot.org/zone/django-memcached-view.htm

Btw, I'm really interested how many reqs you've got from bare
HelloWorld. (I assume this is not that 300 number.)
So please, keep us informed!
Reply all
Reply to author
Forward
0 new messages