Hmmm, you have come from the pinax group have you?
I already warned in that group that 25 processes was a rather rediculous number.
> WSGIScriptAlias / /home/xx/deploy/bme.wsgi
> ErrorLog /var/log/apache2/error.xx.log
> LogLevel warn
> CustomLog /var/log/apache2/access.xx.log common
> ServerSignature On
> </VirtualHost>
>
> When the server is restarted it starts painfully slow until many =~ 10
> requests are made, after that speed comes to what's expected.
>
> Any ideas on how to troubleshoot that? I tested on two different servers
> (Debian Etch on a Dell Sever, and Ubuntu Intrepid on an EC2 instance) and
> experienced the same issue.
You see what you see because application is by default lazy loaded the
first time request arrives which targets it.
Things are much worse than they need to be in your case because you
have so many processes needing to load it, so first request that hits
each process will be slow.
Cut down the number of processes you need. You shouldn't need that
many even if each is single threaded, unless you have lots of long
running requests occuring.
You then want to preload the WSGI script file, so at least Python
modules loaded and ready. Django still lazy initialises itself, but
haven't worked out recipe yet for people to use to force it to early
initialise.
Thus use configuration:
WSGIImportScript /home/xx/deploy/bme.wsgi process-group=xx
application-group=%{GLOBAL}
<VirtualHost *>
ServerName xx.com
DocumentRoot /home/xx/src
WSGIDaemonProcess xx user=xx group=xx threads=1 processes=25
python-path=/home/xx/lib/python2.5/site-packages
WSGIProcessGroup xx
WSGIApplicationGroup %{GLOBAL}
WSGIScriptAlias / /home/xx/deploy/bme.wsgi
ErrorLog /var/log/apache2/error.xx.log
LogLevel warn
CustomLog /var/log/apache2/access.xx.log common
ServerSignature On
</VirtualHost>
The two changes are to add WSGIApplicationGroup directive so know
which interpreter is being used, and to add WSGIImportScript to force
script to be loaded into that process/interpreter automatically at
startup of process and not later only when request first arrives.
Right now it is a pain than WSGIImportScript has to be outside of
VirtualHost. This is fixed for mod_wsgi 3.0 and can be set more
appropriately inside of VirtualHost, athough must come after the
WSGIDaemonProcess group directive. This will actually be enforced and
so when move to mod_wsgi 3.0 in future, configuration file will need
to be updated. For this latter issue see:
http://code.google.com/p/modwsgi/issues/detail?id=110
Graham
BTW, for forward compatibility with mod_wsgi 3.0 you could also just
move WSGIDaemonProcess outside of VirtualHost.
WSGIDaemonProcess xx user=xx group=xx threads=1 processes=25
python-path=/home/xx/lib/python2.5/site-packages
WSGIImportScript /home/xx/deploy/bme.wsgi process-group=xx
application-group=%{GLOBAL}
<VirtualHost *>
ServerName xx.com
DocumentRoot /home/xx/src
WSGIProcessGroup xx
WSGIApplicationGroup %{GLOBAL}
WSGIScriptAlias / /home/xx/deploy/bme.wsgi
ErrorLog /var/log/apache2/error.xx.log
LogLevel warn
CustomLog /var/log/apache2/access.xx.log common
ServerSignature On
</VirtualHost>
The implication of doing this is technically WSGIProcessGroup could
select that daemon process from another VirtualHost. If your own
system this shouldn't be an issue.
Graham
Ariel, did you find that things improved for you pinax installation though?
Graham
Ariel, did you find that things improved for you pinax installation though?
I'm curious, why is this automatically true?
For most sites I would totally agree, but what about the case of an
active site, where even without any long-lived requests you might
expect 20+ concurrent requests (or a lot more, this could be one of
many app servers behind a load balancer). When that's the case, is 25
processes still a ridiculous number?
Just trying to understand. ;)
Thanks,
Brett
I guess what I was reacting to was that the configuration was being
passed around without explanation as to what sort of setup it was
appropriate for. My concern is that if people post around such a
configuration that you are going to have all these people who like to
use a VPS with very minimal memory trying to use is, thinking that
they have to.
In practice though, multiple 25 by 40MB process size for reasonable
Django application and you get about a 1GB, which is quite a bit more
than the minimal 128MB slices that people who don't want to pay money
for something bigger want to use. Even then, 10 process would be too
much for that sort of setup anyway.
Although I disagree with some of his conclusions, Collin Grady posted
a recent blog entry which is quite informative in some ways.
http://collingrady.wordpress.com/2009/01/06/mod_python-versus-mod_wsgi/
He moved from mod_python to mod_wsgi. For mod_wsgi he used daemon mode
with a single process and single thread. From what he has said, his
site receives quite a bit of traffic, yet a single request handler
thread was sufficient in general.
What you need to remember is that the daemon processes aren't handling
static requests, only dynamic requests, and if you have tuned your
Django application well you requests should complete in well under a
second. As such, you are going to have quite a lot of slots within a
second period for handling requests. Thus even a single process and
single thread should be able to handle enough requests for small, and
perhaps up to average sites. In other words, 25 processes is going to
be overkill for the majority.
To try and gauge how much you can probably handle, set things up for a
single process and single thread and run 'ab' against a typical URL
with average response time, just don't use keep alive. From 'ab'
statistics output you can get an idea of how long a request takes and
thus how many requests per second. This can then be a guide as to how
many processes may be needed, although do be guarded in thinking that
this will scale completely linearly, as generally not the case.
In summary, as a recommendation if your application isn't multithread
safe, I would go as far as saying to start out with only 2 (maybe 3)
processes. This will still work for most VPS systems with out killing
it. Then look at the actual performance of your system and gauge how
much extra capacity you need.
So yes, some systems may need quite a lot more processes, as well as
use horizontal scaling across machines, but to give any advice that 25
processes is a starting point is probably not a good idea and give
people the wrong impression when it crashes their memory limited
systems.
Graham