Hi all,
We're deploying a web service running on Ubuntu 14.04 AWS instances with apache 2.4, mod_wsgi and django 1.6. Most requests generate nothing but an http response code, while a few generate small (less than a KB) amounts of XML. There is no static content being served. No use of ORM.
I am trying to tune apache and mod_wsgi for large amounts of clients and high throughput. (The machines have 8 CPUs and 15 GB of RAM.) My apache config looks something like this:
<VirtualHost *:80>
ServerName example.com
WSGIDaemonProcess webservice processes=30 threads=100 display-name=%{GROUP}
WSGIProcessGroup webservice
WSGIScriptAlias / "/home/ubuntu/webservice/wsgi.py"
<Directory />
Options All
AllowOverride All
Require all granted
</Directory>
</VirtualHost>
This works fine, however when using the apache benchmark tool "ab" to throw heavy load at the service, we are seeing something strange. The apache servers are behind a Elastic Load Balancer at AWS. All requests are to a django view which simply returns a 200 response code, no html content. All middleware has been disabled. According to the apache logs all requests take under a second, however according to the load balancer logs some (a very few) requests take upwards of a minute. We have set ab to simulate 1500 simultaneous users constantly making requests. I suspect the ELB can handle all the traffic we're throwing at it, but that possibly there aren't enough worker threads in apache to hand off the requests to mod_wsgi. I've tried setting "MaxRequestWorkers 3000", although I've read that the MPM settings don't really apply to mod_wsgi in daemon mode. But it seems like there need to be some apache settings which would allow more clients to be serviced at the same time.
What apache settings do I need to make to allow more throughput to mod_wsgi? Apache is in MPM event mode. Do I also need to play with ThreadsPerChild? ThreadLimit?
Also, I'd appreciate a sanity check here. If allowing 3000 clients per machine is not recommended, we can switch to a larger number of smaller machines. Or, if these machines should be able to handle more simultaneous requests, we can try higher numbers. I suspect that with a super high number of threads we'll get GIL contention, but I honestly don't know. We are using New Relic to get performance data, and occasionally we see innocuous lines of python taking seconds to execute.
Thanks,
Russ