I have a configured and running setup that I am looking to optimize. I do not want to swap out Apache for gunicorn or other options at this stage.
My setup is so:
Ubuntu 11.04 Default nginx from apt-get Default apache from apt-get
Nginx serves static files, and passes application requests through to Apache. Apache will have between 5-8 Django projects (ie - distinct websites). Small to medium traffic. Apache only has django projects (served via mod_wsgi) - I don't need php or anything that Django does not need.
From the default Ubuntu/Apache, what mods can I disable, and are there any other configuration tweaks I can do to more optimally use resources on my machine.
So we can fill out the full picture, can you post the snippets from
nginx to set it up as proxy to Apache/mod_wsgi, plus the Apache
configuration snippets for mod_wsgi in Apache as well. Also indicate
the MPM Apache is built for and current settings in Apache for
Timeout, KeepAlive, MaxKeepAliveRequests and KeepAliveTimeout, as we
as for the MPM used what you have for:
# prefork MPM
# StartServers: number of server processes to start
# MinSpareServers: minimum number of server processes which are kept spare
# MaxSpareServers: maximum number of server processes which are kept spare
# MaxClients: maximum number of server processes allowed to start
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule mpm_prefork_module>
StartServers 1
MinSpareServers 1
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0
</IfModule>
# worker MPM
# StartServers: initial number of server processes to start
# MaxClients: maximum number of simultaneous client connections
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadsPerChild: constant number of worker threads in each server process
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule mpm_worker_module>
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
</IfModule>
That will give us a good starting point.
Graham
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/modwsgi/-/-TEVSnX2XKAJ.
> To post to this group, send email to mod...@googlegroups.com.
> To unsubscribe from this group, send email to
> modwsgi+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/modwsgi?hl=en.
>
http://code.google.com/p/modwsgi/wiki/CheckingYourInstallation#Apache_Modules_Loaded
Graham
apache2 -M
Now. I am going to go through things to try as a series of emails over
time as don't have state of mind now to try and do it all at once.
First up. If not using PHP then disable mod_php.
Second, now that you have got rid of mod_php, switch Apache
installation from prefork MPM to worker MPM.
Third, since you are using mod_wsgi daemon mode, ensure that mod_wsgi
doesn't unnecessarily enable Python interpreters in Apache child
processes. For that read:
http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html
Changing from prefork to worker MPM and ensuring the Python
interpreters aren't initialised when not needed, will cut down on
memory footprint in the Apache child processes which do the proxying
to mod_wsgi daemon mode processes. This is because instead of
potentially up to 150 processes under excessive load, you would at
most have 6 processes. You have so few now because each of those 6 can
handle 25 concurrent requests in worker MPM where as prefork needed a
separate process for each request.
To try and gauge progress on this before you start, change:
WSGIDaemonProcess site-1 user=my-user group=www-data threads=25
to:
WSGIDaemonProcess site-1 user=my-user group=www-data threads=25
display-name=%{GROUP}
When you do a 'ps' now, the mod_wsgi daemon processes will be named
'(wsgi:site-1)' and can be distinguished easily from Apache root and
child process.
Use 'ps' to look at number of processes and sizes before making the
other changes above. For example, on my Mac OS X box with prefork I
see:
$ ps auxwwww | egrep 'httpd|USER' | grep -v grep
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
_www 58942 0.0 0.1 2440732 2184 ?? S 7:31am 0:08.82
/usr/sbin/httpd -D FOREGROUND
_www 58941 0.0 0.1 2440732 2184 ?? S 7:31am 0:08.83
/usr/sbin/httpd -D FOREGROUND
_www 58940 0.0 0.1 2440732 2196 ?? S 7:31am 0:08.79
/usr/sbin/httpd -D FOREGROUND
_www 58937 0.0 0.1 2440732 2184 ?? S 7:30am 0:08.82
/usr/sbin/httpd -D FOREGROUND
root 58908 0.0 0.0 2439768 1208 ?? Ss 7:30am 0:01.97
/usr/sbin/httpd -D FOREGROUND
_www 61069 0.0 0.1 2440732 2232 ?? S 11:22am 0:05.44
/usr/sbin/httpd -D FOREGROUND
_www 60310 0.0 0.1 2440732 2184 ?? S 9:47am 0:06.62
/usr/sbin/httpd -D FOREGROUND
You would grep for 'apache2' and not 'httpd'.
The 'root' owned process is the Apache parent process and the rest are
the Apache child processes proxying to the mod_wsgi daemon processes
and which would also be handling static file handling if Apache being
used for that.
Remember to check this when traffic flowing and now just when idle as
Apache will dynamically create more and more processes to meet demand.
You can read a bit about that at:
http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html
So, disable PHP, switch to worker MPM and disable interpreter
initialisation in embedded mode and restrict code execution in
embedded mode just to catch and configuration stuff ups.
Then capture 'ps' output and post both so can compare.
BTW, out of the modules you are loading into Apache, if you can say
which you think you are using. I note for example that rpaf is being
loaded. Are you actually using that?
After we look at results for changes above, then we can look at what
is the minimum set of Apache modules you need and strip those back.
Graham
> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/modwsgi/-/_jUgBgryjacJ.
How you switch depends on whether you installed Apache from binary
packages from a operating system or repository or from source code.
Either way, the Apache is a separate binary compiled from source code
with different build time configure options. So, if using binaries,
installed the alternate binary version operating system package
manager should offer. If from source, then rebuild from source code
and to 'configure' for Apache use '--with-mpm=worker'.
Most third party Apache modules shouldn't need recompiling when you
switch MPM, but some may. For mod_wsgi the binary should be okay when
going between worker MPM and prefork MPM. You can't use that binary
though interchangeably if you were to switch to a MPM such as ITK or
similar MPM.
Graham
I am still looking into the apache recompilation on ubuntu (I am using the aptitude) in order to complete the steps. I didn't find a clear guide (clear for me) on doing this yet.
--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To view this discussion on the web visit https://groups.google.com/d/msg/modwsgi/-/M_cII0duB3cJ.
To post to this group, send email to mod...@googlegroups.com.
To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
WSGIRestrictEmbedded On
And my currently enabled modules are (I disabled rpaf, and Ubuntu installed php5-cgi with the mpm-worker module automatically):
alias auth_basic authn_file authz_default authz_groupfile authz_host authz_user autoindex cgi cgid deflate dir env mime negotiation reqtimeout setenvif status wsgi
Thanks Graham, I really appreciate the help.The context:512mb VPS on Linode, running Ubuntu 11.04Here are the snippets:A virtual host config from nginx:server {listen ip.adress:80;server_name example.com www.example.com;access_log /srv/logs/example.com/nginx-access.log;error_log /srv/logs/example.com/nginx-error.log error;location / {proxy_pass http://127.0.0.1:8080;include /etc/nginx/proxy.conf;}}
--You received this message because you are subscribed to the Google Groups "modwsgi" group.
To view this discussion on the web visit https://groups.google.com/d/msg/modwsgi/-/qZmXGLJqaOYJ.
To post to this group, send email to mod...@googlegroups.com.
To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.
To post to this group, send email to mod...@googlegroups.com.
Visit this group at http://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.
To post to this group, send email to mod...@googlegroups.com.
# app1
WSGIDaemonProcess app1.domain.com display-name=app1 python-path=/srv/www/app1.domain.com/company-app1:/srv/www/app1.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app1.domain.com
WSGIScriptAlias / /srv/www/app1.domain.com/company-app1/app1/wsgi.py
# app1.test
WSGIDaemonProcess app1.test.domain.com display-name=app1-test python-path=/srv/www/app1.test.domain.com/company-app1:/srv/www/app1.test.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app1.test.domain.com
WSGIScriptAlias / /srv/www/app1.test.domain.com/company-app1/app1/wsgi.py
# app2
WSGIDaemonProcess app2.domain.com processes=3 display-name=app2 python-path=/srv/www/app2.domain.com/company-app2:/srv/www/app2.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app2.domain.com
WSGIApplicationGroup %{GLOBAL}
WSGIScriptAlias / /srv/www/app2.domain.com/company-app2/app2/config/wsgi.py
WSGIPassAuthorization On
# app2.test
WSGIDaemonProcess app2.test.domain.com display-name=app2-test python-path=/srv/www/app2.test.domain.com/company-app2:/srv/www/app2.test.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app2.test.domain.com
WSGIScriptAlias / /srv/www/app2.test.domain.com/company-app2/app2/config/wsgi.py
WSGIPassAuthorization On
# app3
WSGIDaemonProcess app2.domain.com display-name=app3 python-path=/srv/www/app2.domain.com/company-app2/api:/srv/www/app2.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app2.domain.com
WSGIScriptAlias /api /srv/www/app2.domain.com/company-app2/api/project/wsgi.py
# app3.test
WSGIDaemonProcess app2.test.domain.com display-name=app3-test python-path=/srv/www/app2.test.domain.com/company-app2/api:/srv/www/app2.test.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app2.test.domain.com
WSGIScriptAlias /api /srv/www/app2.test.domain.com/company-app2/api/project/wsgi.py
Timeout 60
KeepAlive Off
MaxKeepAliveRequests 100
KeepAliveTimeout 15
<IfModule prefork.c>
StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 256
MaxClients 256
MaxRequestsPerChild 4000
</IfModule>
<IfModule worker.c>
StartServers 4
MaxClients 300
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
</
IfModule>
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
apache 21998 0.1 0.0 488888 7300 ? Sl 22:40 0:00 app1
apache 21999 0.0 0.0 488888 7300 ? Sl 22:40 0:00 app1-test
apache 22000 35.7 2.9 1033412 237960 ? Sl 22:40 3:46 app2
apache 22001 40.8 2.5 967876 208024 ? Sl 22:40 4:18 app2
apache 22002 45.4 3.5 1098948 282940 ? Sl 22:40 4:48 app2
apache 22003 1.1 0.0 488888 7300 ? Sl 22:40 0:03 app2-test
apache 22004 0.4 0.0 488888 7296 ? Sl 22:40 0:01 app3
apache 22005 0.0 0.0 488888 7284 ? Sl 22:40 0:00 app3-test
apache 22000 35.7 2.9 1033412 237960 ? Sl 22:40 3:46 app2apache 22001 40.8 2.5 967876 208024 ? Sl 22:40 4:18 app2apache 22002 45.4 3.5 1098948 282940 ? Sl 22:40 4:48 app2apache 22003 1.1 0.0 488888 7300 ? Sl 22:40 0:03 app2-test
...
On 9 Jul 2015, at 7:47 am, Brian Littmann <br...@newmedio.com> wrote:Howdy. Glad I found this thread, and thanks so much, Graham, for all the support you give to the community. I've been reading so many SO and SF questions and your replies are everywhere.I must admit, I was in the "naive Apache" crowd not too long ago and our server recently screeched to a halt. We just bumped up the server specs, but there are still some configuration kinks to work out, which I have some questions about.Our server/app situation:
- CentOS 6.3 server with 8GB RAM and a quad-core processor
- Default Apache 2.2 install using mpm-prefork (gross, I know)
- Hosting 6 internal Django apps (3 production, 3 staging) - Not too traffic heavy (maybe 20 req/sec total), but data heavy
- MySQL db backend and an external MySQL db on another server
- Already done:
- All apps are served via mod_wsgi in daemon mode, each set up in a separate VirtualHost
- Using "WSGIRestrictEmbedded On" in global conf
The to-do list:
- Uninstall unused Apache modules (probably based on this list: http://haydenjames.io/strip-apache-improve-performance-memory-efficiency/)
- Swap out MPM prefork for worker (see question below about thread safety)
- Set up nginx proxy for serving static assets (if it's really worth it)
- Extra caching in one of the apps
Questions:
- A) Should I be setting "WSGIApplicationGroup %{GLOBAL}" in all of my virtual hosts so that all apps run under the same Python interpreter?
- B) How can I tell if the Python/Django code is thread-safe for using MPM worker?
- Would I need to make sure multiple db writes in a function are wrapped in a transaction?
- For reference: http://stackoverflow.com/a/20491426/720054
- Apps are on Django 1.5, Django 1.6, and one is an API using Django REST Framework.
- C) When MPM worker is used, how should the corresponding httpd (apache2) conf be set up? I've read your math somewhere, Graham, but I also saw you go for about 25% more threads for Apache than total for mod_wsgi daemon processes.
On Tuesday, August 23, 2011 at 7:57:45 PM UTC-5, Graham Dumpleton wrote:Just a quick message to say I haven't forgotten about this discussion thread. Have been quite busy with work plus was at Sydney PyCon over the weekend and sprint days after that. Life should return to semi normal tomorrow.
Would be interested to hear if you are already seeing better memory usage just by switching to worker and ensuring using daemon mode and turning off interpreters in apache child processes.
Graham
--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.
To post to this group, send email to mod...@googlegroups.com.
So being stuck with mod_wsgi 3.2 is not good at all and you could be seeing extra memory use, possibly quite significant, as a result.Even if Apache 2.4 could be used, using at least mod_wsgi version 4.4.0 or later is recommended. Version 4.4.12 or later if your WSGI application is for some reason streaming a lot of data in small blocks.
By data heavy do you mean the amount of response content returned, or by the amount of processing done in the WSGI application process itself?
- Hosting 6 internal Django apps (3 production, 3 staging) - Not too traffic heavy (maybe 20 req/sec total), but data heavy
- MySQL db backend and an external MySQL db on another server
Are the applications primarily I/O bound waiting on the database or other backend services, or are they running CPU intensive tasks in process?
Knowing the balance is important because for CPU bound tasks, the impact of the Python global interpreter lock (GIL) can be significant and this can dictate how much you should be relying on multiple processes rather than threads for concurrency.
Questions:
- B) How can I tell if the Python/Django code is thread-safe for using MPM worker?
- Would I need to make sure multiple db writes in a function are wrapped in a transaction?
- For reference: http://stackoverflow.com/a/20491426/720054
- Apps are on Django 1.5, Django 1.6, and one is an API using Django REST Framework.
If you had multithreading issues in your code I think you would know already. This is because with your configuration using mod_wsgi daemon mode, you are already running your WSGI applications in a multithread configuration with potential for more than one request to be handled concurrently in the same process. The default number of threads per daemon process with raw mod_wsgi is 15. Thus each Django instance is currently running with processes consisting of 15 threads.
This is the same whether or not you are using prefork or worker MPM. The MPM only affects whether the Apache child worker processes are using multithreading, not the mod_wsgi daemon processes, the configuration of which is dictated by the WSGIDaemonprocess directive.
- C) When MPM worker is used, how should the corresponding httpd (apache2) conf be set up? I've read your math somewhere, Graham, but I also saw you go for about 25% more threads for Apache than total for mod_wsgi daemon processes.
As a rough guess that is a good place to start for where only have the one WSGI application running in a daemon process group. Because you have more then you likely want to increase that up.
What you set this to really depends on how large a responses you are serving up and how slow your HTTP clients are. If you deal with a lot of mobile clients which can be slow to talk to, you might want more capacity in the child worker process doing the proxying or serving up static content.
The problem with creating too many child worker processes is that if you start to get backlogged it can make it harder to clear as clients can still connect rather than getting a connection failure and causing the request to fail. There are various things you can do with more recent mod_wsgi versions with timeouts to cope with backlogging and so clear a back log by failing requests before they get to the WSGI application. This prevents the server being completely overwhelmed and unresponsive in such a situation with the WSGI application still handling requests where the client has likely already given up on anyway.As a first pass, hope this helps. We can keep going as need be. The issue remaining is how CPU intensive your tasks are, how long are response times and so how much capacity of the daemon processes is being used. If those CPU usage figures when under load are getting up to 75% or more for a single process I would definitely be worried a bit and would need to look at that more closely.