Running Django on Apache Behind nginx - What Apache optimizations can I make

Paul Walsh

unread,

Aug 11, 2011, 8:46:18 AM8/11/11

to mod...@googlegroups.com

I asked this on Stack Overflow (http://stackoverflow.com/questions/7022614/running-django-on-apache-behind-nginx-what-apache-optimizations-can-i-make), and it was suggested I ask here. So, here goes:

I have a configured and running setup that I am looking to optimize. I do not want to swap out Apache for gunicorn or other options at this stage.

My setup is so:

Ubuntu 11.04 Default nginx from apt-get Default apache from apt-get

Nginx serves static files, and passes application requests through to Apache. Apache will have between 5-8 Django projects (ie - distinct websites). Small to medium traffic. Apache only has django projects (served via mod_wsgi) - I don't need php or anything that Django does not need.

From the default Ubuntu/Apache, what mods can I disable, and are there any other configuration tweaks I can do to more optimally use resources on my machine.

Graham Dumpleton

unread,

Aug 11, 2011, 6:59:19 PM8/11/11

to mod...@googlegroups.com

Hi Paul. Thanks for bringing the discussion over here. We haven't done
this exercise for a while and probably a good thing to recap and also
include latest ideas. Maybe this time I will collect all the details
together and make a document of it somewhere.

So we can fill out the full picture, can you post the snippets from
nginx to set it up as proxy to Apache/mod_wsgi, plus the Apache
configuration snippets for mod_wsgi in Apache as well. Also indicate
the MPM Apache is built for and current settings in Apache for
Timeout, KeepAlive, MaxKeepAliveRequests and KeepAliveTimeout, as we
as for the MPM used what you have for:

# prefork MPM
# StartServers: number of server processes to start
# MinSpareServers: minimum number of server processes which are kept spare
# MaxSpareServers: maximum number of server processes which are kept spare
# MaxClients: maximum number of server processes allowed to start
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule mpm_prefork_module>
StartServers 1
MinSpareServers 1
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0
</IfModule>

# worker MPM
# StartServers: initial number of server processes to start
# MaxClients: maximum number of simultaneous client connections
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadsPerChild: constant number of worker threads in each server process
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule mpm_worker_module>
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
</IfModule>

That will give us a good starting point.

Graham

> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/modwsgi/-/-TEVSnX2XKAJ.
> To post to this group, send email to mod...@googlegroups.com.
> To unsubscribe from this group, send email to
> modwsgi+u...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/modwsgi?hl=en.
>

Graham Dumpleton

unread,

Aug 11, 2011, 7:41:16 PM8/11/11

to mod...@googlegroups.com

Also, what modules Apache is currently loading.

http://code.google.com/p/modwsgi/wiki/CheckingYourInstallation#Apache_Modules_Loaded

Graham

Paul Walsh

unread,

Aug 12, 2011, 2:30:08 PM8/12/11

to mod...@googlegroups.com

Thanks Graham, I really appreciate the help.

The context:

512mb VPS on Linode, running Ubuntu 11.04

Here are the snippets:

A virtual host config from nginx:

server {

listen ip.adress:80;

server_name example.com www.example.com;

access_log /srv/logs/example.com/nginx-access.log;

error_log /srv/logs/example.com/nginx-error.log error;

location / {

proxy_pass http://127.0.0.1:8080;

include /etc/nginx/proxy.conf;

}

server {

listen ip.address:80;

root /srv/static/example.com;

server_name assets.example.com;

access_log /srv/logs/example.com/nginx-access.log;

error_log /srv/logs/example.com/nginx-error.log error;

location / {

expires 30d;

}

Apache virtual host for the same site:

ServerAdmin m...@example.com

ServerName example.com

ServerAlias www.example.com

DocumentRoot /srv/src/example.com_venv/apache

ErrorLog /srv/logs/example.com/error.log

CustomLog /srv/logs/example.com/access.log combined

WSGIScriptAlias / /srv/src/example.com_venv/apache/django.wsgi

WSGIDaemonProcess site-1 user=my-user group=www-data threads=25

WSGIProcessGroup site-1

</VirtualHost>

Ok, and in my apache2.conf file

KeepAlive On

Timeout 300

MaxKeepAliveRequests 100

KeepAliveTimeout 15

And..

# prefork MPM

# StartServers: number of server processes to start

# MinSpareServers: minimum number of server processes which are kept spare

# MaxSpareServers: maximum number of server processes which are kept spare

# MaxClients: maximum number of server processes allowed to start

# MaxRequestsPerChild: maximum number of requests a server process serves

StartServers 5

MinSpareServers 5

MaxSpareServers 10

MaxClients 150

MaxRequestsPerChild 0

</IfModule>

# worker MPM

# StartServers: initial number of server processes to start

# MinSpareThreads: minimum number of worker threads which are kept spare

# MaxSpareThreads: maximum number of worker threads which are kept spare

# ThreadLimit: ThreadsPerChild can be changed to this maximum value during a

# graceful restart. ThreadLimit can only be changed by stopping

# and starting Apache.

# ThreadsPerChild: constant number of worker threads in each server process

# MaxClients: maximum number of simultaneous client connections

# MaxRequestsPerChild: maximum number of requests a server process serves

StartServers 2

MinSpareThreads 25

MaxSpareThreads 75

ThreadLimit 64

ThreadsPerChild 25

MaxClients 150

MaxRequestsPerChild 0

</IfModule>

Then, I couldn't find another way to check enabled mods on ubuntu, except actually look at the mods-enabled directory.

There I have:

authz_user.load

rpaf.load

mime.load

autoindex.load

setenvif.load

cgi.load

negotiation.load

auth_basic.load

status.load

authn_file.load

deflate.load

php5.load

authz_default.load

wsgi.load

authz_groupfile.load

dir.load

reqtimeout.load

authz_host.load

env.load

And the associated conf files.

I couldn't work out on Ubuntu how to check if I have prefork MPM or worker MPM.

thanks.

Paul Walsh

unread,

Aug 12, 2011, 3:21:24 PM8/12/11

to mod...@googlegroups.com

Oh, ok, to check MPM on Ubuntu was as simple as apache2 -V.

That shows me:

Server MPM: Prefork

threaded: no

forked: yes (variable process count)

Server compiled with....

-D APACHE_MPM_DIR="server/mpm/prefork"

-D APR_HAS_SENDFILE

-D APR_HAS_MMAP

-D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)

-D APR_USE_SYSVSEM_SERIALIZE

-D APR_USE_PTHREAD_SERIALIZE

-D SINGLE_LISTEN_UNSERIALIZED_ACCEPT

-D APR_HAS_OTHER_CHILD

-D AP_HAVE_RELIABLE_PIPED_LOGS

-D DYNAMIC_MODULE_LIMIT=128

-D HTTPD_ROOT="/etc/apache2"

-D SUEXEC_BIN="/usr/lib/apache2/suexec"

-D DEFAULT_PIDLOG="/var/run/apache2.pid"

-D DEFAULT_SCOREBOARD="logs/apache_runtime_status"

-D DEFAULT_LOCKFILE="/var/run/apache2/accept.lock"

-D DEFAULT_ERRORLOG="logs/error_log"

-D AP_TYPES_CONFIG_FILE="mime.types"

-D SERVER_CONFIG_FILE="apache2.conf"

Graham Dumpleton

unread,

Aug 13, 2011, 5:56:27 AM8/13/11

to mod...@googlegroups.com

You should be able to get list of modules actually being load by using
the -M command to Apache. Thus:

apache2 -M

Now. I am going to go through things to try as a series of emails over
time as don't have state of mind now to try and do it all at once.

First up. If not using PHP then disable mod_php.

Second, now that you have got rid of mod_php, switch Apache
installation from prefork MPM to worker MPM.

Third, since you are using mod_wsgi daemon mode, ensure that mod_wsgi
doesn't unnecessarily enable Python interpreters in Apache child
processes. For that read:

http://blog.dscpl.com.au/2009/11/save-on-memory-with-modwsgi-30.html

Changing from prefork to worker MPM and ensuring the Python
interpreters aren't initialised when not needed, will cut down on
memory footprint in the Apache child processes which do the proxying
to mod_wsgi daemon mode processes. This is because instead of
potentially up to 150 processes under excessive load, you would at
most have 6 processes. You have so few now because each of those 6 can
handle 25 concurrent requests in worker MPM where as prefork needed a
separate process for each request.

To try and gauge progress on this before you start, change:

WSGIDaemonProcess site-1 user=my-user group=www-data threads=25

to:

WSGIDaemonProcess site-1 user=my-user group=www-data threads=25

display-name=%{GROUP}

When you do a 'ps' now, the mod_wsgi daemon processes will be named
'(wsgi:site-1)' and can be distinguished easily from Apache root and
child process.

Use 'ps' to look at number of processes and sizes before making the
other changes above. For example, on my Mac OS X box with prefork I
see:

$ ps auxwwww | egrep 'httpd|USER' | grep -v grep
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
_www 58942 0.0 0.1 2440732 2184 ?? S 7:31am 0:08.82
/usr/sbin/httpd -D FOREGROUND
_www 58941 0.0 0.1 2440732 2184 ?? S 7:31am 0:08.83
/usr/sbin/httpd -D FOREGROUND
_www 58940 0.0 0.1 2440732 2196 ?? S 7:31am 0:08.79
/usr/sbin/httpd -D FOREGROUND
_www 58937 0.0 0.1 2440732 2184 ?? S 7:30am 0:08.82
/usr/sbin/httpd -D FOREGROUND
root 58908 0.0 0.0 2439768 1208 ?? Ss 7:30am 0:01.97
/usr/sbin/httpd -D FOREGROUND
_www 61069 0.0 0.1 2440732 2232 ?? S 11:22am 0:05.44
/usr/sbin/httpd -D FOREGROUND
_www 60310 0.0 0.1 2440732 2184 ?? S 9:47am 0:06.62
/usr/sbin/httpd -D FOREGROUND

You would grep for 'apache2' and not 'httpd'.

The 'root' owned process is the Apache parent process and the rest are
the Apache child processes proxying to the mod_wsgi daemon processes
and which would also be handling static file handling if Apache being
used for that.

Remember to check this when traffic flowing and now just when idle as
Apache will dynamically create more and more processes to meet demand.
You can read a bit about that at:

http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html

So, disable PHP, switch to worker MPM and disable interpreter
initialisation in embedded mode and restrict code execution in
embedded mode just to catch and configuration stuff ups.

Then capture 'ps' output and post both so can compare.

BTW, out of the modules you are loading into Apache, if you can say
which you think you are using. I note for example that rpaf is being
loaded. Are you actually using that?

After we look at results for changes above, then we can look at what
is the minimum set of Apache modules you need and strip those back.

Graham

> --
> You received this message because you are subscribed to the Google Groups
> "modwsgi" group.
> To view this discussion on the web visit

> https://groups.google.com/d/msg/modwsgi/-/_jUgBgryjacJ.

D_bot

unread,

Aug 14, 2011, 12:21:13 PM8/14/11

to modwsgi

I'm a bit hazy on how to switch from prefork MPM to worker MPM.

Since results from Paul's apache2 -V command seem to show his apache
server
was compilted with the prefork option, does this indicate a
recompilation to
include worker.c?

On Aug 13, 5:56 am, Graham Dumpleton <graham.dumple...@gmail.com>
wrote:

> http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usa...

>
> So, disable PHP, switch to worker MPM and disable interpreter
> initialisation in embedded mode and restrict code execution in
> embedded mode just to catch and configuration stuff ups.
>
> Then capture 'ps' output and post both so can compare.
>
> BTW, out of the modules you are loading into Apache, if you can say
> which you think you are using. I note for example that rpaf is being
> loaded. Are you actually using that?
>
> After we look at results for changes above, then we can look at what
> is the minimum set of Apache modules you need and strip those back.
>
> Graham
>

D_bot

unread,

Aug 14, 2011, 4:33:49 PM8/14/11

to modwsgi

never mind -- i didn't read far enough

Graham Dumpleton

unread,

Aug 14, 2011, 7:52:06 PM8/14/11

to mod...@googlegroups.com

I now you worked it out, but for benefit of others explain below.

How you switch depends on whether you installed Apache from binary
packages from a operating system or repository or from source code.

Either way, the Apache is a separate binary compiled from source code
with different build time configure options. So, if using binaries,
installed the alternate binary version operating system package
manager should offer. If from source, then rebuild from source code
and to 'configure' for Apache use '--with-mpm=worker'.

Most third party Apache modules shouldn't need recompiling when you
switch MPM, but some may. For mod_wsgi the binary should be okay when
going between worker MPM and prefork MPM. You can't use that binary
though interchangeably if you were to switch to a MPM such as ITK or
similar MPM.

Graham

Paul Walsh

unread,

Aug 16, 2011, 2:38:08 AM8/16/11

to mod...@googlegroups.com

I am still looking into the apache recompilation on ubuntu (I am using the aptitude) in order to complete the steps. I didn't find a clear guide (clear for me) on doing this yet.

Douglas Epling

unread,

Aug 16, 2011, 8:39:53 AM8/16/11

to mod...@googlegroups.com

Yeah, this is turning out to be no trivial task on Fedora either. I have gone so far as to download the source rpm, extract and install it so I could access the httpd.spec along with configuration files. And when I run ./configure --with-mpm=worker it complains about not finding configure in the BUILD/httpd-2.2.17/pcre directory. And I am of the notion this is a bug in the httpd.spec script.

Once in a while. life would be so much easier if we compiled and installed from source instead of relying on the binary repositories.

I am waiting for word from my distro.

On Tue, Aug 16, 2011 at 2:38 AM, Paul Walsh <pauly...@gmail.com> wrote:

I am still looking into the apache recompilation on ubuntu (I am using the aptitude) in order to complete the steps. I didn't find a clear guide (clear for me) on doing this yet.

--

You received this message because you are subscribed to the Google Groups "modwsgi" group.

To view this discussion on the web visit https://groups.google.com/d/msg/modwsgi/-/M_cII0duB3cJ.

To post to this group, send email to mod...@googlegroups.com.
To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.

--
Douglas Epling
President/Owner
YourHelpDesk.us, LLC

D_bot

unread,

Aug 16, 2011, 8:49:41 AM8/16/11

to modwsgi

Okay, word from RedHat is to edit the /etc/sysconfig/httpd file to
allow worker MPM. Maybe it is this simple on Ubunto.

> *Douglas Epling
> President/Owner
> YourHelpDesk.us, LLC*

Paul Walsh

unread,

Aug 16, 2011, 5:16:12 PM8/16/11

to mod...@googlegroups.com

Ok, it was easy enough in Ubuntu to change to the worker module via the repository:

sudo apt-get install apache2-mpm-worker

Removes prefork and installs worker. I've tested and no apparent resulting issues. I'll now move on to the other optimisations.

Paul Walsh

unread,

Aug 16, 2011, 5:46:02 PM8/16/11

to mod...@googlegroups.com

Ok, so the updated environment:

I am running mpm-worker:

me@machine:/etc/apache2$ apache2 -l

Compiled in modules:

core.c

mod_log_config.c

mod_logio.c

worker.c

http_core.c

mod_so.c

me@machine:/etc/apache2$ apache2 -V

Server version: Apache/2.2.17 (Ubuntu)

Server built: Feb 22 2011 18:34:09

Server's Module Magic Number: 20051115:25

Server loaded: APR 1.4.2, APR-Util 1.3.9

Compiled using: APR 1.4.2, APR-Util 1.3.9

Architecture: 32-bit

Server MPM: Worker

threaded: yes (fixed thread count)

forked: yes (variable process count)

Server compiled with....

-D APACHE_MPM_DIR="server/mpm/worker"

-D APR_HAS_SENDFILE

-D APR_HAS_MMAP

-D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)

-D APR_USE_SYSVSEM_SERIALIZE

-D APR_USE_PTHREAD_SERIALIZE

-D SINGLE_LISTEN_UNSERIALIZED_ACCEPT

-D APR_HAS_OTHER_CHILD

-D AP_HAVE_RELIABLE_PIPED_LOGS

-D DYNAMIC_MODULE_LIMIT=128

-D HTTPD_ROOT="/etc/apache2"

-D SUEXEC_BIN="/usr/lib/apache2/suexec"

-D DEFAULT_PIDLOG="/var/run/apache2.pid"

-D DEFAULT_SCOREBOARD="logs/apache_runtime_status"

-D DEFAULT_ERRORLOG="logs/error_log"

-D AP_TYPES_CONFIG_FILE="mime.types"

-D SERVER_CONFIG_FILE="apache2.conf"

I have added the following to my apache2.conf:

WSGIRestrictEmbedded On

And my currently enabled modules are (I disabled rpaf, and Ubuntu installed php5-cgi with the mpm-worker module automatically):

alias auth_basic authn_file authz_default authz_groupfile authz_host authz_user autoindex cgi cgid deflate dir env mime negotiation reqtimeout setenvif status wsgi

Douglas Epling

unread,

Aug 22, 2011, 11:39:14 PM8/22/11

to mod...@googlegroups.com

On Fri, Aug 12, 2011 at 2:30 PM, Paul Walsh <pauly...@gmail.com> wrote:

Thanks Graham, I really appreciate the help.

The context:

512mb VPS on Linode, running Ubuntu 11.04

Here are the snippets:

A virtual host config from nginx:

server {
listen ip.adress:80;
server_name example.com www.example.com;

access_log /srv/logs/example.com/nginx-access.log;
error_log /srv/logs/example.com/nginx-error.log error;

location / {
proxy_pass http://127.0.0.1:8080;
include /etc/nginx/proxy.conf;
}
}

Shouldn't we be telling nginx something like:

    location \.wsgi$ {
        include /etc/nginx/proxy.conf;
        proxy_pass localhost:8080;
    {
?

And couldn't this be just another "location" directive within the default server directive? In other words I am thinking my one nginx server, and its replications, can either serve the request for static files or forward the request to apache, depending on which "location" directive best fits.

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.

To view this discussion on the web visit https://groups.google.com/d/msg/modwsgi/-/qZmXGLJqaOYJ.

To post to this group, send email to mod...@googlegroups.com.
To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.

Graham Dumpleton

unread,

Aug 23, 2011, 8:57:45 PM8/23/11

to mod...@googlegroups.com

Just a quick message to say I haven't forgotten about this discussion thread. Have been quite busy with work plus was at Sydney PyCon over the weekend and sprint days after that. Life should return to semi normal tomorrow.

Would be interested to hear if you are already seeing better memory usage just by switching to worker and ensuring using daemon mode and turning off interpreters in apache child processes.

Graham

> --
> Douglas Epling
> President/Owner
> YourHelpDesk.us, LLC
>

> --
> You received this message because you are subscribed to the Google Groups "modwsgi" group.

> To post to this group, send email to mod...@googlegroups.com.

> To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com <modwsgi%2Bunsu...@googlegroups.com>.

Brian Littmann

unread,

Jul 8, 2015, 5:48:45 PM7/8/15

to mod...@googlegroups.com

Howdy. Glad I found this thread, and thanks so much, Graham, for all the support you give to the community. I've been reading so many SO and SF questions and your replies are everywhere.

I must admit, I was in the "naive Apache" crowd not too long ago and our server recently screeched to a halt. We just bumped up the server specs, but there are still some configuration kinks to work out, which I have some questions about.

Our server/app situation:

CentOS 6.3 server with 8GB RAM and a quad-core processor
Default Apache 2.2 install using mpm-prefork (gross, I know)
Hosting 6 internal Django apps (3 production, 3 staging) - Not too traffic heavy (maybe 20 req/sec total), but data heavy
MySQL db backend and an external MySQL db on another server

Already done:

All apps are served via mod_wsgi in daemon mode, each set up in a separate VirtualHost
Using "WSGIRestrictEmbedded On" in global conf

The to-do list:

Uninstall unused Apache modules (probably based on this list: http://haydenjames.io/strip-apache-improve-performance-memory-efficiency/)
Swap out MPM prefork for worker (see question below about thread safety)
Set up nginx proxy for serving static assets (if it's really worth it)
Extra caching in one of the apps

Questions:

A) Should I be setting "WSGIApplicationGroup %{GLOBAL}" in all of my virtual hosts so that all apps run under the same Python interpreter?
B) How can I tell if the Python/Django code is thread-safe for using MPM worker?

Would I need to make sure multiple db writes in a function are wrapped in a transaction?
For reference: http://stackoverflow.com/a/20491426/720054
Apps are on Django 1.5, Django 1.6, and one is an API using Django REST Framework.

C) When MPM worker is used, how should the corresponding httpd (apache2) conf be set up? I've read your math somewhere, Graham, but I also saw you go for about 25% more threads for Apache than total for mod_wsgi daemon processes.

Jason Garber

unread,

Jul 8, 2015, 5:57:25 PM7/8/15

to mod...@googlegroups.com

Hi Brian,

Here is an example nginx config for you. Yes, using nginx is always worth it. Between request/response buffering, and insane speed, it's great.

Also - here is another tip... https://github.com/appcove/acn-linux is a set of tools for setting up a Centos/RHEL server. It has scripts within for installing and configuring both nginx, mod_rpaf (so apache plays nice behind a proxy with regard to REMOTE_ADDR), and lots of other stuff.

Lastly, it also contains ConfStruct, which is a python system for building out nginx and apache files in any development or production environment. A single file in your project will define the config needs for that project, and then the `cst` command will build all of your config files for that project based on your local environment. The following nginx output came from ConfStruct.

server

{

listen 192.168.50.12:80;

listen 192.168.50.12:443 ssl;

server_name www--participantinsights--com--jason.step.appcove.net;

ssl_certificate ssl/WILDCARD.step.appcove.net.crt;

ssl_certificate_key ssl/WILDCARD.step.appcove.net.key;

proxy_set_header Host $host;

proxy_set_header X-Real-IP $remote_addr;

proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

proxy_set_header Scheme $scheme;

client_max_body_size 1000m;

location ~ \.(py|pyc|pyo|wsgi)$

{

return 403;

}

location ~ \.([a-zA-Z0-9])+$ #NOTE: this is development settings. Production is different.

{

root /home/jason/DevLevel.2/PartIn/Web/Main;

add_header Cache-Control 'no-cache, no-store, max-age=0, must-revalidate';

add_header Expires 'Thu, 01 Jan 1970 00:00:01 GMT';

}

location /

{

add_header Cache-Control 'no-cache, no-store, max-age=0, must-revalidate';

add_header Expires 'Thu, 01 Jan 1970 00:00:01 GMT';

proxy_pass http://127.0.0.1:60601;

}

server

{

listen 192.168.50.12:80;

server_name participantinsights--com--jason.step.appcove.net;

return 301 http://www--participantinsights--com--jason.step.appcove.net$request_uri;

}

Here is apache for the fun of it:

#================================================================================================

# Background Processing

Listen 127.0.0.1:60600

ServerName _default_;

WSGIDaemonProcess async-60600 processes=1 threads=1 python-path=/home/jason/DevLevel.2/PartIn/Python display-name=async-60600

WSGIImportScript /home/jason/DevLevel.2/PartIn/Async/wsgi_import_script.wsgi process-group=async-60600 application-group=%{GLOBAL}

ErrorLog /home/jason/DevLevel.2/PartIn/apache-error.log

</VirtualHost>

#================================================================================================

# Main

WSGIDaemonProcess Port60601 processes=2 threads=2 python-path=/home/jason/DevLevel.2/PartIn/Python

Listen 127.0.0.1:60601

NameVirtualHost 127.0.0.1:60601

ServerName _default_

DocumentRoot /home/jason/DevLevel.2/PartIn/Web/Main

AddDefaultCharset UTF-8

RewriteEngine on

RewriteOptions inherit

# Forbid any python source files from being served.

RewriteRule \.(py|pyc|pyo|wsgi)$ - [F]

WSGIScriptAlias /learn /home/jason/DevLevel.2/PartIn/Web/Main/learn/__init__.wsgi

WSGIScriptAlias / /home/jason/DevLevel.2/PartIn/Web/Main/__init__.wsgi

WSGIProcessGroup Port60601

LogLevel info

ErrorLog /home/jason/DevLevel.2/PartIn/apache-error.log

</VirtualHost>

--

Jason

--

You received this message because you are subscribed to the Google Groups "modwsgi" group.

To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.

To post to this group, send email to mod...@googlegroups.com.

Visit this group at http://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/d/optout.

Graham Dumpleton

unread,

Jul 8, 2015, 6:02:03 PM7/8/15

to mod...@googlegroups.com

I will reply later when finished with moving kids around to places they need to go and attending other chores, but can you provide details on a couple of items.

What is the WSGIDaemonProcess directives set to for each VirtualHost?

What are the Apache prefork MPM settings?

StartServers 1

MinSpareServers 1

MaxSpareServers 10

MaxRequestWorkers 250

MaxConnectionsPerChild 0

</IfModule>

What are settings for KeepAlive, KeepAliveTimeout and Timeout directives?

Do you have any idea what the CPU usage level is for a process out of each hosted Django site?

Graham

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.

To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.

To post to this group, send email to mod...@googlegroups.com.

Brian Littmann

unread,

Jul 8, 2015, 6:55:19 PM7/8/15

to mod...@googlegroups.com

Not sure if using periods in the process or process group name works or if I should use hyphens instead.

App 1 conf

# app1
WSGIDaemonProcess app1.domain.com display-name=app1 python-path=/srv/www/app1.domain.com/company-app1:/srv/www/app1.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app1.domain.com
WSGIScriptAlias / /srv/www/app1.domain.com/company-app1/app1/wsgi.py

# app1.test
WSGIDaemonProcess app1.test.domain.com display-name=app1-test python-path=/srv/www/app1.test.domain.com/company-app1:/srv/www/app1.test.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app1.test.domain.com
WSGIScriptAlias / /srv/www/app1.test.domain.com/company-app1/app1/wsgi.py

App 2 conf

# app2
WSGIDaemonProcess app2.domain.com processes=3 display-name=app2 python-path=/srv/www/app2.domain.com/company-app2:/srv/www/app2.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app2.domain.com
WSGIApplicationGroup %{GLOBAL}
WSGIScriptAlias / /srv/www/app2.domain.com/company-app2/app2/config/wsgi.py
WSGIPassAuthorization On

# app2.test
WSGIDaemonProcess app2.test.domain.com display-name=app2-test python-path=/srv/www/app2.test.domain.com/company-app2:/srv/www/app2.test.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app2.test.domain.com
WSGIScriptAlias / /srv/www/app2.test.domain.com/company-app2/app2/config/wsgi.py
WSGIPassAuthorization On

App 3 conf

# app3
WSGIDaemonProcess app2.domain.com display-name=app3 python-path=/srv/www/app2.domain.com/company-app2/api:/srv/www/app2.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app2.domain.com
WSGIScriptAlias /api /srv/www/app2.domain.com/company-app2/api/project/wsgi.py

# app3.test
WSGIDaemonProcess app2.test.domain.com display-name=app3-test python-path=/srv/www/app2.test.domain.com/company-app2/api:/srv/www/app2.test.domain.com/lib/python2.7/site-packages
WSGIProcessGroup app2.test.domain.com
WSGIScriptAlias /api /srv/www/app2.test.domain.com/company-app2/api/project/wsgi.py

httpd conf

Timeout 60
KeepAlive Off
MaxKeepAliveRequests 100
KeepAliveTimeout 15

<IfModule prefork.c>
StartServers       8
MinSpareServers    5
MaxSpareServers   20
ServerLimit      256
MaxClients       256
MaxRequestsPerChild  4000
</IfModule>

<IfModule worker.c>
StartServers         4
MaxClients         300


MinSpareThreads     25
MaxSpareThreads     75
ThreadsPerChild     25
MaxRequestsPerChild  0
</

IfModule>

Running 'ps aux' for each app at a relatively slow time (it's almost 6pm CDT, so load is lower)

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache   21998  0.1  0.0 488888  7300 ?        Sl   22:40   0:00 app1
apache   21999  0.0  0.0 488888  7300 ?        Sl   22:40   0:00 app1-test

apache   22000 35.7  2.9 1033412 237960 ?      Sl   22:40   3:46 app2
apache   22001 40.8  2.5 967876 208024 ?       Sl   22:40   4:18 app2
apache   22002 45.4  3.5 1098948 282940 ?      Sl   22:40   4:48 app2
apache   22003  1.1  0.0 488888  7300 ?        Sl   22:40   0:03 app2-test

apache   22004  0.4  0.0 488888  7296 ?        Sl   22:40   0:01 app3
apache   22005  0.0  0.0 488888  7284 ?        Sl   22:40   0:00 app3-test

Graham Dumpleton

unread,

Jul 8, 2015, 7:04:16 PM7/8/15

to mod...@googlegroups.com

Sorry, one more thing.

What version of mod_wsgi are you using?

Graham

apache 22000 35.7 2.9 1033412 237960 ? Sl 22:40 3:46 app2apache 22001 40.8 2.5 967876 208024 ? Sl 22:40 4:18 app2apache 22002 45.4 3.5 1098948 282940 ? Sl 22:40 4:48 app2apache 22003 1.1 0.0 488888 7300 ? Sl 22:40 0:03 app2-test

Brian Littmann

unread,

Jul 8, 2015, 7:15:08 PM7/8/15

to mod...@googlegroups.com

Whoops: mod_wsgi 3.2

...

Graham Dumpleton

unread,

Jul 8, 2015, 8:33:23 PM7/8/15

to mod...@googlegroups.com

On 9 Jul 2015, at 7:47 am, Brian Littmann <br...@newmedio.com> wrote:

Howdy. Glad I found this thread, and thanks so much, Graham, for all the support you give to the community. I've been reading so many SO and SF questions and your replies are everywhere.

I must admit, I was in the "naive Apache" crowd not too long ago and our server recently screeched to a halt. We just bumped up the server specs, but there are still some configuration kinks to work out, which I have some questions about.

Our server/app situation:
CentOS 6.3 server with 8GB RAM and a quad-core processor
Default Apache 2.2 install using mpm-prefork (gross, I know)

Using Apache 2.2 with such an old mod_wsgi version you will have potential for seeing much more memory use by Apache.

Areas where extra memory usage would be seen are as follows.

In the Apache child worker processes.

Firstly because the per request thread memory pool threshold is unlimited, so Apache will not release extra memory it grabs for the pools to handle one off requests and so holds on to it in the memory pool. For prefork MPM at least this isn’t so bad because there is only one thread per process and so there isn’t much difference with the memory being held my the memory pool rather than on the memory free list for the process. For worker MPM at least it can be a big problem. Apache 2.4 sets a threshold for the per request thread memory pool, but even then it is larger than it really needs to be. This affects mod_wsgi embedded mode as well as daemon mode. the Apache directive to set the threshold is ThreadStackSize. Apache 2.4 has it as 8MB per thread as threshold. In mod_wsgi-express where daemon mode is used, I set it to 262144, thus 256KB, which is much smaller.

Secondly because the Apache child worker processes when proxying the mod_wsgi daemon processes don’t implement any flow control. If you are serving up large responses from the WSGI application to a slow client which can’t keep up, then the response will get buffered up in memory of the Apache child worker processes, worst case up to a significant percentage of the size of the response content. Apache 2.4 implements a flow control mechanism and will control how much can be buffered in memory by the child worker processes and will stop reading from the WSGI application as needs be. Version 4.3.1 or later of mod_wsgi also has changes to work around fact that Apache 2.2 doesn’t implement any flow control to avoid the problem.

Thirdly because Apache child worker processes handle badly proxying of data returned from the WSGI application when using mod_wsgi daemon mode, where the response content is written/yielded as many small blocks of memory. Even with flow control in place in Apache 2.4, this results in extra memory being used when buffering up to the threshold in the child worker processes. This is because the per data block overhead when buffering can be as much as or more than the amount of data if talking about data blocks of a very small size. Version 4.4.12 or later of mod_wsgi has a change to avoid this issue by force flushing, possibly causing blocking, when two many separate data blocks are being buffered by the flow control mechanism.

In the Apache child worker process if using embedded mode, or the mod_wsgi daemon process if using daemon mode.

Because mod_wsgi, due to Apache 1.3 compatibility was using an older Apache API for reading request content, each read of a block of data from the request content was causing allocation of memory which wasn’t used after the read, but was only released as part of the memory pool at the end of the request. If the response content was very large, or otherwise read as many blocks, then this would cause extra memory usage. In embedded mode and worker MPM this was exacerbated due to the lack of per request thread memory pool threshold in Apache 2.2. Version 4.4.0 or later avoids this problem as the fact that Apache 1.3 support was dropped in version 4.X meant the use of the older Apache 1.3 API in Apache which was causing the problem could be replaced with code making use of newer Apache 2.X API which doesn’t have the same issue.

So being stuck with mod_wsgi 3.2 is not good at all and you could be seeing extra memory use, possibly quite significant, as a result.

Even if Apache 2.4 could be used, using at least mod_wsgi version 4.4.0 or later is recommended. Version 4.4.12 or later if your WSGI application is for some reason streaming a lot of data in small blocks.

Hosting 6 internal Django apps (3 production, 3 staging) - Not too traffic heavy (maybe 20 req/sec total), but data heavy

By data heavy do you mean the amount of response content returned, or by the amount of processing done in the WSGI application process itself?

MySQL db backend and an external MySQL db on another server

Are the applications primarily I/O bound waiting on the database or other backend services, or are they running CPU intensive tasks in process?

Knowing the balance is important because for CPU bound tasks, the impact of the Python global interpreter lock (GIL) can be significant and this can dictate how much you should be relying on multiple processes rather than threads for concurrency.

Already done:
All apps are served via mod_wsgi in daemon mode, each set up in a separate VirtualHost
Using "WSGIRestrictEmbedded On" in global conf
The to-do list:
Uninstall unused Apache modules (probably based on this list: http://haydenjames.io/strip-apache-improve-performance-memory-efficiency/)
Swap out MPM prefork for worker (see question below about thread safety)
Set up nginx proxy for serving static assets (if it's really worth it)

Using nginx to serve up static assets can definitely always help. It is preferable if doing that to have the static assets hosted out of a separate domain so that clients don’t send all the dynamic WSGI application cookies when requesting static assets as well, thus cutting down on request sizes.

If able to host on a separate domain, running the nginx server for static assets on an entirely different physical host is also better. If this cannot be done, then using nginx as a front end to the existing Apache instance can be down with only minimal extra latency on the requests that go through to the WSGI application.

Being behind a front end proxy, the back end would then have to deal with all the special headers that nginx would have to send so that the backend new the public URL of the front end. I have blogged about these issues recently in:

Proxying to a Python web application running in Docker.

Redirection problems when proxying to Apache running in Docker.

Even though I was talking about Docker, still relevant to case of nginx in front of Apache. Using such an old version of mod_wsgi you wouldn’t be able to make use of the capabilities of mod_wsgi to handle the special headers to be set by the proxy. You would need mod_wsgi version 4.4.10 or later for all that special stuff. Thus, you would need to use a WSGI middleware or any special feature of a web framework to fix up things for you.

Extra caching in one of the apps
Questions:
A) Should I be setting "WSGIApplicationGroup %{GLOBAL}" in all of my virtual hosts so that all apps run under the same Python interpreter?

If you only have one WSGI application instance delegated to each mod_wsgi daemon process group then yes, use this.

The application group context is local to just that daemon process group. So although all WSGI applications forced into the main interpreter, it is the main interpreter for that set of daemon processes.

B) How can I tell if the Python/Django code is thread-safe for using MPM worker?
Would I need to make sure multiple db writes in a function are wrapped in a transaction?
For reference: http://stackoverflow.com/a/20491426/720054
Apps are on Django 1.5, Django 1.6, and one is an API using Django REST Framework.

If you had multithreading issues in your code I think you would know already. This is because with your configuration using mod_wsgi daemon mode, you are already running your WSGI applications in a multithread configuration with potential for more than one request to be handled concurrently in the same process. The default number of threads per daemon process with raw mod_wsgi is 15. Thus each Django instance is currently running with processes consisting of 15 threads.

This is the same whether or not you are using prefork or worker MPM. The MPM only affects whether the Apache child worker processes are using multithreading, not the mod_wsgi daemon processes, the configuration of which is dictated by the WSGIDaemonprocess directive.

With the Apache child worker processes only proxying, using worker MPM would mean less overall memory used by the child worker processes as there will be fewer of them. The processes are also less likely to be subject to process churn that prefork MPM is particularly susceptible to. Even with worker MPM, still recommend the maximum spare workers to be increased to try and keep child worker process resident. I talked about the process churn issue in:

http://lanyrd.com/2013/pycon/scdyzk/

C) When MPM worker is used, how should the corresponding httpd (apache2) conf be set up? I've read your math somewhere, Graham, but I also saw you go for about 25% more threads for Apache than total for mod_wsgi daemon processes.

As a rough guess that is a good place to start for where only have the one WSGI application running in a daemon process group. Because you have more then you likely want to increase that up.

What you set this too really depends on how large a responses you are serving up and how slow your HTTP clients are. If you deal with a lot of mobile clients which can be slow to talk to, you might want more capacity in the child worker process doing the proxying or serving up static content.

The problem with creating to many child worker processes is that if you start to get backlogged it can make it harder to clear as clients can still connect rather than getting a connection failure and causing the request to fail. There are various things you can do with more recent mod_wsgi versions with timeouts to cope with backlogging and so clear a back log by failing requests before they get to the WSGI application. This prevents the server being completely overwhelmed and unresponsive in such a situation with the WSGI application still handling requests where the client has likely already given up on anyway.

As a first pass, hope this helps. We can keep going as needs be. The issue remaining is how CPU intensive your tasks are, how long are response times and so how much capacity of the daemon processes is being used. If those CPU usage figures when under load are getting up to 75% or more for a single process I would definitely be worried a bit and would need to look at that more closely.

Graham

On Tuesday, August 23, 2011 at 7:57:45 PM UTC-5, Graham Dumpleton wrote:
Just a quick message to say I haven't forgotten about this discussion thread. Have been quite busy with work plus was at Sydney PyCon over the weekend and sprint days after that. Life should return to semi normal tomorrow.

Would be interested to hear if you are already seeing better memory usage just by switching to worker and ensuring using daemon mode and turning off interpreters in apache child processes.

Graham

--
You received this message because you are subscribed to the Google Groups "modwsgi" group.

To unsubscribe from this group and stop receiving emails from it, send an email to modwsgi+u...@googlegroups.com.

To post to this group, send email to mod...@googlegroups.com.

Brian Littmann

unread,

Jul 8, 2015, 11:30:41 PM7/8/15

to mod...@googlegroups.com

On Wednesday, July 8, 2015 at 7:33:23 PM UTC-5, Graham Dumpleton wrote:

So being stuck with mod_wsgi 3.2 is not good at all and you could be seeing extra memory use, possibly quite significant, as a result.

Even if Apache 2.4 could be used, using at least mod_wsgi version 4.4.0 or later is recommended. Version 4.4.12 or later if your WSGI application is for some reason streaming a lot of data in small blocks.

Rad. I'll try to upgrade to Apache 2.4 and the latest mod_wsgi. We aren't doing any streaming in small blocks.

Hosting 6 internal Django apps (3 production, 3 staging) - Not too traffic heavy (maybe 20 req/sec total), but data heavy
By data heavy do you mean the amount of response content returned, or by the amount of processing done in the WSGI application process itself?

Mostly the size of the response content. We're shuffling a lot of data to the client. Some response sizes are 215kb and 635kb, though I think some of that could be mitigated by code changes.

MySQL db backend and an external MySQL db on another server
Are the applications primarily I/O bound waiting on the database or other backend services, or are they running CPU intensive tasks in process?

Knowing the balance is important because for CPU bound tasks, the impact of the Python global interpreter lock (GIL) can be significant and this can dictate how much you should be relying on multiple processes rather than threads for concurrency.

Related to the above, mostly some sizable queries to the external database. App2 is the the big player here and we're planning to add some server-side caching in so those queries aren't ran all the time.

Questions:

B) How can I tell if the Python/Django code is thread-safe for using MPM worker?
Would I need to make sure multiple db writes in a function are wrapped in a transaction?
For reference: http://stackoverflow.com/a/20491426/720054
Apps are on Django 1.5, Django 1.6, and one is an API using Django REST Framework.
If you had multithreading issues in your code I think you would know already. This is because with your configuration using mod_wsgi daemon mode, you are already running your WSGI applications in a multithread configuration with potential for more than one request to be handled concurrently in the same process. The default number of threads per daemon process with raw mod_wsgi is 15. Thus each Django instance is currently running with processes consisting of 15 threads.

This is the same whether or not you are using prefork or worker MPM. The MPM only affects whether the Apache child worker processes are using multithreading, not the mod_wsgi daemon processes, the configuration of which is dictated by the WSGIDaemonprocess directive.

I had wondered that. I may have a few issues, given I've received a couple IntegrityErrors and MultipleObjectsReturned errors, but they've been few and far between. Not sure how concerned I should be...

C) When MPM worker is used, how should the corresponding httpd (apache2) conf be set up? I've read your math somewhere, Graham, but I also saw you go for about 25% more threads for Apache than total for mod_wsgi daemon processes.
As a rough guess that is a good place to start for where only have the one WSGI application running in a daemon process group. Because you have more then you likely want to increase that up.

What you set this to really depends on how large a responses you are serving up and how slow your HTTP clients are. If you deal with a lot of mobile clients which can be slow to talk to, you might want more capacity in the child worker process doing the proxying or serving up static content.

App2 has a corresponding iOS app the consumes an API. The GET requests it makes aren't nearly as big as the "desktop" app requests, but there is one endpoint to pull down a bunch of data (2MB+ worth, yikes). It also POSTs somewhat frequently to update its location. I don't know numbers off the top of my head, but there might be 50+ devices being used at any given time, and that number is going to increase.

The problem with creating too many child worker processes is that if you start to get backlogged it can make it harder to clear as clients can still connect rather than getting a connection failure and causing the request to fail. There are various things you can do with more recent mod_wsgi versions with timeouts to cope with backlogging and so clear a back log by failing requests before they get to the WSGI application. This prevents the server being completely overwhelmed and unresponsive in such a situation with the WSGI application still handling requests where the client has likely already given up on anyway.

As a first pass, hope this helps. We can keep going as need be. The issue remaining is how CPU intensive your tasks are, how long are response times and so how much capacity of the daemon processes is being used. If those CPU usage figures when under load are getting up to 75% or more for a single process I would definitely be worried a bit and would need to look at that more closely.

I will follow up tomorrow with some more numbers pulled during business hours. For those requests that are 215KB and 635KB, they are currently taking 3 seconds and 4.2 seconds, respectively, but I bet that is because of the queries we are running every time.

Cheers for all of the info. It is greatly appreciated.

Brian Littmann

unread,

Jul 9, 2015, 1:39:41 PM7/9/15

to mod...@googlegroups.com

Ok, I have major impostor syndrome when it comes to server config. Can anyone give me some guidance regarding the below before I go mucking around with a live production server? Obviously a test server would be great, but I feel constrained due to this being client work.

Apache is at 2.2.15 and I think updating to 2.4 is too daunting at the moment.
mod_wsgi was installed via yum and is at 3.2. What is the best way to update to the latest 4.4.*? Just: yum localinstall <rpm url>? If system Python is at 2.6.6, can I still install this package? (Server is running CentOS 6.3.)
Is it still better to use worker MPM rather than prefork for Apache 2.2, even though Graham says "Apache will not release extra memory it grabs for the pools to handle one off requests and so holds on to it in the memory pool."?

Thanks heaps for any suggestions.

Graham Dumpleton

unread,

Jul 9, 2015, 8:51:55 PM7/9/15

to mod...@googlegroups.com

You can leave Apache 2.2.15 for now and keep prefork MPM. The latest mod_wsgi version should deal with all the issues I know about for Apache 2.X around memory usage.

The easiest way to install mod_wsgi is to use the ‘pip’ installable variant.

https://pypi.python.org/pypi/mod_wsgi

This will actually install mod_wsgi-express as part of your Python installation, but you can then tell mod_wsgi-express to copy the mod_wsgi.so into your Apache installation.

This is a lot simpler than downloading mod_wsgi source code yourself and having to configure it.

Before you can do this you need to ensure you have various packages installed.

httpd-devel.x86_64 - The development package for Apache.

python-devel.x86_64 - The development package for Python.

gcc.x86_64 - The GNU C compiler.

Also assume you have ‘pip’ installed already.

You now have two options as to where/how you install mod_wsgi-express.

If you don’t mind doing a ‘pip install’ of a package into system wide Python installation, as root run:

pip install mod_wsgi

With this done, you can then as a normal user check it worked by running:

mod_wsgi-express start-server

Use curl or a browser to hit the URL:

http://localhost:8000/

Next as root run:

mod_wsgi-express install-module

This will install a file 'mod_wsgi-py26.so’ into the Apache modules directory.

Because you already have system mod_wsgi package installed, this will result in two different mod_wsgi modules under /etc/httpd/modules now being present.

640 -rw-r--r-- 1 root root 652286 Jul 10 00:15 mod_wsgi-py26.so
152 -rwxr-xr-x 1 root root 152584 Oct 15 2014 mod_wsgi.so

Now when you ran 'mod_wsgi-express install-module’ it will have output:

LoadModule wsgi_module /usr/lib64/httpd/modules/mod_wsgi-py26.so
WSGIPythonHome /usr

These are output as the lines you now need to add to the Apache configuration file.

Technically the WSGIPythonHome isn’t needed as we are using the system Python package and not one installed in a different location.

Before we can add those two lines to the Apache configuration, we have to uninstall the existing system mod_wsgi package.

yum erase mod_wsgi.x86_64

This will have the effect of removing the files:

/etc/httpd/conf.d/wsgi.conf

/etc/httpd/modules/mod_wsgi.so

With those removed we can now create an alternate ‘conf.d’ file.

Thus create:

/etc/httpd/conf.d/wsgi-express.conf

and in it add:

LoadModule wsgi_module /usr/lib64/httpd/modules/mod_wsgi-py26.so

Don’t worry about WSGIPythonHome.

You can then restart Apache and check everything is okay.

If you are not comfortable doing a ‘pip install’ as root into the system wide Python installation and don’t need mod_wsgi-express for later, then you can instead use a temporary virtual environment.

If you don’t have ‘pip’ installed in system wide Python installation but do have ‘virtualenv’ or some other tool for creating virtual environments, you can use this approach instead.

Thus as a normal user run:

virtualenv /tmp/mod_wsgi-express

source /tmp/mod_wsgi-express/bin/activate

pip install mod_wsgi

Then install the module into the Apache modules directory again.

mod_wsgi-express install-module

This will output:

LoadModule wsgi_module /usr/lib64/httpd/modules/mod_wsgi-py26.so
WSGIPythonHome /tmp/mod_wsgi-express

Again ignore the WSGIPythonHome and after removing the system package for mod_wsgi, create the wsgi-express.conf and add just:

LoadModule wsgi_module /usr/lib64/httpd/modules/mod_wsgi-py26.so

That might sound all complicated, but once you go through it, you should see it isn’t that bad. :-)

So give that a good.

if have issues, then remove the wsgi-express.conf and yum install ‘mod_wsgi.x86_64’ once and more and restart Apache to put it back to how it was.

Graham

Reply all

Reply to author

Forward