Re: mod_wsgi and mod_php memory usage?

Graham Dumpleton

unread,

Jan 26, 2008, 11:39:54 PM1/26/08

to Rob Gabaree, mod...@googlegroups.com

This response has been cc'd to the mod_wsgi mailing list found at:

http://groups.google.com/group/modwsgi

Please keep followups on that list.

On 27/01/2008, Rob Gabaree <r...@rawb.net> wrote:
> I currently have a VPS with 256MB of memory running a small WordPress
> blog using Apache and mod_php. I've been getting into Python lately
> and would like to start using some Python web frameworks (such as
> Django or Pylons) with either mod_wsgi or mod_python.
>
> Will there be any issues running mod_php and either mod_python or
> mod_wsgi together?

> I saw in the mod_wsgi wiki the MySQL shared library
> issue, but is there anything else?

There are other mentions of issues with PHP in the same document:

http://code.google.com/p/modwsgi/wiki/ApplicationIssues

So, just ensure you read through the document properly.

> Also, will running either mod_wsgi or mod_python take up a lot more
> memory per Apache process compared to now with just mod_php? I would
> think yes, since the Python interpreter is embedded.

The Python interpreter itself doesn't take up that much space. There
are various people spreading FUD about this and claiming that just the
act of embedding Python within Apache will cause the Apache child
processes to bloat out in size and that therefore the concept is evil.
This is not strictly true.

The only time where the Python interpreter may take up a lot of memory
is where your Python installation didn't provide a shared library.
See:

http://code.google.com/p/modwsgi/wiki/InstallationIssues

The real problem is not the Python interpreter, but the specific
Python web application that you want to run. If you were only running
small tuned custom applications then your memory usage on top of the
Python interpreter might be quite minimal. If however you loaded up an
application which uses Pylons or TurboGears for example, then you are
immediately hit with 7MB+ per process of additional memory usage
because of the huge basic overhead of such packages.

So the killer on memory usage is the particular web framework you use.
Some of these web frameworks are truly mega frameworks in the sense
that for even a hello world application they import ridiculous amounts
of code which often you would not even necessarily use directly. If
you use any of these mega frameworks and run them in embedded mode of
mod_wsgi in a memory constrained VPS, especially if running with
prefork MPM in Apache, you will quickly start to see all your memory
being used up.

> I only plan to
> run the WordPress PHP blog, plus maybe 2-3 Python applications which
> shouldn't see too much traffic. I'm asking because I know with Rails
> you have to run multiple Mongrel servers on the backend and that
> usually takes up something like 70-90MB of memory per app, which isn't
> that good on a server with 256MB of memory :)

As suggested above, depending on which Python web frameworks you use
and how big your application is you can very well hit the same
problem. To avoid this happening you just need to make sure you
configure you system appropriate.

I have been over these a few times in the mailing list, so maybe
search the archives on Google Groups for 'VPS' and 'memory'. To get
you started though:

1. Recommended that you use Apache worker MPM instead of Apache
preform MPM. This will independent of whether Python is used cut down
on memory usage. To do this though, you will need to ensure that all
third party modules you use for PHP are actually thread safe, not all
are.

2. Don't use mod_python as it only has embedded mode and so copies of
each of your Python web applications run in every Apache child
process.

3. Do use mod_wsgi, but make sure you use its daemon mode to push each
Python web application into its own distinct process. For each Python
web application use a single multithread process.

4. If your Python web applications aren't going to conflict and any
Python web framework used supports hosting instances of different
applications in the same process, then delegate non conflicting Python
web applications to run in the same application group within the same
daemon process. This will allow them to share code from common Python
modules.

5. Use maximum-requests to force mod_wsgi daemon processes to be
periodically recycled if they have memory leaks of some form. This
will ensure they don't keep growing in size.

Anyway, that is the start of things. Depending on what sort of VPS
system you have any how they gauge memory usage, other things can be
done to limit problems.

Graham

Brian Smith

unread,

Jan 27, 2008, 3:00:25 PM1/27/08

to mod...@googlegroups.com

Graham Dumpleton wrote:
> The real problem is not the Python interpreter, but the
> specific Python web application that you want to run. If you
> were only running small tuned custom applications then your
> memory usage on top of the Python interpreter might be quite
> minimal. If however you loaded up an application which uses
> Pylons or TurboGears for example, then you are immediately
> hit with 7MB+ per process of additional memory usage because
> of the huge basic overhead of such packages.

This is why I want to change how mod_wsgi does its forking, so that I
can load Pylons/TurboGears/Django before the forking, and significantly
reduce the amount of per-process memory being used.

> 3. Do use mod_wsgi, but make sure you use its daemon mode to
> push each Python web application into its own distinct
> process. For each Python web application use a single
> multithread process.
>
> 4. If your Python web applications aren't going to conflict
> and any Python web framework used supports hosting instances
> of different applications in the same process, then delegate
> non conflicting Python web applications to run in the same
> application group within the same daemon process. This will
> allow them to share code from common Python modules.

Most frameworks provide an easy mechanism for applications together into
one big one. If memory is limited, and traffic is low, I would combine
all the applications into one application, and load that one application
in a single daemon process that has multiple threads.

(Incidentally, mod_wsgi probably shouldn't count how many threads are
active in the process, but rather, the number of threads it has
started/stopped. Maybe it does this already?)

> 5. Use maximum-requests to force mod_wsgi daemon processes to
> be periodically recycled if they have memory leaks of some
> form. This will ensure they don't keep growing in size.

There needs to be a way of limiting the system resources available to
the daemon processes, ala ulimit -m. Otherwise, the first request could
exhaust all available memory. maximum-requests is not really useful for
this, AFAICT.

- Brian

Graham Dumpleton

unread,

Jan 27, 2008, 4:19:50 PM1/27/08

to mod...@googlegroups.com

On 28/01/2008, Brian Smith <br...@briansmith.org> wrote:
> > 4. If your Python web applications aren't going to conflict
> > and any Python web framework used supports hosting instances
> > of different applications in the same process, then delegate
> > non conflicting Python web applications to run in the same
> > application group within the same daemon process. This will
> > allow them to share code from common Python modules.
>
> Most frameworks provide an easy mechanism for applications together into
> one big one.

Hmmm, not really. You can't host two distinct instances of Django,
TurboGears, CherryPy or web.py applications in the same Python
interpreter instances as they all depend on single global
configuration for an application. Only one that tries to do it
properly is Pylons from what I know.

> If memory is limited, and traffic is low, I would combine
> all the applications into one application, and load that one application
> in a single daemon process that has multiple threads.
>
> (Incidentally, mod_wsgi probably shouldn't count how many threads are
> active in the process, but rather, the number of threads it has
> started/stopped. Maybe it does this already?)

The mod_wsgi doesn't stop or start threads. This is all done by Apache
and controlled by its MPM directives. The mod_wsgi code just executes
in the context of the existing Apache threads, so not sure what you
are saying here.

> > 5. Use maximum-requests to force mod_wsgi daemon processes to
> > be periodically recycled if they have memory leaks of some
> > form. This will ensure they don't keep growing in size.
>
> There needs to be a way of limiting the system resources available to
> the daemon processes, ala ulimit -m. Otherwise, the first request could
> exhaust all available memory. maximum-requests is not really useful for
> this, AFAICT.

Problem is I haven't found a way of doing a clean shutdown when the
memory soft limit is reached. This is because it doesn't trigger
signals or anything when it is reached unlike some other system
resource limits.

Graham

Graham Dumpleton

unread,

Jan 27, 2008, 5:03:59 PM1/27/08

to mod...@googlegroups.com

On 28/01/2008, Graham Dumpleton <graham.d...@gmail.com> wrote:
> > (Incidentally, mod_wsgi probably shouldn't count how many threads are
> > active in the process, but rather, the number of threads it has
> > started/stopped. Maybe it does this already?)
>
> The mod_wsgi doesn't stop or start threads. This is all done by Apache
> and controlled by its MPM directives. The mod_wsgi code just executes
> in the context of the existing Apache threads, so not sure what you
> are saying here.

Whoops, that is for embedded mode.

For daemon mode the threads are created by mod_wsgi but using Apache
thread functions, not Python. Thus they call in as external threads
from Python's perspective. The required number of threads are created
at the outset and reused. The threads are never explicitly killed or
new ones started.

That said, still not sure what you are suggesting. :-)

Graham

Brian Smith

unread,

Jan 28, 2008, 7:44:00 AM1/28/08

to mod...@googlegroups.com

Graham Dumpleton wrote:
> > Most frameworks provide an easy mechanism for applications together
> > into one big one.
>
> Hmmm, not really. You can't host two distinct instances of
> Django, TurboGears, CherryPy or web.py applications in the
> same Python interpreter instances as they all depend on
> single global configuration for an application. Only one that
> tries to do it properly is Pylons from what I know.

I meant that, once you standardize on a framework, it is pretty easy to
combine multiple small applications into one (with one configuration
file).

> > (Incidentally, mod_wsgi probably shouldn't count how many
> threads are
> > active in the process, but rather, the number of threads it has
> > started/stopped. Maybe it does this already?)
>
> The mod_wsgi doesn't stop or start threads. This is all done
> by Apache and controlled by its MPM directives. The mod_wsgi
> code just executes in the context of the existing Apache
> threads, so not sure what you are saying here.

Not a major issue. Does the "threads" limit apply on WSGIDaemonProcess
mean "no more than <X> threads can exist in the process" or "mod_wsgi
will not start more than <X> threads"? In other words, if the
application starts its own threads, does that contribute to the thread
count?

> > There needs to be a way of limiting the system resources
> > available to the daemon processes, ala ulimit -m. Otherwise,
> > the first request could exhaust all available memory.
> > maximum-requests is not really useful for this, AFAICT.
>
> Problem is I haven't found a way of doing a clean shutdown
> when the memory soft limit is reached. This is because it
> doesn't trigger signals or anything when it is reached unlike
> some other system resource limits.

I am happy with the semantics of Posix setrlimit(), which will simply
refuse to allocate memory beyond the limits.

- Brian

Graham Dumpleton

unread,

Jan 28, 2008, 7:14:34 PM1/28/08

to mod...@googlegroups.com

On 28/01/2008, Brian Smith <br...@briansmith.org> wrote:

> > > (Incidentally, mod_wsgi probably shouldn't count how many
> > threads are
> > > active in the process, but rather, the number of threads it has
> > > started/stopped. Maybe it does this already?)
> >
> > The mod_wsgi doesn't stop or start threads. This is all done
> > by Apache and controlled by its MPM directives. The mod_wsgi
> > code just executes in the context of the existing Apache
> > threads, so not sure what you are saying here.
>
> Not a major issue. Does the "threads" limit apply on WSGIDaemonProcess
> mean "no more than <X> threads can exist in the process" or "mod_wsgi
> will not start more than <X> threads"? In other words, if the
> application starts its own threads, does that contribute to the thread
> count?

Any threads the application creates is its own problem and concern,
mod_wsgi will not know anything about them.

The only consideration mod_wsgi gives to such application threads is
that like standard command line Python, when shutting down process it
will trigger Python code which attempts signal non daemon Python
application threads to stop so that process can hopefully stop
cleanly.

In practice there should not be any non daemon Python threads as
configuration for embedded Python is that any threads are created by
default as daemon threads. Thus, use code would have had to explicitly
somehow mark the thread as a non daemon thread when it created it.

BTW, daemon thread is nothing to do with mod_wsgi daemon mode, Python
itself talks about daemon and non daemon threads, just to make it all
confusing in this context.

> > > There needs to be a way of limiting the system resources
> > > available to the daemon processes, ala ulimit -m. Otherwise,
> > > the first request could exhaust all available memory.
> > > maximum-requests is not really useful for this, AFAICT.
> >
> > Problem is I haven't found a way of doing a clean shutdown
> > when the memory soft limit is reached. This is because it
> > doesn't trigger signals or anything when it is reached unlike
> > some other system resource limits.
>
> I am happy with the semantics of Posix setrlimit(), which will simply
> refuse to allocate memory beyond the limits.

If people are happy with a hard limit which causes outright failure
and potential crash of process if some underlying C code in
Apache/mod_wsgi doesn't cope with that well, can quite easily add a
WSGIDaemonProcess option such that web administrator can set fixed
memory limit.

Graham

Reply all

Reply to author

Forward