Reducing pylons app memory usage?

Marcin Kasperski

unread,

Oct 16, 2007, 10:22:14 AM10/16/07

to pylons-...@googlegroups.com

Any ideas how could I reduce pylons app RAM usage? At the moment
pylons process takes above 100MB (almost static site serving some
templates) - both when run with paste, and when run under mod_wsgi.
Quite a lot, considering that for example whole moinmoin manages to
work on ~25MB.

Max Ischenko

unread,

Oct 16, 2007, 1:47:17 PM10/16/07

to pylons-...@googlegroups.com

Did you mean RSS or VSS?
Does it starts with 100MB or it grows over time?

There are some memory issues you may have run into, see recent thread "debugging memory leaks".
On my machine it's about 25Mb RSS (production sites, few K requests).

dou 9846 0.6 6.0 111952 25936 ? Sl 02:16 6:23 /home/dou/bin/python -u /home/dou/server.py

Max.
http://www.developers.org.ua -- ukrainian software community.

Marcin Kasperski

unread,

Oct 17, 2007, 5:59:39 AM10/17/07

to pylons-...@googlegroups.com

>
> Any ideas how could I reduce pylons app RAM usage? At the moment
> pylons process takes above 100MB (almost static site serving some
> templates) - both when run with paste, and when run under mod_wsgi.
> Quite a lot, considering that for example whole moinmoin manages to
> work on ~25MB.
>
> Did you mean RSS or VSS?

RSS. VSS is also fairly significant, 16-17MB.

Production machine, after working for about a day under low load
(apache2 processess use mod_wsgi to load pylons app)

VSS RSS
12197 www-data 18 0 130m 17m 2384 S 0.0 18.2 0:04.32 apache2
8909 www-data 16 0 132m 16m 2376 S 0.0 17.2 0:06.16 apache2

Development machine, paste server immediately after startup

26815 marcink 15 0 109m 15m 2840 S 1 0.8 0:01.76 python2.4

My application is almost static website, where pylons are mostly
rendering mako templates. Some dynamic functions are planned, but not
yet written, no database is used at the moment.

Just for comparison:

(MoinMoin - fairly large and complicated web app written in python)
8929 www-data 16 0 24964 9224 1716 S 0.0 9.4 0:10.64 moin.fcg

(My custom FICS bot, reasonably complicated python app based on twisted,
which also has some simple web interface embedded and uses cheetah templates)
14592 marcink 15 0 79584 8908 1388 S 0.0 9.1 37:56.39 python

> Does it starts with 100MB or it grows over time?

Starts with.

> There are some memory issues you may have run into, see recent thread
> "debugging memory leaks".

Those were mainly problems with cherrypy IIRC.

> On my machine it's about 25Mb RSS (production sites, few K requests).
>
> dou 9846 0.6 6.0 111952 25936 ? Sl 02:16 6:23 /home/dou/bin
> /python -u /home/dou/server.py

So it is similar to my case. Quite a lot.

Why bother? Well, I use 96MB RAM VPS to host my website....

I suspect a few different factors may be aggregated here:

a) Pylons load pleeeeeentyyyyyy of python modules, whether they are
needed, or not. Any hints how could those be cut down a bit?

b) I am not sure whether some module isn't caching something in memory
(haven't enabled such a things, but maybe there are some defaults)

c) Some thread structures are allocated (??? - should they be under mod_wsgi?)

Any ideas?

Dalius Dobravolskas

unread,

Oct 17, 2007, 6:05:14 AM10/17/07

to pylons-...@googlegroups.com

> 12197 www-data 18 0 130m 17m 2384 S 0.0 18.2 0:04.32 apache2
> 8909 www-data 16 0 132m 16m 2376 S 0.0 17.2 0:06.16 apache2

That' might be apache problem not pylons ;-) I guess you have apache2
and are using default worker module. Use prefork. It is enough to
install it on debian based systems:
aptitude install apache2-mpm-prefork

--
Dalius
http://blog.sandbox.lt

Christoph Haas

unread,

Oct 17, 2007, 10:13:06 AM10/17/07

to pylons-...@googlegroups.com

I feel like adding a "me, too" here. I have a Pylons application running
here under "paster serve" which consumes 110 MB RAM. It consists of 6
controllers, uses SQLAlchemy 0.4 to connect to a PostgreSQL database and
renders a few Mako templates. Nothing fancy really. Hardly any users on
it.

I just killed the webserver and restarted it. It has been running for
three days with very few users on it (development system). The memory
usage only dropped down to 105 MB of RAM after a process restart.

Is there a way to profile RAM usage or get otherwise get an overview? I
can afford the 100 MB RAM but have no idea what makes Pylons so hungry.

Btw, is it normal that "paster serve" consumes 1-3% of my CPU while
running with "--reload" while not serving HTTP requests? I wouldn't use
that in production of course so that problem won't bother me there. Just
wondering how many files "--reload" really monitors. When I upgraded
related components like Mako or SQLAlchemy I found my application
restart. So I assume it must be a whole lot of files that are monitored
for changes.

Kindly
Christoph

Max Ischenko

unread,

Oct 17, 2007, 2:03:12 PM10/17/07

to pylons-...@googlegroups.com

On 10/17/07, Christoph Haas <em...@christoph-haas.de> wrote:

On Tue, Oct 16, 2007 at 04:22:14PM +0200, Marcin Kasperski wrote:
> Any ideas how could I reduce pylons app RAM usage? At the moment
> pylons process takes above 100MB (almost static site serving some
> templates) - both when run with paste, and when run under mod_wsgi.
> Quite a lot, considering that for example whole moinmoin manages to
> work on ~25MB.

I feel like adding a "me, too" here. I have a Pylons application running
here under "paster serve" which consumes 110 MB RAM. It consists of 6
controllers, uses SQLAlchemy 0.4 to connect to a PostgreSQL database and
renders a few Mako templates. Nothing fancy really. Hardly any users on
it.

I think you're wrong. 110Mb is VSS, i.e . "virtual" memory requested. Actual memory usage is much better reflected by RSS which is a reasonable 25Mb.

Max.

Ben Bangert

unread,

Oct 17, 2007, 4:19:40 PM10/17/07

to pylons-...@googlegroups.com

On Oct 17, 2007, at 2:59 AM, Marcin Kasperski wrote:

> RSS. VSS is also fairly significant, 16-17MB.

I think you have those reversed according to the other numbers.

> Development machine, paste server immediately after startup
>
> 26815 marcink 15 0 109m 15m 2840 S 1 0.8 0:01.76 python2.4

That looks fine to me, thats 15 megs of ram consumed.

> (MoinMoin - fairly large and complicated web app written in python)
> 8929 www-data 16 0 24964 9224 1716 S 0.0 9.4 0:10.64 moin.fcg
>
> (My custom FICS bot, reasonably complicated python app based on
> twisted,
> which also has some simple web interface embedded and uses cheetah
> templates)
> 14592 marcink 15 0 79584 8908 1388 S 0.0 9.1 37:56.39 python

Note that in these cases, they're using less resident ram as well.
I'll bet some libraries that may trigger various extensions (maybe an
SQL C extension?) could easily cause the Pylons one to load more
virtual. Though I'm rather baffled why there's such a concern over
virtual.

> Those were mainly problems with cherrypy IIRC.

Nope, it was a problem with the Registry in Paste. The workaround
from the other thread applies if you see the resident ram continue to
grow over time.

> So it is similar to my case. Quite a lot.

25 megs for a long lived multi-thread web server is pretty dang low.

> Why bother? Well, I use 96MB RAM VPS to host my website....

VPS's do not count virtual in your RAM allocation, only resident. 25
megs is not unreasonable, and is the lowest starting point for Rails
apps as well though I frequently see them run around 50-75 megs *and*
they're not multi-threaded so you have to run multiple ones for a
single site if you have higher load.

Apache is rather a pig on resources (the threading worker helps a
bit), I'd suggest checking out nginx or lightty. nginx has a bit more
modules to replace Apache functionality and has been more reliable in
my experience, but both of them will mean dropping Apache (which uses
at least 13 megs per process), and in exchange lighty/nginx take
about 3-5 megs of ram. They're also significantly faster.

> I suspect a few different factors may be aggregated here:
>
> a) Pylons load pleeeeeentyyyyyy of python modules, whether they are
> needed, or not. Any hints how could those be cut down a bit?

Pylons for the most part doesn't load what it doesn't need. There's
maybe one or two small modules that with a lot of extra code could
optionally load themselves. But that's hardly a meg.

> b) I am not sure whether some module isn't caching something in memory
> (haven't enabled such a things, but maybe there are some defaults)
>
> c) Some thread structures are allocated (??? - should they be under
> mod_wsgi?)

The default serve runs a thread-pool of 10, you could reduce the
thread-pool if you are running a low traffic site and that will
likely drop the footprint a little. If you add to the [server:main]
section a bit like:
threadpool_nworkers = 5

That should lower it, see if that affects the ram usage.

Cheers,
Ben

Christoph Haas

unread,

Oct 17, 2007, 5:24:19 PM10/17/07

to pylons-...@googlegroups.com

On Wed, Oct 17, 2007 at 09:03:12PM +0300, Max Ischenko wrote:
> On 10/17/07, Christoph Haas <em...@christoph-haas.de> wrote:
>
> On Tue, Oct 16, 2007 at 04:22:14PM +0200, Marcin Kasperski wrote:
> > Any ideas how could I reduce pylons app RAM usage? At the moment
> > pylons process takes above 100MB (almost static site serving some
> > templates) - both when run with paste, and when run under mod_wsgi.
> > Quite a lot, considering that for example whole moinmoin manages to
> > work on ~25MB.
>
> I feel like adding a "me, too" here. I have a Pylons application running
> here under "paster serve" which consumes 110 MB RAM. It consists of 6
> controllers, uses SQLAlchemy 0.4 to connect to a PostgreSQL database and
> renders a few Mako templates. Nothing fancy really. Hardly any users on
> it.
>
> I think you're wrong.

I think you are right. :)

> 110Mb is VSS, i.e . "virtual" memory requested. Actual
> memory usage is much better reflected by RSS which is a reasonable 25Mb.

Why am I still confused by the top columns after 15 years of UNIX usage.
Yes, the virtual usage is around 100 MB while the resident size is at
about 18 MB.

One could argue though whether 18 MB is reasonable. For someone who
comes from the C=64 ages and wrote applications within a few KB of code
18 MB looks pretty huge. OTOH I have stopped wondering why a Firefox
with no page displayed swallows 200 MB. I'd like to point out though
that I still don't need my kids to program my VCR (or rather DVB-S
settop box)! ;)

Cheers
Christoph

Ben Bangert

unread,

Oct 17, 2007, 7:32:24 PM10/17/07

to pylons-...@googlegroups.com

On Oct 17, 2007, at 2:24 PM, Christoph Haas wrote:

> Why am I still confused by the top columns after 15 years of UNIX
> usage.
> Yes, the virtual usage is around 100 MB while the resident size is at
> about 18 MB.

Of course, since most of this ram use is from the libraries, it means
it will increase very little should you run multiple Pylons apps in a
single process. You can do this with the ini file to tell Paste to
load multiple apps, and even host them on different virtual hosts,
that'll save you some ram if you're running multiple Pylons apps on
the same machine.

Cheers,
Ben

Graham Dumpleton

unread,

Oct 17, 2007, 7:56:23 PM10/17/07

to pylons-discuss

Ben Bangert wrote:
> Apache is rather a pig on resources (the threading worker helps a
> bit), I'd suggest checking out nginx or lightty. nginx has a bit more
> modules to replace Apache functionality and has been more reliable in
> my experience, but both of them will mean dropping Apache (which uses
> at least 13 megs per process), and in exchange lighty/nginx take
> about 3-5 megs of ram. They're also significantly faster.

Base Apache memory use does not have to be excessive if you configure
it correctly. The most important thing is not to load in Apache
modules you do not need. If your Apache links a lot of modules
statically, this may mean you need to rebuild it so it uses DSO
modules for a lot of the core Apache modules. That way you can avoid
loading them. Usually even this is not absolutely required.

Next thing is to avoid Apache prefork MPM and use worker MPM. Most
problems with excessive memory use which people complain about are
because they are using prefork MPM and as a consequence they need to
run with a lot more Apache child processes. If you are running a
Python application in an embedded mode within Apache child processes,
this means so much more copies of the application and thus more
overall memory use. By using worker MPM there are less Apache child
processes and thus lower overall memory use. For a low volume site,
also tweak the Apache configuration so you don't start up as many
initial Apache child processes and reduce the maximum allowed number
of processes.

Next problem can be if you are using mod_python and your OS supplies a
crappy version of Python which doesn't have a shared library for the
Python library, or doesn't put it in the correct place. The result on
such systems is that the Python static library gets embedded within
the mod_python.so file and often when it is loaded into Apache,
requires address fixups for it then to be able to work. These fixups
result in the memory going from being shared to local process memory.
So, instead a of a shared library that is counted once across all
processes, every Apache child process sees a 3MB+ hit to memory use.
This lack of a shared library for Python will also see the same
problem arising if mod_wsgi is used.

If all you want to do is host a WSGI application, then use of
mod_python is also now becoming a poor option. This is because
mod_python loads a lot of extra modules that aren't strictly needed in
hosting a WSGI application, but relate to mod_python's own way of
writing Python web applications. Also, mod_python loads some modules
up front that should only really be loaded on demand if specific
features of mod_python are required. Thus, using mod_python you can
incur a couple extra MBs of memory use you do not need. A better
option in this respect is mod_wsgi as it is targeted at WSGI
applications and doesn't have these extra memory overheads and
inefficiencies in memory use that mod_python does.

Next issue is Python web applications which gradually accrue/leak
memory over time. Because they can do this, it is important to set the
maximum number of requests that an Apache child process should handle
before the process is recycled. Setting a limit ensures that the
memory usage is brought back to the base level and any dead memory
that can't be reclaimed for whatever reason is thrown away and you
don't just have a process that just keeps on getting bigger and bigger
over time until you run out of memory.

Further option is not to run your Python application embedded within
the Apache child processes, but to run it in a separate daemon process
using daemon mode of mod_wsgi, or use flup in conjunction with one of
the fastcgi/scgi/ajp solutions for Apache. By doing this you can more
closely control how many processes you want to run your web
application in. So, you can specifically say you want only one daemon
process used. You should still though ensure you configure the daemon
process to be recycled after a maximum number of requests if you have
a leaky web application.

If using daemon processes, still use Apache worker MPM. All the Apache
child processes will then be doing is serving static files and
proxying requests for Python application through to the daemon
process.

Note that in general the argument that lighttpd and nginx are better
because they are quicker in serving static files is totally
meaningless in the context of Python web applications, as the real
work is being done in the Python application and that along with any
database access will be your bottleneck. Unless your web application
is specifically being used as a mechanism for working with large
numbers of very large distinct static media files, you will not really
gain anything from using lighttpd or nginx as static files isn't the
bottleneck. All you will possibly do is just make your setup and
configuration more complicated than it needs to be.

In summary, configure it properly and select which Apache modules you
use appropriately, and Apache doesn't need to be the bloated thing
that people claim it is. Using Apache can also be simpler to maintain
as configuration and process management can all be handled in the one
place and you don't need to be running a separate supervisor and
process control system for a distinct web server running just the
Python application. In an environment where you aren't memory
constrained and need high performance and scalability, Apache will in
general also be a better choice due to its ability to create
additional child processes to handle any extra temporary demand, such
process then being killed off when no longer required. So learn how to
use and configure Apache properly, and you should be okay.

Graham

Ian Bicking

unread,

Oct 17, 2007, 7:58:47 PM10/17/07

to pylons-...@googlegroups.com

It'd be hot if you could load up Pylons and then fork for each app,
sharing a bunch of the libraries.

You can't. But such a thing is not impossible. Mmm... a whole entry point:

[app:otherprocapp]
use = egg:DeploySubprocess
app = config:%(here)s/myapp.ini
# maybe also:
add_libraries = /path/to/some/virtualenv-or-workingenv/

It'd fork, use that add_libraries to do a site.addsitedir(), and maybe
start up with flup's scgi or fcgi server, with s/fcgi_app on the
unforked site, forwarding any requests over, and doing a little process
management (just passing over signals, so that you can kill both as a
unit by killing the parent).

I'm guessing 200 lines of code. Who wants to write it?

Note you won't notice the memory savings unless you look closely; it's
hard to see shared memory. Surprisingly hard. But much of the memory
will be shared between the master process and the forked subprocess.

--
Ian Bicking : ia...@colorstudy.com : http://blog.ianbicking.org
: Write code, do good : http://topp.openplans.org/careers

Marcin Kasperski

unread,

Oct 18, 2007, 6:18:36 AM10/18/07

to pylons-...@googlegroups.com

>> RSS. VSS is also fairly significant, 16-17MB.
>
> I think you have those reversed according to the other numbers.

Of course, type. Both are too large, though ;-)

> Though I'm rather baffled why there's such a concern
> over virtual.

In VPS offers not only RAM, but also swap is usually fairly limited
Offers like 64MB-128 RAM + 128-256MB swap are fairly frequent. And
reasonable when low volume websites are to be hosted....

> VPS's do not count virtual in your RAM allocation, only resident.

See above. Swap is also limited. Note also, that for pylons VPS
is more likely solution than - say - for PHP, there are not that
many ready-to-use-pylons hosting offers

> 25 megs is not unreasonable, and is the lowest starting point for
> Rails apps as well though I frequently see them run around 50-75
> megs *and* they're not multi-threaded so you have to run multiple
> ones for a single site if you have higher load.

I've never used Rails, so not truly interested in comparison with
them. But for example using apache+mod_perl (+some custom framework)
I started to reach similar numbers only after my app grown really
large....

Bah. Even Zope - yes, Zope! - takes less:

(zope after starting, standard standalone server)

marcink 5028 0.0 0.1 70084 2340 ? Sl Jul10 0:02 /usr/bin/python /home/marcink/Zope/Install/lib/python/Zope2/Startup/run.py -C /home/marcink/Zope/Instance/etc/zope.conf

(after serving some requests)

marcink 5028 0.0 0.4 70084 13340 ? Sl Jul10 0:02 /usr/bin/python /home/marcink/Zope/Install/lib/python/Zope2/Startup/run.py -C /home/marcink/Zope/Instance/etc/zope.conf

> I'd suggest checking out nginx or lightty.

I am watching with interest efforts to port WSGI to nginx which show
up on some lists. Nevertheless, it seems it will take some time...

>> a) Pylons load pleeeeeentyyyyyy of python modules, whether they are
>> needed, or not. Any hints how could those be cut down a bit?
>
> Pylons for the most part doesn't load what it doesn't need. There's
> maybe one or two small modules that with a lot of extra code could
> optionally load themselves. But that's hardly a meg.

Well, one can easily notice myriads of webhelpers modules loaded by
default. Regarding the pylons internals - I do not know, so hard to
comment, but ... well, something is using up this memory.

(webhelpers/__init__.py)

from webhelpers.rails import *

(webhelpers/rails/__init__.py)

from asset_tag import *
from urls import *
from javascript import *
from tags import *
from prototype import *
from scriptaculous import *
from form_tag import *
from secure_form_tag import *
from text import *
from form_options import *
from date import *

>> c) Some thread structures are allocated (??? - should they be under
>> mod_wsgi?)
>
> The default serve runs a thread-pool of 10, you could reduce the
> thread-pool if you are running a low traffic site and that will
> likely drop the footprint a little. If you add to the [server:main]
> section a bit like:
> threadpool_nworkers = 5

Is it of any importance when we are working under apache with
mod_wsgi?

(will give it a try, nevertheless, thanks)

Ben Bangert

unread,

Oct 18, 2007, 2:34:42 PM10/18/07

to pylons-...@googlegroups.com

On Oct 18, 2007, at 3:18 AM, Marcin Kasperski wrote:

> Bah. Even Zope - yes, Zope! - takes less:
>
> (zope after starting, standard standalone server)
>
> marcink 5028 0.0 0.1 70084 2340 ? Sl Jul10 0:02 /
> usr/bin/python /home/marcink/Zope/Install/lib/python/Zope2/Startup/
> run.py -C /home/marcink/Zope/Instance/etc/zope.conf
>
> (after serving some requests)
>
> marcink 5028 0.0 0.4 70084 13340 ? Sl Jul10 0:02 /
> usr/bin/python /home/marcink/Zope/Install/lib/python/Zope2/Startup/
> run.py -C /home/marcink/Zope/Instance/etc/zope.conf

Is Zope using a mysqldb connector? I should note that in a large app
I have, it takes 100 megs virtual on my remote debian box, but the
*same* app consumes 64megs of virtual or less on my OSX system. This
leads me to wonder what other libraries or C based Python extensions
might be doing in the course of the app loading.

> Well, one can easily notice myriads of webhelpers modules loaded by
> default. Regarding the pylons internals - I do not know, so hard to
> comment, but ... well, something is using up this memory.
>
> (webhelpers/__init__.py)
>
> from webhelpers.rails import *
>
> (webhelpers/rails/__init__.py)
>
> from asset_tag import *
> from urls import *
> from javascript import *
> from tags import *
> from prototype import *
> from scriptaculous import *
> from form_tag import *
> from secure_form_tag import *
> from text import *
> from form_options import *
> from date import *

Of course, its easy enough to remove this import, thus not loading
WebHelpers at all. Does that help? Not so much... I did some tests
loading the helpers and not loading them, and there was zero
difference in virtual, and about 10 bytes difference in resident.
This generally leads me to believe that whatever is taking the ram,
its all loading by either/both Paste and Pylons, so altering your own
imports in the project make no difference.

> Is it of any importance when we are working under apache with
> mod_wsgi?
>
> (will give it a try, nevertheless, thanks)

I'll continue to look into it. I'm still curious why there's a 40 meg
virtual difference between brand new projects based purely on the
system they're running on.

Cheers,
Ben

Philip Jenvey

unread,

Oct 18, 2007, 3:23:14 PM10/18/07

to pylons-...@googlegroups.com

On Oct 18, 2007, at 11:34 AM, Ben Bangert wrote:

> On Oct 18, 2007, at 3:18 AM, Marcin Kasperski wrote:
>
>> Bah. Even Zope - yes, Zope! - takes less:
>>
>> (zope after starting, standard standalone server)
>>
>> marcink 5028 0.0 0.1 70084 2340 ? Sl Jul10 0:02 /
>> usr/bin/python /home/marcink/Zope/Install/lib/python/Zope2/Startup/
>> run.py -C /home/marcink/Zope/Instance/etc/zope.conf
>>
>> (after serving some requests)
>>
>> marcink 5028 0.0 0.4 70084 13340 ? Sl Jul10 0:02 /
>> usr/bin/python /home/marcink/Zope/Install/lib/python/Zope2/Startup/
>> run.py -C /home/marcink/Zope/Instance/etc/zope.conf
>
> Is Zope using a mysqldb connector? I should note that in a large
> app I have, it takes 100 megs virtual on my remote debian box, but
> the *same* app consumes 64megs of virtual or less on my OSX system.
> This leads me to wonder what other libraries or C based Python
> extensions might be doing in the course of the app loading.

Are the Zope and Moin numbers from the same environment your Pylons
is running on?

These numbers can easily vary per platform.

Here's the most basic of Pylons apps:

FreeBSD 6, Python 2.5.1, after startup:

USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
pjenvey 90260 0.0 1.6 18692 16712 p3 S+ 12:14PM 0:01.28 /usr/
local/bin/python /usr/local/bin/paster serve development.ini

After hitting a 'Hello World' controller with 1000 requests:

pjenvey 90260 1.4 1.8 20604 18884 p3 S+ 12:14PM 0:08.65 /usr/
local/bin/python /usr/local/bin/paster serve development.ini

OS X Python 2.4.4 from MacPorts (without even hitting a controller yet):

USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME
COMMAND
pjenvey 19600 7.6 -0.6 57196 12768 p1 S+ 12:16PM
0:01.09 /opt/local/Library/Frameworks/

1000 requests:

pjenvey 19600 0.0 -0.7 58480 14636 p1 S+ 12:16PM
0:04.64 /opt/local/Library/Frameworks/

Then here's just running plain 'python' on OS X:

pjenvey 19631 0.0 -0.1 36876 2488 p5 S 12:17PM
0:00.08 /opt/local/Library/Frameworks/

So OS X, at least with my macports binary, isn't a great platform for
analyzing this.

--
Philip Jenvey

Ben Bangert

unread,

Oct 18, 2007, 3:32:52 PM10/18/07

to pylons-...@googlegroups.com

On Oct 18, 2007, at 3:18 AM, Marcin Kasperski wrote:

> Bah. Even Zope - yes, Zope! - takes less:
>
> (zope after starting, standard standalone server)
>
> marcink 5028 0.0 0.1 70084 2340 ? Sl Jul10 0:02 /
> usr/bin/python /home/marcink/Zope/Install/lib/python/Zope2/Startup/
> run.py -C /home/marcink/Zope/Instance/etc/zope.conf
>
> (after serving some requests)
>

> marcink 5028 0.0 0.4 70084 13340 ? Sl Jul10 0:02 /

> usr/bin/python /home/marcink/Zope/Install/lib/python/Zope2/Startup/
> run.py -C /home/marcink/Zope/Instance/etc/zope.conf

I've run a few tests on my OSX machine, can you run them on your box?

First, brand new app, started up with paster serve:

bbangert 5796 1.0 -1.0 64688 20136 p3 S+ 4:56PM
27:00.89 /opt/local/Library/Frameworks/Python.framework/Versi

Next, Paster serve + basic WSGI function app (no Pylons at all):

bbangert 6922 0.0 -0.3 49816 6368 p7 S+ 12:19PM
0:00.88 /opt/local/Library/Frameworks/Python.framework/Versi

And finally, base Python itself, running Python on the prompt and not
loading a single thing....

bbangert 6940 0.0 -0.1 36956 2776 p8 S+ 12:22PM
0:00.11 /opt/local/Library/Frameworks/Python.framework/Versi

So paster adds about 4 megs resident, 10 megs virtual over base
Python, and Pylons adds another 10 megs virtual, and 14 megs
resident. I think Pylons could be optimized some more, but at this
point I don't think its a high priority. This is mainly because given
the other features we're working on, trying to make Pylons run in
exceptionally limited VPS's (Also note that Webfaction for a mere
7.50/month can host a Pylons app - http://www.webfaction.com/pylons-
hosting
) just isn't a priority. Cheap VPS's (like serveraxis.com) start
around $30/mth with half a gig of ram, and a gig of swap... I'm a bit
curious what VPS you're on thats so constrained that maybe you'd be
better off with a shared provider like Webfaction?

If you'd like to contribute a patch of continue to hunt down what's
taking the ram, we'll be happy to apply patches.

Cheers,
Ben

Ian Bicking

unread,

Oct 18, 2007, 3:37:33 PM10/18/07

to pylons-...@googlegroups.com

Ben Bangert wrote:
> If you'd like to contribute a patch of continue to hunt down what's
> taking the ram, we'll be happy to apply patches.

Working scripts/tools to give analysis of where the memory is being used
would probably be even more useful than patches. There's tools out
there, but which one works best and how to invoke it is something I
don't think any of us have really figured out properly. If someone does
that, then if there's low hanging fruit now or in the future it can
probably be easily handled.

Marcin Kasperski

unread,

Oct 19, 2007, 5:35:51 AM10/19/07

to pylons-...@googlegroups.com

> Are the Zope and Moin numbers from the same environment your Pylons
> is running on?

Yes. Debian@i386 in both cases.

Marcin Kasperski

unread,

Oct 19, 2007, 5:43:03 AM10/19/07

to pylons-...@googlegroups.com

> I've run a few tests on my OSX machine, can you run them on your box?

Could you share the code@config of test apps? I wouldn't like to
experiment too much with creating them....

> Cheap VPS's (like serveraxis.com) start around $30/mth with half
> a gig of ram, and a gig of swap...

Well, I wouldn't call $30/m cheap, at least considering I mean hobby
project, not a business one. Of course it matters that I earn in
Poland, not in US...
But this is outside the main discussion. Paying some attention to
the memory usage makes sense even in stronger setups, RAM is still
fairly often the main scalability limit in web setups.

Graham Dumpleton

unread,

Oct 21, 2007, 7:13:08 PM10/21/07

to pylons-discuss

One more thing to possibly add to this list of suggestions for cutting
down Apache memory usage.

If you are using a Linux system and are using the worker MPM for
Apache, you should see if the Apache configuration you are using
defines a value for the ThreadStackSize directive. It seems that if
this isn't set, then on Linux the operating system will allocate the
default 8MB of virtual memory per thread, to be consumed by private
and stack memory. Since some VPS seems to count virtual memory usage
(perhaps if not even allocated yet) into actual memory used in some
strange way, this can result in the appearance you are using huge
amounts of memory.

You might therefore consider using this directive and drop down the
amount of memory allocated per thread. An alternative is to use ulimit
in the Apache startup script to override the stack resource limit.

Obviously you should test various settings on a test setup and ensure
that your particular application doesn't fail due to the reduction,
thus running out of stack memory.

For further discussion on this specific issue see the following thread
on Apache developers list.

http://marc.info/?l=apache-httpd-dev&m=119297997210029&w=2

Also, the Apache documentation for the ThreadStackSize directive:

http://httpd.apache.org/docs/2.2/mod/mpm_common.html#threadstacksize

Note, using ThreadStackSize would only apply for mod_python or
embedded mode of mod_wsgi. The ulimit approach would also apply for
mod_wsgi daemon mode. Changes to mod_wsgi daemon mode to allow
ThreadStackSize to apply are being looked at.

Graham

On Oct 18, 9:56 am, Graham Dumpleton <Graham.Dumple...@gmail.com>
wrote:

Jon Rosebaugh

unread,

Oct 21, 2007, 7:49:06 PM10/21/07

to pylons-...@googlegroups.com

On 10/18/07, Marcin Kasperski <Marcin.K...@softax.com.pl> wrote:
> > I'd suggest checking out nginx or lightty.
>
> I am watching with interest efforts to port WSGI to nginx which show
> up on some lists. Nevertheless, it seems it will take some time...

What needs to be ported? ngnix already supports reverse-proxying to a
HTTP server, and paste.httpserver is an excellent app server.

Graham Dumpleton

unread,

Oct 21, 2007, 9:20:52 PM10/21/07

to pylons-discuss

On Oct 22, 9:49 am, "Jon Rosebaugh" <chai...@gmail.com> wrote:

FWIW, I also never quite understood the basis for getting the
equivalent of Apache mod_wsgi working on nginx because how nginx works
somewhat limits its overall usefulness. All the same, I haven't
objected to the person trying and using code from Apache mod_wsgi in
the process.

The big difference between how Apache works and how nginx works
(correct me if I am wrong) is that nginx uses an asynchronous event
driven model, running with a fixed number of worker processes. This
makes it somewhat hard to extend it with embedded modules that are
able to block, as when the application supported by the module blocks,
the whole process blocks. Although with Apache prefork MPM any one
child process can also block completely, Apache can dynamically create
additional child processes to cope with a spike in demand or when
existing child processes are busy.

In nginx, my understanding is that it isn't able to create additional
worker processes like Apache is able to, which would be somewhat
limiting. Also understand that nginx isn't internally thread safe at
this point, so trying to use Python level threads to allow a blocking
mode of operating to work on top of nginx asynchronous model seems
also to be problematic.

Thus, in order to support still being able to handle other requests
such as those for static files, or proxying requests through to
another HTTP server or FASTCGI process, my understanding is that they
originally expected that the nginx with embedded WSGI support would
itself be run as a backend server to which you would proxy from a
front end nginx instance which would do all the other work. In other
words, WSGI in nginx is merely attempting to use the nginx web server
stack in the belief that it would provide a better performing WSGI
server, but not really provide the services of a general purpose web
server at the same time.

In this respect, it isn't really any different to proxying to
paste.httpserver or using a nginx FASTCGI solution.

So Jon, we may not see eye to eye on how Apache mod_wsgi is
implemented or when it would be appropriate to use it, but on nginx
mod_wsgi, we would at least appear to have the same view. ;-)

Graham

Marcin Kasperski

unread,

Oct 22, 2007, 8:12:09 AM10/22/07

to pylons-...@googlegroups.com

> you should see if the Apache configuration you are using
> defines a value for the ThreadStackSize directive.

Thanks for the hint.

I am still to make my experiments, but here is initial summary for
this thread from my point of view.

It is highly suspected, that significant part of the RAM consumption
is caused by the thread stacks and thread-local data. Therefore some
tuning here is likely to improve things.

The following paths are to be explored:

1) Keep using embedded mod_wsgi (and apache worker mode), but:
- experiment with ThreadStackSize, trying to reduce it from default 8MB
- tune worker threads count and processess count (standard
apache worker parameters)
- check whether there are any unnecessary apache modules loaded
- as a safety measure, set some MaxRequestsPerChild

2) Consider (and test) mod_wsgi daemon mode. Main advantage over
embedded mode is that it will allow to tune wsgi threads (count,
stack size...) separately from main apache threads.

3) Serve content using standalone paste, either directly, or using
reverse proxy
[server:main]
threadpool_nworkers = 5
to tune worker threads count (default is 10).
In such case, nginx is to be explored as an alternative for apache
(to serve the static content and perform reverse proxying)

(I leave out the idea of using fastcgi/scgi with flup, I do not think
it has any advantages over 2) and 3) in my case, similarly I leave out
the idea of standalone cherrypy as I do not know about any advantages
over paste)

Bob Ippolito

unread,

Oct 22, 2007, 8:28:59 AM10/22/07

to pylons-...@googlegroups.com

On 10/17/07, Graham Dumpleton <Graham.D...@gmail.com> wrote:
>
> Note that in general the argument that lighttpd and nginx are better
> because they are quicker in serving static files is totally
> meaningless in the context of Python web applications, as the real
> work is being done in the Python application and that along with any
> database access will be your bottleneck. Unless your web application
> is specifically being used as a mechanism for working with large
> numbers of very large distinct static media files, you will not really
> gain anything from using lighttpd or nginx as static files isn't the
> bottleneck. All you will possibly do is just make your setup and
> configuration more complicated than it needs to be.

That's not true. nginx or lighttpd are far better than Apache +
mod_proxy by any metric, at least with Apache 2.0.x on FreeBSD, Mac OS
X, or Linux (other platforms I have not tried myself). Also, unless
you're using lots of mod_rewrite, nginx isn't any more complicated
than Apache. In fact, I find it much easier because you don't have to
figure out what you have to turn off.

In an environment with only 96 mb available, using a web server that
uses forks or threads is a really bad choice anyway. That pretty much
rules Apache 2.0.x out.

-bob

Bob Ippolito

unread,

Oct 22, 2007, 8:34:15 AM10/22/07

to pylons-...@googlegroups.com

But any of those pages are going to get written to any time you touch
any of those objects due to the ref count changing. If you actually
those libraries, the savings should disappear pretty fast. Doing this
for C code makes more sense since you don't increase the ref count on
functions in C when you call them, but I think most modern OS' already
share C libraries across processes using a copy-on-write strategy.

-bob

Graham Dumpleton

unread,

Oct 22, 2007, 7:24:16 PM10/22/07

to pylons-discuss

On Oct 22, 10:28 pm, "Bob Ippolito" <b...@redivi.com> wrote:

The complexity I am referring to is not in configuring nginx, but that
you also have to install and configure some separate supervisor system
which starts up and ensures that your backend process is running. You
also need to integrate that into the system startup scripts whereas
web servers such as Apache on prepackaged Linux distributions
generally have that. This is extra software dependencies and extra
work to setup. Although it may be manageable if you are running a
single web application, it becomes a lot more work if you want to run
a mix of web applications, or multiple sites, all in distinct
processes as you may then need separate supervisor configurations and
startup scripts for each. Because Apache when using appropriate
modules can itself act as the supervisor for the applications, it all
becomes somewhat simpler as the configuration is all in one spot and
all the startup scripts become irrelevant as that would already exist
for Apache.

BTW, Apache/mod_proxy or Apache/fastcgi/flup solutions are not the
fastest way of supporting WSGI applications with Apache and better
performing, easier to setup solutions exist that don't need horrible
rewrite rules. So, comparing to those solutions doesn't give the full
picture.

Also, as I keep trying to put out, speed of request handling and
proxying is generally not the problem as your bottleneck is more often
than not going to be in your Python web application or in the way it
interacts with a database. Thus, for most it would be far more
important to use a system which has less dependencies on third party
packages and is easier to setup.

True, the answer may not always be Apache, especially, if forced to
use a memory constrained environment more tailored to PHP web
application hosting. In some respects though this points out an issue
that Python web developers still have yet to come to grips with. That
is that web hosting solutions offered by companies tend to be tailored
to PHP programming which has a model which throws away stuff at the
end of each request. This is in contrast to Python where everything
generally persists and so applications are fatter. This isn't helped
by Pythons batteries included mentality as all that does is cause
additional memory bloat due to a lot of dead unused code being
imported that you don't really need. If Python folks expect to work
within the constrained environments offered for PHP web applications,
then the large frameworks have to find ways of cutting down on memory
use somehow.

Graham

Graham Dumpleton

unread,

Oct 22, 2007, 7:26:14 PM10/22/07

to pylons-discuss

On Oct 22, 10:12 pm, Marcin Kasperski <Marcin.Kasper...@softax.com.pl>
wrote:

> 2) Consider (and test) mod_wsgi daemon mode. Main advantage over
> embedded mode is that it will allow to tune wsgi threads (count,
> stack size...) separately from main apache threads.

Noting though that mod_wsgi daemon mode doesn't allow tuning of stack
size on a process group basis yet. Only way of tuning it at the moment
would be to use 'ulimit -s' within the environment of Apache itself.
Ie., set it from the Apache 'envars' startup file.

Graham

Cliff Wells

unread,

Oct 22, 2007, 8:53:00 PM10/22/07

to pylons-...@googlegroups.com

On Thu, 2007-10-18 at 12:18 +0200, Marcin Kasperski wrote:

> I am watching with interest efforts to port WSGI to nginx which show
> up on some lists. Nevertheless, it seems it will take some time...

Actually I believe Manlio's implementation is usable right now for WSGI
2.0, with WSGI 1.0 support slated for the next release.

http://hg.mperillo.ath.cx/nginx/mod_wsgi/

Regardless, WSGI support isn't required to run Pylons under Nginx. I
personally use plain HTTP proxying and am quite satisfied.

Regards,
Cliff

Graham Dumpleton

unread,

Oct 22, 2007, 8:59:52 PM10/22/07

to pylons-discuss

On Oct 23, 10:53 am, Cliff Wells <cl...@develix.com> wrote:
> On Thu, 2007-10-18 at 12:18 +0200, Marcin Kasperski wrote:
> > I am watching with interest efforts to port WSGI to nginx which show
> > up on some lists. Nevertheless, it seems it will take some time...
>
> Actually I believe Manlio's implementation is usable right now for WSGI
> 2.0, with WSGI 1.0 support slated for the next release.
>
> http://hg.mperillo.ath.cx/nginx/mod_wsgi/

Except that there is no such thing as WSGI 2.0. The only version of
WSGI that exists is 1.0.

What they have done is taken once core idea which had been put up for
future discussion for a WSGI 2.0 and implemented that. What has been
done should not be called WSGI 2.0 as there isn't a guarantee that
WSGI 2.0 will ever even be created or that it will look like what they
have implemented.

For further information about the ideas that have been put up for a
successor to WSGI 1.0, then see:

http://www.wsgi.org/wsgi/WSGI_2.0

Graham

Cliff Wells

unread,

Oct 22, 2007, 9:54:15 PM10/22/07

to pylons-...@googlegroups.com

On Mon, 2007-10-22 at 23:24 +0000, Graham Dumpleton wrote:

> The complexity I am referring to is not in configuring nginx, but that
> you also have to install and configure some separate supervisor system
> which starts up and ensures that your backend process is running.

I think this argument doesn't really hold. Either you "have to" install
a supervisor process or you don't. If you do "have to", then you
arguably need to install one for Apache as well, so the complexity is
equivalent. If you don't "have to" (I've had Pylons and TurboGears
sites up for months without restarts or supervisors), then once again
the complexity is even.

> You also need to integrate that into the system startup scripts whereas
> web servers such as Apache on prepackaged Linux distributions
> generally have that.

If by "integrate" you mean add a line or two to /etc/rc.local (or your
distro's equivalent), then I'll agree with you, but simultaneously fail
to see your point.

> This is extra software dependencies and extra work to setup.

I'll take this added complexity, subtract the complexity of reducing
Apache's memory footprint and still come out with an overall savings in
complexity.

I'll agree that Apache can be made more efficient (ignoring the
threading vs async debate) but doing so is far more complicated and
error-prone than installing Nginx and a supervisor. Apache starts out
big and must be pruned down, Nginx starts out small and pretty much
stays there regardless.

> Although it may be manageable if you are running a
> single web application, it becomes a lot more work if you want to run
> a mix of web applications, or multiple sites, all in distinct
> processes as you may then need separate supervisor configurations and
> startup scripts for each.

Since the OP mentioned 96MB of RAM, I'll assume he's running single web
app. Unless he requires mod_svn or some other esoteric module, there's
simply no way Apache can win here.

> Because Apache when using appropriate
> modules can itself act as the supervisor for the applications, it all
> becomes somewhat simpler as the configuration is all in one spot and
> all the startup scripts become irrelevant as that would already exist
> for Apache.

I know this is difficult for an Apache expert to grok, but configuring
Apache is everything except easy to understand. Some of this is due to
Apache's rather large feature-set, but some of it is also due to
Apache's rather weird and limited configuration syntax.

> Also, as I keep trying to put out, speed of request handling and
> proxying is generally not the problem as your bottleneck is more often
> than not going to be in your Python web application or in the way it
> interacts with a database. Thus, for most it would be far more
> important to use a system which has less dependencies on third party
> packages and is easier to setup.

In general I agree, but if Apache is sucking down 1/10th or more of your
memory (and with 96MB, that's certainly likely), then it will certainly
reduce overall system performance significantly as disk caching will
suffer and you'll be forced to swap much sooner.

> True, the answer may not always be Apache, especially, if forced to
> use a memory constrained environment more tailored to PHP web
> application hosting.

I generally find Apache preferable to Nginx when one or more of the
following conditions is present:

1) Some esoteric feature of Apache is needed.
2) The customer/user is an Apache expert.
3) I don't care about the installation and hope someone else has to
maintain it.

> If Python folks expect to work
> within the constrained environments offered for PHP web applications,
> then the large frameworks have to find ways of cutting down on memory
> use somehow.

I'm not sure the blame is being aimed in the right direction here. I
don't think PHP is what Pylons is competing with and I have a definite
concern that tailoring it to run in a $7/month hosting account would
require too deep of cuts. RoR is much more of a direct competitor, and
has been noted, Pylons uses a fraction of the RAM RoR does (not to
mention CPU).

Quite frankly, were I consulting with a customer in the same position as
the OP, what I'd be suggesting is they quit trying to run their site in
a toy environment. I think a 96MB VPS is fine for testing, but not a
realistic deployment environment for much more than a personal blog or
photo album.

As an aside, my Pylons (paste+cherrypy) app for Breve only uses around
48MB after a couple months of uptime (there are four Pylons backends
load balanced by Nginx):

hosting ~ # ps -U twisty -o rss,vsz,command
RSS VSZ COMMAND
48000 132720 /usr/bin/python /var/www/virtual/twisty-industries.com//bin/paster serve breve1.ini
48312 133264 /usr/bin/python /var/www/virtual/twisty-industries.com//bin/paster serve breve2.ini
46024 130860 /usr/bin/python /var/www/virtual/twisty-industries.com//bin/paster serve breve3.ini
47920 132860 /usr/bin/python /var/www/virtual/twisty-industries.com//bin/paster serve breve4.ini

This leads me to believe there is something wrong with the OP's
application or perhaps the leak exists in a layer outside the Pylons
core (i.e. template engine).

Regards,
Cliff

Graham Dumpleton

unread,

Oct 23, 2007, 1:58:43 AM10/23/07

to pylons-discuss

On Oct 23, 11:54 am, Cliff Wells <cl...@develix.com> wrote:
> On Mon, 2007-10-22 at 23:24 +0000, Graham Dumpleton wrote:
> > The complexity I am referring to is not in configuring nginx, but that
> > you also have to install and configure some separate supervisor system
> > which starts up and ensures that your backend process is running.
>
> I think this argument doesn't really hold. Either you "have to" install
> a supervisor process or you don't. If you do "have to", then you
> arguably need to install one for Apache as well, so the complexity is
> equivalent. If you don't "have to" (I've had Pylons and TurboGears
> sites up for months without restarts or supervisors), then once again
> the complexity is even.

You do not need a separate supervisor if you use mod_python, mod_wsgi
or mod_fastcgi. In the case of mod_python and embedded mode of
mod_wsgi the standard Apache child process supervisor does it all for
you. For mod_wsgi daemon mode the mod_wsgi implementation makes use of
the Apache runtime library supervisor directly, the same one used by
Apache to manage its child processes. For mod_fastcgi it contains its
own supervisor implementation.

In all cases the supervisor is a part of Apache or provided by the
Apache module in question, there is no need to install a separate
supervisor system. It is only when you use mod_proxy to a back end web
server, or use fastcgi in external process mode that you need to
install a separate supervisor. Thus the supervisor comes for free if
you don't restrict yourself to using mod_proxy to a back end web
server.

> > You also need to integrate that into the system startup scripts whereas
> > web servers such as Apache on prepackaged Linux distributions
> > generally have that.
>
> If by "integrate" you mean add a line or two to /etc/rc.local (or your
> distro's equivalent), then I'll agree with you, but simultaneously fail
> to see your point.

And that may well be fine if you know your way around UNIX system
administration. More and more you are getting people coming to Python
web application programming who know nothing about UNIX system
administration, so if getting things to work simply means adding a
line or two in the Apache configuration to get it all going, it is
much better than having to install a separate supervisor system and
ensure that the system starts it all up on a reboot.

An Apache installation on the other hand is already going to be setup
to start on boot. Using periodic cron jobs to check if an application
is running and restart it, or using 404 error document handler scripts
to trigger starting of an application like one sees suggested
occasionally are not robust solutions.

> > This is extra software dependencies and extra work to setup.
>
> I'll take this added complexity, subtract the complexity of reducing
> Apache's memory footprint and still come out with an overall savings in
> complexity.

Use mod_wsgi daemon mode or mod_fastcgi and the memory footprint of
Apache is not the big deal that people think it is as you can limit
your web application to run in a single or limited number of threaded
processes, thereby getting rid of most of the problems. Yes, this is
not much different to mod_proxy in terms of architecture, but you are
able to skip the need for a separate supervisor if you want to ensure
your application always stays running. You are of course throwing out
the good features of Apache in as far as its scalability, but then if
you need to do this, scalability is the least of your worries.

Whether you use Apache or a backend web server implemented in Python,
the fact remains that the bulk of the memory consumption comes from
the Python web framework and application. Just the starting memory
size for the most popular Python web frameworks range from 5MB up to
15MB and that is before any of the user specific code has been loaded.

Even if you use a Python web server backend, on Linux it would appear
you are still going to have to contend with the large stack size it
allocates to each thread as that is a property of Linux and not
Apache, so you still need to configure your system around that, or
reduce the numbers of threads you use, something that can also be done
with Apache daemon process solutions. But then, why that is an issue I
don't know, for reasons mentioned below.

> > Although it may be manageable if you are running a
> > single web application, it becomes a lot more work if you want to run
> > a mix of web applications, or multiple sites, all in distinct
> > processes as you may then need separate supervisor configurations and
> > startup scripts for each.
>
> Since the OP mentioned 96MB of RAM, I'll assume he's running single web
> app. Unless he requires mod_svn or some other esoteric module, there's
> simply no way Apache can win here.

If one doesn't know that running your Python web application embedded
within the Apache child processes is not your only option then this is
true, but mod_proxy is not the only solution, other options do exist
such as mod_wsgi daemon mode and mod_fastcgi where Apache in effect
still manages the processes for you but the application itself is in a
distinct standalone process. Use these solutions and it is just a
matter of selecting the correct Apache MPM and setting the number of
Apache child processes/thread to sensible values based on the
constrained memory available to ensure the Apache child processes are
predominant. Do that and Apache can be an acceptable solution.

> > Because Apache when using appropriate
> > modules can itself act as the supervisor for the applications, it all
> > becomes somewhat simpler as the configuration is all in one spot and
> > all the startup scripts become irrelevant as that would already exist
> > for Apache.
>
> I know this is difficult for an Apache expert to grok, but configuring
> Apache is everything except easy to understand. Some of this is due to
> Apache's rather large feature-set, but some of it is also due to
> Apache's rather weird and limited configuration syntax.

In practice, neither Apache's or lighttpd's configuration syntax is
going to be easy for a new user. Whether one thinks one or the other
is simpler is ultimately going to be due to familiarity. I don't know
what nginx configuration is like so cant comment on it.

> > If Python folks expect to work
> > within the constrained environments offered for PHP web applications,
> > then the large frameworks have to find ways of cutting down on memory
> > use somehow.
>
> I'm not sure the blame is being aimed in the right direction here. I
> don't think PHP is what Pylons is competing with and I have a definite
> concern that tailoring it to run in a $7/month hosting account would
> require too deep of cuts. RoR is much more of a direct competitor, and
> has been noted, Pylons uses a fraction of the RAM RoR does (not to
> mention CPU).
>
> Quite frankly, were I consulting with a customer in the same position as
> the OP, what I'd be suggesting is they quit trying to run their site in
> a toy environment. I think a 96MB VPS is fine for testing, but not a
> realistic deployment environment for much more than a personal blog or
> photo album.

I'd agree that people trying to use these 96MB VPS for real stuff is
also a bit silly, but some people cant seem to be convinced that they
should spend a bit more money on better solutions. At the same time,
web hosting companies don't want to see a middle ground with many
instead just wanting to offer a high paid premium service or a low
cost basic service only really suitable for PHP sites.

This may not change until a simple to deploy, easy to configure and
manage solution is available for Python web hosting. Such a solution
should also be easy to deal with as far as the user is concerned in as
much as issues such as code reloading. Too often you see complaints
about Python because of the need for the whole web server to be
restarted with some of the more popular hosting solutions, when even a
simple code change has been made. People compare this PHP and point
out how that isn't required with PHP. Web hosting companies see this
as well, and consequently it probably deters many from trying to
support Python as they see it as requiring more intervention to
manage.

> This leads me to believe there is something wrong with the OP's
> application or perhaps the leak exists in a layer outside the Pylons
> core (i.e. template engine).

What doesn't make sense to me about the OP's problem is why for the
VPS being used does the virtual memory size even get calculated into
the process size limits. Only the RSS should matter, but I have seen
it a number of times where people suggest this is not how the web
hosting companies are measuring it. If it was truly done by actual
memory used and not virtual memory, there shouldn't even be a need to
adjust the per thread stack size as the thread never actually uses
that memory in the first place. So, is it the web hosting companies
calculating things wrongly, or do these VPS systems allocate actual
physical memory in the outer container they run in, even though the OS
processes in the VPS haven't actually mapped the virtual memory yet
causing it to become real memory.

Now, I know some will be rolling their eyes at this point thinking
that Graham is off again on his crusade to make everyone use Apache.
Although I prefer Apache, it is not specifically what I am up to. What
I would like to see is simply that people know what the choices are
and any trade off with using one solution over another. I just find it
annoying when people indiscriminately say don't use Apache, when there
are actually various solutions for Apache that exist and each has
their strengths and weaknesses in different areas. The same goes for
lighttpd and nginx. They may be much better than Apache for static
file serving, especially large media files and light weight proxying,
but when hosting large Python web applications, especially where you
may need to serve a mix of Python web applications such as Django,
Trac and Mercurial source code browsers, or many different sites
running distinct Python web application instances, they may not be
appropriate and so their speed in serving static files quite
meaningless.

I would really like to see some members of the Python community as a
whole get together and put together information which balances up all
the available solutions, comparing features and also performance for
different use cases. This would help people make an informed choice
rather than people being guided by individuals personal preferences
and recommendations. Such a comparison would no doubt also probably
help to identify where Python web hosting solutions are lacking,
guiding improvements and even one day we might be able to develop that
solution which web hosting companies may find acceptable for use in
commodity web hosting. Have that and they may start to relate to what
we need and provide us better solutions instead of these PHP biased
setups.

Graham

Marcin Kasperski

unread,

Oct 31, 2007, 11:39:40 AM10/31/07

to pylons-...@googlegroups.com

Just a (post-mortem?) note. After having very good experience from
migration (on some non-pylons site) from moinmoin hosted on apache
with fastcgi to moinmoin on twisted proxied behind nginx (I did not
make any formal measures, but it just feels that the app runs faster)
I opted for similar path on my pylons site. So I run nginx (to serve
static resources and to reverse-proxy pylons) and paster (to serve
pylons content).

I am extremely happy with nginx memory usage (<4MB virtual, 700kB
resident).

I am still not very happy with paster (~100MB virtual, 22MB resident),
and hoping to search for some improvements here. But at least it is
just one such process...

Unfortunately this:

> [server:main]
> threadpool_nworkers = 5

did not work for me. Bare entering of this into the ini file results
in the following error:

$ paster serve runtime/development.ini
Starting server in PID 16404.
Traceback (most recent call last):
File "/usr/bin/paster", line 7, in ?
sys.exit(
File "/usr/lib/python2.4/site-packages/PasteScript-1.3.6-py2.4.egg/paste/scri
pt/command.py", line 78, in run
invoke(command, command_name, options, args[1:])
File "/usr/lib/python2.4/site-packages/PasteScript-1.3.6-py2.4.egg/paste/scri
pt/command.py", line 117, in invoke
exit_code = runner.run(args)
File "/usr/lib/python2.4/site-packages/PasteScript-1.3.6-py2.4.egg/paste/scri
pt/command.py", line 212, in run
result = self.command()
File "/usr/lib/python2.4/site-packages/PasteScript-1.3.6-py2.4.egg/paste/scri
pt/serve.py", line 232, in command
server(app)
File "/usr/lib/python2.4/site-packages/paste/deploy/loadwsgi.py", line 139, i
n server_wrapper
wsgi_app, context.global_conf,
File "/usr/lib/python2.4/site-packages/paste/deploy/util/fixtypeerror.py", li
ne 57, in fix_call
val = callable(*args, **kw)
File "/usr/lib/python2.4/site-packages/Paste-1.4-py2.4.egg/paste/httpserver.p y", line 1303, in server_runner
serve(wsgi_app, **kwargs)
File "/usr/lib/python2.4/site-packages/Paste-1.4-py2.4.egg/paste/httpserver.p y", line 1253, in serve
threadpool_options=threadpool_options)
File "/usr/lib/python2.4/site-packages/Paste-1.4-py2.4.egg/paste/httpserver.p y", line 1108, in __init__
ThreadPoolMixIn.__init__(self, nworkers, daemon_threads,
TypeError: __init__() got multiple values for keyword argument 'nworkers'

Ben Bangert

unread,

Oct 31, 2007, 11:56:44 AM10/31/07

to pylons-...@googlegroups.com

On Oct 31, 2007, at 8:39 AM, Marcin Kasperski wrote:

I am still not very happy with paster (~100MB virtual, 22MB resident),
and hoping to search for some improvements here. But at least it is
just one such process...

Can you ask your ISP if the virtual is counting against some limit? The reason I ask is that virtual doesn't necessarily even reflect swap (which I can believe a picky virtual provider might care about). Virtual can encompass shared libraries, paging files, existing resident, etc. So 100MB virtual is *not* 100 MB of swap. I continue to be baffled why there is such a problem with a number that itself varies wildly depending on architecture (vs the resident size, which *is* consistent across architectures).

For example, according to this detailed explanation of unix memory management:

http://www.dataexpedition.com/~sbnoble/Tips/memory.html

It would appear that the virtual number you see could even be referencing files on the filesystem (not in swap or ram). This is generally why the most meaningful numbers you *should* be looking at, are the resident memory, and shared memory (useful with something like Postgres to see how much ram each individual process is taking).

Can you elaborate on how you believe your hosting provider is counting the virtual number vs swap files and why you believe that virtual does mean swap? Please keep in mind, the 'virtual' memory on unix is *not* the same as 'virtual' memory on Windows (where it does generally mean swap).

Cheers,

Ben

Marcin Kasperski

unread,

Oct 31, 2007, 12:38:29 PM10/31/07

to pylons-...@googlegroups.com

>> [server:main]
>> threadpool_nworkers = 5
>
> did not work for me.

Resolved. One should use

threadpool_workers = 5

(without 'n'). Then it works. Some observations in the next post.

Marcin Kasperski

unread,

Oct 31, 2007, 12:41:09 PM10/31/07

to pylons-...@googlegroups.com

Ben Bangert <b...@groovie.org> writes:

> On Oct 31, 2007, at 8:39 AM, Marcin Kasperski wrote:
>
> I am still not very happy with paster (~100MB virtual, 22MB resident),
> and hoping to search for some improvements here. But at least it is
> just one such process...
>
> Can you ask your ISP if the virtual is counting against some limit?

Well, I recently migrated some app to another host and the problem
does not hurt me that much, so I won't trouble him... But I am almost
sure that indeed, it is RAM and swap what are measured, I can't even
imagine how it could be otherwise.

> The reason I ask is that virtual doesn't necessarily even reflect
> swap (which I can believe a picky virtual provider might care
> about). Virtual can encompass shared libraries, paging files,
> existing resident, etc. So 100MB virtual is *not* 100 MB of swap.

This is true. But this is also relatively easy to measure, it suffices
to compare virtual size of the running app with virtual size of python
which just loaded all the libraries. The latter is fairly well
approximated by paster shell.

The latter allocates about 27MB of virtual on my host (but almost 20MB
of resident).

BTW, in the meantime I resolved the workers problem and indeed it
seems that it is all about per-thread storage. See by yourself
(measures after starting and serving one request):

* threadpool_workers = 5

virtual 66084, resident 18956

* threadpool_workers = 10

virtual 108104, resident 18900

* threadpool_workers = 20

virtual 189060, resident 19132

So it looks like it is all about thread-local storage. Maybe just
the thread stack size...

Ben Bangert

unread,

Oct 31, 2007, 3:08:29 PM10/31/07

to pylons-...@googlegroups.com

On Oct 31, 2007, at 9:41 AM, Marcin Kasperski wrote:

> This is true. But this is also relatively easy to measure, it suffices
> to compare virtual size of the running app with virtual size of python
> which just loaded all the libraries. The latter is fairly well
> approximated by paster shell.
>
> The latter allocates about 27MB of virtual on my host (but almost 20MB
> of resident).

To try and illustrate how useless the virtual number is that you're
so worried about. On my OSX unix box, according to the OS, its using
9.6GB of Virtual Memory. Now, I know where the real swap file it
uses, is actually stored, and while top says 9.6GB of virtual, the
swap file is 64megs....

For another test, to really try and gauge what is occurring, I opened
up some swap counters, to watch swap files page in, and page out.
Then I started increasing the threadpool workers to see whether
swapping was occurring.

At 15 threadpool:
VM: 65MB, Res: 17.13MB

At 65 threadpool:
VM: 90.65MB, Res: 18.13MB

So yes, increasing threadpool workers is definitely making virtual
memory increase. Should we all be worried and stress out over it?
Well, I was watching paging and swap activity the entire time, and
there was *Zero* pager activity. Zero pageins, Zero pageouts, Zero
increase in swap.

Let me re-iterate this again, the increase in virtual memory did
*not* cause any increase in swap.

Can we please stop worrying over the virtual numbers now?

If you have some sort of hard data about real swap activity, I'd love
to see it, but posting virtual numbers is pointless and does not
represent swap usage. If there's some other unix expert that can
shine more light on what the heck this virtual number does mean,
that'd be great.

In the meantime, I'm considering the virtual memory issue a moot point.

Cheers,
Ben

Graham Dumpleton

unread,

Oct 31, 2007, 6:52:53 PM10/31/07

to pylons-discuss

We are talking about VPS systems, not a full OS running directly on
hardware. As a result, the amount of virtual memory may not be a moot
point.

As an example of configuration values for one VPS system:

Memory Limit - maximum virtual memory size a VPS/context can allocate;
Used memory - virtual memory size used by a running VPS/context;
Max total memory - maximum virtual memory usage by VPS/context;
Context RSS limit - maximum resident memory size a VPS/context can
allocate. If limit is exceeded, VPS starts to use the host's SWAP;
Context RSS - resident memory size used by a running VPS/context;
Max RSS memory - maximum resident memory usage by VPS/context;
Disk Limit - maximum disk space that can be used by VPS (calculated
for the entire VPS file tree); Used disk memory - disk space used by a
VPS file tree;
Files limit - maximum number of files that can be switched to a VPS/
context;
Used files - number of files used in a VPS/context;
TCP sockets limit - limit on the number of established connections in
a virtual server
Established sockets - number of established connections in a virtual
server

In respect of these limits it states:

"""when summary virtual memory size used by VPS/context exceedes
Memory Limit value, processes can't allocate the required memory and
fail to run"""

and:

"""it is recommended to set Context RSS Limit to one third of Memory
Limit"""

In other words the VPS configuration allows the provider to control
both Context RSS Limit and Memory Limit. Feasibly, if a provider is
setting the Memory Limit to be too low a value in relation to the
Context RSS Limit, then how a particular OS allocates virtual memory
can be a problem as it will reach the defined Memory Limit before it
reaches the Context RSS Limit.

This is why how the provider has set up the VPS is an issue. It would
appear that some providers may be setting quite restrictive limits,
not perhaps really understanding how processes make use of memory.

FWIW, this issue has come up before on the Apache httpd developers
list:

http://marc.info/?l=apache-httpd-dev&m=119296272027150&w=2

It was in relation to this that the Linux default per thread stack
size came up as a contributer.

How valid all this is still needs some investigation, but it is
probably too early to dismiss as a non issue.

Graham

Graham Dumpleton

unread,

Oct 31, 2007, 7:44:34 PM10/31/07

to pylons-discuss

On Nov 1, 3:41 am, Marcin Kasperski <Marcin.Kasper...@softax.com.pl>
wrote:

What figures do you get for the above if you run:

ulimit -s 512

in your shell prior to running the web application?

You will need to exit the shell when done to get it back to default
value.

If the VPS Memory Limit (virtual memory size) is found to be an issue
for some providers of VPS systems, for Python 2.5 at least, the simple
solution would be to allow a configuration option something like:

threadpool_workers_stackize = 524288

when this is present, then the web server could call:

import thread
thread.stack_size(value)

This way the user could change it if they needed to because of an
overly restrictive VPS configuration.

Graham

Ben Bangert

unread,

Oct 31, 2007, 8:05:16 PM10/31/07

to pylons-...@googlegroups.com

On Oct 31, 2007, at 4:44 PM, Graham Dumpleton wrote:

What figures do you get for the above if you run:

ulimit -s 512

in your shell prior to running the web application?

You will need to exit the shell when done to get it back to default
value.

If the VPS Memory Limit (virtual memory size) is found to be an issue
for some providers of VPS systems, for Python 2.5 at least, the simple
solution would be to allow a configuration option something like:

threadpool_workers_stackize = 524288

when this is present, then the web server could call:

import thread
thread.stack_size(value)

This way the user could change it if they needed to because of an
overly restrictive VPS configuration.

Thanks Graham! I've looked over that Apache thread, the clarifying post was:

http://marc.info/?l=apache-httpd-dev&m=119296450928812&w=2

I did some more testing, with base Python, then paster shell (which loads a Pylons app but *no* threads), then paster serve (which has threads). Sure enough, all the virtual consumption seems to be due to the thread size on linux being rather hefty.

On a linux system, Pylons itself consumes about 15mb of Virtual (not too unreasonable I hope?). The massive increase in Virtual is due to "paster serve" spawning 10 threads (at 8mb virtual each), plus one watch thread that monitors the other threads if they get stuck. Since linux is allocating 8mb in the stack per thread, this is where that extra 80mb of Virtual comes from.

You can run Pylons with FastCGI, directly loading the Pylons app (thus no threadpools). This should work great with your nginx setup as well, and mitigate the Virtual issue from the thread size on linux. Your virtual should then hover around 20mb instead of the 100mb you're seeing.

Cheers,

Ben

Ben Bangert

unread,

Oct 31, 2007, 8:09:12 PM10/31/07

to pylons-...@googlegroups.com

On Oct 31, 2007, at 5:05 PM, Ben Bangert wrote:

> You can run Pylons with FastCGI, directly loading the Pylons app
> (thus no threadpools). This should work great with your nginx setup
> as well, and mitigate the Virtual issue from the thread size on
> linux. Your virtual should then hover around 20mb instead of the
> 100mb you're seeing.

Actually, Ian clarified that the threadpool can be turned off. Add
the following to the [server] section:
use_threadpool = false

On my linux system, that cut the Virtual from 100Mb to 18Mb.

Cheers,
Ben

Marcin Kasperski

unread,

Nov 5, 2007, 6:53:48 AM11/5/07

to pylons-...@googlegroups.com

> Actually, Ian clarified that the threadpool can be turned off. Add the
> following to the [server] section:
> use_threadpool = false
>
> On my linux system, that cut the Virtual from 100Mb to 18Mb.

But this is then single-process mode, I guess?

Marcin Kasperski

unread,

Nov 5, 2007, 7:08:30 AM11/5/07

to pylons-...@googlegroups.com

>> * threadpool_workers = 10
>>
>> virtual 108104, resident 18900

>> (...)

>> So it looks like it is all about thread-local storage. Maybe just
>> the thread stack size...
>
> What figures do you get for the above if you run:
>
> ulimit -s 512

*Far* smaller, for example with workers=10 I got virtual about 27MB
after startup and about 30MB later. Must give it some testing but it
looks like interesting way to go....

BTW, In the meantime I added database connection to my app. Whoa,
there is another pool there ;-) Fortunately also possible to tune
(sqlalchemy.default.pool_size, sqlalchemy.default.max_overflow).
And ulimit trick impacts this one too.

> If the VPS Memory Limit (virtual memory size) is found to be an issue
> for some providers of VPS systems, for Python 2.5 at least, the simple
> solution would be to allow a configuration option something like:
>
> threadpool_workers_stackize = 524288
>
> when this is present, then the web server could call:
>
> import thread
> thread.stack_size(value)
>

Thanks for the hint. I am still using python 2.4 (mainly some laziness).
I suspect the result would be the same....

Has anybody ever tried measuring the actual stack usage in typical
pylons apps?

Marcin Kasperski

unread,

Nov 5, 2007, 7:33:54 AM11/5/07

to pylons-...@googlegroups.com

> Any ideas how could I reduce pylons app RAM usage? At the moment
> pylons process takes above 100MB (almost static site serving some
> templates) - both when run with paste, and when run under mod_wsgi.
> Quite a lot, considering that for example whole moinmoin manages to
> work on ~25MB.

Short summary for possible future readers:

a) High pylons virtual memory usage is almost solely caused by the
huge stack allocated on Linux by default (default Linux stack size
is 8MB). It is still not quite clear whether this is really
troublesome (it looks like in typical configurations this stack
remains purely virtual number), but in some configurations may
theoretically cause trouble (like reaching VPS limits)

b) Easy way to force this memory down is to use ulimit to narrow
the stack size. For example
ulimit -s 512
to use 512kB stack (the command may be incjected in shell script
starting paste or in similar place).

More or less equivalent solution for python>=2.5
is to call
import thread
thread.stack_size(512 * 1024)
in the application initialization code. An elegant way is to move
this number to the configuration (like threadpool_worker_stack_size)

c) While tuning the memory consumption it also makes sense to
pay attention to the following runtime parameters:

[server:main]
threadpool_workers = 10 # 10 is default

[app:main]
sqlalchemy.default.pool_size = 3 # 5 is default
sqlalchemy.default.max_overflow = 7 # pool_size+max_overflow = max simultaneous database sessions

Those impact not only virtual, but also real memory usage....

The text above is related to running application under paste, but
similar idea can be applied while using mod_wsgi. If using embedded
mod_wsgi, one may consider using Apache ThreadStackSize directive, if
using mod_wsgi daemon mode (which allows one to configure amount of
working processess and threads in detail) ulimit -s may be used
(future releases of mod_wsgi are also to have configuration parameters
to tune this).

Thanks everybody for many valuable suggestions.

PS This thread did not resolve the old 'what is the best pylons
hosting method' question, as it was not its purpose. There are many
options, for example direct paste, paste hidden behind reverse proxy
(probably best if light, like nginx or cherokee), embedded mod_wsgi,
mod_wsgi in daemon mode, fastcgi, mod_python...

Personally I feel that in typical application it is Good Thing to
concentrate Pylons in one (or a few) multithreaded processess (so
python code, template caches, database pools etc may be shared between
threads) dedicated solely to this task (so their memory is not wasted
for trivial tasks like serving images) - so either paste shielded
behind light reverse proxy (which also directly serves static data),
or mod_wsgi daemon mode seem to be most promising. But in case
somebody would like to discuss this, please ... start new thread ;-)

Alberto Valverde

unread,

Nov 5, 2007, 7:57:33 AM11/5/07

to pylons-...@googlegroups.com

Thanks for the summary Martin! I've copied it to Pylons' wiki [1] so
it's easier to find for reference in the future.

Alberto

[1] http://wiki.pylonshq.com/x/M4Kq

Marcin Kasperski

unread,

Nov 5, 2007, 9:03:35 AM11/5/07

to pylons-...@googlegroups.com

Alberto Valverde <alb...@toscat.net> writes:

> Thanks for the summary Martin!

Good old habit from times when I participated in tru64-unix-managers.
On that list the initial poster was always expected to create
'solution' post with the summary info, at least in case reply was not
trivial. I must say it worked great.

> I've copied it to Pylons' wiki [1] so
> it's easier to find for reference in the future.

Good idea. Maybe it would make sense to create wider 'Pylons tuning'
page and start gathering there info about different aspects of this
complex theme....

Ian Bicking

unread,

Nov 5, 2007, 9:59:10 AM11/5/07

to pylons-...@googlegroups.com

No, it spawns a new thread for each request, and the thread dies at the
end of the request. It's a little slower, and in some cases you can get
an excessive number of threads (if you are responding very slowly in
your app), but otherwise it's about the same as with the thread pool.

--
Ian Bicking : ia...@colorstudy.com : http://blog.ianbicking.org

Reply all

Reply to author

Forward