With your configuration yes, that would be expected.
> Is it typical for memory consumption to increase throughout a day?
No, but that is going to be an issue with TurboGears, how you use it,
or your specific application code.
> I've been reading the release notes, was lazy initialisation of Python
> interpreter already existent with version 2.5?
That would reduce a little the memory used by the other processes, but
not your main fat one.
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 444 apache 25 0 1682m 1.1g 10m S 26.0 58.2 35:54.74 httpd
> ...
>
>
> apache .conf file:
Is this serious all you have in your apache.conf file, or have you
only shown the relevant bits?
I know some people promote the idea of throwing away the complete
Apache configuration and only adding back in the bits thought to be
necessary, but I disagree with that as the underlying C code defaults
for configuration don't actually match what Apache configuration files
setup, so removing the configuration delivered with Apache could have
unknown consequences.
> =========
> LoadModule wsgi_module modules/mod_wsgi.so
> AddHandler wsgi-script .wsgi
The AddHandler line is not needed if using WSGIScriptAlias.
> WSGIPythonHome /home/rarch/tg2env
> WSGIPythonEggs /home/rarch/tg2env/lib/python-egg-cache
You shouldn't need WSGIPythonEggs as you are using python-eggs option
to WSGIDaemonProcess directive below. The WSGIPythonEggs directive
only applies to embedded mode but you are delegating everything to a
daemon mode process.
> WSGIDaemonProcess rarch threads=15 display-name=%{GROUP} python-eggs=/
> home/rarch/tg2env/lib/python-egg-cache
> WSGIProcessGroup rarch
> WSGISocketPrefix run
I suspect that this value of WSGISocketPrefix is going to live the
socket listener files in the wrong place. If you do need to override
it like that, it would be:
WSGISocketPrefix run/wsgi
With what you have, would be left in the Apache root directory and not
in the 'run' subdirectory and not with a 'wsgi' prefix to the socket
files.
> WSGIRestrictStdout Off
>
> # we'll make the root directory of the domain call the wsgi script
> WSGIScriptAlias /tg /home/rarch/trunk/src/appserver/wsgi-config/wsgi-
> deployment.py
>
> # make the wsgi script accessible
> <Directory /home/rarch/trunk/src/appserver/wsgi-config>
> Order allow,deny
> Allow from all
> </Directory>
>
> <Location /tg/_debug>
> AuthType Basic
> AuthName "For your company's security, this link is for
> retailarchitects.com support only. Please copy the Server Debug link
> and email it to your administrator."
> AuthUserFile /home/rarch/trunk/src/appserver/debugpasswd
> Require valid-user
> </Location>
> ================
>
> Thank you so very much for any time you spare to point me in the right
> direction, documentation regarding memory management with mod_wsgi,
> etc.
TurboGears is known to have a large base memory foot print to begin
with. The size of your process though appears to be the result of
application code performing caching and not purging the cache
properly. Alternatively, objects in application and creating reference
count cycles between objects which the Python garbage collector can't
break and so they hang around.
I would probably suggest you ask about your memory problems on the
TurboGear mailing list.
Other than that, the only place that memory leaks usually come from
Apache itself are when mod_python is also loaded, but since only one
process has memory problems and certain other bits of configuration
are working, you don't appear to be doing that.
Graham
In the context of your very fat web application, I don't the processes
would appear any 'fatter' than they were already. Since you are
restricting it to two, rather than whatever Apache MPM for embedded
mode allowed, you have at least constrained it.
In other words, using multiple threads in a process will increase the
amount of base memory used by that process, but it isn't necessarily
that much that I would be labelling it 'fatter'.
> If so, I'm not sure I understand the purpose of the threads, since
> wouldn't they need to effectively wait for a process anyway? Earlier,
> I believed threads=15 (and processes=1) would allow me to have many
> simultaneous requests processing in parallel. Can this one process
> accept multiple requests and multitask them, and if so, then what
> advantage is gained from processes=2 or higher (does it only make
> sense with multi-core processor)?
Despite the presence of the GIL in Python which restricts only one
thread to running Python code at a time in a process, with a
potentially I/O bound process like a web application, there is ample
opportunity for the GIL to be released while code is waiting for I/O,
such that an effective level of concurrency can still be handled with
one single multithreaded process.
So, using multiple processes across multiple CPUs can allow you to
harness the CPU power of the whole system, the nature of web
applications is such that you can still achieve a lot with a single
process.
Have a read of comments I make in the following about parallelisation
in Apache/mod_wsgi.
http://blog.dscpl.com.au/2007/09/parallel-python-discussion-and-modwsgi.html
http://blog.dscpl.com.au/2007/07/web-hosting-landscape-and-modwsgi.html
>> TurboGears is known to have a large base memory foot print to begin
>> with. The size of your process though appears to be the result of
>> application code performing caching and not purging the cache
>> properly. Alternatively, objects in application and creating reference
>> count cycles between objects which the Python garbage collector can't
>> break and so they hang around.
>>
>
> Yes, I cache things that make sense to cache, but cache them as part
> of 'session' objects which I believed to be being garbage collected,
> maybe that is my problem. I wanted to see if this was typical of even
> well behaved wsgi apps running thru apache *because of an article I
> read*, which reads:
Even if not explicitly caching, object cycles can still cause problems
for transient objects.
You might try seeing if you can get going:
http://pypi.python.org/pypi/Dozer
This will allow you to try and track were all the objects are being
created and what type they are.
> "If you serve 99% static files and 1% dynamic files with Apache, each
> httpd process will use from 3-20 megs of RAM (depending on your MOST
> complex dynamic page).
That is a motherhood statement that has no practical usefulness and
would likely be totally meaningless to anything but the specific setup
and application the person was using. At I guess I would say that that
statement wasn't even made about Python web applications. Python web
applications tend to have much larger memory requirements.
> This occurs because a process grows to accommodate whatever it is
> serving, and NEVER decreases until that process dies. Unless you have
> very few dynamic pages and major traffic fluctuation, most of your
> httpd processes will soon take up an amount of RAM equal to the
> largest dynamic script on your system. A very smart web server would
> deal with this automatically. As it is, you have a few options to
> manually improve RAM usage."
>
> http://onlamp.com/pub/a/onlamp/2004/02/05/lamp_tuning.html
>
> This article lead me to hypothesize [hypothesise ;) ] that it would be
> typical for any apache/wsgi server to slowly increase in RAM
> consumption as more and more requests simultaneously requested
> processing that required some bulk of RAM... if these occurred in
> parallel, then the article suggests this RAM is NEVER returned to the
> OS.
The memory use should plateau though and shouldn't just keep growing.
If it keeps growing you have an object leak through bad caching or
object cycles.
> I can take further inquiry to the turbogears group if I can't resolve
> my memory problems, but please first answer this: suppose my WSGI app
> grabbed a large amount of RAM *and assume it properly disposed of it*:
> would I see the RAM returned to the OS, or would the apache process
> hold it indefinitely?
The simple answer is that now you wouldn't see memory returned to the
OS. There are some slight exceptions to this but only with recent
Python versions (not sure which, may even only be some of the Python
3.X versions). You shouldn't though be able to count on those
exceptions though as in most cases you aren't likely to encounter it.
Graham
>> I would probably suggest you ask about your memory problems on the
>> TurboGear mailing list.
>>
>> Other than that, the only place that memory leaks usually come from
>> Apache itself are when mod_python is also loaded, but since only one
>> process has memory problems and certain other bits of configuration
>> are working, you don't appear to be doing that.
>>
>> Graham
>
> --
> You received this message because you are subscribed to the Google Groups "modwsgi" group.
> To post to this group, send email to mod...@googlegroups.com.
> To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.
>
>