Number of opened connections with mod_python

67 views
Skip to first unread message

Karl Pflästerer

unread,
Oct 8, 2007, 4:35:36 PM10/8/07
to sqlal...@googlegroups.com
Hi,
I try to use SA together with mod_python and its publisher algorithm.
How can I limit the number of opened connections?
I use Mysql as db, Apache2.2.6 prefork, mod_python and SA.
There are only 8 Apache childs running, but a lot more connections open.

I have the problem that connections are never closed, so after some time
the mysql sever doesn't handle any more requests.
A simple script like:

engine = create_engine('mysql://karl@localhost/pflaesterer?use_unicode=True&charset=utf8',
encoding = 'utf8', convert_unicode=False, pool_recycle=1, pool_size=1, echo_pool=True,
strategy='threadlocal')
meta = MetaData()
meta.bind = engine

def index(req):
conn = engine.connect()
conn.close()
return 'hier'


called ten times gives me 10 opened connections.

This is at the moment only for my private Apache but how would I do it right
if I had lots of requests (let's say most of the time 60 -70 Apache childs
running). At work I use PHP (I would prefer Python) and we don't use persistent
connections and no connection pool. So there is no problem with too much opened
connections.

TIA
Karl

Michael Bayer

unread,
Oct 8, 2007, 7:54:32 PM10/8/07
to sqlal...@googlegroups.com

On Oct 8, 2007, at 4:35 PM, Karl Pflästerer wrote:

>
> Hi,
> I try to use SA together with mod_python and its publisher algorithm.
> How can I limit the number of opened connections?
> I use Mysql as db, Apache2.2.6 prefork, mod_python and SA.
> There are only 8 Apache childs running, but a lot more connections
> open.
>
> I have the problem that connections are never closed, so after some
> time
> the mysql sever doesn't handle any more requests.
> A simple script like:
>
> engine = create_engine('mysql://karl@localhost/pflaesterer?
> use_unicode=True&charset=utf8',
> encoding = 'utf8', convert_unicode=False,
> pool_recycle=1, pool_size=1, echo_pool=True,
> strategy='threadlocal')
> meta = MetaData()
> meta.bind = engine
>
> def index(req):
> conn = engine.connect()
> conn.close()
> return 'hier'
>
>
> called ten times gives me 10 opened connections.

If I run the app from a shell, using a loop and calling index() 10
times, only one connection is ever opened since its immediately
returned to the pool, and then reused for the next iteration. But
that's also because the program runs in under a second; since you
have "pool_recycle" set to only 1 second, its going to open a brand
new connection every second (and close the previous one). If i add
"time.sleep(1)" to my loop then this behavior becomes apparent (but
still, only one connection opened at a time).

Keep in mind theres also a "max_overflow" setting at work here as
well which is defaulted to 10, so the above engine is capable of
pooling 11 simultaneous connections, if you werent calling "close()"
and the connection was left opened. To narrow it down to exactly 1
in all cases, max_overflow would need be set to 0.

However, even if you set pool_size=1 and max_overflow=0 (and
pool_recycle to something sane like 3600), none of this will help
control the total number of connections when using mod_python with
Apache prefork. First off, the "threadlocal" setting is meaningless
since prefork does not use multiple threads per process. Secondly,
each new child apache process will represent a brand new Python
interpreter with its own engine and connection pool. So in this
case, to control the number of concurrent database connections
opened, you'd have to control the number of apache processes; if
apache opened 60 child processes, you'd have 60 connection pools.
Apache's default settings for total number of child processes are
also quite high for a dynamic scripting application; each child
process can take many megs of space and you can quickly run out of
physical memory unless you set this number to be pretty low (or you
have many gigs of physical memory).

There's no way to have a single in-python connection pool shared
amongst child processes since that's not prefork's execution model.
The "threaded" MPM would produce the threaded behavior you're looking
for but I think in most cases in order to have a single, threaded app
server most people use a technology like FastCGI or mod_proxy which
communicate with a separate, multithreaded Python process.


Graham Dumpleton

unread,
Oct 8, 2007, 8:56:15 PM10/8/07
to sqlalchemy

On Oct 9, 9:54 am, Michael Bayer <mike...@zzzcomputing.com> wrote:
> There's no way to have a single in-python connection pool shared
> amongst child processes since that's not prefork's execution model.
> The "threaded" MPM would produce the threaded behavior you're looking
> for but I think in most cases in order to have a single, threaded app
> server most people use a technology like FastCGI or mod_proxy which
> communicate with a separate, multithreaded Python process.

Or use daemon mode of mod_wsgi.

All these other ways of hosting Python applications in conjunction
with Apache are though of little value to the OP as they are using
mod_python.publisher which binds them to mod_python. To allow them to
use these other hosting methods, they would need to look at using some
WSGI based dispatcher. That way they have more flexibility as to what
hosting arrangement they use.

Graham

Graham Dumpleton

unread,
Oct 8, 2007, 9:09:01 PM10/8/07
to sqlalchemy
Damn it. I hate it when Google groups appears to loose my messages.
The post I just sent to this group appeared really quick, but the much
longer message I sent earlier still hasn't arrived.

Anyway, what my original longer post which appears to have disappeared
pointed out was that because the OP is using mod_python.publisher,
their problems with consumption of resources may be getting
exacerbated by mod_python's module reloading mechanism. In short, the
engine object will be recreated every time the code file is changed
and reloaded. If this doesn't automatically release the connections
still in the pool, this may result any many pools being created, with
older ones becoming unusable and just consuming resources.

Even if the issues with the settings is more the problem, the module
reloading could still make it worse of a problem depending on
SQLAlchemy works. Thus OP should probably also move the creation of
the engine out into a separate Python module somewhere on the
PythonPath and not in the document tree. This way mod_python will not
do reloading of it.

For more details of mod_python module reloader see import_module()
documentation in:

http://www.modpython.org/live/current/doc-html/pyapi-apmeth.html

The end of the documentation for that function briefly mentions the
issues with increasing resource consumption when reloading occurs.
Since mod_python.publisher uses that function then it would be
affected.

Sorry if this is duplicated information due to my original post
finally turning up.

Graham

On Oct 9, 9:54 am, Michael Bayer <mike...@zzzcomputing.com> wrote:

Karl Pflästerer

unread,
Oct 10, 2007, 8:49:40 AM10/10/07
to sqlal...@googlegroups.com
On 09.10.2007 03:09 Graham Dumpleton wrote:
[...]

> their problems with consumption of resources may be getting
> exacerbated by mod_python's module reloading mechanism. In short, the
> engine object will be recreated every time the code file is changed
> and reloaded. If this doesn't automatically release the connections
> still in the pool, this may result any many pools being created, with
> older ones becoming unusable and just consuming resources.
>
> Even if the issues with the settings is more the problem, the module
> reloading could still make it worse of a problem depending on
> SQLAlchemy works. Thus OP should probably also move the creation of
> the engine out into a separate Python module somewhere on the
> PythonPath and not in the document tree. This way mod_python will not
> do reloading of it.

Hi,
thanks for the answer. I moved the engine creation and the table reflection
code in a separate package and now the number of opened connections stays at
the same level. The module reloading of mod_python was what caused me trouble
(although in most circumstances it's the right thing if you don't want to
restart Apache every time you change some code).

Karl

Reply all
Reply to author
Forward
0 new messages