_DBProxy.dispose() doesn't actually dispose of the pool/connections?

68 views
Skip to first unread message

Carl Meyer

unread,
Nov 15, 2012, 7:26:14 PM11/15/12
to sqlal...@googlegroups.com
Hi,

I'm using SQLAlchemy's connection pool implementation directly (via
manage() and _DBProxy), and it seems to me that _DBProxy does not
dispose of its pools/connections the way that the
documentation/docstrings/method names seem to imply it would.

The _DBProxy.dispose method docstring says that it will "dispose the
pool referenced by the given connect arguments", but it doesn't ever
call dispose() on the pool (which would actually close the db
connections), it just removes the pool from its internal mapping of
pools. The effect of this is that idle connections in the pool remain
open until the process ends. Here's example code:

    from sqlalchemy.pool import manage
    import psycopg2

    proxy = manage(psycopg2)

    params = {'user': 'carljm', 'database': 'foo'}

    # create a pool and a connection
    conn = proxy.connect(**params)

    # release the connection back into the pool
    conn.close()

    # at this point proxy.pools contains a single pool,
    # proxy.pools.values()[0].status() shows one checked-in connection,
    # and pg_stat_activity on the db server shows the idle connection

    # this ought to dispose of the pool, but it doesn't
    proxy.dispose(**params)

    # now proxy.pools is empty so we have no reference to the pool
    # but if we check pg_stat_activity, idle connection is still there!
    # it stays open until the process ends

Similarly, _DBProxy.close() also only clears its map of pools, it never
actually calls their dispose method, causing the same symptom; a
"closed" proxy will still have open connections to the database until
the end of the process.

As a result, clear_managers (which just calls the close method of each
manager/proxy) also does not actually dispose of all pools and
connections, as its docstring claims it does.

It seems that perhaps _DBProxy is relying on garbage collection to
somehow clean everything up once it lets go of its reference to the
pool, but even on CPython with refcounting GC this does not seem to
work; in the code sample above, I never create any unusual references to
any pool internals, but the idle connection remains alive as long as the
process is alive (even after "del conn" and "del proxy"). Maybe some
previous version of Pool had a __del__ method that called its dispose
method, and _DBProxy is still implicitly relying on this
no-longer-present behavior?

If this is a bug, I can file it on Trac - just sanity-checking myself
here first since this is my first real dive into the SA codebase. I
searched Trac and didn't find anything describing this problem, though I
found some older tickets (e.g. #297 [0]) that seemed to imply that Pool
used to have __del__ methods it doesn't anymore.

For what it's worth, I ran into this using Kenneth Reitz's
django-postgrespool [1] to use SQLAlchemy's connection pooling with
Django. Django creates and tears down a fresh test database for each
test run, and the teardown was failing due to the idle connections
hanging around. I was able to work around it by basically reimplementing
_DBProxy.dispose, with an added call to the pool's dispose method. [2]

Thanks,

Carl

  [0] http://www.sqlalchemy.org/trac/ticket/297
  [1] https://github.com/kennethreitz/django-postgrespool
  [2]
https://github.com/carljm/django-postgrespool/blob/dispose/django_postgrespool/base.py#L137

Michael Bayer

unread,
Nov 15, 2012, 10:41:57 PM11/15/12
to sqlal...@googlegroups.com

On Nov 15, 2012, at 7:26 PM, Carl Meyer wrote:

> Hi,
>
> I'm using SQLAlchemy's connection pool implementation directly (via
> manage() and _DBProxy), and it seems to me that _DBProxy does not
> dispose of its pools/connections the way that the
> documentation/docstrings/method names seem to imply it would.
>
> The _DBProxy.dispose method docstring says that it will "dispose the
> pool referenced by the given connect arguments", but it doesn't ever
> call dispose() on the pool (which would actually close the db
> connections), it just removes the pool from its internal mapping of
> pools. The effect of this is that idle connections in the pool remain
> open until the process ends. Here's example code:

that's possible, it's a pretty old, little used, API, and that document probably assumes GC will pick up on it, and it would be better if dispose() was called on the pool all the way through. But the garbage collector should be reclaiming the pool and those connections (all DBAPIs I'm aware of release TCP/IP connections when the connection is garbage collected). Even calling gc.collect(), you're not seeing anything clear out ?

> It seems that perhaps _DBProxy is relying on garbage collection to
> somehow clean everything up once it lets go of its reference to the
> pool, but even on CPython with refcounting GC this does not seem to
> work;

CPython clears unreachable cycles periodically as well. calling gc.collect() will force this process to occur.

> Maybe some
> previous version of Pool had a __del__ method that called its dispose
oh. actually there's a __del__ on _DBProxy and that's probably why gc.collect() isn't doing it (it interferes with cyclic GC).

your best bet for now is to just iterate through db.proxy.pools and just call dispose() on them.

http://www.sqlalchemy.org/trac/ticket/2609 is added.



Carl Meyer

unread,
Nov 16, 2012, 1:03:45 PM11/16/12
to sqlal...@googlegroups.com
Thanks Michael.

On 11/15/2012 08:41 PM, Michael Bayer wrote:
> that's possible, it's a pretty old, little used, API, and that
> document probably assumes GC will pick up on it, and it would be
> better if dispose() was called on the pool all the way through. But
> the garbage collector should be reclaiming the pool and those
> connections (all DBAPIs I'm aware of release TCP/IP connections when
> the connection is garbage collected). Even calling gc.collect(),
> you're not seeing anything clear out ?

gc.collect() does close the connection iff the closed connection object
(well, ConnectionFairy actually in this case) has also gone out of scope
or been deleted; but not otherwise. So apparently the ConnectionFairy
(even after its closed) still holds references that keep the pool and
its connections alive.

>> It seems that perhaps _DBProxy is relying on garbage collection to
>> somehow clean everything up once it lets go of its reference to
>> the pool, but even on CPython with refcounting GC this does not
>> seem to work;
>
> CPython clears unreachable cycles periodically as well. calling
> gc.collect() will force this process to occur.

Yeah, when I mentioned refcounting I wasn't even thinking of cycle
collection, I just meant that CPython is the best-case for reliance on
GC, compared to eg pypy, so it's worth having a deterministic way to
make sure all the connections are closed.

> your best bet for now is to just iterate through db.proxy.pools and
> just call dispose() on them.
>
> http://www.sqlalchemy.org/trac/ticket/2609 is added.

Thanks!

Carl
Reply all
Reply to author
Forward
0 new messages