SQLAlchemy's QueuePool lock (was [gevent] gevent in production)

3,550 views
Skip to first unread message

Dan Korostelev

unread,
Jul 15, 2010, 1:34:17 PM7/15/10
to gev...@googlegroups.com
2010/7/11 Marcin Bachry <hege...@gmail.com>:
> Dan Korostelev <nad...@gmail.com> writes:
>> Yep, i'm using this wait callback -
>> http://bitbucket.org/denis/psycogreen/src/tip/gevent/psyco_gevent.py
>>
>> BTW, I used to get some kind of deadlocks with sqlalchemy's connection
>> pool until i specify max_overflow=-1 in engine arguments.
>> Unfortunately I didn't have time to dig it out and understand the
>> reason, so if anyone have an explanation, I'd like to learn about
>> this. :-)
>
> Check out this thread on eventlet mailing list:
>
>  https://lists.secondlife.com/pipermail/eventletdev/2010-March/000793.html

Thanks. I finally took a look at it. It looks to me that using
max_overflow=-1 on uber-high load with expensive db queries, we can
end up in exception about too many open connections. I made a simple
subclass of sqlalchemy's QueuePool that uses gevent's Semaphore
instead of threading.Lock to make it green:

from gevent.coros import Semaphore

class GreenQueuePool(QueuePool):

def __init__(self, *args, **kwargs):
super(GreenQueuePool, self).__init__(*args, **kwargs)
if self._overflow_lock is not None:
self._overflow_lock = Semaphore()

Now I wonder whether I should actually use it or it's a waste of
performance (for lock aquires/releases) and I should simply make all
my db interactions minimal and fast, so I won't be bothered with
connection pool overflows. Any expirienced opinions, please? :)

--
WBR, Dan Korostelev

lasizoillo

unread,
Jul 15, 2010, 4:57:10 PM7/15/10
to gev...@googlegroups.com
2010/7/15 Dan Korostelev <nad...@gmail.com>:

> 2010/7/11 Marcin Bachry <hege...@gmail.com>:
>> Dan Korostelev <nad...@gmail.com> writes:
>>> Yep, i'm using this wait callback -
>>> http://bitbucket.org/denis/psycogreen/src/tip/gevent/psyco_gevent.py
>>>
>>> BTW, I used to get some kind of deadlocks with sqlalchemy's connection
>>> pool until i specify max_overflow=-1 in engine arguments.
>>> Unfortunately I didn't have time to dig it out and understand the
>>> reason, so if anyone have an explanation, I'd like to learn about
>>> this. :-)
>>
>> Check out this thread on eventlet mailing list:
>>
>>  https://lists.secondlife.com/pipermail/eventletdev/2010-March/000793.html
>
> Thanks. I finally took a look at it. It looks to me that using
> max_overflow=-1 on uber-high load with expensive db queries, we can
> end up in exception about too many open connections. I made a simple
> subclass of sqlalchemy's QueuePool that uses gevent's Semaphore
> instead of threading.Lock to make it green:
>
> from gevent.coros import Semaphore
>
> class GreenQueuePool(QueuePool):
>
>    def __init__(self, *args, **kwargs):
>        super(GreenQueuePool, self).__init__(*args, **kwargs)
>        if self._overflow_lock is not None:
>            self._overflow_lock = Semaphore()
>

I don't understand. Is not patched threading.Lock with a
gevent.coros.Semaphore via thread.allocate_lock?

> Now I wonder whether I should actually use it or it's a waste of
> performance (for lock aquires/releases) and I should simply make all
> my db interactions minimal and fast, so I won't be bothered with
> connection pool overflows. Any expirienced opinions, please? :)
>

Connections pools are made by only one reason. Limit the *heavy*
listen process in server. If it's possible create cheap connections, a
pool is unnecessary. I tested 100 async connections. But there are not
cheap in memory usage, there are still one server process per client
:-(

Async database connections is harder than I was hope :-(

Without a connection pool you can exhaust server memory, need swap,
degradate disk io and degradate totally your service. With a small
pool you'll get latencies and unnecessary connection errors. IMHO a
connection pool is lesser evil.

Regards,
Javi

Dan Korostelev

unread,
Jul 15, 2010, 8:45:33 PM7/15/10
to gev...@googlegroups.com
2010/7/16 lasizoillo <lasiz...@gmail.com>:

Actually, I'm trying to avoid monkey patching and use gevent stuff
explicitly. I'll dig more into the QueuePool class tomorrow. Probably
it can be done more lightweight if written from scratch, using
gevent's Queue.

>> Now I wonder whether I should actually use it or it's a waste of
>> performance (for lock aquires/releases) and I should simply make all
>> my db interactions minimal and fast, so I won't be bothered with
>> connection pool overflows. Any expirienced opinions, please? :)
>>
>
> Connections pools are made by only one reason. Limit the *heavy*
> listen process in server. If it's possible create cheap connections, a
> pool is unnecessary. I tested 100 async connections. But there are not
> cheap in memory usage, there are still one server process per client
> :-(
>
> Async database connections is harder than I was hope :-(
>
> Without a connection pool you can exhaust server memory, need swap,
> degradate disk io and degradate totally your service. With a small
> pool you'll get latencies and unnecessary connection errors. IMHO a
> connection pool is lesser evil.

Yep, I agree. But I was really talking about still using the
connection pool, but with unlimited maximum connection number (so
max_overflow=-1 in sqlalchemy). So they are pooled, but if there's no
free connections in the pool, one more is always created. With
cooperative multitasking model and simple short-term database
transactions I currently can't think of a load that will use all
available connections and require to create so many "overflow"
connections to create any problems to database server. Considering
lack of my experience, of course. :-)

--
WBR, Dan Korostelev

Reply all
Reply to author
Forward
0 new messages