What is a good pattern for using sqlalchemy+psycopg2+gevent in a pylons app

1,775 views
Skip to first unread message

afrotypa

unread,
Jun 4, 2010, 10:16:30 AM6/4/10
to sqlalchemy
Hi All,


For a while I have been toying with the idea of developing/running a
database (postgresql backend) driven WSGI app using an async python
WSGI server such as gevent.

Now that the newer psycopg2 (>=2.2.0) drivers provide coroutine
support (via a hook which makes c extension database drivers co-
routine friendly) I am looking to try gevent out as soon as I can
figure out how to integrate gevent and sqlalchemy.

In theory it should be possible to run blocking SQLAlchemy queries in
a spawned greenlet using an async DB API 2.0 driver such as psycopg2.
Blocking only the greenlet handling the specific request.

See the following URL for details of coroutine support in psycopg2:-

http://bitbucket.org/dvarrazzo/psycogreen/wiki/Home

An example of a wait call back implementation for gevent is here :-

http://bitbucket.org/dvarrazzo/psycogreen/src/tip/gevent/psyco_gevent.py

My question is what is a good pattern for using SQLAlchemy with a co-
routine library such as gevent in a pylons based WSGI app.?

For starters how do I configure the SQLAlchemy connection pool to
create async connections only?.

i.e.

Write a custom getconn method and passing async=True to
psycopg2.connect?

Or

pass connect_args to create_engine?.

Assuming gevent spawns a greenlet to handle each WSGI request, I think
SQLAlchemy should just work as usual in this situation, since the
scopedSession would ensure that a different SQLAlchemy session is used
for each request. Is this a reasonable assumption to make about this?.

If I wanted to run multiple independent queries (each in its own
greenlet) in the same pylons request, it appears I would have to
explicitly create an sqlalchemy session for each greenlet instance in
this case. Is this the case?

Am I missing something very important here?

Hopefully someone here might be able to shed some more light into all
this.

Thanks,

Note: I already posted this question on pylons-discuss, but I am
thinking this is more an SQLAlchemy than a pylons question.

Michael Bayer

unread,
Jun 25, 2010, 9:41:23 AM6/25/10
to sqlal...@googlegroups.com

On Jun 25, 2010, at 3:32 AM, Yang Zhang wrote:

Writing an async SQLA application isn't a matter of figuring out implementation details. That's not such a big deal. It's a matter of, what would the end-user application look like ? Sample code ? Best patterns for use ? Since I've never worked with async libraries (other than javascript), I don't have the best answer.

It hardly seems convenient to have entirely new Session objects for every query. I really have no idea what SQLAlchemy would look like supporting async queries in a useful way, but this is not a matter of implementation within SQLA - it seems obvious that we'd need a different Connection class that performs the execute() call differently (since to execute, you need to give a callback who gets the results, right ?). Assuming I'm not totally off base with that, the "you need a callback argument for every execute" pattern would be extracted out into a custom Session subclass and a custom Query subclass, which both support a "callback" calling style. The unit of work probably can't take advantage of this approach, since it deals with executing many statements in order, and would continue to use plain "execute" to do its work - the modified Connection, Session, and Query would continue to implement "synchronous" versions of themselves for this purpose. Similarly, lazyloaders and object refresh calls would still be synchronous, since they occur on attribute access.

That's as deep as my thought goes on this and its all based on hypotheticals since I've never used twisted or greenlets or anything like that. Sorry if I'm totally off on how gevent/greenlets work, the linked documents didn't really make it clear how they work for someone who isn't already familiar.


Yang Zhang

unread,
Jun 25, 2010, 5:33:07 PM6/25/10
to sqlal...@googlegroups.com
On Fri, Jun 25, 2010 at 6:41 AM, Michael Bayer <mik...@zzzcomputing.com> wrote:
> That's as deep as my thought goes on this and its all based on hypotheticals since I've never used twisted or greenlets or anything like that.    Sorry if I'm totally off on how gevent/greenlets work, the linked documents didn't really make it clear how they work for someone who isn't already familiar.

That's completely fair, the docs leave a lot to be desired. Think of
greenlets as an implementation of cooperative threading. In typical
threading, threads can first of all run in parallel, and the threading
is also preemptive, meaning context switches between threads can
happen at any time. With greenlets, "threads" are run in a single real
OS thread (no parallelism, only multiplexed concurrency), and context
switches between threads happen only voluntarily.

Why is this useful? Because then you can write non-blocking code in a
blocking style. Non-blocking IO is useful for scalable systems
development, but event-driven programming (a la twisted) tends to be
more tedious and less natural than programming with blocking IO. Plus,
most existing code is written against blocking IO, but event-driven
programming makes those difficult to reuse.

So to answer your first question of what things would look like in an
asynchronous world: with cooperative threads like greenlets,
everything would hopefully look identical. Embracing event-driven
style would indeed spell out significant changes to both sqlalchemy
and user code (and that's also not what I'm personally interested in
using).

Upon closer inspection, it seems that changes might not even be
necessary for sqlalchemy, since one can globally set the asynchronous
callback for psycopg to hook directly into gevent.

http://bitbucket.org/dvarrazzo/psycogreen/src/tip/gevent/psyco_gevent.py
--
Yang Zhang
http://yz.mit.edu/

afrotypa

unread,
Jul 29, 2010, 8:23:33 PM7/29/10
to sqlalchemy
Sorry I wasnt monitoring this thread. Didnt get a response right away
and thought no one had responded.

I also think that theoretically there really would be not much of a
change to using sqlalchemy (within pylons - pylons already creates a
scopedsession object for every web request anyhow) in a co-routine
framework such as gevent.

Once I get a chance I would try it out and update this thread
accordingly.

Thanks for your inputs guys.

On Jun 25, 5:33 pm, Yang Zhang <yanghates...@gmail.com> wrote:
> On Fri, Jun 25, 2010 at 6:41 AM, Michael Bayer <mike...@zzzcomputing.com> wrote:
> > That's as deep as my thought goes on this and its all based on hypotheticals since I've never used twisted or greenlets or anything like that.    Sorry if I'm totally off on howgevent/greenlets work, the linked documents didn't really make it clear how they work for someone who isn't already familiar.
Reply all
Reply to author
Forward
0 new messages