Asynchronous PostgreSQL

2,188 views
Skip to first unread message

Frank Smit

unread,
Mar 1, 2011, 4:46:23 PM3/1/11
to Tornado Web Server
Hello,

I'm planning to use Tornado for a project I'm working on. And I've
decided to use PostgreSQL to store all the data. I read that Psycopg
has asynchronous support[1] and I've been wondering if it's possible
to integrate this with Tornado. I'm not sure how to do this any help
will be appreciated.

Regards,
Frank


[1]: http://initd.org/psycopg/docs/advanced.html#asynchronous-support

Cliff Wells

unread,
Mar 1, 2011, 5:22:52 PM3/1/11
to python-...@googlegroups.com
I haven't tried it, but there's an async Twisted driver here:

https://launchpad.net/txpostgres

It uses the async features of psycopg2, so it would be a good starting
point for a Tornado driver.

Another option (that I have used) is ngx_postgres. It works well, but
requires making your queries via HTTP which brings its own slew of
excitement:

https://github.com/FRiCKLE/ngx_postgres/

Regards,
Cliff

Ben Darnell

unread,
Mar 1, 2011, 5:24:31 PM3/1/11
to python-...@googlegroups.com, Frank Smit
You need to use the IOLoop {add,update,remove}_handler methods to listen for updates on the connection's fileno().  Very roughly:

class AsyncConnection:
  def __init__(self, conn):
    self.conn = conn
    self.callback = None

  def query(self, args, callback):
    self.conn.execute(args)
    self.callback = callback
    self._update_handler()

  def _update_handler(self):
    state = self.conn.poll()
    if state == psycopg2.extensions.POLL_OK:
      callback = self.callback
      self.callback = None
      callback()
    elif state == psycopg2.extensions.POLL_READ:
      IOLoop.instance().add_handler(self.conn.fileno(), IOLoop.READ, 
        self._io_callback)
    elif state == psycopg2.extensions.POLL_WRITE:
      IOLoop.instance().add_handler(self.conn.fileno(), IOLoop.WRITE,
        self._io_callback)
  
  def _io_callback(self, events):
    # maybe keep track of the previous state so you can use update_handler
    # instead of remove/add
    IOLoop.instance().remove_handler(self.conn.fileno())
    self._update_handler()

-Ben

Frank Smit

unread,
Mar 2, 2011, 3:44:30 PM3/2/11
to python-...@googlegroups.com, Ben Darnell
Ok, I tried the code: http://pastebin.com/1XAcfji9 - And this is the
output: http://pastebin.com/mAHn0D03

I'm not really familiar with this.

Frank Smit

unread,
Mar 2, 2011, 5:24:29 PM3/2/11
to python-...@googlegroups.com
I've changed some things in the code now and it works:
http://pastebin.com/6DyxHBrV. I shouldn't ask too soon for help when
it doesn't work. :)

Thanks for the help!

Peter Bengtsson

unread,
Mar 3, 2011, 10:05:37 AM3/3/11
to python-...@googlegroups.com
I wish I could put a big fat star on that piece of code. 

Man, we really need a site to collect all these nifty snippets related to Tornado. 

Frank Smit

unread,
Mar 3, 2011, 10:53:58 AM3/3/11
to python-...@googlegroups.com, pet...@gmail.com
It's not in a very usable state right now. Just after I mailed my
message I discovered that the cursor can not do two concurrent queries
(I read about it). You can see it when you start the script and go to
your browser at localhost:whateverport and refresh a couple of times
(very quickly). You'll see in the terminal that it throws an
exception.

I'm going to take a look at it this evening.

Jeremy Kelley

unread,
Mar 3, 2011, 1:33:36 PM3/3/11
to python-...@googlegroups.com

Just create a gist on gist.github.com or create a public repository
(github, bitbucket, etc). Then, link to them from the tornado wiki.

Not ultra-sexy but very low cost of entry and available today.

-j


--
The Christian ideal has not been tried and found wanting;
it has been found difficult and left untried – G. K. Chesterton

emcconne

unread,
Mar 4, 2011, 10:58:02 AM3/4/11
to Tornado Web Server
You mentioned FRiCKLE/ngx_postgres as a async option for Postgres, but
then went on to say "via HTTP which brings its own slew of
excitement". Care to elaborate on any of your experiences for those
of us new to ngx_postgres?

Regards,
Brent

Carl S. Yestrau Jr.

unread,
Mar 4, 2011, 11:32:26 AM3/4/11
to python-...@googlegroups.com
I'm currently using it. You'll have to compile your own nginx with the
custom modules. What really bit me was the escape routines (everything
seems to be escaped as a string) and complex conditional sql that you
would normally generate with logic. It's pretty cool once you have it
up and running but the ramp up might kill features you could have
otherwise built.

Cliff Wells

unread,
Mar 4, 2011, 12:10:46 PM3/4/11
to python-...@googlegroups.com
On Fri, 2011-03-04 at 07:58 -0800, emcconne wrote:
> You mentioned FRiCKLE/ngx_postgres as a async option for Postgres, but
> then went on to say "via HTTP which brings its own slew of
> excitement". Care to elaborate on any of your experiences for those
> of us new to ngx_postgres?

I was making reference to trying to map relational datasets onto a
RESTful interface. If you're comfortable doing that (I've found using
lots of server-side views to be key) then it's mostly just a discovery
process. I know for me it took lots and lots of refactoring, but
overall I was fairly happy with the result.

Regards,
Cliff


Frank Smit

unread,
Mar 4, 2011, 3:41:13 PM3/4/11
to python-...@googlegroups.com
Hi, I've made a simple connection pool (not completely finished) and
it works without problems (except one). The only problem is that the
Tornado server can't fork into multiple processes, because the IOloop
is initiated before the server is started. That's what the exception
tells me. Besides that I think everything works.

Here's the code: http://pastebin.com/JAng2yRU. Would be cool if people
can test it. :) You only need to start a PostgreSQL database and
change the settings in the script.

Regards,
Frank

Frank Smit

unread,
Mar 8, 2011, 5:14:25 PM3/8/11
to python-...@googlegroups.com
Here's my last version: https://gist.github.com/861193

An example is included. I think everything works now. Feedback is
welcome. Have fun.

Regards,
Frank

paolo.losi

unread,
Mar 11, 2011, 8:40:07 AM3/11/11
to Tornado Web Server
Frank,

nice and interesting work!
Would you mind creating a project on github?
I'd be more than willing to contribute to it...

Paolo

Frank Smit

unread,
Mar 11, 2011, 12:53:27 PM3/11/11
to python-...@googlegroups.com, paolo...@gmail.com

Frank Smit

unread,
Mar 26, 2011, 5:48:28 PM3/26/11
to python-...@googlegroups.com
I got some time again and want to work on unit tests. I haven't used
this too much and was wondering how I should make unit tests for this
wrapper.

How do you handle unit tests with a database that needs to be
configured? And what should I exactly test?

Regards,
Frank

Peter Bengtsson

unread,
Mar 28, 2011, 6:12:18 AM3/28/11
to python-...@googlegroups.com, Frank Smit
I actually tried it and it worked really well. Thanks. However my scenario was very basic and just an excuse to play with your code. 

However, I quickly gave up when I realised I had to actually write the SQL :)

One thing that would have helped, would be if I could mix synchronous with asynchronous. Especially since a couple of queries leading up to a big slow one doesn't need to do a bunch async callbacks. 

Frank Smit

unread,
Mar 28, 2011, 7:02:41 AM3/28/11
to python-...@googlegroups.com
I haven't actually done anything advanced with it. Just confirmed that
it worked with one or two query,

It's only a wrapper for Psycopg so it's obvious that you have to write
SQL. ;) You want some kind of ORM?

I was thinking of adding some kind of query chain
[https://github.com/FSX/momoko/issues#issue/3], but I haven't had time
for this yet. The idea is that you only need one callback that is
called once the last query has finished. And maybe optional callbacks
in between. Not sure about his yet.

Regards,
Frank Smit

Cliff Wells

unread,
Mar 28, 2011, 2:03:18 PM3/28/11
to python-...@googlegroups.com
On Mon, 2011-03-28 at 13:02 +0200, Frank Smit wrote:
> I haven't actually done anything advanced with it. Just confirmed that
> it worked with one or two query,
>
> It's only a wrapper for Psycopg so it's obvious that you have to write
> SQL. ;) You want some kind of ORM?

Just my two cents: having used SQLobject and SQLAlchemy, I'd take the
plain SQL API any day. I know a lot of people like ORMs, but I prefer
the simplicity of plain SQL. ORMs do make some tedious things simple,
but the added mental cost of the abstraction isn't worth it.

> I was thinking of adding some kind of query chain
> [https://github.com/FSX/momoko/issues#issue/3], but I haven't had time
> for this yet. The idea is that you only need one callback that is
> called once the last query has finished. And maybe optional callbacks
> in between. Not sure about his yet.

I wrote a little bit of code that simply acts as an accumulator for
multiple queries. It's not a chain like described in the link above,
but it does unroll the async callback sequence a bit, which makes things
nicer if you have to do multiple queries for a single page.

class Aggregator (object):
'''used as a callback that accumulates results of multiple
separate callbacks and finishes when they are all accounted for
'''
def __init__ (self, finish, required):
'''finish is a callback function to be invoked when all required callbacks are done
required is a list of names of callbacks (strings)
'''
self.finish = finish
self.required = required
self.values = { }

def __call__ (self, which, values):
self.values [which] = values
if set (self.required) == set (self.values.keys ()):
self.finish (self.values)

def __add__ (self, required):
self.required.append (required)
return self

The above class gets used like so (seriously simplified from working
code, so it's certainly broken).

class RequestHandler (BaseHandler):
@tornado.web.asynchronous
def get (self, ...):
def render (values):
# values is a dictionary, where the keys are the same passed to Aggregator.__init__
# and the values are the query results.
# ... do some stuff to values, render a template, etc
self.write (...)
self.finish ()

views = {
'query1': 'select * from foo',
'query2': 'select * from bar'
}

ag = Aggregator (render, views.keys ())

for key, query in views.items ():
# theoretical call to some async db api
db.execute (query, callback=functools.partial (ag, key))


Hope that makes sense and I didn't screw it up too badly in the
simplification. Basically it allows me to have a single callback (an
Aggregator object) that only calls render() once all the queries have
completed.

Regards,
Cliff


Frank Smit

unread,
Mar 29, 2011, 5:54:59 PM3/29/11
to python-...@googlegroups.com
Hi,

I added a simple implementation of your aggregator, Cliff. I called it
BatchQuery, since I don't know what aggregator means. :) Here's the
commit: https://github.com/FSX/momoko/commit/43fa5fe095c99210ecf4bce6da4d0e84ec2460f2

The pool cleaner is also fixed. When more than one connections needed
to be removed a IndexError popped up.

Goodnight,
Frank

Cliff Wells

unread,
Mar 29, 2011, 9:52:34 PM3/29/11
to python-...@googlegroups.com
On Tue, 2011-03-29 at 23:54 +0200, Frank Smit wrote:
> Hi,
>
> I added a simple implementation of your aggregator, Cliff. I called it
> BatchQuery, since I don't know what aggregator means. :) Here's the
> commit: https://github.com/FSX/momoko/commit/43fa5fe095c99210ecf4bce6da4d0e84ec2460f2

One issue I see is that it doesn't appear possible to know which queries
you are seeing the results of by the time you reach the callback. For
instance, if the queries were for a blog, and one query represented
"article by id", and another were for "comments by post", you'd want to
have them named in some way so as to be able to easily reference them
later (this is one reason I used a dict rather than a sequence).

Regards,
Cliff


Frank Smit

unread,
Mar 30, 2011, 8:30:37 AM3/30/11
to python-...@googlegroups.com
You're right, didn't think about at the time of writing. Will fix this
when I get home.

Regards,
Frank

Frank Smit

unread,
Mar 31, 2011, 5:07:05 PM3/31/11
to python-...@googlegroups.com

Andrew Zeneski

unread,
Mar 31, 2011, 7:30:01 PM3/31/11
to python-...@googlegroups.com
Frank,

This BatchQuery implementation looks really cool! Often when I have a series of statements to run, I will want to make sure they run in a specific order. Since dictionaries are unordered, maybe the list idea was better. The return could still be a dict, with the key being the sequence the query was in the original list; or just use the dict in the collect method, then sort it and return a tuple of results back.

Just a thought.

Andrew

Cliff Wells

unread,
Mar 31, 2011, 8:54:33 PM3/31/11
to python-...@googlegroups.com
On Thu, 2011-03-31 at 19:30 -0400, Andrew Zeneski wrote:
> Frank,
>
> This BatchQuery implementation looks really cool! Often when I have a
> series of statements to run, I will want to make sure they run in a
> specific order. Since dictionaries are unordered, maybe the list idea
> was better. The return could still be a dict, with the key being the
> sequence the query was in the original list; or just use the dict in
> the collect method, then sort it and return a tuple of results back.

If you need them run in a specific order, then async is not the way to
go. Rather you should simply use the standard sync API or chain your
queries via callbacks.

Regards,
Cliff

Frank Smit

unread,
Apr 1, 2011, 5:06:48 AM4/1/11
to python-...@googlegroups.com
BatchQuery is not the solution for that as Cliff says. Even if the
queries are executed in the right order the results will be not in the
order. The first one that finishes is the first one in the result
list.

I am planning to make a chain. This is a bit more complicated than BatchQuery.

Regards,
Frank

Frank Smit

unread,
Apr 3, 2011, 5:01:59 PM4/3/11
to python-...@googlegroups.com
Here's the first try for an query chain:
https://github.com/FSX/momoko/commit/4b200edeaaf2cabd88598d640e631671cf21b8bf

Regards,
Frank

Cliff Wells

unread,
Apr 3, 2011, 5:32:58 PM4/3/11
to python-...@googlegroups.com
On Sun, 2011-04-03 at 23:01 +0200, Frank Smit wrote:
> Here's the first try for an query chain:
> https://github.com/FSX/momoko/commit/4b200edeaaf2cabd88598d640e631671cf21b8bf

FYI

hasattr(link, '__call__')

could just be

callable (link)

Cliff


Frank Smit

unread,
Apr 12, 2011, 2:41:15 PM4/12/11
to python-...@googlegroups.com
Got docs up: http://momoko.61924.nl/

Frank Smit

unread,
Apr 30, 2011, 8:51:50 AM4/30/11
to python-...@googlegroups.com
Released 0.2.0: http://61924.nl/04-30-2011/momoko-020.html

I'll look into ORM functionality for 0.3.0 if it's needed.

Frank Smit

unread,
May 26, 2011, 6:55:06 AM5/26/11
to python-...@googlegroups.com
I've been trying to use Momoko in an actual project. But it's not as
nice as coding in a blocking style. Got to think about everything I
do.

Executing queries before a request method function (get/post/...). It
also not easy. I've now modified _execute in the request handler so it
executes a query and then executes prepare and get in a callback
(example: http://pastebin.com/kWdsag9N).

Anyone got suggestions to make it nicer to work with non-blocking code?

Regards,
Frank

Mikhail Korobov

unread,
May 26, 2011, 8:24:35 AM5/26/11
to python-...@googlegroups.com
Hi Frank,

PEP-342 describes a way to write sync-looking non-blocking code. There are several Tornado implementations (swirl, adisp). Examples of resulting syntax (for adisp) can be found in e.g. tornado-slacker readme and in an example for brukva. So I think just a callback is enough for momoko if one of these libraries are used, there is no need to invent something fancy.

Frank Smit

unread,
May 26, 2011, 4:34:56 PM5/26/11
to python-...@googlegroups.com
Thanks, Mikhail.

This will make lots of things easier. But both Swirl and Adisp don't
look like their maintained at the moment.

Mikhail Korobov

unread,
May 26, 2011, 4:48:14 PM5/26/11
to python-...@googlegroups.com
I haven't used swirl personally but they both are small enough and adisp certainly works - don't know what maintenance is necessary for them. One may consider them as code snippets (adisp e.g. is a single file with a huge comment and about 50 lines of python code).

Frank Smit

unread,
May 31, 2011, 5:56:04 PM5/31/11
to python-...@googlegroups.com
I've been trying to get a chain working with Adisp, but it doesn't
work for some reason and I can't figure it out.

Here is my attempt:
https://github.com/FSX/momoko/commit/6172f54779540647c72e1b306dd276362f7759e2

Frank Smit

unread,
Aug 7, 2011, 4:07:46 PM8/7/11
to python-...@googlegroups.com
Released Momoko 0.3.0: http://bit.ly/pJjCCV

Ovidiu Predescu

unread,
Aug 12, 2011, 9:22:32 PM8/12/11
to python-...@googlegroups.com, Ben Darnell
This certainly deserves a wider audience!

After a brief look at the underlying adisp framework that Frank uses
in his library, I found it to be excellent: it makes asynchronous
programming a lot easier since you don't need to pass callbacks
around. I know other people have been looking at doing a similar
thing.

https://launchpad.net/adisp

It even handles issuing multiple asynchronous requests in parallel,
all with a very easy syntax.

Ovidiu

Frank Smit

unread,
Aug 14, 2011, 10:01:21 AM8/14/11
to python-...@googlegroups.com
Adisp is also included in Tornado Tools:
https://github.com/truemped/tornadotools

And maybe some feedback in this? :) https://github.com/FSX/momoko/issues/6

Frank Smit

unread,
Dec 15, 2011, 4:32:40 PM12/15/11
to python-...@googlegroups.com

Cliff Wells

unread,
Jan 3, 2012, 1:22:16 AM1/3/12
to python-...@googlegroups.com
I just got around to a new project and was finally able to use Momoko.
This appears to be an excellent, well thought-out library.

Thanks for your work on this.

Regards,
Cliff

Frank Smit

unread,
Jan 5, 2012, 5:50:00 PM1/5/12
to python-...@googlegroups.com
Hi,

I've been trying to remove the Poller class and not add/remove the
connection from the IOLoop each time a connection or cursor is
created.

See commit: https://github.com/FSX/momoko/commit/62de6a2a80f63004dac6072296bdb1ca778b6f7f

I'm kinda stuck now, because I'm not able to get a cursor back.
Creating connections seems to be working fine. This is the error I
get:

Traceback (most recent call last):
File "/home/frank/Code/momoko/py2env/lib/python2.7/site-packages/tornado-2.1.1git-py2.7.egg/tornado/stack_context.py",
line 183, in wrapped
callback(*args, **kwargs)
File "/home/frank/Code/momoko/py2env/lib/python2.7/site-packages/momoko/pools.py",
line 301, in _handle_events
state = self._conn.poll()
InterfaceError: the asynchronous cursor has disappeared

Is there someone who knows more about this error?

Cliff Wells

unread,
Jan 5, 2012, 10:57:40 PM1/5/12
to python-...@googlegroups.com
Unfortunately I picked up a bottle of scotch tonight, so your code looks
like so many hieroglyphics to me at the moment, but I assume you've seen
this?

https://dndg.it/cgi-bin/gitweb.cgi?p=public/psycopg2.git;a=commitdiff_plain;h=5f6e77357567bfba2046bd9fd4b8a2d3ad009d87

The exception you are getting was introduced at the same time that async
cursors were made weakrefs. This is probably a clue.

Cliff

Cliff Wells

unread,
Jan 9, 2012, 10:55:00 PM1/9/12
to python-...@googlegroups.com
On Thu, 2011-12-15 at 22:32 +0100, Frank Smit wrote:

I'm looking for a solution to the case where the PostgreSQL server is
restarted. It doesn't appear that there's any handling of this type of
error. The entire Tornado app must be restarted to recover.

Regards,
Cliff

Cliff Wells

unread,
Jan 9, 2012, 11:15:26 PM1/9/12
to python-...@googlegroups.com

Issue opened: https://github.com/FSX/momoko/issues/11


Regards,
Cliff

Reply all
Reply to author
Forward
0 new messages