I'm planning to use Tornado for a project I'm working on. And I've
decided to use PostgreSQL to store all the data. I read that Psycopg
has asynchronous support[1] and I've been wondering if it's possible
to integrate this with Tornado. I'm not sure how to do this any help
will be appreciated.
On Tue, 2011-03-01 at 13:46 -0800, Frank Smit wrote: > Hello,
> I'm planning to use Tornado for a project I'm working on. And I've > decided to use PostgreSQL to store all the data. I read that Psycopg > has asynchronous support[1] and I've been wondering if it's possible > to integrate this with Tornado. I'm not sure how to do this any help > will be appreciated.
def _update_handler(self): state = self.conn.poll() if state == psycopg2.extensions.POLL_OK: callback = self.callback self.callback = None callback() elif state == psycopg2.extensions.POLL_READ: IOLoop.instance().add_handler(self.conn.fileno(), IOLoop.READ, self._io_callback) elif state == psycopg2.extensions.POLL_WRITE: IOLoop.instance().add_handler(self.conn.fileno(), IOLoop.WRITE, self._io_callback)
def _io_callback(self, events): # maybe keep track of the previous state so you can use update_handler # instead of remove/add IOLoop.instance().remove_handler(self.conn.fileno()) self._update_handler()
On Tue, Mar 1, 2011 at 1:46 PM, Frank Smit <fr...@61924.nl> wrote: > Hello,
> I'm planning to use Tornado for a project I'm working on. And I've > decided to use PostgreSQL to store all the data. I read that Psycopg > has asynchronous support[1] and I've been wondering if it's possible > to integrate this with Tornado. I'm not sure how to do this any help > will be appreciated.
On Tue, Mar 1, 2011 at 11:24 PM, Ben Darnell <b...@bendarnell.com> wrote: > You need to use the IOLoop {add,update,remove}_handler methods to listen for > updates on the connection's fileno(). Very roughly: > class AsyncConnection: > def __init__(self, conn): > self.conn = conn > self.callback = None > def query(self, args, callback): > self.conn.execute(args) > self.callback = callback > self._update_handler() > def _update_handler(self): > state = self.conn.poll() > if state == psycopg2.extensions.POLL_OK: > callback = self.callback > self.callback = None > callback() > elif state == psycopg2.extensions.POLL_READ: > IOLoop.instance().add_handler(self.conn.fileno(), IOLoop.READ, > self._io_callback) > elif state == psycopg2.extensions.POLL_WRITE: > IOLoop.instance().add_handler(self.conn.fileno(), IOLoop.WRITE, > self._io_callback)
> def _io_callback(self, events): > # maybe keep track of the previous state so you can use update_handler > # instead of remove/add > IOLoop.instance().remove_handler(self.conn.fileno()) > self._update_handler() > -Ben
> On Tue, Mar 1, 2011 at 1:46 PM, Frank Smit <fr...@61924.nl> wrote:
>> Hello,
>> I'm planning to use Tornado for a project I'm working on. And I've >> decided to use PostgreSQL to store all the data. I read that Psycopg >> has asynchronous support[1] and I've been wondering if it's possible >> to integrate this with Tornado. I'm not sure how to do this any help >> will be appreciated.
> On Tue, Mar 1, 2011 at 11:24 PM, Ben Darnell <b...@bendarnell.com> wrote: >> You need to use the IOLoop {add,update,remove}_handler methods to listen for >> updates on the connection's fileno(). Very roughly: >> class AsyncConnection: >> def __init__(self, conn): >> self.conn = conn >> self.callback = None >> def query(self, args, callback): >> self.conn.execute(args) >> self.callback = callback >> self._update_handler() >> def _update_handler(self): >> state = self.conn.poll() >> if state == psycopg2.extensions.POLL_OK: >> callback = self.callback >> self.callback = None >> callback() >> elif state == psycopg2.extensions.POLL_READ: >> IOLoop.instance().add_handler(self.conn.fileno(), IOLoop.READ, >> self._io_callback) >> elif state == psycopg2.extensions.POLL_WRITE: >> IOLoop.instance().add_handler(self.conn.fileno(), IOLoop.WRITE, >> self._io_callback)
>> def _io_callback(self, events): >> # maybe keep track of the previous state so you can use update_handler >> # instead of remove/add >> IOLoop.instance().remove_handler(self.conn.fileno()) >> self._update_handler() >> -Ben
>> On Tue, Mar 1, 2011 at 1:46 PM, Frank Smit <fr...@61924.nl> wrote:
>>> Hello,
>>> I'm planning to use Tornado for a project I'm working on. And I've >>> decided to use PostgreSQL to store all the data. I read that Psycopg >>> has asynchronous support[1] and I've been wondering if it's possible >>> to integrate this with Tornado. I'm not sure how to do this any help >>> will be appreciated.
It's not in a very usable state right now. Just after I mailed my message I discovered that the cursor can not do two concurrent queries (I read about it). You can see it when you start the script and go to your browser at localhost:whateverport and refresh a couple of times (very quickly). You'll see in the terminal that it throws an exception.
On Thu, Mar 3, 2011 at 4:05 PM, Peter Bengtsson <pete...@gmail.com> wrote: > I wish I could put a big fat star on that piece of code. > Man, we really need a site to collect all these nifty snippets related to > Tornado.
On Thu, Mar 3, 2011 at 9:05 AM, Peter Bengtsson <pete...@gmail.com> wrote: > I wish I could put a big fat star on that piece of code. > Man, we really need a site to collect all these nifty snippets related to > Tornado.
Just create a gist on gist.github.com or create a public repository (github, bitbucket, etc). Then, link to them from the tornado wiki.
Not ultra-sexy but very low cost of entry and available today.
-j
-- The Christian ideal has not been tried and found wanting; it has been found difficult and left untried – G. K. Chesterton
You mentioned FRiCKLE/ngx_postgres as a async option for Postgres, but
then went on to say "via HTTP which brings its own slew of
excitement". Care to elaborate on any of your experiences for those
of us new to ngx_postgres?
Regards,
Brent
On Mar 1, 5:22 pm, Cliff Wells <cl...@develix.com> wrote:
> It uses the async features of psycopg2, so it would be a good starting
> point for a Tornado driver.
> Another option (that I have used) is ngx_postgres. It works well, but
> requires making your queries via HTTP which brings its own slew of
> excitement:
> On Tue, 2011-03-01 at 13:46 -0800, Frank Smit wrote:
> > Hello,
> > I'm planning to use Tornado for a project I'm working on. And I've
> > decided to use PostgreSQL to store all the data. I read that Psycopg
> > has asynchronous support[1] and I've been wondering if it's possible
> > to integrate this with Tornado. I'm not sure how to do this any help
> > will be appreciated.
I'm currently using it. You'll have to compile your own nginx with the custom modules. What really bit me was the escape routines (everything seems to be escaped as a string) and complex conditional sql that you would normally generate with logic. It's pretty cool once you have it up and running but the ramp up might kill features you could have otherwise built.
On Fri, Mar 4, 2011 at 7:58 AM, emcconne <emcco...@gmail.com> wrote: > You mentioned FRiCKLE/ngx_postgres as a async option for Postgres, but > then went on to say "via HTTP which brings its own slew of > excitement". Care to elaborate on any of your experiences for those > of us new to ngx_postgres?
> Regards, > Brent
> On Mar 1, 5:22 pm, Cliff Wells <cl...@develix.com> wrote: >> I haven't tried it, but there's an async Twisted driver here:
>> It uses the async features of psycopg2, so it would be a good starting >> point for a Tornado driver.
>> Another option (that I have used) is ngx_postgres. It works well, but >> requires making your queries via HTTP which brings its own slew of >> excitement:
>> On Tue, 2011-03-01 at 13:46 -0800, Frank Smit wrote: >> > Hello,
>> > I'm planning to use Tornado for a project I'm working on. And I've >> > decided to use PostgreSQL to store all the data. I read that Psycopg >> > has asynchronous support[1] and I've been wondering if it's possible >> > to integrate this with Tornado. I'm not sure how to do this any help >> > will be appreciated.
On Fri, 2011-03-04 at 07:58 -0800, emcconne wrote: > You mentioned FRiCKLE/ngx_postgres as a async option for Postgres, but > then went on to say "via HTTP which brings its own slew of > excitement". Care to elaborate on any of your experiences for those > of us new to ngx_postgres?
I was making reference to trying to map relational datasets onto a RESTful interface. If you're comfortable doing that (I've found using lots of server-side views to be key) then it's mostly just a discovery process. I know for me it took lots and lots of refactoring, but overall I was fairly happy with the result.
Hi, I've made a simple connection pool (not completely finished) and it works without problems (except one). The only problem is that the Tornado server can't fork into multiple processes, because the IOloop is initiated before the server is started. That's what the exception tells me. Besides that I think everything works.
Here's the code: http://pastebin.com/JAng2yRU. Would be cool if people can test it. :) You only need to start a PostgreSQL database and change the settings in the script.
On Fri, Mar 4, 2011 at 6:10 PM, Cliff Wells <cl...@develix.com> wrote: > On Fri, 2011-03-04 at 07:58 -0800, emcconne wrote: >> You mentioned FRiCKLE/ngx_postgres as a async option for Postgres, but >> then went on to say "via HTTP which brings its own slew of >> excitement". Care to elaborate on any of your experiences for those >> of us new to ngx_postgres?
> I was making reference to trying to map relational datasets onto a > RESTful interface. If you're comfortable doing that (I've found using > lots of server-side views to be key) then it's mostly just a discovery > process. I know for me it took lots and lots of refactoring, but > overall I was fairly happy with the result.
I actually tried it and it worked really well. Thanks. However my scenario was very basic and just an excuse to play with your code.
However, I quickly gave up when I realised I had to actually write the SQL :)
One thing that would have helped, would be if I could mix synchronous with asynchronous. Especially since a couple of queries leading up to a big slow one doesn't need to do a bunch async callbacks.
I haven't actually done anything advanced with it. Just confirmed that it worked with one or two query,
It's only a wrapper for Psycopg so it's obvious that you have to write SQL. ;) You want some kind of ORM?
I was thinking of adding some kind of query chain [https://github.com/FSX/momoko/issues#issue/3], but I haven't had time for this yet. The idea is that you only need one callback that is called once the last query has finished. And maybe optional callbacks in between. Not sure about his yet.
On Mon, Mar 28, 2011 at 12:12 PM, Peter Bengtsson <pete...@gmail.com> wrote: > I actually tried it and it worked really well. Thanks. However my scenario > was very basic and just an excuse to play with your code. > However, I quickly gave up when I realised I had to actually write the SQL > :) > One thing that would have helped, would be if I could mix synchronous with > asynchronous. Especially since a couple of queries leading up to a big slow > one doesn't need to do a bunch async callbacks.
On Mon, 2011-03-28 at 13:02 +0200, Frank Smit wrote: > I haven't actually done anything advanced with it. Just confirmed that > it worked with one or two query,
> It's only a wrapper for Psycopg so it's obvious that you have to write > SQL. ;) You want some kind of ORM?
Just my two cents: having used SQLobject and SQLAlchemy, I'd take the plain SQL API any day. I know a lot of people like ORMs, but I prefer the simplicity of plain SQL. ORMs do make some tedious things simple, but the added mental cost of the abstraction isn't worth it.
> I was thinking of adding some kind of query chain > [https://github.com/FSX/momoko/issues#issue/3], but I haven't had time > for this yet. The idea is that you only need one callback that is > called once the last query has finished. And maybe optional callbacks > in between. Not sure about his yet.
I wrote a little bit of code that simply acts as an accumulator for multiple queries. It's not a chain like described in the link above, but it does unroll the async callback sequence a bit, which makes things nicer if you have to do multiple queries for a single page.
class Aggregator (object): '''used as a callback that accumulates results of multiple separate callbacks and finishes when they are all accounted for ''' def __init__ (self, finish, required): '''finish is a callback function to be invoked when all required callbacks are done required is a list of names of callbacks (strings) ''' self.finish = finish self.required = required self.values = { }
def __call__ (self, which, values): self.values [which] = values if set (self.required) == set (self.values.keys ()): self.finish (self.values)
The above class gets used like so (seriously simplified from working code, so it's certainly broken).
class RequestHandler (BaseHandler): @tornado.web.asynchronous def get (self, ...): def render (values): # values is a dictionary, where the keys are the same passed to Aggregator.__init__ # and the values are the query results. # ... do some stuff to values, render a template, etc self.write (...) self.finish ()
views = { 'query1': 'select * from foo', 'query2': 'select * from bar' }
ag = Aggregator (render, views.keys ())
for key, query in views.items (): # theoretical call to some async db api db.execute (query, callback=functools.partial (ag, key))
Hope that makes sense and I didn't screw it up too badly in the simplification. Basically it allows me to have a single callback (an Aggregator object) that only calls render() once all the queries have completed.
On Mon, Mar 28, 2011 at 8:03 PM, Cliff Wells <cl...@develix.com> wrote: > On Mon, 2011-03-28 at 13:02 +0200, Frank Smit wrote: >> I haven't actually done anything advanced with it. Just confirmed that >> it worked with one or two query,
>> It's only a wrapper for Psycopg so it's obvious that you have to write >> SQL. ;) You want some kind of ORM?
> Just my two cents: having used SQLobject and SQLAlchemy, I'd take the > plain SQL API any day. I know a lot of people like ORMs, but I prefer > the simplicity of plain SQL. ORMs do make some tedious things simple, > but the added mental cost of the abstraction isn't worth it.
>> I was thinking of adding some kind of query chain >> [https://github.com/FSX/momoko/issues#issue/3], but I haven't had time >> for this yet. The idea is that you only need one callback that is >> called once the last query has finished. And maybe optional callbacks >> in between. Not sure about his yet.
> I wrote a little bit of code that simply acts as an accumulator for > multiple queries. It's not a chain like described in the link above, > but it does unroll the async callback sequence a bit, which makes things > nicer if you have to do multiple queries for a single page.
> class Aggregator (object): > '''used as a callback that accumulates results of multiple > separate callbacks and finishes when they are all accounted for > ''' > def __init__ (self, finish, required): > '''finish is a callback function to be invoked when all required callbacks are done > required is a list of names of callbacks (strings) > ''' > self.finish = finish > self.required = required > self.values = { }
> def __call__ (self, which, values): > self.values [which] = values > if set (self.required) == set (self.values.keys ()): > self.finish (self.values)
> The above class gets used like so (seriously simplified from working > code, so it's certainly broken).
> class RequestHandler (BaseHandler): > @tornado.web.asynchronous > def get (self, ...): > def render (values): > # values is a dictionary, where the keys are the same passed to Aggregator.__init__ > # and the values are the query results. > # ... do some stuff to values, render a template, etc > self.write (...) > self.finish ()
> views = { > 'query1': 'select * from foo', > 'query2': 'select * from bar' > }
> ag = Aggregator (render, views.keys ())
> for key, query in views.items (): > # theoretical call to some async db api > db.execute (query, callback=functools.partial (ag, key))
> Hope that makes sense and I didn't screw it up too badly in the > simplification. Basically it allows me to have a single callback (an > Aggregator object) that only calls render() once all the queries have > completed.
One issue I see is that it doesn't appear possible to know which queries you are seeing the results of by the time you reach the callback. For instance, if the queries were for a blog, and one query represented "article by id", and another were for "comments by post", you'd want to have them named in some way so as to be able to easily reference them later (this is one reason I used a dict rather than a sequence).
> One issue I see is that it doesn't appear possible to know which queries > you are seeing the results of by the time you reach the callback. For > instance, if the queries were for a blog, and one query represented > "article by id", and another were for "comments by post", you'd want to > have them named in some way so as to be able to easily reference them > later (this is one reason I used a dict rather than a sequence).
On Wed, Mar 30, 2011 at 2:30 PM, Frank Smit <fr...@61924.nl> wrote: > You're right, didn't think about at the time of writing. Will fix this > when I get home.
> Regards, > Frank
> On Wed, Mar 30, 2011 at 3:52 AM, Cliff Wells <cl...@develix.com> wrote: >> On Tue, 2011-03-29 at 23:54 +0200, Frank Smit wrote: >>> Hi,
>> One issue I see is that it doesn't appear possible to know which queries >> you are seeing the results of by the time you reach the callback. For >> instance, if the queries were for a blog, and one query represented >> "article by id", and another were for "comments by post", you'd want to >> have them named in some way so as to be able to easily reference them >> later (this is one reason I used a dict rather than a sequence).
This BatchQuery implementation looks really cool! Often when I have a series of statements to run, I will want to make sure they run in a specific order. Since dictionaries are unordered, maybe the list idea was better. The return could still be a dict, with the key being the sequence the query was in the original list; or just use the dict in the collect method, then sort it and return a tuple of results back.
Just a thought.
Andrew
On 2011-03-31 17:07:05 -0400, python-tornado@googlegroups.com Wrote:
>On Wed, Mar 30, 2011 at 2:30 PM, Frank Smit <fr...@61924.nl> wrote: >> You're right, didn't think about at the time of writing. Will fix this >> when I get home.
>> Regards, >> Frank
>> On Wed, Mar 30, 2011 at 3:52 AM, Cliff Wells <cl...@develix.com> wrote: >>> On Tue, 2011-03-29 at 23:54 +0200, Frank Smit wrote: >>>> Hi,
>>> One issue I see is that it doesn't appear possible to know which queries >>> you are seeing the results of by the time you reach the callback. For >>> instance, if the queries were for a blog, and one query represented >>> "article by id", and another were for "comments by post", you'd want to >>> have them named in some way so as to be able to easily reference them >>> later (this is one reason I used a dict rather than a sequence).
On Thu, 2011-03-31 at 19:30 -0400, Andrew Zeneski wrote: > Frank,
> This BatchQuery implementation looks really cool! Often when I have a > series of statements to run, I will want to make sure they run in a > specific order. Since dictionaries are unordered, maybe the list idea > was better. The return could still be a dict, with the key being the > sequence the query was in the original list; or just use the dict in > the collect method, then sort it and return a tuple of results back.
If you need them run in a specific order, then async is not the way to go. Rather you should simply use the standard sync API or chain your queries via callbacks.