Multiple Database calls in an asynchronous request

148 views
Skip to first unread message

rsnj

unread,
Jun 25, 2011, 12:29:36 AM6/25/11
to Tornado Web Server
I am currently using Tornado and asyncmongo to create a website that
accesses a mongodb. Everything is working great except when I need to
make multiple requests to mongodb within in a single request to my
handler. I would like to keep all my database call asynchronous so
that there is no blocking on the server.

How can I accomplish this? There are several cases in which I need to
retrieve different documents from multiple collections. I also
sometimes need to use data that I retrieved from my first query in the
second such as a foreign key. I will also need the data from both
requests when I render my template.

Thanks!

Matt Ferguson

unread,
Jun 25, 2011, 1:12:07 AM6/25/11
to python-...@googlegroups.com

rsnj

unread,
Jun 25, 2011, 7:05:52 AM6/25/11
to Tornado Web Server
I'm already aware of how to use asyncmongo at it's basic level. My
question is more about how to handle a request in which I need
multiple asynchronous callbacks and I need to pass data across them.
Is there a data structure available to do this?

For instance I have a user_id in my request so I need to retrieve my
user object from the users collection, but then for my user object I
also need to retrieve a list of events from my events collection. How
can I retrieve and pass both the user object and events list from my
mongodb if they were stored in separate collections. It's less a
question of how do I make the calls to mongodb and more of a question
on how do I make two asynchronous calls in a single request with
Tornado.

On Jun 25, 1:12 am, Matt Ferguson <mbf...@gmail.com> wrote:
> http://www.dunnington.net/entry/asynchronous-mongodb-in-tornado-with-...

Frank Smit

unread,
Jun 25, 2011, 7:21:50 AM6/25/11
to python-...@googlegroups.com
Check out the source code from Momoko (https://github.com/FSX/momoko)
(AdispCLient and adisp module) or Brukva
(https://github.com/kmerenkov/brukva). Both use adisp take make
blocking style call possible. Maybe you can modify Asyncmongo to do
this too.

Rafael Garcia

unread,
Jun 25, 2011, 8:27:39 AM6/25/11
to python-...@googlegroups.com
check out swirl too: http://code.naeseth.com/swirl/

Ben Darnell

unread,
Jun 25, 2011, 12:02:20 PM6/25/11
to python-...@googlegroups.com
To make multiple sequential async calls, just start the second call from the first one's callback:

  @asynchronous
  def get(self):
    self.db.users.find({"username":self.current_user}, callback=self._on_user)

  def _on_user(self, response, error):
    self.db.events.find(..., callback=self._on_events)

  def _on_events(self, response, error):
    self.render("page.html", events=...)

The libraries Frank and Rafael pointed out can help streamline this pattern by making the whole thing look like one big function.  However, they are less suitable for situations where you need to make more than one async call in parallel.  For that, I like to use something like this to manage all the callbacks: https://gist.github.com/741041

-Ben

Landon

unread,
Jun 26, 2011, 12:19:57 AM6/26/11
to Tornado Web Server
Ben,

In this sort of situation, should the use of
self.async_callback(self.callback_function,
additional_params=additional_params) be used to define callbacks with
inline data to them, or should another pattern be used? This is what
i've been forced to use without much choice in the matter. Maybe i'm
not python-capable enough to figure out how to send those callback
arguments with straight asyncmongo, but I recall you specifically
saying that self.async_callback() itself shouldn't be used.

Can you clarify?

example:

def get(self, data):
self.db.coll.find({}, callback=self.async_callback(self.on_callback,
data=data))

def on_callback(self, response, error, data):
self.do_stuff()

This is how a large portion of my asyncmongo 'chain' would look like.

On Jun 25, 9:02 am, Ben Darnell <b...@bendarnell.com> wrote:
> To make multiple sequential async calls, just start the second call from the
> first one's callback:
>
>   @asynchronous
>   def get(self):
>     self.db.users.find({"username":self.current_user},
> callback=self._on_user)
>
>   def _on_user(self, response, error):
>     self.db.events.find(..., callback=self._on_events)
>
>   def _on_events(self, response, error):
>     self.render("page.html", events=...)
>
> The libraries Frank and Rafael pointed out can help streamline this pattern
> by making the whole thing look like one big function.  However, they are
> less suitable for situations where you need to make more than one async call
> in parallel.  For that, I like to use something like this to manage all the
> callbacks:https://gist.github.com/741041
>

Andrew Fort

unread,
Jun 26, 2011, 1:09:55 AM6/26/11
to python-...@googlegroups.com
On Sat, Jun 25, 2011 at 9:19 PM, Landon <land...@gmail.com> wrote:
> Ben,
>
> In this sort of situation, should the use of
> self.async_callback(self.callback_function,
> additional_params=additional_params) be used to define callbacks with
> inline data to them, or should another pattern be used?  This is what
> i've been forced to use without much choice in the matter.  Maybe i'm
> not python-capable enough to figure out how to send those callback
> arguments with straight asyncmongo, but I recall you specifically
> saying that self.async_callback() itself shouldn't be used.

Yeah, it requires a little magic... fortunately, said magic exists in
the Python standard library (functools module).

As far as I can tell, this is the preferred approach when adding
callbacks to the IOLoop (via add_timeout or add_callback), or as
callback arguments to methods on the IOStream or HTTPClient classes.

If you look at async_callback's implementation, it uses
functools.partial() to pickle arguments onto your callback, allowing
the "magic" callable value returned by partial() to be provided to
Tornado.

e.g.,

import functools

def http_callback(http_response_object, bar=None, baz=None):
# does fabulous things based on a tornado HTTPClient response object
pass

foo_param = "some additional state I want to pass to some_callback"
bar_param = "something else..."
...

httpclient_cb = functools.partial(http_callback, bar=bar_param, baz=baz_param)

and then you could use it like so:

import tornado.httpclient

hc = tornado.httpclient.AsyncHTTPClient(server)

hc.fetch("/some/url", httpclient_cb)

httpclient_cb will be called with the httpclient.HTTPResponse class as
the first argument, and functools.partial takes care of the rest of
your arguments.

Though I didn't show it, you can also pass positional args with
functools.partial, see the documentation
(http://docs.python.org/library/functools.html#functools.partial) for
more info.

For another code example, see the docstring on ioloop.py IOLoop class.
(https://github.com/facebook/tornado/blob/v2.0.0/tornado/ioloop.py#L54)

-a

--
Andrew Fort (af...@choqolat.org)

Mikhail Korobov

unread,
Jun 26, 2011, 6:03:42 AM6/26/11
to python-...@googlegroups.com
Hi Ben,

adisp also can handle parallel async calls:

c = brukva.Client()
c.connect()

class MainHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    @adisp.process
    def get(self):
        foo, bar = yield c.async.get('foo'), c.async.get('bar') # that's it
        self.set_header('Content-Type', 'text/html')
        self.render("template.html", title="Simple demo", foo=foo, bar=bar)

Ben Darnell

unread,
Jun 26, 2011, 12:38:24 PM6/26/11
to python-...@googlegroups.com
async_callback was primarily used for exception handling, which is now unnecessary thanks to stack_context.  It could also bind parameters into a function as a convenience, but the preferred way to do that now is to use functools.partial from the standard library (or use a lambda).

-Ben

On Sat, Jun 25, 2011 at 9:19 PM, Landon <land...@gmail.com> wrote:

rsnj

unread,
Jun 27, 2011, 10:31:23 AM6/27/11
to Tornado Web Server
Thanks for all the replies! I'm going to look into both adisp and
functools.partial.

But at the most basic level would something like this work assuming I
need to make my calls sequentially?

@tornado.web.asynchronous
def get(self, id):
self.id = id
self.db = asyncmongo.Client(pool_id='mypool', host='localhost',
port=27107, dbname='mydb')

self.db.users.find_one({'username': self.current_user},
callback=self.on_user)

def on_user(self, response, error):
if error:
raise tornado.web.HTTPError(500)
self.user = response
self.db.documents.find_one({'id': self.id, 'user': self.user},
callback=self.on_document)

def on_document(self, response, error):
if error:
raise tornado.web.HTTPError(500)
self.render('template', first_name=self.user['first_name'],
document=response)

Are there any issues using instance values on the class within a
request?

Ben Darnell

unread,
Jun 27, 2011, 11:42:49 AM6/27/11
to python-...@googlegroups.com
Yes, instance variables work too.  As a stylistic matter I prefer to use instance variables for things that are valid throughout the entire life of the instance and pass objects that are valid only during certain phases of the request via functools.partial, but there's nothing wrong with just using instance variables for everything as long as you're careful about when they exist.

-Ben

Landon

unread,
Jun 27, 2011, 11:47:09 PM6/27/11
to Tornado Web Server
Ben-

Any chance you could drop a quick example of how to format that stuff
"as intended"? I have a lot of this in my code and i'd like to be
sure i'm doing it right, since I heavily rely upon async_callback.

thanks!

-L

On Jun 27, 8:42 am, Ben Darnell <b...@bendarnell.com> wrote:
> Yes, instance variables work too.  As a stylistic matter I prefer to use
> instance variables for things that are valid throughout the entire life of
> the instance and pass objects that are valid only during certain phases of
> the request via functools.partial, but there's nothing wrong with just using
> instance variables for everything as long as you're careful about when they
> exist.
>

Ben Darnell

unread,
Jun 28, 2011, 1:07:05 AM6/28/11
to python-...@googlegroups.com
It's hard to think of a quick non-contrived example, but basically you'd just use functools.partial wherever you're currently using async_callback.  If you've got a long chain of callbacks you may want to set some things as instance attributes just to avoid repeating them all the way down the chain, but until the repetition gets annoying I prefer to keep state bound into the callback rather than in `self` to make it explicit what lives across async segments.

-Ben

Andrew Fort

unread,
Jun 28, 2011, 1:16:37 AM6/28/11
to python-...@googlegroups.com
On Mon, Jun 27, 2011 at 10:07 PM, Ben Darnell <b...@bendarnell.com> wrote:
> It's hard to think of a quick non-contrived example, but basically you'd
> just use functools.partial wherever you're currently using async_callback.
>  If you've got a long chain of callbacks you may want to set some things as
> instance attributes just to avoid repeating them all the way down the chain,
> but until the repetition gets annoying I prefer to keep state bound into the
> callback rather than in `self` to make it explicit what lives across async
> segments.
> -Ben

As an example, a problem I've had in a proxy (using IOStream
instances) lately was that of knowing a stream proxy connection was
successful.

In v.1.2.1, at least, an IOStream.connect() will call the callback you
supply even if the connection did not succeed, so to know for sure I
needed to do a write to the stream, as the write callback only occurs
when the write succeeded.

I passed the "header" (the bytes I was going to write) from the method
calling the stream connect to connect callback to the connect callback
to the write callback, by using functools.partial().

Inside the write callback, I set a self._connected = True variable
for the rest of the proxy connection to know it was OK to continue
with operations such as flushing the data which had buffered on the
connection class while the connection - and other asynchronous
operations such as connection authentication - were underway.

Cheers,
Andrew

Reply all
Reply to author
Forward
0 new messages