Using AsyncHTTPClient-Based Library in Multi-Threaded Backend Server

532 views
Skip to first unread message

Calvin

unread,
Aug 26, 2012, 7:05:49 PM8/26/12
to python-...@googlegroups.com
My application uses Tornado for its front-end servers, and must interact with several external APIs.  We've written a few custom libraries that use AsyncHTTPClient to make non-blocking HTTP requests to those external services.

We're currently building several back-end web servers for our internal APIs.  For these servers, we're thinking about using a multi-threaded web server (i.e., not Tornado) since 1) they don't need to handle many concurrent connections and 2) this allows us to use third-party SDKs that make blocking network requests (e.g., boto for communicating with AWS).

Is it a good idea to use the AsyncHTTPClient-based libraries in a multi-threaded environment?  If so, then what would the recommended implementation look like?  Should I instantiate a private IOLoop per thread? Or is it safe to share a single IOLoop among all threads?  Or is there some other better design?

Ben Darnell

unread,
Aug 27, 2012, 1:02:42 AM8/27/12
to python-...@googlegroups.com
The only IOLoop method that's safe to call from another thread is
add_callback. So you need to either give each thread it's own IOLoop,
or have your other threads use add_callback to send work to the IOLoop
thread (and then something else to send responses back, depending on
the threading model for the rest of your app). You can do it, but
it's kind of tricky, and I'd question whether it's worth it. Since
you're already using a thread per request and making blocking calls
out to AWS for other parts of the app, you might as well stick with
that model for everything unless you have a specific reason to use
AsyncHTTPClient for some parts (e.g. if you're long-polling or
otherwise expect some operations to be much slower than the ones
you're doing synchronously).

-Ben

Calvin

unread,
Aug 27, 2012, 7:41:41 PM8/27/12
to python-...@googlegroups.com, b...@bendarnell.com
Ah I forgot to explain in the original post that both the front-end and back-end servers must be able to communicate with the same external APIs.

The reason I'd like to use the AsyncHTTPClient in the multi-threaded environment is so I can re-use the code I've already written for the front-ends.  I want to avoid maintaining both an async and non-async version of the API client library for the two different environments.

Is there some clever way of implementing these libraries such that I can easily pass in a flag (e.g., async=False) that toggles between asynchronous and non-asynchronous modes?  I guess what I have in mind is something like the following, but this seems like a rather clumsy implementation (this code is untested):

class FooAPIClient(object):
    def __init__(self, async=True):
        self.async = async
        if self.async:
            self.http_client = httpclient.AsyncHTTPClient()
        else:
            self.http_client = httpclient.HTTPClient()

    def fetch_user(self, user_id, callback=None):
        if self.async:
                 def on_response(response):
                     if callback:
                         callback(response)

            self.http_client.fetch("http://api.foo.com/user?id=" + user_id, callback=on_response)
        else:
            response = self.http_client.fetch("http://api.foo.com/user?id=" + user_id)
            return response

Ben Darnell

unread,
Aug 28, 2012, 12:36:16 AM8/28/12
to python-...@googlegroups.com
On Mon, Aug 27, 2012 at 7:41 PM, Calvin <calvin...@gmail.com> wrote:
> Ah I forgot to explain in the original post that both the front-end and
> back-end servers must be able to communicate with the same external APIs.
>
> The reason I'd like to use the AsyncHTTPClient in the multi-threaded
> environment is so I can re-use the code I've already written for the
> front-ends. I want to avoid maintaining both an async and non-async version
> of the API client library for the two different environments.

OK, that's a good reason to mix sync and async styles. The basic
pattern you need to use is this:

class BlockingCallback:
def __init__(self):
self.event = threading.Event()

def set(self, arg):
self.arg = arg
self.event.set()

def wait(self):
self.event.wait()
return self.arg

def do_synchronous_work():
# do stuff
# ...
# now start an async fetch
bc = BlockingCallback()
http_client = AsyncHTTPClient()
IOLoop.instance().add_callback(http_client.fetch, url, callback=bc.set)
response = bc.wait()

-Ben

Calvin

unread,
Aug 28, 2012, 1:25:56 PM8/28/12
to python-...@googlegroups.com, b...@bendarnell.com
Awesome, thanks Ben -- that makes sense.

Last question (I promise!):  Given that pattern, how does the IOLoop get started?  Should I just create a new thread during the application initialization that calls IOLoop.instance.start() and blocks for the lifetime of the application?

Ben Darnell

unread,
Aug 28, 2012, 1:27:12 PM8/28/12
to python-...@googlegroups.com
Yes, you just need to call IOLoop.start in some thread and then leave
it alone.

-Ben

Calvin

unread,
Aug 28, 2012, 1:41:09 PM8/28/12
to python-...@googlegroups.com, b...@bendarnell.com
Thanks a ton!

Calvin

unread,
Aug 28, 2012, 3:00:27 PM8/28/12
to python-...@googlegroups.com, b...@bendarnell.com
Okay one more stupid question (please pardon my ignorance -- still getting acquainted with Tornado/async programming).

Given this code:

    bc = BlockingCallback() 
    IOLoop.instance().add_callback(some_async_func) 
    response = bc.wait()

How does one handle exceptions that occur in some_async_func?  Right now, errors in some_async_func cause bc.wait() to block forever.

I looked over the documentation for stack_context, but am still a little confused about its usage.

Ben Darnell

unread,
Aug 28, 2012, 3:39:56 PM8/28/12
to python-...@googlegroups.com
Exception handling in async code is a pain. The best thing to do is
to use a stack context in some_async_func to catch any stray
exceptions and ensure that the callback will always be called. If
some_async_func uses @gen.engine all you need is a try/except around
the whole function (Thanks to gen.engine's internal stack context),
otherwise you'll need to do it by hand. simple_httpclient has an
example of this.

def some_async_func(callback):
def handle_exception(*exc_info):
callback(('error', exc_info))
return True # consume the error so it doesn't get logged again
with ExceptionStackContext(handle_exception):
# do stuff...
# eventually something (probably another callback) will do
callback(response)
Reply all
Reply to author
Forward
0 new messages