tornado + py-amqplib (subtitle: how to integrate blocking libraries to the ioloop)

232 views
Skip to first unread message

paolo.losi

unread,
Feb 17, 2010, 8:44:57 AM2/17/10
to Tornado Web Server
Hi all,

first of all thanks for tornadoweb. It's really a great piece of
software
and we are using it with great satisfaction. We found it to be
very flexible/powerful and easy to understand at the same time.

Now we are trying to integrate a web application with a amqp broker
(the web app should act as a consumer of a rabbitmq queue and forward
the message to the browser on an async http request).

The amqp library we would like to use is py-amqplib since afaik
is the most widespread and tested.

After some googling I found some approaches that requires hijacking py-
amqplib.
A good review is on:

http://blag.ypermutator.ca/infrastructure/tornado/

I don't like very much with the approaches of "monkey patching" py-
amqplib
because this obviously creates a strong dependency on py-amqplib
internals.

First question is:
1) As anyone find and successfully used alternatives approaches to
integrate
amqp consuming in tornadoweb?

The solution that I envisioned is a little bit more general, since it
would allow
to accomodate for other "blocking consumers" libraries

The idea is to have a slave process in charge of blocking reading on
the amqp queue,
and to let the slave process communicate with the tornado ioloop via a
pipe.

The idea is to implement a InputPipe class along the line of IOStream
to integrate
the pipe in the ioloop.

The next question are:
2) is the approach sensible?
3) has anyone already tried this approach?

Thanks for you attention

Paolo

Grigory Fateyev

unread,
Feb 17, 2010, 8:59:30 AM2/17/10
to python-...@googlegroups.com
Hello paolo.losi!

On Wed, 17 Feb 2010 05:44:57 -0800 (PST) you wrote:

> Now we are trying to integrate a web application with a amqp broker
> (the web app should act as a consumer of a rabbitmq queue and forward
> the message to the browser on an async http request).

In our project we desided to use Stomp ptotocol to send messages to
queue. Look at http://stomp.codehaus.org/Protocol

--
Всего наилучшего!

Emery

unread,
Feb 17, 2010, 6:56:53 PM2/17/10
to Tornado Web Server
Just went through an incredibly frustrating series of trial and error
in order to get this same thing to work by forcing amqplib's socket in
and out of blocking between establishing new channels and receiving
messages for a consumer. Basically, we found it's not too difficult to
do this if you're sure you are using independent connections for
incoming consumers. But if you want to use multiple channels you end
up with corrupted streams from previous consumer data still remaining
in the input buffer of the socket when you try to communicate with the
amqp broker. And ultimately fixing this would be the same effort as
writing an asynchronous amqp library. We ended up doing our own
queuing in memory using a collections.deque and publishing messages
through HTTP POSTs. At this point I'm actually extremely disappointed
in AMQP as a protocol, and its implementations both in clients and
brokers. Nothing but unnecessary problems without very much benefit
but file mapped persistence, which didn't even help us.

paolo.losi

unread,
Feb 18, 2010, 4:11:25 AM2/18/10
to Tornado Web Server
Thanks Emery for sharing your experience.

On Feb 18, 12:56 am, Emery <emery.denuc...@gmail.com> wrote:
> Just went through an incredibly frustrating series of trial and error
> in order to get this same thing to work by forcing amqplib's socket in
> and out of blocking between establishing new channels and receiving
> messages for a consumer. Basically, we found it's not too difficult to
> do this if you're sure you are using independent connections for
> incoming consumers. But if you want to use multiple channels you end
> up with corrupted streams from previous consumer data still remaining
> in the input buffer of the socket when you try to communicate with the
> amqp broker. And ultimately fixing this would be the same effort as
> writing an asynchronous amqp library.

That is exactly my point.

> We ended up doing our own
> queuing in memory using a collections.deque and publishing messages
> through HTTP POSTs. At this point I'm actually extremely disappointed
> in AMQP as a protocol, and its implementations both in clients and
> brokers. Nothing but unnecessary problems without very much benefit
> but file mapped persistence, which didn't even help us.

While I agree with you about the issue of very poor
protocol implementation on the python client side,
we have been quite satisfied with the flexibility of the amqp model
and broker functionalities (we're using rabbitmq).

Did you find any problem broker side that we must be aware of?

I'm going to experiment with the approach described (slave process)
and let everyone know how it turns out.

Thanks
Paolo

Wil Tan

unread,
Feb 18, 2010, 7:25:53 AM2/18/10
to python-...@googlegroups.com
Hi Paolo,

I know you dislike monkey-patching, but the monkey patching done in gevent [1] works very well in my experience. This makes amqplib cooperate.

Coupled with gtornado [2] (my tiny project to monkey patch Tornado to use gevent's hub), one could have the best of both worlds - one can use Tornado's web framework in a high performance epoll/kqueue-based http server along with amqplib and any other pure python library that uses (e.g. smtplib, xmpppy.)

A demo of Tornado + amqplib using this technique is here: http://gist.github.com/307604
Notice how apart from the first two lines of application-wide monkey-patching, the rest is just standard.



=wil

Emery

unread,
Feb 19, 2010, 1:51:28 AM2/19/10
to Tornado Web Server
Yeah we use RabbitMQ too. I just feel amqp is somewhat over-engineered
for its purpose in the sense that it has so much functionality, like
topic exchanges and restart persistence, but at least in our use we
wouldn't actually be able to fully depend on those things, so it just
seems to gets in the way of simple message queueing - but yeah, that's
just for our particular usage (a comet server), and likely somewhat
misguided contempt from my own failure to make sense and use of some
of the features. ;) One good experience I had in the past was
aggregating log messages from separate systems to a central
repository. That was pretty cool.

It's too bad I simply wasn't able to find an existing python amqp
library that was built for async, other than txamqp, which is made to
fit Twisted's event machine layer, and I wasn't ambitious enough to
combine tornado and twisted, or to rewrite amqplib as asynchronous. :)

No real bugs in the broker itself, just odd behavior that we didn't
expect to be a problem until actually implementing. Anyway I'm certain
you can get everything to work fine with rabbitmq if you take the leap
of making amqplib expect async traffic, or easier if you can just have
more connections.

One piece of information that might be of value to you is that
rabbitmq didn't seem to succumb to the c10k problem even with many
connections. We just decided to pull out when we realized rabbitmq
wasn't gaining us much of anything. None of the cool features were
usable at all because of some scenario that would require custom
logic. meh

Carlos A. Rocha

unread,
Feb 22, 2010, 2:40:36 PM2/22/10
to Tornado Web Server
Hi Emery,

> Yeah we use RabbitMQ too. I just feel amqp is somewhat over-engineered

> for its purpose...

Try zeromq. It is super light and has very high perfomance:
http://www.zeromq.org/

I am currently working in integrating zmq's poll into a ioloop
implementation. Ping me if you are interested.

--
Carlos A. Rocha

paolo.losi

unread,
Feb 22, 2010, 6:42:35 PM2/22/10
to Tornado Web Server
Hi Will,

On Feb 18, 1:25 pm, Wil Tan <dre...@gmail.com> wrote:
> Hi Paolo,
>
> I know you dislike monkey-patching, but the monkey patching done in gevent
> [1] works very well in my experience. This makes amqplib cooperate.
>
> Coupled with gtornado [2] (my tiny project to monkey patch Tornado to use
> gevent's hub), one could have the best of both worlds - one can use
> Tornado's web framework in a high performance epoll/kqueue-based http server
> along with amqplib and any other pure python library that uses (e.g.
> smtplib, xmpppy.)

It's very interesting... you remind me me that I should look at gevent
more closely.
The problem that I see with your solution is that you need a different
amqp connection
for every http request. Since I'm forced to use ssl connection to the
amqp broker,
that could be very expensive due to ssl negotiation.
I guess it's impossible to share the same connection, right?

> A demo of Tornado + amqplib using this technique is here:http://gist.github.com/307604

Thanks for your example. I'll try to work along the lines.

Paolo

Wil Tan

unread,
Feb 23, 2010, 11:06:55 AM2/23/10
to python-...@googlegroups.com
Hello Paolo,

On Tue, Feb 23, 2010 at 10:42 AM, paolo.losi <paolo...@gmail.com> wrote:

It's very interesting... you remind me me that I should look at gevent
more closely.
The problem that I see with your solution is that you need a different
amqp connection
for every http request. Since I'm forced to use ssl connection to the
amqp broker,
that could be very expensive due to ssl negotiation.

Indeed, SSL negotiation is expensive.

 
I guess it's impossible to share the same connection, right?


While I have not tested it in this configuration, I have used the multiplexing feature of AMQP in other projects to share a single connection. So, you can have a single connection used by different greenlets, each with its own channel. Or perhaps a pool of connections. With that configuration, you'll have to take care of reconnecting to the broker in case the connection is unexpectedly severed (server restart, network split, etc.) Also, if you are sharing a connection between greenlets (even with their own channel), I suspect you might need to to lock access to the connection object to prevent race conditions.


 
> A demo of Tornado + amqplib using this technique is here:http://gist.github.com/307604

Thanks for your example. I'll try to work along the lines.

Paolo

=wil

paolo.losi

unread,
Feb 23, 2010, 1:00:34 PM2/23/10
to Tornado Web Server
Hi all,

I have implemented a draft of the idea outlined below.
You can find the code at [1] and a demo application at [2].

The idea is to use a amqp consumer and/or amqp producer
slave processes that communicate to the main ioloop process
via socket. The approach is simple, efficient and "correct".

Code review, patches etc... very welcome!

[1] http://code.google.com/p/tornado-amqp/
[2] http://code.google.com/p/tornado-amqp/source/browse/demo/demo.py

Thanks everyone for the help.

Paolo

Douglas Stanley

unread,
Feb 23, 2010, 3:39:43 PM2/23/10
to python-...@googlegroups.com
I've been watching this list for quite some time, and I've never seen
anyone mention hooking
tornado up to the multiprocessing library and use a process pool to
run background tasks
asynchronously.

I did a simple proof of concept modifying the hello world sample,
where it set up a process pool,
then requests where handed off to one of the processes in the pool to
be executed. In my proof
of concept, I simply executed a function which slept and then returned
the PID of the processes
that handled the request.

It worked quite well. Why hasn't anyone suggested something simple
like this before? Or has it
been suggested and I just some how missed the message?

Here's a link to the gist:

http://gist.github.com/312676

Doug

--
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

Denis Bilenko

unread,
Feb 23, 2010, 4:04:00 PM2/23/10
to Tornado Web Server

On Feb 23, 5:42 am, "paolo.losi" <paolo.l...@gmail.com> wrote:
> Hi Will,
>
> On Feb 18, 1:25 pm, Wil Tan <dre...@gmail.com> wrote:
>
> > Hi Paolo,
>
> > I know you dislike monkey-patching, but the monkey patching done in gevent
> > [1] works very well in my experience. This makes amqplib cooperate.
>
> > Coupled with gtornado [2] (my tiny project to monkey patch Tornado to use
> > gevent's hub), one could have the best of both worlds - one can use
> > Tornado's web framework in a high performance epoll/kqueue-based http server
> > along with amqplib and any other pure python library that uses (e.g.
> > smtplib, xmpppy.)
>
> It's very interesting... you remind me me that I should look at gevent
> more closely.
> The problem that I see with your solution is that you need a different
> amqp connection
> for every http request. Since I'm forced to use ssl connection to the
> amqp broker,
> that could be very expensive due to ssl negotiation.
> I guess it's impossible to share the same connection, right?

No, it's quite possible. You have to create a dedicated greenlet that
connects and
stays connected to the broker then communicate with it using Queue[1]
and/or AsyncResult[2].

I have used py-amqplib with gevent that way. I had separate
connections
for Producer and for each consumer's channel, each connection managed
by its own greenlet.
I think It's the same scheme as with using slave processes,
only you completely avoid inter-process communication overhead.

[1] http://www.gevent.org/gevent.queue.html
[2] http://www.gevent.org/gevent.event.html#gevent.event.AsyncResult

Andrew Badr

unread,
Feb 24, 2010, 5:46:59 AM2/24/10
to Tornado Web Server
It looks like someone has already done the hard part of porting
Tornado to Twisted: http://github.com/dustin/tornado. The port is a
little out of date, but might not take that much work to update. I
really want to get Tornado and RabbitMQ working together, so it'll
either be that or the new lib someone posted.

Andrew

Reply all
Reply to author
Forward
0 new messages