Re: Publishing persistent messages with RabbitMQ is extremely slow

2,133 views
Skip to first unread message
Message has been deleted

Michael Klishin

unread,
May 23, 2016, 7:43:59 AM5/23/16
to rabbitmq-users
This sounds like a client or app problem. You can use Wireshark to see
what's being sent on the wire, with timestamps and TCP-level details:
http://www.rabbitmq.com/amqp-wireshark.html

On Monday, 23 May 2016 13:11:44 UTC+3, Sezgin Riggs wrote:

Hello,


I'm dealing with a weird problem with RabbitMQ 3.6.2.


While publishing messages with delivery_mode=1 (to a durable queue), RabbitMQ management panel shows around 2500 messages per second (which is fine for me), but unfortunately when I try to publish same messages with delivery_mode=2 to make them persistent, incoming message rate drops to 15-20 messages per second. (using Python 2.7 and pika library)


Each message consists of just an URL, so their sizes are small too... Also I'm publishing by using basic_publish().


I have a desktop grade Intel i7 server with 32GB memory and the disks are 7200RPM SATA disks. CPU load is very low, between 1.50-2.00 (for 8 cores), also producer is running on the same server, reading data from MongoDB, makes some sorting before sending them to RabbitMQ etc. but doesn't write anything to disk, just reads.. Also memory has free space more than 20GB and iotop shows around 80K/s "total disk write" and around 250K/s "actual disk write"


I just stopped the MongoDB and all other operations, wrote a simple loop which publishes numbers from 0 to 50000, and if durable=true for queue_declare(), results are the same. When i set durable=false with queue_declare() it's getting faster again. In addition to that, I used a crawler on same server and it's crawling more than 200 pages/sec and writing them to MongoDB without any performance problem. So it doesn't look like a hardware/disk problem. Everything else than RabbitMQ works fine with disk writes. When I try to run same simple loop against a RabbitMQ installation on a virtual machine with a SSD (with much lower CPU and memory), it works fine too.


My problem looks like the same as mentioned here: https://groups.google.com/forum/#!topic/rabbitmq-users/ue2inq0dXpU

I also shared the problem and details on Stackoverflow too: http://stackoverflow.com/questions/37380128/why-publishing-persistent-messages-with-rabbitmq-is-so-slow


Node stats on management panel shows the following data;

File descriptors: 55 / 1024 available

Socket descriptors: 1 / 829 available

Erlang processes: 248 / 1048576 available

Memory: 277MB / 12GB high watermark

Disk space: 2.2TB / 12GB low watermark

Also I/O statistics (I/O average time per operation) shows as following; Read: 3.0ms

Write: 0.13ms

Seek: 0.05 ms

Sync: 70ms


Also you can see the output of "iostat -x 2 5" at http://pastebin.com/a0ds7vKx


Thank you very much in advance...

vitaly numenta

unread,
May 23, 2016, 5:20:21 PM5/23/16
to rabbitmq-users
Hi Sezgin,

are you using pika.BlockingConnection, delivery-mode=2, mandatory=True and with Publisher Confirmations enabled via confirm_delivery()? That is a bad match for the synchronous-style interface of BlockingConnection, since BlockingChannel will wait for ACK from RabbitMQ for each message, but RabbitMQ doesn't immediately flush each message to disk, but waits a configured amount of time in an attempt to flush multiple incoming messages at a time to disk for I/O efficiency. IIRC, this value is configurable.

However, for best throughput (especially in this scenario), you should find one of the asynchronous interfaces much more performant - for example, take a look at SelectConnection or TornadoConnection.

vitaly numenta

unread,
May 23, 2016, 5:23:24 PM5/23/16
to rabbitmq-users
When using an asynchronous interface, you would publish each subsequent message without the delay of waiting for the ACK to the prior message to arrive.
Message has been deleted

vitaly numenta

unread,
May 23, 2016, 8:44:02 PM5/23/16
to rabbitmq-users
Try with mandatory=False first, and please report the difference in performance. mandatory: This flag tells the server how to react if the message cannot be routed to a queue. If this flag is True, the server will return an unroutable message with a Return method. If this flag is False the server silently drops the message. If I recall correctly, mandatory in combination with persistence and confirm_delivery may be responsible for the majority of the performance loss, since RabbitMQ will delay the publisher ACK until the persistent message has been written to the disk.

With mandatory=False combined with confirm_delivery, you should still get ACKs, acknowledging that RabbitMQ broker received your message, but you won't be guaranteed that it made it into the queue and was saved to disk.
 
After that, dropping confirm_delivery should result in further performance improvement (by eliminating the leatence from Basic.Publish/Basic.Ack) at the expense of not getting any per-message confirmation from the broker.

Michael Klishin

unread,
May 24, 2016, 5:51:19 AM5/24/16
to rabbitm...@googlegroups.com
A note on the last sentence: publisher confirms are entirely asynchronous in RabbitMQ, both as far as clients
and internal implementation go. Streaming confirms introduce little overhead and have been available in multiple clients
for years.

I assume with BlockingConnection Pika cannot support those but I'm just making sure that publisher confirms aren't seen
as a naive block-and-wait way of doing things in general.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

vitaly numenta

unread,
May 24, 2016, 4:58:43 PM5/24/16
to rabbitmq-users
Thanks for the clarification, Michael. That's correct, although RabbitMQ's publisher confirms are asynchronous, BlockingConnection API, which is synchronous, handles publisher confirms synchronously. In an earlier post, I suggested using one of pika's asynchronous adapters (e.g., SelectConnection) to get the maximum performance, but the initiator of the post declined, citing complexity of dealing with asynchronous interfaces.
Message has been deleted

Michael Klishin

unread,
May 25, 2016, 4:50:38 PM5/25/16
to rabbitm...@googlegroups.com
I believe your question is answered earlier in this thread. Blocking connections in Pika cannot handle server-sent confirms
in a way other than "publish a message and block until a confirms arrives, then repeat." Doesn't that sound like it would be
the worst possible way of doing things throughput-wise?

Most clients can publish and receive acks entirely asynchronously and I believe Pika also can with a different connection driver.

On Wed, May 25, 2016 at 11:29 PM, Sezgin Riggs <sez...@gmail.com> wrote:
Hello again,

I tried with a test script (which just publishes 500.000 messages in a for loop), if mandatory = false and confirm_delivery enabled, (and delivery_mode=2) it's still publishing around 20 messages/sec.. 

If delivery_mode=2, mandatory=True and confirm_delivery disabled, it's much faster, around 15.000 messages/sec.

If delivery_mode=2, mandatory=False and confirm_delivery disabled, it's much faster again, around 15.000 messages/sec.

For all tries, durable=True for queue_declare()..

So the main reason seems like confirm_delivery option itself. But as it's just providing basic ACK, what can be the reason for that slowness?

Thank you very much again...

Sezgin

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

vitaly numenta

unread,
May 25, 2016, 5:06:06 PM5/25/16
to rabbitmq-users
All the other connection drivers (or adapters, as pika refers to them) in pika besides BlockingConnection are asynchronous, and should yield far better performance with publisher confirms. Those would be pika.SelectConnection, pika.TornadoConnection, etc.

vitaly numenta

unread,
May 25, 2016, 5:08:47 PM5/25/16
to rabbitmq-users
The choice of the blocking versus asynchronous adapters is a tradeoff between simplicity and performance.
Reply all
Reply to author
Forward
0 new messages