How to scale RabbitMQ for a given publish rate and number of queues

Kiran D

unread,

Jun 26, 2017, 10:25:41 PM6/26/17

to rabbitmq-users

We are testing our applications for performance and have run into a bottleneck for the publisher application. I am looking for suggestions on how to scale RabbitMQ infrastructure to work with the projected load.

Test Setup:

RabbitMQ Server

Version: 3.6.10
OS: RHEL 7.2
Single node, non-clustered
SSL enabled
Default configuration settings for RabbitMQ server

RMQ server hardware

16 vCPUs
32GB RAM
All hosts on same LAN network

Application messaging nature

Topic based pub-sub pattern
Chronological ordering of messages is to be maintained (makes concurrent message handling difficult).
Single publisher, 600 consumers.
Each consumer binds to a single queue (1 to 1 mapping)
Avg message publish rate 70-75 msgs/sec
Msg payload size ~1KB
Test duration 2 hours
Producer, RMQ server on different hosts.
600 consumers split across 4 hosts.

Publisher

rabbitmq-c/SimpleAmqpClient C++ client
Publish Confirm disabled
Single thread

Consumer

Java client
Msg Acknowledgement disabled

Queue

Persistent queue (auto-delete is disabled)
Queue policy for auto expire is set for HA re-connection
HA mode is set for all 600 queues (however, no cluster setup available for the performance system)

Performance Test Observations

400 Clients, 70 msg/sec Publish rate: No noticeable messaging latency. The application responds within acceptable range for performance test.
600 clients, 70 msg/sec Publish rate: The application holds up for the first hour. However, we noticed severe degradation after 1 hour.

The publishing channel spent lot of time in flow control state. Several message publish calls were blocked for 1+ mins.
Messages were delivered to clients with large and growing latency till end of test (2 hours duration).
There was no noticeable spike in CPU neither on RMQ host nor on the consumer hosts.
There were no alarms on RMQ server

Questions:

How do we scale the RabbitMQ infrastructure to support above load?
Concurrent workers is not feasible due to dependency on strict ordering of messages.
The consumer programs are simulated for this test and these programs only consume message and discard it. There is no processing involved. I do not expect any latency/delay in message consumption.
I have a simulator program to generate messages at avg 70 / s (400, 700, 1000, 400, 700, 1000 messages over 10 sec time slices in a minute). What I noticed is the publisher program publishes 70/s (number of BasicPublish calls) but the broker still publishes at ~ 30 odd / sec. The broker seems to not keep up with the publish rate when number of queues = 600 and this forces the connection into flow state.
What needs to be done to address this flow state? It seems like the RabbitMQ broker is quite aggressive in setting the connection to flow state. Should I modify the credit defaults values

Regards,

Kiran

Kiran D

unread,

Jun 26, 2017, 10:33:43 PM6/26/17

to rabbitmq-users

To add further, I also noticed the channel continues to publish messages for 5+ minutes after I kill my publisher application. It looks like the channel gets its own process on the RMQ server and it does not die until the backlog is drained. Is there is a way to get some information on the channel publish backlog at runtime? This backlog is not the same as queue Ready messages. Also, is there any way to destroy the channel immediately after the connected application is killed?

Kiran D

unread,

Jun 27, 2017, 6:04:56 PM6/27/17

to rabbitmq-users

Update #1: I am monitoring another performance run right now and noticed the publisher channel is performing ~ 3.5-4 million reductions/sec. I understand it means the channel process is performing a lot units of work and consumers processor time. Is there a way to improve this? Should I be looking at my topics used for routing? Will having wildcards and/or 3-5 levels hierarchy in my topic affect work performed by this process?

Ex: Typical topics published to will be A.B.C.1, A.B.C.2, etc. The queue bindings look like: A.B.C.*, A.*.*.*, etc.

Michael Klishin

unread,

Jun 27, 2017, 6:29:05 PM6/27/17

to rabbitm...@googlegroups.com

Every single operation a client performs goes through a channel process.

Consumer deliveries is the only exception: queues delegate serialisation and socket

operations to channel writers without involving channels.

Topic segments depth does affect routing efficiency but not linearly:

https://www.rabbitmq.com/blog/2010/09/14/very-fast-and-scalable-topic-routing-part-1/

http://www.rabbitmq.com/blog/2011/03/28/very-fast-and-scalable-topic-routing-part-2/

As with most other questions in this area, try, measure and see for yourself.

On Wed, Jun 28, 2017 at 1:04 AM, Kiran D <kirand...@gmail.com> wrote:

Update #1: I am monitoring another performance run right now and noticed the publisher channel is performing ~ 3.5-4 million reductions/sec. I understand it means the channel process is performing a lot units of work and consumers processor time. Is there a way to improve this? Should I be looking at my topics used for routing? Will having wildcards and/or 3-5 levels hierarchy in my topic affect work performed by this process?

Ex: Typical topics published to will be A.B.C.1, A.B.C.2, etc. The queue bindings look like: A.B.C.*, A.*.*.*, etc.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

MK

Staff Software Engineer, Pivotal/RabbitMQ

Reply all

Reply to author

Forward