Publishing to RabbitMQ degraded by F5?

Ryan Brown

unread,

May 28, 2015, 4:07:05 PM5/28/15

to rabbitm...@googlegroups.com

Hello all,

I have an application that works as a restful pub/sub system that leverages RabbitMQ for persistence and routing. This system was originally designed to handle load-balancing across out 4 node RabbitMQ cluster in a simple round-robin fashion. We have recently switched to using an F5 to abstract our cluster implementation from the application. However, via a series of performance/load tests we have established that publication throughput into RabbitMQ is reduced by nearly 50% with the F5 handling the load-balancing. To me this is a clear sign that we are doing something wrong with our configuration.

I am admittedly a bit out of my element with some of the F5 settings. Below is the VIP we have currently configured:

ltm virtual rabbitmq.edu4u.net {

destination 10.52.165.51:5672

ip-protocol tcp

mask 255.255.255.255

pool rabbitmq

profiles {

fastL4 { }

}

snatpool app_pool

source-port change

translate-address enabled

translate-port enabled

vlans-disabled

}

The rabbitmq pool looks like this:

ltm pool rabbitmq {

load-balancing-mode least-connections-node

members {

rmq01:5672 {

address 10.52.246.132

}

rmq02:5672 {

address 10.52.246.133

}

rmq03:5672 {

address 10.52.246.134

}

rmq04:5672 {

address 10.52.246.135

}

monitor tcp

}

I'm not entirely sure what I should be looking for here. I have confirmed that our timeout is longer than than our timeout. Other than that, I have not been able to find anything that is glaringly obvious to me that may cause the performance degradation we're seeing.

Any help or guidance would be greatly appreciated.

Best.

Ryan

Michael Klishin

unread,

May 28, 2015, 9:32:09 PM5/28/15

to rabbitm...@googlegroups.com, Ryan Brown

On 28 May 2015 at 23:07:05, Ryan Brown (ryank...@gmail.com) wrote:
> a series of performance/load tests we have established that
> publication throughput into RabbitMQ is reduced by nearly 50%
> with the F5 handling the load-balancing. To me this is a clear
> sign that we are doing something wrong with our configuration.

In both tests, do publishers and consumers connect to the same nodes?
It can be a data locality issue, when a load balancer distributes
connections to different nodes and thus every message has to be moved
between them.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Ryan Brown

unread,

May 28, 2015, 9:56:31 PM5/28/15

to Michael Klishin, rabbitm...@googlegroups.com

Michael,

Publishers and consumers do connect to the same nodes. The way the application currently works is we have connections that publish incoming messages to RMQ. The same nodes also subscribe to all of the queues and deliver to the subscribing endpoints. It appears that the bottleneck is appearing in the initial publishing to RMQ.

Your comment about data locality is interesting. I wonder if the issue could be related to the fact that we are using all HA queues with active/active replication? That would significantly increase the chatter between the nodes. Could that potentially slow-down publishing? We have noticed backpressure being applied under higher loads. (But not actually high compared to what I have seen RMQ handle in the past. ~250mps)

One additional note is that we are using a headers exchange to facilitate some fairly complex routing schemes. My understanding is that this is significantly slower than a topic or fanout exchange.

-rb

Michael Klishin

unread,

May 28, 2015, 10:10:08 PM5/28/15

to Ryan Brown, rabbitm...@googlegroups.com

On 29 May 2015 at 04:56:28, Ryan Brown (ryank...@gmail.com) wrote:
> Your comment about data locality is interesting. I wonder if
> the issue could be related to the fact that we are using all HA queues
> with active/active replication? That would significantly
> increase the chatter between the nodes. Could that potentially
> slow-down publishing?

It will slow down queue processes that will eventually result in back pressure
to publishers.

This is not something a load balancer can affect, though.

Ryan Brown

unread,

May 28, 2015, 10:42:57 PM5/28/15

to Michael Klishin, rabbitm...@googlegroups.com

Understood. My thought process took a bit of a tangent. Thank you.

-rb

Michael Klishin

unread,

May 28, 2015, 10:44:56 PM5/28/15

to Ryan Brown, rabbitm...@googlegroups.com

On 29 May 2015 at 05:42:55, Ryan Brown (ryank...@gmail.com) wrote:
> Understood. My thought process took a bit of a tangent.

I should point out that the mirroring implementation we have today
over-emphasizes safety of delivery by trading off a lot of efficiency
when you mirror to less than "all" nodes.

We are working on a new one for future versions, using a well known algorithm.

Reply all

Reply to author

Forward