Load balancing of workloads in RabbitMQ cluster

1,164 views
Skip to first unread message

Konstantin Kalin

unread,
Nov 30, 2015, 3:40:13 PM11/30/15
to rabbitmq-users
I'm working on following issue with RabbitMQ cluster. Each queue exists on one node (master queue) and other nodes proxy connections from publishers and consumers of that queue. Here is an issue that prevents horizontal scaling. 

Let's look at following use cases:
1) Scale up (Adding new node into the cluster) 
The node will get only new connections (assuming that clients have a way to discover it). All existing queues will stay on other nodes. If an application doesn't create any new queue there is no horizontal scaling of workloads(queues) at all. 

2) Concurrent booting RabbitMQ cluster with services that use the broker - micro-service architecture.  
There is a big chance that the services are quick enough to establish connections with first available RabbitMQ node. Thus the load is not distributed at all. Even if we periodically drop a connection and establish a new one the queue will stay on same node. Only temporary queues are balanced. 

Is there any plans or a feature to do workload balancing if one node gets more load? 

Thank you,
Konstantin. 

Michael Klishin

unread,
Dec 1, 2015, 1:39:11 AM12/1/15
to rabbitm...@googlegroups.com, Konstantin Kalin
 On 30 November 2015 at 23:40:16, Konstantin Kalin (konstant...@gmail.com) wrote:
> Let's look at following use cases:
> 1) Scale up (Adding new node into the cluster)
> The node will get only new connections (assuming that clients
> have a way to discover it). All existing queues will stay on other
> nodes. If an application doesn't create any new queue there is
> no horizontal scaling of workloads(queues) at all.

Queues must guarantee ordering. Therefore all work that happens in a queue
has to stay on that node (in fact, chances are using more than one CPU core
would yield next to no benefit).

Migrating existing queues is not currently on the roadmap.

> 2) Concurrent booting RabbitMQ cluster with services that use
> the broker - micro-service architecture.
> There is a big chance that the services are quick enough to establish
> connections with first available RabbitMQ node. Thus the load
> is not distributed at all. Even if we periodically drop a connection
> and establish a new one the queue will stay on same node. Only temporary
> queues are balanced.

See https://github.com/rabbitmq/rabbitmq-server/releases/tag/rabbitmq_v3_6_0_rc1
and specifically https://github.com/rabbitmq/rabbitmq-server/issues/121.

> Is there any plans or a feature to do workload balancing if one
> node gets more load?

Beyond what's mentioned above, distributing workload in a messaging system very much
requires re-distributing client connections for better data locality, and nearly all messaging protocols
we have today don't cover this in any way in their specs.

You are missing (or at least not mentioning) one other implementation detail
which is a much bigger issue in a cluster: how messages are mirrored from master.
Currently it happens by nodes forming a ring, which scalability-wise isn't great.

There are plans to replace this with Raft (and involve a quorum of nodes). This is a non-trivial
undertaking, however.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ


Konstantin Kalin

unread,
Dec 1, 2015, 10:04:18 AM12/1/15
to rabbitmq-users, konstant...@gmail.com


Queues must guarantee ordering. Therefore all work that happens in a queue
has to stay on that node (in fact, chances are using more than one CPU core
would yield next to no benefit).

Yes I know that queue performance is limited by one CPU at final end. But before it reaches this point there are many other bottle necks :)
 

Migrating existing queues is not currently on the roadmap.

I think it's kinda important. Let assume that we have dozens of notification queues and each queue has multiple consumers.  
Currently to have these queues distributed between RabbitMQ cluster nodes I need introduce some kind of system lock that will wait until RabbitMQ cluster is fully operational.
It's hard to create such lock even impossible in some cases. And it doesn't solve RabbitMQ scale-up use-case.  So user of RabbitMQ is pretty much at his own to deal with this situation. 

At same time RabbitMQ has already notion of HA queues and master election.  Currently master queue re-election (I'm not touching HA queues here) happens on an event of RabbitMQ node crash(or split brain situation). 
I think it can be reused in re-balancing as well. Let's say we need to rebalance the load on a cluster after we add new node into the cluster. 
RabbitMQ can choose a queue to be moved, start its slave on specific node, wait until the slave queue catches up with the master queue and re-elect the slave queue as new master one. 
Does it make sense? 
Yes. I know about this feature. It's great improvement though it doesn't solve the issue I described above. 

You are missing (or at least not mentioning) one other implementation detail
which is a much bigger issue in a cluster: how messages are mirrored from master.
Currently it happens by nodes forming a ring, which scalability-wise isn't great.

I specially didn't mentioned mirrored queues to avoid side-tracking of the discussion :) 

Thank you,
Konstantin. 

Michael Klishin

unread,
Dec 1, 2015, 10:39:57 AM12/1/15
to rabbitm...@googlegroups.com, Konstantin Kalin
On 1 December 2015 at 18:04:22, Konstantin Kalin (konstant...@gmail.com) wrote:
> I think it's kinda important. Let assume that we have dozens
> of notification queues and each queue has multiple consumers.
> Currently to have these queues distributed between RabbitMQ
> cluster nodes I need introduce some kind of system lock that will
> wait until RabbitMQ cluster is fully operational.
> It's hard to create such lock even impossible in some cases. And
> it doesn't solve RabbitMQ scale-up use-case. So user of RabbitMQ
> is pretty much at his own to deal with this situation.
>
> At same time RabbitMQ has already notion of HA queues and master
> election. Currently master queue re-election (I'm not touching
> HA queues here) happens on an event of RabbitMQ node crash(or
> split brain situation).
> I think it can be reused in re-balancing as well. Let's say we need
> to rebalance the load on a cluster after we add new node into the
> cluster.
> RabbitMQ can choose a queue to be moved, start its slave on specific
> node, wait until the slave queue catches up with the master queue
> and re-elect the slave queue as new master one.
> Does it make sense?

I think you greatly overestimate how effective moving a queue master can be
if consumer connections aren't moved (for which there is no provision in any
of the popular messaging protocols, although AMQP 0-8 kinda had a similar idea).
And the complexity of implementing this.

Yeah, we have bigger fish to catch (or 10).

Konstantin Kalin

unread,
Dec 1, 2015, 11:40:13 AM12/1/15
to rabbitmq-users, konstant...@gmail.com

> RabbitMQ can choose a queue to be moved, start its slave on specific
> node, wait until the slave queue catches up with the master queue
> and re-elect the slave queue as new master one.
> Does it make sense?

I think you greatly overestimate how effective moving a queue master can be
if consumer connections aren't moved (for which there is no provision in any
of the popular messaging protocols, although AMQP 0-8 kinda had a similar idea).
And the complexity of implementing this.

Yeah, we have bigger fish to catch (or 10).

I see your point that it's not simple task to implement. And I still believe that RabbitMQ should have it like many other clustered product do. Honestly I think you underestimate the issue. And with multiple consumers per queue there is big chance that consumers are already distributed between nodes and not all consumers are connected to a node running master queue :) 

Currently it has a little sense to do RabbitMQ cluster scale-up unless the majority of traffic falls into short living queues pattern (where ZMQ has better sense to be used). 
Thus RabbitMQ cluster should be planned ahead (with growth prediction) on long running system and an user has to keep "warm" extra-hardware for future growth. 
I don't think someone likes to do restart of whole system just to be able scale RabbitMQ. 
Secondly the system power-boot requires orchestration to make sure that RabbitMQ cluster starts before the other services. And what should be done if it is power outage and not all RabbitMQ nodes were recovered... and extra. Of course there is always humans that can do proper sequence of restart but it just increases the down-time. Less a human has to deal with than it is less down-time for a service. 

Thank you,
Konstantin. 

Michael Klishin

unread,
Dec 1, 2015, 12:44:46 PM12/1/15
to rabbitm...@googlegroups.com, Konstantin Kalin
On 1 December 2015 at 19:40:16, Konstantin Kalin (konstant...@gmail.com) wrote:
> with multiple consumers per queue there is big chance that consumers
> are already distributed between nodes and not all consumers
> are connected to a node running master queue :)

exactly, and once master is migrated, consumers won't be and your data locality gets worse. 

Konstantin Kalin

unread,
Dec 2, 2015, 11:33:27 PM12/2/15
to rabbitmq-users, konstant...@gmail.com


On Tuesday, December 1, 2015 at 9:44:46 AM UTC-8, Michael Klishin wrote:
On 1 December 2015 at 19:40:16, Konstantin Kalin (konstant...@gmail.com) wrote:
> with multiple consumers per queue there is big chance that consumers  
> are already distributed between nodes and not all consumers  
> are connected to a node running master queue :)

exactly, and once master is migrated, consumers won't be and your data locality gets worse. 

Well, honestly it's irrelevant for many workloads where processing of single message takes longer than an extra hop between two Erlang nodes. Even if I develop an app/service that will be aware where a master queue is running (feels like a bad design decision) and my service/app is smart enough to see that the master queue fails over to another node. It still doesn't solve the issue I'm trying to describe. I will try to explain one more time. Also I'm not trying to say that RabbitMQ is "bad product" - I really like it and worked with it in several projects.

Well .. a story :)
Let say I designed a service/app that uses RabbitMQ for messaging. My service/app deals with a lot of "permanent" queues which are handled by multiple consumers. I setup RabbitMQ cluster with 2 nodes hoping I will add one more node later on when the load growths. Everything works fine until the load reaches a point - RabbitMQ node cannot handle all queues that are running on that node. Sounds like not a big deal - add an additional RabbitMQ node and done. But the reality is that adding the node doesn't help at all. The queues will stay on overloaded node that is trying to keep up with the load and eventually either crash or partition. There is only one way to redistribute the load - the service/app needs to be restarted and be intelligent enough to establish "proper" connections (the feature that you mentioned before simplify this task :). Thus the service/app will face a down-time. Isn't it an issue?

Thank you,
Konstantin.
Reply all
Reply to author
Forward
0 new messages