RabbitMQ Active/Passive cluster failover scenarios

288 views
Skip to first unread message

Craig Sharp

unread,
Nov 15, 2023, 9:47:05 AM11/15/23
to rabbitmq-users
Hi all,

Has anyone done anything like the following?

Scenario 1
Active/Passive Clusters
Two distinct two node clusters on current Rabbit version, one cluster running, and one in standby (running but no connections).
Both clusters are behind an LTM VIP. 
In normal mode, the first cluster is enabled in the LTM, but in the event of a cluster issue, the second cluster can be manually enabled in the LTM and the first cluster disabled.

Scenario 2
Active/Passive single nodes
This scenario is similar to scenario 1, but only uses a single RabbitMQ node for each of the active and passive sides. Both are still behind the LTM and can be failed over manually.

We are currently on V3.8.22 and I am pulling my hair out with network partitioning and instability using mirrored queues. I am currently building and testing the latest RabbitMQ version using Quorum queues on a three node cluster for upgrading our old environment.
Do the quorum queues eliminate the issues of a network partition and create a more stable cluster? If so I would not need to consider the two aforementioned scenarios.

Thanks,
Craig

Michal Kuratczyk

unread,
Nov 15, 2023, 12:30:11 PM11/15/23
to rabbitm...@googlegroups.com
Not directly, but yes, you should see fewer partitions when you use a recent version and quorum queues.
The mirroring algorithm is very inefficient and may lead to the communication links between the nodes
(aka Erlang distribution) to be blocked/have high latency, which in turn may trigger a network partition,
with an actual network-level issue.

Other issues that often cause this - querying the Management API too often (like monitoring /api/queues every few
seconds with a large number of queues in the cluster).

In RabbitMQ 4.0, to be released next year, all that network partition stuff will change as we're replacing
Mnesia with Khepri. RabbitMQ will be effectively a RAFT system (Khepri uses Ra/RAFT, just like quorum queues).

If you are interested in a warm standby offering, we have it as a commercial feature:

Best,

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/023a71a0-3d0a-421b-be0e-381f31b76c89n%40googlegroups.com.


--
Michał
RabbitMQ team
Reply all
Reply to author
Forward
0 new messages