Hello,
I am new to RabittMQ and trying to solve a problem in the existing infrastructure at work. Our requirement is to deploy a cluster to ensure
high availability of channels and configuration across data centers. The RabbitMQ Cluster would have been perfect for us but for the strong recommendations against deploying it across WANs. Our requirement is slightly different from what I came across in the documentation and reading through posts online is that for us, having a backup channel (preferably passive) with all the configuration information replicated (vhosts, queues, users, exchanges etc.) is of primary importance - we are ok to drop messages in case a node fails as long as further communication can continue on the other nodes.
Federation and Shovel seem to be logical solutions for replication messages across DCs while HA and HA pacemaker are solving a different problem. What we need is a simple HA solution for communication so that, say, if a data center goes down, we could continue with another node(s) in another data center. That is:
- Monitor the nodes to detect when one fails
- Move over to the one of the failover nodes.
I don't think this would be such an absurd use case that it isn't supported at all. Any ideas or pointers will be highly appreciated! :)
A work on our current config: We have thousands of servers across (AWS) data centers in the world that currently communicate to a single RabbitMQ node. The servers aren't very chatty but we are concerned about a single point of failure in case things go wrong. Our current scheme of fixing things is to manually detect the the RabbitMQ node has gone down and recreate it using the configuration information (vhost, users, etc) we store (and keep current) elsewhere. The servers are already configured to take a list of RabbitMQ servers to attempt to connect to.
Thanks in advance!
Regards,
Vaibhaw