Federated Cluster Setup

Kris Reese

unread,

Sep 2, 2014, 4:55:41 PM9/2/14

to rabbitm...@googlegroups.com

Hello,

I'm having trouble understanding how to setup a federated cluster. I have a setup as follows:

Two rabbitmq nodes in data center 1 (DC1):
DEV@DC1 mqmgr@devmq02 ~ $ rabbitmqctl cluster_status
Cluster status of node rabbit@devmq02 ...
[{nodes,[{disc,[rabbit@devmq02,rabbit@devmq03]}]},
{running_nodes,[rabbit@devmq03,rabbit@devmq02]},
{cluster_name,<<"rabbit@dc1">>},
{partitions,[]}]
...done.

Two rabbitmq nodes in data center 2 (DC2):
DEV@DC2 mqmgr@devmq04 ~ $ rabbitmqctl cluster_status
Cluster status of node rabbit@devmq04 ...
[{nodes,[{disc,[rabbit@devmq04,rabbit@devmq05]}]},
{running_nodes,[rabbit@devmq05,rabbit@devmq04]},
{cluster_name,<<"rabbit@dc2">>},
{partitions,[]}]
...done.

I've set the following policy on one node in each cluster:
DEV@DC1 mqmgr@devmq02 ~ $ rabbitmqctl set_policy federate-me ".*" '{"federation-upstream-set":"all”}’
DEV@DC2 mqmgr@devmq04 ~ $ rabbitmqctl set_policy federate-me ".*" '{"federation-upstream-set":"all"}'

Next, I tried to set_parameter via command line but have something incorrect with the command:
DEV@DC1 mqmgr@devmq02 ~ $ rabbitmqctl set_parameter federation-upstream rabbitdc2 '[{"uri":"amqp://user:pass...@devmq04.myserver.com","expires":3600000}, {"uri":"amqp://user:password@devmq05.myserver.com","expires":3600000}]'
Setting runtime parameter "rabbitdc2" for component "federation-upstream" to "[{\"uri\":\"amqp://user:pass...@devmq04.myserver.com\",\"expires\":3600000}, {\"uri\":\"amqp://user:pass...@devmq05.myserver.com\",\"expires\":3600000}]" ...
Error: Validation failed

Unrecognised terms [[{<<"uri">>,<<"amqp://user:pass...@devmq04.myserver.com">>},
                     {<<"expires">>,3600000}],
                    [{<<"uri">>,<<"amqp://user:pass...@devmq05.myserver.com">>},
                     {<<"expires">>,3600000}]] in rabbitdc2
Key "uri" not found in rabbitdc2

DEV@DC2 mqmgr@devmq04 ~ $ rabbitmqctl set_parameter federation-upstream rabbitdc1 '[{"uri":"amqp://user:pass...@devmq02.myserver.com","expires":3600000}, {"uri":"amqp://user:password@devmq03.myserver.com","expires":3600000}]'
((same error output as above)

So I setup the Federation Upstream via the management console where I space delimited the URIs. I ran this via the management console on a node in each cluster (devmq02 and devmq04 respectively).

What I'm looking to do is have DC1 as the primary site, replicating messages between DC1 and DC2. In the event DC1 is lost, a load balancer would begin to direct traffic to the DC2 rabbitmq cluster.

Am I in the realm of possibility?

Thank you.

Michael Klishin

unread,

Sep 2, 2014, 4:59:11 PM9/2/14

to rabbitm...@googlegroups.com, Kris Reese

On 3 September 2014 at 00:55:53, Kris Reese (ktr...@gmail.com) wrote:

--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Michael Klishin

unread,

Sep 2, 2014, 5:02:05 PM9/2/14

to rabbitm...@googlegroups.com, Kris Reese

On 3 September 2014 at 00:55:53, Kris Reese (ktr...@gmail.com) wrote:

> So I setup the Federation Upstream via the management console
> where I space delimited the URIs. I ran this via the management
> console on a node in each cluster (devmq02 and devmq04 respectively).

Have you managed to set up the upstream in the end? Is the issue simply that you
couldn't do it via rabbitmqctl or something seems wrong when performed via management UI?

> What I'm looking to do is have DC1 as the primary site, replicating
> messages between DC1 and DC2. In the event DC1 is lost, a load balancer
> would begin to direct traffic to the DC2 rabbitmq cluster.
>
> Am I in the realm of possibility?

This is what exchange federation is for, pretty much. Note that you'll have to make
sure the topology (not just exchanges but queues, bindings) in both clusters is the same.

Alvaro Videla

unread,

Sep 2, 2014, 5:17:17 PM9/2/14

to Kris Reese, rabbitm...@googlegroups.com

Hi Kris,

On Tue, Sep 2, 2014 at 10:55 PM, Kris Reese <ktr...@gmail.com> wrote:

Next, I tried to set_parameter via command line but have something incorrect with the command:
DEV@DC1 mqmgr@devmq02 ~ $ rabbitmqctl set_parameter federation-upstream rabbitdc2 '[{"uri":"amqp://user:pass...@devmq04.myserver.com","expires":3600000}, {"uri":"amqp://user:password@devmq05.myserver.com","expires":3600000}]'

There you are using a JSON array for a federation-upstream, but arrays are allowed only for federation-upstream-set. See the reference here:

http://www.rabbitmq.com/federation-reference.html

Regards,

Alvaro

Kris Reese

unread,

Sep 2, 2014, 5:21:28 PM9/2/14

to rabbitm...@googlegroups.com, ktr...@gmail.com

I believe the upstream has been setup. Attached are some screenshots showing the upstreams. What I'm not sure about is whether or not, at this point, I should see any running links under Federation Status. Right now, it shows "... no links ...". Also note, that at this time, there are no queues declared or anything of the sort. Just a stock install with the above setup.

Screen Shot 2014-09-02 at 4.14.52 PM.png

Screen Shot 2014-09-02 at 4.18.59 PM.png

Kris Reese

unread,

Sep 3, 2014, 10:25:23 AM9/3/14

to rabbitm...@googlegroups.com, ktr...@gmail.com

so I think my problem as to why I was not seeing an Running Links was because I had another priority 0 policy defined as follows:

rabbitmqctl set_policy ha-all ".*" '{"ha-mode":"all"}'

My initial thoughts were that I would need to mirror queues within the cluster so that HA was attainable within a cluster (say in the event one node goes down in the cluster but at that point, there would be no need to cutover to DC2). So how does queue mirroring play with federation?

Alvaro Videla

unread,

Sep 3, 2014, 10:30:40 AM9/3/14

to Kris Reese, rabbitm...@googlegroups.com

If you want to have two policy definitions, take a look here "Combining Policy Definitions" on how to do it: http://www.rabbitmq.com/parameters.html

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,

Sep 3, 2014, 11:16:12 AM9/3/14

to Kris Reese, rabbitm...@googlegroups.com

Exchange federation can be used with mirrored queues. Mirroring federated queues is possible technically but I'd need to try it to see if it works the way I expect.

MK

--

Kris Reese

unread,

Sep 3, 2014, 11:44:14 AM9/3/14

to rabbitm...@googlegroups.com, ktr...@gmail.com

perfect -- thanks Alvaro! Now I feel that my policy is properly defined.

I'm still confused, however, a bit on the overall federation cluster setup. Do I need to setup Federation Upstreams on a node in each cluster? In other words, is my DC2 cluster the upstream to the DC1 cluster, and is my DC1 cluster the upstream to the DC2 cluster?

Right now, I've only defined both nodes of the DC1 cluster as an upstream on a node in the DC2 via the management console

Michael Klishin

unread,

Sep 3, 2014, 11:52:33 AM9/3/14

to Kris Reese, rabbitm...@googlegroups.com

You can do either, and even mutual federation. Depends on what you want to achieve.

MK

Kris Reese

unread,

Sep 3, 2014, 1:10:49 PM9/3/14

to rabbitm...@googlegroups.com, ktr...@gmail.com, mic...@rabbitmq.com

I am interested in the following setup:

DC1: two nodes (node1 and node2) in a cluster

DC2: two nodes (node3 and node4) in a cluster

DC1 and DC2 are it two different geographical locations.

DC2 is essentially to be a mirror copy of DC1, sitting idle/standby unless DC1 were to completely go offline.

On node1 and node3, I set the following policy, which were replicated to their respective cluster partner:

rabbitmqctl set_policy ha-fed ".*" '{"federation-upstream-set":"all","ha-mode":"all"}'

On node1, via the management console, I defined the following Federation Upstream:

Name: rabbitdc2

URI: amqp://user:password@node3 amqp://user:password@node4

Expiry: 3600000ms

Ack mode: on-confirm

On node3, via the management console, I defined the following Federation Upstream:

Name: rabbitdc1

URI: amqp://user:password@node1 amqp://user:password@node2

Expiry: 3600000ms

Ack mode: on-confirm

Via the management console, I see Running Links via Federation Status as expected.

Next, I create a Queue named "test" on node1 via the management console. I see it automatically show up on node2, node3, and node4.

Next, I publish a message to the "test" queue on node1. I see the message mirror over to node2. But I do not see the message populate on node3 and node4.

Next, I want to consume the message and have the fact that it was consumed reflected on node3 and node4 as well.

I'm unclear as to what components I am missing for messages posted to a queue to replicate across the clusters. I read up a bit on http://www.rabbitmq.com/federated-queues.html in the How It Works section and it seems that messages are only retrieved on a as needed basis. With that said, I ran the consumer against node4, even though it showed 0 messages in the test queue on that node, the messages were consumed. With that, I reposted the messages to the queue on node1, then shutdown node1 and node2 (the DC1 cluster) and now I'm not able to consume those messages from my DC2 cluster. I was hoping that with the shutdown of one of the nodes, the messages would sync between clusters. I'm missing something, but I'm not sure what at this point.

Thanks for the time

--

Kris

Kris Reese

unread,

Sep 3, 2014, 1:58:05 PM9/3/14

to rabbitm...@googlegroups.com, ktr...@gmail.com, mic...@rabbitmq.com

ok so... I wasn't publishing a message to the exchange, and if the following is true, then it makes sense:

Exchange federation replicates messages that are published to exchange X
in the upstream to downstream. Messages not yet published downstream will
be lost.

Queue federation does not replicate messages: it distributes them between
federated nodes/clusters, favoring local consumers.

However, consuming the message from the queue leaves the message in the standby clusters queue...

Michael Klishin

unread,

Sep 3, 2014, 2:15:08 PM9/3/14

to Kris Reese, rabbitm...@googlegroups.com

Then make dc1 an upstream of dc2 for all exchanges. The rest of the topology should be identical.

MK

Kris Reese

unread,

Sep 3, 2014, 2:36:41 PM9/3/14

to rabbitm...@googlegroups.com, ktr...@gmail.com, mic...@rabbitmq.com

ok -- I've done just that.

However, if I simulate a DC1 failure by shutting down rabbitmq on both nodes in DC1, the queued messages are not being replicated over to DC2.

Michael Klishin

unread,

Sep 3, 2014, 3:17:30 PM9/3/14

to rabbitm...@googlegroups.com, Kris Reese

On 3 September 2014 at 22:36:47, Kris Reese (ktr...@gmail.com) wrote:
> However, if I simulate a DC1 failure by shutting down rabbitmq
> on both nodes in DC1, the queued messages are not being replicated
> over to DC2.

Exchange federation replicates a stream of messages to an exchange. If it helps,
it does so by binding an anonymous queue to the exchange, consuming all messages
from it and publishing them over a RabbitMQ client to the downstream machine.

So your downstream exchanges will have an illusion of messages being published to them
by the clients. If those messages cannot be routed anywhere, for example, they are gone.

This is pretty much like async replication in a data store like MySQL. Enqueued messages in DC1
that were not replicated to DC2 are gone if DC1 is gone.

Kris Reese

unread,

Sep 3, 2014, 5:28:33 PM9/3/14

to rabbitm...@googlegroups.com, ktr...@gmail.com

Looks like I've made good progress. I'm at the point now where I've only federated the exchanges. I created the same queue across all nodes in both clusters, and bound them to the amq.fanout exchange. Now, when I produce a message to the amq.fanout exchange, the message shows up in the said queue across all nodes.

However, when I consume the message from a node in DC1, the consumption is not replicated in DC2. What is the solution to keep the queues in sync across the cluster?

Thanks!

Michael Klishin

unread,

Sep 3, 2014, 5:38:30 PM9/3/14

to rabbitm...@googlegroups.com, Kris Reese

On 4 September 2014 at 01:28:39, Kris Reese (ktr...@gmail.com) wrote:
> However, when I consume the message from a node in DC1, the consumption
> is not replicated in DC2. What is the solution to keep the queues
> in sync across the cluster?

Message acks/nacks are visible to all cluster nodes but federation does not form clusters.

There is currently no solution other than clustering that does this, but clustering
is not supposed to be used over WAN. Queue federation is for spreading a single queue
over multiple nodes/clusters that each have consumers.

Kris Reese

unread,

Sep 3, 2014, 5:48:12 PM9/3/14

to rabbitm...@googlegroups.com, ktr...@gmail.com

Thank you... that answer, ultimately, is what I've been struggling to understand, and the simple answer of "there is currently no solution" will hopefully get rid of this headache :)

With that said, perhaps it's best to go with an Active/Active approach between data centers and use federated queues. The load balancer would then be responsible for directing traffic to nodes that are online, while avoiding the problem network partitions pose with a cluster config over a WAN.

Michael Klishin

unread,

Sep 3, 2014, 6:22:14 PM9/3/14

to rabbitm...@googlegroups.com, Kris Reese

On 4 September 2014 at 01:48:19, Kris Reese (ktr...@gmail.com) wrote:
> With that said, perhaps it's best to go with an Active/Active
> approach between data centers and use federated queues. The
> load balancer would then be responsible for directing traffic
> to nodes that are online, while avoiding the problem network
> partitions pose with a cluster config over a WAN.

Note that federated queues are NOT the same as mirrored queues over federation links.

Federated queues get a set of messages distributed among them (a "logical queue"), which
is not replicated, so if DC1 goes dark, all the messages in federated queues
that were in DC1 will not be available to consumers in DC2.

What you're asking for is mirroring over WAN. It remains to be seen if key RabbitMQ cluster semantics
(CP in CAP) can be reasonably preserved in an AP solution, so there is no
timeline for this feature.

What you possibly can do is to connect DC1 and DC2 via federated exchanges but have the standby
DC have a policy that set message TTL for all queues to a few hours. Then during failover
you can remove the policy. Still not the same as "AP mirroring" but may be an OK solution
for a standby cluster nonetheless.

Kris Reese

unread,

Sep 4, 2014, 12:17:14 PM9/4/14

to rabbitm...@googlegroups.com, ktr...@gmail.com

Michael,

Thank you for the clarification and the time helping me better understand.

Kris

Reply all

Reply to author

Forward