RabbitMQ Clustering - Best practices?

485 views
Skip to first unread message

Dayton Turner

unread,
May 15, 2017, 8:59:13 PM5/15/17
to 2600hz-dev
Hi list,

I'm interested in doing some AMQP clustering within our zones, specifically to be able to sustain individual node restarts without taking amqp offline completely (rolling restarts, basically).

I see in Kamailio's config comments that there are suggestions about ensuring that the BLF queues are marked with the proper ha-mode policy, I dont know if these are the only queues that require ha-mode declarations, or if any others do as well? (If so, which ones?)

Having never successfully clustered RabbitMQ with Kazoo in the past, I'm unsure if this is a resiliency solution, or if its a "if amqp1 has problems, now amqp2 does too" sort of thing..

Would love to hear feedback from others who have experimented with this! This is currently the last "single point of failure" that has the ability to impact the entire cluster if its impacted..

:)

Arek Fryz

unread,
May 15, 2017, 10:10:51 PM5/15/17
to 2600h...@googlegroups.com

Dayton,

I was under impression that if primary amqp goes down system will switch to secondary. I think I tested it while ago and it worked.

I may be wrong with current release.

Arek


--
You received this message because you are subscribed to the Google Groups "2600hz-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 2600hz-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yuriy Nasida

unread,
May 16, 2017, 1:36:44 AM5/16/17
to 2600h...@googlegroups.com
I think, Dayton is correct. Kamailio conf file just suggests to set IP address of amqp server. But... What will be if amqp will go down? The only idea I got currently - 2nd kamailio server should use 2nd amqp server. 

Btw, there is one more question here. 
Why kazoo looking for amqp server only locally? I can set some external IP in kz zones file but  it is ignored. It means that amqp node should be with kazoo node everytimes. 


16 Май 2017 г. 5:10 пользователь "Arek Fryz" <ar...@remacenterprises.com> написал:
To unsubscribe from this group and stop receiving emails from it, send an email to 2600hz-dev+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "2600hz-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 2600hz-dev+unsubscribe@googlegroups.com.

Dayton Turner

unread,
May 16, 2017, 9:21:33 AM5/16/17
to 2600hz-dev
So, yes, you can totally fail over to a secondary AMQP, but there's a significant difference between failing over because rabbit is stopped, and failing over when rabbit is 'unhealthy' - for any reason.  The secondary failover works, I have seen kamailio get kind of 'stuck' after that where it doesnt relay all its messages to the broker properly, and a quick kamailio restart clears it up, but there are still good reasons to want to not have to fail over all your queues to a full backup AMQP host, and instead share full state with a cluster member.

note: amqp over the wan is a bad idea.  erlang messaging over the wan (latent or lossy link) is a bad idea.


On Monday, 15 May 2017 22:36:44 UTC-7, Yuriy Nasida wrote:
I think, Dayton is correct. Kamailio conf file just suggests to set IP address of amqp server. But... What will be if amqp will go down? The only idea I got currently - 2nd kamailio server should use 2nd amqp server. 

Btw, there is one more question here. 
Why kazoo looking for amqp server only locally? I can set some external IP in kz zones file but  it is ignored. It means that amqp node should be with kazoo node everytimes. 

16 Май 2017 г. 5:10 пользователь "Arek Fryz" <ar...@remacenterprises.com> написал:

Dayton,

I was under impression that if primary amqp goes down system will switch to secondary. I think I tested it while ago and it worked.

I may be wrong with current release.

Arek


On Mon, May 15, 2017, 7:59 PM Dayton Turner <day...@voxter.ca> wrote:
Hi list,

I'm interested in doing some AMQP clustering within our zones, specifically to be able to sustain individual node restarts without taking amqp offline completely (rolling restarts, basically).

I see in Kamailio's config comments that there are suggestions about ensuring that the BLF queues are marked with the proper ha-mode policy, I dont know if these are the only queues that require ha-mode declarations, or if any others do as well? (If so, which ones?)

Having never successfully clustered RabbitMQ with Kazoo in the past, I'm unsure if this is a resiliency solution, or if its a "if amqp1 has problems, now amqp2 does too" sort of thing..

Would love to hear feedback from others who have experimented with this! This is currently the last "single point of failure" that has the ability to impact the entire cluster if its impacted..

:)

--
You received this message because you are subscribed to the Google Groups "2600hz-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 2600hz-dev+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "2600hz-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 2600hz-dev+...@googlegroups.com.

fred

unread,
May 16, 2017, 1:52:38 PM5/16/17
to 2600hz-dev
Seems to work when I checked.  You will probably lose any in progress calls in that zone when the primary AMQP goes down.  Easy to test just by taking it offline while making calls.  Not a problem for me as long as it keeps the zone functioning.

I think the thing with AMQP and not using WAN is if you are creating AMQP clusters.  Not sure how many people are doing that.  Seems like a big hassle for little benefit.  The way I am approaching it is if the datacenter goes down the zone is down and I will still have other zones for redundancy.

Joshua Laroff

unread,
May 16, 2017, 5:32:27 PM5/16/17
to 2600h...@googlegroups.com
While I'm not quite convinced that my clustered rabbitmq configuration would be considered "best practices", the threads subject does end with a question mark and so I'll share what I have done.

***If you are going to cluster RabbitMQ in a Kazoo deployment you will need to modify kazoo-rabbitmq so that Rabbit's mnesia database is not removed on restart.
I simply comment out the offending line in kazoo-rabbitmq, specifically line 27 (rm -rf /var/lib/rabbitmq/mnesia/kazoo-rabbit*).
Pay attention when upgrading kazoo-configs in the future if you have made this modification.

Clustering Rabbit within a zone is pretty straight forward:
Off the top of my head here are the steps I would take to cluster two RabbitMQ instances:
We'll call them R1 and R2
1. Start both instances: systemctl start kazoo-rabbitmq
2. Stop RabbitMQ using rabbitmqctl on both instances: rabbitmqctl stop_app
3. Reset RabbitMQ using rabbitmqctl on both instances: rabbitmqctl reset
4. Stop both instances: systemctl stop kazoo-rabbitmq
5. Restart both instances: systemctl start kazoo-rabbitmq
6. Stop and reset R2 (for good measure): rabbitmqctl stop_app && rabbitmqctl reset
7. Join cluster from R1: rabbitmqctl join_cluster kazoo-r...@R2.FQDN
8. Restart R2 using rabbitmqctl: rabbitmqctl start_app
9. Set the HA Policy: rabbitmqctl set_policy HA '^(.*)' '{"ha-mode"':' "all"}'
10. Run rabbitmqctl cluster_status to verify your cluster

To take advantage of this you must update your zone configuration in config.ini as well as your Kamailio config:
config.ini example:
[zone]
name = "zone1"
amqp_uri = "amqp://guest:guest@R1"
amqp_uri = "amqp://guest:guest@R2" 

For Kamailio modify local.cfg adding:
#!define MY_AMQP_URL_SECONDARY
#!substdef "!MY_AMQP_URL!kazoo://guest:guest@R1:5672!g"
#!substdef "!MY_AMQP_URL_SECONDARY!kazoo://guest:guest@R2:5672!g"


Hopefully this is helpful to those considering clustering rabbitmq!

Josh


--
You received this message because you are subscribed to the Google Groups "2600hz-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 2600hz-dev+unsubscribe@googlegroups.com.

fred

unread,
May 17, 2017, 2:56:15 PM5/17/17
to 2600hz-dev
Thanks for the step by step.  I think this is the first time I have seen this documented anywhere.  

What does this accomplish as opposed to just setting up Primary/Secondary in Kazoo?  Without AMQP clustering you lose in progress calls when the Primary AMQP goes down and I think there is a timeout period of 1 minute or so before it fails over but the Zone will stay up and new call attempts will go to Secondary after the timeout period correct?
To unsubscribe from this group and stop receiving emails from it, send an email to 2600hz-dev+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages