[3.7][Kubernetes][Cluster] Making RabbitMQ persistent, resetting the cluster config?

Nils Peeters

unread,

Dec 21, 2017, 5:00:45 AM12/21/17

to rabbitmq-users

Hiya,

I made a RabbitMQ (3.7) cluster on Kubernetes on GKE with the help of the official plugin which I found on RabbitMQ's clustering page.

Unfortunately I'm unable to make the cluster persistent (even after a full cluster reboot) and I was hoping someone could help me out or give me some advice.

Initial attempt

When I used the example YAML files from the example.

At first I had some issues with the RBAC rules from kubernetes (needed an Endpoint reader, etc), but after fixing that everything works fine, but I needed to make the whole cluster persistent / redundant (also needed to work with delayed messages plugin).

The solution that I found was making the '/var/lib/rabbitmq' folder persisent by adding a volume to this path (I used a Google Persistent Disk, which is basically a physical hard drive).

Discovery issues with erlang cookie (solved)

This was when the real issues starting popping up. Discovery was starting to fail, because each Rabbit Node had a different erlang_cookie, because of this he refused to form a cluster.

After some searching I found a workaround for this by setting the environment variable RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS to "-setcookie cookie1" which all Rabbit Nodes share, forcing the cluster not to use the file-based erlang cookie.

Cluster config is saved and contains IP addresses (unresolved)

After tackling this issue, I've noticed that the peer-discovery-k8s plugin uses a Pod's IP to discover each other and form a cluster but a Pod's IP is never static, which wouldn't be a problem if the cluster config isn't stored in the '/var/lib/rabbitmq/mnesia' folder.

Because of this when a Rabbit Node crashes and recovers itself (e.g. goes out of memory), or Kubernetes moves a RabbitMQ Pod around with a cluster upgrade, RabbitMQ's boot process will fail.

This is because it gets a different IP assigned by Kubernetes but the previous cluster config is still saved somewhere, he tries to connect to the cluster with the old IP addresses and the cluster forming fails.

Possible solutions

At this point there are two possible stable solutions in my opinion (if you think I could try something else, I'm all ears):

The official plugin no longer works with Pod IP's but with host names, you're working with StatefulSets, so Pod names and their host names are predictable (eg. rabbitmq-0, rabbitmq-1, ...) and Kubernetes enforces the numeric ordering.
I would use host names for the discovery instead of getting the kubernetes endpoints but I'm not an erlang developer, so can't fork this and make a stable fix.
Changing RABBITMQ_NODENAME to 'rabbit@${MY_HOSTNAME}' is not enough, because the plugin really expects an IP address as RABBITMQ_NODENAME.
Reset the whole cluster config on container boot (not been able to do this so far, lacking documentation) / Save the cluster config outside my persistent path (/var/lib/rabbitmq), but I can't find any documentation on how to do this.

You can easily replicate my use-case by doing the following:
(Only in Google Kubernetes Engine, otherwise you'll might have to modify the 'VolumeClaimTemplate' to your current supported persistent volumes, EmptyDir is not persistent)

Go to the parent folder of my zip
kubectl apply -f <foldername_of_zip>
Wait and watch the logs of the containers until the cluster is formed (there is also a management console available at nodeport 31672)
Scale the StatefulSet to 0, scale it again to 3 after the StatefulSet has 0 available pods
Watch the whole thing burn

There's also a minikube guide in the readme.md in the official github examples page, but this doesn't keep the persistence in mind.

Lastly I've temporarily disabled the readiness and liveness probe until I'm able to form a stable cluster.

I've added all my modified YAML's to this post.

I hope I gave you guys enough information, any help would be really appreciated (especially with 'possible solution 2').

Kind regards,

Nils Peeters

Message has been deleted

Nils Peeters

unread,

Dec 21, 2017, 5:11:00 AM12/21/17

to rabbitmq-users

Apparently uploading files seems broken atm, or ZIP / YAML files are not allowed.

I've uploaded the YAML files to my github page:

https://github.com/Hetkoekje/rabbitmq-k8s

Gabriele Santomaggio

unread,

Dec 21, 2017, 11:19:48 AM12/21/17

to rabbitmq-users

Hi,
Please check the new example https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s/tree/v3.7.x/examples/k8s_statefulsets, it uses the official rabbitmq docker image so you can change the cookie and other settings easily.

-
Gabriele

Nils Peeters

unread,

Dec 27, 2017, 12:15:06 PM12/27/17

to rabbitmq-users

Hi Gabriele,

Thanks for your quick reply! I was only able to pick this up this week.

This is great stuff and helped me get a whole lot further, thanks.

I was able to manage to create a cluster in Kubernetes with discovery based on hostnames.

Initial deployment was no problem at all anymore, but when I scaled the cluster down to for example to 1 and then back to 3, the 2 rabbit nodes that were rebooted, failed to start.

I think this is because of some sort of configuration mismatch, but I have no idea on how to solve this robustly.

On rabbit node 0 (the first) I saw the following messages when I scaled down:
Removing node 'rab...@rabbitmq-1.rabbitmq.test-rabbitmq.svc.cluster.local' from cluster

Removing node 'rab...@rabbitmq-2.rabbitmq.test-rabbitmq.svc.cluster.local' from cluster

When I scaled them back up, I saw the following relevant messages in the logs of rabbit node 1 and 2.

throw:{error,{inconsistent_cluster,"Node 'rab...@rabbitmq-1.rabbitmq.test-rabbitmq.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-0.rabbitmq.test-rabbitmq.svc.cluster.local', but 'rab...@rabbitmq-0.rabbitmq.test-rabbitmq.svc.cluster.local' disagrees"}}

And a similar message on node 2.

I've read up a bit on this error and read the section about breaking apart a cluster in the clustering guide, but the solutions offered there and in other answers require that you reset the node with rabbitmqctl.

If I understand this correctly that also means that all the data gets wiped off the rabbit node, which is not something you want in case that a rabbit node crashes (out of memory for example) and has valuable (delayed) messages on it.

I've pushed my updated YAML files to my github: https://github.com/Hetkoekje/rabbitmq-k8s.

Thanks in advance.

Kind regards,

Nils

Thanh Pham Minh

unread,

Jan 10, 2018, 3:25:54 AM1/10/18

to rabbitmq-users

I also have the same goal to make rabbitmq-autocluster on k8s persistent, my similar topic : https://groups.google.com/forum/#!topic/rabbitmq-users/aaGpuyyAF78

Many thanks to Nils Peeters about making rabbitmq clustering by hostname

This is my solution: https://gist.github.com/pmint93/cb87c5f46502ce8047a084238cad03e4

It's work well on K8s 1.8.4 cluster with kops managed, and running on AWS

Michael Klishin

unread,

Jan 10, 2018, 12:18:00 PM1/10/18

to rabbitm...@googlegroups.com

Thank you for sharing that!

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

MK

Staff Software Engineer, Pivotal/RabbitMQ

Nils Peeters

unread,

Jan 15, 2018, 12:18:56 PM1/15/18

to rabbitmq-users

Thanks for posting your solution Thanh Pham Minh, really appreciated.

I've tried your configuration, but on my Google Cloud Kubernetes cluster (1.8.5).

With your example they don't seem to te able to form a cluster, unfortunately (split brain, it seems).

Am I doing something wrong? (I ran your script)

Furthermore, I don't see any major differences in our setups aside from 3 things:

1#

kubectl exec rabbitmq-0 -n $KUBE_NAMESPACE -- rabbitmqctl set_policy ha-all "" '{"ha-mode":"all","ha-sync-mode":"automatic"}'

I tried this to be sure, but as expected it does not seem to affect cluster formation.

2#

You keep `/var/lib/rabbitmq/mnesia` persistent instead of `/var/lib/rabbitmq`, I tried changing this, no effect.

3#

Your setup uses the old autocluster plugin and an older version of RabbitMQ.

I used the same Docker Image as you did with the same plugin as you did, and I had the results described above.

---

Lastly, did you really test the use-case the same use-case as I did?
1. Create initial cluster

2. Wait until the cluster in completely formed

3. Delete pod with the highest ID from the statefulset (rabbitmq-2 in my csse) and see if it automatically recovers

Thanks in advance for any suggestions.

Thanh Pham Minh

unread,

Jan 15, 2018, 10:07:04 PM1/15/18

to rabbitmq-users

Hmm, yeah i see it's seem similar, those minor scripts is just for my use case. But it's hard to say it's the same because it's use the difference image and I'm not sure rabbitmq-autocluster is not doing something else.

Also, I've tested my cluster by running these scenario:
1. Delete the first rabbitmq pod (rabbitmq-0)
2. Delete the last rabbitmq pod (rabbitmq-2 in my 3 replicas statefulset)
3. Delete all pods

And it's normally auto recovered.

Did you check the Pod DNS is working in your K8s cluster ? Try to exec into a pod then curl to another by it's DNS (eg: curl rabbitmq-0.<headless service name>.<namespace>.svc.cluster.local:15672 ) ?

Arthur Wiebe

unread,

Jan 23, 2018, 11:23:08 PM1/23/18

to rabbitmq-users

I am seeing something similar. Configs include:

- env:
        - name: MY_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: RABBITMQ_USE_LONGNAME
          value: "true"
        - name: RABBITMQ_NODENAME
          value: rabbit@$(MY_POD_NAME).rabbitmq.default.svc.cluster.local

And in rabbitmq.conf

cluster_formation.peer_discovery_backend  = rabbit_peer_discovery_k8s
cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
cluster_formation.k8s.hostname_suffix = rabbitmq.default.svc.cluster.local
cluster_formation.k8s.address_type = hostname

Yet the peer discovery service seems to be off.

If from rabbitmq-1 I run the command:

rabbitmqctl join_cluster rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local

On the rabbitmq-0 log I see:

2018-01-24 03:42:55.100 [info] <0.288.0> node 'rab...@rabbitmq-1.rabbitmq.default.svc.cluster.local' up

2018-01-24 03:42:57.720 [warning] <0.386.0> Peer discovery: removing unknown node rab...@rabbitmq-1.rabbitmq.default.svc.cluster.local from the cluster

2018-01-24 03:42:57.720 [info] <0.386.0> Removing node 'rab...@rabbitmq-1.rabbitmq.default.svc.cluster.local' from cluster

I have not been able to find a way to figure out what the rabbit_peer_discovery_k8s is actually discovering as it seems it's "discovering" different names than actually exist.

Running everything on IPs instead of hostname works fine, except of course there is no data persistence.

Thanh Pham Minh

unread,

Jan 23, 2018, 11:56:28 PM1/23/18

to rabbitmq-users

Let's digging into rabbit_peer_discovery_k8s for what they do by reading the code:
Config mapping: https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s/blob/master/include/rabbit_peer_discovery_k8s.hrl
Peer discovering: https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s/blob/master/src/rabbit_peer_discovery_k8s.erl

As I see it's pretty simple: at startup peer_discovery make a query into k8s API api/v1/namespaces/<namespace>/endpoints/<service-name>
That API return a list of addresses included 2 field: ip and hostname, K8S_ADDRESS_TYPE env variable is used to decide which field should be picked to perform discovery.
K8S_HOSTNAME_SUFFIX is used to construct Fully qualified domain name (FQDN) of nodes which rabbitmq used to start clustering

So I think maybe it's a good idea to "debugging" the rabbitmq_peer_discovery_k8s itself by understand what it's does and trace the log to figure out what wrong.

Hope it help !

Arthur Wiebe

unread,

Jan 24, 2018, 9:10:16 AM1/24/18

to rabbitmq-users

Yeah exactly I've gone through the source code as well, and it looks like I'm doing everything right. The plugin doesn't log anything useful even if I set the level to debug.

Arthur Wiebe

unread,

Jan 24, 2018, 9:37:00 AM1/24/18

to rabbitmq-users

After closer examination, it appears the node_name function which adds the hostname suffix does not add a . in between the hostname and the suffix. So changing K8S_HOSTNAME_SUFFIX (or cluster_formation.k8s.hostname_suffix) to .rabbitmq.default.svc.cluster.local solved the issue!

Amanpreet Singh

unread,

Apr 9, 2018, 6:41:18 AM4/9/18

to rabbitmq-users

I'm hitting this with Rabbitmq 3.7 with k8s peer discovery as well. I'm using k8s statefulset.

Cluster is formed correctly, but deleting a pod takes it in a crashloop forever due to BOOT FAILED, with this log:

2018-04-09 09:15:15.618 [debug] <0.97.0> Lager installed handler lager_forwarder_backend into error_logger_lager_event
2018-04-09 09:15:15.618 [debug] <0.100.0> Lager installed handler lager_forwarder_backend into rabbit_log_lager_event
2018-04-09 09:15:15.613 [debug] <0.94.0> Lager installed handler error_logger_lager_h into error_logger
2018-04-09 09:15:15.618 [debug] <0.86.0> Supervisor lager_sup started gen_event:start_link({local,error_logger_lager_event}) at pid <0.96.0>
2018-04-09 09:15:15.618 [debug] <0.86.0> Supervisor lager_sup started gen_event:start_link({local,rabbit_log_lager_event}) at pid <0.99.0>
2018-04-09 09:15:15.618 [debug] <0.86.0> Supervisor lager_sup started gen_event:start_link({local,rabbit_log_channel_lager_event}) at pid <0.102.0>
2018-04-09 09:15:15.623 [debug] <0.103.0> Lager installed handler lager_forwarder_backend into rabbit_log_channel_lager_event
2018-04-09 09:15:15.623 [debug] <0.86.0> Supervisor lager_sup started gen_event:start_link({local,rabbit_log_connection_lager_event}) at pid <0.105.0>
2018-04-09 09:15:15.623 [debug] <0.106.0> Lager installed handler lager_forwarder_backend into rabbit_log_connection_lager_event
2018-04-09 09:15:15.623 [debug] <0.86.0> Supervisor lager_sup started gen_event:start_link({local,rabbit_log_mirroring_lager_event}) at pid <0.108.0>
2018-04-09 09:15:15.624 [debug] <0.109.0> Lager installed handler lager_forwarder_backend into rabbit_log_mirroring_lager_event
2018-04-09 09:15:15.624 [debug] <0.86.0> Supervisor lager_sup started gen_event:start_link({local,rabbit_log_queue_lager_event}) at pid <0.111.0>
2018-04-09 09:15:15.624 [debug] <0.112.0> Lager installed handler lager_forwarder_backend into rabbit_log_queue_lager_event
2018-04-09 09:15:15.624 [debug] <0.86.0> Supervisor lager_sup started gen_event:start_link({local,rabbit_log_federation_lager_event}) at pid <0.114.0>
2018-04-09 09:15:15.624 [debug] <0.115.0> Lager installed handler lager_forwarder_backend into rabbit_log_federation_lager_event
2018-04-09 09:15:15.625 [debug] <0.86.0> Supervisor lager_sup started gen_event:start_link({local,rabbit_log_upgrade_lager_event}) at pid <0.117.0>
2018-04-09 09:15:15.625 [debug] <0.118.0> Lager installed handler lager_forwarder_backend into rabbit_log_upgrade_lager_event
2018-04-09 09:15:15.638 [debug] <0.81.0> Supervisor gr_param_sup started gr_param:start_link(gr_lager_default_tracer_params) at pid <0.120.0>
2018-04-09 09:15:15.642 [debug] <0.80.0> Supervisor gr_counter_sup started gr_counter:start_link(gr_lager_default_tracer_counters) at pid <0.121.0>
2018-04-09 09:15:15.647 [debug] <0.82.0> Supervisor gr_manager_sup started gr_manager:start_link(gr_lager_default_tracer_params_mgr, gr_lager_default_tracer_params, []) at pid <0.122.0>
2018-04-09 09:15:15.647 [debug] <0.82.0> Supervisor gr_manager_sup started gr_manager:start_link(gr_lager_default_tracer_counters_mgr, gr_lager_default_tracer_counters, [{input,0},{filter,0},{output,0},{job_input,0},{job_run,0},{job_time,0},{job_error,0}]) at pid <0.123.0>
2018-04-09 09:15:15.948 [info] <0.33.0> Application lager started on node 'rab...@rabbitmq-1.rabbitmq.default.svc.cluster.local'
2018-04-09 09:15:16.084 [debug] <0.90.0> Lager installed handler lager_backend_throttle into lager_event
2018-04-09 09:15:17.969 [debug] <0.127.0> Supervisor inet_gethost_native_sup started undefined at pid <0.128.0>
2018-04-09 09:15:17.969 [debug] <0.60.0> Supervisor kernel_safe_sup started inet_gethost_native:start_link() at pid <0.127.0>
2018-04-09 09:15:17.979 [debug] <0.136.0> Supervisor mnesia_sup started mnesia_sup:start_event() at pid <0.137.0>
2018-04-09 09:15:17.979 [debug] <0.136.0> Supervisor mnesia_sup started mnesia_ext_sup:start() at pid <0.138.0>
2018-04-09 09:15:17.979 [debug] <0.139.0> Supervisor mnesia_kernel_sup started mnesia_monitor:start() at pid <0.140.0>
2018-04-09 09:15:17.980 [debug] <0.139.0> Supervisor mnesia_kernel_sup started mnesia_subscr:start() at pid <0.141.0>
2018-04-09 09:15:17.980 [debug] <0.139.0> Supervisor mnesia_kernel_sup started mnesia_locker:start() at pid <0.142.0>
2018-04-09 09:15:17.980 [debug] <0.139.0> Supervisor mnesia_kernel_sup started mnesia_recover:start() at pid <0.143.0>
2018-04-09 09:15:17.980 [debug] <0.139.0> Supervisor mnesia_kernel_sup started mnesia_tm:start() at pid <0.144.0>
2018-04-09 09:15:17.980 [debug] <0.139.0> Supervisor mnesia_kernel_sup started mnesia_checkpoint_sup:start() at pid <0.145.0>
2018-04-09 09:15:17.980 [debug] <0.139.0> Supervisor mnesia_kernel_sup started mnesia_controller:start() at pid <0.146.0>
2018-04-09 09:15:17.981 [debug] <0.139.0> Supervisor mnesia_kernel_sup started mnesia_late_loader:start() at pid <0.147.0>
2018-04-09 09:15:17.981 [debug] <0.136.0> Supervisor mnesia_sup started mnesia_kernel_sup:start() at pid <0.139.0>
2018-04-09 09:15:17.981 [info] <0.33.0> Application mnesia started on node 'rab...@rabbitmq-1.rabbitmq.default.svc.cluster.local'
2018-04-09 09:15:17.983 [info] <0.33.0> Application mnesia exited with reason: stopped

BOOT FAILED
===========

Error description:
    init:do_boot/3
    init:start_em/1
    rabbit:start_it/1 line 444
    rabbit:'-boot/0-fun-0-'/0 line 300
    rabbit_mnesia:check_cluster_consistency/0 line 663
throw:{error,{inconsistent_cluster,"Node 'rab...@rabbitmq-1.rabbitmq.default.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local', but 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local' disagrees"}}
Log file(s) (may contain more information):
   <stdout>

2018-04-09 09:15:17.984 [error] <0.5.0>
Error description:
    init:do_boot/3
    init:start_em/1
    rabbit:start_it/1 line 444
    rabbit:'-boot/0-fun-0-'/0 line 300
    rabbit_mnesia:check_cluster_consistency/0 line 663
throw:{error,{inconsistent_cluster,"Node 'rab...@rabbitmq-1.rabbitmq.default.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local', but 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local' disagrees"}}
Log file(s) (may contain more information):
   <stdout>
{"init terminating in do_boot",{error,{inconsistent_cluster,"Node 'rab...@rabbitmq-1.rabbitmq.default.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local', but 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local' disagrees"}}}

Any ideas how to deal with this? (other than deleting the rabbitmq-1 data, of course)

Michael Klishin

unread,

Apr 9, 2018, 6:54:09 AM4/9/18

to rabbitm...@googlegroups.com

Please start new threads for new questions.

This particular one was asked multiple times in different places on GitHub to which our team responded. We explained what the error means. To provide an informed recommendation someone needs to investigate how exactly the chart provisions nodes and what exactly is done to recreate it.

One hypothesis we had was that nodes are reset upon recreation which is wrong (a reset deletes all data).

Server logs from all nodes are critically important in investigating this outcome. Consider inspecting them.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Staff Software Engineer, Pivotal/RabbitMQ

Petr Šebek

unread,

Apr 9, 2018, 8:13:23 AM4/9/18

to rabbitmq-users

I had the same problem and I've discovered that only problem for me was `cluster_formation.node_cleanup.only_log_warning` settings set to false. According to http://www.rabbitmq.com/cluster-formation.html#node-health-checks-and-cleanup it performs node deletion when it leaves cluster which certainly is not what I want when it is clustered. Setting it to true fixed the issue with inconsistent_cluster.

Michael Klishin

unread,

Apr 9, 2018, 9:35:48 AM4/9/18

to rabbitm...@googlegroups.com

Thank you, Petr.

To make it clear, "only_log_warning = false" will indeed remove a node from the cluster when it disappears

from the list reported by the backend, even if its transient. That's why only_log_warning defaults to true:

https://github.com/rabbitmq/rabbitmq-website/blob/4cc238bfc21ea93024ff5df5684847706454b1a3/site/cluster-formation.xml#L956

However, this example sets to to `false`:

https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s/blob/b309d641ce7ccf135d02fb39254e8c67c98d3a63/examples/k8s_statefulsets/rabbitmq_statefulsets.yaml#L40

I'll have to check with the author to understand what's the intent.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Michael Klishin

unread,

Apr 9, 2018, 10:38:23 AM4/9/18

to rabbitm...@googlegroups.com

I clarified the intent with Gabriele.

The goal was to demonstrate cluster scaling up and down. The automatic node removal feature

makes most sense in environments where failed nodes are replaced with blank ones,

namely AWS Autoscaling Groups.

We will change the default to only emit a warning. The downside of that setting is that nodes

that are gone for good will have to be manually deleted by the operator.

To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Michael Klishin

unread,

Apr 9, 2018, 11:04:19 AM4/9/18

to rabbitm...@googlegroups.com

So apparently the Kubernetes Chart adopted the config we used in our example verbatim.

We corrected the example and submitted the PR for the Chart.

Never deploy examples you find on the Internet into production, folks. Not without a careful

review anyway.

Michael Klishin

unread,

Apr 9, 2018, 11:47:27 AM4/9/18

to rabbitm...@googlegroups.com

FTR, here's the updated example: https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s/blob/master/examples/k8s_statefulsets/rabbitmq_statefulsets.yaml#L44

as well as the Charts PR: https://github.com/kubernetes/charts/pull/4823.

Many thanks to Petr Šebek for uncovering this mystery.

Michael Klishin

unread,

Apr 18, 2018, 11:12:51 PM4/18/18

to rabbitm...@googlegroups.com

FTR, https://github.com/kubernetes/charts/pull/4823 was accepted and should ship with the 1.3.2 release of the chart.

Thanks for the productive discussion everyone, and in particular Petr :)

Matt Yule-Bennett

unread,

Jun 1, 2018, 7:11:43 AM6/1/18

to rabbitmq-users

With this change in place, the unreachable node remains part of the cluster after pods are replaced. Is there a recommended approach for removing them other than period manual cleanup?

We have at least once seen the situation where a queue created by a federation consumer remained "homed" to an offline node. This may or may not be a bug in the federation setup, but it caused an outage for us because messages published to the federated exchange to which the queue was bound could not be confirmed.

While thinking about how to solve this problem I wondered why the nodename is set to the IP address rather than the name of the pod. In a stateful set the pod names are consistent, which would allow a replacement pod to rejoin the cluster without changing the identity of the node. A caveat is that you have to use fully qualified domain names and add a headless service so that pods in the namespace can resolve themselves by name. Is there a reason why this would not work?

Thanks,

Matt.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.

To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Michael Klishin

unread,

Jun 1, 2018, 7:15:52 AM6/1/18

to rabbitm...@googlegroups.com

Please start new threads for new questions.

You have two options:

* Manually delete nodes (recommended, RabbitMQ cannot possibly know when it should happen)

* Use the automatic forced removal of unreachable nodes (the risks of which are documented and discussed in this thread)

The latter option has a configurable node inactivity timeout, which can be set fairly high (say, hours).

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,

Jun 1, 2018, 7:18:11 AM6/1/18

to rabbitm...@googlegroups.com

From [1]:

«When a list of peer nodes is computed from a list of pod containers returned by Kubernetes,

either contain hostnames or IP addresses can be used. This is configurable using the cluster_formation.k8s.address_type key […]

Supported values are ip or hostname, the former is used by default…»

1. http://www.rabbitmq.com/cluster-formation.html#peer-discovery-k8s

Pascal Larivee

unread,

Jun 1, 2018, 8:05:53 AM6/1/18

to rabbitm...@googlegroups.com

I'm using StatefullSet with hostname and nodes rejoins the cluster after being offline and finds it's data OK. So I do not have to deal with nodes not part of the cluster ( unless scaling down the cluster )

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Pascal Larivée

Michael Klishin

unread,

Jun 1, 2018, 8:45:16 AM6/1/18

to rabbitm...@googlegroups.com

Thanks for the data point, Pascal.

I think it makes more sense to default to “hostname” now that we highly recommend stateful sets in the docs. Unfortunately this is not the kind of change that would be safe to ship in 3.7.x so updating the docs is the best we can do in the short term.

Do you agree with the default change idea?

Pascal Larivee

unread,

Jun 4, 2018, 9:42:42 AM6/4/18

to rabbitmq-users

I do agree with the change as it will make things simpler for new users and will prevent a bad experience. But will removed the pleasure of finding how to do it ;)

Pascal Larivée

Michael Klishin

unread,

Jun 4, 2018, 11:01:03 AM6/4/18

to rabbitm...@googlegroups.com

May I ask you to file an issue with some details from this thread and a link to it (in Google Groups)?

Here's the repo: https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s.

Thank you.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

8239...@163.com

unread,

Jul 29, 2018, 10:06:59 PM7/29/18

to rabbitmq-users

Hello，guy

I also have the same probelm. We use hostname.

When someone id down for a long time, other cluster node will clear it. But when the node is back, it can not join the cluster, so failed.

If you have any solution, please tell me.

Thanks

在 2017年12月21日星期四 UTC+8下午6:00:45，Nils Peeters写道：

Michael Klishin

unread,

Jul 30, 2018, 4:26:36 PM7/30/18

to rabbitm...@googlegroups.com

The solution is to not enable automatic node cleanup since what's described in this thread

is the primary con (risk) of that feature. It is pretty explicitly stated in the docs now [1].

1. http://www.rabbitmq.com/cluster-formation.html#node-health-checks-and-cleanup

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward