Hi, we are using an old version of RMQ (3.8.11), which we plan to upgrade soon - but in the meantime, I'm wondering what is the root cause of this and how to resolve this properly.
I have a cluster of 3 RabbitMQ nodes.
2022-11-18 01:48:03.613 [info] <0.44.0> Application lager started on node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local'
2022-11-18 01:48:03.918 [debug] <0.288.0> Lager installed handler lager_backend_throttle into lager_event
2022-11-18 01:48:19.306 [info] <0.44.0> Application mnesia started on node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local'
2022-11-18 01:48:19.309 [info] <0.44.0> Application mnesia exited with reason: stopped
2022-11-18 01:48:19.309 [info] <0.44.0> Application mnesia exited with reason: stopped
2022-11-18 01:48:19.316 [info] <0.272.0> Feature flags: list of feature flags found:
2022-11-18 01:48:19.316 [info] <0.272.0> Feature flags: [x] drop_unroutable_metric
2022-11-18 01:48:19.316 [info] <0.272.0> Feature flags: [x] empty_basic_get_metric
2022-11-18 01:48:19.316 [info] <0.272.0> Feature flags: [x] implicit_default_bindings
2022-11-18 01:48:19.316 [info] <0.272.0> Feature flags: [x] maintenance_mode_status
2022-11-18 01:48:19.316 [info] <0.272.0> Feature flags: [x] quorum_queue
2022-11-18 01:48:19.316 [info] <0.272.0> Feature flags: [x] user_limits
2022-11-18 01:48:19.316 [info] <0.272.0> Feature flags: [x] virtual_host_metadata
2022-11-18 01:48:19.317 [info] <0.272.0> Feature flags: feature flag states written to disk: yes
2022-11-18 01:48:19.513 [info] <0.44.0> Application mnesia started on node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local'
2022-11-18 01:48:19.516 [info] <0.44.0> Application mnesia exited with reason: stopped
2022-11-18 01:48:19.516 [info] <0.44.0> Application mnesia exited with reason: stopped
2022-11-18 01:48:19.518 [error] <0.272.0>
2022-11-18 01:48:19.518 [error] <0.272.0> BOOT FAILED
BOOT FAILED
===========
Error during startup: {error,
{inconsistent_cluster,
2022-11-18 01:48:19.518 [error] <0.272.0> ===========
2022-11-18 01:48:19.519 [error] <0.272.0> Error during startup: {error,
2022-11-18 01:48:19.519 [error] <0.272.0> {inconsistent_cluster,
2022-11-18 01:48:19.519 [error] <0.272.0> "Node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local', but 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local' disagrees"}}
"Node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local', but 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local' disagrees"}}
2022-11-18 01:48:19.519 [error] <0.272.0>
2022-11-18 01:48:20.520 [info] <0.271.0> [{initial_call,{application_master,init,['Argument__1','Argument__2','Argument__3','Argument__4']}},{pid,<0.271.0>},{registered_name,[]},{error_info,{exit,{{inconsistent_cluster,"Node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local', but 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local' disagrees"},{rabbit,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,138}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}},{ancestors,[<0.270.0>]},{message_queue_len,1},{messages,[{'EXIT',<0.272.0>,normal}]},{links,[<0.270.0>,<0.44.0>]},{dictionary,[]},{trap_exit,true},{status,running},{heap_size,1598},{stack_size,28},{reductions,322}], []
2022-11-18 01:48:20.520 [error] <0.271.0> CRASH REPORT Process <0.271.0> with 0 neighbours exited with reason: {{inconsistent_cluster,"Node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local', but 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local' disagrees"},{rabbit,start,[normal,[]]}} in application_master:init/4 line 138
2022-11-18 01:48:20.521 [info] <0.44.0> Application rabbit exited with reason: {{inconsistent_cluster,"Node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local', but 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local' disagrees"},{rabbit,start,[normal,[]]}}
2022-11-18 01:48:20.521 [info] <0.44.0> Application rabbit exited with reason: {{inconsistent_cluster,"Node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local', but 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local' disagrees"},{rabbit,start,[normal,[]]}}
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{{inconsistent_cluster,\"Node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local' thinks it's clustered with node 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local', but 'rab...@rabbitmq-1.rabbitmq-service.example-qa.svc.cluster.local' disagrees\"},{rabbit,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{{inconsistent_cluster,"Node 'rab...@rabbitmq-2.rabbitmq-service.example-qa.svc.cluster.local' thinks it's clustered w
Crash dump is being written to: /var/log/rabbitmq/erl_crash.dump...done
Thanks!