RabbitMQ 3.9 startup crash with k8s peer discovery on EKS

67 views

Skip to first unread message

Mark Summers

unread,

Sep 28, 2022, 8:03:50 AM9/28/22

to rabbitmq-users

This happens when trying to use the k8s discovery plugin on EKS. Has anyone encountered this failure mode before?

Please note: This happens after the plugin is able to contact the Kubernetes API, so it is somehow related to the data that was returned. If the plugin simply cannot connect to the endpoint (e.g. nxdomain), then startup proceeds and logging is initialised, with various errors and warnings printed to the logs intermittently.

It looks to me as though something about my setup is causing the discovery plugin to go down a code path that isn't normally used. I'm saying this because, looking at the code, I don't see how it could ever have worked: It seems that a string value is being passed to a function (list_to_atom) that expects a list:

https://github.com/rabbitmq/rabbitmq-server/blob/v3.9.22/deps/rabbitmq_peer_discovery_common/src/rabbit_peer_discovery_util.erl#L219

BOOT FAILED

===========
Exception during startup:

error:function_clause

lists:nth/2, line 170
args: [1,[]]
rabbit_peer_discovery_util:node_name/1, line 219
lists:map/2, line 1243
rabbit_peer_discovery_k8s:list_nodes/0, line 44
rabbit_peer_discovery_k8s:lock/1, line 76
rabbit_peer_discovery:lock/0, line 190
rabbit_mnesia:init_with_lock/3, line 104
rabbit_mnesia:init/0, line 76

Error:
rabbit_is_not_running
Starting broker...{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{function_clause,{rabbit,start,[normal,[]]}}}"}

Kernel pid terminated (application_controller) ({application_start_failure,rabbit,{function_clause,{rabbit,start,[normal,[]]}}})

Reply all

Reply to author

Forward

0 new messages