I don't know why the rabbitmq operator has stopped working (I had managed to get it to work on previous occasions).
I am using eks on aws with kubernetes version 1.21.
The installation of the rabbitmq operator and the topology operator seems to work properly:
NAME READY STATUS RESTARTS AGE
messaging-topology-operator-f9c69d45b-xnmns 1/1 Running 0 84m
rabbitmq-cluster-operator-7cbf865f89-7lvsp 1/1 Running 0 84m
The problem is that when I try to deploy a rabbitmq cluster using the operator, the pod keeps rebooting and I never get service (I never get to the ready state):
Normal Scheduled 29m default-scheduler Successfully assigned dev/rabbitmq-server-0 to ip-10-50-121-78.eu-central-1.compute.internal
Normal SuccessfulAttachVolume 29m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-3c1b2424-6a32-4428-8f64-53448ed79a53"
Normal Pulled 29m kubelet Container image "rabbitmq:3.8.21-management" already present on machine
Normal Created 29m kubelet Created container setup-container
Normal Started 29m kubelet Started container setup-container
Normal Created 26m (x2 over 28m) kubelet Created container rabbitmq
Normal Started 26m (x2 over 28m) kubelet Started container rabbitmq
Normal Pulled 18m (x5 over 28m) kubelet Container image "rabbitmq:3.8.21-management" already present on machine
Warning Unhealthy 8m58s (x80 over 28m) kubelet Readiness probe failed: dial tcp 10.50.125.24:5672: connect: connection refused
Warning BackOff 3m46s (x46 over 24m) kubelet Back-off restarting failed container
Looking at the logs of the machine itself, I see that the epmd is failing.
kubectl logs -n dev rabbitmq-server-0 -f
WARNING: 'docker-entrypoint.sh' generated/modified the RabbitMQ configuration file, which will no longer happen in 3.9 and later! (https://github.com/docker-library/rabbitmq/pull/424)
Generated end result, for reference:
------------------------------------
loopback_users.guest = false
total_memory_available_override_value = 268435456
listeners.tcp.default = 5672
management.tcp.port = 15672
------------------------------------
Configuring logger redirection
15:35:04.103 [warning] cluster_formation.randomized_startup_delay_range.min and cluster_formation.randomized_startup_delay_range.max are deprecated
15:36:03.455 [error]
15:36:03.455 [error] BOOT FAILED
15:36:03.455 [error] ===========
15:36:03.455 [error] ERROR: epmd error for host rabbitmq-server-0.rabbitmq-nodes.dev: timeout (timed out)
15:36:03.455 [error]
BOOT FAILED
===========
ERROR: epmd error for host rabbitmq-server-0.rabbitmq-nodes.dev: timeout (timed out)
15:36:04.456 [error] Supervisor rabbit_prelaunch_sup had child prelaunch started with rabbit_prelaunch:run_prelaunch_first_phase() at undefined exit with reason {epmd_error,"rabbitmq-server-0.rabbitmq-nodes.dev",timeout} in context start_error
15:36:04.457 [error] CRASH REPORT Process <0.152.0> with 0 neighbours exited with reason: {{shutdown,{failed_to_start_child,prelaunch,{epmd_error,"rabbitmq-server-0.rabbitmq-nodes.dev",timeout}}},{rabbit_prelaunch_app,start,[normal,[]]}} in application_mas
ter:init/4 line 142
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,{epmd_error,\"rabbitmq-server-0.rabbitmq-nodes.dev\",timeout}}},{rabbit_prelaunch_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,rabbitmq_prelaunch,{{shutdown,{failed_to_start_child,prelaunch,{epmd_error,"rabbitmq-server-0.rabbitmq-nodes.dev",timeout}}},
Crash dump is being written to: erl_crash.dump...%
Has anyone had a similar problem? Last week it was working perfectly and I haven't changed any of the manifests.
1 │ apiVersion:
rabbitmq.com/v1beta1 2 │ kind: RabbitmqCluster
3 │ metadata:
4 │ name: rabbitmq
5 │ namespace: dev
6 │ annotations:
7 │
rabbitmq.com/topology-allowed-namespaces: dev
8 │ spec:
9 + │ resources:
10 + │ requests:
11 + │ cpu: 250m
12 + │ memory: 256Mi
13 + │ limits:
14 + │ cpu: 250m
15 + │ memory: 256Mi
16 │ replicas: 1
17 │ tolerations:
18 │ - key: dedicated
19 │ operator: Equal
20 │ value: spot
21 _ │ effect: NoSchedule
22 │ rabbitmq:
23 │ additionalPlugins:
24 │ - rabbitmq_mqtt
25 │ - rabbitmq_management
26 │ - rabbitmq_management_agent
27 │ - rabbitmq_top
28 │ - rabbitmq_shovel
29 │ - rabbitmq_shovel_management
Thank you very much