rabbitMQ is not coming up on kubernetes cluster

mohan m

unread,

Dec 12, 2019, 4:25:59 AM12/12/19

to rabbitmq-users

I'm new to rabbitMQ, I'm trying to deploy rabbitMQ in kubernetes cluster using statefulsets method.

kubectl describe pod rabbitmq-0 listing below error,

============================================================================

Type Reason Age From Message

---- ------ ---- ---- -------

Warning Unhealthy 17m (x67 over 81m) kubelet, kube-node03 (combined from similar events): Readiness probe failed: Error: unable to perform an operation on node 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local'. Please see diagnostics information and suggestions below.

Most common reasons for this are:

* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)

=============================================

Can some one help me to fix issue?

Thanks

Mohan M

Wesley Peng

unread,

Dec 12, 2019, 6:13:00 AM12/12/19

to rabbitm...@googlegroups.com

On 2019/12/12 5:25 下午, mohan m wrote:
> Warning Unhealthy 17m (x67 over 81m) kubelet, kube-node03
> (combined from similar events): Readiness probe failed: Error: unable to
> perform an operation on node
> 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local'. Please see
> diagnostics information and suggestions below.

hi

firstly please check the hostname
"rabbitmq-0.rabbitmq.default.svc.cluster.local" and its shortname must
be resolvable, either by DNS or local hosts file.

regards.

Gabriele Santomaggio

unread,

Dec 12, 2019, 8:13:30 AM12/12/19

to rabbitmq-users

In which way are you trying to deploy RabbitMQ in K8s ?

With Helm?

The RabbitMQ github example ?

Your own yaml file?

You could have also some permission issue.

-

Gabriele Santomagggio

mohan m

unread,

Dec 12, 2019, 1:03:19 PM12/12/19

to rabbitm...@googlegroups.com

The RabbitMQ github example ?

Yes RabbitMQ giihub example

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/8f184b97-4a47-4df2-ac31-868281afa702%40googlegroups.com.

mohan m

unread,

Dec 12, 2019, 2:23:09 PM12/12/19

to rabbitm...@googlegroups.com

this is error message im getting now

{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.postgres.svc.cluster.local\\",443}},\n

Gabriele Santomaggio

unread,

Dec 12, 2019, 3:33:17 PM12/12/19

to rabbitmq-users

I think you have a permission issue, try to put the log in debug mode

Follow this link https://www.rabbitmq.com/cluster-formation.html#troubleshooting-cluster-formation

You will have more information about your error.

Il giorno giovedì 12 dicembre 2019 20:23:09 UTC+1, mohan m ha scritto:

this is error message im getting now

{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.postgres.svc.cluster.local\\",443}},\n

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitm...@googlegroups.com.

mohan m

unread,

Dec 13, 2019, 3:56:15 AM12/13/19

to rabbitmq-users

I did some modification.. receiving below error while starting container. can some one help on this?

kubectl logs rabbitmq-0 -n postgres

## ##

########## Licensed under the MPL. See http://www.rabbitmq.com/

###### ##

########## Logs: <stdout>

Starting broker...

2019-12-13 08:44:20.597 [info] <0.219.0>

Starting RabbitMQ 3.7.10 on Erlang 21.2.3

Licensed under the MPL. See http://www.rabbitmq.com/

2019-12-13 08:44:20.605 [info] <0.219.0>

node : rab...@rabbitmq-0.rabbitmq.postgres.svc.cluster.local

home dir : /var/lib/rabbitmq

config file(s) : /etc/rabbitmq/rabbitmq.conf

cookie hash : XhdCf8zpVJeJ0EHyaxszPg==

log(s) : <stdout>

database dir : /var/lib/rabbitmq/mnesia/rab...@rabbitmq-0.rabbitmq.postgres.svc.cluster.local

2019-12-13 08:44:22.588 [info] <0.227.0> Memory high watermark set to 3997 MiB (4191661260 bytes) of 9993 MiB (10479153152 bytes) total

2019-12-13 08:44:22.591 [info] <0.229.0> Enabling free disk space monitoring

2019-12-13 08:44:22.592 [info] <0.229.0> Disk free limit set to 50MB

2019-12-13 08:44:22.595 [info] <0.232.0> Limiting to approx 1048476 file handles (943626 sockets)

2019-12-13 08:44:22.595 [info] <0.233.0> FHC read buffering: OFF

2019-12-13 08:44:22.595 [info] <0.233.0> FHC write buffering: ON

2019-12-13 08:44:22.596 [info] <0.219.0> Node database directory at /var/lib/rabbitmq/mnesia/rab...@rabbitmq-0.rabbitmq.postgres.svc.cluster.local is empty. Assuming we need to join an existing cluster or initialise from scratch...

2019-12-13 08:44:22.596 [info] <0.219.0> Configured peer discovery backend: rabbit_peer_discovery_k8s

2019-12-13 08:44:22.596 [info] <0.219.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s

2019-12-13 08:44:22.596 [info] <0.219.0> Peer discovery backend does not support locking, falling back to randomized delay

2019-12-13 08:44:22.596 [info] <0.219.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.

2019-12-13 08:44:22.603 [info] <0.219.0> Failed to get nodes from k8s - {failed_connect,[{to_address,{"kubernetes.postgres.svc.cluster.local",443}},

{inet,[inet],nxdomain}]}

2019-12-13 08:44:22.604 [error] <0.218.0> CRASH REPORT Process <0.218.0> with 0 neighbours exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.postgres.svc.cluster.local\",443}},\n {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 164 in application_master:init/4 line 138

2019-12-13 08:44:22.604 [info] <0.43.0> Application rabbit exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.postgres.svc.cluster.local\",443}},\n {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 164

{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.postgres.svc.cluster.local\\",443}},\n {inet,[inet],nxdomain}]}\"}},[{rabbit_mnesia,init_from_config,0,[{file,\"src/rabbit_mnesia.erl\"},{line,164}]},{rabbit_mnesia,init_with_lock,3,[

mohan m

unread,

Dec 13, 2019, 4:13:17 AM12/13/19

to rabbitmq-users

looks like k8s cluster does not pickup correct node name as we configured. can some one help to fix issue?

Luke Bakken

unread,

Dec 13, 2019, 1:34:11 PM12/13/19

to rabbitmq-users

Hi Mohan,

I searched this group using the terms kubernetes nxdomain:

https://groups.google.com/forum/#!searchin/rabbitmq-users/kubernetes$20nxdomain%7Csort:date

One of those discussions may help you out.

Thanks,

Luke

mohan m

unread,

Dec 14, 2019, 3:13:03 AM12/14/19

to rabbitm...@googlegroups.com

Thank you for your response. i will check it

Thanks

Mohan M

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/c0c938da-879e-491c-9ebe-259f02f865eb%40googlegroups.com.

Gabriele Santomaggio

unread,

Dec 14, 2019, 6:18:46 AM12/14/19

to rabbitmq-users

In this link [1] you can find the way to debug the DNS problems.

You have to check your dns using :

kubectl exec -ti busybox -- nslookup  rabbitmq-op-developing

Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      rabbitmq-op-developing
Address 1: 10.106.38.198 rabbitmq-op-developing.default.svc.cluster.local

The first "rabbitmq-op-developing" part is the service name and you should have the same value on:

K8S_SERVICE_NAME = rabbitmq-op-developing

and the pod should be something like:

Name:  "RABBITMQ_NODENAME",
Value: fmt.Sprintf("rabbit@%s.%s.%s.svc.cluster.local", "$(MY_POD_NAME)", "$(K8S_SERVICE_NAME)", "$(MY_POD_NAMESPACE)"),

1- https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/

-

Gabriele Santomaggio

To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/c0c938da-879e-491c-9ebe-259f02f865eb%40googlegroups.com.

mohan m

unread,

Dec 16, 2019, 6:11:53 AM12/16/19

to rabbitmq-users

============================================

kubectl get svc -n postgres

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

rabbitmq NodePort 10.254.136.51 <none> 15672:30002/TCP,5672:30003/TCP 3d15h

rabbitmq-lb ClusterIP None <none> 15672/TCP,5672/TCP 3d15h

=======================================

kubectl exec -it busybox /bin/sh -n postgres

/ $ nslookup rabbitmq

Server: 10.254.0.10

Address: 10.254.0.10:53

*** Can't find rabbitmq.svc.cluster.local: No answer

*** Can't find rabbitmq.cluster.local: No answer

*** Can't find rabbitmq.postgres.svc.cluster.local: No answer

*** Can't find rabbitmq.svc.cluster.local: No answer

*** Can't find rabbitmq.cluster.local: No answer

==================

pls find my statefulsets yaml file

apiVersion: apps/v1

# See the Prerequisites section of https://www.rabbitmq.com/cluster-formation.html#peer-discovery-k8s.

kind: StatefulSet

metadata:

spec:

serviceName: rabbitmq

# Three nodes is the recommended minimum. Some features may require a majority of nodes

# to be available.

replicas: 3

selector:

matchLabels:

app: rabbitmq

template:

metadata:

labels:

app: rabbitmq

spec:

serviceAccountName: rabbitmq

terminationGracePeriodSeconds: 10

containers:

- name: rabbitmq-k8s

image: rabbitmq:3.7.10

lifecycle:

postStart:

exec:

command:

- /bin/sh

- -c

- >

until rabbitmqctl --erlang-cookie ${RABBITMQ_ERLANG_COOKIE} node_health_check; do sleep 1; done;

rabbitmqctl --erlang-cookie ${RABBITMQ_ERLANG_COOKIE} set_policy ha-all "" '{"ha-mode":"all", "ha-sync-mode": "automatic"}'

securityContext:

runAsUser: 999

runAsGroup: 999

imagePullPolicy: Always

# command: [ "/bin/sh", "-c", "--" ]

# args: [ "while true; do sleep 30; done;" ]

volumeMounts:

- name: config-volume

mountPath: /etc/rabbitmq

# Learn more about what ports various protocols use

# at https://www.rabbitmq.com/networking.html#ports

ports:

- name: http

containerPort: 15672

protocol: TCP

- name: amqp

containerPort: 5672

protocol: TCP

- name: port3

containerPort: 25672

protocol: TCP

- name: port4

containerPort: 15672

protocol: TCP

livenessProbe:

exec:

command:

- /bin/sh

- -c

- >

curl -H "Authorization: Basic ${RABBITMQ_BASIC_AUTH}" http://localhost:15672/api/aliveness-test/%2F

initialDelaySeconds: 15

periodSeconds: 5

readinessProbe:

exec:

command:

- /bin/sh

- -c

- >

curl -H "Authorization: Basic ${RABBITMQ_BASIC_AUTH}" http://localhost:15672/api/aliveness-test/%2F

initialDelaySeconds: 15

periodSeconds: 5

imagePullPolicy: Always

env:

- name: MY_POD_NAME

valueFrom:

fieldRef:

fieldPath: metadata.name

- name: MY_POD_NAMESPACE

valueFrom:

fieldRef:

fieldPath: metadata.namespace

- name: RABBITMQ_USE_LONGNAME

value: "true"

# - name: RABBITMQ_BASIC_AUTH

# value: 1234

# See a note on cluster_formation.k8s.address_type in the config file section

# - name: K8S_SERVICE_NAME

# value: rabbitmq-internal

# - name: RABBITMQ_DEFAULT_USER

# value: mohan

# - name: RABBITMQ_DEFAULT_PASS

# value: secret_pass

- name: RABBITMQ_NODENAME

# value: rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local

value: "rabbit@$(MY_POD_NAME).rabbitmq.$(MY_POD_NAMESPACE).svc.cluster.local"

- name: K8S_HOSTNAME_SUFFIX

value: .rabbitmq.$(MY_POD_NAMESPACE).svc.cluster.local

# value: .$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local

- name: RABBITMQ_ERLANG_COOKIE

value: "mycookie"

# - name: K8S_ADDRESS_TYPE

# value: "hostname"

- name: K8S_SERVICE_NAME

value: rabbitmq

- name: RABBITMQ_ERL_COOKIE

value: "mycookie"

# - name: RABBITMQ_DEFAULT_USER

# value: user

# - name: RABBITMQ_DEFAULT_PASS

# value: pass

# - name: NODE_NAME

# valueFrom:

# fieldRef:

# fieldPath: metadata.name

volumes:

- name: config-volume

configMap:

items:

- key: rabbitmq.conf

path: rabbitmq.conf

- key: enabled_plugins

path: enabled_plugins

volumeClaimTemplates:

- metadata:

spec:

storageClassName: cinder

accessModes:

- ReadWriteOnce

resources:

requests:

storage: 2Gi

========================

still error message receiving below

2019-12-16 11:09:46.064 [info] <0.219.0> Node database directory at /var/lib/rabbitmq/mnesia/rab...@rabbitmq-0.rabbitmq.postgres.svc.cluster.local is empty. Assuming we need to join an existing cluster or initialise from scratch...

2019-12-16 11:09:46.064 [info] <0.219.0> Configured peer discovery backend: rabbit_peer_discovery_k8s

2019-12-16 11:09:46.064 [info] <0.219.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s

2019-12-16 11:09:46.064 [info] <0.219.0> Peer discovery backend does not support locking, falling back to randomized delay

2019-12-16 11:09:46.064 [info] <0.219.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.

2019-12-16 11:09:46.071 [info] <0.219.0> Failed to get nodes from k8s - {failed_connect,[{to_address,{"kubernetes.postgres.svc.cluster.local",443}},

{inet,[inet],nxdomain}]}

2019-12-16 11:09:46.072 [error] <0.218.0> CRASH REPORT Process <0.218.0> with 0 neighbours exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.postgres.svc.cluster.local\",443}},\n {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 164 in application_master:init/4 line 138

2019-12-16 11:09:46.072 [info] <0.43.0> Application rabbit exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.postgres.svc.cluster.local\",443}},\n {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 164

{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.postgres.svc.cluster.local\\",443}},\n {inet,[inet],nxdomain}]}\"}},[{rabb

Gabriele Santomaggio

unread,

Dec 17, 2019, 3:50:36 AM12/17/19

to rabbitmq-users

Can be https://github.com/kubernetes/minikube/issues/4475 ?

I suggest to update the enviroment and follow this example:

https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s/tree/master/examples/minikube

mohan m

unread,

Dec 17, 2019, 6:10:57 AM12/17/19

to rabbitm...@googlegroups.com

Hi,

Thanks for your response.i have updated the ENV variable cluster_formation.k8s.host and cluster_formation.k8s.port with kubernetes end point(kubectl get ep) and port. now pod is up and running.

like as below

cluster_formation.k8s.host = <ip address of kubernetes end point>==> will get it using kubectl get ep
cluster_formation.k8s.port = 6443

cluster_formation.k8s.address_type = ip

but below cluster status showing on rabbitmq-0 POD only even though rabbitmq-1 and rabbitmq-2.

===========================================================

kubectl exec $FIRST_POD rabbitmq-diagnostics cluster_status
Cluster status of node rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local ...
[{nodes,[{disc,['rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local']}]},
{running_nodes,['rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local']},
{cluster_name,<<"rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local">>},
{partitions,[]},
{alarms,[{'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local',[]}]}]

=============

pod list below

rabbitmq-0 1/1 Running 0 111m
rabbitmq-1 1/1 Running 0 111m
rabbitmq-2 1/1 Running 0 110m

Can you help me to understand why pod rabbitmq-1 and rabbitmq-2.is not listing in cluster status?

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/c8fb3001-a561-4179-9e9e-fcaeab963971%40googlegroups.com.

mohan m

unread,

Dec 17, 2019, 9:31:53 PM12/17/19

to rabbitmq-users

Hi,

We have deployed rabbitMQ as statefulsets method. 3 pods are running as below

rabbitmq-0 1/1 Running 0 11m rabbitmq-1 1/1 Running 0 10m rabbitmq-2 1/1 Running 0 10m

============================================

please find sample log below from one of POD

. 2019-12-18 01:59:43.451 [info] <0.33.0> Application rabbitmq_management started on node 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local' completed with 5 plugins. 2019-12-18 01:59:43.685 [info] <0.5.0> Server startup complete; 5 plugins started.

rabbitmq_management
rabbitmq_web_dispatch
rabbitmq_peer_discovery_k8s
rabbitmq_management_agent
rabbitmq_peer_discovery_common

==============================================

My concern is each POD is showing cluster status. but other 2 pods is not listing in cluster status. for example i logged on POD rabbitmq-0 and i could see below status however other POD's rabbitmq-1 and 2 is not listing in cluster status. rabbitmq@rabbitmq-0:/$ rabbitmqctl cluster_status Cluster status of node rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local ... [{nodes,[{disc,['rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local']}]}, {running_nodes,['rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local']}, {cluster_name,<"rab...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local">}, {partitions,[]}, {alarms,[{'rab...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local',[]}]}] still we have any issue to join the nodes . can any one help me to understand this?

On Tuesday, 17 December 2019 16:40:57 UTC+5:30, mohan m wrote:

Hi,

Thanks for your response.i have updated the ENV variable cluster_formation.k8s.host and cluster_formation.k8s.port with kubernetes end point(kubectl get ep) and port. now pod is up and running.

like as below

cluster_formation.k8s.host = <ip address of kubernetes end point>==> will get it using kubectl get ep
cluster_formation.k8s.port = 6443
cluster_formation.k8s.address_type = ip

but below cluster status showing on rabbitmq-0 POD only even though rabbitmq-1 and rabbitmq-2.
===========================================================

kubectl exec $FIRST_POD rabbitmq-diagnostics cluster_status
Cluster status of node rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local ...

[{nodes,[{disc,['rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local']}]},
{running_nodes,['rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local']},
{cluster_name,<<"rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local">>},
{partitions,[]},
{alarms,[{'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local',[]}]}]

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.

Wesley Peng

unread,

Dec 17, 2019, 9:36:15 PM12/17/19

to rabbitm...@googlegroups.com

Hi

on 2019/12/18 10:31, mohan m wrote:
> rabbitmq@rabbitmq-0:/$ rabbitmqctl cluster_status Cluster status of node
> rab...@rabbitmq-0.rabbitmq.default.svc

> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local ...
> [{nodes,[{disc,['rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']}]},
> {running_nodes,['rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']},
> {cluster_name,<"rab...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local"
> <mailto:%3C%22ra...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local%22>>}, {partitions,[]},
> {alarms,[{'rab...@rabbitmq-0.rabbitmq.defaulf.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.defaulf.svc>.cluster.local',[]}]}]

From info above, the cluster is not created successfully. Have you
followed the guide to do it?
https://www.rabbitmq.com/clustering.html

regards.

Message has been deleted

mohan m

unread,

Dec 18, 2019, 10:03:49 AM12/18/19

to rabbitmq-users

rabbitMQ service came up alone on each POD. seems peer discovering is not happening properly which is not allowing POD to join in cluster mode. let me know who has faced same issue. can any one help me to fix it. please find the below log.

=================================================================

2019-12-17 09:16:05.784 [info] <0.215.0> Node database directory at /var/lib/rabbitmq/mnesia/rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local is empty. Assuming we need to join an existing cluster or initialise from scratch...

2019-12-17 09:16:05.784 [info] <0.215.0> Configured peer discovery backend: rabbit_peer_discovery_k8s

2019-12-17 09:16:05.785 [info] <0.215.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s

2019-12-17 09:16:05.785 [info] <0.215.0> Peer discovery backend does not support locking, falling back to randomized delay

2019-12-17 09:16:05.785 [info] <0.215.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.

2019-12-17 09:16:05.815 [info] <0.215.0> All discovered existing cluster peers:

2019-12-17 09:16:05.815 [info] <0.215.0> Discovered no peer nodes to cluster with

2019-12-17 09:16:05.818 [info] <0.43.0> Application mnesia exited with reason: stopped

2019-12-17 09:16:05.869 [info] <0.215.0> Waiting for Mnesia tables for 30000 ms, 9 retries left

2019-12-17 09:16:05.891 [info] <0.215.0> Waiting for Mnesia tables for 30000 ms, 9 retries left

2019-12-17 09:16:05.914 [info] <0.215.0> Waiting for Mnesia tables for 30000 ms, 9 retries left

2019-12-17 09:16:05.915 [info] <0.215.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping registration.

2019-12-17 09:16:05.915 [info] <0.215.0> Priority queues enabled, real BQ is rabbit_variable_queue

2019-12-17 09:16:05.918 [info] <0.411.0> Starting rabbit_node_monitor

2019-12-17 09:16:05.938 [info] <0.215.0> message_store upgrades: 1 to apply

2019-12-17 09:16:05.938 [info] <0.215.0> message_store upgrades: Applying rabbit_variable_queue:move_messages_to_vhost_store

2019-12-17 09:16:05.938 [info] <0.215.0> message_store upgrades: No durable queues found. Skipping message store migration

2019-12-17 09:16:05.938 [info] <0.215.0> message_store upgrades: Removing the old message store data

2019-12-17 09:16:05.939 [info] <0.215.0> message_store upgrades: All upgrades applied successfully

2019-12-17 09:16:05.963 [info] <0.215.0> Management plugin: using rates mode 'basic'

2019-12-17 09:16:05.965 [info] <0.215.0> Adding vhost '/'

2019-12-17 09:16:05.974 [info] <0.450.0> Making sure data directory '/var/lib/rabbitmq/mnesia/rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local/msg_stores/vhosts/628WB79CIFDYO9LJI6DKMI09L' for vhost '/' exists

2019-12-17 09:16:05.976 [info] <0.450.0> Starting message stores for vhost '/'

2019-12-17 09:16:05.977 [info] <0.454.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_transient": using rabbit_msg_store_ets_index to provide index

2019-12-17 09:16:05.977 [info] <0.450.0> Started message store of type transient for vhost '/'

2019-12-17 09:16:05.978 [info] <0.457.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": using rabbit_msg_store_ets_index to provide index

2019-12-17 09:16:05.978 [warning] <0.457.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": rebuilding indices from scratch

2019-12-17 09:16:05.979 [info] <0.450.0> Started message store of type persistent for vhost '/'

2019-12-17 09:16:05.980 [info] <0.215.0> Creating user 'guest'

2019-12-17 09:16:05.981 [info] <0.215.0> Setting user tags for user 'guest' to [administrator]

2019-12-17 09:16:05.982 [info] <0.215.0> Setting permissions for 'guest' in '/' to '.*', '.*', '.*'

2019-12-17 09:16:05.984 [warning] <0.481.0> Setting Ranch options together with socket options is deprecated. Please use the new map syntax that allows specifying socket options separately from other options.

2019-12-17 09:16:05.985 [info] <0.495.0> started TCP listener on [::]:5672

2019-12-17 09:16:05.987 [info] <0.215.0> Setting up a table for connection tracking on this node: 'tracked_connection_on_node_rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local'

2019-12-17 09:16:05.989 [info] <0.215.0> Setting up a table for per-vhost connection counting on this node: 'tracked_connection_per_vhost_on_node_rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'

2019-12-17 09:16:05.991 [info] <0.549.0> Peer discovery: enabling node cleanup (will only log warnings). Check interval: 30 seconds.

2019-12-17 09:16:06.037 [info] <0.558.0> Management plugin: HTTP (non-TLS) listener started on port 15672

2019-12-17 09:16:06.037 [info] <0.664.0> Statistics database started.

2019-12-17 09:16:06.216 [info] <0.8.0> Server startup complete; 5 plugins started.

* rabbitmq_peer_discovery_k8s

* rabbitmq_management

* rabbitmq_web_dispatch

* rabbitmq_peer_discovery_common

* rabbitmq_management_agent

completed with 5 plugins.

On Wednesday, 18 December 2019 08:06:15 UTC+5:30, Wesley Peng wrote:

Hi

on 2019/12/18 10:31, mohan m wrote:
> rabbitmq@rabbitmq-0:/$ rabbitmqctl cluster_status Cluster status of node
> rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local ...

> [{nodes,[{disc,['rabbit@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']}]},
> {running_nodes,['rabbit@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']},
> {cluster_name,<"rabbit@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local"
> <mailto:%3C%22rabbit@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local%22>>}, {partitions,[]},
> {alarms,[{'rabbit@rabbitmq-0.rabbitmq.defaulf.svc

mohan m

unread,

Dec 23, 2019, 10:58:19 AM12/23/19

to rabbitmq-users

Can any one help to fix this issue?

Michael Klishin

unread,

Jan 2, 2020, 6:31:50 AM1/2/20

to rabbitmq-users

There is a section on troubleshooting discovery [1].

There's another message in one of the log snippets that says that a readiness probe has failed. The peer discovery plugin will only consider

nodes in Ready state (that is, their readiness probe succeeded) for clustering, the rest will be filtered out.

I honestly don't think using curl and the extremely basic aliveness probe is necessary. Consider one of the basic health checks [2]

and add more if you need them ([2] explains the methodology).

1. https://www.rabbitmq.com/cluster-formation.html#troubleshooting-cluster-formation

2. https://www.rabbitmq.com/monitoring.html#health-checks

2019-12-17 09:16:05.987 [info] <0.215.0> Setting up a table for connection tracking on this node: 'tracked_connecti...@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2019-12-17 09:16:05.989 [info] <0.215.0> Setting up a table for per-vhost connection counting on this node: 'tracked_connection_pe...@rabbitmq-0.rabbitmq.default.svc.cluster.local'

2019-12-17 09:16:05.991 [info] <0.549.0> Peer discovery: enabling node cleanup (will only log warnings). Check interval: 30 seconds.
2019-12-17 09:16:06.037 [info] <0.558.0> Management plugin: HTTP (non-TLS) listener started on port 15672
2019-12-17 09:16:06.037 [info] <0.664.0> Statistics database started.
2019-12-17 09:16:06.216 [info] <0.8.0> Server startup complete; 5 plugins started.
* rabbitmq_peer_discovery_k8s
* rabbitmq_management
* rabbitmq_web_dispatch
* rabbitmq_peer_discovery_common
* rabbitmq_management_agent
completed with 5 plugins.

On Wednesday, 18 December 2019 08:06:15 UTC+5:30, Wesley Peng wrote:

Hi

on 2019/12/18 10:31, mohan m wrote:
> rabbitmq@rabbitmq-0:/$ rabbitmqctl cluster_status Cluster status of node
> rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local ...

> [{nodes,[{disc,['rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']}]},
> {running_nodes,['rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']},

> {cluster_name,<"rab...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local"
> <mailto:%3C%22ra...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local%22>>}, {partitions,[]},
> {alarms,[{'rab...@rabbitmq-0.rabbitmq.defaulf.svc

> <mailto:rab...@rabbitmq-0.rabbitmq.defaulf.svc>.cluster.local',[]}]}]

From info above, the cluster is not created successfully. Have you
followed the guide to do it?
https://www.rabbitmq.com/clustering.html

regards.

--

You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/d47315c6-30c8-4dac-8de5-e3f868aea7de%40googlegroups.com.

--

MK

Staff Software Engineer, Pivotal/RabbitMQ

li yl

unread,

Feb 26, 2020, 10:44:19 AM2/26/20

to rabbitmq-users

I have same problem, did your rabbitmq cluster running ok ? if yes, could you share you yaml for me.thanks a lot.

在 2020年1月2日星期四 UTC+8下午7:31:50，Michael Klishin写道：

There is a section on troubleshooting discovery [1].

There's another message in one of the log snippets that says that a readiness probe has failed. The peer discovery plugin will only consider
nodes in Ready state (that is, their readiness probe succeeded) for clustering, the rest will be filtered out.

I honestly don't think using curl and the extremely basic aliveness probe is necessary. Consider one of the basic health checks [2]
and add more if you need them ([2] explains the methodology).

1. https://www.rabbitmq.com/cluster-formation.html#troubleshooting-cluster-formation
2. https://www.rabbitmq.com/monitoring.html#health-checks

2019-12-17 09:16:05.987 [info] <0.215.0> Setting up a table for connection tracking on this node: 'tracked_connection_on_node_rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2019-12-17 09:16:05.989 [info] <0.215.0> Setting up a table for per-vhost connection counting on this node: 'tracked_connection_per_vhost_on_node_rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'

2019-12-17 09:16:05.991 [info] <0.549.0> Peer discovery: enabling node cleanup (will only log warnings). Check interval: 30 seconds.
2019-12-17 09:16:06.037 [info] <0.558.0> Management plugin: HTTP (non-TLS) listener started on port 15672
2019-12-17 09:16:06.037 [info] <0.664.0> Statistics database started.
2019-12-17 09:16:06.216 [info] <0.8.0> Server startup complete; 5 plugins started.
* rabbitmq_peer_discovery_k8s
* rabbitmq_management
* rabbitmq_web_dispatch
* rabbitmq_peer_discovery_common
* rabbitmq_management_agent
completed with 5 plugins.

On Wednesday, 18 December 2019 08:06:15 UTC+5:30, Wesley Peng wrote:

Hi

on 2019/12/18 10:31, mohan m wrote:
> rabbitmq@rabbitmq-0:/$ rabbitmqctl cluster_status Cluster status of node
> rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local ...

> [{nodes,[{disc,['rabbit@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']}]},
> {running_nodes,['rabbit@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']},

> {cluster_name,<"rabbit@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local"
> <mailto:%3C%22rabbit@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local%22>>}, {partitions,[]},
> {alarms,[{'rabbit@rabbitmq-0.rabbitmq.defaulf.svc

> <mailto:rab...@rabbitmq-0.rabbitmq.defaulf.svc>.cluster.local',[]}]}]

From info above, the cluster is not created successfully. Have you
followed the guide to do it?
https://www.rabbitmq.com/clustering.html

regards.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitm...@googlegroups.com.

To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/d47315c6-30c8-4dac-8de5-e3f868aea7de%40googlegroups.com.

Michael Klishin

unread,

Feb 27, 2020, 11:03:29 AM2/27/20

to rabbitmq-users

There are some examples available under [1].

1. https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s/tree/master/examples

Reply all

Reply to author

Forward