rabbitMQ is not coming up on kubernetes cluster

2,763 views
Skip to first unread message

mohan m

unread,
Dec 12, 2019, 4:25:59 AM12/12/19
to rabbitmq-users
I'm new to rabbitMQ, I'm trying to deploy rabbitMQ in kubernetes cluster using statefulsets method.

kubectl describe pod rabbitmq-0 listing below error, 

============================================================================
 Type     Reason     Age                 From                          Message
  ----     ------     ----                ----                          -------
  Warning  Unhealthy  17m (x67 over 81m)  kubelet, kube-node03  (combined from similar events): Readiness probe failed: Error: unable to perform an operation on node 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local'. Please see diagnostics information and suggestions below.

Most common reasons for this are:

 * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
=============================================

Can some one help me to fix issue?

Thanks
Mohan M



Wesley Peng

unread,
Dec 12, 2019, 6:13:00 AM12/12/19
to rabbitm...@googlegroups.com
On 2019/12/12 5:25 下午, mohan m wrote:
>   Warning  Unhealthy  17m (x67 over 81m)  kubelet, kube-node03
> (combined from similar events): Readiness probe failed: Error: unable to
> perform an operation on node
> 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local'. Please see
> diagnostics information and suggestions below.

hi

firstly please check the hostname
"rabbitmq-0.rabbitmq.default.svc.cluster.local" and its shortname must
be resolvable, either by DNS or local hosts file.

regards.

Gabriele Santomaggio

unread,
Dec 12, 2019, 8:13:30 AM12/12/19
to rabbitmq-users
In which way are you trying to deploy RabbitMQ in K8s ?
With Helm?
The RabbitMQ github example ? 
Your own yaml file?

You could have also some permission issue.

-
Gabriele Santomagggio

mohan m

unread,
Dec 12, 2019, 1:03:19 PM12/12/19
to rabbitm...@googlegroups.com
The RabbitMQ github example ?  

Yes RabbitMQ giihub example 

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/8f184b97-4a47-4df2-ac31-868281afa702%40googlegroups.com.

mohan m

unread,
Dec 12, 2019, 2:23:09 PM12/12/19
to rabbitm...@googlegroups.com
this is error message im getting now

{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.postgres.svc.cluster.local\\",443}},\n   

Gabriele Santomaggio

unread,
Dec 12, 2019, 3:33:17 PM12/12/19
to rabbitmq-users
I think you have a permission issue, try to put the log in debug mode
You will have more information about your error.



Il giorno giovedì 12 dicembre 2019 20:23:09 UTC+1, mohan m ha scritto:
this is error message im getting now

{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.postgres.svc.cluster.local\\",443}},\n   

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitm...@googlegroups.com.

mohan m

unread,
Dec 13, 2019, 3:56:15 AM12/13/19
to rabbitmq-users

I did some modification.. receiving below error while starting container. can some one help on this?

 kubectl logs rabbitmq-0 -n postgres
  ##  ##
  ##  ##      RabbitMQ 3.7.10. Copyright (C) 2007-2018 Pivotal Software, Inc.
  ##########  Licensed under the MPL.  See http://www.rabbitmq.com/
  ######  ##
  ##########  Logs: <stdout>
              Starting broker...
2019-12-13 08:44:20.597 [info] <0.219.0>
 Starting RabbitMQ 3.7.10 on Erlang 21.2.3
 Copyright (C) 2007-2018 Pivotal Software, Inc.
 Licensed under the MPL.  See http://www.rabbitmq.com/
2019-12-13 08:44:20.605 [info] <0.219.0>
 node           : rab...@rabbitmq-0.rabbitmq.postgres.svc.cluster.local
 home dir       : /var/lib/rabbitmq
 config file(s) : /etc/rabbitmq/rabbitmq.conf
 cookie hash    : XhdCf8zpVJeJ0EHyaxszPg==
 log(s)         : <stdout>
 database dir   : /var/lib/rabbitmq/mnesia/rab...@rabbitmq-0.rabbitmq.postgres.svc.cluster.local
2019-12-13 08:44:22.588 [info] <0.227.0> Memory high watermark set to 3997 MiB (4191661260 bytes) of 9993 MiB (10479153152 bytes) total
2019-12-13 08:44:22.591 [info] <0.229.0> Enabling free disk space monitoring
2019-12-13 08:44:22.592 [info] <0.229.0> Disk free limit set to 50MB
2019-12-13 08:44:22.595 [info] <0.232.0> Limiting to approx 1048476 file handles (943626 sockets)
2019-12-13 08:44:22.595 [info] <0.233.0> FHC read buffering:  OFF
2019-12-13 08:44:22.595 [info] <0.233.0> FHC write buffering: ON
2019-12-13 08:44:22.596 [info] <0.219.0> Node database directory at /var/lib/rabbitmq/mnesia/rab...@rabbitmq-0.rabbitmq.postgres.svc.cluster.local is empty. Assuming we need to join an existing cluster or initialise from scratch...
2019-12-13 08:44:22.596 [info] <0.219.0> Configured peer discovery backend: rabbit_peer_discovery_k8s
2019-12-13 08:44:22.596 [info] <0.219.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
2019-12-13 08:44:22.596 [info] <0.219.0> Peer discovery backend does not support locking, falling back to randomized delay
2019-12-13 08:44:22.596 [info] <0.219.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.
2019-12-13 08:44:22.603 [info] <0.219.0> Failed to get nodes from k8s - {failed_connect,[{to_address,{"kubernetes.postgres.svc.cluster.local",443}},
                 {inet,[inet],nxdomain}]}
2019-12-13 08:44:22.604 [error] <0.218.0> CRASH REPORT Process <0.218.0> with 0 neighbours exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.postgres.svc.cluster.local\",443}},\n                 {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 164 in application_master:init/4 line 138
2019-12-13 08:44:22.604 [info] <0.43.0> Application rabbit exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.postgres.svc.cluster.local\",443}},\n                 {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 164
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.postgres.svc.cluster.local\\",443}},\n                 {inet,[inet],nxdomain}]}\"}},[{rabbit_mnesia,init_from_config,0,[{file,\"src/rabbit_mnesia.erl\"},{line,164}]},{rabbit_mnesia,init_with_lock,3,[

mohan m

unread,
Dec 13, 2019, 4:13:17 AM12/13/19
to rabbitmq-users

looks like k8s cluster does not pickup correct node name as we configured. can some one help to fix issue?

Luke Bakken

unread,
Dec 13, 2019, 1:34:11 PM12/13/19
to rabbitmq-users
Hi Mohan,

I searched this group using the terms kubernetes nxdomain:


One of those discussions may help you out.

Thanks,
Luke

mohan m

unread,
Dec 14, 2019, 3:13:03 AM12/14/19
to rabbitm...@googlegroups.com
Thank you for your response. i will check it

Thanks
Mohan M

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/c0c938da-879e-491c-9ebe-259f02f865eb%40googlegroups.com.

Gabriele Santomaggio

unread,
Dec 14, 2019, 6:18:46 AM12/14/19
to rabbitmq-users

In this link [1] you can find the way to debug the DNS problems.
You have to check your dns using :

kubectl exec -ti busybox -- nslookup  rabbitmq-op-developing

Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      rabbitmq-op-developing
Address 1: 10.106.38.198 rabbitmq-op-developing.default.svc.cluster.local


The first "rabbitmq-op-developing" part is  the service name and you should have the same value on:

K8S_SERVICE_NAME = rabbitmq-op-developing

and the pod should be something like:

Name:  "RABBITMQ_NODENAME",
Value: fmt.Sprintf("rabbit@%s.%s.%s.svc.cluster.local", "$(MY_POD_NAME)", "$(K8S_SERVICE_NAME)", "$(MY_POD_NAMESPACE)"),



-
Gabriele Santomaggio

mohan m

unread,
Dec 16, 2019, 6:11:53 AM12/16/19
to rabbitmq-users

============================================
kubectl get svc -n postgres
NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                          AGE

rabbitmq         NodePort    10.254.136.51    <none>        15672:30002/TCP,5672:30003/TCP   3d15h
rabbitmq-lb      ClusterIP   None             <none>        15672/TCP,5672/TCP               3d15h

=======================================
 kubectl exec -it busybox /bin/sh -n postgres
/ $ nslookup rabbitmq
Server:         10.254.0.10
Address:        10.254.0.10:53


*** Can't find rabbitmq.svc.cluster.local: No answer
*** Can't find rabbitmq.cluster.local: No answer
*** Can't find rabbitmq.postgres.svc.cluster.local: No answer
*** Can't find rabbitmq.svc.cluster.local: No answer
*** Can't find rabbitmq.cluster.local: No answer


==================

pls find my statefulsets yaml file

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: rabbitmq
spec:
  serviceName: rabbitmq
  # Three nodes is the recommended minimum. Some features may require a majority of nodes
  # to be available.
  replicas: 3
  selector:
    matchLabels:
      app: rabbitmq
  template:
    metadata:
      labels:
        app: rabbitmq
    spec:
      serviceAccountName: rabbitmq
      terminationGracePeriodSeconds: 10

      containers:
      - name: rabbitmq-k8s
        image: rabbitmq:3.7.10
        lifecycle:
          postStart:
            exec:
              command:
                - /bin/sh
                - -c
                - >
                  until rabbitmqctl --erlang-cookie ${RABBITMQ_ERLANG_COOKIE} node_health_check; do sleep 1; done;
                  rabbitmqctl --erlang-cookie ${RABBITMQ_ERLANG_COOKIE} set_policy ha-all "" '{"ha-mode":"all", "ha-sync-mode": "automatic"}'
        securityContext:
          runAsUser: 999
          runAsGroup: 999
        imagePullPolicy: Always
#        command: [ "/bin/sh", "-c", "--" ]
#        args: [ "while true; do sleep 30; done;" ]
        volumeMounts:
          - name: config-volume
            mountPath: /etc/rabbitmq
        # Learn more about what ports various protocols use
        ports:
          - name: http
            containerPort: 15672
            protocol: TCP
          - name: amqp
            containerPort: 5672
            protocol: TCP
          - name: port3
            containerPort: 25672
            protocol: TCP
          - name: port4
            containerPort: 15672
            protocol: TCP
        livenessProbe:
          exec:
            command:
              - /bin/sh
              - -c
              - >
                curl -H "Authorization: Basic ${RABBITMQ_BASIC_AUTH}" http://localhost:15672/api/aliveness-test/%2F
          initialDelaySeconds: 15
          periodSeconds: 5
        readinessProbe:
          exec:
            command:
              - /bin/sh
              - -c
              - >
                curl -H "Authorization: Basic ${RABBITMQ_BASIC_AUTH}" http://localhost:15672/api/aliveness-test/%2F
          initialDelaySeconds: 15
          periodSeconds: 5
        imagePullPolicy: Always

        env:
          - name: MY_POD_NAME
            valueFrom:
              fieldRef:

                fieldPath: metadata.name
          - name: MY_POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          - name: RABBITMQ_USE_LONGNAME
            value: "true"
#          - name: RABBITMQ_BASIC_AUTH
#            value: 1234
          # See a note on cluster_formation.k8s.address_type in the config file section
#          - name: K8S_SERVICE_NAME
#            value: rabbitmq-internal
#          - name: RABBITMQ_DEFAULT_USER
#            value: mohan
#          - name: RABBITMQ_DEFAULT_PASS
#            value: secret_pass
          - name: RABBITMQ_NODENAME
#            value: rabbit@$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
            value: "rabbit@$(MY_POD_NAME).rabbitmq.$(MY_POD_NAMESPACE).svc.cluster.local"
          - name: K8S_HOSTNAME_SUFFIX
            value: .rabbitmq.$(MY_POD_NAMESPACE).svc.cluster.local
#            value: .$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
          - name: RABBITMQ_ERLANG_COOKIE
            value: "mycookie"
#          - name: K8S_ADDRESS_TYPE
#            value: "hostname"
          - name: K8S_SERVICE_NAME
            value: rabbitmq
          - name: RABBITMQ_ERL_COOKIE
            value: "mycookie"
#          - name: RABBITMQ_DEFAULT_USER
#            value: user
#          - name: RABBITMQ_DEFAULT_PASS
#            value: pass
#          - name: NODE_NAME
#            valueFrom:
#              fieldRef:
#                fieldPath: metadata.name
      volumes:
        - name: config-volume
          configMap:
            name: rabbitmq-config
            items:
            - key: rabbitmq.conf
              path: rabbitmq.conf
            - key: enabled_plugins
              path: enabled_plugins
  volumeClaimTemplates:
    - metadata:
        name: data-rabbitmq
      spec:
        storageClassName: cinder
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 2Gi


========================

still error message receiving below

2019-12-16 11:09:46.064 [info] <0.219.0> Node database directory at /var/lib/rabbitmq/mnesia/rab...@rabbitmq-0.rabbitmq.postgres.svc.cluster.local is empty. Assuming we need to join an existing cluster or initialise from scratch...
2019-12-16 11:09:46.064 [info] <0.219.0> Configured peer discovery backend: rabbit_peer_discovery_k8s
2019-12-16 11:09:46.064 [info] <0.219.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
2019-12-16 11:09:46.064 [info] <0.219.0> Peer discovery backend does not support locking, falling back to randomized delay
2019-12-16 11:09:46.064 [info] <0.219.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.
2019-12-16 11:09:46.071 [info] <0.219.0> Failed to get nodes from k8s - {failed_connect,[{to_address,{"kubernetes.postgres.svc.cluster.local",443}},
                 {inet,[inet],nxdomain}]}
2019-12-16 11:09:46.072 [error] <0.218.0> CRASH REPORT Process <0.218.0> with 0 neighbours exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.postgres.svc.cluster.local\",443}},\n                 {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 164 in application_master:init/4 line 138
2019-12-16 11:09:46.072 [info] <0.43.0> Application rabbit exited with reason: no case clause matching {error,"{failed_connect,[{to_address,{\"kubernetes.postgres.svc.cluster.local\",443}},\n                 {inet,[inet],nxdomain}]}"} in rabbit_mnesia:init_from_config/0 line 164
{"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{case_clause,{error,\"{failed_connect,[{to_address,{\\"kubernetes.postgres.svc.cluster.local\\",443}},\n                 {inet,[inet],nxdomain}]}\"}},[{rabb

Gabriele Santomaggio

unread,
Dec 17, 2019, 3:50:36 AM12/17/19
to rabbitmq-users

mohan m

unread,
Dec 17, 2019, 6:10:57 AM12/17/19
to rabbitm...@googlegroups.com
Hi,

Thanks for your response.i have updated the ENV variable cluster_formation.k8s.host  and  cluster_formation.k8s.port with kubernetes end point(kubectl get ep) and port. now pod is up and running.

like as below

cluster_formation.k8s.host = <ip address of kubernetes end point>==> will get it using kubectl get ep
      cluster_formation.k8s.port = 6443
 cluster_formation.k8s.address_type = ip


but below cluster status showing on  rabbitmq-0 POD only even though  rabbitmq-1 and  rabbitmq-2.  
===========================================================
kubectl exec   $FIRST_POD rabbitmq-diagnostics cluster_status
Cluster status of node rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local ...
[{nodes,[{disc,['rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local']}]},
 {running_nodes,['rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local']},
 {cluster_name,<<"rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local">>},
 {partitions,[]},
 {alarms,[{'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local',[]}]}]

=============

pod list below
rabbitmq-0                                 1/1     Running            0          111m
rabbitmq-1                                 1/1     Running            0          111m
rabbitmq-2                                 1/1     Running            0          110m

Can you help me to understand why pod  rabbitmq-1 and  rabbitmq-2.is not listing in cluster status?

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/c8fb3001-a561-4179-9e9e-fcaeab963971%40googlegroups.com.

mohan m

unread,
Dec 17, 2019, 9:31:53 PM12/17/19
to rabbitmq-users

Hi,


We have deployed rabbitMQ as statefulsets method. 3 pods are running as below

rabbitmq-0 1/1 Running 0 11m rabbitmq-1 1/1 Running 0 10m rabbitmq-2 1/1 Running 0 10m

============================================

please find sample log below from one of POD

. 2019-12-18 01:59:43.451 [info] <0.33.0> Application rabbitmq_management started on node 'rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local' completed with 5 plugins. 2019-12-18 01:59:43.685 [info] <0.5.0> Server startup complete; 5 plugins started.

  • rabbitmq_management
  • rabbitmq_web_dispatch
  • rabbitmq_peer_discovery_k8s
  • rabbitmq_management_agent
  • rabbitmq_peer_discovery_common

==============================================

My concern is each POD is showing cluster status. but other 2 pods is not listing in cluster status. for example i logged on POD rabbitmq-0 and i could see below status however other POD's rabbitmq-1 and 2 is not listing in cluster status. rabbitmq@rabbitmq-0:/$ rabbitmqctl cluster_status Cluster status of node rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local ... [{nodes,[{disc,['rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local']}]}, {running_nodes,['rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local']}, {cluster_name,<"rab...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local">}, {partitions,[]}, {alarms,[{'rab...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local',[]}]}] still we have any issue to join the nodes . can any one help me to understand this?





On Tuesday, 17 December 2019 16:40:57 UTC+5:30, mohan m wrote:
Hi,

Thanks for your response.i have updated the ENV variable cluster_formation.k8s.host  and  cluster_formation.k8s.port with kubernetes end point(kubectl get ep) and port. now pod is up and running.

like as below

cluster_formation.k8s.host = <ip address of kubernetes end point>==> will get it using kubectl get ep
      cluster_formation.k8s.port = 6443
 cluster_formation.k8s.address_type = ip


but below cluster status showing on  rabbitmq-0 POD only even though  rabbitmq-1 and  rabbitmq-2.  
===========================================================
kubectl exec   $FIRST_POD rabbitmq-diagnostics cluster_status
Cluster status of node rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local ...
[{nodes,[{disc,['rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local']}]},
 {running_nodes,['rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local']},
 {cluster_name,<<"rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local">>},
 {partitions,[]},
 {alarms,[{'rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local',[]}]}]
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.

Wesley Peng

unread,
Dec 17, 2019, 9:36:15 PM12/17/19
to rabbitm...@googlegroups.com
Hi

on 2019/12/18 10:31, mohan m wrote:
> rabbitmq@rabbitmq-0:/$ rabbitmqctl cluster_status Cluster status of node
> rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local ...
> [{nodes,[{disc,['rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']}]},
> {running_nodes,['rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']},
> {cluster_name,<"rab...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local"
> <mailto:%3C%22ra...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local%22>>}, {partitions,[]},
> {alarms,[{'rab...@rabbitmq-0.rabbitmq.defaulf.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.defaulf.svc>.cluster.local',[]}]}]

From info above, the cluster is not created successfully. Have you
followed the guide to do it?
https://www.rabbitmq.com/clustering.html


regards.
Message has been deleted

mohan m

unread,
Dec 18, 2019, 10:03:49 AM12/18/19
to rabbitmq-users
rabbitMQ service came up alone on each POD. seems peer  discovering is not happening properly which is not allowing POD to join in cluster mode. let me know who has faced same issue. can any one help me to fix it.  please find the below log.

=================================================================

2019-12-17 09:16:05.784 [info] <0.215.0> Node database directory at /var/lib/rabbitmq/mnesia/rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local is empty. Assuming we need to join an existing cluster or initialise from scratch...
2019-12-17 09:16:05.784 [info] <0.215.0> Configured peer discovery backend: rabbit_peer_discovery_k8s
2019-12-17 09:16:05.785 [info] <0.215.0> Will try to lock with peer discovery backend rabbit_peer_discovery_k8s
2019-12-17 09:16:05.785 [info] <0.215.0> Peer discovery backend does not support locking, falling back to randomized delay
2019-12-17 09:16:05.785 [info] <0.215.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping randomized startup delay.
2019-12-17 09:16:05.815 [info] <0.215.0> All discovered existing cluster peers:
2019-12-17 09:16:05.815 [info] <0.215.0> Discovered no peer nodes to cluster with
2019-12-17 09:16:05.818 [info] <0.43.0> Application mnesia exited with reason: stopped
2019-12-17 09:16:05.869 [info] <0.215.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2019-12-17 09:16:05.891 [info] <0.215.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2019-12-17 09:16:05.914 [info] <0.215.0> Waiting for Mnesia tables for 30000 ms, 9 retries left
2019-12-17 09:16:05.915 [info] <0.215.0> Peer discovery backend rabbit_peer_discovery_k8s does not support registration, skipping registration.
2019-12-17 09:16:05.915 [info] <0.215.0> Priority queues enabled, real BQ is rabbit_variable_queue
2019-12-17 09:16:05.918 [info] <0.411.0> Starting rabbit_node_monitor
2019-12-17 09:16:05.938 [info] <0.215.0> message_store upgrades: 1 to apply
2019-12-17 09:16:05.938 [info] <0.215.0> message_store upgrades: Applying rabbit_variable_queue:move_messages_to_vhost_store
2019-12-17 09:16:05.938 [info] <0.215.0> message_store upgrades: No durable queues found. Skipping message store migration
2019-12-17 09:16:05.938 [info] <0.215.0> message_store upgrades: Removing the old message store data
2019-12-17 09:16:05.939 [info] <0.215.0> message_store upgrades: All upgrades applied successfully
2019-12-17 09:16:05.963 [info] <0.215.0> Management plugin: using rates mode 'basic'
2019-12-17 09:16:05.965 [info] <0.215.0> Adding vhost '/'
2019-12-17 09:16:05.974 [info] <0.450.0> Making sure data directory '/var/lib/rabbitmq/mnesia/rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local/msg_stores/vhosts/628WB79CIFDYO9LJI6DKMI09L' for vhost '/' exists
2019-12-17 09:16:05.976 [info] <0.450.0> Starting message stores for vhost '/'
2019-12-17 09:16:05.977 [info] <0.454.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_transient": using rabbit_msg_store_ets_index to provide index
2019-12-17 09:16:05.977 [info] <0.450.0> Started message store of type transient for vhost '/'
2019-12-17 09:16:05.978 [info] <0.457.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": using rabbit_msg_store_ets_index to provide index
2019-12-17 09:16:05.978 [warning] <0.457.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": rebuilding indices from scratch
2019-12-17 09:16:05.979 [info] <0.450.0> Started message store of type persistent for vhost '/'
2019-12-17 09:16:05.980 [info] <0.215.0> Creating user 'guest'
2019-12-17 09:16:05.981 [info] <0.215.0> Setting user tags for user 'guest' to [administrator]
2019-12-17 09:16:05.982 [info] <0.215.0> Setting permissions for 'guest' in '/' to '.*', '.*', '.*'
2019-12-17 09:16:05.984 [warning] <0.481.0> Setting Ranch options together with socket options is deprecated. Please use the new map syntax that allows specifying socket options separately from other options.
2019-12-17 09:16:05.985 [info] <0.495.0> started TCP listener on [::]:5672
2019-12-17 09:16:05.987 [info] <0.215.0> Setting up a table for connection tracking on this node: 'tracked_connection_on_node_rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2019-12-17 09:16:05.989 [info] <0.215.0> Setting up a table for per-vhost connection counting on this node: 'tracked_connection_per_vhost_on_node_rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2019-12-17 09:16:05.991 [info] <0.549.0> Peer discovery: enabling node cleanup (will only log warnings). Check interval: 30 seconds.
2019-12-17 09:16:06.037 [info] <0.558.0> Management plugin: HTTP (non-TLS) listener started on port 15672
2019-12-17 09:16:06.037 [info] <0.664.0> Statistics database started.
2019-12-17 09:16:06.216 [info] <0.8.0> Server startup complete; 5 plugins started.
 * rabbitmq_peer_discovery_k8s
 * rabbitmq_management
 * rabbitmq_web_dispatch
 * rabbitmq_peer_discovery_common
 * rabbitmq_management_agent
 completed with 5 plugins.


On Wednesday, 18 December 2019 08:06:15 UTC+5:30, Wesley Peng wrote:
Hi

on 2019/12/18 10:31, mohan m wrote:
> rabbitmq@rabbitmq-0:/$ rabbitmqctl cluster_status Cluster status of node
> rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local ...
> [{nodes,[{disc,['rabbit@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']}]},
> {running_nodes,['rabbit@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']},
> {cluster_name,<"rabbit@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local"
> <mailto:%3C%22rabbit@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local%22>>}, {partitions,[]},
> {alarms,[{'rabbit@rabbitmq-0.rabbitmq.defaulf.svc

mohan m

unread,
Dec 23, 2019, 10:58:19 AM12/23/19
to rabbitmq-users
Can any one help to fix this issue?

Michael Klishin

unread,
Jan 2, 2020, 6:31:50 AM1/2/20
to rabbitmq-users
There is a section on troubleshooting discovery [1].

There's another message in one of the log snippets that says that a readiness probe has failed. The peer discovery plugin will only consider
nodes in Ready state (that is, their readiness probe succeeded) for clustering, the rest will be filtered out.

I honestly don't think using curl and the extremely basic aliveness probe is necessary. Consider one of the basic health checks [2]
and add more if you need them ([2] explains the methodology).


2019-12-17 09:16:05.987 [info] <0.215.0> Setting up a table for connection tracking on this node: 'tracked_connecti...@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2019-12-17 09:16:05.989 [info] <0.215.0> Setting up a table for per-vhost connection counting on this node: 'tracked_connection_pe...@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2019-12-17 09:16:05.991 [info] <0.549.0> Peer discovery: enabling node cleanup (will only log warnings). Check interval: 30 seconds.
2019-12-17 09:16:06.037 [info] <0.558.0> Management plugin: HTTP (non-TLS) listener started on port 15672
2019-12-17 09:16:06.037 [info] <0.664.0> Statistics database started.
2019-12-17 09:16:06.216 [info] <0.8.0> Server startup complete; 5 plugins started.
 * rabbitmq_peer_discovery_k8s
 * rabbitmq_management
 * rabbitmq_web_dispatch
 * rabbitmq_peer_discovery_common
 * rabbitmq_management_agent
 completed with 5 plugins.


On Wednesday, 18 December 2019 08:06:15 UTC+5:30, Wesley Peng wrote:
Hi

on 2019/12/18 10:31, mohan m wrote:
> rabbitmq@rabbitmq-0:/$ rabbitmqctl cluster_status Cluster status of node
> rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local ...
> [{nodes,[{disc,['rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']}]},
> {running_nodes,['rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']},
> {cluster_name,<"rab...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local"
> <mailto:%3C%22ra...@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local%22>>}, {partitions,[]},
> {alarms,[{'rab...@rabbitmq-0.rabbitmq.defaulf.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.defaulf.svc>.cluster.local',[]}]}]

 From info above, the cluster is not created successfully. Have you
followed the guide to do it?
https://www.rabbitmq.com/clustering.html


regards.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/d47315c6-30c8-4dac-8de5-e3f868aea7de%40googlegroups.com.


--
MK

Staff Software Engineer, Pivotal/RabbitMQ

li yl

unread,
Feb 26, 2020, 10:44:19 AM2/26/20
to rabbitmq-users
I have same problem, did your rabbitmq cluster running ok ? if yes, could you share you yaml for me.thanks a lot.

在 2020年1月2日星期四 UTC+8下午7:31:50,Michael Klishin写道:
There is a section on troubleshooting discovery [1].

There's another message in one of the log snippets that says that a readiness probe has failed. The peer discovery plugin will only consider
nodes in Ready state (that is, their readiness probe succeeded) for clustering, the rest will be filtered out.

I honestly don't think using curl and the extremely basic aliveness probe is necessary. Consider one of the basic health checks [2]
and add more if you need them ([2] explains the methodology).


2019-12-17 09:16:05.987 [info] <0.215.0> Setting up a table for connection tracking on this node: 'tracked_connection_on_node_rab...@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2019-12-17 09:16:05.989 [info] <0.215.0> Setting up a table for per-vhost connection counting on this node: 'tracked_connection_per_vhost_on_node_rabbit@rabbitmq-0.rabbitmq.default.svc.cluster.local'
2019-12-17 09:16:05.991 [info] <0.549.0> Peer discovery: enabling node cleanup (will only log warnings). Check interval: 30 seconds.
2019-12-17 09:16:06.037 [info] <0.558.0> Management plugin: HTTP (non-TLS) listener started on port 15672
2019-12-17 09:16:06.037 [info] <0.664.0> Statistics database started.
2019-12-17 09:16:06.216 [info] <0.8.0> Server startup complete; 5 plugins started.
 * rabbitmq_peer_discovery_k8s
 * rabbitmq_management
 * rabbitmq_web_dispatch
 * rabbitmq_peer_discovery_common
 * rabbitmq_management_agent
 completed with 5 plugins.


On Wednesday, 18 December 2019 08:06:15 UTC+5:30, Wesley Peng wrote:
Hi

on 2019/12/18 10:31, mohan m wrote:
> rabbitmq@rabbitmq-0:/$ rabbitmqctl cluster_status Cluster status of node
> rab...@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local ...
> [{nodes,[{disc,['rabbit@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']}]},
> {running_nodes,['rabbit@rabbitmq-0.rabbitmq.default.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.default.svc>.cluster.local']},
> {cluster_name,<"rabbit@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local"
> <mailto:%3C%22rabbit@rabbitmq-0.rabbitmq.defaulf.svc.cluster.local%22>>}, {partitions,[]},
> {alarms,[{'rabbit@rabbitmq-0.rabbitmq.defaulf.svc
> <mailto:rab...@rabbitmq-0.rabbitmq.defaulf.svc>.cluster.local',[]}]}]

 From info above, the cluster is not created successfully. Have you
followed the guide to do it?
https://www.rabbitmq.com/clustering.html


regards.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitm...@googlegroups.com.

Michael Klishin

unread,
Feb 27, 2020, 11:03:29 AM2/27/20
to rabbitmq-users
Reply all
Reply to author
Forward
0 new messages