Cannot list all queues on k8s rabbitmq cluster from remote CLI node

136 views
Skip to first unread message

hasis hasis

unread,
Jun 5, 2020, 10:14:57 AM6/5/20
to rabbitmq-users
Hi there,
I've deployed rabbitmq cluster to on-premise k8s cluster via rabbitmq-ha helm chart.

Slight changes in values.yaml:
[root@node:/home/matej.hasul/charts/stable/rabbitmq-ha] git diff

diff
--git a/stable/rabbitmq-ha/values.yaml b/stable/rabbitmq-ha/values.yaml
index
4e54a9c..9a5caff 100644
--- a/stable/rabbitmq-ha/values.yaml
+++ b/stable/rabbitmq-ha/values.yaml
@@ -505,7 +505,7 @@ livenessProbe:
     command
:
       
- /bin/sh
       
- -c
-      - 'timeout 5 wget -O - -q --header "Authorization: Basic `echo -n \"$RABBIT_MANAGEMENT_USER:$RABBIT_MANAGEMENT_PASSWORD\" | base64`" http://localhost:15672/api/healthchecks/node | grep -qF "{\"status\":\"ok
+      - '
timeout 5 wget -O - -q --header "Authorization: Basic `echo -n \"$RABBIT_MANAGEMENT_USER:$RABBIT_MANAGEMENT_PASSWORD\" | base64`" http://127.0.0.1:15672/api/healthchecks/node | grep -qF "{\"status\":\"ok


 readinessProbe
:
   initialDelaySeconds
: 20
@@ -516,7 +516,7 @@ readinessProbe:
     command
:
       
- /bin/sh
       
- -c
-      - 'timeout 3 wget -O - -q --header "Authorization: Basic `echo -n \"$RABBIT_MANAGEMENT_USER:$RABBIT_MANAGEMENT_PASSWORD\" | base64`" http://localhost:15672/api/healthchecks/node | grep -qF "{\"status\":\"ok
+      - '
timeout 3 wget -O - -q --header "Authorization: Basic `echo -n \"$RABBIT_MANAGEMENT_USER:$RABBIT_MANAGEMENT_PASSWORD\" | base64`" http://127.0.0.1:15672/api/healthchecks/node | grep -qF "{\"status\":\"ok


 
# Specifies an existing secret to be used for RMQ password, management user password and Erlang Cookie
 existingSecret
: ""


Install chart:
[root@node:/home/matej.hasul/charts/stable/rabbitmq-ha] helm -n mhs2 install rabbit . --set prometheus.exporter.enabled=false --set prometheus.operator.enabled=false
NAME
: rabbit
LAST DEPLOYED
: Fri Jun  5 14:04:35 2020
NAMESPACE
: mhs2
STATUS
: deployed
REVISION
: 1
TEST SUITE
: None
NOTES
:
** Please be patient while the chart is being deployed **


 
Credentials:


   
Username            : guest
   
Password            : $(kubectl get secret --namespace mhs2 rabbit-rabbitmq-ha -o jsonpath="{.data.rabbitmq-password}" | base64 --decode)
   
Management username : management
   
Management password : $(kubectl get secret --namespace mhs2 rabbit-rabbitmq-ha -o jsonpath="{.data.rabbitmq-management-password}" | base64 --decode)
   
ErLang Cookie       : $(kubectl get secret --namespace mhs2 rabbit-rabbitmq-ha -o jsonpath="{.data.rabbitmq-erlang-cookie}" | base64 --decode)


 
RabbitMQ can be accessed within the cluster on port 5672 at rabbit-rabbitmq-ha.mhs2.svc.cluster.local


 
To access the cluster externally execute the following commands:


   
export POD_NAME=$(kubectl get pods --namespace mhs2 -l "app=rabbitmq-ha" -o jsonpath="{.items[0].metadata.name}")
    kubectl port
-forward $POD_NAME --namespace mhs2 5672:5672 15672:15672


 
To Access the RabbitMQ AMQP port:


    amqp
://127.0.0.1:5672/


 
To Access the RabbitMQ Management interface:


    URL
: http://127.0.0.1:15672


I can list all queues from all nodes when rabbitmqctl list run inside pods (There are not queues which is correct):
Timeout: 60.0 seconds ...
Listing queues for vhost / ...


[root@node:/home/matej.hasul/charts/stable/rabbitmq-ha] kubectl -n mhs2 exec -it rabbit-rabbitmq-ha-1 -- rabbitmqctl list_queues
Timeout: 60.0 seconds ...
Listing queues for vhost / ...


[root@node:/home/matej.hasul/charts/stable/rabbitmq-ha] kubectl -n mhs2 exec -it rabbit-rabbitmq-ha-2 -- rabbitmqctl list_queues
Timeout: 60.0 seconds ...
Listing queues for vhost / ...



Run CLI pod:
[root@node:/home/matej.hasul/charts/stable/rabbitmq-ha] cat /tmp/f
apiVersion: v1
kind: Pod
metadata:
  name: rabbitmq-mgmt
spec:
  containers:
  - command:
    - sh
    - -c
    - 'sleep 3600'
    env:
    - name: RABBITMQ_ERLANG_COOKIE
      valueFrom:
        secretKeyRef:
          key: rabbitmq-erlang-cookie
          name: rabbit-rabbitmq-ha
    image: rabbitmq:3.8.4-alpine
    imagePullPolicy: Always
    name: connectors-rabbitmq-ha-tests
    resources:
      limits:
        cpu: 500m
        ephemeral-storage: 10Mi
        memory: 256Mi
      requests:
        cpu: 100m
        ephemeral-storage: 1Mi
        memory: 128Mi
    securityContext:
      runAsGroup: 101
      runAsNonRoot: true
      runAsUser: 100

[root@node:/home/matej.hasul/charts/stable/rabbitmq-ha] kubectl -n mhs2 apply -f /tmp/f

However If I try to list queues from CLI pod, it ends on timeout:
kubectl -n mhs2 exec -it rabbitmq-mgmt -- /bin/bash

bash-5.0$ rabbitmqctl -l -n rab...@rabbit-rabbitmq-ha-0.rabbit-rabbitmq-ha-discovery.mhs2.svc.cluster.local list_queues -t 10
Timeout: 10.0 seconds ...
Listing queues for vhost / ...
{:badrpc, {:timeout, 10.0, "Some queue(s) are unresponsive, use list_unresponsive_queues command."}}

Listing only local queues works perfectly:
bash-5.0$ rabbitmqctl -l -n rab...@rabbit-rabbitmq-ha-0.rabbit-rabbitmq-ha-discovery.mhs2.svc.cluster.local list_queues --local -t 10
Timeout: 10.0 seconds ...
Listing queues for vhost / ...

Queue listing works (both local and clusterwide) on single node "cluster". That's what means me think that something smells in the networking. I was consulting networking guide but cannot find out what's wrong. Epmd and ports seems to be ok in the rabbitmq nodes:
[root@node:/home/matej.hasul/charts/stable/rabbitmq-ha] kubectl -n mhs2 exec -it rabbit-rabbitmq-ha-0 -- epmd -names
epmd: up and running on port 4369 with data:
name rabbit at port 25672

[root@node:/home/matej.hasul/charts/stable/rabbitmq-ha] kubectl -n mhs2 exec -it rabbit-rabbitmq-ha-0 -- netstat -tnlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:4369            0.0.0.0:*               LISTEN      111/epmd
tcp        0      0 0.0.0.0:15672           0.0.0.0:*               LISTEN      170/beam.smp
tcp        0      0 0.0.0.0:25672           0.0.0.0:*               LISTEN      170/beam.smp
tcp        0      0 :::4369                 :::*                    LISTEN      111/epmd
tcp        0      0 :::5672                 :::*                    LISTEN      170/beam.smp


Connection from CLI node to rabbitmq node is allowed:
kubectl -n mhs2 exec -it rabbitmq-mgmt -- /bin/bash

bash
-5.0# curl rabbit-rabbitmq-ha-0.rabbit-rabbitmq-ha-discovery.mhs2.svc.cluster.local:25672
curl
: (52) Empty reply from server

bash
-5.0# curl rabbit-rabbitmq-ha-0.rabbit-rabbitmq-ha-discovery.mhs2.svc.cluster.local:4369
curl
: (52) Empty reply from server


Any clue/idea/issue is appreciated.

Matej

Michael Klishin

unread,
Jun 9, 2020, 4:32:23 PM6/9/20
to rabbitm...@googlegroups.com
Start with enabling HTTP API access logs and then see server logs for clues. An empty response can mean a reset TCP connection. There isn't a whole lot of information to work with and we do not guess on this list.


From: rabbitm...@googlegroups.com <rabbitm...@googlegroups.com> on behalf of hasis hasis <hasis...@gmail.com>
Sent: Friday, June 5, 2020 5:14:57 PM
To: rabbitmq-users <rabbitm...@googlegroups.com>
Subject: [Suspected Spam] [rabbitmq-users] Cannot list all queues on k8s rabbitmq cluster from remote CLI node
 
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/6dc403a4-7f03-4985-a960-9de7592f158bo%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages