Rabbitmq pods are effecting after worker-node restart

153 views
Skip to first unread message

burak bingöl

unread,
Oct 19, 2021, 7:55:53 AM10/19/21
to rabbitmq-users
I have three rabbitmq nodes on an internal Kubernetes cluster I have deployed rabbitmq as a statefulset with bitnami/rabbitmq helm chart. We have classic and quorum queues on the rabbitmq cluster if a worker node shuts down unexpectedly or Kubernetes admin changed the rabbitmq pods to other worker nodes the rabbit cluster is displaying some abnormal behaviors. For example, it is sending these logs after,


2021-10-18 05:37:15.203 [error] <0.10784.180> Channel error on connection <0.19689.38> (192.168.7.206:24385 -> 192.168.8.168:5672, vhost: "/", user: "admin"), channel 34: operation queue.declare caused a channel exception not_found: failed to perform operation on queue "queue name" in vhost "/" due to timeout 

Or consumer counts are increasing on rabbitnodes but it should be one because only one service is consuming the queue. You can find it in the attachments.

I solved the issue to stop statefulset and reinitialized the rabbitmq. Is it possible to solve this issue without reinitializing the rabbitmq or is it a configuration issue that I miss?
rabbit-issue.png

Wesley Peng

unread,
Oct 19, 2021, 8:18:56 AM10/19/21
to rabbitm...@googlegroups.com

Its maybe due to network congestion if you have a busy node/cluster.

Can you deploy the monitoring such as Prometheus to watch for system and network performance?

 

Regards.

 

发件人: <rabbitm...@googlegroups.com> 代表 burak bingöl <bingol...@gmail.com>
答复: <rabbitm...@googlegroups.com>
日期: 20211019 星期二 下午7:56
收件人: rabbitmq-users <rabbitm...@googlegroups.com>
主题: [rabbitmq-users] Rabbitmq pods are effecting after worker-node restart

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/e5afaa10-2f83-4857-bf5b-0b212fec3133n%40googlegroups.com.

burak bingöl

unread,
Oct 19, 2021, 10:22:05 AM10/19/21
to rabbitmq-users
Hi Wesley,
We are monitoring pods network, CPU, and memory with grafana.
You could be right about it but I couldn't see any network overload at the issue time. 
Just memory consumption was too high on two nodes, they were decreased after restarting the rabbit nodes.
But after first reboot the logs and consumer issues was happened which I sent above.


19 Ekim 2021 Salı tarihinde saat 15:18:56 UTC+3 itibarıyla wes...@magenta.de şunları yazdı:
Reply all
Reply to author
Forward
0 new messages