--
You received this message because you are subscribed to a topic in the Google Groups "rabbitmq-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rabbitmq-users/YidEQVCY2NQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/6afe9a8e-1aeb-4d95-aa0f-457017dafb49n%40googlegroups.com.
The issue occurred when we applied some changes to our k8s cluster-dns and tried serving stale records https://github.com/johanneswuerbach/rabbitmq-bug/blob/master/kubeconfigs/coredns/broken_configmap.yml#L18, but it so I haven't been able to replicate the same behaviour locally. https://github.com/johanneswuerbach/rabbitmq-bug shows our rough setup though
--
You received this message because you are subscribed to a topic in the Google Groups "rabbitmq-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rabbitmq-users/YidEQVCY2NQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/c5f1203b-aa60-4b07-9d38-761ca915056dn%40googlegroups.com.
The issue occurred when we applied some changes to our k8s cluster-dns and tried serving stale records https://github.com/johanneswuerbach/rabbitmq-bug/blob/master/kubeconfigs/coredns/broken_configmap.yml#L18, but it so I haven't been able to replicate the same behaviour locally. https://github.com/johanneswuerbach/rabbitmq-bug shows our rough setup though
--
You received this message because you are subscribed to a topic in the Google Groups "rabbitmq-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rabbitmq-users/YidEQVCY2NQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/18da767e-5224-4d8a-800c-a847d455d1ebn%40googlegroups.com.
While I have some knowledge on rabbitmq as a user / administrator, I don't know much about the internals. We are using `{"ha-mode":"exactly","ha-params":2,"ha-sync-mode":"automatic"}` with classic queues (quorum queues look preferable for our usage, but we didn't had the time yet).
We run rabbitmq in a k8s statefulset (with the default podManagementPolicy OrderedReady) with "rabbitmq-upgrade post_upgrade" as a postStart and "rabbitmq-upgrade await_online_quorum_plus_one -t 600; rabbitmq-upgrade await_online_synchronized_mirror -t 600;" as preStop hooks, which as far as I understand ensures to within reason that we have a synchronised mirror ready and masters are balanced across the cluster.
If I understand your question correctly a queue master is some sort of (erlang) process, that would only receive messages if it is currently running. How can I see if its currently running and can it be that it isn't started long (like hours) after the node is ready? Our health/readiness checks are here https://github.com/johanneswuerbach/rabbitmq-bug/blob/master/kubeconfigs/rabbitmq/statefulset.yaml#L49-L66 and both pass on all nodes, but the cluster is still not routing messages.
Do I understand correctly that rabbitmq might drop messages as unroutable in case the queue master fails and there is no layer of buffering/retry/persistence here?
Since you are using automatic message synchronisation, when the queue master goes offline and the existing queue mirror becomes the new queue master, a new queue mirror needs to be created and all messages need to be synchronised before the queue will accept new messages. During this period it is possible for new messages to not be delivered to this queue. Since messages can potentially be delivered to multiple queues (multiple queues can match the topic) I am not 100% certain what the exact behaviour is for publisher confirms when messages are delivered to some but not all queues. I am not sure if the mandatory flag will help, since messages are delivered, but only to non-blocked running queues. Do you know @luke @michael?
As an analogy, if you think of RabbitMQ as a hamburger (can you tell that it's close to lunch-time for me?), we are debugging an issue with how the ketchup interacts with the cheese in the hamburger. DNS is the menu that we used to order the hamburger. To complete the analogy, K8S is the restaurant.
--
You received this message because you are subscribed to a topic in the Google Groups "rabbitmq-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rabbitmq-users/YidEQVCY2NQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/1e5f3ddf-9b0c-41d2-89c3-0567d413300en%40googlegroups.com.
Thank you for this great analogy, it made me laugh so hard :-D
While the issue might be at a different level, it seems somewhat related. We were running 3.8.5 for 27 days on 81 clusters before rolling out the DNS change and within 48h had three clusters failing with this behaviour. On all of them the issue started after the rabbitmq cluster was rolled by some automation. Once we reverted the DNS change we never saw this issue again (31 days now). We upgraded to 3.8 (3.8.3 to be precise) 177 days and also never saw something like that. I know that correlation doesn't imply causation, but this looks odd.
I tried running "rabbitmq-diagnostics list_unresponsive_queues", but no queue is returned, also "rabbitmq-diagnostics cluster_status" doesn't list anything. I also enabled debugging logging, but don't see anything special. Is there something I should look out for?
Is there any way to see based on which internal state the routing decision is made and to see whether the queue master process is tried to be contacted?
--
You received this message because you are subscribed to a topic in the Google Groups "rabbitmq-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rabbitmq-users/YidEQVCY2NQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/2b45c2a7-b28e-4e4d-887b-91e99ec15810o%40googlegroups.com.