Hi, we are experiencing a problem where some messages are routed to destination queues whenever there is a node restart.
we are running a openstack cluster both in dev and production enviroment, and openstack is of Rocky release. We have 3 controller nodes and several compute nodes. The scale is around 500-1000 queues for dev and 4000 queues in production. Whenever there is a node restarting, there are several queues having some problems receiving published messages. When publishing with mandatory bit set to true, I can see rabbitmq-server return "not_route" to client. Under such circumstances, it seems that there is nothing I can do except manually rebuilding binding by first deleting and then creating an identitical one; or by manually deleting the queue and forcing openstack creating a same queue again.
We are not using ha or durable queues for some reasons. Of course, I have tried ha and durable queues, and seems that they are fine. The queues are binding to topic exchanges and having routing key identical to queue name. Every rpc remote synchronous call will create a direct exchange and a queue. It seems that both direct and topic exchange suffering from the same problem.
Surprisingly, problems are significantly mitigated when trying with a old release for example 3.7.4, so far we haven't observed any not routable message again. It also seems to work if I let client waiting for some seconds before reconnecting (kombu_reconnect_timeout=30)
I believe problem could be easily reproduced if I have around 100 connections and 300 non-ha queues using a 3-node cluster.
cluster_formation.k8s.address_type = hostname
cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
cluster_formation.node_cleanup.interval = 10
cluster_formation.node_cleanup.only_log_warning = true
cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s
cluster_partition_handling = autoheal
listeners.tcp.1 = 0.0.0.0:5672
log.console.level = debug
loopback_users.guest = false
management.load_definitions = /var/lib/rabbitmq/definitions.json
net_ticktime = 5
queue_master_locator = min-masters
+----------+-----------------+----------+--------------------------------------------------------------------+-----------------------+----------+
| vhost | name | apply-to | definition | pattern | priority |
+----------+-----------------+----------+--------------------------------------------------------------------+-----------------------+----------+
| cinder | ha_ttl_cinder | all | {"message-ttl": 70000} | ^(?!(amq\.|reply_)).* | 0 |
| glance | ha_ttl_glance | all | {"message-ttl": 70000} | ^(?!(amq\.|reply_)).* | 0 |
| keystone | ha_ttl_keystone | all | {"message-ttl": 70000} | ^(?!(amq\.|reply_)).* | 0 |
| neutron | ha_ttl_neutron | all | {"message-ttl": 70000} | ^(?!(amq\.|reply_)).* | 0 |
| nova | ha_ttl_nova | all | {"message-ttl": 70000} | ^(?!(amq\.|reply_)).* | 0 |
+----------+-----------------+----------+--------------------------------------------------------------------+-----------------------+----------+
[DEFAULT]
amqp_auto_delete = false
amqp_durable_queues = true
I am happy to provide more information if listed is not enough. And thank you for the help.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/da577dc0-9c93-4760-b0c3-df68cc3fe9b2%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/7e13fe82-7c34-41af-b32e-a873015fb920%40googlegroups.com.
The least efficient part of queue deletion is binding deletion because there is currently no way to load bindings of a queue (or exchange) without a full scan on one of the binding tables. This unfortunate design can be mitigated with secondary indices but this would be a breaking schema change and therefore can only go into 3.8.0
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/6522ba9f-28bf-4705-970d-7e1f30bdfe61%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/6522ba9f-28bf-4705-970d-7e1f30bdfe61%40googlegroups.com.