Question on queue_leader_locator

493 views
Skip to first unread message

AK

unread,
Oct 31, 2023, 4:07:11 AM10/31/23
to rabbitmq-users
Hi,

We have a 5 node RMQ (3.11.2) cluster, we have set queue_leader_locator to balanced. As per the document when queues are declared they queues get balanced. But we have observed if queues are declared in bulk we saw majority of queue's leader is a single node - it is skewed.

In the document https://www.rabbitmq.com/quorum-queues.html#leader-placement
  • balanced: If there are overall less than 1000 queues (classic queues, quorum queues, and streams), pick the node hosting the minimum number of quorum queue leaders. If there are overall more than 1000 queues, pick a random node.

What does the above exactly mean ? first 1000 queues if declared at same time will go to the same node ?

Regards,
Arati

Michal Kuratczyk

unread,
Oct 31, 2023, 4:31:56 AM10/31/23
to rabbitm...@googlegroups.com
If a queue is declared and there are currently less than 1000 queues in the cluster,
we perform an exact check/comparison and place the leader on the node with the least number of leaders.
If there is more than 1000 queues, a random node is picked, because that's a much faster operation than
counting leaders on all nodes and at this scale, we assume it doesn't really matter (random still provides a relatively
even distribution). The history of "balanced" is that "least-leaders" strategy that we used to have
didn't scale well - for example importing 10000 queues to a cluster would get very slow over time since for every new
queue, we had to count the exact number of queue leaders on each node. The "random" strategy on the other hand,
doesn't work well with a low number of queues (with just a handful of queues, the distribution can be very skewed).
Hence, we implemented a mixed "balanced" strategy, which provides the perfect distribution for a relatively small
number of queues and fairly even distribution for a large number of queues, while providing much better performance:
O(1) instead of O(N) for N > 1000.

If this is not the behaviour you see, please provide an executable test case.

If you think the docs are not clear enough - please PR a suggested edit.


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/062ff00e-1fa2-4c5d-b218-15096d9052f8n%40googlegroups.com.


--
Michał
RabbitMQ team

AK

unread,
Oct 31, 2023, 7:46:10 AM10/31/23
to rabbitmq-users
We have a 5 node cluster, 3.11.2 version. queue-leader-locator set to balanced in config file on all the nodes.
We had 276 queues balanced between 5 nodes (56 on 3 nodes, 54 on 2 nodes).
App starts using this RMQ declares 269 new queues, the queue distribution looks like follows:
1st node: 81 queues
2nd node: 59 queues
3rd node: 59 queues
4th node: 67 queues
5th node: 279 queues

triggering rebalance manually balanced the queues equally. Even though the balance mode was set and the queues were balanced, when new queues were declared  by app(amqp-client: 5.7.3) queues were again skewed.

Michal Kuratczyk

unread,
Oct 31, 2023, 8:28:56 AM10/31/23
to rabbitm...@googlegroups.com
Again, if this is not the behaviour you see, please provide an executable test case.



--
Michał
RabbitMQ team

AK

unread,
Oct 31, 2023, 10:16:47 AM10/31/23
to rabbitmq-users
Hi Michal,

can you give me an example of an executable test case ?

Regards,
Arati

Michal Kuratczyk

unread,
Oct 31, 2023, 10:37:08 AM10/31/23
to rabbitm...@googlegroups.com
Something I can just execute against a freshly deployed cluster, ideally by simply copy-pasting your commands.
Whether that's a bunch of `rabbitmqctl` / `rabbitmqadmin` or similar commands,
or https://perftest.rabbitmq.com/ or a custom app you develop (sometimes this is necessary,
but should be avoided if possible, as it adds steps for me to understand how to build it / run it
and introduces a risk of client-side problems).

Here's a surprisingly related example:
In this case it was an internal report, so I use bazel to start a development node and I assume people
know where to find our sample definition files (I refer to a file from https://github.com/rabbitmq/sample-configs/)
but overall, someone can copy-paste this and see the problem and then re-run the steps to test
the fix. As an additional benefit, creating such steps often leads to enlightenment where you realize
what you did wrong (I'm not suggesting you did anything wrong here, just in general).




--
Michał

AK

unread,
Oct 31, 2023, 10:40:17 AM10/31/23
to rabbitmq-users
Understood Michal. Thanks.
I will try and reproduce this issue at my end and get back with the exact steps.
Reply all
Reply to author
Forward
0 new messages