It's an interesting one and certainly requires a bit of trial and error to find the best approach. Personally I'd expect multiple queues to handle this use case best. Since these are tasks, destructive consumption is better - once performed, the operation should no longer be in the queue (it'd remain in the stream).
A single very long quorum queue is indeed not great due to quorum queue recovery and other issues.
Given no ordering requirements and no strict latency requirements (I mean that it's ok if a task takes a bit of time to execute),
I'd expect a random exchange with a bunch of classic queues (v2 of course ;) ) to work well:
* classic queues v2 can handle very long queues quite well, but with messages split between many queues, they wouldn't even have to
* if a node with a given queue is unavailable for some time, the tasks from that queue(s) will take longer to execute, but that shouldn't be a huge problem
* local random exchange (
https://github.com/rabbitmq/rabbitmq-server/pull/8334, discussed in the video) could further improve this solution by making sure that messages are always published locally (where the publishing connection is) and with a bit of work, that they are consumed locally as well
As for how many queues - that'd certainly require trial and error, but I'd expect the number to be a few per node, not more.
If we split even 100M messages between 3 nodes and then further into a few queues, that'd be a few million messages per queue.
This should be handled quite well with classic queues (or even quorum queues, but there would be replication overhead for sure).
Lastly, keep in mind that classic queues (both v1 and v2) have two storage mechanisms: messages embedded in the index and a shared message store.
So this is another dimension worth testing/investigating: there could be a significant difference based on whether the messages are below this threshold or not (changing the value of queue_index_embed_msgs_below is an alternative).
Best,