Slow confirms when using multiple queues

226 views
Skip to first unread message

Dong Young Kim

unread,
Aug 7, 2015, 3:24:37 PM8/7/15
to rabbitmq-users
I have a fanout exchange and there are at least 1000 queues bound to this exchange (each consumer has its own queue). I also enabled confirms so a publisher will receive a confirm when the message has been written to disk (persistent message). Also, I need to publish messages in order so I cannot spawn a separate thread to wait for confirms. I need to publish synchronously (publish, confirm, publish confirm, etc.).

What I've noticed is that as the number of queues increase, the time to confirm decrease by a significant amount. According to https://www.rabbitmq.com/tutorials/amqp-concepts.html, the broker copies each message into the queue. So if you have 1000 queues, the broker will copy the message 1000 times before sending an acknowledgement back. This takes a long time and will make it impractical to use in our application as the number of queues increase to 2000, 3000, and more.

First, can someone confirm that the broker is indeed copying the messages? If this is true, why doesn't the broker just keep a reference count and do some book-keeping to write the message only once to disk and figure out which queues this message belongs to? If not, how does it work?

Second, is there a workaround to this problem? One solution I thought of is have an exchange publish to multiple other exchanges (2nd level exchanges if you will), and each of these 2nd level exchanges will finally write to the queue. Basically, the broker is writing to disk in parallel. I am not sure how confirms would work in this case, though.


I'd appreciate any response. Thanks!

Michael Klishin

unread,
Aug 7, 2015, 3:42:02 PM8/7/15
to rabbitm...@googlegroups.com, Dong Young Kim
 On 7 Aug 2015 at 22:24:40, Dong Young Kim (dongyou...@gmail.com) wrote:
> I have a fanout exchange and there are at least 1000 queues bound
> to this exchange (each consumer has its own queue). I also enabled
> confirms so a publisher will receive a confirm when the message
> has been written to disk (persistent message). Also, I need to
> publish messages in order so I cannot spawn a separate thread
> to wait for confirms. I need to publish synchronously (publish,
> confirm, publish confirm, etc.).

I know nothing about your application but I don’t see why you
need to wait for a confirm for every single message. That’s the least
efficient way of doing it, and it will be slow regardless of how many
queues a message is routed to.

> What I've noticed is that as the number of queues increase, the
> time to confirm decrease by a significant amount. According
> to https://www.rabbitmq.com/tutorials/amqp-concepts.html,
> the broker copies each message into the queue. So if you have 1000
> queues, the broker will copy the message 1000 times before sending
> an acknowledgement back. This takes a long time and will make
> it impractical to use in our application as the number of queues
> increase to 2000, 3000, and more.
>
> First, can someone confirm that the broker is indeed copying
> the messages?

Yes, every queue gets a semantical copy.

If this is true, why doesn't the broker just keep
> a reference count and do some book-keeping to write the message
> only once to disk and figure out which queues this message belongs
> to? If not, how does it work?

There is reference counting going at multiple levels, primarily in the runtime
which does not store the same binary [which is a separate data type in Erlang]
N times.

You still need to update queue indices N times, and that happens sequentially.

> Second, is there a workaround to this problem? One solution I
> thought of is have an exchange publish to multiple other exchanges
> (2nd level exchanges if you will), and each of these 2nd level
> exchanges will finally write to the queue. Basically, the broker
> is writing to disk in parallel. I am not sure how confirms would
> work in this case, though.

This assumes that exchanges do the routing and N exchanges would do it in parallel.
That’s not the case: exchanges are just named routing tables. Channel processes perform
routing. You can use multiple channels for publishing and get better parallelism that way.

Don’t wait for every single confirm before publishing the next message — use confirm listeners
(callbacks). That will make bigger difference than using multiple channels.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ


Dong Young Kim

unread,
Aug 7, 2015, 4:01:34 PM8/7/15
to rabbitmq-users, dongyou...@gmail.com
Hi Michael, thanks for your reply.

My application needs to publish in order. I need to know that the previous message has been successfully delivered before I can publish the next one. This is because I am making guarantees to the consumer that the order of the messages in which the publisher publishes is the same as the order in which the consumer consumes. By using confirm listeners, I might publish the next message before realizing that my previous message wasn't successfully delivered, which will cause the consumer to consume out of order.

How would I publish to one exchange using multiple channels to achieve any kind of parallelism? Wouldn't publishing on two different channels ultimately duplicate the message in the queues?

Thanks.

Michael Klishin

unread,
Aug 7, 2015, 4:08:04 PM8/7/15
to rabbitm...@googlegroups.com, Dong Young Kim
On 7 Aug 2015 at 23:01:38, Dong Young Kim (dongyou...@gmail.com) wrote:
> How would I publish to one exchange using multiple channels
> to achieve any kind of parallelism?

Channels are logical connections multiplexed over a TCP connection. Each channel is a runtime
process (think of it as a lightweight thread). They can and will be executed concurrently and in
parallel if your system has enough cores for that.

> Wouldn't publishing on two
> different channels ultimately duplicate the message in the
> queues?

How would publishing using N channels duplicate anything? You’d publish message 1 on channel 1,
message 2 on channel 2, message 3 on channel 3, message 4 on channel 1, etc.

What that may lead to is a natural race conditions between channels delivering messages to queues.
Which is probably a big deal for your case.

Strict ordering requirements and parallelism typically don’t go well together. Either eliminate
that requirement or timestamp your messages and make your consumer use timestamps to recreate
the original order. Without it, you will have much lower throughput, with any number of queues,
cores, and so on.
Reply all
Reply to author
Forward
0 new messages