On 7 Aug 2015 at 22:24:40, Dong Young Kim (
dongyou...@gmail.com) wrote:
> I have a fanout exchange and there are at least 1000 queues bound
> to this exchange (each consumer has its own queue). I also enabled
> confirms so a publisher will receive a confirm when the message
> has been written to disk (persistent message). Also, I need to
> publish messages in order so I cannot spawn a separate thread
> to wait for confirms. I need to publish synchronously (publish,
> confirm, publish confirm, etc.).
I know nothing about your application but I don’t see why you
need to wait for a confirm for every single message. That’s the least
efficient way of doing it, and it will be slow regardless of how many
queues a message is routed to.
> What I've noticed is that as the number of queues increase, the
> time to confirm decrease by a significant amount. According
> to
https://www.rabbitmq.com/tutorials/amqp-concepts.html,
> the broker copies each message into the queue. So if you have 1000
> queues, the broker will copy the message 1000 times before sending
> an acknowledgement back. This takes a long time and will make
> it impractical to use in our application as the number of queues
> increase to 2000, 3000, and more.
>
> First, can someone confirm that the broker is indeed copying
> the messages?
Yes, every queue gets a semantical copy.
If this is true, why doesn't the broker just keep
> a reference count and do some book-keeping to write the message
> only once to disk and figure out which queues this message belongs
> to? If not, how does it work?
There is reference counting going at multiple levels, primarily in the runtime
which does not store the same binary [which is a separate data type in Erlang]
N times.
You still need to update queue indices N times, and that happens sequentially.
> Second, is there a workaround to this problem? One solution I
> thought of is have an exchange publish to multiple other exchanges
> (2nd level exchanges if you will), and each of these 2nd level
> exchanges will finally write to the queue. Basically, the broker
> is writing to disk in parallel. I am not sure how confirms would
> work in this case, though.
This assumes that exchanges do the routing and N exchanges would do it in parallel.
That’s not the case: exchanges are just named routing tables. Channel processes perform
routing. You can use multiple channels for publishing and get better parallelism that way.
Don’t wait for every single confirm before publishing the next message — use confirm listeners
(callbacks). That will make bigger difference than using multiple channels.
--
MK
Staff Software Engineer, Pivotal/RabbitMQ