Stalls between prefetches in RabbitMQ

Roy Reznik

unread,

Jun 22, 2017, 1:25:35 PM6/22/17

to rabbitmq-users

Hi,

We're seeing very large stalls between prefetches in our consumer.

We see that we're handling a batch of messages very quickly, and then the thread who handled the messages waits for a new task (probably because it was from a pool), and a very long time (10s of seconds, sometimes even 90 seconds) pass before a new thread starts handling another batch.

We're using Java, RabbitMQ 3.6.10 & Erlang 19.3.6.

We can try to increase the prefetch count, but why does it take so long to fetch a batch?

Our current prefetch count is 250.

Message size is ~2K.

Our RabbitMQ node has plenty of free CPU, RAM, IO and network bandwidth.

Thanks,

Roy.

Michael Klishin

unread,

Jun 22, 2017, 1:48:36 PM6/22/17

to rabbitm...@googlegroups.com

Use Wireshark to see when exactly messages are sent (as well as acknowledgements).

RabbitMQ does not deliver batches but messages one by one, when prefetch allows for it and queue processes are not busy doing something else, such as moving messages to disk or garbage collection.

Message breakdown for individual queues, GC activity for individual queues as well as node as a whole, consumer utilisation and effective QoS cam all be inspected via management UI. Those values

help reasoning about the system.

Java client starting with 4.0 provides metrics as well.

Collecting data and metrics is step number 1 for any performance evaluation.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Staff Software Engineer, Pivotal/RabbitMQ

Michael Klishin

unread,

Jun 22, 2017, 1:52:06 PM6/22/17

to rabbitm...@googlegroups.com

Here's a blog post that explains what Consumer Utilisation is in the UI and how it can be used

to reason about prefetch and consumer throughput (without any consumer application metrics, at least):

https://www.rabbitmq.com/blog/2014/04/14/finding-bottlenecks-with-rabbitmq-3-3/

Here's how Java client metrics can be accessed, e.g. via JMX:

https://www.rabbitmq.com/blog/2016/11/30/metrics-support-in-rabbitmq-java-client-4-0/

On Thu, Jun 22, 2017 at 8:48 PM, Michael Klishin <mkli...@pivotal.io> wrote:

Use Wireshark to see when exactly messages are sent (as well as acknowledgements).

RabbitMQ does not deliver batches but messages one by one, when prefetch allows for it and queue processes are not busy doing something else, such as moving messages to disk or garbage collection.

Message breakdown for individual queues, GC activity for individual queues as well as node as a whole, consumer utilisation and effective QoS cam all be inspected via management UI. Those values
help reasoning about the system.

Java client starting with 4.0 provides metrics as well.

Collecting data and metrics is step number 1 for any performance evaluation.

On Thu, 22 Jun 2017 at 20:25, Roy Reznik <r...@adallom.com> wrote:

Hi,

We're seeing very large stalls between prefetches in our consumer.
We see that we're handling a batch of messages very quickly, and then the thread who handled the messages waits for a new task (probably because it was from a pool), and a very long time (10s of seconds, sometimes even 90 seconds) pass before a new thread starts handling another batch.
We're using Java, RabbitMQ 3.6.10 & Erlang 19.3.6.
We can try to increase the prefetch count, but why does it take so long to fetch a batch?
Our current prefetch count is 250.
Message size is ~2K.
Our RabbitMQ node has plenty of free CPU, RAM, IO and network bandwidth.

Thanks,
Roy.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Staff Software Engineer, Pivotal/RabbitMQ

--

MK

Staff Software Engineer, Pivotal/RabbitMQ

Roy Reznik

unread,

Jun 22, 2017, 1:59:02 PM6/22/17

to rabbitmq-users

Hi Michael,

We see in Wireshark that basicGet doesn't return for 10s of seconds.

Consumer utilization looks like a sin function (ups and downs)

Queue metrics look like this:

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Staff Software Engineer, Pivotal/RabbitMQ

Michael Klishin

unread,

Jun 22, 2017, 1:59:14 PM6/22/17

to rabbitm...@googlegroups.com

If I had to investigate a "slower than expected" JVM-based consumer, I'd use VisualVM or

similar to profile the app and see what that may reveal about Consumer interface implementations.

If there's evidence that consumer utilisation is high (not necessarily 100% but

in that ballpark),

see CPU Utilisation and Parallelism Considerations in http://www.rabbitmq.com/queues.html.

Existing rabbitmq-users thread with some very generic advice about

throughput:

https://groups.google.com/d/msg/rabbitmq-users/p6CnpiBXf6E/-dvYlG7PBAAJ

To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Staff Software Engineer, Pivotal/RabbitMQ

--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Michael Klishin

unread,

Jun 22, 2017, 2:04:11 PM6/22/17

to rabbitm...@googlegroups.com

basic.get is polling, there is a roundtrip for each messages.
QoS prefetch only really makes sense for basic.consume (which
you definitely should use if throughput and latency matter).

With basic.consume there is no roundtrip, messages are "pushed" to the consumer
as QoS prefetch allows.

See "Receiving Messages by Subscription ("Push API")" in http://www.rabbitmq.com/api-guide.html.

According to the chart the queue isn't overloaded. "Reductions" is an abstract VM unit
so 150K isn't necessarily a high or low number, it's primarily useful for comparing how much
CPU and scheduler time different processes (queues, channels, connections) take compared to each other.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Roy Reznik

unread,

Jun 22, 2017, 2:04:30 PM6/22/17

to rabbitmq-users

We're pretty certain the problem relies in the server and not the client, that's why we haven't debugged it like this.

It looks like the server simply "halts" every once in a while.

We run all sorts of queue listing, status, etc. with rabbitmqctl, could this be holding all sorts of locks of any kind?

Roy.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.

To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Staff Software Engineer, Pivotal/RabbitMQ

--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Michael Klishin

unread,

Jun 22, 2017, 2:06:36 PM6/22/17

to rabbitm...@googlegroups.com

The above can be misleading.

basic.get is like HTTP GET, it's a roundtrip plus then the client sends an ack

(if manual acknowledgements are used). So it takes from 2 to 3 "hops" per message

where the last one is asynchronous.

With basic.consume the client subscribes once and receives deliveries with

optional acknowledgements: http://www.rabbitmq.com/confirms.html.

So it can be as low as 1 "hop", or 2, and both are entirely asynchronous in the protocol.

Michael Klishin

unread,

Jun 22, 2017, 2:13:07 PM6/22/17

to rabbitm...@googlegroups.com

`rabbitmqctl list_queue` does not acquire locks per se but it does query

the state of all queue processes, which means they all will be competing

for scheduler resources.

Listing via HTTP API uses tables with periodically collected stats, which

in theory should be less intrusive.

Unless you list all queues every few seconds and there are at least 1000s of them,

I'd be surprised if it blocked basic.get for 10 or 90 seconds.

Even if your connection or channel frequently go into flow control state

(throttled by their downstream components), 10s of seconds is too long.

Only resource-driven (e.g. memory) alarms can throttle for several seconds or more.

I'd suggest switching to basic.consume first.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Roy Reznik

unread,

Jun 22, 2017, 2:22:37 PM6/22/17

to rabbitmq-users

Hi Michael,

Thanks for the answers.

Our solution does use basic.consume, we did basic.get just to see if the queue is working.

I think that the fact that basic.get took 10s of seconds tells us that something is wrong, and just switching to basic.consume won't solve it.

There's something fundamentally not working here.

From the picture I pasted with the queue metrics, do they make any sense? I don't know what numbers should be expected there.

Thanks

Roy.

Michael Klishin

unread,

Jun 22, 2017, 2:33:12 PM6/22/17

to rabbitm...@googlegroups.com

I mentioned the metric above:

> According to the chart the queue isn't overloaded. "Reductions" is an abstract VM unit
> so 150K isn't necessarily a high or low number, it's primarily useful for comparing how much
> CPU and scheduler time different processes (queues, channels, connections) take compared to each other.

Please do not use basic.get if throughput is of any significance.

Steps I'd take if I were you:

* Switch to basic.cosume

* Increase QoS or try automatic acknowledgements as an experiment to get consumer utilisation high

Then

* See message location breakdown (RAM, disk, paged out transient messages) on the queue page

* See disk activity charts on the same page: are they high?

* See what queue's GC chart on the same page says (it's in MBs, not abstract units)

* See node I/O metrics

* Post details about how many queues are used, how many nodes are involved, how many cores they have

* Ideally post some kind of diagram that demonstrates how your system operates

I'd suspect that VM scheduler-to-CPU-core binding used by default can be suboptimal for your case

but no amount of CPU context switching can take 10 to 90 seconds.

The shortest road to finding what is "fundamentally not working" is through collecting and analysis metrics.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Roy Reznik

unread,

Jun 22, 2017, 3:56:47 PM6/22/17

to rabbitmq-users

Hi Michael, thanks. Yes, I'd like to collect any interesting metrics that will help us analyze the issue.

We're already using basic.consume.

Regarding automatic acknowledges - durability is important for us so that is not an option currently.

Here is the GC chart for the entire node:

Roy.

Michael Klishin

unread,

Jun 22, 2017, 4:20:09 PM6/22/17

to rabbitm...@googlegroups.com

This chart alone only tells us that the node is far from idle.

Other metrics mentioned above, in particular for the queue(s) in question,

are more interesting.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Roy Reznik

unread,

Jun 22, 2017, 4:42:08 PM6/22/17

to rabbitmq-users

Here are the queue metrics:

Roy.

Michael Klishin

unread,

Jun 22, 2017, 5:05:01 PM6/22/17

to rabbitm...@googlegroups.com

There are 302 consumers on a single queue with 2M+ messages, at 50% consumer utilisation rate.

While there is no information on the number of channels or connections those consumers are on,

this is asking for

* rabbitmq-sharding or more queues used

* Experiments with a higher prefetch (finding the right number is usually a matter of trial and error, with fast consumers and their memory usage monitored you can safely go into 100s)

550 deliveries/second with 2 kB is 1.1 MB or 8.8 MBit a second, which are low.

We are aware of a couple of scenarios where after a queue with a backlog and sudden rate

spikes and drops can settle on delivering at a low rate, the exact reason is yet to be investigated.

However, the recommendation above still stands. This queue is mostly busy with moving data to disk

and GC. With N queues those activities can be parallelised (to some extent, eventually message store throughput

will become the limiting factor).

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Roy Reznik

unread,

Jul 5, 2017, 1:46:37 PM7/5/17

to rabbitmq-users

Eventually, we upgraded from network SSD to local SSD and that resolved the issue for a while, but the issue returned after a week or so.

We got to the conclusion that RabbitMQ is simply unable to handle the rate of 10-15K messages sized 4KB on a 220GB RAM server with 16 cores & local SSD.

Michael Klishin

unread,

Jul 5, 2017, 2:37:14 PM7/5/17

to rabbitm...@googlegroups.com

I cannot argue with your conclusions but we are aware of environments that do at least that many

messages an order of magnitude greater in size with much less RAM and fewer cores.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,

Jul 5, 2017, 2:51:12 PM7/5/17

to rabbitm...@googlegroups.com

…as well as environments with monster nodes (in the 200+ GB of RAM range, massive network links) that

go well above 100K (message size distribution is unknown).

But use whatever works for you.

Roy Reznik

unread,

Jul 5, 2017, 4:59:22 PM7/5/17

to rabbitmq-users

Hi Michael,

I guess these environments probably don't have the ack/durability requirements we have (most strict ones) or they are heavily optimized and tweaked (we're using pretty basic settings) or they are using a different version which works better.

We split all of our data across different queues with hundreds of consumers that are independent of each other and still see large stalls at the *node* level, which are probably a bug/lock of some kind (this image is the *node* level message rates):

Like I said, all consumers are independent and therefore it doesn't make any sense that they all "halt" at the same second - the issue *must* be in RabbitMQ.

None of the resources are fully utilized - memory, CPU or IOPS.

Roy.

Michael Klishin

unread,

Jul 5, 2017, 5:50:07 PM7/5/17

to rabbitm...@googlegroups.com

There are pretty much no "node wide locks" in RabbitMQ and even GC is per-process, not stop-the-world.

However, there are resources that are inherently shared by everything on a node: two message stores,

a file handle cache, internal database schema, and a few others. Except for the message stores

and file handle cache, they are rarely on the hot path.

Reviewing the data provided in this thread suggests that there is a single queue with hundreds

of consumers that is expected to deliver throughput higher than X. This is a known anti-pattern.

There is a section on this scenario in the docs (http://www.rabbitmq.com/queues.html) but let me repeat:

a workload based around a single queue is limited to one CPU core on the hot path and will be

affected by said queue's GC pauses. It does not matter how many cores, network bandwidth

or RAM you have: a single queue can only go so fast. RabbitMQ was built with the assumption that

there will be multiple queues in concurrent use.

rabbitmq-sharding can let you trade off ordering for parallelism on a "single logical queue".

I cannot suggest much without access to application code and/or a fair amount of metrics — which in this thread were limited

to a chart at a time — but we hear claims that "RabbitMQ is the problem" every few days and

in a pretty substantial % of cases they end up

being issues with applications, downstream consumer services, reliance on known anti-patterns,

users ignoring any advice given to them, and so on.

We plan on publishing a few posts about how such issues can be investigated in a manner

that does not rely on gut feeling.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Roy Reznik

unread,

Jul 6, 2017, 8:27:09 AM7/6/17

to rabbitmq-users

Michael,

Like I stated above, we have *hundreds of queues* with *thousands of consumers* (overall, not per queue) - we are *not* using a single queue whatsoever.

When only producers run (consumers are turned off) - they produce data *very* quickly into the node.

When only consumers run (producers are turned off) - they consume data *very* quickly from the node.

When both of them run together, the node periodically halts and behaves as stated above with very, very poor performance.

We are not ignoring advice, we are just trying to understand what are the next steps in order to debug such an issue.

Unfortunately, after reading all the data available on the internet (that we were able to find or referenced to) regarding RabbitMQ performance, and also, asking here in the forum, we were unable to find any concrete next steps.

If there is any more data that will help to debug this issue and understand why RabbitMQ is behaving so poorly, please let me know.

We also tried to use rabbit-top - seems like most of the queues are "waiting" (!?)

Roy.

Michael Klishin

unread,

Jul 6, 2017, 10:24:08 AM7/6/17

to rabbitm...@googlegroups.com

Sorry, I did not suggest that you specifically were ignoring our advice.

It's not a fairly common thing to see.

Thanks for the details, we will investigate it as time permits.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,

Jul 6, 2017, 10:24:26 AM7/6/17

to rabbitm...@googlegroups.com

This should read "I did not mean to suggest".

Roy Reznik

unread,

Jul 6, 2017, 4:36:30 PM7/6/17

to rabbitmq-users

Thanks.

Is there any way to extract stack trace of currently running tasks?

We want to try and see what it halts on.

Roy.

Michael Klishin

unread,

Jul 6, 2017, 6:31:00 PM7/6/17

to rabbitm...@googlegroups.com

Current stack trace changes many times a second and is not a reliable way.

Asking every process for its state can be a very expensive operation for production systems.

Erlang profiling tools are all based on VM tracing: dbg, Recon [1], and so on.

Except for dbg, it means a certain library must be on code path.

Some examples of how the observer app and tracing can be used on remote nodes:

http://blog.plataformatec.com.br/2016/05/tracing-and-observing-your-remote-node/

https://s3.us-east-2.amazonaws.com/ferd.erlang-in-anger/text.v1.1.0.pdf, section 2.

I'd highly recommend experimenting with this on a non-production node first,

e.g. loaded with PerfTest [2].

1. http://ferd.github.io/recon/recon_trace.html

2. https://www.rabbitmq.com/java-tools.html

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward