Trouble with implementing broker backpressure

53 views
Skip to first unread message

Kayla Oliva

unread,
Jan 6, 2025, 12:15:35 PMJan 6
to Druid User
Hi, i'm having a couple issues with implementing broker backpressure.

A. What should you observe that lets you know it's time to implement backpressure? For context, I think this mechanism came up as something that could help our cluster's stability during sustained spikes in traffic. However, I don't believe i see any sign of resource contention in the brokers.
A.png
B. No matter how low (even 1 byte) I set maxQueuedBytes in the query context, the metric query/node/backpressure does not get emitted. Here's the PR where the metric was introduced.
B.png

gi...@imply.io

unread,
Apr 1, 2025, 6:02:24 AMApr 1
to Druid User
It looks like the query/node/backpressure metric was never actually implemented in a release. It was reverted prior to going out. I raised a PR to clean up the docs: https://github.com/apache/druid/pull/17854

Note that these days, backpressure is enabled by default.

Kayla Oliva

unread,
Apr 1, 2025, 12:37:09 PMApr 1
to Druid User
Hi,
Thanks for the insight! Any ideas on when backpressure started getting enabled by default?

gi...@imply.io

unread,
Apr 1, 2025, 12:58:10 PMApr 1
to Druid User
Broker backpressure was enabled by default in https://github.com/apache/druid/pull/12840, which was Druid 24.0.0.

Gian

Kayla Oliva

unread,
Apr 1, 2025, 3:29:14 PMApr 1
to Druid User
Any ideas on a path forward for tuning maxQueuedBytes? What would make sense to observe?

gi...@imply.io

unread,
Apr 2, 2025, 12:05:53 PMApr 2
to Druid User
I would start with what are you trying to achieve with tuning? I've rarely needed to tune this parameter in clusters I've worked on tuning, so my first guess would be it probably doesn't need to be tuned.

In theory, there are a couple of things to think about:

1) If the parameter is too large, you will run out of memory on the Broker. If it is throwing OutOfMemoryErrors, and the heap dump suggests the culprit is buffered data from Historicals, then lowering the parameter would help.

2) If the parameter is too small, it might limit the bandwidth available from Historicals to Brokers. You would mostly notice this for queries that have a large amount of data sent from Historicals to Brokers. In this case, raising it might help more data transfer more quickly.

But again, it's pretty rare that this parameter needs to be tuned.

Gian
Reply all
Reply to author
Forward
0 new messages