Optimum Historical server size

Suhas

unread,

Apr 5, 2022, 8:54:01 AM4/5/22

to Druid User

Hello,

For every historical, there is a setting "druid.server.maxSize" which is equal to 500GB. I run 12 historicals serving a total of 6TB. What is the optimum size for a historical? I found no noticable difference with six 1 TB historicals as opposed to twelve 500 GB historicals. Is there any observation on this?

Suhas

Mark Herrera

unread,

Apr 5, 2022, 3:20:37 PM4/5/22

to Druid User

Hi Suhas,

Unfortunately, there is no optimum size. You'll find general guidelines and rules-of-thumb in the basic cluster tuning doc.

Regarding druid.server.maxSize, it "controls the total size of segment data that can be assigned by the Coordinator to a Historical." What's the size of your segment data? That may explain why you're not seeing a difference.

Best,

Mark

Suhas

unread,

Apr 6, 2022, 2:20:21 AM4/6/22

to Druid User

Hello Mark,

The total size of all the segment data is about 5.7 TB. Currently, I am running 12 historicals each configured with a druid.server.maxSize=500GB. Should I just have six 1 TB-sized historicals instead? I believe, given all the other parameters left unchanged, too few historicals can cause queries to slow down. Too many historicals may add to overhead latencies. Any sweet spot for this?

Suhas

Mark Herrera

unread,

Apr 6, 2022, 4:01:40 PM4/6/22

to Druid User

Hello Suhas,

Thanks for providing your segment data size. I'm sorry I didn't ask this in my prior response, but are you collecting any metrics?

I'm linking to a discussion about cluster sizing. You're asking a difficult question, and, unfortunately, there is no simple answer. Stated differently, and quoting from the linked discussion, "[t]his is a complex issue that needs an in-depth analysis . . . ."

Back to your original question, when you were looking for differences between six 1 TB historicals and twelve 500 GB historicals, what were you measuring? For good performance, you'll want enough historicals to have a good (free system memory / total size of all druid.segmentCache.locations) ratio.

I'll look forward to continuing this discussion.

Best,

Mark

Didip Kerabat

unread,

Jun 14, 2022, 10:42:19 AM6/14/22

to Druid User

Hello, we run fairly large Druid clusters and for us, we have a golden rule of 1:10 ratio between Historical's RAM and disk. And usually 10 CPU is enough for each Historical.

For example, if there's 256GB RAM available for each historical, we then attach 2560GB SSD to it.

This golden rule works pretty well.

Tijo Thomas

unread,

Jun 14, 2022, 1:45:21 PM6/14/22

to druid...@googlegroups.com

+1 on what Didip mentioned.
Maybe a bit complex calculation could be a Total memory = 10-15% of
actively queryable data size as free memory for the operating
system(page caching) + jvm heap ( lookup and historical heap) +
Direct memory for buffers. The calculation for jvm heap and DM is
available in the basic cluster tuning doc which Mark pointed out.

Thanks

> --
> You received this message because you are subscribed to the Google Groups "Druid User" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/de40fb2b-8311-4424-a60a-519e49d91be9n%40googlegroups.com.

--
Tijo Thomas
Solutions Architect | => Imply , Bangalore , India

Usama Mehboob

unread,

Aug 11, 2023, 10:23:38 AM8/11/23

to Druid User

Is there any golden rule for brokers.?
we have 4TB of data on each hisotrical and they have 128gb of ram (we have 4 historicals)
We have 3 brokers (each broker with 384gb of ram). We always get Our of memory error on broker and historicals. Errors on historicals are less frequent but brokers will frequently have
```
2023-08-10T20:01:05,204 INFO [Curator-ConnectionStateManager-0] org.apache.zookeeper.ZooKeeperTestable - injectSessionExpiration() called 2023-08-10T20:01:05,204 WARN [Curator-ConnectionStateManager-0-EventThread] org.apache.curator.ConnectionState - Session expired event received 2023-08-10T20:01:24,485 WARN [Curator-ConnectionStateManager-0] org.apache.curator.framework.state.ConnectionStateManager - Session timeout has elapsed while SUSPENDED. Injecting a session expiration. Elapsed ms: 517651. Adjusted session timeout ms: 30000 Terminating due to java.lang.OutOfMemoryError: Java heap space
```
thanks

Reply all

Reply to author

Forward