Question on druid memory configs

1,467 views
Skip to first unread message

quangtrung tran

unread,
May 29, 2016, 10:44:58 PM5/29/16
to Druid User
Hi,

According to http://druid.io/docs/latest/operations/performance-faq.html. I saw this recommendation:

memory_for_segments = total_memory - heap - direct_memory - jvm_overhead

What are the corresponding configs for those attributes? for my understanding, they are:

total_memory: Total RAM of the machine
direct_memory: -XX:MaxDirectMemorySize=2000m (found inside jvm.config)
heap: xms / xmx ???
jvm_overhead: ???
memory_for_segments: is it druid.processing.buffer.sizeBytes??

Thanks

sascha...@smaato.com

unread,
May 31, 2016, 3:19:48 AM5/31/16
to Druid User
Hi,

I'm not an expert but to my understanding you are right about the heap and direct memory configuration, but the memory_for_segments has nothing to do with the processing buffer setting. For java you can only configure the heap (xmx) and the direct memory (-XX:MaxDirectMemorySize). The rest of the system's RAM will then be usable for memory-mapping the segments (as the documentation states, that should be the default unless the memory quota is limited explicitly via a linux cgroup setting). It says minus the jvm-overhead below to indicate that some memory will be eaten up by the java platform itself.

The following page is also a nice example of the kinds of settings you'd have on each node type: http://druid.io/docs/latest/configuration/production-cluster.html

To my understanding, the processing buffer is for storing intermediate query results. The performance FAQ says that both off-heap and on-heap memory is used by the processing buffer: "Historical nodes use off-heap memory to store intermediate results" ... "On historicals, the JVM heap is used for GroupBy queries, some data structures used for intermediate computation". I think that off-heap here is synonymous to direct memory.

Nishant Bangarwa

unread,
May 31, 2016, 3:34:56 AM5/31/16
to druid...@googlegroups.com
see Inline 

On Mon, 30 May 2016 at 08:15 quangtrung tran <tranquang...@gmail.com> wrote:
Hi,

According to http://druid.io/docs/latest/operations/performance-faq.html. I saw this recommendation:

memory_for_segments = total_memory - heap - direct_memory - jvm_overhead

What are the corresponding configs for those attributes? for my understanding, they are:

total_memory: Total RAM of the machine
direct_memory: -XX:MaxDirectMemorySize=2000m (found inside jvm.config)
direct_memory should be atleast  -> (druid.processing.numThreads * druid.processing.buffer.sizeBytes ) 
generally a processing buffer of 512 should be enough and processing threads = cpu_cores - 1

heap: xms / xmx ???
  3-4G heap size should be sufficient for small/medium size setups, GroupBys use heap memory, so you might need to set it higher to 6G if you extensively use groupBys.  
jvm_overhead: ???
this includes the system overheads too i.e memory used by other OS daemons and processes, typically 1G-2G.  

memory_for_segments: is it druid.processing.buffer.sizeBytes??
No, this means the system memory available for memory mapping of druid segment files. 
it is defined by config druid.server.maxSize - the maximum amount of segments that will be assigned to a historical node.   


Thanks

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/3bb2c279-23ac-4b3c-ae4e-fe22e08c5297%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

quangtrung tran

unread,
May 31, 2016, 3:43:07 AM5/31/16
to Druid User
Thanks Sascha & Nishant, I am cleared now :D

Prabakaran Hadoop

unread,
May 23, 2019, 10:27:11 PM5/23/19
to Druid User
Hi
Please explain me what is the purpose of direct_memory in druid?
How  it is used? How it is related to VM ?



On Tuesday, May 31, 2016 at 3:34:56 PM UTC+8, Nishant Bangarwa wrote:
see Inline 

On Mon, 30 May 2016 at 08:15 quangtrung tran <tranquang...@gmail.com> wrote:
Hi,

According to http://druid.io/docs/latest/operations/performance-faq.html. I saw this recommendation:

memory_for_segments = total_memory - heap - direct_memory - jvm_overhead

What are the corresponding configs for those attributes? for my understanding, they are:

total_memory: Total RAM of the machine
direct_memory: -XX:MaxDirectMemorySize=2000m (found inside jvm.config)
direct_memory should be atleast  -> (druid.processing.numThreads * druid.processing.buffer.sizeBytes ) 
generally a processing buffer of 512 should be enough and processing threads = cpu_cores - 1

heap: xms / xmx ???
  3-4G heap size should be sufficient for small/medium size setups, GroupBys use heap memory, so you might need to set it higher to 6G if you extensively use groupBys.  
jvm_overhead: ???
this includes the system overheads too i.e memory used by other OS daemons and processes, typically 1G-2G.  

memory_for_segments: is it druid.processing.buffer.sizeBytes??
No, this means the system memory available for memory mapping of druid segment files. 
it is defined by config druid.server.maxSize - the maximum amount of segments that will be assigned to a historical node.   


Thanks

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid...@googlegroups.com.

Abhishek Jain

unread,
May 28, 2019, 11:35:47 AM5/28/19
to Druid User
Hey,

This will give you an idea of how direct memory is used in druid http://druid.io/docs/latest/operations/performance-faq.html

Thanks
Abhishek
Reply all
Reply to author
Forward
0 new messages