druid performance optimization issues

zoucaitou

unread,

Feb 28, 2018, 9:53:01 PM2/28/18

to Druid User

System

ubuntu 16.04
druid 0.10.1
hadoop 2.9.0

Hardware

cpu*16
mem 64g
storage 500g(not ssd)

Distributed

master node => node1
data node => node2
query node => node3

Master Node

Coordinator

jvm.config

-server
-Xmx10g
-Xms10g
-XX:NewSize=512m
-XX:MaxNewSize=512m
-XX:+UseG1GC
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=/home/druid/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
-Dderby.stream.error.file=/home/druid/derby.log

runtime.properties

druid.service=druid/coordinator
druid.host=xxxxxxxx
druid.port=8081

druid.coordinator.startDelay=PT30S
druid.coordinator.period=PT30S
druid.coordinator.merge.on=true

Overlord

jvm.config

-server
-Xmx4g
-Xms4g
-XX:NewSize=256m
-XX:MaxNewSize=256m
-XX:+UseConcMarkSweepGC
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=/home/druid/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

runtime.properties

druid.service=druid/overlord
druid.host=xxxxxxx
druid.port=8090

druid.indexer.autoscale.doAutoscale=true
druid.indexer.autoscale.strategy=ec2
druid.indexer.autoscale.workerIdleTimeout=PT90m
druid.indexer.autoscale.terminatePeriod=PT5M

druid.indexer.queue.startDelay=PT30S
druid.coordinator.period=PT30S

druid.indexer.runner.type=remote
druid.indexer.storage.type=metadata

Data Node

Historical

jvm.config

-server
-Xmx12g
-Xms12g
-XX:NewSize=6g
-XX:MaxNewSize=6g
-XX:MaxDirectMemorySize=30g
-XX:+UseConcMarkSweepGC
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=/home/druid/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

runtime.properties

druid.service=druid/historical
druid.host=xxxxxx
druid.port=8083

druid.server.tier=hot
druid.server.priority=100

# HTTP server threads
druid.server.http.numThreads=45

# Processing threads and buffers
druid.processing.buffer.sizeBytes=1073741824
druid.processing.numMergeBuffers=11
druid.processing.numThreads=15
druid.processing.tmpDir=/home/druid/processing

# Segment storage
druid.segmentCache.locations=[{"path":"/home/druid/segment-cache","maxSize"\:300000000000}]
druid.server.maxSize=300000000000

# Query cache
druid.historical.cache.useCache=false
druid.historical.cache.populateCache=false
# druid.cache.type=caffeine
# druid.cache.sizeInBytes=2000000000

MiddleManager

jvm.config

-server
-Xmx64m
-Xms64m
-XX:+UseConcMarkSweepGC
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=/home/druid/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

runtime.properties

druid.service=druid/middlemanager
druid.host=xxxxxxxx
druid.port=8091

# Number of tasks per middleManager
druid.worker.capacity=10

# Task launch parameters
druid.indexer.runner.javaOpts=-server -Xmx3g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

druid.indexer.task.baseTaskDir=/home/druid/task
druid.indexer.task.restoreTasksOnRestart=true

# HTTP server threads
druid.server.http.numThreads=45

# Processing threads and buffers
druid.indexer.fork.property.druid.processing.buffer.sizeBytes=336870912
druid.indexer.fork.property.druid.processing.numThreads=2
druid.indexer.fork.property.druid.segmentCache.locations=[{"path": "/home/druid/processing", "maxSize": 0}]
druid.indexer.fork.property.druid.server.http.numThreads=45

druid.processing.buffer.sizeBytes=100000000
druid.processing.numMergeBuffers=2
druid.processing.numThreads=3
druid.processing.tmpDir=/home/druid/processing

# Hadoop indexing
druid.indexer.task.hadoopWorkingPath=/home/druid/hadoop-tmp
druid.indexer.task.defaultHadoopCoordinates=["org.apache.hadoop:hadoop-client:2.7.3"]

Query Node

Broker

jvm.config

-server
-Xmx20g
-Xms20g
-XX:NewSize=6g
-XX:MaxNewSize=6g
-XX:MaxDirectMemorySize=30g
-XX:+UseConcMarkSweepGC
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=/home/druid/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

runtime.properties

druid.service=druid/broker
druid.host=xxxxxx
druid.port=8082

# HTTP server threads
druid.broker.http.numConnections=20
druid.server.http.numThreads=45

# Processing threads and buffers
druid.processing.buffer.sizeBytes=1073741824
druid.processing.numMergeBuffers=11
druid.processing.numThreads=15
druid.processing.tmpDir=/home/druid/processing

# Query cache disabled -- push down caching and merging instead
druid.broker.cache.useCache=true
druid.broker.cache.populateCache=true
druid.cache.type=memcached
druid.cache.hosts=node1:11211,node3:11211
druid.cache.memcachedPrefix=druid
druid.cache.numConnections=12

druid.broker.select.tier=highestPriority

Cluster

metric monitor historical node query/wait time very slow

GunWoo Kim

unread,

Jun 30, 2018, 1:51:21 PM6/30/18

to Druid User

Hi, zoucaitou

how many historical node do you use?

i checked your historical node runtime props,

-Xmx12g

-Xms12g

-XX:NewSize=6g

-XX:MaxNewSize=6g

-XX:MaxDirectMemorySize=30g

your server has 64Gb memory and with your historical runtime conf, available memory for segment loading is under 20gb.

if 20gb memory is not enough for serving segments in historical node, there can be page in/out for segments and it can effect to query processing.

Check your total segments size per historical node and available memory for your segments.

good luck :)

zhangxin...@gmail.com

unread,

Jul 14, 2018, 10:13:21 AM7/14/18

to Druid User

I think you can add the GC logs configuration in your historical nodes jvm.config file, then test the heap size changes through the gc logs once the queries are coming. guess it maybe not enough young generation heap size, you can increase the -NewSize or -MaxNewSize, or it maybe caused by whole heap size, you can solve it using -Xmx or -Xms.

在 2018年3月1日星期四 UTC+8上午10:53:01，zoucaitou写道：

Reply all

Reply to author

Forward