My Druid is very slow

Alberto González Mesas

unread,

Jan 25, 2016, 5:52:15 AM1/25/16

to Druid User

Hi guys!

we have deployed Druid in a lab environment (v0.8.1).

Cluster features:

(x1) Broker node: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz (x6), 8GB total memory (Xmx/Xms: 3GB, NewSize/MaxNewSize 1GB, MaxDirectMemorySize: 4GB), 18GB total Disk.

druid.cache.sizeInBytes=50000000
druid.processing.buffer.sizeBytes=629145600
druid.processing.numThreads=5
druid.query.groupBy.maxResults=10000000
druid.processing.numThreads=5

(x1) Historical node: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz (x8), 12GB total memory (Xmx/Xms: 4GB, NewSize/MaxNewSize 2GB,MaxDirectMemorySize: 5GB ), 20GB total Disk

druid.segmentCache.locations=[{"path": "/opt/druid/store/indexCache", "maxSize": 6442450944}]
druid.processing.buffer.sizeBytes=629145600
druid.server.http.numThreads=10
druid.server.http.maxIdleTime=PT5m
druid.processing.numThreads=7
druid.processing.columnCache.sizeBytes=0
druid.query.groupBy.singleThreaded=false
druid.query.groupBy.maxIntermediateRows=50000
druid.query.groupBy.maxResults=10000000
druid.cache.type=local
druid.cache.sizeInBytes=536870912
druid.cache.initialSize=268435456
druid.server.http.numThreads=50
druid.server.maxSize=6442450944

Segments => 7 shards in 7 intervals.

1 segment = 114MB (granularity = day) => 1.296.502 rows/segment

1 row = 30 dimensions and 3 metrics

What's is the problem?

When running a groupBy in broker node: curl -X POST 'http://BROKER-ADDR:8080/druid/v2/?pretty' -H 'content-type: application/json' -d @query_groupBy_01.json > resul.txt

Result: Time Total is 27s

Query: query_groupBy_01.json

{
"queryType": "groupBy",
"dataSource": "dsLab",
"granularity": "minute",
"dimensions": [ "field1", "field2", "field3" ],
"limitSpec": { "type": "default", "limit": 50, "columns": [ {"dimension": "inbytes", "direction": "descending"}] },
"aggregations": [
    { "type": "doubleSum", "name": "inbytes", "fieldName": "inbytes" }
],
"intervals": [
        "2015-08-01T00:00:00.000/2015-08-08T00:00:00.000"
],
"having": {
    "type": "greaterThan",
    "aggregation": "inbytes",
    "value": 0.0
},
"context" : {
    "timeout": 120000,
    "queryId": "q0002"
}
}

We do not know why druid groupBy queries are so slowly... ¿any idea?

Thanks in advance

Nicholas McKoy

unread,

Jan 25, 2016, 8:39:10 AM1/25/16

to Druid User

+1 my groupBy queries are also really really slow to the point that curl will timeout. I have increased curl max timeout and i think jetty server from broker sends time out anyway. I increased my broker and historical memory to about 30GBs and that still improved performance but i still cant query for a 10 day interval with a few javascript filters. this is my query:

{

"queryType": "groupBy",

"dataSource": "datasource1",

"granularity": "all",

"dimensions": ["dim1", "dim2", "dim3"],

"limitSpec": {

"type": "default", "limit": 10, "columns": [

{

"dimension": "avg_time",

"direction": "DESCENDING"

}

]

},

"filter": {

"type": "and",

"fields": [

{

"type": "javascript",

"dimension": "time",

"function": "function(x){return(x < 60000)}"

},

{

"type": "javascript",

"dimension": "dim3",

"function": "function(x){return(x == 'value1' || x == 'value2' || x == 'value3' || x == 'value4' || x == 'value5' || x == 'value6' || x == 'value7' || x == 'value8' || x == 'value9')}"

}

]

},

"aggregations": [

{

"type": "count",

"name": "count"

},

{

"type": "longSum",

"name": "sumTime",

"fieldName": "sumTime"

}

],

"postAggregations": [

{ "type": "arithmetic",

"name": "avg_time",

"fn": "/",

"fields": [

{ "type": "fieldAccess", "fieldName": "sumTime" },

{ "type": "fieldAccess", "fieldName": "count" }

]

}

],

"intervals": [

"2016-01-01T00:00:00.000/2016-01-11T00:00:00.000"

],

"having" : {

"type": "greaterThan",

"aggregation": "count",

"value": 1000

},

"context": {

"priority": 1,

"chunkPeriod": "PT24H"

Nicholas McKoy

unread,

Jan 25, 2016, 8:42:40 AM1/25/16

to Druid User

Alberto,

I found this doc pretty helpful: http://druid.io/docs/latest/operations/performance-faq.html

It details how brokers and historical nodes use the JVM heap and off heap memory to store and merge results. helped me to configure right jvm heap and what off heap memory sizes i needed to improve on some performance.

But if anyone can help with other configurations that would improve performance that would help a ton.

Gian Merlino

unread,

Jan 25, 2016, 1:24:24 PM1/25/16

to druid...@googlegroups.com

Hey Nicholas,

One simple thing you could try first is replacing your second JS filter with an OR of "selector" filters. Generally any non-JS filter or extractionFn will be faster than the equivalent JS based option. So it's best to reach for the JS stuff when you're trying to do something that isn't supported by native filters or extractionFns. One tip is to use PlyQL in verbose mode to help write your Druid queries (http://github.com/implydata/plyql), it converts SQL to Druid queries in as close to the ideal way as it can.

Another thing you could do, if you have a lot of historical nodes, is reduce merging overhead by disabling caching on the broker and enabling it on the historical nodes instead. This will move merging to the historicals which generally works better for larger clusters.

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/5d0cb330-d6ef-4bb3-9c55-a24487837a3b%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Alberto González Mesas

unread,

Jan 26, 2016, 2:41:46 AM1/26/16

to druid...@googlegroups.com

Thx Nicholas McKoy, i readed FAQ but it did not solve our problem. I tried to test with several JVM configurations and none has solved the problem.

Somebody uses groupBy query in your druid deployment?

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/8IHF7z7RSAA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/18fa77f7-a75b-48ff-9f09-ce12c38ced0f%40googlegroups.com.

Fangjin Yang

unread,

Jan 27, 2016, 11:41:10 PM1/27/16

to Druid User

Alberto, can you use the existing Druid metrics to narrow down what is slow? Is it merging? Segment scans? Network time?

http://druid.io/docs/latest/operations/metrics.html

To unsubscribe from this group and all its topics, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Alberto González Mesas

unread,

Feb 2, 2016, 3:35:00 AM2/2/16

to druid...@googlegroups.com

Hi guys!

We "solved" our problem with low groupBy. The problem is high cardinality in segments. The dimension with high cardinality contains IPv4 values

http://druid.io/docs/latest/operations/metrics.html

To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/18fa77f7-a75b-48ff-9f09-ce12c38ced0f%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/8IHF7z7RSAA/unsubscribe.

To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/c1939372-d567-480d-9f8d-98a3ae60ae05%40googlegroups.com.

Nicholas McKoy

unread,

Feb 2, 2016, 3:47:27 PM2/2/16

to Druid User

Fangjin,

How would i go about sending those metrics to a kafka topic? Is there a prop config to do that?

To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Fangjin Yang

unread,

Feb 5, 2016, 8:04:32 PM2/5/16

to Druid User

No, you emit metrics from Druid to a HTTP Kafka producer.

Reply all

Reply to author

Forward