Hi guys!
we have deployed Druid in a lab environment (v0.8.1).
Cluster features:
(x1) Broker node: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz (x6), 8GB total memory (Xmx/Xms: 3GB, NewSize/MaxNewSize 1GB, MaxDirectMemorySize: 4GB), 18GB total Disk.
druid.cache.sizeInBytes=50000000
druid.processing.buffer.sizeBytes=629145600
druid.processing.numThreads=5
druid.query.groupBy.maxResults=10000000
druid.processing.numThreads=5
(x1) Historical node:
Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz (x8), 12GB total memory (Xmx/Xms: 4GB, NewSize/MaxNewSize 2GB,MaxDirectMemorySize: 5GB ), 20GB total Disk
druid.segmentCache.locations=[{"path": "/opt/druid/store/indexCache", "maxSize": 6442450944}]
druid.processing.buffer.sizeBytes=629145600
druid.server.http.numThreads=10
druid.server.http.maxIdleTime=PT5m
druid.processing.numThreads=7
druid.processing.columnCache.sizeBytes=0
druid.query.groupBy.singleThreaded=false
druid.query.groupBy.maxIntermediateRows=50000
druid.query.groupBy.maxResults=10000000
druid.cache.type=local
druid.cache.sizeInBytes=536870912
druid.cache.initialSize=268435456
druid.server.http.numThreads=50
druid.server.maxSize=6442450944
Segments => 7 shards in 7 intervals.
1 segment = 114MB (granularity = day) => 1.296.502 rows/segment
1 row = 30 dimensions and 3 metrics
What's is the problem?
When running a
groupBy in broker node:
curl -X POST 'http://BROKER-ADDR:8080/druid/v2/?pretty' -H 'content-type: application/json' -d @query_groupBy_01.json > resul.txt
Result:
Time Total is 27sQuery: query_groupBy_01.json
{
"queryType": "groupBy",
"dataSource": "dsLab",
"granularity": "minute",
"dimensions": [ "field1", "field2", "field3" ],
"limitSpec": { "type": "default", "limit": 50, "columns": [ {"dimension": "inbytes", "direction": "descending"}] },
"aggregations": [
{ "type": "doubleSum", "name": "inbytes", "fieldName": "inbytes" }
],
"intervals": [
"2015-08-01T00:00:00.000/2015-08-08T00:00:00.000"
],
"having": {
"type": "greaterThan",
"aggregation": "inbytes",
"value": 0.0
},
"context" : {
"timeout": 120000,
"queryId": "q0002"
}
}
We do not know why druid groupBy queries are so slowly... ¿any idea?
Thanks in advance