My Druid is very slow

981 views
Skip to first unread message

Alberto González Mesas

unread,
Jan 25, 2016, 5:52:15 AM1/25/16
to Druid User
Hi guys!

we have deployed Druid in a lab environment (v0.8.1).

Cluster features:

(x1) Broker node: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz (x6), 8GB total memory (Xmx/Xms: 3GB, NewSize/MaxNewSize 1GB,  MaxDirectMemorySize: 4GB), 18GB total Disk.

druid.cache.sizeInBytes=50000000
druid.processing.buffer.sizeBytes=629145600
druid.processing.numThreads=5
druid.query.groupBy.maxResults=10000000
druid.processing.numThreads=5

(x1) Historical node: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz (x8), 12GB total memory (Xmx/Xms: 4GB, NewSize/MaxNewSize 2GB,MaxDirectMemorySize: 5GB ), 20GB total Disk

druid.segmentCache.locations=[{"path": "/opt/druid/store/indexCache", "maxSize": 6442450944}]
druid.processing.buffer.sizeBytes=629145600
druid.server.http.numThreads=10
druid.server.http.maxIdleTime=PT5m
druid.processing.numThreads=7
druid.processing.columnCache.sizeBytes=0
druid.query.groupBy.singleThreaded=false
druid.query.groupBy.maxIntermediateRows=50000
druid.query.groupBy.maxResults=10000000
druid.cache.type=local
druid.cache.sizeInBytes=536870912
druid.cache.initialSize=268435456
druid.server.http.numThreads=50
druid.server.maxSize=6442450944


Segments => 7 shards in 7 intervals.
1 segment = 114MB (granularity = day) => 1.296.502 rows/segment

1 row = 30 dimensions and 3 metrics


What's is the problem?

When running a groupBy in broker node: curl -X POST 'http://BROKER-ADDR:8080/druid/v2/?pretty' -H 'content-type: application/json' -d @query_groupBy_01.json > resul.txt

Result: Time Total is 27s


Query: query_groupBy_01.json


{
  "queryType": "groupBy",
  "dataSource": "dsLab",
  "granularity": "minute",
  "dimensions": [ "field1", "field2", "field3" ],
  "limitSpec": { "type": "default", "limit": 50, "columns": [ {"dimension": "inbytes", "direction": "descending"}] },
  "aggregations": [
    { "type": "doubleSum", "name": "inbytes", "fieldName": "inbytes" }
  ],
  "intervals": [
        "2015-08-01T00:00:00.000/2015-08-08T00:00:00.000"
  ],
  "having": {
    "type": "greaterThan",
    "aggregation": "inbytes",
    "value": 0.0
  },
  "context" : {
    "timeout": 120000,
    "queryId": "q0002"
  }
}


We do not know why druid groupBy queries are so slowly...  ¿any idea?

Thanks in advance

Nicholas McKoy

unread,
Jan 25, 2016, 8:39:10 AM1/25/16
to Druid User
+1 my groupBy queries are also really really slow to the point that curl will timeout. I have increased curl max timeout and i think jetty server from broker sends time out anyway. I increased my broker and historical memory to about 30GBs and that still improved performance but i still cant query for a 10 day interval with a few javascript filters. this is my query: 

{
    "queryType": "groupBy",
    "dataSource": "datasource1",
    "granularity": "all",
    "dimensions": ["dim1", "dim2", "dim3"],
    "limitSpec": {
      "type": "default", "limit": 10, "columns": [
            {
                "dimension": "avg_time",
                "direction": "DESCENDING"
            }
          ]  
    },
    "filter": {
        "type": "and",
        "fields": [
            {
                "type": "javascript",
                "dimension": "time",
                "function": "function(x){return(x < 60000)}"
            },
            {
                "type": "javascript",
                "dimension": "dim3",
                "function": "function(x){return(x == 'value1' || x == 'value2' || x == 'value3' || x == 'value4' || x == 'value5' || x == 'value6' || x == 'value7' || x == 'value8' || x == 'value9')}"
            }
        ]
    },
    "aggregations": [
        {
            "type": "count",
            "name": "count"
        },
        {
            "type": "longSum",
            "name": "sumTime",
            "fieldName": "sumTime"
        }
    ],
    "postAggregations": [
     { "type": "arithmetic",
       "name": "avg_time",
       "fn": "/",
       "fields": [
         { "type": "fieldAccess", "fieldName": "sumTime" },
         { "type": "fieldAccess", "fieldName": "count" }
       ]
     }
    ],
    "intervals": [
        "2016-01-01T00:00:00.000/2016-01-11T00:00:00.000"
    ],
    "having" : {
        "type": "greaterThan",
        "aggregation": "count",
        "value": 1000
    },
    "context": {
        "priority": 1,
        "chunkPeriod": "PT24H"

Nicholas McKoy

unread,
Jan 25, 2016, 8:42:40 AM1/25/16
to Druid User
Alberto,


It details how brokers and historical nodes use the JVM heap and off heap memory to store and merge results. helped me to configure right jvm heap and what off heap memory sizes i needed to improve on some performance.

But if anyone can help with other configurations that would improve performance that would help a ton.

Gian Merlino

unread,
Jan 25, 2016, 1:24:24 PM1/25/16
to druid...@googlegroups.com
Hey Nicholas,

One simple thing you could try first is replacing your second JS filter with an OR of "selector" filters. Generally any non-JS filter or extractionFn will be faster than the equivalent JS based option. So it's best to reach for the JS stuff when you're trying to do something that isn't supported by native filters or extractionFns. One tip is to use PlyQL in verbose mode to help write your Druid queries (http://github.com/implydata/plyql), it converts SQL to Druid queries in as close to the ideal way as it can.

Another thing you could do, if you have a lot of historical nodes, is reduce merging overhead by disabling caching on the broker and enabling it on the historical nodes instead. This will move merging to the historicals which generally works better for larger clusters.

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/5d0cb330-d6ef-4bb3-9c55-a24487837a3b%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Alberto González Mesas

unread,
Jan 26, 2016, 2:41:46 AM1/26/16
to druid...@googlegroups.com
Thx Nicholas McKoy, i readed FAQ but it did not solve our problem. I tried to test with several JVM configurations and none has solved the problem.

Somebody uses groupBy query in your druid deployment?

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/8IHF7z7RSAA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Fangjin Yang

unread,
Jan 27, 2016, 11:41:10 PM1/27/16
to Druid User
Alberto, can you use the existing Druid metrics to narrow down what is slow? Is it merging? Segment scans? Network time?

To unsubscribe from this group and all its topics, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Alberto González Mesas

unread,
Feb 2, 2016, 3:35:00 AM2/2/16
to druid...@googlegroups.com
Hi guys!

We "solved" our problem with low groupBy. The problem is high cardinality in segments. The dimension with high cardinality contains IPv4 values

To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/8IHF7z7RSAA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Nicholas McKoy

unread,
Feb 2, 2016, 3:47:27 PM2/2/16
to Druid User
Fangjin,

How would i go about sending those metrics to a kafka topic? Is there a prop config to do that?
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Fangjin Yang

unread,
Feb 5, 2016, 8:04:32 PM2/5/16
to Druid User
No, you emit metrics from Druid to a HTTP Kafka producer.
Reply all
Reply to author
Forward
0 new messages