Understanding Druid Query performance and tuning druid for better query time.

282 views
Skip to first unread message

Kiran Sunkari

unread,
Aug 9, 2018, 3:39:47 AM8/9/18
to Druid User
Hey Team,

We use Druid in our production setup and we are trying to tune foe performance. The following is the version I use.

Druid v 0.12.1


- I understand that segment query latencies depend on number of cores across historicals vs number of segments, I have tuned this to reduce query time.
- Now as mentioned in the above link, I am trying to reduce memory maps and set the "druid.server.maxSize" to match my available physical memory so that historical behaves as an in-memory store i.e. segments are loaded from physical memory.
- Despite doing this,  I do not see a significant improvement in query performance(query/segment/time).

Is the approach I use correct? I do not even see a spike in memory usage as well so I still doubt that segments are not being fetched from physical memory.

Just in case if every segment is loaded to memory on demand, I would better fare using compute optimized machines rather than memory optimized, am I correct in thinking so?

Thanks
--Kiran.

Jihoon Son

unread,
Aug 10, 2018, 9:22:22 PM8/10/18
to druid...@googlegroups.com
Hi Kiran,

what kind of queries did you test? Would you share one of them?
Also, what is your expectation for those queries?

Jihoon

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/faa8272d-8c74-4eee-9b44-e083d7decfbf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kiran Sunkari

unread,
Aug 13, 2018, 2:55:51 AM8/13/18
to Druid User
@JihoonSon, Iam executing basic timeseries Query over 3 months of data
Query : 
{
  "queryType": "timeseries",
  "dataSource": "test-events",
  "intervals": [
    "2018-02-01T00:00:00.000Z/2018-07-18T00:00:00.000Z"
  ],
  "granularity": "all",
  "aggregations": [
    {
      "type": "hyperUnique",
      "name": "count",
      "fieldName": "userIdHll"
    }
  ]
Reply all
Reply to author
Forward
0 new messages