SSD vs EBS druid perfomance

152 views
Skip to first unread message

sumatheja dasararaju

unread,
Aug 13, 2017, 2:09:51 PM8/13/17
to Druid User
Hi Everyone,

In all the documentation for production clusters, aws r3.8xl is mentioned. Is there any known limitation with using r4 series (same CPU and RAM but r4 has EBS unlike r3 SSD). I would assume there is going to be some network latency added to the read and write with r4. Anything other than that, that would significantly affect the cluster performance?

Thanks in advance. Let me know if you need any further information to better answer this.

Gian Merlino

unread,
Aug 14, 2017, 2:22:46 PM8/14/17
to druid...@googlegroups.com
EBS performance can vary a lot depending on usage and whether or not you have provisioned IOPS. I think overall your experience would depend on how much reading you are actually doing from EBS, vs the in-memory cache, and also what kind of EBS you are using.

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/4046e581-61c3-49a8-83b7-49ea1af18c5d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

sumatheja dasararaju

unread,
Aug 14, 2017, 3:00:17 PM8/14/17
to Druid User
Thanks for the response Gian. We are experiencing a query latency of ~2 seconds and we are trying to find root causes for that. To explain our use case better,

we have 15 nodes (r4.8xl) hosts, we specified max size of 300 gb per node and we have over 3.5 TB of data as of now. We are on 0.9.2 

here is the query we are trying to use


{
    "queryType" : "groupBy",
    "dataSource" : "sales-rank-daily",
    "dimensions": [{"type" : "listFiltered",
                    "delegate" : "category",
                    "values":["#####################",
                                "#####################",
                                "#####################",
                               "#####################",
                                "#####################",
                              "#####################",
                               "#####################",
                              "#####################",
                               "#####################",
                              "#####################",
                               "#####################",
                               "#####################",
                               "#####################",
                                "#####################",
                                "#####################",
                               "#####################",
                                "#####################",
                              "#####################",
                               "#####################",
                              "#####################",
                               "#####################",
                              "#####################",
                               "#####################",
                               "#####################"
                                ]
                        }],
    "limitSpec": { "dimension": "category-rank", "direction": "descending", "dimensionOrder": "numeric" },
    "granularity" : "day",
    "filter":{ "type" : "and",
               "fields":[
                          {
                           "type": "selector",
                           "dimension": "asin",
                           "value": "####"
                          },
                          {
                           "type": "selector",
                           "dimension": "######",
                           "value": 1
                          }
          ]
           },
            "aggregations": [
    { "type" : "doubleMin", "name" : "sales-rank", "fieldName" : "category-rank" }
    ],

    "intervals": ["2015-06-23/2017-06-23"]
}

I have given the default group by version set to v2. I have 9 hosts (r4 8xl) running (overlord, broker and coordinator) and 9 hosts (r4 8xl) running middle manager. Attaching the config for each nodes. We were going to try and see if changing to ssd would improve the performance. If you find anything that would improve the performance or something that we are doing wrong please do point. the cluster is going to grow larger like 5 times atleast.

On Monday, August 14, 2017 at 11:22:46 AM UTC-7, Gian Merlino wrote:
EBS performance can vary a lot depending on usage and whether or not you have provisioned IOPS. I think overall your experience would depend on how much reading you are actually doing from EBS, vs the in-memory cache, and also what kind of EBS you are using.

Gian

On Sun, Aug 13, 2017 at 11:09 AM, sumatheja dasararaju <suma...@gmail.com> wrote:
Hi Everyone,

In all the documentation for production clusters, aws r3.8xl is mentioned. Is there any known limitation with using r4 series (same CPU and RAM but r4 has EBS unlike r3 SSD). I would assume there is going to be some network latency added to the read and write with r4. Anything other than that, that would significantly affect the cluster performance?

Thanks in advance. Let me know if you need any further information to better answer this.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
JVM args for Nodes
historical.properties
middle-manager.properties
overlord.properties
coordinator.properties
broker.properties
Reply all
Reply to author
Forward
0 new messages