High Availability Druid cluster requires a minimum of 12 machines

2,132 views
Skip to first unread message

Kevin Sookocheff

unread,
Nov 25, 2015, 5:07:58 PM11/25/15
to Druid User
Hi,

We are evaluating Druid as an analytics store but the devops story seems quite complicated. Am I reading this right that a high availability cluster requires at least 12 machines?

Overlord x2
MiddleManager x2
Coordinator x2
Historical x2
Broker x2
Real-Time x2


Are most people deploying this combining nodes onto single machines or deploying all 12 and scaling/tuning from there?

Thanks,

Kevin

Himanshu

unread,
Nov 25, 2015, 9:56:27 PM11/25/15
to Druid User
Hi,

- You can put overlord and coordinator on same machine.
- You either need middle managers or realtime nodes based on whether you are planning to ingest data in realtime from storm(via tranquility) or kafka. You need none of those if you are going to have batch ingestion only. Please see https://groups.google.com/forum/#!searchin/druid-development/fangjin$20yang$20%22thoughts%22/druid-development/aRMmNHQGdhI/muBGl0Xi_wgJ (Also number of those nodes will depend upon your scale of realtime ingestion)
 
- Number of Historical Nodes is driven by what is size of your segments and how many you want to keep loaded in druid cluster so that they are queryable.


-- Himanshu

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/b7bd1635-10c1-4d05-8e78-30c34f6da6ff%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Fangjin Yang

unread,
Nov 28, 2015, 5:45:35 PM11/28/15
to Druid User
No.

Overlord/coordinator x2
Middlemanager/Historical x2
Broker x 2

You might also want to read about HA Druid here: http://imply.io/docs/latest/cluster.html

shims...@gmail.com

unread,
Apr 14, 2016, 8:15:18 PM4/14/16
to Druid User
Hello but i was wondering if you could go into detail in order to achieve a HA druid cluster. Currently I have coordinator,broker,historical,realtime and overlord in my set up. SO far I only have one of each. So in order to achieve an HA environment i would just add a copy of each node and let them communicate with zookeeper and that is it? With this in mind is there like a test tool created so that while we are testing our druid cluster's performance we will also be able to test out its HA.

Gian Merlino

unread,
Apr 14, 2016, 10:48:13 PM4/14/16
to druid...@googlegroups.com
We have some new clustering docs available here that should be useful: http://druid.io/docs/0.9.0/tutorials/cluster.html

To get HA you need 2x Coordinator, 2x Overlord, 2x+ Broker, 2x+ Historical, and 2x+ MiddleManager. You don't get much benefit from adding more Coordinators and Overlords (they are failover-based HA) but you do get scaling benefit from adding more Brokers, Historicals, and MiddleManagers.

This *doesn't* mean you need 10 machines. Especially for smaller clusters it is very common to colocate Druid services on the same physical machines. You could in theory get by with 2 physical machines although most people do 4–6 for a basic cluster (separating data-heavy services from coordination services).

Also you generally don't need both Overlord and Realtime nodes. Generally you pick one or the other (we recommend Overlords these days)

Gian

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.

Shim Ster

unread,
Apr 14, 2016, 10:52:13 PM4/14/16
to druid...@googlegroups.com
Hi Gian,

Thank you very much for replying to my comment so quickly.But what about the configuration to attain HA? Do we just have them connect to zookeeper? Or is there any configurations that we would need to do? and is there a script that we can use to test the transition of failover in the cluster?

Regards,

Shim

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/qME7lEFZH_M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

charles.allen

unread,
Apr 15, 2016, 10:46:25 AM4/15/16
to Druid User
Just FYI, our method of testing failover on our cluster is by doing normal operations like rolling restarts for upgrades or configuration changes. And we exercise it regularly.

Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/qME7lEFZH_M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+unsubscribe@googlegroups.com.

Gian Merlino

unread,
Apr 15, 2016, 12:22:44 PM4/15/16
to druid...@googlegroups.com
Hey Shim,

The Druid nodes are all automatically HA assuming you have configured them to use an external metadata store and zookeeper cluster (rather than the default embedded metadata option). All you have to do is start up multiple of them.

It is up to you to make your metadata store and zookeeper cluster HA. The usual way of doing that is using MySQL/PostgreSQL with replication and failover, and setting up a 3 or 5 node zookeeper cluster.

Gian

Shim Ster

unread,
Apr 19, 2016, 11:50:56 PM4/19/16
to druid...@googlegroups.com
Hi everyone,

I am sorry for the late reply. Hmm well i have set everything up but i was wondering if there is a test script to test the fail over besides upgrading each node one by one? I was hoping something that I would leave running form the broker node which will test it's communication to the coordinator and I would shut down one of the coordinator to test the failover. I'm sorry if I have a lot of questions


Regards,

Shim

shims...@gmail.com

unread,
Apr 20, 2016, 7:56:57 PM4/20/16
to Druid User
In addition to my earlier post, I currently have 5 node types deployed in my cluster namely overlord,broker,historical,realtime and coordinator. Currently we only have one of each and are planning to set up an HA environment. I am aware that I could have coordinator and overload in the same nodes which reduces my HA cluster to 8. Are there any other nodes that I can merge together to minimize the amount of servers deployed.

Gian


Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/qME7lEFZH_M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/qME7lEFZH_M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+unsubscribe@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

Fangjin

unread,
Apr 22, 2016, 5:48:14 PM4/22/16
to Druid User
You should take a look at how Imply choose to package Druid if you want to use less hardware:


Gian


Gian

To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/qME7lEFZH_M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Druid User" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/druid-user/qME7lEFZH_M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.

To post to this group, send email to druid...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages