Please see https://groups.google.com/d/msg/druid-development/eIWDPfhpM_U/AzMRxSQGAgAJ
for the work on reducing usage of Zookeeper inside Druid and eventually making it optional.
Latest Druid code now has the code to run Druid cluster with Zookeeper usage reduced to bare minimum, that is, only for discovering other Druid nodes and for Leader election of Overlord/Coordinator. If you’re running a Druid cluster with latest code in Druid master(to be released in Druid-0.13.0) already then following options can be enabled.
1. Using HTTP based segment discovery on Coordinators and Brokers.
’ and restart Druid process at Coordinators and Brokers.Verification
: On process start, "Starting HttpServerInventoryView." should be printed in the log.
2. Using HTTP based segment load/drop management at Coordinators.
’ and restart Druid process at Coordinators. You can optionally configure ‘druid.coordinator.loadqueuepeon.http.batchSize
’ at coordinator for processing multiple load/drop requests in parallel. See its description in the doc at https://github.com/druid-io/druid/blob/83c6c48bed5312f20d7344358dd9804d080f68c1/docs/content/configuration/coordinator.md
for more details.Verification
: In the coordinator logs, “HttpLoadQueuePeon” class should be printing messages as segments load/drop requests are prepared to be sent to historical nodes.
3. Using HTTP based Task management at Overlord running in remote mode i.e using MiddleManagers to run the tasks.
’ and restart Druid process at Overlords. Note
that, all Overlords must be using same taskrunner type config at all times. So, you would update the configuration at all Overlords, then stop all of them, and then start all of them. Verification
: In the overlord logs, something like “io.druid.indexing.overlord.hrtr.HttpRemoteTaskRunner - Starting...” should be printed on process start.
There are more configurations to tweak certain behavior but defaults should work fine.
For curious readers wondering how Coordinator HTTP based segment discovery manages to keep segment state up-to-date in realtime, see https://github.com/druid-io/druid/blob/83c6c48bed5312f20d7344358dd9804d080f68c1/server/src/main/java/io/druid/server/http/SegmentListerResource.java#L87
( basically an implementation of “long polling” via HTTP). Overlord HTTP based Task management uses same mechanism to keep task state up-to-date.
Note: Above options are made available for bleeding edge Druid code testing only
at this time and should not be enabled in production. Once above options are verified in multiple test clusters, then we will start removing hard coded zookeeper based Segment and Task management.