How does DL do load balancing?

40 views
Skip to first unread message

Jon Derrick

unread,
May 13, 2016, 2:29:05 AM5/13/16
to distributedlog-user
Hello,

I am new to distributed log.

In Kafka, if you add a broker to a cluster to handle increased demand, it is allocated new partitions. but the broker does not automatically share the load of existing partitions on other brokers. admins have to redistribute the data by manually reassigning partitions.

does DL have similar concept to redistribute the load when adding new machines? if so, what algorithm?

- jderrick

Leigh Stewart

unread,
May 13, 2016, 10:53:38 AM5/13/16
to Jon Derrick, distributedlog-user
Depends on the configuration you run the system in. 

The important thing to understand is that BookKeeper log segments are evenly spread across the cluster of BookKeeper nodes transparently. This is one significant advantage of using BookKeeper as a storage layer. Unlike Kafka in which entire copies of partitions are hosted on a shard, in DL streams are segmented and randomly distributed across nodes.

So, adding a node is a very low touch operation. It will transparently and immediately be included in the cluster and start accepting new log segments - no manual operation required.

--
You received this message because you are subscribed to the Google Groups "distributedlog-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to distributedlog-...@googlegroups.com.
To post to this group, send email to distribut...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/distributedlog-user/6286de3d-8778-40f3-a344-38896fe50deb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sijie Guo

unread,
May 14, 2016, 3:11:06 AM5/14/16
to Leigh Stewart, Jon Derrick, distributedlog-user
To add to what Leigh said, the way how DL stores data is different from how Kafka stores data. For a DL stream, the write proxy doesn't actually store the data locally. It stores log segments in bookkeeper - that means even a DL stream is hot, the data would be segmented and stored among storage nodes. However a kafka's partition is the finest granularity of storage. so even a kafka's partition is hot - you basically can't do nothing except moving data around. That's my 2c.

- Sijie

Jon Derrick

unread,
May 14, 2016, 4:07:39 PM5/14/16
to Sijie Guo, Leigh Stewart, distributedlog-user
Thanks Sijie & Leigh.

It seems to have a different design than Kafka. If so, if a broker (I am not sure what is it in DL, write proxy or bookkeeper?) is down, when it comes back, does it have to catch up with the leader broker?

jderrick

Leigh Stewart

unread,
Jun 6, 2016, 11:38:51 AM6/6/16
to Jon Derrick, Sijie Guo, distributedlog-user
No it doesn't - "catch up" is a master-slave replication concept. With bookkeeper's direct replication, everything is always caught up.
Reply all
Reply to author
Forward
0 new messages