How does etcd handle network outages between hosts?

38 views
Skip to first unread message

Petr Horáček

unread,
Aug 6, 2016, 7:59:01 AM8/6/16
to etcd-dev
Hello,

let's say I have an etcd cluster with 2 switches, 5 hosts connected to each of them, switches are connected to each other. What happens when connection between switches is cut? Will there be two sub-clusters with their own primary hosts? What happens if (after data changes in each cluster) connection between switches turn up again?

Thanks in advance,
Petr

Vick Khera

unread,
Aug 6, 2016, 12:13:10 PM8/6/16
to etcd-dev
You need to have an odd number of servers to ensure quorum can be computed properly.

Seán C. McCord

unread,
Aug 6, 2016, 12:33:02 PM8/6/16
to Vick Khera, etcd-dev

Proper calculation of quorum has nothing to do with the odd or even count of members.  The algorithm will never result in a tie.  It requires _greater_ than 50%, not 50%.  This is an important distinction, lest people begin to think that it is somehow unsafe for the integrity of their data in running an even-numbered set.

The reason for maintaining an odd number of members is because the additional member in an even-numbered set is marginalized by that same algorithm, offering the added weight, overhead, and quorum requirements without supplying any meaningful redundancy.


On Sat, Aug 6, 2016, 12:13 Vick Khera <vi...@khera.org> wrote:
You need to have an odd number of servers to ensure quorum can be computed properly.

--
You received this message because you are subscribed to the Google Groups "etcd-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to etcd-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Seán C McCord
CyCore Systems, Inc

Brandon Philips

unread,
Aug 8, 2016, 8:59:42 PM8/8/16
to Seán C. McCord, Vick Khera, etcd-dev
Playing with http://play.etcd.io/ can help you understand if writes can go through under what conditions.

Petr Horáček

unread,
Aug 9, 2016, 3:16:29 AM8/9/16
to Brandon Philips, Seán C. McCord, Vick Khera, etcd-dev
Hello guys,

thanks for response.

I think I understand it (a little bit better) now. If i want to handle
a split of cluster, I have to remove disconnected hosts from both
sides.When I want them to become one big cluster again, I have to do
the sync manually. There is no mechanism inside etcd which can handle
it (and it's good, sync is specific specific for user, keep it
simple).

Thanks and keep up the good work,
Petr
> You received this message because you are subscribed to a topic in the
> Google Groups "etcd-dev" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/etcd-dev/kgNE8-cPCdc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
Reply all
Reply to author
Forward
0 new messages