How many etcd servers are appropriate for running a etcd cluster spanning in two datacenters?

719 views
Skip to first unread message

Hiroaki Nakamura

unread,
Jun 2, 2017, 9:50:41 PM6/2/17
to CoreOS User
Hi all,

I started evaluating etcd and thinking running a etcd cluster spanning in two datacenters.

If I run a etcd cluster in just one datacenter, I understand three or five etcd servers are good
since it is an odd number and make clear to have a majority in these servers.

With two datacenters, how many etcd servers are appropriate?
Five servers with two servers at one datacenter and three servers at the other datacenter?

Could you give some advice?

Thanks!
Hiroaki

paul...@coreos.com

unread,
Jun 6, 2017, 11:24:19 AM6/6/17
to CoreOS User
Hi Hiroaki,

Five servers would work just as well as in two data centers as it would in one. The big downside is if an entire data center disappears, you wouldn't have majority. That's unavoidable with a two data center setup, no matter how large your cluster is.

Other important notes are in this guide http://coreos.com/etcd/docs/latest/admin_guide.html . 
    "The recommended etcd cluster size is 3, 5 or 7, which is decided by the fault tolerance requirement. A 7-member cluster can provide enough fault tolerance in most cases. While larger cluster provides better fault tolerance the write performance reduces since data needs to be replicated to more machines."

Growing the number of nodes in your cluster is a tradeoff.

Cheers,

Hiroaki Nakamura

unread,
Jun 7, 2017, 11:05:56 AM6/7/17
to paul...@coreos.com, CoreOS User
Hi Paul,

Thanks for your response.

I go for cluster size 5, that is, run 2 servers at data center A, and
3 servers at data center B.
And I prepare to setup an additional 1 server at data center A if data
center B disappears.

Thanks again!
> --
> You received this message because you are subscribed to the Google Groups
> "CoreOS User" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to coreos-user...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Seán C. McCord

unread,
Jun 7, 2017, 11:48:53 AM6/7/17
to Hiroaki Nakamura, paul...@coreos.com, CoreOS User
Just keep in mind that it will not be sufficient to simply add another etcd server in DC A.  You would fail quorum even after adding the replacement server in DC A; quorum requires a majority, not just a plurality.  You would have 3 of 6, rather than 3 of 5.  Thus, you would have to remove one of the failed nodes and add your new one all without the cluster being up.  This is a non-trivial recovery.


--
Seán C McCord
CyCore Systems, Inc

Hiroaki Nakamura

unread,
Jun 9, 2017, 9:13:48 AM6/9/17
to Seán C. McCord, paul...@coreos.com, CoreOS User
Hi Seán,
Thanks for you comment.

I thought there remains 2 servers in DC A if DC B disappears.
And if I add one server in DC A, then 3 servers in DC A.

Could you tell me what you mean with 6 servers?
3 servers coming back to the cluster if DC B recovers again?

If so, how to prevent that?
Is it something like this?
(1) stop remaining 2 servers in DC A.
(2) change initial-cluster and initial-cluster-token in config.
https://github.com/coreos/etcd/blob/864ffec88c7a53896cd2dedf5f3544b65be980b3/embed/config.go#L96-L97
(3) start 2 servers in DC A.
(4) add 1 server in DC A.

Seán C. McCord

unread,
Jun 9, 2017, 11:15:23 AM6/9/17
to Hiroaki Nakamura, paul...@coreos.com, CoreOS User
All I'm saying is that you will need to remember to _remove_ the failed DC's members from the cluster.

If you start with 5 members and 3 are burned, you still have five members, so adding another member will bring you up to 6, of which 3 will have been alive.  3 out of 6 members will not give you quorum.

Another issue I did not mention before is that if you merely lost connectivity with the failed DC and then you recover the remaining cluster, you will have a split brain.  The failed DC will have been operating with 3/5 (quorum), which is perfectly fine, from its perspective.

Again, the best answer here is to have a third location, the tie-breaker.  Then any location can go down without concern for split brain or loss of remaining functionality.

The point is that you want to maintain an odd number not only of members, but of data centers/locations/failable components.

Hiroaki Nakamura

unread,
Jun 9, 2017, 10:27:47 PM6/9/17
to Seán C. McCord, paul...@coreos.com, CoreOS User
Thanks for your detailed explanation.
I understand what you're saying now and plan to have a third location.
Thanks!
Reply all
Reply to author
Forward
0 new messages