etcd-operator: Split etcd across nodes in different availability zones

Drew Wells

unread,

Nov 13, 2017, 1:51:54 PM11/13/17

to CoreOS Dev

We want to spread etcd deployed by etcd-operator across different AZs. In theory, our etcd cluster could then survive a node going down since no more than 1 etcd pod would be running on it.

As an example, if we have kubernetes nodes in 3 different AZs: AZ1, AZ2, AZ3.

Now I create a deployment with size: 3 for etcd. The desired behavior is this pods are created: etcd-0001 on AZ1, etcd-0002 on AZ2, and etcd-003 on AZ3. Basically, we desire per pod taints that decrease the change an etcd pod will be scheduled with a peer on the same node. I've searched through the database for a hack to make this happen. It doesn't appear possible to do per pod behavior like this.

Are there recommendations for getting better reliability of an etcd cluster when a AZ becomes unavailable outside of what's been described in this example?

Best,

Drew

Brandon Philips

unread,

Nov 13, 2017, 1:55:28 PM11/13/17

to coreo...@googlegroups.com

Hello Drew-

If you could specify an anti-affinity node selector would that work?

Brandon

--

CTO, CoreOS, Inc

Tectonic is enterprise Kubernetes

https://coreos.com/tectonic

Drew Wells

unread,

Nov 13, 2017, 2:16:12 PM11/13/17

to coreo...@googlegroups.com

Yeah I believe that would work. I was not sure if the operator created pods supported this.

Drew Wells

unread,

Nov 15, 2017, 3:54:52 PM11/15/17

to coreo...@googlegroups.com

Is this something in the pipeline for etcd-operator?

hongch...@coreos.com

unread,

Nov 16, 2017, 3:36:06 PM11/16/17

to CoreOS Dev

etcd operator currently only supports anti-affinity across nodes. However, it could extend the feature beyond that.

Could you create an issue and describe your use case to https://github.com/coreos/etcd-operator/ ?

Thanks!

Drew Wells

unread,

Nov 16, 2017, 3:51:52 PM11/16/17

to coreo...@googlegroups.com

I can, thanks!

Drew Wells

unread,

Nov 17, 2017, 3:02:53 PM11/17/17

to coreo...@googlegroups.com

Issue opened: https://github.com/coreos/etcd-operator/issues/1676

Please let me know if it needs more clarification

Reply all

Reply to author

Forward