Backing up master etcd

ad...@cloudhealthtech.com

unread,

Mar 8, 2017, 8:08:13 AM3/8/17

to Kubernetes user discussion and Q&A

I would like to add a kubernetes cron job that will backup the master etcd and send it off to S3. I'm currently having problems figuring out how to expose the etcd server (running on master node, or elsewhere) to the cron job container.

Does anyone have insight on how to accomplish this? Is there a better, recommended way to accomplish this?

Thanks,
Adam

Daniel Smith

unread,

Mar 8, 2017, 1:17:23 PM3/8/17

to kubernet...@googlegroups.com, Matt Liggett, Wojciech Tyczynski

It's a bit of a shame that we don't have a "insert your backup code here" spot for this in the default setup scripts, if you're running only one etcd replica this is fairly important. It's worth noting that old (depending on churn in the cluster) backups are not likely to do too much good (meaning, if you restore, it will rewind the state in the cluster, probably requiring a cluster-wide reboot if you want to respect the "ResourceVersions never go backwards" constraint, and could e.g. re-run jobs that were supposed to be run only once), so, the best backups are frequent ones.

Depending on your cluster setup, you might be able to exec something in the etcd container (I think our recent containers include the correct etcdctl tool?), but I'd probably recommend modifying the etcd pod definition to include your backup script in a sidecar container. (I'm assuming you have total admin access to the cluster.)

It's also worth mentioning that you haven't successfully backed anything up until you've tested the restore procedure! Test after each etcd version upgrade, if not continuously. (1.6 is going to migrate clusters to etcd 3.0.x, for example--the etcdctl tool is *not* compatible across many versions, if any.)

--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.

br...@flightdeckdata.com

unread,

Mar 8, 2017, 1:38:31 PM3/8/17

to Kubernetes user discussion and Q&A

Have you looked at Burry?

https://github.com/mhausenblas/burry.sh

ma...@arroyonetworks.com

unread,

Mar 10, 2017, 9:48:13 AM3/10/17

to Kubernetes user discussion and Q&A

On Wednesday, March 8, 2017 at 8:08:13 AM UTC-5, Adam Schepis wrote:

This page might be helpful: https://coreos.com/etcd/docs/latest/v2/admin_guide.html#disaster-recovery

Brandon Philips

unread,

Mar 10, 2017, 1:22:25 PM3/10/17

to kubernet...@googlegroups.com, Matt Liggett, Wojciech Tyczynski, Xiang Li, Hongchao Deng

On Wed, Mar 8, 2017 at 10:17 AM 'Daniel Smith' via Kubernetes user discussion and Q&A <kubernet...@googlegroups.com> wrote:

It's a bit of a shame that we don't have a "insert your backup code here" spot for this in the default setup scripts, if you're running only one etcd replica this is fairly important. It's worth noting that old (depending on churn in the cluster) backups are not likely to do too much good (meaning, if you restore, it will rewind the state in the cluster, probably requiring a cluster-wide reboot if you want to respect the "ResourceVersions never go backwards" constraint, and could e.g. re-run jobs that were supposed to be run only once), so, the best backups are frequent ones.

We are actively working on using the etcd Operator underneath Kubernetes to enable use of its backup and recovery integrations.

Help wanted! The bootkube incubator project has a flag `--experimental-self-hosted-etcd` and we have just put that into CI/CD.