ETCD Backup & Restore issue for k8s V1.21.5

50 views
Skip to first unread message

Paresh Shinde

unread,
Jan 3, 2024, 3:40:48 AMJan 3
to etcd-dev
Hello All,
I have 3 node etcd cluster in my kubernetes environment. I have taken snapshots of all 3 etcd nodes
and try to restore all snapshot on respective nodes but sometime its running etcd service and sometime not. When its running, all nodes become leader and not working as expected. Also i am using external etcd cluster not in pods.
Tried restore process with single node snapshot as well. It is working for single node but when i started other node and fresh data directory, ideally remaining nodes should sink with that running node and should start the process but it fails to start. Please someone suggest me the correct way to recover the etcd cluster. Also I am attaching screenshot for the references. 
Screenshot from 2024-01-01 14-58-00.png

Benjamin Wang

unread,
Jan 3, 2024, 5:50:36 AMJan 3
to etcd-dev
Hi,
If you diagnose it using https://github.com/ahrtr/etcd-diagnosis, you may find that the three members aren't belonging to the same cluster at all.

You need to restore the cluster (all members) using the same snapshot (a single snapshot), please refer to https://etcd.io/docs/v3.5/op-guide/recovery/#restoring-a-cluster .

Regards,
Benjamin

Reply all
Reply to author
Forward
0 new messages