Etcd: On node drain sometimes etcd goes to CrashloopBackOff state

81 views
Skip to first unread message

ABHAY KUMAR

unread,
Apr 2, 2024, 3:22:17 AMApr 2
to etcd-dev
Hello Guys,

Setup info:
I have a 3 node(ubuntu OS) K8s cluster, where I'm using bitnami helmcharts with etcd:3.5.9 version.
Allowed: Multiple pods belonging to the etcd StatefulSet can be scheduled onto the same node.
NOTE: This issue is intermittent.

Issue: 
When draining one of the node, the that Etcd instance restarts and gets scheduled to available node. Usual behavior is when terminating the etcd-instance it is removing itself from the etcd cluster member list(member is removed), when comes up and joins the member list again(member is added). But in this issue case, Etcd member is missing from the Etcd member list.
voltha  voltha-etcd-cluster-client-0  1/1 Running  0 14h
voltha  voltha-etcd-cluster-client-1  1/1 Running  0 14h
voltha  voltha-etcd-cluster-client-2  0/1 CrashLoopBackOff 173 (4m31s ago) 14h

Logs when first time etcd crashes. It gives member not found error. Afterwards etcd restarted by K8s it never comes up and gets stuck in CrashLookBackOff State.

Error Logs:


2024-02-20T15:03:22.167595028Z stderr F [38;5;6metcd [38;5;5m15:03:22.16 [0m [38;5;2mINFO [0m ==> ** Starting etcd setup **

2024-02-20T15:03:22.185870876Z stderr F [38;5;6metcd [38;5;5m15:03:22.18 [0m [38;5;2mINFO [0m ==> Validating settings in ETCD_* env vars..

2024-02-20T15:03:22.188192766Z stderr F [38;5;6metcd [38;5;5m15:03:22.18 [0m [38;5;3mWARN [0m ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production environment.

2024-02-20T15:03:22.19508585Z stderr F [38;5;6metcd [38;5;5m15:03:22.19 [0m [38;5;2mINFO [0m ==> Initializing etcd

2024-02-20T15:03:22.197954667Z stderr F [38;5;6metcd [38;5;5m15:03:22.19 [0m [38;5;2mINFO [0m ==> Generating etcd config file using env variables

2024-02-20T15:03:22.221563855Z stderr F [38;5;6metcd [38;5;5m15:03:22.22 [0m [38;5;2mINFO [0m ==> Detected data from previous deployments

2024-02-20T15:03:22.336259228Z stderr F [38;5;6metcd [38;5;5m15:03:22.33 [0m [38;5;2mINFO [0m ==> Updating member in existing cluster

2024-02-20T15:03:22.38884315Z stderr F {"level":"warn","ts":"2024-02-20T15:03:22.388607Z","logger":"etcd-client","caller":"v...@v3.5.9/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0001c0000/voltha-etcd-cluster-client-0.voltha-etcd-cluster-client-headless.voltha.svc.cluster.local:2379","attempt":0,"error":"rpc error: code = NotFound desc = etcdserver: member not found"}

2024-02-20T15:03:22.388876646Z stderr F Error: etcdserver: member not found

 

When checked the member, the member is really not present in the member list.
kv exec -it voltha-etcd-cluster-client-1 – etcdctl member list -w table
-------------------------------------------------------------------------------------------------------------------{}{}----------+


       ID        

STATUS  

            NAME            

                                             PEER ADDRS                                              

                                                                            CLIENT ADDRS                                                                              

IS LEARNER

 

-------------------------------------------------------------------------------------------------------------------{}{}----------+

As this issue is intermit, I am hoping if anyone can help me to resolve it or if it is a known issue and got fixed in latest releases.


Thanks,

Abhay

James Blair

unread,
Apr 2, 2024, 6:42:55 PMApr 2
to etcd-dev
Hi Abhay - Thanks for your question.

The etcd project do not provide support for etcd delivered via the bitnami helm charts currently.

For support with the behaviour of etcd run via the bitnami charts please raise an issue with https://github.com/bitnami/charts/issues.

If you can recreate an issue outside the context of the bitnami charts or confirm that the behaviour/issue is not in relation to the custom scripts used in those charts then please let us know with clear reproduce steps by raising a bug at https://github.com/etcd-io/etcd/issues/new?assignees=&labels=type%2Fbug&projects=&template=bug-report.yml and we can take a look.


Thanks
James

ABHAY KUMAR

unread,
Apr 3, 2024, 1:39:45 PMApr 3
to etcd-dev
Thanks for the quick reply James. 
Reply all
Reply to author
Forward
0 new messages