Scenario:
Cassandra cluster built from 3 pods
Failure occurs on one of the Kubernetes worker nodes
Replacement node is joining the cluster
New pod from StatefulSet is scheduled on new node
As pod IP address has changed, new pod is visible as new Cassandra node (4 nodes in cluster in total) and is unable to bootstrap until dead one is removed.
It's very difficult to follow the official procedure (https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsReplaceNode.html), as Cassandra is ran as StatefulSet.
One completely hacky workaround I've found is to use ConfigMap to supply JAVA_OPTS. As changing ConfigMap doesn't recreate pods (yet), you can manipulate running pods in such way that you will be able to follow the procedure.
However, that's as I mentioned, super hacky. I'm wondering if anyone is running Cassandra on top of Kubernetes and has a better idea how to deal with such failure?
K.