# Problem description
when upgrading the control-plane with kubeadm in a stacked etcd topology, all
in-flight requests will have a terrible processing time, as when the etcd pod
is updated, kube-apiserver hasn't been drained and is still processing
requests. for more details, you can check
this issueand as a reminder, here's what currently happens during a kubeadm upgrade:
1. etcd is upgraded, which typically takes between 5s and 30s.
2. during that time, kube-apiserver logs countless errors, and in-flight requests stall.
3. once etcd is back online, in-flight requests might complete (depending on the etcd downtime), with a consequent delay.
4. kube-apiserver pod upgrade starts.
# Maintenance Mode
My idea is to introduce a maintenance mode for kube-apiserver, triggered with a
`SIGUSR1` signal, with the following effects:
- /readyz endpoint reports a 503 status code
- kube-apiserver removes itself from kubernetes.default.svc endpoints.
- kube-apiserver continues processing in-flight requests, but sends GOAWAY to new requests.
- kube-apiserver stays in 'maintenance mode' until it receives a `SIGTERM` (i.e. until termination)
Overall, maintenance mode would be akin to a shutdown, but the pod would
continue to run. This would permit to transparently use this new maintenance
mode in the kubeadm upgrade scenario, typically prior to upgrading etcd.
I will join the bi-weekly next Wednesday (March 6th), I will be available to discuss it then.