Adding to what Saad has said, CSI Volume Health is an Alpha feature just introduced in Kubernetes 1.19. It enables CSI drivers to check volume conditions on the storage system and communicate that information back to Kubernetes. As Saad mentioned, initially this is informative only so abnormal volume conditions will be logged as events on PVCs or Pods. We will be looking at how to make this information available so that we can programmatically make corrections based on volume health. The problem you described looks like a good use case we can consider.
Here is the Volume Health KEP:
Here’s the repo: https://github.com/kubernetes-csi/external-health-monitor.
Right now volume health just provides information in the form of events. In the future, we want to look into how to make programmatic actions when the volume on the underlying storage system has problems.
> Since NodeUnpublish and NodeUnstage are called only in context of pods going down and not for containers, is there a way to instruct k8s to forcefully terminate the pod altogether for storage errors?
As Saad mentioned, this is not there today. This is something we can think about when working on the next phase of the Volume Health feature.
> Another enhancement (if this does not yet exist) could be to work in conjunction with k8s scheduler and decide to fail over the pod to another node where storage is accessible instead of repeatedly trying to start the pod on the same node. A negotiated error from Nodepublish or Nodestage calls would help k8s scheduler decide if it should attempt re-scheduling the pod to another node. Is anything getting discussed around this?
This is something we may want to build on top of this KEP on storage capacity tracking:
https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1472-storage-capacity-tracking. So in addition to capacity, we could also consider volume health or "pool" health when making scheduling decisions.
CC: Patrick Ohly