Hello, I am leading scalability effort for GDCV-BM (go/ekpbm) and we're seeing an issue where CP node(s) become unreachable (and un-sshable as well) when we're adding 500 nodes to the cluster. Though it is somewhat expected for clusters to be under high stress, I am investigating why the node would be stuck in unreachable state. I am currently suspecting high disk throughput from etcd caused
kernel panic but I cannot find any details of such nature of high disk throughput. Any lead would be deeply appreciated. Thanks!