Perma failing and continuously unhealthy jobs (call for action!)

95 views
Skip to first unread message

Davanum Srinivas

unread,
Apr 18, 2022, 2:58:35 PM4/18/22
to kubernetes-...@googlegroups.com, d...@kubernetes.io
Folks,

We have been struggling with keeping our test grids green and it has a bunch of consequences for everything including releases.

For example, here's a list of jobs that are currently RED for a while now (opened yesterday!) : 
This is not a new problem and sig-testing has been trying to think about this, see:
We also have the reliability KEP that wants to help push folks to do better:
For now this is a request for action from sig chairs and leads to look for things under their SIG that they want to clean up one way or another, by fixing issues or removing the unnecessary CI jobs. Look through these json files to start:
Please prioritize this work! We should use the 2 weeks (the 1.24 pushed out) to clean up our house. If you need help, please come to sig-testing. 

thanks,
Dims

--
Davanum Srinivas :: https://twitter.com/dims

Davanum Srinivas

unread,
Jan 5, 2026, 11:32:32 AM (6 days ago) Jan 5
to kubernetes-...@googlegroups.com, d...@kubernetes.io
Folks,

To help you all dig into and prioritize fixing our CI jobs we now have web pages for each of the metrics we capture here:

Please dig into Broken/Failing/Flaking jobs for your sig(s), open issues and clean things up!

This will help us be more healthy with good ci signals, better quality and importantly more sustainable by conserving CI resources (save 💰 💰 💰 money!)

Wish you all a very Happy New Year 2026!

Thanks,
Dims
Reply all
Reply to author
Forward
0 new messages