Discussion: How do we Improve Kubernetes Reliability

Skip to first unread message

Josh Berkus

Mar 14, 2022, 7:08:39 PM3/14/22
to d...@kubernetes.io

Per discussion in a proposed KEP[1], the reliability of Kubernetes as
software is not really improving, and might be declining, depending on
how you measure. We certainly don't have a process to ensure that it
improves over time.

As such, we are starting a discussion with our whole contributor
community around: what would it take to improve Kubernetes reliability?

This discussion will take place here on dev@ because it involves the
entire project, including Release, Architecture, Enhancements, Testing,
and all of the SIGs that write code.

We will also discuss briefly, live, during this Thursday's community
meeting on Thursday at 1700UTC[2].

Key questions here:

A. How do we measure reliability?
- Tests passing?
- Test coverage?
- Build reliability?
- Something else?

B. What would enable code contributors, reviewers, and leads to spend
more time on tests, troubleshooting, debugging, and reliability review,
even at the expense of features?

C. What other ways can we make sure that reliability improves over time?

Please add your thoughts in this thread, and also in the community
meeting on Thursday.

[1] https://github.com/kubernetes/enhancements/pull/3139


-- Josh Berkus
Kubernetes Community Architect

Reply all
Reply to author
0 new messages