[UPDATE] Merges to kubernetes/release repo on hold until further notice

23 views
Skip to first unread message

Stephen Augustus

unread,
Jul 4, 2019, 1:49:49 PM7/4/19
to kubernetes-sig-release, kubernetes-sig-testing, Kubernetes developer/contributor discussion, Kubernetes Release Team, Kubernetes Release Managers, Kubernetes Release Managers (Private)
Hi Kubies,


Following an attempt to improve the semantics of the release tooling via shellcheck (https://github.com/kubernetes/release/issues/726), we found that we were unable to stage releases.

Multiple fixes were merged in an attempt to bring us to a usable state.

An unintended and unexpected side effect of this was a cascading failure of multiple release-blocking jobs. A few for example:
- https://github.com/kubernetes/kubernetes/issues/79652
- https://github.com/kubernetes/kubernetes/issues/79668
- https://github.com/kubernetes/kubernetes/issues/79669

Ultimately, it was decided that the right course of action was to revert back to a known good state in the repo (https://github.com/kubernetes/release/pull/814) to stop the bleeding.

This implies that, in our current state, it is inadvisable to make any changes to the tooling in this repo.

As such, I'm advising the following course of action (h/t to @nikhita, @liggitt, and @BenTheElder for being a sounding board):
- [ ] (https://github.com/kubernetes/test-infra/pull/13328) Add a blockade for files that have the potential to impact releasing and CI signal
      (this will require repo admins to explicitly approve and override the blockade to merge changes to critical tooling)
- [ ] Examine and document exactly why these release-blocking jobs failed
      (they are using **_something_** in k/release; we need to figure out what those somethings are)
- [ ] Tag the repo after executing a successful release of Kubernetes
      (this locks in a known good state of k/release that doesn't need to be `master`)
- [ ] Refactor release tooling/jobs that depend on tooling to accept pulling a tag of k/release       instead of `master`

At this point, we will have gotten to a place where we can safely make changes to k/release without impacting CI. We will then:

- [ ] Write tests around the specific pieces of the tooling that caused job failure (maybe https://github.com/sstephenson/bats ?)
- [ ] Setup a presubmit job that can emulate one of the existing jobs that broke recently

For longer term goals, we should seek to:

- [ ] Write go tooling (and tests!) to replace the shell libraries (`lib/{common,gitlib,releaselib}`) and call these new tools in the existing release tooling
      (this allows us to get some immediate benefit of a more robust language w/o having to completely refactor)
- [ ] Full refactor of existing tools (shell --> go)

(Some historical references: https://github.com/kubernetes/kubernetes/pull/28922, https://github.com/kubernetes/kubernetes/issues/16529, https://github.com/kubernetes/kubernetes/issues/15560, https://github.com/kubernetes/kubernetes/issues/8686)

Please take this an initial assessment of the situation and feel free to provide feedback. :)


-- Stephen

On Wed, Jul 3, 2019 at 10:23 AM Stephen Augustus <Ste...@agst.us> wrote:
More details to follow.
Please do not lgtm, approve anything in kubernetes/release until I give the all clear.

-- Stephen
Reply all
Reply to author
Forward
0 new messages