Dustin said:
> From my amateur perspective, it sounds like this is roughly a renaming
> of "nightly" to "shippable", with some further refinement of how
> they're implemented to align better with the term, and the separating
> out some of what is truly related to the shipped nightly builds
> (beetmoving, balrog, etc.?). Is that about right?
>
> What do you think of also renaming "dep" or "depend" to something like
> "CI"?
I would agree with that. Nightly builds are essentially becoming shippable
builds, with the actual shipping tasks moving to a promotion graph, which
lets us schedule both at their own cadence.
I would be pretty happy with renaming "dep" or "depend" to "CI".
RyanVM said:
Overall, this sounds like a good plan to me. I like the idea of being able
> to consolidate PGO/Nightly/Release builds into one primary build type with
> different subsequent operations layered on top of them. Sounds like a nice
> simplification overall and it gets us closer to testing what we actually
> ship. I also do agree with Dustin that we might want to take the
> opportunity to give dep/opt builds a more clarified name as well while
> we're at it.
>
Cool, so far it sounds like people either agree or don't object.
> The only main concern I have with this proposal is that while it's a rare
> situation to happen, we have seen past instances of nightly-only test
> failures (which are about as much fun to diagnose as you'd expect), the
> main reason being features/tests that key off the update channel being set
> to "nightly" instead of "default". Looking at branch_specifics.py, it looks
> like we could still find ourselves hitting a variant of that if we're only
> setting "nightly" on mozilla-central. TBH, I'm not sure what the right
> solution to that is offhand, but I think it's something we should at least
> discuss while fleshing this out.
1. Because the act of building a shippable build isn't the same as shipping
it, we are able to change the update channel of integration branches to
"nightly" without actually affecting Nightly users. Right now I'm
picturing us either not signing those or signing them with the dep/CI key,
and disallowing beetmover and balrog from running on inbound/autoland.
2. We may want to add some additional fast tests to our CI builds on
integration branches. Essentially, if CI is supposed to tell us quickly if
something is wrong with the commit, and it doesn't catch something that we
catch at either Shippable Build time or Nightly/Release Promotion time,
then CI hasn't caught the error. Let's fill in the gaps.
Right now I'm picturing one or a handful of tests, like linters, that
inspect pushes for known problematic changes and flags them by turning
orange once they find something. Inspecting the file list of a push:
changing these sets of files may result in this type of bustage.
Inspecting the diff of a push: this type of change may result in this other
type of bustage. These may be false-error-prone at first, but we can
iterate and improve them over time. If there's shippable build bustage on
tip-of-inbound, sheriffs can automatically flag the first orange changeset
as a potential problem push.
Changing periodic PGO builds to periodic shippable builds has a downside:
the latter will take a bit longer. We're going to clobber every time.
We're going to add multilocale, and potentially l10n single locale repacks
on certain branches. (These can become more sheriffable with in-tree
l10n-bumper l10n-changesets [1], similar to what we did for gaia/b2g, and
similar to what's running for mobile right now.) Backfilling shippable
builds is one option to bisect bustage, but it may become a bit heavy
handed. Making our CI builds and tests more able to catch errors ahead of
time is important. I think the positives of simplification and testing
what we ship outweigh the downsides.
[1]
https://bugzilla.mozilla.org/show_bug.cgi?id=1345619