Proposal: depend builds vs shippable builds

Aki Sasaki

unread,

Mar 25, 2017, 4:57:22 PM3/25/17

to sher...@mozilla.org, release, release-engineering

(This email is a longer writeup of this bug comment [1].)

Tl;dr: we should implement a new `shippable` build type, that prioritizes
correctness, to replace pgo, nightly, and release builds. We can build
these shippable builds as often as makes sense. Promotion to our various
shipping channels (nightly/aurora/beta/release/esr) can be at a separate
cadence.

Initially sending this to release + sheriffs, but please feel free to add
interested parties to the distribution list.

Background
=========

Our builds need to answer two questions: "Is my commit good?" and "Are
these binaries shippable?". These two questions have conflicting
priorities: speed vs. correctness. To prioritize speed, we don't always
clobber the tree; we skip lengthy or complex steps like PGO and
multilocale; and if we gain enough confidence in them, we might consider
using artifact builds in place of standard depend builds. To prioritize
correctness, we clobber, perform PGO, package multilocale, and the like.

Bug 932211 [2] (Nightly builds should be identical to standard builds in
automation) seems to be asking us to abandon depend builds for
consistency. We can do that, but it comes at the price of turnaround
time. Bug 1349227 [3] (Ship 2 Firefox nightlies per day) assumes that the
act of building a shippable build and the act of shipping it need to be
tied together.

Proposal
=======

* Let's keep depend builds, and run them on push on all integration and
project branches.
* Let's combine PGO, nightly, and release builds into the new "shippable"
build type.
** Shippable builds should be nightly/release promotable when built off of
release branches.
** Shippable builds should run periodically on integration branches, but be
backfillable.
** Both depend and shippable builds should be choosable on Try.
** Project branch owners can alter their on-push schedules, platforms, and
build types in-tree.
** Release branches should run shippable builds on push, and skip depend
opt builds.

We're already getting a good portion of the way there with our taskcluster
nightly work. We can move some existing nightly tasks into a separate
nightly promotion graph, and change shippable build scheduling.

If we come to a consensus here, we'd likely add shippable builds to our
todo queue, and start on them once we knock off at least one of our other
mid- to long-term projects.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=932211#c5
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=932211
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1349227

Dustin Mitchell

unread,

Mar 27, 2017, 8:54:44 AM3/27/17

to Aki Sasaki, Sheriffs, release, release-engineering

>From my amateur perspective, it sounds like this is roughly a renaming
of "nightly" to "shippable", with some further refinement of how
they're implemented to align better with the term, and the separating
out some of what is truly related to the shipped nightly builds
(beetmoving, balrog, etc.?). Is that about right?

What do you think of also renaming "dep" or "depend" to something like "CI"?

Dustin

> _______________________________________________
> release-engineering mailing list
> release-e...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/release-engineering

Ryan VanderMeulen

unread,

Mar 27, 2017, 9:56:15 AM3/27/17

to Aki Sasaki, sheriffs, release, release-engineering

Overall, this sounds like a good plan to me. I like the idea of being able
to consolidate PGO/Nightly/Release builds into one primary build type with
different subsequent operations layered on top of them. Sounds like a nice
simplification overall and it gets us closer to testing what we actually
ship. I also do agree with Dustin that we might want to take the
opportunity to give dep/opt builds a more clarified name as well while
we're at it.

The only main concern I have with this proposal is that while it's a rare
situation to happen, we have seen past instances of nightly-only test
failures (which are about as much fun to diagnose as you'd expect), the
main reason being features/tests that key off the update channel being set
to "nightly" instead of "default". Looking at branch_specifics.py, it looks
like we could still find ourselves hitting a variant of that if we're only
setting "nightly" on mozilla-central. TBH, I'm not sure what the right
solution to that is offhand, but I think it's something we should at least
discuss while fleshing this out.

-Ryan

> Sheriffs mailing list
> Sher...@mozilla.org
> https://mail.mozilla.org/listinfo/sheriffs
>
>

Aki Sasaki

unread,

Mar 27, 2017, 12:35:16 PM3/27/17

to Ryan VanderMeulen, sheriffs, release, release-engineering

Dustin said:

> From my amateur perspective, it sounds like this is roughly a renaming
> of "nightly" to "shippable", with some further refinement of how
> they're implemented to align better with the term, and the separating
> out some of what is truly related to the shipped nightly builds
> (beetmoving, balrog, etc.?). Is that about right?
>
> What do you think of also renaming "dep" or "depend" to something like
> "CI"?

I would agree with that. Nightly builds are essentially becoming shippable
builds, with the actual shipping tasks moving to a promotion graph, which
lets us schedule both at their own cadence.

I would be pretty happy with renaming "dep" or "depend" to "CI".

RyanVM said:

Overall, this sounds like a good plan to me. I like the idea of being able
> to consolidate PGO/Nightly/Release builds into one primary build type with
> different subsequent operations layered on top of them. Sounds like a nice
> simplification overall and it gets us closer to testing what we actually
> ship. I also do agree with Dustin that we might want to take the
> opportunity to give dep/opt builds a more clarified name as well while
> we're at it.
>

Cool, so far it sounds like people either agree or don't object.

> The only main concern I have with this proposal is that while it's a rare
> situation to happen, we have seen past instances of nightly-only test
> failures (which are about as much fun to diagnose as you'd expect), the
> main reason being features/tests that key off the update channel being set
> to "nightly" instead of "default". Looking at branch_specifics.py, it looks
> like we could still find ourselves hitting a variant of that if we're only
> setting "nightly" on mozilla-central. TBH, I'm not sure what the right
> solution to that is offhand, but I think it's something we should at least
> discuss while fleshing this out.

1. Because the act of building a shippable build isn't the same as shipping
it, we are able to change the update channel of integration branches to
"nightly" without actually affecting Nightly users. Right now I'm
picturing us either not signing those or signing them with the dep/CI key,
and disallowing beetmover and balrog from running on inbound/autoland.

2. We may want to add some additional fast tests to our CI builds on
integration branches. Essentially, if CI is supposed to tell us quickly if
something is wrong with the commit, and it doesn't catch something that we
catch at either Shippable Build time or Nightly/Release Promotion time,
then CI hasn't caught the error. Let's fill in the gaps.

Right now I'm picturing one or a handful of tests, like linters, that
inspect pushes for known problematic changes and flags them by turning
orange once they find something. Inspecting the file list of a push:
changing these sets of files may result in this type of bustage.
Inspecting the diff of a push: this type of change may result in this other
type of bustage. These may be false-error-prone at first, but we can
iterate and improve them over time. If there's shippable build bustage on
tip-of-inbound, sheriffs can automatically flag the first orange changeset
as a potential problem push.

Changing periodic PGO builds to periodic shippable builds has a downside:
the latter will take a bit longer. We're going to clobber every time.
We're going to add multilocale, and potentially l10n single locale repacks
on certain branches. (These can become more sheriffable with in-tree
l10n-bumper l10n-changesets [1], similar to what we did for gaia/b2g, and
similar to what's running for mobile right now.) Backfilling shippable
builds is one option to bisect bustage, but it may become a bit heavy
handed. Making our CI builds and tests more able to catch errors ahead of
time is important. I think the positives of simplification and testing
what we ship outweigh the downsides.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1345619

Chris AtLee

unread,

Mar 27, 2017, 12:44:07 PM3/27/17

to Aki Sasaki, sheriffs, Ryan VanderMeulen, release, release-engineering

One thing to call out in particular here is that mozilla-central is a
"release" branch. So we would be changing all builds done of m-c pushes to
be "shippable" builds: clobber builds with PGO, nightly update channel, etc.

Justin Wood

unread,

Mar 27, 2017, 12:51:03 PM3/27/17

to Chris AtLee, sheriffs, Ryan VanderMeulen, release-engineering, release, Aki Sasaki

This is sounding more like this is scope creeping to become "release
promotion" for nighties, and while a great idea, I'm not sure the
benefits of gong that far outweigh the priority of general
tc-migration of osx/windows.

I'd be happy initially unifying pgo+nightly process as a shippable
build, without also enabling "single locale" l10n on each shippable
build (unless requested as part of a publishing nightly graph) and
without also enabling the whole branch to be always-shippable.

I can see that following shortly thereafter, but I think the work
required to do this gets exponentially easier if we're not having to
think about buildbot in the process.

~Justin Wood (Callek)

Aki Sasaki

unread,

Mar 27, 2017, 12:52:11 PM3/27/17

to Justin Wood, sheriffs, Chris AtLee, Ryan VanderMeulen, release, release-engineering

Correct, I'm assuming TC nightly work is done first.

Rail Aliiev

unread,

Mar 27, 2017, 12:59:35 PM3/27/17

to Chris AtLee, sheriffs, Ryan VanderMeulen, release-engineering, release, Aki Sasaki

I fully support this change. We'll have much clearer separation of *what*
we build (CI/shippable) and *when* or *how* we schedule those.

Nightly (mozilla-central branded Frirefox) vs nightly (Aurora nighly,
project branch nightly, Nightly nightly?) are good example of mix of
concepts.

Unifying "nightly" and "release" build types under a single umbrella
(shippable, user builds, etc) sounds like a great idea.

Not sure if I agree with the "CI" name for other build types. In a way
"shippable" builds are CI builds as well...

Hal Wine

unread,

Mar 27, 2017, 2:15:01 PM3/27/17

to Rail Aliiev, sheriffs, release, Ryan VanderMeulen, Chris AtLee, release-engineering, Aki Sasaki

On Mon, Mar 27, 2017 at 9:58 AM, Rail Aliiev <ra...@mozilla.com> wrote:

> Not sure if I agree with the "CI" name for other build types. In a way
> "shippable" builds are CI builds as well...
>

Depends which audience is using the term. I'd go with using the more
popular term ("CI") for the larger audience (devs), and let the awareness
that "shippable builds are a kind of CI" be internal lore.

Dustin Mitchell

unread,

Mar 27, 2017, 2:18:13 PM3/27/17

to Hal Wine, sheriffs, release, Ryan VanderMeulen, Rail Aliiev, Chris AtLee, release-engineering, Aki Sasaki

Maybe the "dep" -> "CI" rename is a separate issue. I've just never
really understood what "dep" an "depend" come from, but maybe everyone
already knows that those words mean builds that happen on every push.

Dustin

Ryan VanderMeulen

unread,

Mar 27, 2017, 2:19:58 PM3/27/17

to Dustin Mitchell, sheriffs, release, Rail Aliiev, Chris AtLee, Hal Wine, release-engineering, Aki Sasaki

Yeah, I don't want to rat-hole on it too much. Just figured that if we're
going to be renaming things anyway, better to do it all in one shot and be
done with it.

Chris AtLee

unread,

Mar 27, 2017, 2:21:04 PM3/27/17

to Dustin Mitchell, sheriffs, release, Ryan VanderMeulen, Rail Aliiev, Hal Wine, release-engineering, Aki Sasaki

I think "dep" / "depend" comes from doing an incremental build that depends
on the previous build. i.e. a non-clobber build.

This crops up in other uses, where we sometimes group builds into "dep and
try" builds vs "nightly and release" as an indication of trust.

Ben Hearsum

unread,

Mar 27, 2017, 2:39:58 PM3/27/17

to Chris AtLee, sheriffs, release, Ryan VanderMeulen, Rail Aliiev, Hal Wine, Dustin Mitchell, release-engineering, Aki Sasaki

Yep. This dates back to Tinderbox, where one machine continually built the same build configuration on the same branch over and over and over. Kill it with fire.

signature.asc

Jordan Lund

unread,

Mar 27, 2017, 3:41:48 PM3/27/17

to Ben Hearsum, sheriffs, release, Ryan VanderMeulen, Rail Aliiev, Chris AtLee, Hal Wine, Dustin Mitchell, release-engineering, Aki Sasaki

++ to this change. I love the idea of having a 'shippable' build type that
is consistent and uniform across all our repos/channels. This part explains
it really well for me: "Promotion to our various shipping channels

(nightly/aurora/beta/release/esr) can be at a separate cadence"

Aki Sasaki

unread,

Mar 30, 2017, 12:37:58 PM3/30/17

to Jordan Lund, sheriffs, Ben Hearsum, release, Ryan VanderMeulen, Rail Aliiev, Chris AtLee, Hal Wine, Dustin Mitchell, release-engineering

I think we've come to a consensus that shippable builds are a good idea.
We're mostly there on the depend->CI discussion.

I filed a bug [1] and added shippable builds to our future releng list
[2]... I'd love for us to start work on these as soon as we're shipping
taskcluster nightlies. Waiting until OSX and Windows TC nightlies are
shipping makes sense and may simplify things, but we might be able to start
on linux and android sooner, if anyone's time frees up and they're
interested. The shippable promotion graph overlaps significantly with
porting Buildbot Release Promotion tasks to taskcluster, which we want to
do anyway.

Thanks everyone!

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1352113
[2]
https://github.com/mozilla/build-relengdocs/blob/master/future/index.rst#shippable-builds