Using the same revision for nightly builds

Chris AtLee

unread,

Jun 24, 2010, 3:11:29 PM6/24/10

to dev-tree-...@lists.mozilla.org

Now that we've got all our regular desktop opt and debug builds for all
our branches running on the shiny new database-enabled buildbot, it's
made it easier to do things like get all our nightly builds building
against the same revision. Currently the nightly builds just build the
tip of the default branch whenever they happen to run.

I've got a patch in bug 570814 [1] that will calculate the revision for
a nightly build by looking at the past 10,000 opt and debug builds and
picking one that is green or orange across all platforms. If it can't
find a green or orange run, it will use the latest change it knows about
on the given branch. And if that fails, it will fall back to using the
tip of the default branch.

Does this algorithm sound ok? The only downside I can see is that if we
haven't had a "good" build on all platforms for a given revision, the
next nightly could be of a revision more than 24 hours in the past.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=570814

Ted Mielczarek

unread,

Jun 24, 2010, 3:25:55 PM6/24/10

to dev-tree-...@lists.mozilla.org

On Thu, Jun 24, 2010 at 3:11 PM, Chris AtLee <cat...@mozilla.com> wrote:
> I've got a patch in bug 570814 [1] that will calculate the revision for a
> nightly build by looking at the past 10,000 opt and debug builds and picking
> one that is green or orange across all platforms. If it can't find a green
> or orange run, it will use the latest change it knows about on the given
> branch. And if that fails, it will fall back to using the tip of the
> default branch.

That seems like a long way to go back. I think this is likely to
confuse people more than anything. I'd prefer that it just pick the
most recent revision, unless maybe it's actually burning, because not
getting nightlies sucks too. Having an all-green nightly would be
excellent, but given our amount of intermittent orange right now, that
means that we're going to get nightlies built from changesets landed
at weird hours of the day.

-Ted

Boris Zbarsky

unread,

Jun 24, 2010, 4:00:19 PM6/24/10

to

On 6/24/10 3:11 PM, Chris AtLee wrote:
> Does this algorithm sound ok? The only downside I can see is that if we
> haven't had a "good" build on all platforms for a given revision, the
> next nightly could be of a revision more than 24 hours in the past.

That seems undesirable for using nightlies for bisection...

Can we perhaps make a distinction between the builds we put on the ftp
site for bisection and the builds we push out via the update channel and
try to make the _latter_ all-green? Assuming the goal is to prevent
breakage for nightly users, that is.

-Boris

Mike Shaver

unread,

Jun 24, 2010, 4:18:32 PM6/24/10

to Chris AtLee, dev-tree-...@lists.mozilla.org

On Thu, Jun 24, 2010 at 12:11 PM, Chris AtLee <cat...@mozilla.com> wrote:
> Now that we've got all our regular desktop opt and debug builds for all our
> branches running on the shiny new database-enabled buildbot, it's made it
> easier to do things like get all our nightly builds building against the

> same revision. Currently the nightly builds just build the tip of the

> default branch whenever they happen to run.

You mean that for a given repository you want the same changeset built
for every platform? I think that is virtuous, but I don't think we
have had a problem with the "whatever's latest" strategy producing a
problematic number of broken builds, so I would advocate simplicity:
when the first nightly is triggered, it gets the latest on the default
branch of the repository, and then that changeset is reused for all
other platforms.

Mike

Robert Kaiser

unread,

Jun 25, 2010, 12:16:31 PM6/25/10

to

Chris AtLee schrieb:

> I've got a patch in bug 570814 [1] that will calculate the revision for
> a nightly build by looking at the past 10,000 opt and debug builds and
> picking one that is green or orange across all platforms

I'm not sold on the idea as a whole yet, but I think an arbitrary number
to search back is always wrong - if we go down that lane, we should
always only search at most 24 hours back, I think.

Robert Kaiser

--
Note that any statements of mine - no matter how passionate - are never
meant to be offensive but very often as food for thought or possible
arguments that we as a community needs answers to. And most of the time,
I even appreciate irony and fun! :)

Ted Mielczarek

unread,

Jun 25, 2010, 12:25:14 PM6/25/10

to Mike Shaver, Chris AtLee, dev-tree-...@lists.mozilla.org

+1

-Ted

Chris AtLee

unread,

Jun 25, 2010, 1:12:52 PM6/25/10

to Ted Mielczarek, dev-tree-...@lists.mozilla.org, Mike Shaver

Bah, y'all are ruining all my fun!

But yes, this is much simpler. So the behaviour will be to use the
latest revision at the time we trigger nightlies, so they'll all build
the same revision.

Ben Hearsum

unread,

Jul 5, 2010, 4:39:12 PM7/5/10

to Chris AtLee, Ted Mielczarek, dev-tree-...@lists.mozilla.org, Mike Shaver

> But yes, this is much simpler. So the behaviour will be to use the
> latest revision at the time we trigger nightlies, so they'll all build
> the same revision.

We'll probably hit revisions at some point that compile on one platform
but not another, which is something that was going to be addressed with
this change originally. Is that something people care about?

Ben Hearsum

unread,

Jul 5, 2010, 4:39:12 PM7/5/10

to Chris AtLee, Ted Mielczarek, dev-tree-...@lists.mozilla.org, Mike Shaver

> But yes, this is much simpler. So the behaviour will be to use the
> latest revision at the time we trigger nightlies, so they'll all build
> the same revision.

We'll probably hit revisions at some point that compile on one platform

John O'Duinn

unread,

Jul 7, 2010, 5:10:39 PM7/7/10

to dev-tree-...@lists.mozilla.org

Hi;

I think the objective here is to raise the standard of nightly builds.
For a given night, on a given branch, we're proposing two mechanical
changes:

1) use the same source code revision for each OS

2) use a "good" source code revision. To start with, the definition of
"good" is that it does successfully compile-and-link on all OS. If the
latest source code revision did not compile-and-link, we'll use the
previous most recent code revision that *does* compile-and-link.

(Initially, we're defining "good" to be "compile-and-link" on all OS.
However, over time, we're planning to raise the standard to
"compile-and-link-and-pass-some-testsuites", and eventually
"compile-and-link-and-pass-all-testsuites". However, that will obviously
need unittests to be more stable first!)

If we ever find there are no "good" changesets since the previous
nightly, we can decide if we should produce a nightly of the same "good"
source code as the previous night or if we should just wait for a "good"
revision. Right now, I'm inclined to not generate another nightly of the
same changeset, as that feels like waste of time/resources, but open to
suggestions if I'm missing something. Regardless, it doesnt feel useful
to attempt a nightly that we know will fail to compile-and-link.

This is our first time changing how we decide what revision to use for
nightly builds. If you have concerns, or see something that we might be
missing, please let us know before we go ahead with this.

tc
John.

Justin Dolske

unread,

Jul 8, 2010, 1:02:22 AM7/8/10

to

On 7/7/10 2:10 PM, John O'Duinn wrote:

> For a given night, on a given branch, we're proposing two mechanical
> changes:
>
> 1) use the same source code revision for each OS

This seems fine and sensible to have, but is there any technical
motivation for doing this? Nightlies don't seem to have been a problem
so far as I've heard. (Cleanup as a side project is cool too, so don't
read this as stop energy!)

> 2) use a "good" source code revision. To start with, the definition of
> "good" is that it does successfully compile-and-link on all OS. If the
> latest source code revision did not compile-and-link, we'll use the
> previous most recent code revision that *does* compile-and-link.

This sounds reasonable, assuming there's a < 24 hour window we don't
ever risk rebuild the same changeset of the last nightly (or, worse,
somehow build an older changeset!)

Or maybe just a 6/12 hour window, to eliminate the confusion of a
"nightly" not having most of a day's checkins. Probably an edge case, as
I'd suspect that 99% percent of the time this skips at most one or two
of the most recent changesets. And then you'd have to deal with, say,
weekends where 1 change is checked in just after the last nightly, and
nothing else lands all day.

> If we ever find there are no "good" changesets since the previous
> nightly, we can decide if we should produce a nightly of the same "good"
> source code as the previous night or if we should just wait for a "good"
> revision. Right now, I'm inclined to not generate another nightly of the
> same changeset, as that feels like waste of time/resources, but open to
> suggestions if I'm missing something. Regardless, it doesnt feel useful
> to attempt a nightly that we know will fail to compile-and-link.

I'd go with KISS initially...

1) Grab last 23:59 hours of changsets
2) Look for latest changeset that passed Bo build for Mac/Lin/Win.
3) Build that

Bonus points for writing this as a tool that can be run against
historical data, so that if we add more conditions (like passing unit
tests), we can see what the effect would have been.

Justin

Phil Ringnalda

unread,

Jul 8, 2010, 1:58:46 AM7/8/10

to

On 7/7/10 10:02 PM, Justin Dolske wrote:
> On 7/7/10 2:10 PM, John O'Duinn wrote:
>
>> For a given night, on a given branch, we're proposing two mechanical
>> changes:
>>
>> 1) use the same source code revision for each OS
>
> This seems fine and sensible to have, but is there any technical
> motivation for doing this? Nightlies don't seem to have been a problem
> so far as I've heard. (Cleanup as a side project is cool too, so don't
> read this as stop energy!)

It was more of a problem in the olden days, before there were so many
people testing with every single tinderbox build, and before anyone who
could build could painlessly build from any changeset. In the CVS days,
it wasn't terribly unusual to have regression bugs with a day of arguing
about how "I don't see this on OS X" "there's no reason why this would
be OS-specific, though" before the next nightly would come out with
everyone broken.

Still probably useful for good things, though, so in the rare case when
someone lands a new feature at 01:59:59 or 02:00:05, we don't have some
OSes waiting a day to see it.

Chris AtLee

unread,

Jul 19, 2010, 10:11:37 AM7/19/10

to dev-tree-...@lists.mozilla.org

On 08/07/10 01:02 AM, Justin Dolske wrote:
>
> I'd go with KISS initially...
>
> 1) Grab last 23:59 hours of changsets
> 2) Look for latest changeset that passed Bo build for Mac/Lin/Win.
> 3) Build that
>
> Bonus points for writing this as a tool that can be run against
> historical data, so that if we add more conditions (like passing unit
> tests), we can see what the effect would have been.
>
> Justin

So one effect of this would be that for branches with low activity, we
wouldn't get nightlies if there were no checkins / green builds in the
past 24 hours.

Is it ok to have no 3.5.x or 3.6.x nightlies if nothing has changed in
the code?

Mike Beltzner

unread,

Jul 19, 2010, 10:53:03 AM7/19/10

to Chris AtLee, dev-tree-...@lists.mozilla.org

On 2010-07-19, at 10:11 AM, Chris AtLee wrote:

> Is it ok to have no 3.5.x or 3.6.x nightlies if nothing has changed in the code?

It's OK to have no nightlies, but it's not OK to have no Talos data for that time period. If you can run Talos against the latest available nightly, that'd be fine.

cheers,
mike

John O'Duinn

unread,

Jul 19, 2010, 1:00:44 PM7/19/10

to dev-tree-...@lists.mozilla.org

On 7/19/10 7:53 AM, Mike Beltzner wrote:
> On 2010-07-19, at 10:11 AM, Chris AtLee wrote:
>
>> Is it ok to have no 3.5.x or 3.6.x nightlies if nothing has changed in the code?
>
> It's OK to have no nightlies

Cool, rebuilding the same source code feels... sub-optimal!

> but it's not OK to have no Talos data for that time period. If you can run Talos against the latest available nightly, that'd be fine.

Not sure I follow. If we have no source change, and hence no new
nightly, you would like us to take the most-recent-nightly (which
already ran through Talos and posted results), and run it through Talos
again?

We can do that if you want, but I'm curious - what exactly is the objective?

> cheers,
> mike
tc
John.

Shawn Wilsher

unread,

Jul 19, 2010, 1:50:25 PM7/19/10

to dev-tree-...@lists.mozilla.org

On 7/19/2010 10:00 AM, John O'Duinn wrote:
> We can do that if you want, but I'm curious - what exactly is the objective?

More data points, which makes it easier to spot noise and regressions.

Cheers,

Shawn

Mike Beltzner

unread,

Jul 19, 2010, 2:10:54 PM7/19/10

to jod...@mozilla.com, dev-tree-...@lists.mozilla.org

On 2010-07-19, at 1:00 PM, John O'Duinn wrote:

> Not sure I follow. If we have no source change, and hence no new
> nightly, you would like us to take the most-recent-nightly (which
> already ran through Talos and posted results), and run it through Talos
> again?
>

> We can do that if you want, but I'm curious - what exactly is the objective?

As Shawn said, it's about making sure we have the data points. I believe that the graph just stops drawing if there are no new data points in time, but beyond that, there's a certain amount of jitter and noise in Talos results that we need to be able to see as "regular" vs. "due to a change in the code."

Out of curiousity: what would happen if Talos or test runs failed? Would a checkin be required to get the next run? We have some tests that fail intermittently, and without re-runs on tests, we wouldn't be able to see that, either.

cheers,
mike

John O'Duinn

unread,

Jul 19, 2010, 5:23:58 PM7/19/10

to Mike Beltzner, dev-tree-...@lists.mozilla.org

On 7/19/10 11:10 AM, Mike Beltzner wrote:
> On 2010-07-19, at 1:00 PM, John O'Duinn wrote:
>
>> Not sure I follow. If we have no source change, and hence no new
>> nightly, you would like us to take the most-recent-nightly (which
>> already ran through Talos and posted results), and run it through Talos
>> again?
>>
>> We can do that if you want, but I'm curious - what exactly is the objective?
>
> As Shawn said, it's about making sure we have the data points.

Thats what I'm trying to understand. See comments below.

> I believe that the graph just stops drawing if there are no new data points in time,

Incorrect. If no new data was posted to graphserver, it continues the
line from the last known datapoint forward. (This was originally done to
allow comparing FF3.0 results with FF2.0 results as we stopped producing
FF2.0 builds.)

> but beyond that, there's a certain amount of jitter and noise in Talos results that we need to be able to see as "regular" vs. "due to a change in the code."

If noise/variance in test results is the problem, it seems best to
re-run the same test 'n' times in a row on the *same* build - not
generate new builds, with new test results. Rerunning a test is
something we can trigger if/when requested. Its also part of our "help
track down intermittent orange unittest failures" work with QA and Devs,
and feels slightly orthogonal to the topic at hand.

> Out of curiousity: what would happen if Talos or test runs failed? Would a checkin be required to get the next run? We have some tests that fail intermittently, and without re-runs on tests, we wouldn't be able to see that, either.

A checkin is one way, but that generates new builds and then runs *all*
tests. The more efficient (and faster!) thing to do is ping buildduty
(or file a RelEng bug) to have that specific failing test suite be
re-run on the pre-existing build.

> cheers,
> mike

hth

tc
John.

Mike Beltzner

unread,

Jul 19, 2010, 5:25:41 PM7/19/10

to jod...@mozilla.com, dev-tree-...@lists.mozilla.org

On 2010-07-19, at 5:23 PM, John O'Duinn wrote:

> If noise/variance in test results is the problem, it seems best to
> re-run the same test 'n' times in a row on the *same* build - not

Uhm, that's what I asked for, yes. Take the "most recent nightly" and rerun the Talos set on that. Just do it at least, oh I don't know, let's say twice a day.

> A checkin is one way, but that generates new builds and then runs *all*
> tests. The more efficient (and faster!) thing to do is ping buildduty
> (or file a RelEng bug) to have that specific failing test suite be
> re-run on the pre-existing build.

No, the more efficient way is to have it happen automatically. Pinging releng doesn't scale, though I suppose I could set up my own cron job to send that email! :)

cheers,
mike

Mike Beltzner

unread,

Jul 19, 2010, 5:28:01 PM7/19/10

to Mike Beltzner, dev-tree-...@lists.mozilla.org, jod...@mozilla.com

On 2010-07-19, at 5:25 PM, Mike Beltzner wrote:

> Uhm, that's what I asked for, yes. Take the "most recent nightly" and rerun the Talos set on that. Just do it at least, oh I don't know, let's say twice a day.

Clarifying further, since I realize I'm using in-my-head terminology :)

I'm considering builds to only be generated when there's new code, as per your proposal.

I'm proposing that Talos and testing runs happen at least twice a day, on whatever is the most recently compiled version for that tree. This gives us data about those builds on a regular pulse. You're right, we should expect that data to be stagnant. If it's not, that tells us interesting (and valuable!) things about the tests and performance metrics packages.

cheers,
mike