Changing what TryServer jobs are run by default

John O'Duinn

unread,

Jul 19, 2010, 12:36:01 PM7/19/10

to dev-tree-management, dev. planning

(If you do not use TryServer, you can hit delete now!)

hi;

The new TryServer-as-a-branch runs everything that mozilla-central does:
all the builds, unittests and talos suites, on all the OS. All included,
this runs to over 65 hours of compute time per push.

When rolling this new TryServer into production, we set the default to
run everything for every patch sent to TryServer. This gave us a chance
to make sure everything worked properly; more importantly, this gave us
time for people to find out how much of the new functionality they
actually used most frequently.

Based on feedback we've received so far, we're proposing changing
TryServer to *not* run Talos *by default*. Of course, we still support
running Talos on TryServer; the only difference is that now, if you want
Talos results for your TryServer patch, you'll need to ping the RelEng
buildduty person to ask them to trigger your Talos jobs for you. If you
have any concerns with this change-of-default, or think we're missing
something, please comment in
https://bugzilla.mozilla.org/show_bug.cgi?id=579573.

Obviously, this is just an short term interim step until we have a
self-service interface for tryserver in production. At that point, users
will have much more granular control over their own jobs - be able to
select which specific suites to run, and being able to re-run those
suites on the same build if needed. That work is being tracked in
bug#473184 and bug#520226

Thanks
John.

PS: Already mentioned elsewhere, but worth repeating: if you want
TrySrver to skip builds, and unittests, for a given OS for your own
patch, you can use a custom mozconfig. Simply include this mozconfig as
part of your patch. Details in:
https://wiki.mozilla.org/Build:TryServerAsBranch#Disabling_specific_platforms_for_try_push

Mike Beltzner

unread,

Jul 19, 2010, 2:08:01 PM7/19/10

to jod...@mozilla.com, dev. planning, dev-tree-management

On 2010-07-19, at 12:36 PM, John O'Duinn wrote:

> Based on feedback we've received so far, we're proposing changing
> TryServer to *not* run Talos *by default*. Of course, we still support
> running Talos on TryServer; the only difference is that now, if you want
> Talos results for your TryServer patch, you'll need to ping the RelEng
> buildduty person to ask them to trigger your Talos jobs for you. If you

Commented in the bug - this change seems virtuous, since most people are looking for compilation and test runs, but chasing down the buildduty sheriff is difficult and not a good workflow as opposed to being able to check a box on the tryserver page and have it happen automagically.

cheers,
mike

Nicholas Nethercote

unread,

Jul 20, 2010, 12:20:50 AM7/20/10

to Mike Beltzner, dev. planning, dev-tree-management, jod...@mozilla.com

Indeed. Which is why John's next paragraph said the following:

> Obviously, this is just an short term interim step until we have a
> self-service interface for tryserver in production. At that point, users
> will have much more granular control over their own jobs

Nick

ps: This is good stuff, RelEng folks, thanks for all your hard work!

Mike Beltzner

unread,

Jul 20, 2010, 1:13:11 AM7/20/10

to Nicholas Nethercote, dev. planning, dev-tree-management, jod...@mozilla.com

>> Commented in the bug - this change seems virtuous, since most
>> people are looking for compilation and test runs, but chasing down
>> the buildduty sheriff is difficult and not a good workflow as
>> opposed to being able to check a box on the tryserver page and have
>> it happen automagically.
>
> Indeed. Which is why John's next paragraph said the following:
>
>> Obviously, this is just an short term interim step until we have a
>> self-service interface for tryserver in production. At that point,
>> users will have much more granular control over their own jobs

I did not take that paragraph to mean the same thing, but maybe that's
because I define "self-service interface for tryserver" to be something
more than an additional checkbox. I haven't seen a good sketch of the
full plans for this eventual UI.

Even still, I don't think we can make this change without making it very
easy to say "I would like Talos runs with this build." I don't care if
the backend is that an email gets sent to all of releng and they handle
it; I care about making it simple to get performance data for tryserver
builds.

cheers,
mike

Bobby Holley

unread,

Jul 20, 2010, 1:28:53 AM7/20/10

to Nicholas Nethercote, dev. planning, jod...@mozilla.com, dev-tree-management

I'd imagine that the full self-service web interface will probably take some
work. What if, in the interim, we were to do talos runs if the commit
message of the new head revision contained the string "TALOS"? This would
make it dead simple to trigger perf runs with push-to-try.

Keep up the awesome work. Armen can attest to my being Mozilla RelEng's
biggest fan. <3

-bholley

On Tue, Jul 20, 2010 at 12:20 AM, Nicholas Nethercote <
n.neth...@gmail.com> wrote:

> On Tue, Jul 20, 2010 at 4:08 AM, Mike Beltzner <belt...@mozilla.com>
> wrote:
> > On 2010-07-19, at 12:36 PM, John O'Duinn wrote:
> >
> >> Based on feedback we've received so far, we're proposing changing
> >> TryServer to *not* run Talos *by default*. Of course, we still support
> >> running Talos on TryServer; the only difference is that now, if you want
> >> Talos results for your TryServer patch, you'll need to ping the RelEng
> >> buildduty person to ask them to trigger your Talos jobs for you. If you
> >

> > Commented in the bug - this change seems virtuous, since most people are
> looking for compilation and test runs, but chasing down the buildduty
> sheriff is difficult and not a good workflow as opposed to being able to
> check a box on the tryserver page and have it happen automagically.
>
> Indeed. Which is why John's next paragraph said the following:
>
> > Obviously, this is just an short term interim step until we have a
> > self-service interface for tryserver in production. At that point, users
> > will have much more granular control over their own jobs
>

> Nick
>
> ps: This is good stuff, RelEng folks, thanks for all your hard work!

> _______________________________________________
> dev-tree-management mailing list
> dev-tree-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tree-management
>

Justin Dolske

unread,

Jul 20, 2010, 3:48:56 PM7/20/10

to

On 7/19/10 10:28 PM, Bobby Holley wrote:
> I'd imagine that the full self-service web interface will probably take some
> work. What if, in the interim, we were to do talos runs if the commit
> message of the new head revision contained the string "TALOS"?

I probably wouldn't use the web interface anyway, since push-to-try is
so darn convenient.

Would it be easier (as an implementation detail) on the releng side to
have things examine a config file in the tree (ie, modifiable by the
changeset being pushed) to control what tryserver jobs are run?

EG,

1. hg clone mozilla-central
2. ...make changes to code...
3. edit mozilla-central/build/trytasks
4. push -f try

Justin

Justin Wood (Callek)

unread,

Jul 20, 2010, 6:09:31 PM7/20/10

to

IF we do this, I heartily request we create a hook for m-c that tests
for this try-file [and rejects changeset if it exists at tip]. So we
don't accidentally use it when pushing to try for others.

--
~Justin Wood (Callek)

Rob Arnold

unread,

Jul 20, 2010, 7:31:29 PM7/20/10

to dev-pl...@lists.mozilla.org

While this is great when pushing to try, I worry that it will become a
hassle to post a patch for review since qrefresh/qdiff will no longer do the
right thing to generate my patches for me. I'm also not sure how well it
works with a stack of patches.

I like the idea of a web interface (especially if it tracked build & test
progress for me). After seeing the Science Fair at the Summit, my idea for
such a site would have a workflow like the following:

1. hg clone mozilla-central
2. ...make changes to code ...

3. push -f try
4. Go to web interface, choose which sets of tests to run (visualized using
that great dependency tree from Summit) and releasing the job off to the
cloud.
5. Wait for results via email or periodically check the web interface (tbpl
isn't really great for this).

-Rob

Steve Fink

unread,

Jul 20, 2010, 9:39:56 PM7/20/10

to dev-pl...@lists.mozilla.org

On 07/20/2010 04:31 PM, Rob Arnold wrote:
> On Tue, Jul 20, 2010 at 3:48 PM, Justin Dolske<dol...@mozilla.com> wrote:
>
>

> While this is great when pushing to try, I worry that it will become a
> hassle to post a patch for review since qrefresh/qdiff will no longer do the
> right thing to generate my patches for me. I'm also not sure how well it
> works with a stack of patches.
>

I always seem to need a mozconfig-extra for try pushes, so this is no
extra difficulty for me. But I dont' find it to be a burden at all,
although you made me realize that it's because of a simple extension
that I wrote. (I'm accustomed to using Stacked Git, which is mq's
equivalent in git-land. I still find it slightly more comfortable in
some ways. This extension of mq is mostly a port of a small piece of
stg's differences.)

The extension adds two commands: qshow and qexport. Although you could
generate individual patches with qshow, I only use it for looking at
what's in my queue (similar to qdiff). To generate a patch, I'll make
sure the patch in question is currently applied, then run 'hg qexport -p
/tmp/patches'. That'll write out the whole set of applied patches to
/tmp/patches/<patchname>.patch. I then include from there. (You could
use qshow to grab a particular one by index, if that feels better than
using a temporary directory.)

My workflow is to keep the try server-specific configuration patch
somewhere in my queue (sometimes at the top, sometimes at the bottom,
oddly enough.) When I have something I want to push to try, I just push
-f. When I want to attach a patch to a bug, I qexport. If I were to push
to m-c, I'd keep the try config at the top to qpop off and then push.
(This is hypothetical; I do not have commit access.)

I even have a separate try-specific patch in my queue that limits the
build to a single arch. It hovers around the top as well, and I'll qpush
it on temporarily for a try push when I want the restriction.

(Actually, I rarely use qpop or qpush anymore, except for the -a
variants. I use qseries -v to see my queue, then qgoto to go to a
particular index. Mentally, I just want a particular state; I don't care
that much about the exact ordering of the patches.)

In short, I'm all for the "configure try server via a config file"
approach, and I think it'll work for you with some minor tool additions.

I put the current version of my extension at
https://wiki.mozilla.org/Sfink/Mercurial#extensions. It really doesn't
do much. You could probably boil most of it down to some .hgrc aliasing
or something.

Chris AtLee

unread,

Jul 21, 2010, 1:04:21 PM7/21/10

to Rob Arnold, dev-pl...@lists.mozilla.org

On 20/07/10 07:31 PM, Rob Arnold wrote:
> While this is great when pushing to try, I worry that it will become a
> hassle to post a patch for review since qrefresh/qdiff will no longer do the
> right thing to generate my patches for me. I'm also not sure how well it
> works with a stack of patches.
>

> I like the idea of a web interface (especially if it tracked build& test

> progress for me). After seeing the Science Fair at the Summit, my idea for
> such a site would have a workflow like the following:
>
> 1. hg clone mozilla-central
> 2. ...make changes to code ...
> 3. push -f try
> 4. Go to web interface, choose which sets of tests to run (visualized using
> that great dependency tree from Summit) and releasing the job off to the
> cloud.
> 5. Wait for results via email or periodically check the web interface (tbpl
> isn't really great for this).

Our long term plans are to support a combination of:

- Tags / keywords in the commit message to enable/disable platforms or tests
- A custom file you can include in your push to do the same. We'll put
a hook in place on other branches to prevent this file from being pushed
to non-try repos
- A web interface where you can set preferences for your pushes, so you
don't need the commit message or custom file. You'll also be able to
cancel or re-trigger past pushes, or post new patches via the web
interface, as well as being a place for configuring notifications and
collecting results.

The first two of these are fairly easy, we should be starting work on
them in the next few weeks.

The web interface is much more involved. Any help people can offer here
would be much appreciated!

Cheers,
Chris

Chris AtLee

unread,

Jul 21, 2010, 1:04:21 PM7/21/10

to Rob Arnold, dev-pl...@lists.mozilla.org

On 20/07/10 07:31 PM, Rob Arnold wrote:

> While this is great when pushing to try, I worry that it will become a
> hassle to post a patch for review since qrefresh/qdiff will no longer do the
> right thing to generate my patches for me. I'm also not sure how well it
> works with a stack of patches.
>

> I like the idea of a web interface (especially if it tracked build& test

> progress for me). After seeing the Science Fair at the Summit, my idea for
> such a site would have a workflow like the following:
>
> 1. hg clone mozilla-central
> 2. ...make changes to code ...
> 3. push -f try
> 4. Go to web interface, choose which sets of tests to run (visualized using
> that great dependency tree from Summit) and releasing the job off to the
> cloud.
> 5. Wait for results via email or periodically check the web interface (tbpl
> isn't really great for this).

Our long term plans are to support a combination of:

Mike Shaver

unread,

Jul 21, 2010, 2:02:35 PM7/21/10

to Chris AtLee, dev-pl...@lists.mozilla.org, Rob Arnold

On Wed, Jul 21, 2010 at 1:04 PM, Chris AtLee <cat...@mozilla.com> wrote:
> - Tags / keywords in the commit message to enable/disable platforms or tests

In mercurial 1.6, you can apparently include metadata in the *push*
message, not just in commits:

http://twitter.com/shaver/status/19091349104
http://twitter.com/bos31337/status/19093524298

This would make it easier to push-to-try with the right test matrix
enabled, without mucking up the local repo (and having to strip
commits or other ugliness before pushing for realsies).

> The web interface is much more involved. Any help people can offer here
> would be much appreciated!

Can you point to the place that the current knobs are turned WRT
triggering different types of test runs and platform selection? That
would help figure out how to help, I think.

Mike

Chris AtLee

unread,

Jul 21, 2010, 3:43:26 PM7/21/10

to Mike Shaver, Rob Arnold

On 21/07/10 02:02 PM, Mike Shaver wrote:
> On Wed, Jul 21, 2010 at 1:04 PM, Chris AtLee<cat...@mozilla.com> wrote:
>> - Tags / keywords in the commit message to enable/disable platforms or tests
>
> In mercurial 1.6, you can apparently include metadata in the *push*
> message, not just in commits:
>
> http://twitter.com/shaver/status/19091349104
> http://twitter.com/bos31337/status/19093524298
>
> This would make it easier to push-to-try with the right test matrix
> enabled, without mucking up the local repo (and having to strip
> commits or other ugliness before pushing for realsies).

We could support this as well. All that's going to happen is that when
we see a new push on try, we're going to hit hg.m.o/try, grab the commit
message from the push, look for the 'special' file, and we could grab
any extra metadata once that becomes available.

>> The web interface is much more involved. Any help people can offer here
>> would be much appreciated!
>
> Can you point to the place that the current knobs are turned WRT
> triggering different types of test runs and platform selection? That
> would help figure out how to help, I think.

Currently there are no knobs that can affect this on a per-push basis.
Our existing controls are hard-coded on a per-branch basis [0].

We're planning on writing a custom buildbot scheduler that will look in
all the places mentioned above, as well as in whatever backend the web
UI is using, to make the decisions about which builds and tests to run.
I *think* this custom scheduler is the simplest part of the project,
which we can get started on in a few weeks. If somebody else wants to
get started hacking on the buildbot scheduler, let me know [1].

What we could use help with is writing up the web side of things to
create the UI for the users, store preferences, grab past results,
cancel/re-trigger builds, etc.

Another thing to figure out is the language for how to enable/disable
things. There are several options that could be specified

platforms: ALL / specific platforms to build on / specific platforms to
NOT build on
build types: opt only, debug only, or both?
tests: ALL / unittests only / talos only / specific tests to run /
specific tests to NOT run

Cheers,
Chris

[0] e.g.
http://hg.mozilla.org/build/buildbot-configs/file/f4eb384f0702/talos-r3/config.py#l592

[1] We need a subclass of buildbot.schedulers.BaseScheduler or Scheduler
that looks at the various data sources and (commit message, etc.) and
triggers buildsets for a subset of self.builderNames based on the
information gathered. This scheduler will be used both to trigger (or
not) builds, and to trigger (or not) tests.