Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Changes to automated bug comments for intermittent failures

82 views
Skip to first unread message

Ed Morley

unread,
Sep 28, 2015, 8:31:40 PM9/28/15
to dev-tree-management, auto-tools, dev-pl...@lists.mozilla.org, dev...@lists.mozilla.org
Currently whenever a sheriff or developer classifies an intermittent
failure on Treeherder with a bug number, a comment is left on that bug for
every occurrence.

We're going to be turning these comments off, and replacing them with
periodic summaries of recent failures, in order to:
* Improve the signal to noise ratio of bug comments/bugmail.
* Reduce the impact these comments have on Bugzilla (the current bot has
made 670,000 comments over the years and the BMO team are understandably
unimpressed by the bot!).
* Move us one step closer to a crashstats-like model, where not all
failures have bugs filed, and instead the canonical source of truth is a
dashboard, from which the most actionable issues can be identified and
prioritised.

There will be two types of summary:
a) Daily: Intended to warn of sudden spikes in failure rate, for which
waiting a week would be too long. Posted to bugs with >= 15 failures/day
across all trees.
b) Weekly: The primary summary, intended to keep interested parties up to
date and make it clear the bug is still occurring. Posted if >=5
failures/week across all trees.

Using last week as an example - 8000 bug comments were left with the
per-occurrence model across 800 bugs. With the above we would have instead
only posted ~300 comments (for example content see bug 1179821 comment 13).
The thresholds are an initial best estimate (see bug 1179821 comment 5) and
will be tweaked in the future as needed.

Longer term we have plans to:
* Automatically classify failures where possible.
* Use a "failure signature" as the canonical identifier for an intermittent
failure, so we don't have to file bugs for every failure, only the most
actionable.
* Create an OrangeFactor v2 that uses Treeherder as the source of truth for
intermittent failure data (as opposed to what is mirrored to ElasticSearch
in a TBPL-era schema) and in so doing, give it the UX overhaul it has long
needed.

Barring any issues, this change will land this week. If you have any
questions reply here or comment in bug 1179821 / bug 1179310, or see bug
1179263 for the longer term vision.

Best wishes,

Ed

Wes Kocher

unread,
Sep 28, 2015, 9:20:58 PM9/28/15
to Ed Morley, auto-tools, dev-pl...@lists.mozilla.org, dev-b2g, dev-tree-management
This is great. Thanks for this!

Wes

Ehsan Akhgari

unread,
Sep 28, 2015, 9:52:27 PM9/28/15
to dev-tree-...@lists.mozilla.org
For those who haven't met TinderboxPushlog Robot, here is a picture I
took from him five years ago when he started at Mozilla
<http://farm1.static.flickr.com/24/183272970_54862f67b4.jpg> (from
<http://ehsanakhgari.org/blog/2010-04-09/assisted-starring-oranges>).

The bot is sorry for the 670,000 or so comments he has made. He didn't
knew better and he wasn't really be expecting to still be in production
in 2015. He promises to be more cautious in making bug comments from
now on.

I promise to have a conversation with the bot about his behavior in the
past few years... :-)
> _______________________________________________
> dev-tree-management mailing list
> dev-tree-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tree-management
>

0 new messages