confirm failure mode and documentation

353 views
Skip to first unread message

Joel Maher

unread,
Oct 11, 2023, 4:58:17 PM10/11/23
to dev-pl...@mozilla.org
Understanding results of a Try push can be difficult.  I have started writing some documentation on tools to help understand it:

Specifically I want to call out "Confirm Failure".

This runs much faster than a retrigger and yields as strong if not a stronger signal.  It runs as tier-3 right now, so you need to view tier-3 jobs in your try push.

Currently there are plans to make this more streamlined (and once there it will be tier-2), specifically:
 * better UI filtering for tasks that don't have a clear signal after running confirm failures
 * a more automated way to run confirm failure for try pushes (will at least require a bug, most likely a phabricator ID)
 * other tools for running it easier manually

The more these tools are used, the more usability and edge cases can be fixed.


Joel Maher

unread,
Nov 15, 2023, 7:33:41 AM11/15/23
to dev-pl...@mozilla.org, Joel Maher
I wanted to follow up on "Confirm Failure":
 * Treeherder will mark a task with a MITTEN if confirm failure runs and it is intermittent
 * many edge cases have been fixed in running confirm failures, if an infrastructure failure exists, it will fallback to retrigger mode
 * confirm failure tasks (ending with `-cf` in the task label) are part of the original task graph (and always optimized out), which allows other tools to discover the task to trigger or get status (more streamlined)

As this is more robust, I am making it tier-2.  Also there are some tooling prototypes in place to run this on a given push/revision which is work towards the end goal of making this run automatically when certain criteria is hit (e.g. a phabricator review, autoland failures, non hardware specific tests, etc.)
Reply all
Reply to author
Forward
0 new messages