On 12/06/2013 08:20 PM, Anthony Ricaud wrote:
> I'm the only unofficial sheriff so I cannot catch everything, I can
> only catch it while I'm working, and I need to work on my main tasks
> so I cannot spend too much time on this. Should we get some sheriffs
> looking at Travis? Should we get managers/release management to nag
> people about fixing them?
I think this is fundamentally an engineering issue that should be
addressed by prioritization and tooling. mozilla-central has
dedicated-ish sheriffs who can and do mark intermittents all day, but it
doesn't fix the intermittent failures in and of itself. (Although The
TBPL bugzilla traffic that results from the easily marked intermittents
is very helpful in that it helps make it clear just how intermittent a
problem is, however.)
Having said that, where engineers are failing to address this, you seem
to be getting stuck with the responsibility, so I think it makes sense
for you (or others doing what you to do) to demand that we come up with
a sheriffing schedule or something that spreads the load more fairly.
Engineering-wise, I think it's two-pronged:
- Engineers should absolutely spend time on fixing intermittents and
managers should be emphasizing this as a priority.
- We need to improve our JS marionette logging in the face of failures.
It's pretty hard to tell what is happening/happened in the test
failures. I assume :lightsofapollo is all over this with the automated
testing/landing stuff he and friends are working on, but there's
probably more we can all do to help improve this.
For Thunderbird's mozmill tests where we encountered a similar problem
of "okay, so the test failed, why did the test fail?! gaaaaaah!", I
added logging and failure test capturing that was useful in many cases
to understand what was happening on the server.
A blog post with screnshots can be seen at:
http://www.visophyte.org/blog/2011/03/02/teaser-rich-contextual-information-for-thunderbird-mozmill-failures/
And a current extracted log failure (though manually fetched since ArbPL
is no longer actively running/scraping for Thunderbird):
https://clicky.visophyte.org/tools/arbpl-mozmill-standalone/?log=https://clicky.visophyte.org/examples/arbpl-mozmill/20131201/mozmill-fail.log
Note that Thunderbird's failures frequently involved focus issues,
keypresses, popups, and XBL/XUL hiccups which is why there is so much
emphasis placed on focus changes and where events were actually
handled. These are not the things that really matter for our marionette
tests. I know just having the console.log() output for the main thread
and (shimmed/faked) console.log output for the worker for the e-mail app
is probably the most bang-for-the-buck (and efforts have already been
made to try and improve this).
Andrew