Overall status: exit criteria nearly satisfied. One remaining piece of monitoring to add, ETA Friday. Goal: CY Exit on Friday! Stay tuned for details and wrap up.
Monitoring:
CQ false-rejection alerting only remaining item. ETA: Friday.
jparent: circulate this final monitoring piece once ready, wait for signoff from other leads, then DONE.
Flakiness:
Alan Cutter has his last CY CL: 845963005, see demo at http://attempt-timeline.trooper-o-matic.appspot.com/cq/chromium .
False rejection rate last week: 3.6%, 2.3% due to failed jobs, and 1.2% due to failure to trigger jobs.
I see disproportionately many triggered GPU tests as causes - likely a bug in CQ verifier, to be fixed by the new verifier this week. CLs: 123977014, 140747013
Other causes of flakiness:
Jan 5: 5.64% contentshelltest in android bots, git setup in win_gpu_triggered_tests, various test steps in other builders (no pattern)
Jan 6: 1.15% mac_chromium_rel_ng: telemetry_unittests, unit_tests, tab_capture_end2end_tests
Jan 7: 0.68% mac_chromium_rel_ng: no pattern
Jan 8: 6.81% browser_tests, tab_capture_end2end_tests (mac & win builders), android builders - no pattern, tab_capture_end2end_tests (win_gpu_triggered_tests)
Jan 9: 3.01% browser_tests all over the place, compile failures on android bots
http://build.chromium.org/p/tryserver.chromium.linux/builders/android_arm64_dbg_recipe/builds/35759
http://build.chromium.org/p/tryserver.chromium.linux/builders/android_clang_dbg_recipe/builds/36011
etc. See more at https://trooper-o-matic.appspot.com/cq/chromium , click on false rejection graph point, and see the builds.
Jan 10: 4.62% compile on android bots,
Jan 11: 0%
CQ Matches Waterfall:
Ongoing work, but does not block CY Exit.
status
ng-trybot coverage is 47.5%, up from 29.6% in December
note that above includes optional trybots that are not part of CQ because of capacity constraints
sorting out chromium.chrome coverage would get us to ~51%
sorting out GPU coverage would get us to ~64%
if we do both that’s ~70%
full completion not possible in Q1 due to hardware constraints
major areas
need to verify consensus
need to prioritize bugs (possibly up to Pri-0), especially https://code.google.com/p/chromium/issues/detail?id=446266 on the critical path now
telemetry tests on android
need to prioritize https://code.google.com/p/chromium/issues/detail?id=397338 (Pri-0 is not really taking effect, also see comments #92+)
chromium.memory
we’re under hardware capacity constraints (detailed estimates pending)
provide in-sync (“ng”) trybot coverage for GPU tests
mostly recipe changes, could use help from GPU folks (should be able to ask for cycles because of Code Yellow)
decision: are debug (dbg) bots in scope, i.e. to be covered? if not, that’d improve our metrics; AFAICT they’re not in scope, just checking (for now metrics are conservative)
different versions of OS-es
Windows: do we need all of XP, Vista, 7, and 8? We may be able to drop XP in April (http://chrome.blogspot.com/2013/10/extending-chrome-support-for-xp-users.html), also likely Vista as well at that point; optional trybots are in progress
Mac: can we standardize on 1-2 versions? Main waterfall currently has at least 10.{6,8,9}.
Linux: can we add Trusty to main waterfall without trybot coverage? This would regress our metrics
Bling seems to want to create their own infrastructure (https://groups.google.com/a/google.com/d/msg/chrome-infrastructure-team/DjCYRkKTiJY/FOKxTrSQBbMJ) making metrics and guarantees more difficult from infra/chromium POV
Tree Open:
Status: CY satisfied.
https://trooper-o-matic.appspot.com/tree-status/chromium to see the data.
Fast Cycles: