General updates
Still a CY, welcome back Chase!
jam@, ojan@, jparent@ all gone until January. Please send updates if you ARE working. Regroup in January.
Area updates
Flakiness:
2.4% (still mostly oddballs)
jam@: how long were these outages?
~1 day. But hard to measure.
jam@: There will always be bad things checked in. But we need to detect it sooner. sergey@ change goes in, 1 hour cycle time, 30 mins or so before someone notices badness, CQ keeps going … If there are outages > say 4 hours, we really should be able to do better than this.
when jobs fail on trybots without patch, surface THIS to sheriff. This is an earlier signal into issues (AI: sergey will follow up on this)
ojan@ says persistent without patch failures would be easy to surface, but this is less useful
alancutter adding flakiness data to som (no update, did not work on it last week). Will be doing this week, adding more detail into particular tryjobs
open bug on adding flakiness of actual builders, needs owner: http://crbug.com/436967 (AI: jparent -find someone to do this)
CQ Matches Waterfall:
automated metric for determining match main waterfall bot matching ng bots. Now: 29.6%. This metric is a bit lower because only covers ng bots. Notable exceptions: android (but this is being converted now. they don’t do retry without patch yet)
sergiy still making progress on capacity estimates and
Monitoring:
Pingdom still catching outages before our users have to notify us! (today, crbug)
More discussions with ACQ_SRE for [internal google monitoring product ] access. On track for this week.
Tree Open:
no real updates.
sean is ramping up on som
Fast Cycles:
no updates