Do we need to commit expectations? It'd seem better to me if Infra was able to file bugs for flaky tests, ignore them on CQ, but
keep running them (auto-closing bugs of the ones that stop flaking).
I have a more general concern with explicitly disabling flaky tests however. I've been struggling with the inability to conscientiously avoid introducing new flakes to the codebase. When making core changes that affect all tests, sometimes even in an attempt to address existing flakes, it's really hard to know that you're not introducing new flakes. Impossible as a dry run (CQ explicitly hides flakes from you.. no way to request otherwise), so the "best" way is to wait for pinpoint/sheriffs, 2-3 days of silence is usually a good sign...
Whenever such a CL gets reverted, I'm always left to wonder how many other tests were disabled because of the systemic flake I inadvertently introduced... Some sheriffs do tag me/bug but there's no guarantee. Other disabled tests are gone forever...
If instead Infra had a database of known flakes, it could:
1) Ignore known flakes in CQ
2) Close bugs for flakes that vanish (like ClusterFuzz)
3) Have a "flakiness Dry Run" on CQ where it could spot
new flakes before landing :)!
Also, re. TestExpectations. This issue is not specific to only Blink Web Tests, they are #1 but there are
other instances.
- Gab