Specific bot: master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2
Bot capacity increase required before we do this: http://crbug.com/703478
Backing rationale: http://crbug.com/707281 burned a few days tracking down a cc change that broke composited animations in SPv2. As we work on the last ~20% of functionality for SPv2 to achieve parity I expect we'll see an increasing risk of breaking changes that aren't directly within Blink paint directories where we add this trybot today.
Thought: start as FYI but to prevent regression we would want to move to blocking once we have gained confidence that that would be a reasonable thing to do. So, what would we need to see to feel confident?
Given we are only adding it for certain subdirectories Blink-side for example, we see occasional regressions today. For example there were multicol changes under WebKit/Source/core/layout that broke the SPv2 trybot builds, we had to notice it was red manually and mark those tests as failing. http://crrev.com/2765013002
pdr@ set up our current try bot and overall test plan, here is the historical doc:
In there he mentions setting up add'l SPv2 specific trybots: "When spv2 is closer to launch, additional spv2-specific trybots can be added." I'm not sure exactly what he had in mind. What would be best?
> Second thought: Is the slimming paint bot not running on the waterfall? It's unfortunate to have bots on the waterfall that don't run in the CQ since that can break the waterfall. But then, it's also a problematic situation to have CQ blocking on bots (for some patches only) that don't run on the waterfall, since this allows patches to land and then block CQ for other things.
It's on the tryserver.chromium.linux waterfall here:
https://build.chromium.org/p/tryserver.chromium.linux/waterfall
I think we haven't added it to the fuller CQ because we've been building up more compatibility. We're just now approaching point where broadening likely makes sense, and cc would be an obvious initial candidate, so a bit of a mini capybara type of situation.
> Third thought: Why is this through presubmit instead of just adding it to the CQ and having analyze decide of the patch is relevant to the bot?
See above re: compatibility. I think this would allow us more direct control until we have greater confidence, but also, I know how to add it through presubmit and I don't know what we'd need to do for CQ, so open to thoughts here as well.
On Monday, April 3, 2017 at 12:07:50 PM UTC-7, danakj wrote:
> First thought: Would failing this bot be an FYI or would it block the patch landing?
Thought: start as FYI but to prevent regression we would want to move to blocking once we have gained confidence that that would be a reasonable thing to do. So, what would we need to see to feel confident?
Given we are only adding it for certain subdirectories Blink-side for example, we see occasional regressions today. For example there were multicol changes under WebKit/Source/core/layout that broke the SPv2 trybot builds, we had to notice it was red manually and mark those tests as failing. http://crrev.com/2765013002
pdr@ set up our current try bot and overall test plan, here is the historical doc:
https://docs.google.com/document/d/1QCM912Dr6u38DqyQqd7pxQxDy8FFOoWMMDq7uAXqKdA/view#heading=h.scmfc1jwpawo
In there he mentions setting up add'l SPv2 specific trybots: "When spv2 is closer to launch, additional spv2-specific trybots can be added." I'm not sure exactly what he had in mind. What would be best?
> Second thought: Is the slimming paint bot not running on the waterfall? It's unfortunate to have bots on the waterfall that don't run in the CQ since that can break the waterfall. But then, it's also a problematic situation to have CQ blocking on bots (for some patches only) that don't run on the waterfall, since this allows patches to land and then block CQ for other things.
It's on the tryserver.chromium.linux waterfall here:
https://build.chromium.org/p/tryserver.chromium.linux/waterfall
I think we haven't added it to the fuller CQ because we've been building up more compatibility. We're just now approaching point where broadening likely makes sense, and cc would be an obvious initial candidate, so a bit of a mini capybara type of situation.
On Monday, April 3, 2017 at 12:07:50 PM UTC-7, danakj wrote:
> First thought: Would failing this bot be an FYI or would it block the patch landing?
Thought: start as FYI but to prevent regression we would want to move to blocking once we have gained confidence that that would be a reasonable thing to do. So, what would we need to see to feel confident?
Given we are only adding it for certain subdirectories Blink-side for example, we see occasional regressions today. For example there were multicol changes under WebKit/Source/core/layout that broke the SPv2 trybot builds, we had to notice it was red manually and mark those tests as failing. http://crrev.com/2765013002
pdr@ set up our current try bot and overall test plan, here is the historical doc:
https://docs.google.com/document/d/1QCM912Dr6u38DqyQqd7pxQxDy8FFOoWMMDq7uAXqKdA/view#heading=h.scmfc1jwpawo
In there he mentions setting up add'l SPv2 specific trybots: "When spv2 is closer to launch, additional spv2-specific trybots can be added." I'm not sure exactly what he had in mind. What would be best?
> Second thought: Is the slimming paint bot not running on the waterfall? It's unfortunate to have bots on the waterfall that don't run in the CQ since that can break the waterfall. But then, it's also a problematic situation to have CQ blocking on bots (for some patches only) that don't run on the waterfall, since this allows patches to land and then block CQ for other things.
It's on the tryserver.chromium.linux waterfall here:
https://build.chromium.org/p/tryserver.chromium.linux/waterfall
I think we haven't added it to the fuller CQ because we've been building up more compatibility. We're just now approaching point where broadening likely makes sense, and cc would be an obvious initial candidate, so a bit of a mini capybara type of situation.
> Third thought: Why is this through presubmit instead of just adding it to the CQ and having analyze decide of the patch is relevant to the bot?
See above re: compatibility. I think this would allow us more direct control until we have greater confidence, but also, I know how to add it through presubmit and I don't know what we'd need to do for CQ, so open to thoughts here as well.
--
You received this message because you are subscribed to the Google Groups "Graphics-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to graphics-dev+unsubscribe@chromium.org.
A few things ...First, there is currently *one* machine for that builder, so, yeah, that's way under capacity for anything that you want to be sending builds to at all regularly, let alone as part of a presubmit or on the CQ. We'd normally need to either move it to a bigger pool or the CQ pool itself. But ...Second, we shouldn't have anything being run in the CQ that isn't also a waterfall (continuous) builder, because it's too hard to expect to keep a tryserver-only config green. So, you need to add a waterfall configuration. But ...Third, I'm pretty sure I asked this a year ago but I'll ask again: do you really need to run *all* of the layout tests? Because that is indeed quite expensive. And ...Fourth, adding an extra builder just to run an additional test step is also expensive (since you have to do a full build even though SPv2 is just a runtime flag) so we should try to avoid that if we can. But ...Fifth, all of this is complicated by the fact that we can't run the layout tests under swarming and hence it's much harder to parallelize that test execution with anything else. *And* part of the reason we use particular directories instead of just relying on `analyze` is because we don't rely on analyze even for the normal layout test case, because the test suites are too slow, because we're not running under swarming. But! ...Sixth, we're about to start running layout tests under swarming on linux, literally hopefully with CLs landing in the next day or two (so, really, in the next week or two before things stick).So all of this will hopefully become *much* easier Real Soon Now, and you'll be able to just add a test step into the CQ like any other step. Can we wait for that?
-- Dirk
On Mon, Apr 3, 2017 at 12:32 PM, Chris Harrelson <chri...@chromium.org> wrote:On Mon, Apr 3, 2017 at 12:29 PM, <wko...@chromium.org> wrote:On Monday, April 3, 2017 at 12:07:50 PM UTC-7, danakj wrote:
> First thought: Would failing this bot be an FYI or would it block the patch landing?
Thought: start as FYI but to prevent regression we would want to move to blocking once we have gained confidence that that would be a reasonable thing to do. So, what would we need to see to feel confident?
Given we are only adding it for certain subdirectories Blink-side for example, we see occasional regressions today. For example there were multicol changes under WebKit/Source/core/layout that broke the SPv2 trybot builds, we had to notice it was red manually and mark those tests as failing. http://crrev.com/2765013002
pdr@ set up our current try bot and overall test plan, here is the historical doc:
https://docs.google.com/document/d/1QCM912Dr6u38DqyQqd7pxQxDy8FFOoWMMDq7uAXqKdA/view#heading=h.scmfc1jwpawo
In there he mentions setting up add'l SPv2 specific trybots: "When spv2 is closer to launch, additional spv2-specific trybots can be added." I'm not sure exactly what he had in mind. What would be best?The only reason we have a specific bot is to avoid slowdown of the regular CQ with a lot of extra tests. But since it runs in parallel with the other bots, I'm not sure if it ends up slowing down any commits (assuming we have enough capacity).
> Second thought: Is the slimming paint bot not running on the waterfall? It's unfortunate to have bots on the waterfall that don't run in the CQ since that can break the waterfall. But then, it's also a problematic situation to have CQ blocking on bots (for some patches only) that don't run on the waterfall, since this allows patches to land and then block CQ for other things.
It's on the tryserver.chromium.linux waterfall here:
https://build.chromium.org/p/tryserver.chromium.linux/waterfall
I think we haven't added it to the fuller CQ because we've been building up more compatibility. We're just now approaching point where broadening likely makes sense, and cc would be an obvious initial candidate, so a bit of a mini capybara type of situation.Clarification: it is on the CQ for code touching core/paint/ or platform/graphics/, but not otherwise.
> Third thought: Why is this through presubmit instead of just adding it to the CQ and having analyze decide of the patch is relevant to the bot?
See above re: compatibility. I think this would allow us more direct control until we have greater confidence, but also, I know how to add it through presubmit and I don't know what we'd need to do for CQ, so open to thoughts here as well.
--
You received this message because you are subscribed to the Google Groups "Graphics-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to graphics-dev+unsubscribe@chromium.org.
On Tuesday, April 11, 2017 at 12:31:19 PM UTC-7, Dirk Pranke wrote:
> No one actually responded to my earlier note in this thread (apart from not doing anything, which I suppose is a response to my can we wait question ;).
Yes, I was waiting a bit to see if the stuff you mentioned in flight had landed.
>
> What do you actually want to do with these tests? I don't like running them in the CQ if we don't have a matching main waterfall builder.
Replying specifically:
> First, there is currently *one* machine for that builder, so, yeah, that's way under capacity for anything that you want to be sending builds to at all regularly, let alone as part of a presubmit or on the CQ. We'd normally need to either move it to a bigger pool or the CQ pool itself. But ...
We have that bug above aiming to add capacity. It seems to not have been fully enabled. I had assumed from reading the estimation in the bug (but perhaps I am incorrect?) that the resourcing we scoped is sufficient to add it to just cc/... changes for presubmit and/or CQ or both.
> Second, we shouldn't have anything being run in the CQ that isn't also a waterfall (continuous) builder, because it's too hard to expect to keep a tryserver-only config green. So, you need to add a waterfall configuration. But ...
I'm not super familiar with these builder setups. I see our bot listed on tryserver.chromium.linux on the waterfall page. Is that not sufficient?
https://build.chromium.org/p/tryserver.chromium.linux/waterfall
> Third, I'm pretty sure I asked this a year ago but I'll ask again: do you really need to run *all* of the layout tests? Because that is indeed quite expensive. And ...
IMO we get much of the value from running some specific sub-directories. Probably they add up to about 1/3 of total tests? You can see based on what is enabled in FlagExpectations/enable-slimming-paint-v2 for rough sense.
> Fourth, adding an extra builder just to run an additional test step is also expensive (since you have to do a full build even though SPv2 is just a runtime flag) so we should try to avoid that if we can. But ...
Hmm, this seems a bummer, really no way around having to build as well given it's runtime only?
> Fifth, all of this is complicated by the fact that we can't run the layout tests under swarming and hence it's much harder to parallelize that test execution with anything else. *And* part of the reason we use particular directories instead of just relying on `analyze` is because we don't rely on analyze even for the normal layout test case, because the test suites are too slow, because we're not running under swarming. But! ...
I don't fully grok everything above, but it sounds like it's changing with the below, so focusing there.
> Sixth, we're about to start running layout tests under swarming on linux, literally hopefully with CLs landing in the next day or two (so, really, in the next week or two before things stick).
This sounds good! Did it happen? Does it make it more feasible resource-wise for us to enable SPv2 layout test runs in some manner for cc/... changes?
Open to thoughts on best next steps. My impression from discussion to date is that adding it to analyze is preferred approach? We obviously need to get the resourcing fully addressed first.