shards = 20,Can we bump up the shards instead? GPU tests might not be able to so this might be necessary still
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
On time intensive builders like linux-code-coverage test involving I/O
bound operations like webgpu tests may timeout due to lack of resources,
particularly lack of time.Can you provide an example? The settings you're applying here only control swarming-level timeouts, and I don't see any such failures in https://ci.chromium.org/ui/p/chromium/builders/ci/linux-code-coverage/6498/infra.
There might be individual test cases that are timing out, but this CL wouldn't affect those at all.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
On time intensive builders like linux-code-coverage test involving I/O
bound operations like webgpu tests may timeout due to lack of resources,
particularly lack of time.Can you provide an example? The settings you're applying here only control swarming-level timeouts, and I don't see any such failures in https://ci.chromium.org/ui/p/chromium/builders/ci/linux-code-coverage/6498/infra.
There might be individual test cases that are timing out, but this CL wouldn't affect those at all.
result #1 failed (unexpectedly timed out) in task: 75afd7474d0ff411
https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true
As we can see the linux-code-coverage is unstable https://luci-milo.appspot.com/ui/test/chromium/%3A%2F%2F%5C%3Ablink_web_tests!webtest%3A%3Ahttp%2Ftests%2Fdevtools%2Fsources%2Fdebugger-async%23async-callstack-promises.js?followRenames=false#:~:text=linux%2Dcode%2Dcoverage-,not_site_per_process_blink_web_tests,-Ubuntu%2D22.04
shards = 20,Can we bump up the shards instead? GPU tests might not be able to so this might be necessary still
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
On time intensive builders like linux-code-coverage test involving I/O
bound operations like webgpu tests may timeout due to lack of resources,
particularly lack of time.Manuel BrionesCan you provide an example? The settings you're applying here only control swarming-level timeouts, and I don't see any such failures in https://ci.chromium.org/ui/p/chromium/builders/ci/linux-code-coverage/6498/infra.
There might be individual test cases that are timing out, but this CL wouldn't affect those at all.
result #1 failed (unexpectedly timed out) in task: 75afd7474d0ff411https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true
As we can see the linux-code-coverage is unstable https://luci-milo.appspot.com/ui/test/chromium/%3A%2F%2F%5C%3Ablink_web_tests!webtest%3A%3Ahttp%2Ftests%2Fdevtools%2Fsources%2Fdebugger-async%23async-callstack-promises.js?followRenames=false#:~:text=linux%2Dcode%2Dcoverage-,not_site_per_process_blink_web_tests,-Ubuntu%2D22.04
Increasing the timeouts of the _swarming shards_ will only help alleviate swarming task failures that end with status `TIMED_OUT`, eg:
https://chromium-swarm.appspot.com/task?id=761ea41604e77210
The example you linked (https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true) exited with normal `COMPLETED (FAILURE)` status. So not a swarming timeout. It might be that individual test cases within that shard are timing out; but the whole shard did not timeout.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
On time intensive builders like linux-code-coverage test involving I/O
bound operations like webgpu tests may timeout due to lack of resources,
particularly lack of time.Manuel BrionesCan you provide an example? The settings you're applying here only control swarming-level timeouts, and I don't see any such failures in https://ci.chromium.org/ui/p/chromium/builders/ci/linux-code-coverage/6498/infra.
There might be individual test cases that are timing out, but this CL wouldn't affect those at all.
Ben Pasteneresult #1 failed (unexpectedly timed out) in task: 75afd7474d0ff411https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true
As we can see the linux-code-coverage is unstable https://luci-milo.appspot.com/ui/test/chromium/%3A%2F%2F%5C%3Ablink_web_tests!webtest%3A%3Ahttp%2Ftests%2Fdevtools%2Fsources%2Fdebugger-async%23async-callstack-promises.js?followRenames=false#:~:text=linux%2Dcode%2Dcoverage-,not_site_per_process_blink_web_tests,-Ubuntu%2D22.04
Increasing the timeouts of the _swarming shards_ will only help alleviate swarming task failures that end with status `TIMED_OUT`, eg:
https://chromium-swarm.appspot.com/task?id=761ea41604e77210The example you linked (https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true) exited with normal `COMPLETED (FAILURE)` status. So not a swarming timeout. It might be that individual test cases within that shard are timing out; but the whole shard did not timeout.
Ohhh I get it!! I've seen many time out across the builder i.e https://luci-milo.appspot.com/ui/test/chromium/%3A%2F%2F%5C%3Ablink_web_tests!webtest%3A%3Ahttp%2Ftests%2Fdevtools%2Fsources%2Fdebugger-async%23async-callstack-promises.js?followRenames=false do you think is okay to follow up with this approach or what do you recommend? Many thanks!!
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
On time intensive builders like linux-code-coverage test involving I/O
bound operations like webgpu tests may timeout due to lack of resources,
particularly lack of time.Manuel BrionesCan you provide an example? The settings you're applying here only control swarming-level timeouts, and I don't see any such failures in https://ci.chromium.org/ui/p/chromium/builders/ci/linux-code-coverage/6498/infra.
There might be individual test cases that are timing out, but this CL wouldn't affect those at all.
Ben Pasteneresult #1 failed (unexpectedly timed out) in task: 75afd7474d0ff411https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true
As we can see the linux-code-coverage is unstable https://luci-milo.appspot.com/ui/test/chromium/%3A%2F%2F%5C%3Ablink_web_tests!webtest%3A%3Ahttp%2Ftests%2Fdevtools%2Fsources%2Fdebugger-async%23async-callstack-promises.js?followRenames=false#:~:text=linux%2Dcode%2Dcoverage-,not_site_per_process_blink_web_tests,-Ubuntu%2D22.04
Manuel BrionesIncreasing the timeouts of the _swarming shards_ will only help alleviate swarming task failures that end with status `TIMED_OUT`, eg:
https://chromium-swarm.appspot.com/task?id=761ea41604e77210The example you linked (https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true) exited with normal `COMPLETED (FAILURE)` status. So not a swarming timeout. It might be that individual test cases within that shard are timing out; but the whole shard did not timeout.
Ohhh I get it!! I've seen many time out across the builder i.e https://luci-milo.appspot.com/ui/test/chromium/%3A%2F%2F%5C%3Ablink_web_tests!webtest%3A%3Ahttp%2Ftests%2Fdevtools%2Fsources%2Fdebugger-async%23async-callstack-promises.js?followRenames=false do you think is okay to follow up with this approach or what do you recommend? Many thanks!!
do you think is okay to follow up with this approach
As in: follow-up with this CL as-is? I've already explained above why this CL won't improve anything.
If you're trying to make something like https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true go from failing to passing, I'd suggest trying to determine why it's failing. If the reason isn't immediately obvious, I'd suggest reaching out to test owners.
I've also already suggested trying to increase test case timeout, in case it's a per-test timeout that's causing it to fail. But again as I've already explained, this CL applies a _swarming_ level timeout, not a test-case timeout. You'll have to research the specific test harnesses for how to bump per-test timeouts. It will differ test by test.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
On time intensive builders like linux-code-coverage test involving I/O
bound operations like webgpu tests may timeout due to lack of resources,
particularly lack of time.Manuel BrionesCan you provide an example? The settings you're applying here only control swarming-level timeouts, and I don't see any such failures in https://ci.chromium.org/ui/p/chromium/builders/ci/linux-code-coverage/6498/infra.
There might be individual test cases that are timing out, but this CL wouldn't affect those at all.
Ben Pasteneresult #1 failed (unexpectedly timed out) in task: 75afd7474d0ff411https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true
As we can see the linux-code-coverage is unstable https://luci-milo.appspot.com/ui/test/chromium/%3A%2F%2F%5C%3Ablink_web_tests!webtest%3A%3Ahttp%2Ftests%2Fdevtools%2Fsources%2Fdebugger-async%23async-callstack-promises.js?followRenames=false#:~:text=linux%2Dcode%2Dcoverage-,not_site_per_process_blink_web_tests,-Ubuntu%2D22.04
Manuel BrionesIncreasing the timeouts of the _swarming shards_ will only help alleviate swarming task failures that end with status `TIMED_OUT`, eg:
https://chromium-swarm.appspot.com/task?id=761ea41604e77210The example you linked (https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true) exited with normal `COMPLETED (FAILURE)` status. So not a swarming timeout. It might be that individual test cases within that shard are timing out; but the whole shard did not timeout.
Ben PasteneOhhh I get it!! I've seen many time out across the builder i.e https://luci-milo.appspot.com/ui/test/chromium/%3A%2F%2F%5C%3Ablink_web_tests!webtest%3A%3Ahttp%2Ftests%2Fdevtools%2Fsources%2Fdebugger-async%23async-callstack-promises.js?followRenames=false do you think is okay to follow up with this approach or what do you recommend? Many thanks!!
do you think is okay to follow up with this approach
As in: follow-up with this CL as-is? I've already explained above why this CL won't improve anything.
If you're trying to make something like https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true go from failing to passing, I'd suggest trying to determine why it's failing. If the reason isn't immediately obvious, I'd suggest reaching out to test owners.
I've also already suggested trying to increase test case timeout, in case it's a per-test timeout that's causing it to fail. But again as I've already explained, this CL applies a _swarming_ level timeout, not a test-case timeout. You'll have to research the specific test harnesses for how to bump per-test timeouts. It will differ test by test.
Many thanks for the explanation I didn't realize until now (my bad) that these were applying only at the swarming level. Let me proceed with the multiplier flag you suggested me. Many thanks again for the tips I appreciate it:)
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
targets.mixin(
args = [
"--test-timeout-sec=720",
],
),Good thinking running the "linux-code-coverage " tryjob. Check the "webgpu_cts_tests" results on it and you'll see the error: `run_browser_tests.py: error: no such option: --test-timeout-sec=720`. So this isn't how that test harness can be configured.
But I'm not sure what it really wants. Running `./content/test/gpu/run_gpu_integration_test.py --help` doesn't show any timeout options. @bsh...@chromium.org any tips on increasing per-test timeouts for this harness?
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
targets.mixin(
args = [
"--test-timeout-sec=720",
],
),Good thinking running the "linux-code-coverage " tryjob. Check the "webgpu_cts_tests" results on it and you'll see the error: `run_browser_tests.py: error: no such option: --test-timeout-sec=720`. So this isn't how that test harness can be configured.
But I'm not sure what it really wants. Running `./content/test/gpu/run_gpu_integration_test.py --help` doesn't show any timeout options. @bsh...@chromium.org any tips on increasing per-test timeouts for this harness?
These tests already use a heartbeat mechanism to avoid a timeouts as long as the test page is able to send a heartbeat at least once every 15 seconds. I will need to look at the underlying flakes to see what's going on.
We do have the option of adding a blanket `Slow` expectation for the `clang-coverage` tag in https://source.chromium.org/chromium/chromium/src/+/main:third_party/dawn/webgpu-cts/slow_tests.txt, which will 5x various timeouts. However, I want to take a closer look at the failures first to make sure that's the correct approach here.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
targets.mixin(
args = [
"--test-timeout-sec=720",
],
),Brian SheedyGood thinking running the "linux-code-coverage " tryjob. Check the "webgpu_cts_tests" results on it and you'll see the error: `run_browser_tests.py: error: no such option: --test-timeout-sec=720`. So this isn't how that test harness can be configured.
But I'm not sure what it really wants. Running `./content/test/gpu/run_gpu_integration_test.py --help` doesn't show any timeout options. @bsh...@chromium.org any tips on increasing per-test timeouts for this harness?
These tests already use a heartbeat mechanism to avoid a timeouts as long as the test page is able to send a heartbeat at least once every 15 seconds. I will need to look at the underlying flakes to see what's going on.
We do have the option of adding a blanket `Slow` expectation for the `clang-coverage` tag in https://source.chromium.org/chromium/chromium/src/+/main:third_party/dawn/webgpu-cts/slow_tests.txt, which will 5x various timeouts. However, I want to take a closer look at the failures first to make sure that's the correct approach here.
There are a handful of timeouts in some tests, but there are also some non-timeout failures that will need to be investigated separately.
I'll add `Slow` expectations for the tests that are timing out.
You can abandon this CL since there won't be any Chromium-side changes needed for that.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
targets.mixin(
args = [
"--test-timeout-sec=720",
],
),Brian SheedyGood thinking running the "linux-code-coverage " tryjob. Check the "webgpu_cts_tests" results on it and you'll see the error: `run_browser_tests.py: error: no such option: --test-timeout-sec=720`. So this isn't how that test harness can be configured.
But I'm not sure what it really wants. Running `./content/test/gpu/run_gpu_integration_test.py --help` doesn't show any timeout options. @bsh...@chromium.org any tips on increasing per-test timeouts for this harness?
Brian SheedyThese tests already use a heartbeat mechanism to avoid a timeouts as long as the test page is able to send a heartbeat at least once every 15 seconds. I will need to look at the underlying flakes to see what's going on.
We do have the option of adding a blanket `Slow` expectation for the `clang-coverage` tag in https://source.chromium.org/chromium/chromium/src/+/main:third_party/dawn/webgpu-cts/slow_tests.txt, which will 5x various timeouts. However, I want to take a closer look at the failures first to make sure that's the correct approach here.
There are a handful of timeouts in some tests, but there are also some non-timeout failures that will need to be investigated separately.
I'll add `Slow` expectations for the tests that are timing out.
You can abandon this CL since there won't be any Chromium-side changes needed for that.
https://dawn-review.googlesource.com/c/dawn/+/288535 is the CL to mark the timing out tests as slow.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
targets.mixin(
args = [
"--test-timeout-sec=720",
],
),Brian SheedyGood thinking running the "linux-code-coverage " tryjob. Check the "webgpu_cts_tests" results on it and you'll see the error: `run_browser_tests.py: error: no such option: --test-timeout-sec=720`. So this isn't how that test harness can be configured.
But I'm not sure what it really wants. Running `./content/test/gpu/run_gpu_integration_test.py --help` doesn't show any timeout options. @bsh...@chromium.org any tips on increasing per-test timeouts for this harness?
Brian SheedyThese tests already use a heartbeat mechanism to avoid a timeouts as long as the test page is able to send a heartbeat at least once every 15 seconds. I will need to look at the underlying flakes to see what's going on.
We do have the option of adding a blanket `Slow` expectation for the `clang-coverage` tag in https://source.chromium.org/chromium/chromium/src/+/main:third_party/dawn/webgpu-cts/slow_tests.txt, which will 5x various timeouts. However, I want to take a closer look at the failures first to make sure that's the correct approach here.
Brian SheedyThere are a handful of timeouts in some tests, but there are also some non-timeout failures that will need to be investigated separately.
I'll add `Slow` expectations for the tests that are timing out.
You can abandon this CL since there won't be any Chromium-side changes needed for that.
https://dawn-review.googlesource.com/c/dawn/+/288535 is the CL to mark the timing out tests as slow.
Hey Ben and Brian,
Wow this is pretty interesting!!
I was wondering if we can apply the same workaround to `not_site_per_process_headless_shell_wpt_tests` as it uses
run_wpt_tests.py
I could see some options to add multiplier but it's a bit obscure https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/tools/blinkpy/wpt_tests/wpt_adapter.py;drc=bc259fc1f325d6817306e96d2cd11a1de22a2d98;l=276
What do you recommend in this case?
Many thanks for the comments and the help on this
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Web tests [support `Slow` expectations as well](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/web_tests/SlowTests), but the available tags are much more limited so there's no way to differentiate based on whether code coverage is enabled or not.
My guess is that your best bet would be to add something that's basically identical to the `self.options.enable_sanitizer` check in the file you linked but for `self.options.enable_code_coverage` or similar.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
On time intensive builders like linux-code-coverage test involving I/O
bound operations like webgpu tests may timeout due to lack of resources,
particularly lack of time.Manuel BrionesCan you provide an example? The settings you're applying here only control swarming-level timeouts, and I don't see any such failures in https://ci.chromium.org/ui/p/chromium/builders/ci/linux-code-coverage/6498/infra.
There might be individual test cases that are timing out, but this CL wouldn't affect those at all.
Ben Pasteneresult #1 failed (unexpectedly timed out) in task: 75afd7474d0ff411https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true
As we can see the linux-code-coverage is unstable https://luci-milo.appspot.com/ui/test/chromium/%3A%2F%2F%5C%3Ablink_web_tests!webtest%3A%3Ahttp%2Ftests%2Fdevtools%2Fsources%2Fdebugger-async%23async-callstack-promises.js?followRenames=false#:~:text=linux%2Dcode%2Dcoverage-,not_site_per_process_blink_web_tests,-Ubuntu%2D22.04
Manuel BrionesIncreasing the timeouts of the _swarming shards_ will only help alleviate swarming task failures that end with status `TIMED_OUT`, eg:
https://chromium-swarm.appspot.com/task?id=761ea41604e77210The example you linked (https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true) exited with normal `COMPLETED (FAILURE)` status. So not a swarming timeout. It might be that individual test cases within that shard are timing out; but the whole shard did not timeout.
Ben PasteneOhhh I get it!! I've seen many time out across the builder i.e https://luci-milo.appspot.com/ui/test/chromium/%3A%2F%2F%5C%3Ablink_web_tests!webtest%3A%3Ahttp%2Ftests%2Fdevtools%2Fsources%2Fdebugger-async%23async-callstack-promises.js?followRenames=false do you think is okay to follow up with this approach or what do you recommend? Many thanks!!
Manuel Brionesdo you think is okay to follow up with this approach
As in: follow-up with this CL as-is? I've already explained above why this CL won't improve anything.
If you're trying to make something like https://chromium-swarm.appspot.com/task?id=75afd7474d0ff411&o=true&w=true go from failing to passing, I'd suggest trying to determine why it's failing. If the reason isn't immediately obvious, I'd suggest reaching out to test owners.
I've also already suggested trying to increase test case timeout, in case it's a per-test timeout that's causing it to fail. But again as I've already explained, this CL applies a _swarming_ level timeout, not a test-case timeout. You'll have to research the specific test harnesses for how to bump per-test timeouts. It will differ test by test.
Many thanks for the explanation I didn't realize until now (my bad) that these were applying only at the swarming level. Let me proceed with the multiplier flag you suggested me. Many thanks again for the tips I appreciate it:)
Acknowledged
Manuel BrionesCan we bump up the shards instead? GPU tests might not be able to so this might be necessary still
Will take a look! Many thanks for the suggestion:)
Acknowledged
targets.mixin(
args = [
"--test-timeout-sec=720",
],
),Brian SheedyGood thinking running the "linux-code-coverage " tryjob. Check the "webgpu_cts_tests" results on it and you'll see the error: `run_browser_tests.py: error: no such option: --test-timeout-sec=720`. So this isn't how that test harness can be configured.
But I'm not sure what it really wants. Running `./content/test/gpu/run_gpu_integration_test.py --help` doesn't show any timeout options. @bsh...@chromium.org any tips on increasing per-test timeouts for this harness?
Brian SheedyThese tests already use a heartbeat mechanism to avoid a timeouts as long as the test page is able to send a heartbeat at least once every 15 seconds. I will need to look at the underlying flakes to see what's going on.
We do have the option of adding a blanket `Slow` expectation for the `clang-coverage` tag in https://source.chromium.org/chromium/chromium/src/+/main:third_party/dawn/webgpu-cts/slow_tests.txt, which will 5x various timeouts. However, I want to take a closer look at the failures first to make sure that's the correct approach here.
Brian SheedyThere are a handful of timeouts in some tests, but there are also some non-timeout failures that will need to be investigated separately.
I'll add `Slow` expectations for the tests that are timing out.
You can abandon this CL since there won't be any Chromium-side changes needed for that.
Manuel Brioneshttps://dawn-review.googlesource.com/c/dawn/+/288535 is the CL to mark the timing out tests as slow.
Brian SheedyHey Ben and Brian,
Wow this is pretty interesting!!
I was wondering if we can apply the same workaround to `not_site_per_process_headless_shell_wpt_tests` as it uses
run_wpt_tests.pyI could see some options to add multiplier but it's a bit obscure https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/tools/blinkpy/wpt_tests/wpt_adapter.py;drc=bc259fc1f325d6817306e96d2cd11a1de22a2d98;l=276
What do you recommend in this case?
Many thanks for the comments and the help on this
Web tests [support `Slow` expectations as well](https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/web_tests/SlowTests), but the available tags are much more limited so there's no way to differentiate based on whether code coverage is enabled or not.
My guess is that your best bet would be to add something that's basically identical to the `self.options.enable_sanitizer` check in the file you linked but for `self.options.enable_code_coverage` or similar.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
'--timeout-multiplier',Hey @bpas...@chromium.org
I just tested this CL. Works, though I'd like to ask you for some tips/review on this.
I added this flag but we already have that option [1]. The reason behind this is that when we call the function `command_line.add_testing_options_group(parser)` [2] it already have `rwt` set as True
```
def add_testing_options_group(... rwt: bool = True):
```
What would you recommend in this case?
P.S. testing worked, you can take a look here: https://chromium-swarm.appspot.com/task?d=true&id=76508d3086869810
References:
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
'--timeout-multiplier',Hey @bpas...@chromium.org
I just tested this CL. Works, though I'd like to ask you for some tips/review on this.
I added this flag but we already have that option [1]. The reason behind this is that when we call the function `command_line.add_testing_options_group(parser)` [2] it already have `rwt` set as True
```
def add_testing_options_group(... rwt: bool = True):
```What would you recommend in this case?
P.S. testing worked, you can take a look here: https://chromium-swarm.appspot.com/task?d=true&id=76508d3086869810
References:
Btw I'd like to see if there's a way we can do this or should I rename the flag?
As I don't want any _conflicts_ with the existing flag.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |