522 views
Skip to first unread message

dan...@chromium.org

unread,
May 13, 2019, 2:16:22 PM5/13/19
to blink-dev, chromium-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov

Subject: Proposal: Disable Android web_tests and fix Android browser_tests

Dear Chromium-dev and blink-dev,


tl;dr: webkit_layout_tests are largely disabled; we should remove web tests from Android bots. Android browser_tests currently are not compiled and cannot run, but Team Bubblesort is actively hacking to fix it.


There's been some background chatter over the past while about seeming holes in our testing infrastructure on Android relating to web_tests and browser_tests.


For webkit_layout_tests, the WebTest harness needs work before it will run Android M+, leaving the bots stuck on K. Even in this set up, most tests are disabled, with a steady stream of additional tests being disabled over time. This means we're mostly just testing the WebTest Harness itself. Much of the actual coverage is provided by other platforms with android-specific behaviours being covered via pixel tests on the High-Dpi MacOS bots, and by the virtual/android tests on Linux. As such, we'd like to recommend explicitly deciding to disable the Android bots and handle coverage for web_tests in other platforms. See "Current State of Android Web Tests" for more details.


For browser_tests, the target does not get built on Android meaning (a) lost coverage for some features (b) some teams end up putting android functionality into the Linux build just to test it (c) some teams have written test code for browser_tests under OS_ANDROID expecting coverage only to have them not even be compiled. creis@ started a doc to track the impact. There are over 11,000 browser_tests (taken from examining linux build logs). While it's unclear exactly how many existing tests are applicable to Android, we estimate it to be in the thousands that could be eventually enabled. More critically though, there are tests we know we want on Android, but currently cannot write. Because of this, we're now actively building on top of jbudorick's work on Android browser_tests with hopes of getting a usable harness on the waterfall this quarter.


If there are any questions or concerns -- especially about the proposal to explicitly not support web_tests on Android -- please chime in on this thread! It a controversial choice, so we're taking a stance to start a discussion.


Yours truly,

Team Bubblesort  (danakj, dcheng, ajwong)



Current State of Android Web Tests


-- General stats --

There is currently a single Android KitKat bot running web tests. It only runs:

  • 969 tests, which is 1% of the 88689 tests run on Linux.

  • 161 marked failures in TestExpectations

  • 121 expected failures on the bot.

This means < 1% of the layout Test set is run. As such, the bot is mostly testing that the Android web test harness works.


-- What is being tested --


For the test that are run, the types that are included are very limited. There is no pixel test coverage included in Android web tests. Android pixel web tests stopped working when WebTests were switched to use Viz. However, before this there were already zero pixel tests passing, as evidenced when the legacy compositing mode was ripped out.


-- Where is equivalent coverage --

There is a virtual/android/ test suite which runs on Linux using cmd line params to make Blink behave the way it does on Android. These do include pixel tests and seem sufficient to cover the functionality gap. It is currently 100 tests, that specifically target per-platform differences on Android.


Speaking with the rendering team, the primary differences to be tested on Android are platform differences such as A) high-dpi B) scrolling behaviour. Examining these two cases


  1. High-dpi is tested by the MacOS bots as well, providing good pixel test coverage for this feature in Blink painting. There is not Android-specific behaviour that differs from what we also test on MacOS.

  2. Scrolling behaviour differences are tested via the virtual/android/ test suite on Linux already, and are not run on Android.


Thus the use of virtual/android and High-dpi tests on MacOS provide sufficient coverage for the rendering team.


-- What else is there? --

The set of tests that run on the Android bot are listed in web_tests/SmokeTests. These are either text-only tests, or marked as failures in TestExpectations, and these do not test useful Android-specific behaviours in Blink.


Examples:

  • Tests under compositing/ are text-only tests, with expectations that differ in the size of the content layer because root layer scrolling is enabled on Android. However the feature is actually tested by the virtual/android/rootscroller tests which run on Linux.

  • The single plugins/ test is not marked as Failure in TestExpectations, but it just verifies that the test *fails* regardless.

  • The fast/beacon/beacon-basic.html test does have a difference on Android, but it appears to be testing differences in the WebTest harness - on Linux the test is loaded from file:// (so doesn’t accept http:) but on Android (and Fuchsia) does accept it, seemingly because the file is loaded from http:// instead.

  • The crypto/random-values.html test has a difference on Android which is just that SharedArrayBuffer is not available so that result is missing from the expectations. This is repeated in other tests such as beacon-basic.html. This is not a useful thing to be testing.


After going through the Android expectations directory, I (danakj) am unable to find a useful platform-specific expectation.


Albert J. Wong (王重傑)

unread,
May 13, 2019, 2:22:54 PM5/13/19
to Dana Jansens (Google Drive), blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov

[ dangerously attempting to change subject... please reply to THIS email. Otherwise, I think the null-subject will cause double or triple delivery via groups ]

Yang Zhang

unread,
May 13, 2019, 6:56:07 PM5/13/19
to Chromium-dev, dan...@chromium.org, blin...@chromium.org, dch...@chromium.org, na...@chromium.org
Thanks for looking at these. 
May I ask why not fixing the existing layout tests and also adding browser_tests? 

Dirk Pranke

unread,
May 13, 2019, 8:40:17 PM5/13/19
to Dana Jansens, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu
As noted in the other reply I just sent, I feel like you're raising two issues that are best dealt with separately, so I'm splitting my replies and sending the one about the web_tests just to blink-dev (and bcc'ing chromium-dev ...).

Also, +Yang Zhang+Erik Staab and +Yihong Gu to try and make sure some of the EngProd folks are aware of this discussion.

As someone who spent some large chunk of time trying to get the web_tests running on Android on the bots reasonably well years ago, I have mixed feelings about this proposal.

I guess I'm reluctantly okay with you turning this off, though I am worried that doing so will make it even harder to bring them back and run on newer versions.

I have additional thoughts that might be interesting to some ...

First, I'm biased, but I'd be inclined to argue that the web_tests are the most important test suite we have in Chrome, and Android is actually our most-used platform. So, having it not run there seems like a bad thing.

But, pragmatically, we also simply don't have the hardware to run all of the tests with any sort of frequency (much like the point I just made about potentially running browser_tests), and that's not likely to change soon.

The point of the SmokeTests was to try and get some sort of broad but shallow coverage of the test suite. I don't actually know if I achieved ever achieved that, but I also don't know that I didn't. I also don't know if that was true at one point but is no longer true. I'm not sure what the state of code coverage on Android is, but it sure would be interesting to try and pull numbers to check. It was my hope that people would add the tests that are platform-specific to it over time, but that hasn't much happened.

However, given that we don't know, to say that the tests are only testing the harness feels a bit unnecessarily harsh to me :). That said, I've also never heard of us hitting an actual Android-specific bug when running that suite, so I also couldn't argue that it is a valuable suite to run at all.

And, certainly given that no one seems to be willing to invest in making it work on newer versions of Android, that also suggests that the test suite isn't that valuable.

It also suggests that much of the test suite is generic, and there's simply no need to run most of the tests on multiple platforms. The fact that we took web_tests out of the CQ on Mac and the world hasn't fallen over (indeed, I'm not sure if we see much in the way of an increased number of Mac failures in the suite on the waterfall at all) also suggests that.

As I noted in my browser_tests reply, we're increasingly living in a world of constrained budgets and we need to be thinking more and more about only running tests that find issues not found elsewhere, and that means that we're probably more likely to repeat that exercise with more test suites on more platforms in the future, to free up scarce hardware resources for testing things that actually need to be tested on them.

In my ideal world, I'd have someone sign up to make the test suite actually work on M+, and we'd have someone focusing on running just the tests that are testing Android-specific stuff and maybe a sanity check of some subset of the rest a la SmokeTests.

Side note: you wrote:
> The single plugins/ test is not marked as Failure in TestExpectations, but it just verifies that the test *fails* regardless.

That's actually the *correct* way to handle a test that is expected to fail. web_tests are change-detector tests as much as they are compliance tests. Marking a test as `Failure` means we lose the change-detection property. The obvious downside to this "correct" approach is that you then can't tell which things are actually failing. I wanted to fix that at some point (e.g., with an -expected-failure.txt or something) but could never come up with an approach that seemed better enough to be worth it. An alternative would be to WontFix that directory, since we don't expect plugins to run at all on Android, but that loses coverage. But, this is something of a pedantic point these days, as clearly we don't and probably never will do this consistently one way or another.

I don't actually understand your comments about why the platform-specific crypto results showing that SAB is missing isn't a useful thing to be testing, or what's wrong with the fact that the compositing/ tests have platform-specific expectations because they actually have platform-specific behavior, but I'm not sure either of these topics is all that important, either.

-- Dirk

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAHtyhaS0rrHiOT2KxOnqRVuh4RY39N5%2BBuRSpoGxPKXGTjPfFA%40mail.gmail.com.

Yang Zhang

unread,
May 13, 2019, 9:05:34 PM5/13/19
to Dirk Pranke, Dana Jansens, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Erik Staab, Yihong Gu
Thanks Dirk for following up on this! 

Like I replied previously, "why can't we fix the existing tests and adding browser_tests" instead of "turning this off"? 
I understand we need to use the resources effectively, though, if "webkit_layout_tests" has value without duplication from other tests, we'd better to have that coverage. 
Also, Can we invest more proper on unit tests that are independent to platform? So they are not necessary to run on all the platforms.




Dirk Pranke

unread,
May 13, 2019, 9:15:34 PM5/13/19
to Yang Zhang, Dana Jansens, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Erik Staab, Yihong Gu
From: Yang Zhang <yz...@google.com>
Date: Mon, May 13, 2019 at 6:05 PM
To: Dirk Pranke
Cc: Dana Jansens, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Erik Staab, Yihong Gu

Thanks Dirk for following up on this! 

Like I replied previously, "why can't we fix the existing tests and adding browser_tests" instead of "turning this off"? 

We can do this, but historically it's been very hard to get someone to spend the required amount of time to make sure the web_tests keep running. I encourage you to talk to the web platform leads about changing that :).
 
I understand we need to use the resources effectively, though, if "webkit_layout_tests" has value without duplication from other tests, we'd better to have that coverage.

Agreed, all other things being equal. But, tests are of limited value if they're not properly maintained and supported.
 
 Also, Can we invest more proper on unit tests that are independent to platform? So they are not necessary to run on all the platforms.

I'm not sure what you're referring to here. Are you asking for more platform-specific unit tests? If so, I generally agree that we should favor unit tests over browser_tests (as noted in the other thread).

-- Dirk

Yang Zhang

unread,
May 14, 2019, 12:07:10 AM5/14/19
to Dirk Pranke, Dana Jansens, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Erik Staab, Yihong Gu
We can do this, but historically it's been very hard to get someone to spend the required amount of time to make sure the web_tests keep running. I encourage you to talk to the web platform leads about changing that :).
- Thanks, will follow up.

I'm not sure what you're referring to here. Are you asking for more platform-specific unit tests? If so, I generally agree that we should favor unit tests over browser_tests (as noted in the other thread).
-  Yes, that's correct

dan...@chromium.org

unread,
May 14, 2019, 11:36:00 AM5/14/19
to Dirk Pranke, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu
Thanks for your thoughts on this Dirk.

On Mon, May 13, 2019 at 8:40 PM Dirk Pranke <dpr...@chromium.org> wrote:
As noted in the other reply I just sent, I feel like you're raising two issues that are best dealt with separately, so I'm splitting my replies and sending the one about the web_tests just to blink-dev (and bcc'ing chromium-dev ...).

Also, +Yang Zhang+Erik Staab and +Yihong Gu to try and make sure some of the EngProd folks are aware of this discussion.

As someone who spent some large chunk of time trying to get the web_tests running on Android on the bots reasonably well years ago, I have mixed feelings about this proposal.

I understand. I appreciate the effort you spent to get them running well. I think we have evolved how we're testing blink for Android in the meantime, both with more unit tests and with the virtual/android/ test suite running on Linux.
 
I guess I'm reluctantly okay with you turning this off, though I am worried that doing so will make it even harder to bring them back and run on newer versions.

It may, at this point the harness has bitrotted so much it will be a lot of work either way.
 
I have additional thoughts that might be interesting to some ...

First, I'm biased, but I'd be inclined to argue that the web_tests are the most important test suite we have in Chrome, and Android is actually our most-used platform. So, having it not run there seems like a bad thing.

This was my original argument as well. I came into this work with the intention to get them working on M and get more tests turned on. But I talked to rendering folks and looked through the tests that are running on the bots and was unable to support actually spending that effort. I tried to detail why, though I think some things weren't clear based on questions, which I will try answer below.
 
But, pragmatically, we also simply don't have the hardware to run all of the tests with any sort of frequency (much like the point I just made about potentially running browser_tests), and that's not likely to change soon.

The cost vs value of this test suite seems to be at a very bad ratio.
 
The point of the SmokeTests was to try and get some sort of broad but shallow coverage of the test suite. I don't actually know if I achieved ever achieved that, but I also don't know that I didn't. I also don't know if that was true at one point but is no longer true. I'm not sure what the state of code coverage on Android is, but it sure would be interesting to try and pull numbers to check. It was my hope that people would add the tests that are platform-specific to it over time, but that hasn't much happened.

Looking through the SmokeTests I was unable to find a single test that was actually testing something Android-specific. There are Android-specific expectations but none of them were interesting from a testing/correctness perspective. More below..
 
However, given that we don't know, to say that the tests are only testing the harness feels a bit unnecessarily harsh to me :). That said, I've also never heard of us hitting an actual Android-specific bug when running that suite, so I also couldn't argue that it is a valuable suite to run at all.

Sorry I don't mean to be harsh. I only mean that when looking at the android-specific expectations the differences that I see are due to the harness there, not due to "Android". Anything android-specific in blink seems to be triggered by command lines not by OS_ANDROID checks, and thus we already have the ability to test them via the virtual/android/ suite.
 
And, certainly given that no one seems to be willing to invest in making it work on newer versions of Android, that also suggests that the test suite isn't that valuable.

The rate at which they seem to break with Android releases increases the costs here, but even without it, I have trouble justifying keeping these tests going on Android.
 
It also suggests that much of the test suite is generic, and there's simply no need to run most of the tests on multiple platforms. The fact that we took web_tests out of the CQ on Mac and the world hasn't fallen over (indeed, I'm not sure if we see much in the way of an increased number of Mac failures in the suite on the waterfall at all) also suggests that.

Mac is actually one of the ways we do test Android-relevant behaviour, or platform-specific behaviour, as it tests our high-dpi code paths. I don't think we have win/linux bots hitting those.
 
As I noted in my browser_tests reply, we're increasingly living in a world of constrained budgets and we need to be thinking more and more about only running tests that find issues not found elsewhere, and that means that we're probably more likely to repeat that exercise with more test suites on more platforms in the future, to free up scarce hardware resources for testing things that actually need to be tested on them.

As Albert suggested, we should consider using the crow emulator for helping with hardware constraints. In this case I think the primary constraint is engineering cost - and lack of return on investment.
 
In my ideal world, I'd have someone sign up to make the test suite actually work on M+, and we'd have someone focusing on running just the tests that are testing Android-specific stuff and maybe a sanity check of some subset of the rest a la SmokeTests.

Right, that was my original intention with this adventure. So I would say it is ideal as well, but I can't currently see a way to justify any engineering time on these. My hope with this email was that maybe I could be shown wrong and folx would have some strong ideas about return on investment for working on these tests for Android.
 
Side note: you wrote:
> The single plugins/ test is not marked as Failure in TestExpectations, but it just verifies that the test *fails* regardless.

That's actually the *correct* way to handle a test that is expected to fail. web_tests are change-detector tests as much as they are compliance tests. Marking a test as `Failure` means we lose the change-detection property. The obvious downside to this "correct" approach is that you then can't tell which things are actually failing. I wanted to fix that at some point (e.g., with an -expected-failure.txt or something) but could never come up with an approach that seemed better enough to be worth it. An alternative would be to WontFix that directory, since we don't expect plugins to run at all on Android, but that loses coverage. But, this is something of a pedantic point these days, as clearly we don't and probably never will do this consistently one way or another.

Noted. What I wanted to emphasize here was that of the ~600 tests that run more than 100 are marked as failures in TestExpectations but other tests in the suite are also testing for failures via their expectation results. This doesn't seem like a high value argument for spending time/money/opportunity cost maintaining this.
 
I don't actually understand your comments about why the platform-specific crypto results showing that SAB is missing isn't a useful thing to be testing, or what's wrong with the fact that the compositing/ tests have platform-specific expectations because they actually have platform-specific behavior, but I'm not sure either of these topics is all that important, either.

Sorry to be unclear here. My intent was to say that in the crypto case the SAB missing appears to be an artifact of how the test harness is running tests on Android, not to do with the OS/platform differences of running Chrome/Blink there. And in the compositing case, yes there is a layer size difference but that does not appear to be useful - it is recorded in expectations as a side effect, not the intent of the tests. The feature causing that side effect has its own set of tests which we *don't* run on Android (in virtual/android/ instead) which kinda underscores to me the lack of value the suite is providing on Android bots.

I tried to find some tests that were providing Android-specific value in the test suite but was unable to do so, everything appears to be testable via virtual/android/ (and already tested there more explicitly) failing entirely, or testing generic code sometimes with side effects of Android-web-test-harness differences, not Android-Chrome differences.

Cheers,
Dana

Dirk Pranke

unread,
May 14, 2019, 12:00:42 PM5/14/19
to Dana Jansens, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu
Thanks, Dana. It sounds like we're in agreement: it seems like running the tests on Android *should* provide value, but it's hard to see that they actually do. So, it makes sense to turn them off to reduce some of the complexity in the system until at least we get to a point where they're better maintained and more clearly targeted at high-value tests.

I'm still not following one part of your response, though, which is that I'm not understanding your comments about not seeing tests for OS-specific differences.

For example, isn't the plugins test needing to fail only on android an OS-specific difference?

Also, I'm pretty sure the webaudio tests are testing OS-specific implementations (or at least they used to), and I thought some of the media tests were doing so as well.

Are you saying those would also be covered by the virtual/android tests? Or something else?

-- Dirk

dan...@chromium.org

unread,
May 14, 2019, 12:28:16 PM5/14/19
to Dirk Pranke, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu
From: Dirk Pranke <dpr...@chromium.org>
Date: Tue, May 14, 2019 at 12:00 PM
To: Dana Jansens

Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

Thanks, Dana. It sounds like we're in agreement: it seems like running the tests on Android *should* provide value, but it's hard to see that they actually do. So, it makes sense to turn them off to reduce some of the complexity in the system until at least we get to a point where they're better maintained and more clearly targeted at high-value tests.

I'm still not following one part of your response, though, which is that I'm not understanding your comments about not seeing tests for OS-specific differences.

For example, isn't the plugins test needing to fail only on android an OS-specific difference?

It's an ENABLE_PLUGINS difference, I'd say, though the test is failing in a way that could be caused by any number of problems. It is not testing "there are not plugins on Android", which is a property of content not of Blink anyhow, right? A content unittest would be better suited for that I believe. It seems we mostly disable plugin tests on Android. However I did find this test that seems to cover this need?
 
Also, I'm pretty sure the webaudio tests are testing OS-specific implementations (or at least they used to), and I thought some of the media tests were doing so as well.

There's 2 platform specific expectations in webaudio. One has text differences around ArrayBufferView being shared and throwing an exception, I get the same result on Linux when I run it in Chrome. But the Linux web test runner seems to avoid it. This may be an Android-OS difference, but it looks more like the test environment to me. Can you confirm?

The only other platform difference is a different binary result for one resampled codec test. I didn't think this is part of Blink, and would be covered by media unit tests and such elsewhere, but I may be mistaken?

There's no Android-specific media results in here so I'm not sure what else you're thinking about. Webmidi differences are again the ArrayBufferView being shared causing an exception.

Thanks,
Dana

Dirk Pranke

unread,
May 14, 2019, 12:35:15 PM5/14/19
to Dana Jansens, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu
Date: Tue, May 14, 2019 at 9:28 AM

To: Dirk Pranke
Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

From: Dirk Pranke <dpr...@chromium.org>
Date: Tue, May 14, 2019 at 12:00 PM
To: Dana Jansens
Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

Thanks, Dana. It sounds like we're in agreement: it seems like running the tests on Android *should* provide value, but it's hard to see that they actually do. So, it makes sense to turn them off to reduce some of the complexity in the system until at least we get to a point where they're better maintained and more clearly targeted at high-value tests.

I'm still not following one part of your response, though, which is that I'm not understanding your comments about not seeing tests for OS-specific differences.

For example, isn't the plugins test needing to fail only on android an OS-specific difference?

It's an ENABLE_PLUGINS difference, I'd say, though the test is failing in a way that could be caused by any number of problems. It is not testing "there are not plugins on Android", which is a property of content not of Blink anyhow, right? A content unittest would be better suited for that I believe. It seems we mostly disable plugin tests on Android. However I did find this test that seems to cover this need?

Yeah, I think this is one of those things where you try and decide whether if having a unit test is sufficient or whether you really also want a functional (or integration) test.
 
 
Also, I'm pretty sure the webaudio tests are testing OS-specific implementations (or at least they used to), and I thought some of the media tests were doing so as well.

There's 2 platform specific expectations in webaudio. One has text differences around ArrayBufferView being shared and throwing an exception, I get the same result on Linux when I run it in Chrome. But the Linux web test runner seems to avoid it. This may be an Android-OS difference, but it looks more like the test environment to me. Can you confirm?

The only other platform difference is a different binary result for one resampled codec test. I didn't think this is part of Blink, and would be covered by media unit tests and such elsewhere, but I may be mistaken?

I think this is my confusion. You seem to be talking about different results, if I'm understanding you correctly, but I'm talking about different implementations (different code paths) that may produce the same result. If that's the case, you still should test both, right?

-- Dirk

Emil A Eklund

unread,
May 14, 2019, 12:42:53 PM5/14/19
to Dana Jansens, Dirk Pranke, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu
> Mac is actually one of the ways we do test Android-relevant behaviour, or platform-specific behaviour, as it tests our high-dpi code paths. I don't think we have win/linux bots hitting those.

We probably should add a win10 high-dpi bot at some point as we have
more windows users with High-DPI than Mac now days. That's besides the
point though.

From a rendering perspective we haven't really seen too many android
specific rendering issues and the ones we do see are covered fairly
well by virtual/android.

The areas where we have seen problems in the past have been around OS
integration, things like media controls and fullscreen, and those
aren't well covered by web_tests.

dan...@chromium.org

unread,
May 14, 2019, 1:35:57 PM5/14/19
to Dirk Pranke, Dale Curtis, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu
From: Dirk Pranke <dpr...@chromium.org>
Date: Tue, May 14, 2019 at 12:35 PM

To: Dana Jansens
Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

Date: Tue, May 14, 2019 at 9:28 AM
To: Dirk Pranke
Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

From: Dirk Pranke <dpr...@chromium.org>
Date: Tue, May 14, 2019 at 12:00 PM
To: Dana Jansens
Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

Thanks, Dana. It sounds like we're in agreement: it seems like running the tests on Android *should* provide value, but it's hard to see that they actually do. So, it makes sense to turn them off to reduce some of the complexity in the system until at least we get to a point where they're better maintained and more clearly targeted at high-value tests.

I'm still not following one part of your response, though, which is that I'm not understanding your comments about not seeing tests for OS-specific differences.

For example, isn't the plugins test needing to fail only on android an OS-specific difference?

It's an ENABLE_PLUGINS difference, I'd say, though the test is failing in a way that could be caused by any number of problems. It is not testing "there are not plugins on Android", which is a property of content not of Blink anyhow, right? A content unittest would be better suited for that I believe. It seems we mostly disable plugin tests on Android. However I did find this test that seems to cover this need?

Yeah, I think this is one of those things where you try and decide whether if having a unit test is sufficient or whether you really also want a functional (or integration) test.
 
 
Also, I'm pretty sure the webaudio tests are testing OS-specific implementations (or at least they used to), and I thought some of the media tests were doing so as well.

There's 2 platform specific expectations in webaudio. One has text differences around ArrayBufferView being shared and throwing an exception, I get the same result on Linux when I run it in Chrome. But the Linux web test runner seems to avoid it. This may be an Android-OS difference, but it looks more like the test environment to me. Can you confirm?

The only other platform difference is a different binary result for one resampled codec test. I didn't think this is part of Blink, and would be covered by media unit tests and such elsewhere, but I may be mistaken?

I think this is my confusion. You seem to be talking about different results, if I'm understanding you correctly, but I'm talking about different implementations (different code paths) that may produce the same result. If that's the case, you still should test both, right?

I agree with that yes. My understanding has been that different code paths in blink are run-time switchable and thus testable via virtual/android/. My further understanding is that different code paths for media are tested by media tests not by web tests, web tests are for testing bindings and hook up between blink and media code which are not platform specific.

It sounds like media is the one place you have concerns and we can perhaps get some guidance/input there to help resolve that? +Dale Curtis 

Raymond Toy

unread,
May 14, 2019, 1:44:08 PM5/14/19
to Dana Jansens, Dirk Pranke, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu
From: <dan...@chromium.org>
Date: Tue, May 14, 2019 at 9:28 AM
To: Dirk Pranke
Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

From: Dirk Pranke <dpr...@chromium.org>
Date: Tue, May 14, 2019 at 12:00 PM
To: Dana Jansens
Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

Thanks, Dana. It sounds like we're in agreement: it seems like running the tests on Android *should* provide value, but it's hard to see that they actually do. So, it makes sense to turn them off to reduce some of the complexity in the system until at least we get to a point where they're better maintained and more clearly targeted at high-value tests.

I'm still not following one part of your response, though, which is that I'm not understanding your comments about not seeing tests for OS-specific differences.

For example, isn't the plugins test needing to fail only on android an OS-specific difference?

It's an ENABLE_PLUGINS difference, I'd say, though the test is failing in a way that could be caused by any number of problems. It is not testing "there are not plugins on Android", which is a property of content not of Blink anyhow, right? A content unittest would be better suited for that I believe. It seems we mostly disable plugin tests on Android. However I did find this test that seems to cover this need?
 
Also, I'm pretty sure the webaudio tests are testing OS-specific implementations (or at least they used to), and I thought some of the media tests were doing so as well.

There's 2 platform specific expectations in webaudio. One has text differences around ArrayBufferView being shared and throwing an exception, I get the same result on Linux when I run it in Chrome. But the Linux web test runner seems to avoid it. This may be an Android-OS difference, but it looks more like the test environment to me. Can you confirm?

You mean the codec-tests and dom-exceptions test?  I had forgotten we had these and I don't remember why they're different.  The codec-tests used to be very different on Android, but since Project Spitzer, Android uses ffmpeg for decoding just like desktop.

I'll have to investigate why we still need these.

But fundamentally, webaudio on Android uses a different FFT implementation, along with a bunch of arm simd optimizations so these could produce slightly different results from desktop.  I haven't been able to run the layout tests on Android locally for a long time, so I would not be surprised if there are now new differences between Android and desktop.

 

Robert Ma

unread,
May 14, 2019, 6:33:32 PM5/14/19
to blink-dev, dpr...@chromium.org, dalec...@chromium.org, ajw...@chromium.org, dch...@chromium.org, na...@chromium.org, yz...@google.com, est...@google.com, yih...@google.com, Philip Jägenstedt
FWIW, I just checked all external/wpt tests enabled on Android. And many results are wildly wrong -- WPT should never dump the layout tree as the output. We hadn't been running web_tests for a long time until John Budorick fixed https://crbug.com/824539 . The tests (or rather, the harness) have apparently all bitroted. +foolip
From: Dirk Pranke <dpr...@chromium.org>

-- Dirk
 

-- Dirk


-- Dirk

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.

Marijn Kruisselbrink

unread,
May 14, 2019, 6:43:58 PM5/14/19
to Robert Ma, blink-dev, Dirk Pranke, Dale Curtis, ajw...@chromium.org, Daniel Cheng, Nasko Oskov, yz...@google.com, est...@google.com, yih...@google.com, Philip Jägenstedt
They must have bitrotted long before 824539 was fixed as well, as https://crbug.com/755698 was filed for at least one of the broken WPT tests...

From: Dirk Pranke <dpr...@chromium.org>

-- Dirk
 

-- Dirk


-- Dirk

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/4199555e-223e-41d2-b1a5-41ef92e61981%40chromium.org.

John Budorick

unread,
May 14, 2019, 6:54:43 PM5/14/19
to Robert Ma, blink-dev, Dirk Pranke, dalec...@chromium.org, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu, Philip Jägenstedt
From: Robert Ma <robe...@chromium.org>
Date: Tue, May 14, 2019 at 3:33 PM
To: blink-dev
Cc: <dpr...@chromium.org>, <dalec...@chromium.org>, <ajw...@chromium.org>, <dch...@chromium.org>, <na...@chromium.org>, <yz...@google.com>, <est...@google.com>, <yih...@google.com>, Philip Jägenstedt

FWIW, I just checked all external/wpt tests enabled on Android. And many results are wildly wrong -- WPT should never dump the layout tree as the output. We hadn't been running web_tests for a long time until John Budorick fixed https://crbug.com/824539 . The tests (or rather, the harness) have apparently all bitroted. +foolip

The harness had bitrotten after a few months of failing green. Some of the tests bitrotted and may have stayed that way.
 

On Tuesday, May 14, 2019 at 1:35:57 PM UTC-4, Dana Jansens wrote:
From: Dirk Pranke <dpr...@chromium.org>
Date: Tue, May 14, 2019 at 12:35 PM
To: Dana Jansens
Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

Date: Tue, May 14, 2019 at 9:28 AM
To: Dirk Pranke
Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

From: Dirk Pranke <dpr...@chromium.org>
Date: Tue, May 14, 2019 at 12:00 PM
To: Dana Jansens
Cc: blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu

Thanks, Dana. It sounds like we're in agreement: it seems like running the tests on Android *should* provide value, but it's hard to see that they actually do. So, it makes sense to turn them off to reduce some of the complexity in the system until at least we get to a point where they're better maintained and more clearly targeted at high-value tests.

This gets to a distinction in the course forward. If y'all are ok with permanently scrapping the current Android layout test harness, then turning off the suite seems fine. If y'all are interested in maintaining / targeting the tests better at some point in the near- to mid-term future, then turning off the suite seems counterproductive, as the harness will probably rot (again) relatively quickly.


-- Dirk
 

-- Dirk


-- Dirk

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/4199555e-223e-41d2-b1a5-41ef92e61981%40chromium.org.

Robert Ma

unread,
May 15, 2019, 11:24:24 AM5/15/19
to John Budorick, blink-dev, Dirk Pranke, dalec...@chromium.org, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu, Philip Jägenstedt, ser...@chromium.org
From: John Budorick <jbud...@chromium.org>
Date: Tue, May 14, 2019 at 6:44 PM
To: Robert Ma
Cc: blink-dev, Dirk Pranke, <dalec...@chromium.org>, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu, Philip Jägenstedt

The harness had bitrotten after a few months of failing green. Some of the tests bitrotted and may have stayed that way.

Yes that's what I meant and I've been aware of the issue for a while but haven't got time to look into it. Sorry if my phrasing wasn't clear.

Also it just occurred to me that Fuchsia also runs web tests listed in SmokeTests. Although I don't think they are using any Android-specific harness (only the filtering & plumbing in blinkpy), +sergeyu just in case.

Łukasz Anforowicz

unread,
May 15, 2019, 12:55:13 PM5/15/19
to blink-dev, dan...@chromium.org, chromi...@chromium.org, dch...@chromium.org, na...@chromium.org
You are probably already aware of this, but please note that the main reason for not_site_per_process_webkit_layout_tests step on the linux-rel CQ/waterfall bot is to have test coverage of Android-specific Site Isolation behavior.  Maybe this test step should be mutated into android_simulation_webkit_layout_tests?  I am not sure how virtual/android test suite fits here - it seems to cover only a subset of tests + it seems to inject Android-specific cmdline flags into webkit_layout_tests step on *every* bot/platform?

dan...@chromium.org

unread,
May 15, 2019, 2:38:52 PM5/15/19
to Łukasz Anforowicz, blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov
On Wed, May 15, 2019 at 12:55 PM Łukasz Anforowicz <luk...@chromium.org> wrote:
You are probably already aware of this, but please note that the main reason for not_site_per_process_webkit_layout_tests step on the linux-rel CQ/waterfall bot is to have test coverage of Android-specific Site Isolation behavior.  Maybe this test step should be mutated into android_simulation_webkit_layout_tests?  I am not sure how virtual/android test suite fits here - it seems to cover only a subset of tests + it seems to inject Android-specific cmdline flags into webkit_layout_tests step on *every* bot/platform?

I did not realize this is meant to simulate Android specifically, that's interesting. There is also a not-site-per-process virtual test suite, which is defined to include a large number of different tests. It uses the flag --disable-site-isolation-trials which looks the same as not_site_per_process_webkit_layout_tests. I'm not sure if --disable-blink-features=LayoutNG is also part of the "emulate Android" goal there, but it is not part of the virtual test suite. So it seems like we have some redundant coverage?

The not_site_per_process_webkit_layout_tests would not have a way to disable tests without also disabling them for the regular test run as well, I think? Whereas the virtual suite does. It seems like the goal of the not_site_per_process_webkit_layout_tests target could also be achieved with a virtual test suite replacing the current definitions like

  {
    "prefix": "not-site-per-process",
    "base": ".",
    "args": ["--disable-site-isolation-trials"]
  },

I'm not sure that the harness supports a virtual test suite that includes everything like that, but it could be done if not. And they could be marked as [ Skip ] for all platforms but linux to avoid redundant testing.

Either way though, thanks for bringing up yet another way we test Android outside of the Android bots. This repeats the pattern that Android differences inside of blink are controlled by command line, so it can be tested on Linux.

Marijn Kruisselbrink

unread,
May 15, 2019, 2:43:15 PM5/15/19
to Dana Jansens, Łukasz Anforowicz, blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov
On Wed, May 15, 2019 at 11:38 AM <dan...@chromium.org> wrote:
On Wed, May 15, 2019 at 12:55 PM Łukasz Anforowicz <luk...@chromium.org> wrote:
You are probably already aware of this, but please note that the main reason for not_site_per_process_webkit_layout_tests step on the linux-rel CQ/waterfall bot is to have test coverage of Android-specific Site Isolation behavior.  Maybe this test step should be mutated into android_simulation_webkit_layout_tests?  I am not sure how virtual/android test suite fits here - it seems to cover only a subset of tests + it seems to inject Android-specific cmdline flags into webkit_layout_tests step on *every* bot/platform?

I did not realize this is meant to simulate Android specifically, that's interesting. There is also a not-site-per-process virtual test suite, which is defined to include a large number of different tests. It uses the flag --disable-site-isolation-trials which looks the same as not_site_per_process_webkit_layout_tests. I'm not sure if --disable-blink-features=LayoutNG is also part of the "emulate Android" goal there, but it is not part of the virtual test suite. So it seems like we have some redundant coverage?

The not_site_per_process_webkit_layout_tests would not have a way to disable tests without also disabling them for the regular test run as well, I think?
Isn't that what third_party/blink/web_tests/flag-specific/ and third_party/blink/web_tests/FlagExpectations/ are for?

dan...@chromium.org

unread,
May 15, 2019, 2:49:53 PM5/15/19
to Marijn Kruisselbrink, Łukasz Anforowicz, blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov
Thanks, looks like yes, I had never seen that before. So it seems functionally similar to the virtual test suite then, and perhaps they are just redundant. Thanks for pointing it out :)
 

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Steve Kobes

unread,
May 15, 2019, 3:10:25 PM5/15/19
to dan...@chromium.org, Marijn Kruisselbrink, Łukasz Anforowicz, blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov
On Wed, 15 May 2019 at 14:49, <dan...@chromium.org> wrote:
On Wed, May 15, 2019 at 2:43 PM Marijn Kruisselbrink <m...@chromium.org> wrote:


On Wed, May 15, 2019 at 11:38 AM <dan...@chromium.org> wrote:
On Wed, May 15, 2019 at 12:55 PM Łukasz Anforowicz <luk...@chromium.org> wrote:
You are probably already aware of this, but please note that the main reason for not_site_per_process_webkit_layout_tests step on the linux-rel CQ/waterfall bot is to have test coverage of Android-specific Site Isolation behavior.  Maybe this test step should be mutated into android_simulation_webkit_layout_tests?  I am not sure how virtual/android test suite fits here - it seems to cover only a subset of tests + it seems to inject Android-specific cmdline flags into webkit_layout_tests step on *every* bot/platform?

I did not realize this is meant to simulate Android specifically, that's interesting. There is also a not-site-per-process virtual test suite, which is defined to include a large number of different tests. It uses the flag --disable-site-isolation-trials which looks the same as not_site_per_process_webkit_layout_tests. I'm not sure if --disable-blink-features=LayoutNG is also part of the "emulate Android" goal there, but it is not part of the virtual test suite. So it seems like we have some redundant coverage?

The not_site_per_process_webkit_layout_tests would not have a way to disable tests without also disabling them for the regular test run as well, I think?
Isn't that what third_party/blink/web_tests/flag-specific/ and third_party/blink/web_tests/FlagExpectations/ are for?

Thanks, looks like yes, I had never seen that before. So it seems functionally similar to the virtual test suite then, and perhaps they are just redundant. Thanks for pointing it out :)

FlagExpectations and flag-specific were added to support dedicated bots running the entire set of layout tests with a certain flag.  Virtual test suites run on CQ/waterfall which doesn't scale to (all layout tests) * (n flags).

But if you are willing to force every layout-test-running CQ/waterfall bot to also run the flag with a special test step, I guess you might as well use virtual test suites.

Łukasz Anforowicz

unread,
May 15, 2019, 3:17:25 PM5/15/19
to Steve Kobes, Dana Jansens, Marijn Kruisselbrink, blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov
If we had a virtual/android (or virtual/not-site-per-process) that covers *all* layout tests, then every test failure/flake that affects both modes (the default mode + the android / not-site-per-process mode) would have to be duplicated.  In other words, not_site_per_process_webkit_layout_tests step inherits the test expectations of the default mode, but a virtual test suite does not.  This is desirable in some cases and not in others (which is the reason why we have both not_site_per_process_webkit_layout_tests [main coverage] + virtual/not-site-per-process [to cover a handful of tests with diverging test expectations]).

Steve Kobes

unread,
May 15, 2019, 3:22:57 PM5/15/19
to Łukasz Anforowicz, Dana Jansens, Marijn Kruisselbrink, blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov
Ah good point.  I continually forget that virtual test suites don't inherit the non-virtual expectations.

dan...@chromium.org

unread,
May 15, 2019, 3:26:24 PM5/15/19
to Steve Kobes, Łukasz Anforowicz, Marijn Kruisselbrink, blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov
On Wed, May 15, 2019 at 3:22 PM Steve Kobes <sko...@chromium.org> wrote:
Ah good point.  I continually forget that virtual test suites don't inherit the non-virtual expectations.

On Wed, 15 May 2019 at 15:17, Łukasz Anforowicz <luk...@chromium.org> wrote:
If we had a virtual/android (or virtual/not-site-per-process) that covers *all* layout tests, then every test failure/flake that affects both modes (the default mode + the android / not-site-per-process mode) would have to be duplicated.  In other words, not_site_per_process_webkit_layout_tests step inherits the test expectations of the default mode, but a virtual test suite does not.  This is desirable in some cases and not in others (which is the reason why we have both not_site_per_process_webkit_layout_tests [main coverage] + virtual/not-site-per-process [to cover a handful of tests with diverging test expectations]).

Thanks for the clarification! This is pretty obscure, and maybe not ideal to diverge like that. But for the purposes of this thread it has been super helpful to know we're getting Android test coverage on linux for the whole test suite wrt site isolation. Renaming the suite seems like a pretty reasonable thing to do, esp if if it going to grow more flags in the future (like dropping LayoutNG currently?).

Thanks!
Dana

Christian Biesinger

unread,
May 15, 2019, 3:29:35 PM5/15/19
to Dana Jansens, Steve Kobes, Łukasz Anforowicz, Marijn Kruisselbrink, blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov
From: <dan...@chromium.org>
Date: Wed, May 15, 2019 at 2:26 PM
To: Steve Kobes
Cc: Łukasz Anforowicz, Marijn Kruisselbrink, blink-dev, chromium-dev,
Daniel Cheng, Nasko Oskov

> On Wed, May 15, 2019 at 3:22 PM Steve Kobes <sko...@chromium.org> wrote:
>>
>> Ah good point. I continually forget that virtual test suites don't inherit the non-virtual expectations.
>>
>> On Wed, 15 May 2019 at 15:17, Łukasz Anforowicz <luk...@chromium.org> wrote:
>>>
>>> If we had a virtual/android (or virtual/not-site-per-process) that covers *all* layout tests, then every test failure/flake that affects both modes (the default mode + the android / not-site-per-process mode) would have to be duplicated. In other words, not_site_per_process_webkit_layout_tests step inherits the test expectations of the default mode, but a virtual test suite does not. This is desirable in some cases and not in others (which is the reason why we have both not_site_per_process_webkit_layout_tests [main coverage] + virtual/not-site-per-process [to cover a handful of tests with diverging test expectations]).
>
>
> Thanks for the clarification! This is pretty obscure, and maybe not ideal to diverge like that. But for the purposes of this thread it has been super helpful to know we're getting Android test coverage on linux for the whole test suite wrt site isolation. Renaming the suite seems like a pretty reasonable thing to do, esp if if it going to grow more flags in the future (like dropping LayoutNG currently?).

Dropping LayoutNG is a temporary thing, we are enabling that flag
incrementally so we don't have to deal with all bots' failures at the
same time.

Christian

>>> From: Steve Kobes <sko...@chromium.org>
>>> Date: Wed, May 15, 2019 at 12:10 PM
>>> To: <dan...@chromium.org>
>>> Cc: Marijn Kruisselbrink, Łukasz Anforowicz, blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov
>>>
>>>>
>>>>
>>>> On Wed, 15 May 2019 at 14:49, <dan...@chromium.org> wrote:
>>>>>
>>>>> On Wed, May 15, 2019 at 2:43 PM Marijn Kruisselbrink <m...@chromium.org> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, May 15, 2019 at 11:38 AM <dan...@chromium.org> wrote:
>>>>>>>
>>>>>>> On Wed, May 15, 2019 at 12:55 PM Łukasz Anforowicz <luk...@chromium.org> wrote:
>>>>>>>>
>>>>>>>> You are probably already aware of this, but please note that the main reason for not_site_per_process_webkit_layout_tests step on the linux-rel CQ/waterfall bot is to have test coverage of Android-specific Site Isolation behavior. Maybe this test step should be mutated into android_simulation_webkit_layout_tests? I am not sure how virtual/android test suite fits here - it seems to cover only a subset of tests + it seems to inject Android-specific cmdline flags into webkit_layout_tests step on *every* bot/platform?
>>>>>>>
>>>>>>>
>>>>>>> I did not realize this is meant to simulate Android specifically, that's interesting. There is also a not-site-per-process virtual test suite, which is defined to include a large number of different tests. It uses the flag --disable-site-isolation-trials which looks the same as not_site_per_process_webkit_layout_tests. I'm not sure if --disable-blink-features=LayoutNG is also part of the "emulate Android" goal there, but it is not part of the virtual test suite. So it seems like we have some redundant coverage?
>>>>>>>
>>>>>>> The not_site_per_process_webkit_layout_tests would not have a way to disable tests without also disabling them for the regular test run as well, I think?
>>>>>>
>>>>>> Isn't that what third_party/blink/web_tests/flag-specific/ and third_party/blink/web_tests/FlagExpectations/ are for?
>>>>>
>>>>>
>>>>> Thanks, looks like yes, I had never seen that before. So it seems functionally similar to the virtual test suite then, and perhaps they are just redundant. Thanks for pointing it out :)
>>>>
>>>>
>>>> FlagExpectations and flag-specific were added to support dedicated bots running the entire set of layout tests with a certain flag. Virtual test suites run on CQ/waterfall which doesn't scale to (all layout tests) * (n flags).
>>>>
>>>> But if you are willing to force every layout-test-running CQ/waterfall bot to also run the flag with a special test step, I guess you might as well use virtual test suites.
>
> --
> You received this message because you are subscribed to the Google Groups "blink-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
> To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAHtyhaRpYGo6uE9iSsCe_f3oup-CKjXkcmmUhJcmqX6Quh%2B2HA%40mail.gmail.com.

Chris Cunningham

unread,
May 15, 2019, 3:34:18 PM5/15/19
to Robert Ma, John Budorick, blink-dev, Dirk Pranke, Dale Curtis, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu, Philip Jägenstedt, ser...@chromium.org, Matthew Wolenetz, libe...@chromium.org
> The point of the SmokeTests was to try and get some sort of broad but shallow coverage of the test suite. 
This is how media/ team has used it.

> Looking through the SmokeTests I was unable to find a single test that was actually testing something Android-specific.
These media tests do somehting very android specific

They use a real gpu on the bot to test hw accelerated decode paths. Unit testing this platform and hardware specific code has historically been infeasable. I beleive these layout tests are the only test coverage we have of these code paths.

> virtual/android/ test suite running on Linux.
I'm not familiar with this. Can someone provide some pointers? I expect (but should confirm) that this doesn't have enough virtualization to emulate Android GPU video decode.

Chris

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

dan...@chromium.org

unread,
May 15, 2019, 3:50:57 PM5/15/19
to Chris Cunningham, Robert Ma, John Budorick, blink-dev, Dirk Pranke, Dale Curtis, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu, Philip Jägenstedt, Sergey Ulanov, Matthew Wolenetz, Frank Liberato
On Wed, May 15, 2019 at 3:34 PM Chris Cunningham <chcunn...@chromium.org> wrote:
+libe...@chromium.org, +wole...@chromium.org

> The point of the SmokeTests was to try and get some sort of broad but shallow coverage of the test suite. 
This is how media/ team has used it.

> Looking through the SmokeTests I was unable to find a single test that was actually testing something Android-specific.
These media tests do somehting very android specific

Thanks for the pointers! This seems to be a pretty limited number of tests. Does testing HW accelerated video decode need to be done via web_tests or would it be better well as content_browsertests instead? Those can make full use of GPU on all platforms as desired. I understand these are external/wpt/ tests, but as an outsider it's hard to tell how much these tests are testing that is Android-specific, and if a more targeted hardware-decode test would make sense as a content_browsertest. They have the ability to load HTML and check if the loaded page is happy or not. A cursory look at the tests doesn't seem to indicate they require the testRunner or other web_tests articles.

Has the media team discussed previously any plans or concerns regarding maintaining the web_tests build on Android? This does sound like an important and valid thing to test, but a large engineering task to maintain the framework for these 6 tests (5 of which run).
 
They use a real gpu on the bot to test hw accelerated decode paths. Unit testing this platform and hardware specific code has historically been infeasable. I beleive these layout tests are the only test coverage we have of these code paths.

> virtual/android/ test suite running on Linux.
I'm not familiar with this. Can someone provide some pointers? I expect (but should confirm) that this doesn't have enough virtualization to emulate Android GPU video decode.

VirtualTestSuites are tests run with command line flags defined in the VirtualTestSuites file. They are run on all bots, so Linux etc. They don't gain access to Android HW as a result.

Thank you,
Dana

Chris

On Wed, May 15, 2019 at 8:24 AM Robert Ma <robe...@chromium.org> wrote:
From: John Budorick <jbud...@chromium.org>
Date: Tue, May 14, 2019 at 6:44 PM
To: Robert Ma
Cc: blink-dev, Dirk Pranke, <dalec...@chromium.org>, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu, Philip Jägenstedt

The harness had bitrotten after a few months of failing green. Some of the tests bitrotted and may have stayed that way.

Yes that's what I meant and I've been aware of the issue for a while but haven't got time to look into it. Sorry if my phrasing wasn't clear.

Also it just occurred to me that Fuchsia also runs web tests listed in SmokeTests. Although I don't think they are using any Android-specific harness (only the filtering & plumbing in blinkpy), +sergeyu just in case.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAOPAaNLz1m27XvuV%3DvLX2Vv%2B8kAwj-ukG5DVTj8YsahC_gmhcQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Frank Liberato

unread,
May 15, 2019, 11:50:40 PM5/15/19
to dan...@chromium.org, Chris Cunningham, Robert Ma, John Budorick, blink-dev, Dirk Pranke, Dale Curtis, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu, Philip Jägenstedt, Sergey Ulanov, Matthew Wolenetz
(Since mailer daemon ate my previous mail to most recipients, here it is:)

> I beleive these layout tests are the only test coverage we have of these code paths.
As far as I know, that's still essentially correct.
There is some limited testing with mocked-out hardware bits these days, but it's not really a substitute for running it on the bots with real hardware end to end.  Losing these tests would be unfortunate.

Now for new stuff:

Does testing HW accelerated video decode need to be done via web_tests or would it be better well as content_browsertests instead?

This is a good question.  I don't have an answer right now, but i'll take a look at it today or (maybe) tomorrow to see if it's a good substitute.  While I haven't read the whole thread yet, I understand that moving them away is valuable.

thanks
-fl

Dirk Pranke

unread,
May 16, 2019, 6:53:34 PM5/16/19
to Christian Biesinger, Dana Jansens, Steve Kobes, Łukasz Anforowicz, Marijn Kruisselbrink, blink-dev, chromium-dev, Daniel Cheng, Nasko Oskov
This is somewhat tangential, but I wouldn't want us to consider using a virtual test suite to run *everything* with a different flag as part of the same test step unless we were absolutely sure this is the right thing to do and we planned for it accordingly.

Perhaps most obviously, adding such a virtual test suite would double the cost of the web_tests and cycle time, and that's not something we can do w/o planning.

Slightly less obviously, that's just not what I had in mind when we first added virtual test suites, which were supposed to be relatively small and focused subsets of the main test suite. Running the whole suite differently is to me better suited to be done as a different test step with explicit flags (like we did for site per process). 

There's some tension between when a virtual test suite is better vs. when we should simply have a different test step, and I don't know what the right balance is, but I suspect we're already on the wrong side (i.e., have too many virtual test suites), at least from the point of view of understanding the bigger picture of where we spend our time testing chromium.

-- Dirk

Dale Curtis

unread,
May 23, 2019, 4:00:23 PM5/23/19
to Frank Liberato, Dana Jansens, Chris Cunningham, Robert Ma, John Budorick, blink-dev, Dirk Pranke, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu, Philip Jägenstedt, Sergey Ulanov, Matthew Wolenetz
Frank, Chris and I chatted about this. We're okay with moving forward with the deprecation. After investigation, we feel we have adequate coverage between existing content_browsertests and the GPU pixel tests. Thanks!

- dale

dan...@chromium.org

unread,
May 24, 2019, 1:40:28 PM5/24/19
to Dale Curtis, Frank Liberato, Chris Cunningham, Robert Ma, John Budorick, blink-dev, Dirk Pranke, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu, Philip Jägenstedt, Sergey Ulanov, Matthew Wolenetz
Thank you Dale!

I think this means we're not able to justify the effort to maintain this suite going forward for Android and we should remove it from the bots.

Dirk, I'll send you a CL to do so. If you'd like to collect more info or have any other concerns tho (or anyone else) please speak up. I do want to promote good test coverage.

Dirk Pranke

unread,
May 24, 2019, 2:04:18 PM5/24/19
to Dana Jansens, Dale Curtis, Frank Liberato, Chris Cunningham, Robert Ma, John Budorick, blink-dev, Albert J. Wong (王重傑), Daniel Cheng, Nasko Oskov, Yang Zhang, Erik Staab, Yihong Gu, Philip Jägenstedt, Sergey Ulanov, Matthew Wolenetz
I'm good with the diligence you've done already :).

-- Dirk
Reply all
Reply to author
Forward
0 new messages