Developer pain as a result of "introducing blink_wpt_tests"

231 views
Skip to first unread message

Dominic Farolino

unread,
May 27, 2022, 12:03:52 PM5/27/22
to blink-dev, Weizhong Xia
I write a lot of web platform tests as a Web Platform engineer; recently I wrote one in external/wpt/ (the external web platform tests directory), and was shocked to find the eradication of `-expected.txt` files. I placed my expectations file next to the source file as we've done for many years, and found that my test was "failing" because the test runner couldn't find my test expectations file.

I dug deeper and found https://crrev.com/c/3603221 which was responsible for moving more than 21,000 platform-agnostic test expectations files away from their source files and into web_tests/platform/generic directory. I found more discussion in this email thread which I missed because blink-dev emails do not go directly in my inbox.

I must say I find this change extraordinarily inconvenient as a Web Platform engineer, and I want to push back against this. A minority of web platform tests have platform-specific failures, which justifies the need for some platform-specific test expectations directories, but I believe a huge majority have generic baselines that are wildly convenient to have right next to the actual tests themselves. Putting them in a separate directory means I and others have to open a separate browser tab to view how many expectations there are for a given directory, and requires a lot of unnecessary context switching. It is particularly confusing for clusters of tests whose names are all very similar and vary by only a few numbers or suffixes—this increases the cost of the context switching.

Furthermore, it renders tons of directories absolutely useless! All ~150 directories in web_tests/virtual (for VirtualTestSuites) are just empty directories with README files—these used to house virtualtest-specific expectations. So now for fenced frames (the project I'm working on right now), we have the following test directories:
This is so weird! Regardless of whether or not there are plans to clean this up, I can't see the upsides. The ousting of platform-agnostic expectations is only an inconvenience for WP engineers, while there might be some test-infra conveniences around BUILD.gn dependencies (maybe?). In any case, I am hard pressed to find justification in this move, and would love to see if we can reconsider this.

Thoughts?

Dom

Domenic Denicola

unread,
May 27, 2022, 12:10:03 PM5/27/22
to Dominic Farolino, blink-dev, Weizhong Xia
+1. This was really unpleasantly surprising. When I first saw the original blink-dev email, I thought "generic baselines" meant something like "non-web platform test baselines", not "WPT expectation files that are platform-agnostic".

In addition to the context-switching cost, it's just much harder to navigate between tests and their expectations, which is something I do quite often. E.g., they are no longer grouped together in code reviews, since their file paths are lexicographically far away from each other. And, as someone maintaining and reviewing several WPT directories, moving these crucial files out of the directories I commonly work in (and have metadata marking me as the point-of-contact for) into separate directories dilutes the cohesiveness of my projects.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAP-uykAN06y5o-WYznnicvm1YREbSsLbs6dM57LtL4vCWB%3Duzw%40mail.gmail.com.

Yoav Weiss

unread,
Jun 1, 2022, 9:06:56 AM6/1/22
to Domenic Denicola, Dominic Farolino, blink-dev, Weizhong Xia
+1 from me as well. I was similarly caught by surprise by this change (during reviews for webexposed changes), and am similarly not seeing the upside for this.

While I'm sure this is a change that was meant to be a positive one, I'd love to better understand the reasoning, and whether the current situation is a temporary one, or one that is planned to be permanent even after the move to blink_wpt_tests is done.

Weizhong Xia

unread,
Jun 2, 2022, 3:21:39 PM6/2/22
to blink-dev, yoav...@chromium.org, d...@chromium.org, blink-dev, Weizhong Xia, dom...@chromium.org
Folks

I'm sorry to see this has caused inconvenience, and sorry for being late in response to this, due to the same reason Dom had.

The reason to move baselines to one central place at this point is to make it possible to specify dependency in BUILD.gn. This is part of the work to use upstream wptrunner to run wpt tests. In crbug/1299834 we have tried to discuss some different approaches. I would say this is the least disruptive way. For blink engprod, I think to make blink devs happy is our top priority.

For the long term, we can definitely move this back once we have separate legacy tests and wpt into different folders. I can also reach out to you guys to better understand your needs.

For the folders Dom mentioned, I can check to see if any of that can be removed. I understand the README is needed for virtual test suites.

Cheers, Weizhong 

Domenic Denicola

unread,
Jun 2, 2022, 3:29:46 PM6/2/22
to Weizhong Xia, blink-dev, yoav...@chromium.org, d...@chromium.org, Weizhong Xia, dom...@chromium.org
On Thu, Jun 2, 2022 at 3:21 PM Weizhong Xia <weiz...@google.com> wrote:
Folks

I'm sorry to see this has caused inconvenience, and sorry for being late in response to this, due to the same reason Dom had.

Thanks for responding and listening to our concerns!
 

The reason to move baselines to one central place at this point is to make it possible to specify dependency in BUILD.gn.

I don't quite understand this. I've had to work with WPTs and WPT expectations my entire time working on Chromium. I've never had to "specify dependency in BUILD.gn"; I don't really know what that means. I'd love to hear more (or be referred to a doc explaining the issue), so that I understand why we're making the sacrifice we're making. (The linked bug isn't very understandable for me, unfortunately. I can't even understand enough to find the part you mentioned where you discussed different approaches.)

I'm also unsure how the decision was weighed. Is the population of people specifying dependency in BUILD.gn very large, so that their needs are outweighing those of the Chromium developers working with web platform tests?

Weizhong Xia

unread,
Jun 2, 2022, 4:25:44 PM6/2/22
to Domenic Denicola, blink-dev, yoav...@chromium.org, d...@chromium.org, Weizhong Xia
Previously you don't need to specify anything in BUILD.gn is because we are downloading the whole "web_tests" folder. Now we want to run legacy tests and wpt in different steps. Under "web_tests/virtual" we have baselines for wpt and pure virtual legacy tests, we can list the folders under "virtual/prefix" for each virtual test suite, which will make the dependency very large(yes), and we need make such change each time when we add new virtual suites. I don't think this is something blink devs want to do.

The discussion is at the beginning of the crbug. So pls scroll back to #c1, and read from there.

thanks, Weizhong

Dominic Farolino

unread,
Jun 3, 2022, 9:55:49 AM6/3/22
to Weizhong Xia, Domenic Denicola, blink-dev, yoav...@chromium.org, Weizhong Xia
For the long term, we can definitely move this back once we have separate legacy tests and wpt into different folders. I can also reach out to you guys to better understand your needs.

For the folders Dom mentioned, I can check to see if any of that can be removed. I understand the README is needed for virtual test suites.

Can you define "long term" here? Is there a timeline? I was really hoping that we'd go back to how things were almost immediately. Still the need for the delay does not quite make sense to me, and I'm really hoping that for Blink developer experience we can revert this back to the previous setup ASAP.

Under "web_tests/virtual" we have baselines for wpt and pure virtual legacy tests, we can list the folders under "virtual/prefix" for each virtual test suite, which will make the dependency very large(yes), and we need make such change each time when we add new virtual suites. I don't think this is something blink devs want to do.

I'm sorry, but like Domenic I am not sure what to make of much of this. What I do know is that Blink developers also don't want the current WPT writing experience where expectation files are positioned far away from the source test. I think this experience matters a lot.

Xianzhu Wang

unread,
Jun 6, 2022, 6:30:23 PM6/6/22
to blink-dev, Dominic Farolino, Domenic Denicola, blink-dev, Yoav Weiss, Weizhong Xia, weiz...@google.com
I think we first need to answer a question: Why do we need *-expected.txt for WPT tests?

Upstream WPT doesn't have *-expected.txt. *-expected.txt is a blink-specific thing to allow some failing WPT tests to pass on blink with temporarily allowed failures. If a test has -expected.txt, it means Chrome behaves differently than the standard web platform behavior (or the test itself is wrong). The files are sometimes harmful because they hide failures and web platform incompatibilities (e.g.). So I think -expected.txt files for WPT tests should be rare and eventually be removed. An -expected.txt should not be treated as a part of the test itself because a) it doesn't exist in upstream WPT and 2) it doesn't describe the standard expected behavior, thus isn't necessarily placed besides the test.

The directory name 'platform' may be misleading. 'baselines' may be a better name (but we should not rename until we decide what to do for generic baselines). For WPT tests, 'failures' is perhaps an even better name.

Weizhong Xia

unread,
Jun 6, 2022, 6:52:08 PM6/6/22
to Xianzhu Wang, blink-dev, Dominic Farolino, Domenic Denicola, Yoav Weiss, Weizhong Xia
Xianzhu, yes 'baselines' is the name we agreed on previously. The reason I later changed back to use 'platform' is because that will make the CL smaller, and make it easier for gerrit to handle it. We can make one round rename when everything is stabilized. (I left you a message when you are OOO. I am not sure if that message lived long enough for you to catch it.)

thanks, Weizhong

Domenic Denicola

unread,
Jun 6, 2022, 6:52:10 PM6/6/22
to Xianzhu Wang, blink-dev, Dominic Farolino, Domenic Denicola, Yoav Weiss, Weizhong Xia, weiz...@google.com
I'm not sure I understand this logic. -expected.txt files are just more-convenient, more in-your-face versions of TestExpectations files. Surely you're not suggesting we get rid of TestExpectations files?

Xianzhu Wang

unread,
Jun 6, 2022, 8:00:07 PM6/6/22
to Domenic Denicola, blink-dev, Dominic Farolino, Yoav Weiss, Weizhong Xia, weiz...@google.com
On Mon, Jun 6, 2022 at 3:52 PM Weizhong Xia <weiz...@google.com> wrote:
Xianzhu, yes 'baselines' is the name we agreed on previously. The reason I later changed back to use 'platform' is because that will make the CL smaller, and make it easier for gerrit to handle it. We can make one round rename when everything is stabilized. (I left you a message when you are OOO. I am not sure if that message lived long enough for you to catch it.)

I caught the message. I guessed that the name 'platform' might be one of the reasons for the surprise to blink developers after the change, and the name 'baselines' might make the change easier to explain :)

On Mon, Jun 6, 2022 at 3:52 PM Domenic Denicola <dom...@chromium.org> wrote:
I'm not sure I understand this logic. -expected.txt files are just more-convenient, more in-your-face versions of TestExpectations files. Surely you're not suggesting we get rid of TestExpectations files?

Sorry, "-expected.txt ... should be ... eventually be removed" in my previous email was not clear. I meant for each individual -expected.txt we should eventually remove it because we should fix the failure. The same logic applies to TestExpectations. At any time we may allow a certain number of failures but we should keep the number as small as possible.

I think we should prefer TestExpectations to -expected.txt for WPT tests because the entries in TestExpectations have associated bugs which track the fixing process, unless we find a better way to track the fixing of the failures in -expected.txt. -expected.txt files do have their values, e.g. for partially-passing tests we can discover regressions and progressions of individual sub tests, but they should be rare.

I think separating -expected.txt from the tests has the following benefits:
- It makes it clear to blink developers that the files are not a part of WPT.
- It simplifies the WPT export/import process and others by reducing blink-specific files under external/wpt.

It does make it more difficult to find -expected.txt, but we already have the similar well-known logic for platform-specific baselines. Though platform-specific baselines are rare, ignoring a platform baseline can still cause surprises.

I think we can improve the test result viewer
- to better show -expected.txt for passing tests
- to show information about tests without actually running the tests
WDYT?

Weizhong Xia

unread,
Jun 23, 2022, 2:00:30 PM6/23/22
to Xianzhu Wang, Domenic Denicola, blink-dev, Dominic Farolino, Yoav Weiss, Weizhong Xia
Hi blink devs

Thanks to those who joined the survey at https://forms.gle/ju45qciS5VTR4ywN7. Most of you expressed the desire to put baselines at the same place of the tests. Your voice is heard, and here is the plan for the next step.

In Q3 we will work on to completely separate legacy layout tests and wpt tests, to put them under `third_party/blink/web_tests` and `third_party/blink/wpt_tests` respectively. Generic baselines (including generic virtual baselines) will be moved back to their previous place. Rebaseline tool will be updated to work with this structure, and update baselines for legacy layout tests and wpt tests in a single run. We will have different copies of TestExpectations, FlagSpecificConfig, VirtualTestSuites etc for legacy tests and wpt tests. When working on those files, we will need to make sure we are updating the correct copy of the file. (We will investigate if we need some presubmit check for such a scenario).

The reason for this is two fold: as requested we want to put baselines side to side to the tests, and we want to make the directory structure right to speed up the switch to wptrunner.

Thoughts? Feel free to leave a comment in crbug/1299834.

Thanks, Weizhong


Weizhong Xia

unread,
Aug 26, 2022, 1:57:10 PM8/26/22
to Xianzhu Wang, Domenic Denicola, blink-dev, Dominic Farolino, Yoav Weiss, Weizhong Xia
Hi Blink devs

FYI we are finally ready to move generic baselines back to their original places: the test folders. I plan to land the CL today. To avoid any merge conflict during this process, we will add an OWNERS file to //third_party/blink/web_tests/platform/generic. Any change to that folder will require an additional +1, and will not be approved. Once the move is done, the "platform/generic" folder will be removed.

thanks, Weizhong

Xianzhu Wang

unread,
Aug 26, 2022, 7:53:59 PM8/26/22
to Weizhong Xia, Domenic Denicola, blink-dev, Dominic Farolino, Yoav Weiss, Weizhong Xia
Hi Blink devs,

The move has landed (https://crrev.com/1039971).

If you have any pending CLs that add/remove/modify files under web_tests/platform/generic, please sync, and add/remove/modify the corresponding file under web_tests/ instead.

Thanks,
Xianzhu

Domenic Denicola

unread,
Aug 28, 2022, 11:07:54 PM8/28/22
to Xianzhu Wang, Weizhong Xia, Domenic Denicola, blink-dev, Dominic Farolino, Yoav Weiss, Weizhong Xia
This is great news! Thanks for being receptive to feedback and to working to get us back to a more-productive state.

Philip Jägenstedt

unread,
Aug 29, 2022, 12:07:59 PM8/29/22
to Domenic Denicola, Xianzhu Wang, Weizhong Xia, blink-dev, Dominic Farolino, Yoav Weiss, Weizhong Xia
I'd like to second that, thanks Weizhong!

Weizhong Xia

unread,
Aug 29, 2022, 12:27:18 PM8/29/22
to Philip Jägenstedt, Domenic Denicola, Xianzhu Wang, blink-dev, Dominic Farolino, Yoav Weiss, Weizhong Xia
I am glad we were able to make this happen. Thanks for all the feedback on this particular issue and other blink/infra related issues. We definitely need your feedback to make blink/infra better.

Thank you, all!
Reply all
Reply to author
Forward
0 new messages