PSA: New layout test results viewer

120 views
Skip to first unread message

Quinten Yearsley

unread,
Aug 1, 2017, 2:24:11 PM8/1/17
to blink-dev, blink-infra
Summary: There's a new layout test results.html page (made by atotic@)!


Details: I'd like to switch over the default layout test results viewer and eventually remove the old one in order to simplify things. So, at first, I plan to switch the default viewer, but there old one is still there, and there's a link to the old viewer in the upper-right corner.

Bug: 748628

We're interested to hear any feedback or suggestions; to file a bug about the new UI, file a bug with the component Blink>Infra.

Stefan Zager

unread,
Aug 1, 2017, 2:51:41 PM8/1/17
to Quinten Yearsley, blink-dev, blink-infra
On Tue, Aug 1, 2017 at 11:23 AM, Quinten Yearsley <qyea...@chromium.org> wrote:
Summary: There's a new layout test results.html page (made by atotic@)!


Details: I'd like to switch over the default layout test results viewer and eventually remove the old one in order to simplify things. So, at first, I plan to switch the default viewer, but there old one is still there, and there's a link to the old viewer in the upper-right corner.

That new viewer seems to be missing many of the most useful features of the old viewer.  Have you considered rolling this out in a less disruptive way by, e.g., leaving the default alone and providing a link at the top of the existing results page to the new results page?

Thanks,

Stefan

Nico Weber

unread,
Aug 1, 2017, 2:57:05 PM8/1/17
to Quinten Yearsley, blink-dev, blink-infra
Can you share the motivation behind the rewrite?

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAFp4ES%3D62SUXoW1qBqUeffWgCeAL_MOOCU27LUECmKa6%2BDMfDQ%40mail.gmail.com.

Stefan Zager

unread,
Aug 1, 2017, 3:04:27 PM8/1/17
to Nico Weber, Quinten Yearsley, blink-dev, blink-infra
On Tue, Aug 1, 2017 at 11:56 AM, Nico Weber <tha...@chromium.org> wrote:
Can you share the motivation behind the rewrite?

The CL description refers to this discussion thread:


Seems like the primary motivation is to make it easier to generate TestExpectations entries.  If that's true, then I would suggest that someone look at incorporating the TestExpectations-generating feature into the existing results page.

Quinten Yearsley

unread,
Aug 1, 2017, 7:44:21 PM8/1/17
to Stefan Zager, Nico Weber, blink-dev, blink-infra
Thanks for the feedback - calling this a "PSA" was actually a bit wrong, it should be considered more of a request for more feedback. I actually only want to change things if it's helpful and good :-)

On Tue, Aug 1, 2017 at 12:04 PM, Stefan Zager <sza...@chromium.org> wrote:
That new viewer seems to be missing many of the most useful features of the old viewer.  Have you considered rolling this out in a less disruptive way by, e.g., leaving the default alone and providing a link at the top of the existing results page to the new results page?

Which features are you thinking of now?

Right now there's a link at the top of the results page to the "new results page" (example page; there's a link in the top-right that says "Expectations".  
One of the other motivations was maintainability -- the existing results page is around 1400 lines, whereas the new one is around 950 lines and uses more modern JS features.

As far as I understand it, there were a few bugs in the current viewer (for example, I think some numbers were not computed quite right, and also bugs like crbug.com/619512 and crbug.com/664274). Aleks initially looked at modifying the existing one but decided a rewrite may be easier; now I'm hoping that if we can make the new page cover everyone's use-cases then we may be able to remove the old one.

Nico Weber

unread,
Aug 2, 2017, 10:50:54 AM8/2/17
to Quinten Yearsley, Stefan Zager, blink-dev, blink-infra
My feedback shouldn't count for much since I don't use the expectations viewer all that much. Having said that, I kind of know how to operate the old one and couldn't figure out the new one. (I couldn't figure out the old one at first either, but it's been around for a while, so I learned to do that.) I'd imagine most people not working with expectations all that often might be in a similar situation.

1400 lines doesn't sound all that much, and 950 doesn't sound like all that much less -- why isn't it feasible to change the existing one? Why is a full rewrite (+ completely different UI) needed?

Stefan Zager

unread,
Aug 2, 2017, 12:36:42 PM8/2/17
to Quinten Yearsley, Stefan Zager, Nico Weber, blink-dev, blink-infra
On Tue, Aug 1, 2017 at 4:43 PM, Quinten Yearsley <qyea...@chromium.org> wrote:
Thanks for the feedback - calling this a "PSA" was actually a bit wrong, it should be considered more of a request for more feedback. I actually only want to change things if it's helpful and good :-)

On Tue, Aug 1, 2017 at 12:04 PM, Stefan Zager <sza...@chromium.org> wrote:
That new viewer seems to be missing many of the most useful features of the old viewer.  Have you considered rolling this out in a less disruptive way by, e.g., leaving the default alone and providing a link at the top of the existing results page to the new results page?

Which features are you thinking of now?

- Keyboard shortcuts (j, k, e, c, f).  More generally, the new results page requires a lot more clicking.
- Information about the type of failure (text diff, image diff, ref diff, etc.)
- The ability to see expected and actual results side-by-side, for visual comparison
- The pretty-diff view
- The pane with two-second toggling between expected and actual results
- The pixel-zoom behavior when hovering over an image diff (https://screenshot.googleplex.com/wG4oCHjXJdT)
- The ability to flag a subset of tests and then copy the names of those tests in the "Flagged Tests" section


If I'm only dealing with a small number of test failures on a results page, I can suffer through the new format (with some swearing).  But when working through 1500 test failures -- as I'm now doing daily as part of a large refactoring project -- it pretty much stops me in my tracks.

Quinten Yearsley

unread,
Aug 2, 2017, 6:37:57 PM8/2/17
to Stefan Zager, Nico Weber, blink-dev, blink-infra
On Wed, Aug 2, 2017 at 7:50 AM, Nico Weber <tha...@chromium.org> wrote:

My feedback shouldn't count for much since I don't use the expectations viewer all that much. Having said that, I kind of know how to operate the old one and couldn't figure out the new one. (I couldn't figure out the old one at first either, but it's been around for a while, so I learned to do that.) I'd imagine most people not working with expectations all that often might be in a similar situation.

1400 lines doesn't sound all that much, and 950 doesn't sound like all that much less -- why isn't it feasible to change the existing one? Why is a full rewrite (+ completely different UI) needed?

Thanks Nico, this feedback is valuable.

It probably is feasible to change the existing one, and and a full rewrite is probably not strictly necessary. The new alternate results view (test-expectations.html) wasn't originally made with the intent of replacing all features. The idea of replacing the existing one came as a later thought.

Thanks for the feedback -- so I think for now there's no need to change the default results viewer. Just added the above feedback to crbug.com/748628.

For now, we'll have two results viewers, the classic one (results.html) and the alternate one (test-expectations.html), and they have links to each other in case you want to see the other one; we can revisit this issue at some point in the future.

Aleks Totic

unread,
Aug 7, 2017, 4:10:45 AM8/7/17
to Stefan Zager, Quinten Yearsley, Nico Weber, blink-dev, blink-infra
I did the rewrite. The number of lines of code was not what drove me.

It all started as a 2 hour hack. results.html was reporting an
incorrect number of "Unexpected passes", which was something I needed
to be correct. The logic to decide whether test was unexpectedly
passing was spread between python code that generated
failing_results.json, and results.html that interpreted it.

Since I was unfamiliar with our test result data format, and did not
feel like hacking python, I hacked up JS that read full_results.json,
and correctly counted "Unexpected passes" in Javascript.

Now I had a something hackable, I had a way to fix all my frustrations
with results.html. Some of these frustrations are caused by my
ignorance: without help, I could not figure how to make things work.

- incorrect number of unexpected passes
- lack of statistics: No clean overview of how many tests ran, how
many failed, how many timed out.
- strange display of results where I had to scroll to a lot.
- difficulty of integrating results with TestExpectations file. In
layoutng, many tests change state, and manually editing
TestExpectations file is error prone. Since the results page had all
the information, why not let it do the work.
- no links to associated bugs.

The temptation was too much, and about 2 days of work produced
test-expectations.html. The TestExpectations integration has saved me
and others a lot of time. Statistics are also very helpful in
measuring progress. Steve is using it to automatically generate flag
specific TestExpectations from 3 platforms.

I would eventaully like to make test-expectations the default. My
experience is that the existing page is newbie hostile.
I also want to retain the power of the old UI. And I'd like it to
remain hackable, so we can add functionality needed by different
developers. So keep your comments coming, this is my side project when
staring at LegacyLayout code drives me batty.

Now to respond to comments:

> Keyboard shortcuts (j, k, e, c, f). More generally, the new results page requires a lot more clicking.

New page is keyboard friendly, there is no need to click at all, enter
and tab are all you need. Tab moves between tests, enter shows the
results. Why click? How is this harder than jkecf?

> Information about the type of failure (text diff, image diff, ref diff, etc.)

Ack. There will be another output format showing this information

> The ability to see expected and actual results side-by-side, for visual comparison

I found using tab to switch between these in-place to be more useful.
The results.html expected and actual were often separated, it was hard
to identify difference.

> The pretty-diff view

Ack. I assume this is the textual diff. Personally, I am happy reading
.diff files, but if there is demand....

> The pane with two-second toggling between expected and actual results

Why not just use tab/shift tab, and avoid the two second wait?

- The pixel-zoom behavior when hovering over an image diff
(https://screenshot.googleplex.com/wG4oCHjXJdT)

Ack. My pet peeve are those single pixel errors that are hard to spot.
We can do something that'll make this easier....

- The ability to flag a subset of tests and then copy the names of
those tests in the "Flagged Tests" section

Ack. Something like the flagging ability already got checked in while
I was on vacation. https://chromium-review.googlesource.com/c/597388
Could be generalized to what you need. Again, I never discovered this
in the old UI.

Aleks

On Wed, Aug 2, 2017 at 9:36 AM, Stefan Zager <sza...@chromium.org> wrote:
> --
> You received this message because you are subscribed to the Google Groups
> "blink-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to blink-dev+...@chromium.org.
> To view this discussion on the web visit
> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAHOQ7J-%3DH2ou_DGbtS6-3CjRLr3N1hRBUQjuCaqxZ81ZWAkhAg%40mail.gmail.com.

Aleks Totic

unread,
Sep 29, 2017, 2:02:06 PM9/29/17
to blink-dev, sza...@chromium.org, qyea...@chromium.org, tha...@chromium.org, blink...@chromium.org
New layout test result viewer has been updated with some new features:

Example:
# Better image viewer

View result images as single/animated/side-by-side. 
Zoom works. Also shows color under cursor (http only)
Diff highlight. An animation highlights diff area when result is loaded.

# Better text results viewer

Now with pretty diffs.

# New output format: Group results by crash site

Want to know where your fixes can fix the greatest number of crashes? Use this view.

# New output format: Group results by text mismatch

Want to know which results are really different, and which differ only in whitespace?

Big thanks to qyearsley and xiaocheng for your help.

Aleks

Robert Ma

unread,
Sep 29, 2017, 4:38:09 PM9/29/17
to Aleks Totic, blink-dev, sza...@chromium.org, Quinten Yearsley, tha...@chromium.org, blink...@chromium.org
This looks great, and the UI is much more functional! Not sure if it's just a mental effect, the filtering also seems faster.

--
You received this message because you are subscribed to the Google Groups "blink-infra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-infra+unsubscribe@chromium.org.
To post to this group, send email to blink...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-infra/fb13a086-6205-4f12-8d7d-3f3d918279d6%40chromium.org.

Aleks Totic

unread,
Sep 29, 2017, 4:43:39 PM9/29/17
to blink-dev, blink-infra
Thanks. 

I forgot to mention one more feature: use space + shift space to quickly navigate between detailed results.

Aleks

Philip Jägenstedt

unread,
Sep 30, 2017, 6:45:36 AM9/30/17
to Aleks Totic, blink-dev, blink-infra
I'm really liking this too!

Aside: Unexpected pass is a funny thing that I don't think anyone's trying to fix. Seems like a shame to have passing tests that we could still regress without noticing. What would it take to collect and send a weekly report of these somewhere like blink...@chromium.org?

--
You received this message because you are subscribed to the Google Groups "blink-infra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-infra...@chromium.org.

--
You received this message because you are subscribed to the Google Groups "blink-infra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-infra...@chromium.org.

To post to this group, send email to blink...@chromium.org.

Xianzhu Wang

unread,
Sep 30, 2017, 1:58:18 PM9/30/17
to Philip Jägenstedt, Aleks Totic, blink-dev, blink-infra
On Sat, Sep 30, 2017 at 3:45 AM, 'Philip Jägenstedt' via blink-dev <blin...@chromium.org> wrote:
I'm really liking this too!

Aside: Unexpected pass is a funny thing that I don't think anyone's trying to fix. Seems like a shame to have passing tests that we could still regress without noticing. What would it take to collect and send a weekly report of these somewhere like blink...@chromium.org?

I think it's a good idea to monitor and fix unexpected passes.

The flakiness dashboard might be a better source to generate the report of unexpected passes (including unexpected unflakiness) and unexpected flakiness. This is also the reason that I prefer running flaky tests somewhere to skipping them on all bots.


--
You received this message because you are subscribed to the Google Groups "blink-infra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-infra+unsubscribe@chromium.org.

--
You received this message because you are subscribed to the Google Groups "blink-infra" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-infra+unsubscribe@chromium.org.

To post to this group, send email to blink...@chromium.org.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.

Aleks Totic

unread,
Sep 30, 2017, 11:40:03 PM9/30/17
to Philip Jägenstedt, blink-dev, blink-infra
Aside: Unexpected pass is a funny thing that I don't think anyone's trying to fix. Seems like a shame to have passing tests that we could still regress without noticing. What would it take to collect and send a weekly report of these somewhere like blink...@chromium.org?

Incorrect reporting of unexpected passes is what kicked off this entire side project.

In layoutng, unexpected passes is how you know your patch is doing well.

I think infra is aware, and working on reducing number of unexpected passes. w-p-t imports are a frequent cause.

Aleks

Philip Jägenstedt

unread,
Oct 1, 2017, 3:07:08 AM10/1/17
to Aleks Totic, robe...@chromium.org, blink-dev, blink-infra
Do you mean that the import changes the test so that it's now passing, but the failure expectation remains?

+Robert Ma, is that something that could be automatically handled?

bo...@google.com

unread,
Oct 2, 2017, 12:13:49 PM10/2/17
to blink-dev, ato...@google.com, robe...@chromium.org, blink...@chromium.org, foo...@google.com
FYI: We already have Tools/Scripts/update-flaky-expectations which looks at recent trybot results and removes lines in TestExpectations that are unneeded. I'm not sure if WPT are different in any meaningful way but it might already "just work" (or the script may need some simple tweaking to handle WPT specifically). It'd be nice if we could automate to run regularly (though qyearsley@ mentioned there's still an outstanding issue or two).

Robert Ma

unread,
Oct 2, 2017, 2:13:21 PM10/2/17
to bo...@google.com, Quinten Yearsley, blink-dev, Aleks Totic, blink...@chromium.org, Philip Jägenstedt
On Mon, Oct 2, 2017 at 12:13 PM, <bo...@google.com> wrote:
FYI: We already have Tools/Scripts/update-flaky-expectations which looks at recent trybot results and removes lines in TestExpectations that are unneeded. I'm not sure if WPT are different in any meaningful way but it might already "just work" (or the script may need some simple tweaking to handle WPT specifically). It'd be nice if we could automate to run regularly (though qyearsley@ mentioned there's still an outstanding issue or two).

Quinten, can you shed some light on this? I'd be interested to automate this.

(I was actually not aware of the need to clean up unexpected passes. It seems they do not turn trybots red, and hence are not handled by automatic rebaseline.)

Quinten Yearsley

unread,
Oct 2, 2017, 8:57:56 PM10/2/17
to Robert Ma, David Bokan, blink-dev, Aleks Totic, blink-infra, Philip Jägenstedt
Yep! So update-flaky-expectations is a great script. It uses history of results from the test-results dashboard to decide whether a port is flaky or not. It's still theoretically possible for a test to be flaky but to only have passing results on the test-results dashboard (e.g. if results aren't being reported correctly, or for some reason the waterfall bots are different than the CQ bots).

CLs created after running the script look like this:

In order to automate it entirely, I'd imagine we'd make is something similar to wpt-import/wpt-export -- with a bot account, a recipe and a builder. The builder would run the recipe which uses the bot account credentials to runs the script and commit a CL. I'm not sure whether this is the best option, but it's possible.

On the topic of unexpected passes, there's crbug.com/730704.

Aleks Totic

unread,
Oct 18, 2017, 3:36:45 PM10/18/17
to blink-dev, sza...@chromium.org, qyea...@chromium.org, tha...@chromium.org, blink...@chromium.org
New layout test result viewer has been updated with a few new features:
- flagging: tests can be flagged
- text filter: you can filter tests by name

I think that this brings it up to feature parity with existing test viewer. Exception is that certain keyboard shortcuts are different (j, k, e, c, f)

The reason for not implementing j & k is that calling element.focus() on a div moves focus, but not the focus ring, and I do not know how to work around it.

I'd prefer that this page be the default. What do you think?

Aleks

Xianzhu Wang

unread,
Oct 18, 2017, 4:40:54 PM10/18/17
to Aleks Totic, blink-dev, sza...@chromium.org, Quinten Yearsley, Nico Weber, blink-infra
Thanks Aleks for the work. +1 for making the page default.

I use a heavy user of the keyboard shortcuts. Can you file a bug for the focus issue in more details? Perhaps I can work on the bug.

I'm also a heavy user of the following features of the current default result page:
1. List of test names without new lines. It's useful to run specific layout tests by copying the test names into the run-webkit-tests command line;
2. Repaint overlay.
I can work on them for the new result page.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.

Xianzhu Wang

unread,
Oct 18, 2017, 5:26:41 PM10/18/17
to Aleks Totic, blink-dev, sza...@chromium.org, Quinten Yearsley, Nico Weber, blink-infra
For the rebaseline feature, I think we should rely more on the rebaseline tool and run-webkit-tests --reset-results. The rebaseline script generated in the new result page seems not to handle platform variants well.

Xiaocheng Hu

unread,
Oct 19, 2017, 1:54:53 PM10/19/17
to Xianzhu Wang, Aleks Totic, blink-dev, sza...@chromium.org, Quinten Yearsley, Nico Weber, blink-infra
I did the rebaseline feature mostly for LayoutNG, where I had to manually determine from tons of failures that if each failure is a true failure or just needs rebaseline. Not sure if there's a more efficient workflow...

The script is tested only on my Linux machine. I doubt it's ever run on any other machine.

Maybe I should add a note that this is a highly experimental feature that's designed for edge cases, and the rebaseline tool is preferred for general rebaselining?


Quinten Yearsley

unread,
Oct 19, 2017, 2:25:16 PM10/19/17
to Xiaocheng Hu, Xianzhu Wang, Aleks Totic, blink-dev, Stefan Zager, Nico Weber, blink-infra
The "webkit-patch rebaseline*" commands are most useful for rebaselining platform-specific results. For non-platform-specific results it's faster to just run the tests locally to get baselines.

run-webkit-tests --reset-results is probably easier for non-platform-specific results. Xiaocheng, do you think that --reset-results could also work for all of the use-cases of the rebaseline feature of test-expectations.html?

Xianzhu Wang

unread,
Oct 19, 2017, 2:38:47 PM10/19/17
to Quinten Yearsley, Xiaocheng Hu, Aleks Totic, blink-dev, Stefan Zager, Nico Weber, blink-infra
run-webkit-tests --reset-results rebaselines platform-specific results if there are already platform-specific baselines, for the current platform. It can also remove extra platform-specific baselines if the new platform-specific baselines are the same as the fallback ones. Together with --copy-baselines, --reset-results can also generate new platform-specific baselines if they are different from the fallback ones.

The rebaseline feature of text-expectations.html seems to lack these features, and in some cases it will rebaseline wrong results.

I think the rebaseline feature of text-expectations.html would be more useful if it just create a run-webkit-tests --reset-results command line containing all selected tests.

Stephen Chenney

unread,
Oct 19, 2017, 4:36:58 PM10/19/17
to Xianzhu Wang, Quinten Yearsley, Xiaocheng Hu, Aleks Totic, blink-dev, Stefan Zager, Nico Weber, blink-infra
To circle all the way back several emails, I use the "1. List of test names without new lines" as a string to pass to rebaseline-cl, because it makes it easy to avoid rebaselining unrelated flake failures from try runs. Reverting things in git after the fact is very annoying, and hand copying names is annoying when 300 odd tests change, as happens to me not infrequently.

Cheers,
Stephen.

Aleks Totic

unread,
Oct 19, 2017, 4:40:11 PM10/19/17
to Stephen Chenney, Xianzhu Wang, Quinten Yearsley, Xiaocheng Hu, blink-dev, Stefan Zager, Nico Weber, blink-infra
Names without newlines is easy. Would "Plain text without newlines" format work for you?

Aleks

Stephen Chenney

unread,
Oct 19, 2017, 4:53:24 PM10/19/17
to Aleks Totic, Xianzhu Wang, Quinten Yearsley, Xiaocheng Hu, blink-dev, Stefan Zager, Nico Weber, blink-infra
Yes, that would work for my use case. Thanks.

Xiaocheng Hu

unread,
Oct 19, 2017, 5:04:55 PM10/19/17
to Stephen Chenney, Aleks Totic, Xianzhu Wang, Quinten Yearsley, Xiaocheng Hu, blink-dev, Stefan Zager, Nico Weber, blink-infra
On Thu, Oct 19, 2017 at 1:53 PM, Stephen Chenney <sche...@chromium.org> wrote:
Yes, that would work for my use case. Thanks.

On Thu, Oct 19, 2017 at 4:39 PM, Aleks Totic <ato...@google.com> wrote:
Names without newlines is easy. Would "Plain text without newlines" format work for you?

Aleks

On Thu, Oct 19, 2017 at 1:36 PM, Stephen Chenney <sche...@chromium.org> wrote:
To circle all the way back several emails, I use the "1. List of test names without new lines" as a string to pass to rebaseline-cl, because it makes it easy to avoid rebaselining unrelated flake failures from try runs. Reverting things in git after the fact is very annoying, and hand copying names is annoying when 300 odd tests change, as happens to me not infrequently.

Cheers,
Stephen.

On Thu, Oct 19, 2017 at 2:38 PM, Xianzhu Wang <wangx...@chromium.org> wrote:
run-webkit-tests --reset-results rebaselines platform-specific results if there are already platform-specific baselines, for the current platform. It can also remove extra platform-specific baselines if the new platform-specific baselines are the same as the fallback ones. Together with --copy-baselines, --reset-results can also generate new platform-specific baselines if they are different from the fallback ones.

The rebaseline feature of text-expectations.html seems to lack these features, and in some cases it will rebaseline wrong results.

I think the rebaseline feature of text-expectations.html would be more useful if it just create a run-webkit-tests --reset-results command line containing all selected tests.

This seems to be the correct direction. It seems better for test-expectations.html to simply generate a list of tests to be rebaselined (using the flag feature), and then pass the list to run-webkit-tests or some other tool.
 

On Thu, Oct 19, 2017 at 11:24 AM, Quinten Yearsley <qyea...@chromium.org> wrote:
The "webkit-patch rebaseline*" commands are most useful for rebaselining platform-specific results. For non-platform-specific results it's faster to just run the tests locally to get baselines.

run-webkit-tests --reset-results is probably easier for non-platform-specific results. Xiaocheng, do you think that --reset-results could also work for all of the use-cases of the rebaseline feature of test-expectations.html?

We don't need to consider platform-specific baselines for now. However, we need to rebaseline flag-specific results.

Does run-webkit-tests --reset-results reset flag-specific baselines when, say, run with --additional-driver-flag?

Xianzhu Wang

unread,
Oct 19, 2017, 5:06:40 PM10/19/17
to Xiaocheng Hu, Stephen Chenney, Aleks Totic, Quinten Yearsley, blink-dev, Stefan Zager, Nico Weber, blink-infra
On Thu, Oct 19, 2017 at 2:04 PM, Xiaocheng Hu <xiaoc...@chromium.org> wrote:


On Thu, Oct 19, 2017 at 1:53 PM, Stephen Chenney <sche...@chromium.org> wrote:
Yes, that would work for my use case. Thanks.

On Thu, Oct 19, 2017 at 4:39 PM, Aleks Totic <ato...@google.com> wrote:
Names without newlines is easy. Would "Plain text without newlines" format work for you?

Aleks

On Thu, Oct 19, 2017 at 1:36 PM, Stephen Chenney <sche...@chromium.org> wrote:
To circle all the way back several emails, I use the "1. List of test names without new lines" as a string to pass to rebaseline-cl, because it makes it easy to avoid rebaselining unrelated flake failures from try runs. Reverting things in git after the fact is very annoying, and hand copying names is annoying when 300 odd tests change, as happens to me not infrequently.

Cheers,
Stephen.

On Thu, Oct 19, 2017 at 2:38 PM, Xianzhu Wang <wangx...@chromium.org> wrote:
run-webkit-tests --reset-results rebaselines platform-specific results if there are already platform-specific baselines, for the current platform. It can also remove extra platform-specific baselines if the new platform-specific baselines are the same as the fallback ones. Together with --copy-baselines, --reset-results can also generate new platform-specific baselines if they are different from the fallback ones.

The rebaseline feature of text-expectations.html seems to lack these features, and in some cases it will rebaseline wrong results.

I think the rebaseline feature of text-expectations.html would be more useful if it just create a run-webkit-tests --reset-results command line containing all selected tests.

This seems to be the correct direction. It seems better for test-expectations.html to simply generate a list of tests to be rebaselined (using the flag feature), and then pass the list to run-webkit-tests or some other tool.
 

On Thu, Oct 19, 2017 at 11:24 AM, Quinten Yearsley <qyea...@chromium.org> wrote:
The "webkit-patch rebaseline*" commands are most useful for rebaselining platform-specific results. For non-platform-specific results it's faster to just run the tests locally to get baselines.

run-webkit-tests --reset-results is probably easier for non-platform-specific results. Xiaocheng, do you think that --reset-results could also work for all of the use-cases of the rebaseline feature of test-expectations.html?

We don't need to consider platform-specific baselines for now. However, we need to rebaseline flag-specific results.

Does run-webkit-tests --reset-results reset flag-specific baselines when, say, run with --additional-driver-flag?

Aleks Totic

unread,
Oct 20, 2017, 12:28:36 PM10/20/17
to Xianzhu Wang, Xiaocheng Hu, Stephen Chenney, Quinten Yearsley, blink-dev, Stefan Zager, Nico Weber, blink-infra
Fixed the nits: j/k shortcuts, plus plain text wrapped format.


Thanks to Xianzhu for offering to look at focus issue, which made me find a bug when I tried to create a minimal reproducible case.

Aleks

qyea...@google.com

unread,
May 9, 2018, 12:25:41 PM5/9/18
to blink-dev, wangx...@chromium.org, xiaoc...@chromium.org, sche...@chromium.org, qyea...@chromium.org, sza...@chromium.org, tha...@chromium.org, blink...@chromium.org
Follow-up: Does anyone use the old results viewer anymore? That is, does anyone ever click on "go back to legacy results.html"? Any objections to removing it now?
Reply all
Reply to author
Forward
0 new messages