Pinpoint should also "auto-blame" improvements down a single CL

8 views
Skip to first unread message

Gabriel Charette

unread,
Feb 19, 2018, 7:33:20 AM2/19/18
to benchmar...@chromium.org, scheduler-dev, v8-...@googlegroups.com, Ben Henry, sull...@chromium.org
Hello Benchmarking Dev!

First, thanks for your hard work on bringing and keeping reliable benchmarks with automatic single CL blaming tools.

I recently have been digging more into low-level scheduling primitives on the v8 side and it tends to move benchmarks in interesting ways.

The problem: pinpoint only identifies and then narrows down to a single CL for regressions.
The request: pinpoint should identify and narrow down to a single CL for improvements as well.

CLs in the scheduling space tend to move many benchmarks and it's hard to tell what it actually improved/regressed.

For example did r534414 overall improve or regress things? It looks like it went both ways, but only regressions were pinpointed down to my CL, what about the other dozens of green graphs, are they coincidental or also caused by my CL? I could launch bisects on all the green dots but not only is that tedious it will also result in pinging a bunch of people (go/catabug/4225).

After fixing the pinpointed regressions I'm left to wonder whether this was a no-op or an overall improvement.

Not knowing what is an overall improvement is an engineering problem as it denies data that could otherwise serve as a hint to paradigms that should be encouraged and reproduced for further gains.

If a CL regresses one thing and improves ten. The only way to specifically know that it improved those 10 things is to have it reverted for that 1 regression and then have the revert cause its own set of "regressions" (i.e. unimprovements)...

Feels that all the data and tooling is there, can we just enable automatic pinpoint for improvements? I understanding that filing bugs for improvements is weird but as a first pass I'm sure the vast majority of engineers would be glad to be told that they're making things better, regardless of the medium through which the news is delivered!

Thanks!
Gab

Annie Sullivan

unread,
Feb 20, 2018, 11:06:53 AM2/20/18
to Gabriel Charette, scheduler-dev, v8-...@googlegroups.com, Ben Henry, speed-ser...@chromium.org, Dave Tu, Simon Hatch
bcc: benchmarking-dev
cc: speed-services-dev, dtu, simonhatch

Pinpoint does identify both regressions and improvements; you can see more details here. Dave, Simon, can one of you look into the specifics of this case?

Thanks,
Annie

simon...@chromium.org

unread,
Feb 20, 2018, 11:37:34 AM2/20/18
to Chrome benchmarking, g...@google.com, schedu...@chromium.org, v8-...@googlegroups.com, benh...@chromium.org, speed-ser...@chromium.org, d...@chromium.org, simon...@chromium.org
Completely agree in that we want to get to a place where you can clearly what impact a CL had. The data and tooling are getting there, there's been considerable effort in the last year to improve things on that front. We can't enable automatic Pinpoint for improvements, although we do have plans for a much more automated sherriffing flow in the not-too-distant future (we're requesting a lot more hardware and I believe benchmarking team is narrowing the # of configurations).

For this specific case, only regressions get triaged, thus you only see regressions get filed to you.

Gabriel Charette

unread,
Feb 20, 2018, 11:45:12 AM2/20/18
to simon...@chromium.org, Chrome benchmarking, schedu...@chromium.org, v8-...@googlegroups.com, benh...@chromium.org, speed-ser...@chromium.org, d...@chromium.org
Oops, we crossed in writing :)  

On Tue, Feb 20, 2018 at 5:37 PM <simon...@chromium.org> wrote:
Completely agree in that we want to get to a place where you can clearly what impact a CL had. The data and tooling are getting there, there's been considerable effort in the last year to improve things on that front. We can't enable automatic Pinpoint for improvements, although we do have plans for a much more automated sherriffing flow in the not-too-distant future (we're requesting a lot more hardware and I believe benchmarking team is narrowing the # of configurations).

For this specific case, only regressions get triaged, thus you only see regressions get filed to you.

What would it take to triage both? Seems like a single bit thing for a big gain?

Also, re. "you can file a bug on them". That's at least blocked on go/catabug/4225.

Gabriel Charette

unread,
Feb 20, 2018, 11:46:18 AM2/20/18
to simon...@chromium.org, Chrome benchmarking, schedu...@chromium.org, v8-...@googlegroups.com, benh...@chromium.org, speed-ser...@chromium.org, d...@chromium.org
(sigh @chromium.org.. again!)
Reply all
Reply to author
Forward
0 new messages