Re: Make WTF::Vector safer by default

Kentaro Hara

unread,

Aug 29, 2024, 7:36:09 AM8/29/24

to TAMURA, Kent, platform-architecture-dev, Michael Lippautz, Daniel Cheng, Nico Weber, Dana Jansens, tse...@chromium.org, Daniel Cheng

If you replace the existing, hand-written Vector::begin/end with Vector::CheckedBegin/End (except performance-sensitive ones), what will the performance look like on Speedometer? Would it be hard to measure it locally before we decide to move forward with this approach?

@platform-architecture-dev (to respect go/always-cc)

On Thu, Aug 29, 2024 at 4:04 AM TAMURA, Kent <tk...@chromium.org> wrote:

On Tue, Aug 27, 2024 at 4:01 PM TAMURA, Kent <tk...@chromium.org> wrote:

On Thu, Aug 15, 2024 at 5:44 PM Michael Lippautz <mlip...@chromium.org> wrote:
Thanks for keeping performance a top priority here.

On Thu, Aug 15, 2024 at 3:04 AM TAMURA, Kent <tk...@chromium.org> wrote:
Thank you for the feedback!
I reconsider alternative options.

Alternative 1: Adding minimal CHECKs to the current iterator
I tried to add CHECKs to * -> and [] operators.
CL: https://chromium-review.googlesource.com/c/chromium/src/+/5767974
Pinpoint: https://pinpoint-dot-chromeperf.appspot.com/job/17256d24210000

Name⋯▴ base: chromium@4ccd4ef(Variant: 0)▿ exp: chromium@4ccd4ef(Variant: 1)▿
Charts-chartjs
44.679 ms
+0.1%
Charts-observable-plot
47.839 ms
+0.4%
Editor-CodeMirror
24.150 ms
+0.3%
Editor-TipTap
78.013 ms
+0.3%
NewsSite-Next
87.927 ms
+0.4%
NewsSite-Nuxt
74.510 ms
+0.3%
Perf-Dashboard
34.905 ms
-0.1%
React-Stockcharts-SVG
82.670 ms
+0.9%
Score
25.561
-0.3%
TodoMVC-Angular-Complex-DOM
33.982 ms
+0.7%
TodoMVC-Backbone
28.726 ms
+0.1%
TodoMVC-JavaScript-ES5
49.120 ms
+0.1%
TodoMVC-JavaScript-ES6-Webpack-Complex-DOM
57.223 ms
+0.7%
TodoMVC-jQuery
120.165 ms
+0.4%
TodoMVC-Lit-Complex-DOM
22.915 ms
+0.7%
TodoMVC-Preact-Complex-DOM
15.526 ms
-0.1%
TodoMVC-React-Complex-DOM
34.814 ms
-0.4%
TodoMVC-React-Redux
35.314 ms
+0.1%
TodoMVC-Svelte-Complex-DOM
13.690 ms
+0.4%
TodoMVC-Vue
24.184 ms
+0.1%
TodoMVC-WebComponents
20.882 ms
+0.3%

IMO, this performance result is not acceptable.

This is not great as is. However, have you tried a local build w/ PGO/LTO as well?

For better or worse, this is an area where PGO/LTO and good inlining changes profiles. In the past I noticed that a bunch of base::span CHECKs that would show up in profiles for plain release builds that were all gone on PGO/LTO.

I know it's a bit cumbersome to experiment with locally but it may allow you to ship the checked iterators everywhere. E.g., Oilpan pointer compression had >1% regression (on Speedometer2) without PGO that was almost (~0.2%) gone when compiling with PGO.

We don't have high repetition counts on our PGO perf bots though (resource constraints), so we would need to verify locally that PGO indeed helps here.

Thank you for the suggestion! I tested PGO builds on MacBook Pro (M1 Pro). I ran Speedometer3 128 times for each configuration.
"base": ToT, PGO build with locally-generated profile
"checked": The CL is applied, PGO build with locally-generated profile

I think the result below is still unacceptable unfortunately.

Name⋯▴ base▿ checked▿
Charts-chartjs
42.833 ms
+1.0%
Charts-observable-plot
41.956 ms
+1.7%
Editor-CodeMirror
22.060 ms
+1.3%
Editor-TipTap
71.760 ms
+0.2%
NewsSite-Next
71.426 ms
-0.4%
NewsSite-Nuxt
56.899 ms
+0.2%
Perf-Dashboard
31.001 ms
+1.7%
React-Stockcharts-SVG
72.251 ms
+1.8%
Score
31.505
-1.0%
TodoMVC-Angular-Complex-DOM
29.201 ms
+1.1%
TodoMVC-Backbone
19.864 ms
+2.0%
TodoMVC-JavaScript-ES5
35.568 ms
+0.7%
TodoMVC-JavaScript-ES6-Webpack-Complex-DOM
45.357 ms
+0.8%
TodoMVC-jQuery
96.584 ms
+0.6%
TodoMVC-Lit-Complex-DOM
17.555 ms
+1.5%
TodoMVC-Preact-Complex-DOM
11.695 ms
+2.3%
TodoMVC-React-Complex-DOM
31.544 ms
+0.1%
TodoMVC-React-Redux
29.516 ms
+0.8%
TodoMVC-Svelte-Complex-DOM
10.727 ms
+0.7%
TodoMVC-Vue
18.643 ms
+1.8%
TodoMVC-WebComponents
13.499 ms
+0.2%

Alternative 2: Keep the current unsafe iterator as is, but Blink developers can avoid using the unsafe iterator
C++ compilers produce code calling `Vector::begin()` `Vector::end()` `iterator::operator++` and `iterator::operator*` for range-based `for` loops, and we don't want to add CHECKs to them to keep the current performance. However, Blink developers should use a safer iterator in hand-written code. So,
Don't change Vector::begin(), Vector::end(), and the current iterator implementation
Add Vector::CheckedBegin() and Vector::CheckedEnd(), which return base::CheckedContiguousIterator instances. Blink developers should use them in hand-written code.
Other methods taking an iterator argument or returning an iterator are not changed. We can have implicit converters between the unsafe iterator and base::CheckedContiguousIterator.
Blink developers replace `begin()` with `CheckedBegin() to resolve unsafe-buffer-manipulation warnings for existing code.

How big is the fallout of changes required here? I personally find DX okay here as it's all guided by warnings.

I guess we can always exempt certain hot code in case it comes up on profiles.

Currently 764 files in third_party/blink/renderer have "#pragma allow_unsafe_buffers". Not all of them are due to WTF::Vector iterator, so I think the number of files involved is much less than 764. We can proceed this incrementally, as a part of the spanification effort.

If no one objects, I'll proceed with this approach.

// Code producing a unsafe-buffer-manipulation warning
for (auto it = v.begin(); it != v.end(); ++it) {
if (it->...) {
v.erase(it);
break;
}
}

// Resolve the warning by CheckedBegin() and CheckedEnd()
for (auto it = v.CheckedBegin(); it != v.CheckedEnd(); ++it) {
if (it->...) {
v.erase(it); // implicit conversion from base::CheckedContiguousIterator to Vector<>::iterator.
break;
}
}

On Wed, Aug 14, 2024 at 3:04 PM Michael Lippautz <mlip...@chromium.org> wrote:

On Wed, Aug 14, 2024 at 6:01 AM Kentaro Hara <har...@chromium.org> wrote:
Thanks Kent-san for the idea!

I do understand the motivation but I have a few concerns about introducing yet another type of Vector: 1) the extra type regresses developer ergonomics, 2) it's not obvious when developers should use Vector vs. UnsafeVector; e.g., even if it doesn't regress Speedometer3, it may regress the real-world LCP because # of elements stored in the Vector will depend on the workload.

+1, Speedometer is a proxy for performance and we should not just work around it.

Also, wrt to 1) we are still violating one of our key principles with GC in HeapVector (the type allows on-stack and on-heap usage). We wanted to harden this case many years ago (because it causes real confusion and discussions) but held off on it because of not wanting to introduce any additional types around Vector.

My concerns are gone if we can reduce the performance overhead of Vector enough to say "You can use Vector by default. Only when it is extremely performance-critical, please use UnsafeVector -- the usage should be limited".

+1 (see below)

A few questions:

- What is our plan for iterators of std::vector, std::map etc?

- Can we consider a (less qualified but) more lightweight check? e.g., only check whether |it| is not pointing to the end element before incrementing. I think @Daniel Cheng experimented with this idea in the past maybe?

On Wed, Aug 14, 2024 at 10:48 AM TAMURA, Kent <tk...@chromium.org> wrote:
Hi WTF owners,

As a part of Spanification project, I'd like to do the following changes:
* Rename all existing Vector and HeapVector types to UnsafeVector and HeapUnsafeVector.
* Introduce Vector and HeapVector, which use base::CheckedContiguousIterator.
* Switch back UnsafeVector/HeapUnsafeVector instances to the new Vector/HeapVector if they are not performance-critical.

Isn't one of the promises that spans and friends with their checked ops are free if they are used e.g. range-based for loops when then the compiler can prove that there's no OOB? Shouldn't this be our regular case?

I am concerned about switching something as broadly used as Vector to a different version to only incrementally convert it back then. I am a bit surprised that a checked iterator here would actually regress anything because we already have a CHECK() in at() that is also used by operator[].

Do you have any feedback on this plan?

Background:
Iterators of WTF::Vector are almost raw pointers, and even code like `++it` triggers unsafe-buffer-operation errors. Using base::CheckedContiguousIterator resolves the compilation errors, but I confirmed it regressed Speedometer3. So we should have Vector with the current unsafe iterators and Vector with base::CheckedContiguousIterator, and should use the former only in hot code paths.

--
TAMURA Kent
Software Engineer, Google

--
Kentaro Hara, Tokyo

--
TAMURA Kent
Software Engineer, Google

--
TAMURA Kent
Software Engineer, Google

--
TAMURA Kent
Software Engineer, Google

Name⋯▴	base: chromium@4ccd4ef(Variant: 0)▿	exp: chromium@4ccd4ef(Variant: 1)▿
Charts-chartjs	44.679 ms	+0.1%
Charts-observable-plot	47.839 ms	+0.4%
Editor-CodeMirror	24.150 ms	+0.3%
Editor-TipTap	78.013 ms	+0.3%
NewsSite-Next	87.927 ms	+0.4%
NewsSite-Nuxt	74.510 ms	+0.3%
Perf-Dashboard	34.905 ms	-0.1%
React-Stockcharts-SVG	82.670 ms	+0.9%
Score	25.561	-0.3%
TodoMVC-Angular-Complex-DOM	33.982 ms	+0.7%
TodoMVC-Backbone	28.726 ms	+0.1%
TodoMVC-JavaScript-ES5	49.120 ms	+0.1%
TodoMVC-JavaScript-ES6-Webpack-Complex-DOM	57.223 ms	+0.7%
TodoMVC-jQuery	120.165 ms	+0.4%
TodoMVC-Lit-Complex-DOM	22.915 ms	+0.7%
TodoMVC-Preact-Complex-DOM	15.526 ms	-0.1%
TodoMVC-React-Complex-DOM	34.814 ms	-0.4%
TodoMVC-React-Redux	35.314 ms	+0.1%
TodoMVC-Svelte-Complex-DOM	13.690 ms	+0.4%
TodoMVC-Vue	24.184 ms	+0.1%
TodoMVC-WebComponents	20.882 ms	+0.3%

Name⋯▴	base▿	checked▿
Charts-chartjs	42.833 ms	+1.0%
Charts-observable-plot	41.956 ms	+1.7%
Editor-CodeMirror	22.060 ms	+1.3%
Editor-TipTap	71.760 ms	+0.2%
NewsSite-Next	71.426 ms	-0.4%
NewsSite-Nuxt	56.899 ms	+0.2%
Perf-Dashboard	31.001 ms	+1.7%
React-Stockcharts-SVG	72.251 ms	+1.8%
Score	31.505	-1.0%
TodoMVC-Angular-Complex-DOM	29.201 ms	+1.1%
TodoMVC-Backbone	19.864 ms	+2.0%
TodoMVC-JavaScript-ES5	35.568 ms	+0.7%
TodoMVC-JavaScript-ES6-Webpack-Complex-DOM	45.357 ms	+0.8%
TodoMVC-jQuery	96.584 ms	+0.6%
TodoMVC-Lit-Complex-DOM	17.555 ms	+1.5%
TodoMVC-Preact-Complex-DOM	11.695 ms	+2.3%
TodoMVC-React-Complex-DOM	31.544 ms	+0.1%
TodoMVC-React-Redux	29.516 ms	+0.8%
TodoMVC-Svelte-Complex-DOM	10.727 ms	+0.7%
TodoMVC-Vue	18.643 ms	+1.8%
TodoMVC-WebComponents	13.499 ms	+0.2%

--

Kentaro Hara, Tokyo

TAMURA, Kent

unread,

Sep 3, 2024, 3:25:13 AM9/3/24

to Kentaro Hara, platform-architecture-dev, Michael Lippautz, Daniel Cheng, Nico Weber, Dana Jansens, tse...@chromium.org, Daniel Cheng

Speedometer score should not be changed. If a CL using CheckedBegin()/CheckedEnd() regressed Speedometer, it should be reverted to begin()/end() with UNSAFE_BUFFERS().

Kentaro Hara

unread,

Sep 3, 2024, 4:50:10 AM9/3/24

to TAMURA, Kent, Michael Buettner, platform-architecture-dev, Michael Lippautz, Daniel Cheng, Nico Weber, Dana Jansens, tse...@chromium.org, Daniel Cheng

Speedometer score should not be changed.

Would you elaborate on why you can say this?

If a CL using CheckedBegin()/CheckedEnd() regressed Speedometer, it should be reverted to begin()/end() with UNSAFE_BUFFERS().

My concern is that a small regression might accumulate to a significant regression. I want to see some evidence to believe that we can end up with a state where most begin()/end() are replaced with CheckedBegin()/CheckedEnd() without a significant regression.

@Michael Lippautz will be the best person to make a call about Speedometer :)

--

Kentaro Hara, Tokyo

Michael Lippautz

unread,

Sep 3, 2024, 1:35:15 PM9/3/24

to Kentaro Hara, TAMURA, Kent, Michael Buettner, platform-architecture-dev, Michael Lippautz, Daniel Cheng, Nico Weber, Dana Jansens, tse...@chromium.org, Daniel Cheng

+1 to Dana's question about PGO (and probably double-checking that). I have seen a lot of iterator CHECKs being optimized away on PGO (yay!) which is why I asked.

On Tue, Sep 3, 2024 at 10:50 AM Kentaro Hara <har...@chromium.org> wrote:

Speedometer score should not be changed.

Would you elaborate on why you can say this?

If a CL using CheckedBegin()/CheckedEnd() regressed Speedometer, it should be reverted to begin()/end() with UNSAFE_BUFFERS().

My concern is that a small regression might accumulate to a significant regression. I want to see some evidence to believe that we can end up with a state where most begin()/end() are replaced with CheckedBegin()/CheckedEnd() without a significant regression.

+1 about the incremental regressions creeping in that we cannot fix as they don't show up.

I don't think there's a need for blocking anything though (as there's really no regression by just adding the safer defaults) but it would certainly be nice if we could switch the default everywhere and rely on PGO to optimize them away.

TAMURA, Kent

unread,

Sep 6, 2024, 9:57:47 AM9/6/24

to Michael Lippautz, Kentaro Hara, Michael Buettner, platform-architecture-dev, Daniel Cheng, Nico Weber, Dana Jansens, tse...@chromium.org, Daniel Cheng

On Wed, Sep 4, 2024 at 2:35 AM Michael Lippautz <mlip...@chromium.org> wrote:

+1 to Dana's question about PGO (and probably double-checking that). I have seen a lot of iterator CHECKs being optimized away on PGO (yay!) which is why I asked.

I generated profile data after building a binary with the change, and rebuilt a binary with the profile data.

On Tue, Sep 3, 2024 at 10:50 AM Kentaro Hara <har...@chromium.org> wrote:
Speedometer score should not be changed.

Would you elaborate on why you can say this?

If a CL using CheckedBegin()/CheckedEnd() regressed Speedometer, it should be reverted to begin()/end() with UNSAFE_BUFFERS().

My concern is that a small regression might accumulate to a significant regression. I want to see some evidence to believe that we can end up with a state where most begin()/end() are replaced with CheckedBegin()/CheckedEnd() without a significant regression.

+1 about the incremental regressions creeping in that we cannot fix as they don't show up.

Ok, I understand your concern.

If we changed begin()/end() so that they return base::CheckedContiguousIterator, we would have 0.2% regression of Speedometer3 score. The impact of the latest proposal would be smaller than it because we would update a part of begin()/end() usages.

Kentaro Hara

unread,

Sep 6, 2024, 10:03:46 AM9/6/24

to TAMURA, Kent, Michael Lippautz, Michael Buettner, platform-architecture-dev, Daniel Cheng, Nico Weber, Dana Jansens, Tom Sepez, Daniel Cheng

Thanks LGTM from my side.

2024年9月6日(金) 22:57 TAMURA, Kent <tk...@chromium.org>:

danakj

unread,

Sep 6, 2024, 10:07:58 AM9/6/24

to TAMURA, Kent, Michael Lippautz, Kentaro Hara, Michael Buettner, platform-architecture-dev, Daniel Cheng, Nico Weber, tse...@chromium.org, Daniel Cheng

On Fri, Sep 6, 2024 at 9:57 AM TAMURA, Kent <tk...@chromium.org> wrote:

On Wed, Sep 4, 2024 at 2:35 AM Michael Lippautz <mlip...@chromium.org> wrote:
+1 to Dana's question about PGO (and probably double-checking that). I have seen a lot of iterator CHECKs being optimized away on PGO (yay!) which is why I asked.

I generated profile data after building a binary with the change, and rebuilt a binary with the profile data.

On Tue, Sep 3, 2024 at 10:50 AM Kentaro Hara <har...@chromium.org> wrote:
Speedometer score should not be changed.

Would you elaborate on why you can say this?

If a CL using CheckedBegin()/CheckedEnd() regressed Speedometer, it should be reverted to begin()/end() with UNSAFE_BUFFERS().

My concern is that a small regression might accumulate to a significant regression. I want to see some evidence to believe that we can end up with a state where most begin()/end() are replaced with CheckedBegin()/CheckedEnd() without a significant regression.

+1 about the incremental regressions creeping in that we cannot fix as they don't show up.

Ok, I understand your concern.
If we changed begin()/end() so that they return base::CheckedContiguousIterator, we would have 0.2% regression of Speedometer3 score. The impact of the latest proposal would be smaller than it because we would update a part of begin()/end() usages.

https://chrome-supersize.firebaseapp.com/viewer.html?load_url=https%3A%2F%2Fstorage.googleapis.com%2Fchromium-binary-size-trybot-results%2Fandroid-binary-size%2F2024%2F08%2F06%2F1980038%2Fsupersize_diff.sizediff#focus=15360 shows where the checks failed to be elided. It may be because its protecting against going out of bounds legitimately, and avoiding the chance for a memory safety security flaw in the presence of an error.

The largest impacts are logical_line_builder.cc and grid_layout_algorithm.cc, both in calls to std::stable_sort. The libc++ sort algorithms are known for walking out of bounds and causing memory safety issues if the comparator is written incorrectly. Maybe we should consider a future improvement here that replaces the sort algorithm with one that does not go out of bounds, and for which we can avoid the bounds checking. See also https://github.com/Voultapher/sort-research-rs/blob/main/writeup/sort_safety/text.md. The libc++ sort algorithm is also slow, per the benchmarking done there.

Reply all

Reply to author

Forward