Design doc: Oilpan Caged Heap Expansion

81 views
Skip to first unread message

Kevin Babbitt

unread,
Mar 29, 2023, 7:33:08 PM3/29/23
to v8-dev...@googlegroups.com

Hi all,

 

I wrote a design doc with a proposal to give embedders an option to expand the Oilpan caged heap: https://docs.google.com/document/d/1yGAsu_41rU8_hGQ9tcSKH84Em3vj3uzw_c0YlL7SCjA/edit?usp=sharing

 

Thanks,

Kevin

 

Anton Bikineev

unread,
Mar 30, 2023, 11:11:22 AM3/30/23
to v8-dev
(looks like I sent the message privately, repeating here for visibility)

Thanks Kevin for writing up the proposal! I have a couple of comments/concerns about the proposal.

Have you investigated the OOMs resulting from Oilpan cage exhaustion? Since we returned the 4GB cage, we rarely see any in Chrome. Such OOMs usually indicate leaks. Small cages actually help manifest and fix them earlier.

Re the shift and the 100KB increase. The branchless design and the alignment bits do seamlessly allow to increase the cage size. However, there may be a performance penalty associated:
  • Since the decompression snippet becomes larger, this may add additional pressure on instruction cache.
  • CyclesPerInstruction of add is known to be 2x smaller than that of shl/sal, even on the modern IceLake/TigerLake. This may have negative impact on the workloads that have non-dependent decompressions, e.g.:
for (auto* member: heap_vector)
member->do_something();

I assume this is why llvm performs this optimization for the 4GB cage.

Kevin Babbitt

unread,
Apr 5, 2023, 5:05:50 PM4/5/23
to v8-...@googlegroups.com

Thanks Anton!

 

Re: investigation of OOMs – the motivation for doing this work was based on bulk analysis of crash data coming into Edge from Windows devices. What I found was that a non-trivial percentage of renderer OOMs had the v8-oom-location crash key set to a value indicating we had exhausted the Oilpan heap, and of those crashes a large fraction showed plenty of available commit on the system. I can certainly believe that some fraction of these crashes are leaks. I can sample a few dumps to get a better sense, but I’m not sure how to distinguish a leak in a post-mortem context – any suggestions on what to look for?

 

Re: performance – I ran a few Pinpoint jobs on a change implementing the 3-bit shift plus a larger reservation to ensure the correct bit pattern for a 4GB cage. Speedometer was perf-neutral. blink_perf.bindings and blink_perf.css showed a mix of improvements and regressions. Based on my past experience with the blink_perf.* benchmarks I’d say the results are within the margin of error, but let me know if you see something concerning or would like me to run some additional tests.

Speedometer: https://pinpoint-dot-chromeperf.appspot.com/job/13c2308e460000

blink_perf.bindings: https://pinpoint-dot-chromeperf.appspot.com/job/15b7a7ba460000

blink_perf.css: https://pinpoint-dot-chromeperf.appspot.com/job/16b8dc08460000

 

Kevin

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/0abb08d9-b724-4289-bebb-88125ef05d11n%40googlegroups.com.

Anton Bikineev

unread,
Apr 16, 2023, 3:40:03 PM4/16/23
to v8-...@googlegroups.com
I can sample a few dumps to get a better sense, but I’m not sure how to distinguish a leak in a post-mortem context – any suggestions on what to look for?
It's tricky indeed, unless there is a repro. If an OOM is consistently reproduced on a specific URL, you could look at the heap snapshot in devtools ("cppgc_enable_object_names = true" must be present in GN).

Re: performance – I ran a few Pinpoint jobs on a change implementing the 3-bit shift plus a larger reservation to ensure the correct bit pattern for a 4GB cage. Speedometer was perf-neutral.
Good, thanks for checking! I think that the proposed change shouldn't cause a binary size regression on Android (or generally on ARM/AArch64). However, just for safety, it'd be great to gate the feature behind a compile-time flag, e.g. "#ifdef CPPGC_ENABLE_LARGER_CAGE".

Kevin Babbitt

unread,
Apr 18, 2023, 1:53:53 PM4/18/23
to v8-...@googlegroups.com

> However, just for safety, it'd be great to gate the feature behind a compile-time flag, e.g. "#ifdef CPPGC_ENABLE_LARGER_CAGE".

 

SGTM, will do. I’ve opened https://crbug.com/1434388 to track the work. Thanks!

Reply all
Reply to author
Forward
0 new messages