[Proposal] Let GC learn business context via SPI: a hybrid Arena + GC memory model for V8

19 views
Skip to first unread message

Tao Chen

unread,
Mar 28, 2026, 10:23:59 AM (yesterday) Mar 28
to v8-...@googlegroups.com

Hi V8 team, I recently proposed a hybrid Arena + GC memory model for JavaScript runtimes, initially posted under the Node.js discussions since that was my entry point. However, after reflection, this idea is fundamentally a V8-level concern — the SPI, the dual-heap architecture, and the write barrier extension all live inside the engine, not the runtime above it.

The Node.js discussion is here for reference: https://github.com/orgs/nodejs/discussions/5143

The core idea in one sentence: Let GC learn business context via SPI. Rather than letting the GC guess which objects are short-lived, expose a minimal SPI that allows the hosting runtime (Node, Deno, or any embedder) to communicate request-scoped lifetime boundaries to the engine — enabling Arena allocation for the common case, with automatic promotion to GC for objects that genuinely escape.

This is relevant beyond Node. Chrome's Service Workers and Web Workers share the same short-lived context semantics and could benefit from the same mechanism.

Full proposal: https://github.com/babyfish-ct/node-arena-gc-hybrid

I recognize this proposal asks for a non-trivial change to V8. I deliberated for a while before sending this — but the potential upside felt too significant to stay silent.

If GC can learn business patterns from the hosting runtime via SPI, developers no longer have to choose between the expressive safety of GC languages and the performance of non-GC languages. That trade-off has been accepted as inevitable for decades. This proposal suggests it doesn't have to be.

I'd genuinely love to hear your thoughts, even if the answer is "interesting, but not feasible."

Jakob Kummerow

unread,
Mar 28, 2026, 11:21:35 AM (23 hours ago) Mar 28
to v8-...@googlegroups.com
Between the pages and pages of slop, there's a short paragraph that sums it up: this describes generational GC. Which we already have. The remaining difference is that the system we have does everything automatically, whereas this proposal suggests an explicit way for embedders to say "run a scavenge now". It's unclear whether that would have any benefit; it might well end up doing more work.

AFAICT, the repeated claims of "zero cost" ignore that the cost was moved, not eliminated: the work of a scavenger run is mostly equivalent to the work that the suggested write barrier would have to do when it promotes objects on demand. And doing it on demand is almost certainly more expensive, because you'd have to scan the entire arena every time to find pointers to update, which the scavenger only has to do only once when the young generation is full.

So, in conclusion: you are right to think that generational GC is a great way to reduce the cost of GC. Which is why we already have a generational GC. Next time you submit a proposal, please be a lot more concise. Tell your AI to produce a half-page summary, not multiple pages of exaggerated claims.


--

Tao Chen

unread,
Mar 28, 2026, 1:19:28 PM (22 hours ago) Mar 28
to v8-...@googlegroups.com
First of all, I sincerely apologize. My poor English made me fear writing directly, so I handed the idea to an AI and let it produce the article — without ever considering that AI has a habit of padding everything with unnecessary verbosity.

Your point — "you'd have to scan the entire arena every time to find pointers to update" — is the key insight I missed. I was too hasty and didn't think the problem through. I was focused on the idea of using business context to redefine the young generation — making it large instead of small — but completely overlooked the pointer fixup cost caused by the heavy cross-references between Arena objects.
Two things are now clear to me:

Mainstream GC implementations optimize for normal code patterns. Abandoning pointer-based object references in favor of handles just to avoid fixup costs is not a realistic trade-off.

Even a more sophisticated approach like a remembered set to optimize fixup for the non-promoted Arena objects may not yield a net positive. And even if it does help somewhat, whether it justifies the engineering cost is a question worth serious scrutiny — you, as an expert in GC implementation, can likely sense that this optimization might not pay off either.

Jakob Kummerow <jkum...@chromium.org> 于2026年3月28日周六 23:21写道:
--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/v8-dev/CAKSzg3QxzH%2BeLOYAoiYR_omzu-9DZDzZgSYikiBt5B_xFKvpvA%40mail.gmail.com.

Tao Chen

unread,
Mar 28, 2026, 1:33:55 PM (21 hours ago) Mar 28
to v8-...@googlegroups.com
One more thought I want to add, not to argue, but because I think it's worth considering together.

Each request's Arena is a complete island — no cross-Arena references exist by design. This means the pointer fixup scope is strictly bounded to a single Arena, not the entire arena heap.

This doesn't eliminate the cost you described. But it does constrain it to a predictable, bounded scope. Whether that constraint changes the cost calculus enough to matter — I genuinely don't know. You have far more intuition about this than I do.

Still learning. Thanks for engaging.

Tao Chen <babyf...@gmail.com> 于2026年3月29日周日 01:19写道:
Reply all
Reply to author
Forward
0 new messages