--
--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
---
You received this message because you are subscribed to the Google Groups "v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-users/CAF8PYMgNXRdvW16Sb%3DwRaU21XGcMG3eBgkz_ey65%2BX7DdQ0a6g%40mail.gmail.com.
Hi Vitali,Stabilising the cached data format as-is is pretty challenging; the cache as written is pretty much a direct field-by-field serialisation of the internal data structures, so freezing the cache would mean freezing the shapes of those internal objects, effectively making the internal fields an API-level guarantee. Furthermore, it's a backdoor to a stable bytecode format, which is something we've also pushed back on as it severely limits our ability to work on the interpreter; if we wanted to have a slightly weaker constraint of at least guaranteeing backwards compatibility with old bytecode, we'd have to vastly expand our test suite with old bytecodes in order to try to maintain this backwards compatibility, and even then I'm not sure we could fully guarantee if there's some edge case not covered in the test suite. Same story with porting code caches from older to newer versions; such a port would require a mapping from old to new, which would require a) some sort of log of what old fields/bytecodes translate to what new ones, and b) heavy testing to make sure that this mapping is valid. This is a big security problem; the deserialisation is pretty dumb (for performance reasons), and just spits out data onto the V8 heap without e.g. checking if the number of fields match. Having bugs in the old->new mapping, or in the backwards compatibility, would open up a whole pandora's box of security issues, where one deleted field in an edge case that tests don't cover would become an out-of-bounds write widget.Given that this would greatly increase our development complexity (maintaining a stable API is already a lot of trouble for us), would be a big source of security issues, and I don't expect it to provide much benefit for Chrome (since we expect websites to change more often than Chrome versions), I don't see us either working on (or accepting patches for) a stable or even upgradeable cache.I'd be curious to know if you've actually observed/measured script parse time being a big problem, or whether you're more seeing issues due to lazy function compilation time. We've done a lot of work on parse time in recent years, so it's not as slow as (some) people assume.
We're also prototyping a potential stable & standardisable snapshot format for the results of partial script execution, which could help you if you're seeing large script "setup" code being an issue, but it wouldn't store compiled bytecode (for the above reasons).I appreciate that this might be a disappointing answer for you, but having flexibility with internal objects and bytecode is one of the things that allows us to stay performant and secure.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-users/a10e0853-3cec-43d3-abbb-d6a2ecdb8796n%40googlegroups.com.
What's the best way to measure script parse time vs lazy function compilation time? It's been a few months since I last looked at this so my memory is a bit hazy on whether it was instantiating v8::ScriptCompiler::Source, v8::ScriptCompiler::CompileUnboundScript, or the combined time of both (although I suspect both count as script parse time?). I do recall that on my laptop, using the code cache basically halved the time on larger scripts of what I was measuring & I suspect I would have looked at the overall time to instantiate the isolate with a script (it was a no-op on smaller scripts, so I suspect we're talking about script parse time).
FWIW, if It's helpful, when I profiled a stress test of isolate construction on my machine with a release build, I saw V8 spending a lot of time deserializing the snapshot (seemingly once for the isolate & then again for the context).
Breakdown of the flamegraph:* ~22% of total runtime to run NewContextFromSnapshot. Within that ~5% of total runtime was spent just decompressing the snapshot & the rest was deserializing it (17%). I thought there was only 1 snapshot. Couldn't the decompression happen once in V8System instead?
* 9% of total runtime spent decompressing the snapshot for the isolate (in other words 14% of total runtime was spent decompressing the snapshot).
In our use-case we construct a lot of isolates in the same process. I'm curious if there's opportunities to extend V8 to utilize COW to reduce the memory & CPU impact of deserializing the snapshot multiple times. Is my guess correct that deserialization is actually doing non-trivial things like relocating objects or do you think there's a 0-copy approach that can be taken with serializing/deserializing the snapshot so that it's prebuilt in the right format (perhaps even without any compression)?
I fully understand. I'm definitely interested in the snapshot format since presumably anything that helps the web here will also help us. Is there a paper I can reference to read up more on the proposal? I've seen a few in the wild from the broader JS community but nothing about V8's plans here. I have no idea if that will help our workload but it's certainly something we're open to exploring.
Complementary data and task parallel JSON lexing implementations already exists (https://github.com/simdjson/simdjson, https://github.com/mogill/parallel-xml2json) but there's no way to store the results in a way V8 can use, and thus an extra copy in/out step is needed.
I see several "bookend" options which may be combined to varying degrees:- A lowest common denominator data storage sequence is defined by V8- Applications gain the ability to introspect about V8's object storage sequences at runtime- Applications can tell V8 how data is stored in memory and V8 can adapt to an existing storage sequence
On Fri, Jul 23, 2021 at 1:18 AM Vitali Lovich <vlo...@gmail.com> wrote:What's the best way to measure script parse time vs lazy function compilation time? It's been a few months since I last looked at this so my memory is a bit hazy on whether it was instantiating v8::ScriptCompiler::Source, v8::ScriptCompiler::CompileUnboundScript, or the combined time of both (although I suspect both count as script parse time?). I do recall that on my laptop, using the code cache basically halved the time on larger scripts of what I was measuring & I suspect I would have looked at the overall time to instantiate the isolate with a script (it was a no-op on smaller scripts, so I suspect we're talking about script parse time).The best way is to run with --runtime-call-stats, this will give you detailed scoped timers for almost everything we do, including compilation. Script deserialisation is certainly faster than script compilation, so I'm not surprised it has a big impact when the two are compared against each other, I'm more curious how it compares to overall worklet runtime.FWIW, if It's helpful, when I profiled a stress test of isolate construction on my machine with a release build, I saw V8 spending a lot of time deserializing the snapshot (seemingly once for the isolate & then again for the context).Yeah, the isolate snapshot is the ~immutable context-independent one (think of things like the "undefined" value) which is deserialized once per isolate, and the context snapshot is things that are mutable (think of things like the "Math" object) that have to be fresh per new context. Note that these snapshots use the same mechanism as the code cache snapshot, but are otherwise entirely distinct.Breakdown of the flamegraph:* ~22% of total runtime to run NewContextFromSnapshot. Within that ~5% of total runtime was spent just decompressing the snapshot & the rest was deserializing it (17%). I thought there was only 1 snapshot. Couldn't the decompression happen once in V8System instead?It's possible that the decompression could be once per isolate, although there is the memory impact to consider.
* 9% of total runtime spent decompressing the snapshot for the isolate (in other words 14% of total runtime was spent decompressing the snapshot).
In our use-case we construct a lot of isolates in the same process. I'm curious if there's opportunities to extend V8 to utilize COW to reduce the memory & CPU impact of deserializing the snapshot multiple times. Is my guess correct that deserialization is actually doing non-trivial things like relocating objects or do you think there's a 0-copy approach that can be taken with serializing/deserializing the snapshot so that it's prebuilt in the right format (perhaps even without any compression)?There's definitely relocations happening during deserialisation; for the isolate, we've wanted to share the "read-only space" which contains immutable immortal objects (like "undefined"), but under pointer compression this has technical issues because of limited guarantees when using mmap (IIRC). I imagine COW for the context snapshot would have similar issues, combined with the COW getting immediately defeated as soon as the GC runs (because it has to mutate the data to set mark bits). It's a direction worth exploring, but hasn't been enough of a priority for us.Another thing we're considering looking into is deserializing the context snapshot lazily, so that unused functions/classes never get deserialized in the first place. Again, not something we've had time to prioritise, but something we're much more likely to work on at some point in the future, since it becomes more web relevant every time new functionality is introduced.I fully understand. I'm definitely interested in the snapshot format since presumably anything that helps the web here will also help us. Is there a paper I can reference to read up more on the proposal? I've seen a few in the wild from the broader JS community but nothing about V8's plans here. I have no idea if that will help our workload but it's certainly something we're open to exploring.You're probably thinking of BinaryAST, which is unrelated to this. We haven't talked much about web snapshots yet, because it's still very preliminary, very prototypy, and we don't want to make any promises or guarantees around it even ever materialising. +Marja Hölttä is leading this effort, she'll know the current state.
--
--
v8-users mailing list
v8-u...@googlegroups.com
http://groups.google.com/group/v8-users
---
You received this message because you are subscribed to the Google Groups "v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-users/CAGRskv-Qk5X5MzcUez0Us2haeKGPaU0-Bcjk8j_0sRtweNS%3DKw%40mail.gmail.com.