https://chromium.googlesource.com/v8/v8/+/e895b7af73728b0f1431549dbd52db37e8b1b577commit e895b7af73728b0f1431549dbd52db37e8b1b577
Author: Seth Brenith <
seth.b...@microsoft.com>
Date: Mon Jul 25 15:39:46 2022
Background merging of deserialized scripts
Recently,
https://crrev.com/c/v8/v8/+/3681880 added new API functions
with which an embedder could request that V8 merge newly deserialized
script data into an existing Script from the Isolate's compilation
cache. This change implements those new functions. This functionality is
still disabled by default due to the flag
merge_background_deserialized_script_with_compilation_cache.
The goal of this new functionality is to reduce memory usage when
multiple frames load the same script with a long delay between (long
enough for the script to have been evicted from Blink's in-memory cache
and for the top-level SharedFunctionInfo to be flushed). In that case,
there are two Script objects for the same script: one which was found in
the Isolate compilation cache (the "old" script), and one which was
recently deserialized (the "new" script). The new script's object graph
is essentially standalone: it may point to internalized strings and
readonly objects such as the empty feedback metadata, but otherwise
it is unconnected to the rest of the heap. The merging logic takes any
useful data from the new script's object graph and attaches it into the
old script's object graph, so that the new Script object and any other
duplicated objects can be discarded. More specifically:
1. If the new Script has a SharedFunctionInfo for a particular function
literal, and the old Script does not, then the old Script is updated
to refer to the new SharedFunctionInfo.
2. If the new Script has a compiled SharedFunctionInfo for a particular
function literal, and the old Script has an uncompiled
SharedFunctionInfo, then the old SharedFunctionInfo is updated to
point to the function_data and feedback_metadata from the new
SharedFunctionInfo.
3. If any used object from the new object graph points to a
SharedFunctionInfo, where the old object graph contains a matching
SharedFunctionInfo for the same function literal, then that pointer
is updated to point to the old SharedFunctionInfo.
The document at [0] includes diagrams showing an example merge on a very
small script.
Steps 1 and 2 above are pretty simple, but step 3 requires walking a
possibly large set of objects, so this new API lets the embedder run
step 3 from a background thread. Steps 1 and 2 are performed later, on
the main thread.
The next important question is: in what ways can the old script's object
graph be modified during the background execution of step 3, or during
the time after step 3 but before steps 1 and 2?
A. SharedFunctionInfos can go from compiled to uncompiled due to
flushing. This is okay; the worst outcome is that the function would
need to be compiled again later. Such a risk is already present,
since V8 doesn't keep IsCompiledScopes for every compiled function in
a background-deserialized script.
B. SharedFunctionInfos can go from uncompiled to compiled due to lazy
compilation. This is also okay; the merge completion logic on the
main thread will just keep this lazily compiled data rather than
inserting compiled data from the newly deserialized object graph.
C. SharedFunctionInfos can be cleared from the Script's weak array if
they are no longer referenced. This is mostly okay, because any
SharedFunctionInfo that is needed by the background merge is strongly
referenced and therefore can't be cleared. The only problem arises if
the top-level SharedFunctionInfo gets cleared, so the merge task must
deliberately keep a reference to that one.
D. SharedFunctionInfos can be created if they are needed due to lazy
compilation of a parent function. This change is somewhat troublesome
because it invalidates the background thread's work and requires a
re-traversal on the main thread to update any pointers that should
point to this lazily compiled SharedFunctionInfo.
At a high level, this change implements three previously unimplemented
functions in BackgroundDeserializeTask (in compiler.cc) and updates one:
- BackgroundDeserializeTask::SourceTextAvailable, run on the main
thread, checks whether there is a matching Script in the Isolate
compilation cache which doesn't already have a top-level
SharedFunctionInfo. If so, it saves that Script in a persistent
handle.
- BackgroundDeserializeTask::ShouldMergeWithExistingScript checks
whether the persistent handle from the first step exists (a fast
operation which can be called from any thread).
- BackgroundDeserializeTask::MergeWithExistingScript, run on a
background thread, performs step 3 of the merge described above and
generates lists of persistent data describing how the main thread can
complete the merge.
- BackgroundDeserializeTask::Finish is updated to perform the merge
steps 1 and 2 listed above, as well as a possible re-traversal of the
graph if required due to newly created SharedFunctionInfos in the old
Script.
The merge logic has nothing to do with deserialization, and indeed I
hope to reuse it for background compilation tasks as well, so it is all
contained within a new class BackgroundMergeTask (in compiler.h,cc). It
uses a second class, ForwardPointersVisitor (in compiler.cc) to perform
the object visitation that updates pointers to SharedFunctionInfos.
[0]
https://docs.google.com/document/d/1UksB5Vm7TT1-f3S9W1dK_rP9jKn_ly0WVm_UDPpWuBw/editBug: v8:12808
Change-Id: Id405869e9d5b106ca7afd9c4b08cb5813e6852c6
Reviewed-on:
https://chromium-review.googlesource.com/c/v8/v8/+/3739232