Hi all,
TL;DR: we have sample context functionality in Datadog's Node.js profiler built on top of v8::CpuProfiler. It works but we want to improve on it. We attempted a few things that don't work very well. Our ultimate realization is that we need V8 itself to support this.
Longer version:
We'd like to propose that V8 should have a supported API for attaching context to profiler samples. As prior art, the Go programming language supports a similar feature already[1][2].
Speculatively, it could look like this pair of methods:
void v8::CpuProfiler::SetSampleContext(v8::Local<v8::Value>)
and
v8::Local<v8::Value> v8::CpuProfile::GetSampleContext(int index)
(similar to e.g. CpuProfile::GetSampleTimestamp(int index).)These would work when CpuProfiler::Start is called with record_samples=true to record individual samples and not just aggregates.
When the profiler captures a sample, it’d also associate the sample with the currently set sample context. A single context can get associated with any number of samples including zero if it was replaced with another SetSampleContext call before a sample was taken. On the other hand, if the context is not replaced for some time then it can end up associated with multiple samples as well.
A hopefully obvious requirement is that these sample contexts would also have to be propagated into continuations. The isolate would thus need to track "Continuation Preserved Profiler Data" similar to how it currently tracks ContinuationPreservedEmbedderData (CPED.) It would most likely need to be a map, as there can be multiple active CPU profilers at any given time, but I don't want to get into implementation details much off the bat.
We'd also vastly prefer if the profiler could work with being passed a Local<Value> as the above method signatures suggest. The other choice would be to represent the sample context as void*, but I think it's a suboptimal choice as then ownership of the value and correct freeing of whatever was passed in would become an issue and might even require additional assistance from the CpuProfiler, complicating the API. Using GC managed values frees us from this – if a value gets associated with a sample, it remains alive; if a value is replaced with another before it gets associated with a sample, it is eventually collected (presuming it is not referenced from elsewhere, obviously.) A compromise solution would be to have some other limited context representation, e.g. a map of string key-value pairs if that would be easier to implement.
As an added bonus, the concept of sample contexts could be extended to the HeapProfiler as well – we had customers ask for context information associated with our Heap Live Size reports. It might even be simpler to implement for HeapProfiler (seeing how that one doesn't require signal handling) but even having it for CpuProfiler first would be great.
I think this is enough as a discussion starter. Let me know what you think, we'd definitely like to collaborate on this!
I’m also attaching an expose on the background of this proposal, our current state-of-the-art and things we tried recently to improve it; it’s a bit lengthy and entirely optional so I saved it for after the proposal itself:
BACKGROUND
==========
At Datadog we ship a Node.js profiler built on top of V8 profiler. A significant functional addition we have is sample contexts – we add contextual information from our tracer to the samples. This is typically the span ID so we can correlate samples to traces as well as an endpoint ID (e.g. a HTTP method and a path.) The way we achieve this has several elements, but it mostly hinges on us intercepting the PROF signal, recording TimeTicks::Now before and after we delegate to V8's signal handler, and storing the current context in a ring buffer with the timestamps. Later when we’re processing a v8::CpuProfile, we find the element in the ring buffer whose timestamps bracket the CpuProfile::GetSampleTimestamp() value and associate it with the context object in that element.
Setting the current context from the JS side is currently implemented by registering a Node.js async_hooks.createHook callback. On each async context switch – be it a V8 promise or a Node.js timeout etc. we will pull the current tracing contextual information and call a setter on the profiler.
These contexts are JS objects, and on the profiler's C++ side the profiler has basically a single std::shared_ptr<v8::Global<v8::Value>> private field that is updated to the new object.
This approach works, but it has some drawbacks. Node.js wants to deprecate async_hooks.createHook so we want to move away from using it. Under the hood it uses v8::Isolate::SetPromiseHook which – AFAICT – triggers some deoptimizations.
Recently we tried a different approach based on Isolate::SetContinuationPreservedEmbedderData(), seeing how it's also used by default for implementation of AsyncLocalStorage API in Node.js 24+. The idea here was to associate the sample context with the object Node.js stores in CPED. This way, we get async context propagation for free and can stop using async_hooks callbacks.
Unfortunately, now our signal handler code would need to call Isolate::GetContinuationPreservedEmbedderData(), which returns a v8::Local<Value> thus it needs to create a handle. We did some proofs of concept, and everything worked fine to our own astonishment – until it didn't. Calling a V8 API that returns a Local<> risks triggering allocation of a new Address array block in HandleScope::Extend, which is not signal safe. We were able to produce several fun looking crashes and even malloc deadlocks with it.
We could spend time trying to fix this. Like, use the ucontext passed to the signal handler to identify whether "known dangerous" methods are on stack – anything that has the chance to mutate isolate->handle_scope_implementer()->blocks(), but also __GI__malloc and probably few others. But it’s awfully brittle ensuring we covered all possible bases.
We don’t see an external way to implement a sample context solution that is signal safe, follows continuations, and doesn’t require a SetPromiseHook.
Hence the above proposal.
Attila.
--
[1]
https://rakyll.org/profiler-labels/[2]
https://felixge.de/2022/02/11/connecting-go-profiling-with-tracing/