Annotating CPU profiler samples with embedder values

13 views
Skip to first unread message

Attila Szegedi

unread,
May 15, 2026, 10:06:49 AM (4 days ago) May 15
to v8-...@googlegroups.com
Hi folks,

I recently submitted both a bug[0] and a code patch[1] for annotating CPU profiler samples with embedder values; I wanted to raise awareness of it here for discussion as well.

The use cases for these can be manifold; in practice, at Datadog we're using it to associate tracing span IDs, HTTP endpoint data, and other contextual information our customers want to group and slice samples in their Node.js profiles by. We've been providing this functionality to our customers for the past 3 years through a workaround – we use sample timestamps for correlation with our values. We install our own PROF signal handler function, store the pointer to V8's own signal handler, and then delegate to V8's handler while recording the current context value and time before and after the invocation; then we can associate each sample with a value that's matched with a signal handler invocation time that sample's timestamp falls into.

Our approach works, but it's not ideal – for one thing, it doesn't work on Windows which doesn't use UNIX signals… And frankly, we'd rather _not_ be hijacking the PROF signal and dealing with timestamps as surrogate correlation identifiers if we don't have to.

I opted for a very minimal implementation that adds a single void* to samples in keeping with the convention for other embedder data, External value, etc. It's opaque to V8, sourced by a callback installed into the profiler, and readable through a new void* CPUProfile::GetSampleContext() API. It keeps in spirit of additions of EmbedderStateTag and trace_id_.

One might ask why don't we "just" somehow retrofit trace_id, which is an uint64_t, so the same size (on most platforms) and opaqueness as a void*. The thing with trace_id is that it has particular semantics AFAICT around trace events, and is also serialized to JSON; we need something more generic and don't want to have rendering of what looks like random values in JSON. E.g. in our case, that void* carries data extracted through few dependent reads starting from Isolate's ContinuationPreservedEmbedderData.

Anyhow, I'm curious if other folks on this list find the feature piques their interest. I'm also looking forward to reviews and naturally I'm open to further discussion.

Attila.
(Software Engineer @ Datadog, working on the Node.js profiler.)

---
Reply all
Reply to author
Forward
0 new messages