We're excited to share Perfetto v55.1! Highlights include GPU-accelerated trace rendering, a new Heap Dump Explorer, native Linux heap profiling, a redesigned trace_processor shell, multi-GPU analysis, and a major documentation refresh.
Note: v55.0 was never released due to a build failure on Windows and an issue with parsing of some important ftrace events. v55.1 includes fixes to these issues.
Opening big traces (hundreds of processes, thousands of tracks, millions of slices) should be significantly smoother in v55. Two changes drive this:
Investigating Android and Java memory issues gets a significant improvement in Perfetto UI through the new Heap Dump Explorer page. Instead of just showing the flamegraph for the heap dump, this is an interactive exploration tool on the classes, objects and references in the dump. Specifically:
.hprof heap dump).The Java HPROF importer also got a major upgrade: primitive fields, array contents, and Bitmap-specific metadata are now preserved when parsing the traces to power this functionality.
Learn more in the Heap Dump Explorer guide
heap_profile host: native Linux heap profilingheap_profile is no longer Android-only. The new host subcommand lets you capture an allocation-attributed memory profile of a local Linux process and open the result in the same UI:
tools/heap_profile host -- ./my_binary --some-flag
It auto-downloads the bits it needs, launches a bundled daemon, and produces a flamegraph that behaves exactly like the Android equivalent (drill in by stack frame, deobfuscate symbols, etc.). The existing Android workflow lives under heap_profile android and is unchanged.
Get started with native heap profiling On Linux
trace_processor_shelltrace_processor_shell has been rebuilt around purpose-built subcommands that read more like git, replacing the flat-CLI-with-flags model:
trace_processor query # one-shot SQL
trace_processor interactive # REPL
trace_processor server # serve the UI over RPC
trace_processor metrics # run metrics
trace_processor summarize # high-level summaries
trace_processor export # write to other formats
trace_processor convert # legacy conversions
The query subcommand also gained a structured query mode for programmatic callers. Existing scripts that use the classic flat CLI keep working just as it is today but all new functionality will be exposed through subcommands only.
New trace_processor subcommand docs
Trace Processor now models multiple GPUs as first-class citizens. The new gpu and gpu_context tables let you slice slice and memory tracks per-GPU and per-machine, so workloads spanning a discrete + integrated pair, a multi-GPU host, or a multi-machine setup attribute correctly all the way through analysis.
On the recording side, the GPU data source gained InstrumentedSamplingConfig for fine-grained sampling control, custom counter groups in GpuCounterConfig, and CUDA / HIP added to the graphics-context APIs; useful for compute workloads beyond traditional rendering.
The Python TraceProcessor API can now return query results as polars DataFrames as well as pandas.
df = tp.query("SELECT * FROM slice").as_polars_dataframe()
The Data Explorer (formerly "Explore") now has a full dashboard environment with side-by-side Graph and Dashboard views, an in-sidebar chart editor, smarter graph with cycle detection, import/export of dashboards between sessions, and built-in tutorials & solution recipes.
A handful of new chart and aggregation types ship across the UI:
android.aflags data source captures Android aconfig flags in effect during the trace, surfaced through Trace Processor and the UI./proc/slabinfo polling added to sys_stats for tracking kernel slab allocations.traced_perf gained a CPU filter for per-CPU sampling, and raw perf events can specify a dynamic PMU by name.trace_all_machines in TraceConfig).Perfetto can now ingest more formats end-to-end:
The Java SDK now understands @CompileTimeConstant annotations and static track names — so instrumentation can be more efficient when the compiler can prove the strings are constant.
The documentation site received a signifcant facelift
heap_profile with both host and android subcommandstraceconv (rewritten from scratch)trace_processor shell + C++ embeddingtraced_relaytrace_processor and traceconv on WindowsTraceMetadata proto was renamed to TraceAttributes.tid field in thread_descriptor.proto is now int64 (matters for Windows, which has large thread IDs).dev.perfetto.Catapult plugin if you need them.arg_set_id column was added to the thread table — queries joining thread with other tables that have arg_set_id (e.g. slice) may need to qualify the column name to avoid ambiguity.A huge thanks to everyone — inside and outside Google — who contributed to making Perfetto v55 a success. ♥️
For complete details, see the changelog or view all changes on GitHub.
Download Perfetto v55.1 from our releases page, get started at docs.perfetto.dev, or try the UI directly at ui.perfetto.dev.