. This is of course an implementation detail subject to change, but the logic for why it does this is sound, and so it's probably still safe to truncate stack frames at least somewhat. Doing so would likely permit many samples to be merged, which could significantly reduce the uncompressed size of the profile.
A pprof profile is, for purposes of PGO, effectively table of execution stacks and how often they were sampled. If you want to get really good profiling data, you do as the
PGO guide tells you and collect multiple samples and merge them, which gets you more coverage, but also makes for larger, more varied sample counts, which decreases the effectiveness of compression. For purposes of PGO, we only care about the relative frequency of different code paths at a pretty coarse granularity. There's two opportunities here.
Normalizing and quantizing the sample counts should be possible to do with no significant effect on the accuracy or usefulness to PGO, and would improve the effectiveness of compression. That is, you could for example round each sample count to the nearest power of N, and then scale them all so that the smallest sample count is N (where N is e.g. 2). The effect of this would likely be minor, since most of the space in the profile is taken up by other things like the location and function tables, but it wouldn't hurt.
The other, much more complicated thing we can do is merge sampled locations. PGO is using the profile data to improve its guesses about which branches are taken (including implicit branches by type for interface methods). We generally don't actually care which specific statement within each branch is taking up the most time. If there are no possible branches between two sampled locations, from PGO's perspective one might as well merge them (e.g. just drop the one with the lower sample count). This is more complicated to do than quantization, of course, as it requires control flow analysis.
My questions for anyone who's read this far are
- Would these ideas work, or am I making bad assumptions about what PGO actually needs?
- Are there pre-existing tools for doing this kind of thing that I just haven't noticed?
- Are there other significant opportunities for pruning the pprof data in ways that wouldn't impact PGO?
- Would this be valuable enough to try to roll it into an option for `go tool pprof -proto`?
- Any pointers for pre-existing tools for doing the control-flow analysis bits?