cmd/pprof: respect symbol boundaries during Disasm
When the profile contains locations that incorrectly computed the
disasm would leav to invalid decoding of instructions.
This change ensures that we start decoding from a known location.
Updates #74610
diff --git a/src/cmd/internal/disasm/disasm.go b/src/cmd/internal/disasm/disasm.go
index 3ae8989..1ee5f48 100644
--- a/src/cmd/internal/disasm/disasm.go
+++ b/src/cmd/internal/disasm/disasm.go
@@ -15,6 +15,7 @@
"encoding/binary"
"fmt"
"io"
+ "iter"
"os"
"path/filepath"
"regexp"
@@ -194,6 +195,29 @@
return cf.Lines[line-1], nil
}
+// TextSyms returns text symbols that overlap the specified range.
+func (d *Disasm) TextSyms(start, end uint64) iter.Seq[objfile.Sym] {
+ if start < d.textStart {
+ start = d.textStart
+ }
+ if end > d.textEnd {
+ end = d.textEnd
+ }
+
+ return func(yield func(objfile.Sym) bool) {
+ for _, sym := range d.syms {
+ symStart := sym.Addr
+ symEnd := sym.Addr + uint64(sym.Size)
+ if sym.Code != 'T' && sym.Code != 't' ||
+ symStart < d.textStart ||
+ symEnd <= start || end <= symStart {
+ continue
+ }
+ yield(sym)
+ }
+ }
+}
+
// Print prints a disassembly of the file to w.
// If filter is non-nil, the disassembly only includes functions with names matching filter.
// If printCode is true, the disassembly includes corresponding source lines.
diff --git a/src/cmd/pprof/pprof.go b/src/cmd/pprof/pprof.go
index bfc2911..474a50c 100644
--- a/src/cmd/pprof/pprof.go
+++ b/src/cmd/pprof/pprof.go
@@ -196,10 +196,17 @@
if err != nil {
return nil, err
}
+
var asm []driver.Inst
- d.Decode(start, end, nil, false, func(pc, size uint64, file string, line int, text string) {
- asm = append(asm, driver.Inst{Addr: pc, File: file, Line: line, Text: text})
- })
+ for sym := range d.TextSyms(start, end) {
+ symStart := sym.Addr
+ symEnd := sym.Addr + uint64(sym.Size)
+ symEnd = min(symEnd, end)
+ d.Decode(symStart, symEnd, sym.Relocs, false, func(pc, size uint64, file string, line int, text string) {
+ asm = append(asm, driver.Inst{Addr: pc, File: file, Line: line, Text: text})
+ })
+ }
+
return asm, nil
}
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
for sym := range d.TextSyms(start, end) {
I'm not sure how often Disasm is called and whether this iteration would cause a performance regression for some use cases.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
I'm not sure how often Disasm is called and whether this iteration would cause a performance regression for some use cases.
After thinking about, this overhead should not be a problem, because in https://github.com/google/pprof/blob/6e76a2b096b5fa52e4bb3f7f7a357bd6e6b3b7b1/internal/binutils/binutils.go#L259 it actually invokes a command which would be a significant overhead.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
This is trying to improve the handling of invalid PCs. There is still a question on the necessity of this. Where do the invalid PCs come from? If the profile is just bad, even we disassemble them correctly it may not actually help, like, is it the right place that the CPU time actually spent?
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
This is trying to improve the handling of invalid PCs. There is still a question on the necessity of this. Where do the invalid PCs come from? If the profile is just bad, even we disassemble them correctly it may not actually help, like, is it the right place that the CPU time actually spent?
I agree that this doesn't fix the invalid PC issue and there's still investigation needed on that front.
However, it does make Disasm func behavior similar to other command line tools used by pprof. From what I can tell gcc objdump, llvm objdump and go tool objdump all do this PC "fixup" to decode.
For the invalid PC-s, I'm not certain, but it did seem to me like some sort of "off by few bytes" issue. So, even if the profile PCs are wrong, they should be close to the actual corresponding line and still show useful information. So the profile will be still somewhat bad, but at least the disassembly will be correct. Whether this is improvement is worth the cost, I'm not certain, but I'm leaning towards yes... however, I can also see the other side of the argument.
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |