Re: Issue 476 in google-breakpad: Linux stack walker does not know how to unwind through a trampoline

33 views
Skip to first unread message

google-...@googlecode.com

unread,
Sep 7, 2012, 10:16:33 PM9/7/12
to google-br...@googlegroups.com

Comment #1 on issue 476 by the...@chromium.org: Linux stack walker does
not know how to unwind through a trampoline
http://code.google.com/p/google-breakpad/issues/detail?id=476

Updated example program for r1031

Attachments:
example.cc 3.2 KB

google-...@googlecode.com

unread,
Sep 19, 2014, 7:25:48 PM9/19/14
to google-br...@googlegroups.com

Comment #2 on issue 476 by mdemp...@chromium.org: Linux stack walker does
not know how to unwind through a trampoline
https://code.google.com/p/google-breakpad/issues/detail?id=476

Just repro'd this with breakpad r1375, gdb 7.8-gg1, linux 3.13.0-35-generic
on Ubuntu 14.04. I'll try to dig into how gdb is able to unwind past the
signal handler and why breakpad's getting stuck on it.

--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

google-...@googlecode.com

unread,
Sep 19, 2014, 10:32:20 PM9/19/14
to google-br...@googlegroups.com

Comment #3 on issue 476 by mdemp...@chromium.org: Linux stack walker does
not know how to unwind through a trampoline
https://code.google.com/p/google-breakpad/issues/detail?id=476

Brain dumping some info here; mostly for my own reference, and maybe useful
to others.

Linux's kernel_sigaction struct contains an sa_restorer field, which the
kernel uses to setup the return address for a signal handler call frame.
glibc's sigaction() sets this to __restore_rt, which is in libpthread.so
(hence why breakpad unwinds there before giving up). The restorer code is
responsible for calling sigreturn().

gdb has two mechanisms for detecting signal frames: 1) if the DWARF CIE
augmentation string contains the 'S' character, or 2) if it unwinds and
finds a function named "__restore_rt", or if it finds a function
named "sigaction" and the next instructions are "mov
$__NR_rt_sigreturn, %rax; syscall". (See gdb/amd64-linux-tdep.c.)

However, the signal frame detection only seems to affect whether unwinding
should return to PC or to PC-1, and I think either should generally be okay.

stackwalker_amd64.cc (and probably others) has a check "If the new stack
pointer is at a lower address than the old, then that's clearly
incorrect." But I don't think that's necessarily true if we use
sigaltstack() and the signal stack is at a higher address in memory (i.e.,
then unwinding would jump to a lower address); I think this happens to work
currently because in Chrome only the main thread in a process (whose stack
will always(?) be allocated high) uses an alternate stack, whereas
additional threads (whose stacks are dynamically allocated) handle their
signals on the main stack.

Running dump_syms on libpthread-2.19.so gives a bunch of warnings like:

/lib/x86_64-linux-gnu/libpthread-2.19.so, section '.eh_frame': the call
frame entry at offset 0x3130 uses a DWARF expression to describe how to
recover register '.ra', but this translator cannot yet translate DWARF
expressions to Breakpad postfix expressions
/lib/x86_64-linux-gnu/libpthread-2.19.so, section '.eh_frame': the call
frame entry at offset 0x3130 uses a DWARF expression to describe how to
recover register '.ra', but this translator cannot yet translate DWARF
expressions to Breakpad postfix expressions
/lib/x86_64-linux-gnu/libpthread-2.19.so, section '.eh_frame': the call
frame entry at offset 0x31a8 uses a DWARF expression to describe how to
recover register '.cfa', but this translator cannot yet translate DWARF
expressions to Breakpad postfix expressions
/lib/x86_64-linux-gnu/libpthread-2.19.so, section '.eh_frame': the call
frame entry at offset 0x31a8 uses a DWARF expression to describe how to
recover register '$r8', but this translator cannot yet translate DWARF
expressions to Breakpad postfix expressions
/lib/x86_64-linux-gnu/libpthread-2.19.so, section '.eh_frame': the call
frame entry at offset 0x31a8 uses a DWARF expression to describe how to
recover register '$r9', but this translator cannot yet translate DWARF
expressions to Breakpad postfix expressions
/lib/x86_64-linux-gnu/libpthread-2.19.so, section '.eh_frame': the call
frame entry at offset 0x31a8 uses a DWARF expression to describe how to
recover register '$r10', but this translator cannot yet translate DWARF
expressions to Breakpad postfix expressions

Also, I get different results if I run dump_syms on
/lib/x86_64-linux-gnu/libpthread-2.19.so vs
/usr/lib/debug/lib/x86_64-linux-gnu/libpthread-2.19.so. It probably needs
to be made aware of .gnu_debuglink?

google-...@googlecode.com

unread,
Sep 19, 2014, 10:37:31 PM9/19/14
to google-br...@googlegroups.com

Comment #4 on issue 476 by the...@chromium.org: Linux stack walker does
not know how to unwind through a trampoline
https://code.google.com/p/google-breakpad/issues/detail?id=476

For dump_syms, you need to run "dump_syms
/lib/x86_64-linux-gnu/libpthread-2.19.so
/usr/lib/debug/lib/x86_64-linux-gnu" for it to pick up the .gnu_debuglink.

google-...@googlecode.com

unread,
Sep 19, 2014, 10:38:30 PM9/19/14
to google-br...@googlegroups.com

Comment #5 on issue 476 by mdemp...@chromium.org: Linux stack walker does
not know how to unwind through a trampoline
https://code.google.com/p/google-breakpad/issues/detail?id=476

Oops, dump_syms already supports gnu_debuglink, I just needed to explicitly
tell it where to find the debug version of the file.

google-...@googlecode.com

unread,
Sep 19, 2014, 11:15:30 PM9/19/14
to google-br...@googlegroups.com

Comment #6 on issue 476 by mdemp...@chromium.org: Linux stack walker does
not know how to unwind through a trampoline
https://code.google.com/p/google-breakpad/issues/detail?id=476

Looking closer at the DWARF expressions that are failing, most of them (and
in particular all of the ones that affect __restore_rt) are simple one or
two instruction programs. Relevant excerpt from "readelf
--dump-debug=frames /lib/x86_64-linux-gnu/libpthread-2.19.so":

000031a8 000000000000007c 0000001c FDE cie=00003190
pc=000000000001033f..0000000000010349
DW_CFA_def_cfa_expression (DW_OP_breg7 (rsp): 160; DW_OP_deref)
DW_CFA_expression: r8 (r8) (DW_OP_breg7 (rsp): 40)
DW_CFA_expression: r9 (r9) (DW_OP_breg7 (rsp): 48)
DW_CFA_expression: r10 (r10) (DW_OP_breg7 (rsp): 56)
DW_CFA_expression: r11 (r11) (DW_OP_breg7 (rsp): 64)
DW_CFA_expression: r12 (r12) (DW_OP_breg7 (rsp): 72)
DW_CFA_expression: r13 (r13) (DW_OP_breg7 (rsp): 80)
DW_CFA_expression: r14 (r14) (DW_OP_breg7 (rsp): 88)
DW_CFA_expression: r15 (r15) (DW_OP_breg7 (rsp): 96)
DW_CFA_expression: r5 (rdi) (DW_OP_breg7 (rsp): 104)
DW_CFA_expression: r4 (rsi) (DW_OP_breg7 (rsp): 112)
DW_CFA_expression: r6 (rbp) (DW_OP_breg7 (rsp): 120)
DW_CFA_expression: r3 (rbx) (DW_OP_breg7 (rsp): 128)
DW_CFA_expression: r1 (rdx) (DW_OP_breg7 (rsp): 136)
DW_CFA_expression: r0 (rax) (DW_OP_breg7 (rsp): 144)
DW_CFA_expression: r2 (rcx) (DW_OP_breg7 (rsp): 152)
DW_CFA_expression: r7 (rsp) (DW_OP_breg7 (rsp): 160)
DW_CFA_expression: r16 (rip) (DW_OP_breg7 (rsp): 168)

So I think we could pretty easily recognize these handful of limited
instruction patterns and be able to generate

.cfa: $rsp 160 + ^
$r8: $rsp 40 +
$r9: $rsp 48 +
...

for the breakpad syms files. If that sounds right, I'll work on a CL.

google-...@googlecode.com

unread,
Sep 20, 2014, 4:18:08 AM9/20/14
to google-br...@googlegroups.com

Comment #7 on issue 476 by mdemp...@google.com: Linux stack walker does not
I hacked together something to recognize DW_OP_bregN optionally followed by
DW_OP_deref, and now running dump_syms on libpthread-2.19.so, I get this
entry for __restore_rt:

STACK CFI INIT 1033f a $r10: $rsp 56 + ^ $r11: $rsp 64 + ^ $r12: $rsp 72 +
^ $r13: $rsp 80 + ^ $r14: $rsp 88 + ^ $r15: $rsp 96 + ^ $r8: $rsp 40 + ^
$r9: $rsp 48 + ^ $rax: $rsp 144 + ^ $rbp: $rsp 120 + ^ $rbx: $rsp 128 + ^
$rcx: $rsp 152 + ^ $rdi: $rsp 104 + ^ $rdx: $rsp 136 + ^ $rsi: $rsp 112 + ^
$rsp: $rsp 160 + ^ .cfa: $rsp 160 + ^ .ra: $rsp 168 + ^

and when I run the sample program and then run minidump_stackwalk, I get a
full stack trace past the signal handler frame:

0 breakpad_signal + 0x2112
rax = 0x000000000000002a rdx = 0x00007fff15c71000
rcx = 0x000000000000002a rbx = 0x00007fff15c715c0
rsi = 0x00007fff15c71130 rdi = 0x0000000000000001
rbp = 0x00007fff15c70ff0 rsp = 0x00007fff15c70fd0
r8 = 0x00007fff15c716c0 r9 = 0x0000000000000000
r10 = 0x0000000000000008 r11 = 0x0000000000000246
r12 = 0x00007fff15c71640 r13 = 0x00007fff15c71980
r14 = 0x0000000000000000 r15 = 0x0000000000000000
rip = 0x0000000000402112
Found by: given as instruction pointer in context
1 libpthread-2.19.so + 0x10340
rbx = 0x00007fff15c715c0 rbp = 0x00000000ffffffff
rsp = 0x00007fff15c71000 r12 = 0x00007fff15c71640
r13 = 0x00007fff15c71980 r14 = 0x0000000000000000
r15 = 0x0000000000000000 rip = 0x00007f6d7b924340
Found by: call frame info
2 libc-2.19.so + 0xc19a0
rax = 0x0000000000000023 rdx = 0x0000000000000000
rcx = 0xffffffffffffffff rbx = 0x00007fff15c715c0
rsi = 0x00007fff15c715b0 rdi = 0x00007fff15c715b0
rbp = 0x00000000ffffffff rsp = 0x00007fff15c715a8
r8 = 0x00007fff15c716c0 r9 = 0x0000000000000000
r10 = 0x0000000000000008 r11 = 0x0000000000000246
r12 = 0x00007fff15c71640 r13 = 0x00007fff15c71980
r14 = 0x0000000000000000 r15 = 0x0000000000000000
rip = 0x00007f6d7b60f9a0
Found by: call frame info
3 libc-2.19.so!__sleep [sleep.c : 137 + 0xb]
rbx = 0x00007fff15c715c0 rbp = 0x00000000ffffffff
rsp = 0x00007fff15c715b0 r12 = 0x00007fff15c71640
r13 = 0x00007fff15c71980 r14 = 0x0000000000000000
r15 = 0x0000000000000000 rip = 0x00007f6d7b60f854
Found by: call frame info
4 breakpad_signal + 0x233c
rbx = 0x0000000000000000 rbp = 0x00007fff15c718a0
rsp = 0x00007fff15c71790 r12 = 0x0000000000401fc0
r13 = 0x00007fff15c71980 r14 = 0x0000000000000000
r15 = 0x0000000000000000 rip = 0x000000000040233c
Found by: call frame info
5 libc-2.19.so!__libc_start_main [libc-start.c : 287 + 0x1a]
rbx = 0x0000000000000000 rbp = 0x0000000000000000
rsp = 0x00007fff15c718b0 r12 = 0x0000000000401fc0
r13 = 0x00007fff15c71980 r14 = 0x0000000000000000
r15 = 0x0000000000000000 rip = 0x00007f6d7b56fec5
Found by: call frame info
6 breakpad_signal + 0x1fe9
rbx = 0x0000000000000000 rbp = 0x0000000000000000
rsp = 0x00007fff15c71970 r12 = 0x0000000000401fc0
r13 = 0x00007fff15c71980 r14 = 0x0000000000000000
r15 = 0x0000000000000000 rip = 0x0000000000401fe9
Found by: call frame info
7 0x7fff15c71978
rbx = 0x0000000000000000 rbp = 0x0000000000000000
rsp = 0x00007fff15c71978 r12 = 0x0000000000401fc0
r13 = 0x00007fff15c71980 r14 = 0x0000000000000000
r15 = 0x0000000000000000 rip = 0x00007fff15c71978
Found by: call frame info

So I'll work on cleaning that up and then mail a CL.
Reply all
Reply to author
Forward
0 new messages