Hey guys,
I am looking for some help debugging a crash that I occasionally observe when running gcc opt compiled binaries with HEAPCHECK=strict.
Here are various details:
- This is the stack trace that I get. Since heap checker is active, it tries to invoke the NewHook. That hook tries to unwind the callstack and crashes in the process.
@ 0x210d062 GetStackTrace()
@ 0x21056b4 MallocHook_GetCallerStackTrace
@ 0x20f46a4 NewHook()
@ 0x2104f32 MallocHook::InvokeNewHookSlow()
@ 0x21127cf operator new[]()
@ 0x7ff3eb637d17 std::string::_Rep::_S_create()
@ 0x1fdc175 std::string::_S_construct<>()
@ 0x7ff3eb638d30 std::string::string()
- I am running *without libunwind* and hence the unwinding method that gperftools uses is to walk the frame pointer linked list.
- My binary is compiled with -fno-omit-frame-pointer -O3 with gcc 5.1.0
- One of the functions in the stack trace is std::string::_S_construct<>. I looked at the disassembly of that function, and this is what I found:
1fdc140: cmp %rsi,%rdi
1fdc143: je 1fdc1b8 <_ZNSs12_S_constructIPKcEEPcT_S3_RKSaIcESt20forward_iterator_tag+0x78>
1fdc145: test %rsi,%rsi
1fdc148: push %r12
1fdc14a: push %rbp
1fdc14b: mov %rdi,%rbp
1fdc14e: push %rbx
1fdc14f: mov %rsi,%rbx
1fdc152: je 1fdc168 <_ZNSs12_S_constructIPKcEEPcT_S3_RKSaIcESt20forward_iterator_tag+0x28>
1fdc154: test %rdi,%rdi
1fdc157: jne 1fdc168 <_ZNSs12_S_constructIPKcEEPcT_S3_RKSaIcESt20forward_iterator_tag+0x28>
1fdc159: mov $0x2177460,%edi
1fdc15e: callq 411350 <_ZSt19__throw_logic_errorPKc@plt>
1fdc163: nopl 0x0(%rax,%rax,1)
1fdc168: sub %rbp,%rbx
1fdc16b: xor %esi,%esi
1fdc16d: mov %rbx,%rdi
1fdc170: callq 4112d0 <_ZNSs4_Rep9_S_createEmmRKSaIcE@plt> // next function in the call stack.
Note that the stack contains the value of r12, then the old frame pointer and then instead of the convention push %rsp, %rbp, we have push %rdi, %rbp. This means that the stack frame linked list established by the frame pointers is effectively broken by this stack frame.
Now if I look at the code of GetStackTrace from gperftools, I see the following (taken from stacktrace_x86_64-inl.h)
while (sp && n < max_depth) {
if (*(sp+1) == reinterpret_cast<void *>(0)) {
// In 64-bit code, we often see a frame that
// points to itself and has a return address of 0.
break;
}
}
In this call stack, @sp is no longer guaranteed to be on the stack. In fact *(sp + 1) is not even guaranteed to be valid memory and hence I get the segfault shown above. Of course the situation does not trigger deterministically and hence, I am unable to to investigate in more detail inside gdb etc.
- This never happens in debug mode and I have verified that the assembly for debug mode is properly setting the frame pointer.
So given all this information,
- Does my diagnosis for why the crash is happening seem reasonable?
- Is there some compiler setting that I am missing which will ensure that frame pointer is not omitted for the function std::string::Rep::_S_construct? Most of my functions do have the usual frame pointer preamble.
- Any suggestions on how to prevent this segfault?
Regards,
-- Priyendra