Hello,
This is a little bit weird. I am encountering a segmentation fault in a replaced function call in my client.
First of all, I instrument all BBs, this is the last BB before the crash:
Meta - 0x0000000000000000: mov qword ptr [gs:0x08], rsp
Meta - 0x0000000000000000: mov qword ptr [gs:0x18], 0x00000000
Meta - 0x0000000000000000: mov r11, 0x00000fffc01b0330
Meta - 0x0000000000000000: jmp r11
App - 0x00007F45023400F0: nop
The address of that BB matches the address of the mmap function in the application, which I override using dr_wrap_replace_native. The function never finishes, and I encounter the crash inside it.
In the BB above, I insert calls to print the RSP right before the jmp instruction, it has the value: 0x0000110000002E77
I then have this info about the crash in the logs:
computing memory target for 0x00000fffc018b4da causing SIGSEGV, kernel claims it is 0x0000000000000000
opnd_compute_address for: oword ptr [rsp]
base => 0x0000110000002b9f
index,scale => 0x0000110000002b9f
disp => 0x0000110000002b9f
For SIGSEGV at cache pc 0x00000fffc018b4da, computed target write 0x0000000000000000
faulting instr: movaps oword ptr [rsp], xmm0
add_process_lock: 0 lock 0x00007f46a9437a40: name=report_buf_lock(mutex)@/home/travis/build/DynamoRIO/dynamorio/core/utils.c:2026
rank=88 owner=167333 owning_dc=0x00007f4628e5cd40 contended_event=0xffffffff prev=0x0000000000000000
lock count_times_acquired= 1 0 0 0 0+2 report_buf_lock(mutex)@/home/travis/build/DynamoRIO/dynamorio/core/utils.c:2026
SYSLOG_CRITICAL: Application /home/mewais/DCSim/Debug/ReplacementMain (167333). Tool internal crash at PC 0x00000fffc018b4da. Please report this at your tool's issue tracker. Program aborted.
Received SIGSEGV at client library pc 0x00000fffc018b4da in thread 167333
Base: 0x00007f46a8e1f000
Registers:eax=0x00000fffc02bdc10 ebx=0x0000000000020022 ecx=0x0000000000000004 edx=0x000000000000002f
esi=0x00000fffc025a960 edi=0x0000110000002b9f esp=0x0000110000002b9f ebp=0x0000000000000000
r8 =0x0000110000002e0f r9 =0x0000000000000000 r10=0x000011000000227f r11=0x0000000000000246
r12=0x0000110000002e1f r13=0x0000110000002bbf r14=0x00000fffc02bdc00 r15=0x0000110000002e0f
eflags=0x0000000000010206
This is very weird for me to understand. RSP clearly holds a reasonable value of 0x0000110000002b9f, at the same time the instruction movaps oword ptr [rsp], xmm0
segfaults because it computes address 0?! How is this possible?
I was able to track down this instruction in the objdump of my client, here's the snippet:
7218b4a0: f3 0f 1e fa endbr64
7218b4a4: 41 56 push %r14
7218b4a6: 45 31 c9 xor %r9d,%r9d
7218b4a9: 41 55 push %r13
7218b4ab: 41 54 push %r12
7218b4ad: 49 89 fc mov %rdi,%r12
7218b4b0: 55 push %rbp
7218b4b1: 53 push %rbx
7218b4b2: 48 81 ec 20 02 00 00 sub $0x220,%rsp
7218b4b9: 4c 8b 35 d8 5c 13 00 mov 0x135cd8(%rip),%r14 # 722c1198 <_ZTVN3fmt2v819basic_memory_bufferIcLm500ESaIcEEE@@Base+0x3598>
7218b4c0: 4c 8d 6c 24 20 lea 0x20(%rsp),%r13
7218b4c5: 48 89 e7 mov %rsp,%rdi
7218b4c8: 49 8d 46 10 lea 0x10(%r14),%rax
7218b4cc: 66 49 0f 6e cd movq %r13,%xmm1
7218b4d1: 66 48 0f 6e c0 movq %rax,%xmm0
7218b4d6: 66 0f 6c c1 punpcklqdq %xmm1,%xmm0
7218b4da: 0f 29 04 24 movaps %xmm0,(%rsp)
Obviously the RSP was also being used before the offending instruction, no issues with its values or permissions. So why does this specific instruction compute the address as 0 when it clearly has a valid value?
EXTRAS:
This stack is actually created by my client using dr_raw_mem_alloc because I want it to be specifically at the range 0x110000001000-0x110000002FFF. I also made sure to give it DR_MEMPROT_READ | DR_MEMPROT_WRITE permissions. I don't think this is the cause of the issue since the application and the client were both working fine for a while before that mmap call. It also shows that the RSP at the point of failure is still very much within the stack boundaries.