I am experimenting with a tracing mode where we first collect a cache-warmup
trace using the existing L0_filter and then switch to a full instruction trace.
I am using the dr_flush_region_ex method to flush the current instrumentation
(inspired by how delay instrumentation is implemented).
This works for a toy program as shown below. We start off with a delay of ~600M
instrs. Then we run with L0_filter for 100m references and then we switch to a
full instruction trace.
Thread 18547: started (global references 0)
<curiosity: rex.w on OPSZ_6_irex10_short4!>
<(1+x) Handling our fault in a TRY at 0x00007f2a8fc29442>
Thread 18547: Hit delay threshold: enabling tracing after 614465538 instructions.
Thread 18547: Switching to full trace after ~100000201 references for this thread (global references = 100000201).
In open_post_trace.
Hit -max_trace_size: disabling tracing.
<Stopping application /home/prasun/DR/array (18547)>
Thread 18547: exiting after ~10484224 references (global references = 10484224).
However when I tried this with specint benchmark, I see a segfault in the
second basic block after switching to the full instruction trace.
Any idea why this could happen?
Thread 14595: started (global references 0)
<spurious rep/repne prefix @0x00000000002b7f00 (f3 0f 1e fa): >
<curiosity: rex.w on OPSZ_6_irex10_short4!>
<get_memory_info mismatch! (can happen if os combines entries in /proc/pid/maps)
os says: 0x00007f1b7cae8000-0x00007f1b7cae9000 prot=0x00000003
cache says: 0x00007f1b7cae8000-0x00007f1b7caf4000 prot=0x00000003
>
Thread 14595: Hit delay threshold: enabling tracing after 61522051073 instructions.
Thread 14595: Switching to full trace after ~100002083 references for this thread (global references = 100002083).
In open_post_trace.
<Application /path/to/my_binary (14595). Application exception at PC 0x00000000004cba50.
Signal 11 delivered to application as default action.
Callstack:
0x00000000004cba50 </path/to/my_binary+0x2cba50>
0x00003c3c4f4f5d5d
>
<Stopping application /path/to/my_binary (14595)>
Thread 14595: exiting after ~0 references (global references = 0).
<Application /path/to/my_binary (14595). Internal Error: DynamoRIO debug check failure: /home/prasun/DR/dynamorio/core/heap.c:1951 IF_WINDOWS(doing_detach ||) vmh->num_free_blocks == vmh->num_blocks - unfreed_blocks || ((ever_beyond_vmm IF_WINDOWS(|| get_os_version() >= WINDOWS_VERSION_8_1)) && vmh->num_free_blocks >= vmh->num_blocks - unfreed_blocks)
(Error occurred @8461 frags)
version 8.0.18752, custom build
-loglevel 4 -client_lib '/home/prasun/DR/build-DDEBUGON/bin64/../clients/lib64/debug/libdrmemtrace.so;0;-trace_after_instrs 58672M -warmup_refs 100000000 -max_trace_size 80M -offline -subdir_prefix 13' -client_lib64 '/home/prasun/DR/build-DDEBUGON/bin64/../clients/lib64/debug/libdrmemtrace.so;0;-t
0x00007f1b7db240a0 0x00007f1d7db3b635
0x00007f1b7db242f0 0x00007f1d7dbd0975
0x00007f1b7db24330 0x00007f1d7dbd0a00
0x00007f1b7db24340 0x00007f1d7dab3195
0x00007f1b7db24380 0x00007f1d7dab34d1
0x00007f1b7db243b0 0x00007f1d7dab3657
0x00007f1b7db243d0 0x00007f1d7dd4a609
0x00007f1b7db24530 0x00007f1d7dd4a9ea
0x00007f1b7db24560 0x00007f1d7dd3fd34
0x00007f1b7db245d0 0x00007f1d7dd40237
0x00007f1b7db24740 0x00007f1d7dd469ff
0x00007f1b7db247e0 0x00007f1d7dd4346c
0x00007f1b7db24a10 0x00007f1d7dd4660f
0x00007f1b7db24ab0 0x00007f1d7dd03f63
0x00007f1b7c710948 0x00003c3c4f4f5d5d
/home/prasun/DR/build-DDEBUGON/lib64/debug/libdynamorio.so=0x00007f1d7da56000>
Some selected snippets from the log
1471127:SYSLOG_INFORMATION: curiosity: rex.w on OPSZ_6_irex10_short4!
1635841:SYS_open /proc/sys/vm/overcommit_memory => 3
1638442:SYS_open /sys/kernel/mm/transparent_hugepage/enabled => 3
1642124:SYSLOG_WARNING: get_memory_info mismatch! (can happen if os combines entries in /proc/pid/maps)
6581389:before instrumentation: TAG 0x00000000004cba50
6588616:before instrumentation: TAG 0x00000000004cba50
7154653:Exit from client redirection --> switch from delay_instrumentation to L0_filter
7161239:before instrumentation: TAG 0x00000000004cba50
7170675:before instrumentation: TAG 0x00000000004cba50
8270818:Exit from client redirection --> switch from L0_filter to full instr trace
8270819-
8270820:d_r_dispatch: target = 0x00000000004cba61
8270834-before instrumentation:
8270835-TAG 0x00000000004cba61
8270836- +0 L3 @0x00007f1b7dc9de30 0f 85 e9 ff ff ff jnz $0x00000000004cba50
8270837-END 0x00000000004cba61
8271176:d_r_dispatch: target = 0x00000000004cba50
8271192:before instrumentation: TAG 0x00000000004cba50
8271841:** Received SIGSEGV at cache pc 0x00000000481e8142 in thread 14595
8271849:translate context, thread 14595 at pc_recreatable spot translating
8271850:recreate_app_state -- translating from:
8271912:before instrumentation: TAG 0x00000000004cba50
8272319:recreate_app : pc is in F8461(0x00000000004cba50)
8272409:recreate_app : looking for 0x00000000481e8142 in frag @ 0x00000000481e80dc (tag 0x00000000004cba50)
8272410:recreate_app -- found valid state pc 0x00000000004cba50 - 0x0000000000000000
8272411:recreate_app_state_from_ilist translation 0x00000000004cba50 is post-walk 0x0000000000000000 so not fixing xsp
8272412:recreate_app -- found ok pc 0x00000000004cba50
8272420:recreate_app_state -- translation is:
8272559:SYSLOG_WARNING: Application /path/to/my_binary (14595). Application exception at PC 0x00000000004cba50.
8272566:SYSLOG_INFORMATION: Stopping application /path/to/my_binary (14595)
8283111: Fcache exits, client redirecting (thread): 2