New inserted BB does not execute

79 views
Skip to first unread message

Mohammad Ewais

unread,
Feb 7, 2022, 1:38:26 PM2/7/22
to DynamoRIO Users
Hello,

I want to induce a syscall at the beginning of each thread (Create from scratch, not invoke another post an already existing syscall). So I simply do the following:
1. In the first instrumented BB, save RAX, RDI, and RSI in my TLS region.
2. Create and insert instructions to setup RAX and the syscall parameters.
3. dr_insert_clean_call to a print function for debugging
4. Create and insert the actual syscall instruction
5. Remove the original BB instructions
6. Return with DR_EMIT_DEFAULT
7. In the second BB (Really just the original instructions of the first), create and insert instructions to restore RAX, RDI, and RSI from the TLS.
8. dr_insert_clean_call to another print function.
9. Let the rest of the BB be

Everything listed above is made as an app instruction not meta instructions. I am able to print the BB itself and make sure the instructions match what I have in mind, but I have observed the second print work fine while the first never gets invoked. Any ideas what I could be missing that causes this?

Here is my original BB:
╔══════╤════════════════════╤══════════════════════════════════╗
║ Type │ PC                 │ Disassembly                      ║
╠══════╪════════════════════╪══════════════════════════════════╣
║ App  │ 0x00007FFFF79A8490 │ nop    edx                       ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A8494 │ push   r15                       ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A8496 │ push   r14                       ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A8498 │ push   r13                       ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A849A │ push   r12                       ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A849C │ push   rbp                       ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A849D │ push   rbx                       ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A849E │ mov    rbx, rcx                  ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A84A1 │ sub    rsp, 0x00000098           ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A84A8 │ mov    qword ptr [rsp+0x10], rdi ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A84AD │ mov    dword ptr [rsp+0x0c], esi ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A84B1 │ mov    qword ptr [rsp], rdx      ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A84B5 │ test   r9, r9                    ║
╟──────┼────────────────────┼──────────────────────────────────╢
║ App  │ 0x00007FFFF79A84B8 │ jz     0x00007ffff79a84c6        ║
╚══════╧════════════════════╧══════════════════════════════════╝


Which gets converted to the two following BBs (absent the dr_insert_clean_call for size and clarity):
╔══════╤════════════════════╤════════════════════════════════════════════╗
║ Type │ PC                 │ Disassembly                                ║
╠══════╪════════════════════╪════════════════════════════════════════════╣
║ App  │ 0x00007FFFF79A8490 │ mov    qword ptr [0x00007fffb4235b28], rax ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8490 │ mov    qword ptr [0x00007fffb4235b30], rdi ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8490 │ mov    qword ptr [0x00007fffb4235b38], rsi ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8490 │ mov    rax, 0x000000000000009e             ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8490 │ mov    rdi, 0x0000000000001003             ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8490 │ mov    rsi, 0x00007fffb4235c88             ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8490 │ syscall                                    ║
╚══════╧════════════════════╧════════════════════════════════════════════╝

╔══════╤════════════════════╤════════════════════════════════════════════╗
║ Type │ PC                 │ Disassembly                                ║
╠══════╪════════════════════╪════════════════════════════════════════════╣
║ App  │ 0x00007FFFF79A8490 │ mov    rax, qword ptr [0x00007fffb4235b28] ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8490 │ mov    rdi, qword ptr [0x00007fffb4235b30] ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8490 │ mov    rsi, qword ptr [0x00007fffb4235b38] ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8490 │ nop    edx                                 ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8494 │ push   r15                                 ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8496 │ push   r14                                 ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A8498 │ push   r13                                 ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A849A │ push   r12                                 ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A849C │ push   rbp                                 ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A849D │ push   rbx                                 ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A849E │ mov    rbx, rcx                            ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A84A1 │ sub    rsp, 0x00000098                     ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A84A8 │ mov    qword ptr [rsp+0x10], rdi           ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A84AD │ mov    dword ptr [rsp+0x0c], esi           ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A84B1 │ mov    qword ptr [rsp], rdx                ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A84B5 │ test   r9, r9                              ║
╟──────┼────────────────────┼────────────────────────────────────────────╢
║ App  │ 0x00007FFFF79A84B8 │ jz     0x00007ffff79a84c6                  ║
╚══════╧════════════════════╧════════════════════════════════════════════╝

Derek Bruening

unread,
Feb 8, 2022, 10:31:06 AM2/8/22
to dynamor...@googlegroups.com
Everything else works fine except the first block's clean call is not invoked?
Where is the clean call in the first block?  Is it after the syscall?
Is the start of the 2nd block the PC of the start of the 1st block + the length of SYSCALL which is 2 and is thus in the middle of an instruction?
If you run in debug build (-debug) do you get asserts/warnings?

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/3e280e8d-06f9-4e6f-beb1-290693472b73n%40googlegroups.com.

Mohammad Ewais

unread,
Feb 8, 2022, 10:37:39 AM2/8/22
to DynamoRIO Users
- Everything else will work fine, although at some point the app will crash because RAX, RDI, and RSI will hold 0 instead of their original values, because the first BB does not run. I did a diff between this version and a version without the inserted syscall, and up until the crash point all BBs and instruction traces are the same.
- In the first BB the clean call is right before the syscall.
- I think here I made a mistake, I have set all PCs of the inserted instructions to the same value which is the PC of "nop edx". I realize now they should at least be different before and after the syscall. But how do I assign PCs without risk of conflict with existing ones?
- No warnings or asserts. The app will just seg fault when it tries to use RAX or RDI or RSI as an address.

Derek Bruening

unread,
Feb 8, 2022, 11:03:33 AM2/8/22
to dynamor...@googlegroups.com
I would use -loglevel 4 (see https://dynamorio.org/page_logging.html#autotoc_md206) to see exactly what code is executed in the cache and in what sequence.  That should tell you exactly what is happening when the first block is executed.

Mohammad Ewais

unread,
Feb 10, 2022, 9:49:58 AM2/10/22
to DynamoRIO Users
That's the first reference to the PC in question (which I set as the translation PC for everything in the inserted BB, and is also the PC of the first instruction in the original BB):

(target 0x00007ffff79a8490 not in cache)
fragment_add_ibl_target tag 0x00007ffff79a8490, branch 1, F0
         Table ret_trace, table 0x00007fff33fcf480, mask 0x000000000000007f
         Table indcall_trace, table 0x00007fff33fcfcc0, mask 0x00000000000007f0
         Table indjmp_trace, table 0x00007fff33fd0500, mask 0x000000000000007f

d_r_dispatch: target = 0x00007ffff79a8490
check_thread_vm_area: pc = 0x00007ffff79a8490
prepend_entry_to_fraglist: putting fragment @0x00007ffff79a8490 (shared) on vmarea 0x00007ffff79a6000-0x00007ffff7b11000
check_thread_vm_area: check_stop = 0x00007ffff7b11000

interp: start_pc = 0x00007ffff79a8490
  0x00007ffff79a8490  f3 0f 1e fa          nop    edx
  0x00007ffff79a8494  41 57                push   r15
  0x00007ffff79a8496  41 56                push   r14
  0x00007ffff79a8498  41 55                push   r13
  0x00007ffff79a849a  41 54                push   r12
  0x00007ffff79a849c  55                   push   rbp
  0x00007ffff79a849d  53                   push   rbx
  0x00007ffff79a849e  48 89 cb             mov    rbx, rcx
  0x00007ffff79a84a1  48 81 ec 98 00 00 00 sub    rsp, 0x00000098
        wrote all 6 flags now!
  0x00007ffff79a84a8  48 89 7c 24 10       mov    qword ptr [rsp+0x10], rdi
  0x00007ffff79a84ad  89 74 24 0c          mov    dword ptr [rsp+0x0c], esi
  0x00007ffff79a84b1  48 89 14 24          mov    qword ptr [rsp], rdx
  0x00007ffff79a84b5  4d 85 c9             test   r9, r9
  0x00007ffff79a84b8  74 0c                jz     0x00007ffff79a84c6
end_pc = 0x00007ffff79a84ba


and it seems like my new created BB never makes it into the code cache at all. Not sure why that would be the case.

Derek Bruening

unread,
Feb 10, 2022, 10:46:55 AM2/10/22
to dynamor...@googlegroups.com
The logs should tell you everything: what the fall-through PC is for the end of the transformed first block, determining the PC where DR decodes for the subsequent block.  It should all be there in the log.

Mohammad Ewais

unread,
Feb 16, 2022, 2:49:40 PM2/16/22
to DynamoRIO Users
Hello Derek,

The logs told me nothing more than I already knew, the BB was not executing even though it was logged being instrumented. It is worth noting that this was the first BB after a cache flush. Not sure if that causes any issues.
Anyway, I refactored my code a bit, and managed to get the BB to actually work, with a different problem now:

The original BB Start PC was: 0x00007FFFF7A97D4D
The inserted BB translation PC is: 0x00007FFFF7A97D4B (which is the same for all instructions, including the syscall). This PC + the 2 bytes of the syscall should yield the original PC above, so fall through should be fine.
What I get, however, is the following, the syscall gets invoked many times in a loop, then the loop suddenly breaks and the original BB starts doing its business.

Here's a representative part of the log:

Instrumentation
before instrumentation:
TAG  0x00007ffff7a97d4d
 +0    L3 @0x00007fff345c0d98  31 ed                xor    ebp, ebp
 +2    L3 @0x00007fff345c0e60  58                   pop    rax
 +3    L3 @0x00007fff345c0f58  5f                   pop    rdi
 +4    L3 @0x00007fff345c1020  ff d0                call   rax
END 0x00007ffff7a97d4d


after instrumentation:
TAG  0x00007ffff7a97d4d
 +0    L4 @0x00007fff345c1150  48 a3 18 9b 1f b5 ff mov    qword ptr [0x00007fffb51f9b18], rax
                               7f 00 00
 +10   L4 @0x00007fff345c11d0  48 89 3d 19 12 d4 80 mov    qword ptr [0x00007fffb51f9b20], rdi
 +17   L4 @0x00007fff345c1250  48 89 35 21 12 d4 80 mov    qword ptr [0x00007fffb51f9b28], rsi
 +24   L4 @0x00007fff345c12d0  48 b8 9e 00 00 00 00 mov    rax, 0x000000000000009e
                               00 00 00
 +34   L4 @0x00007fff345c1350  48 bf 03 10 00 00 00 mov    rdi, 0x0000000000001003
                               00 00 00
 +44   L4 @0x00007fff345c13d0  48 be 78 9c 1f b5 ff mov    rsi, 0x00007fffb51f9c78
                               7f 00 00
 +54   L3 @0x00007fff345c10e8  0f 05                syscall
END 0x00007ffff7a97d4d



Fragment 3792, tag 0x00007ffff7a97d4d, flags 0x1801018, shared, tracehead, size 59, must end trace:                                                      
                                                                                                                                                         
  0x00007fffb51ec04c  48 a3 18 9b 1f b5 ff mov    qword ptr [0x00007fffb51f9b18], rax                                                                    
                      7f 00 00                                                                                                                            
  0x00007fffb51ec056  48 89 3d c3 da 00 00 mov    <rel> qword ptr [0x00007fffb51f9b20], rdi                                                              
  0x00007fffb51ec05d  48 89 35 c4 da 00 00 mov    <rel> qword ptr [0x00007fffb51f9b28], rsi                                                              
  0x00007fffb51ec064  48 b8 9e 00 00 00 00 mov    rax, 0x000000000000009e                                                                                
                      00 00 00                                                                                                                            
  0x00007fffb51ec06e  48 bf 03 10 00 00 00 mov    rdi, 0x0000000000001003                                                                                
                      00 00 00                                                                                                                            
  0x00007fffb51ec078  48 be 78 9c 1f b5 ff mov    rsi, 0x00007fffb51f9c78                                                                                
                      7f 00 00                                                                                                                            
  0x00007fffb51ec082  e9 bc 5b d3 ff       jmp    0x00007fffb4f21c43                                                                                      
  -------- exit stub 0: -------- <target: 0x00007ffff7a97d4d> type: fall-through/speculated/IAT                                                          
  0x00007fffb4f21c43  67 65 48 a3 00 00 00 mov    qword ptr [gs:0x00], rax                                                                                
                      00
  0x00007fffb4f21c4b  48 b8 00 80 26 34 ff mov    rax, 0x00007fff34268000
                      7f 00 00
  0x00007fffb4f21c55  e9 a6 e1 2d ff       jmp    $0x00007fffb41ffe00 <fcache_return>

which so far looks OK, just the original BB giving place for the inserted one. It even shows the correct fall through address

Loop

Entry into F3792(0x00007ffff7a97d4d).0x00007fffb51ec04c (trace head)(shared)

fcache_enter = 0x00007fffb41ffd00, target = 0x00007fffb51ec04c
Exit from F3792(0x00007ffff7a97d4d).0x00007fffb51ec082 (shared)
 (block ends with syscall)
Entry into do_syscall to execute a non-ignorable system call
system call 158
fcache_enter = 0x00007fffb41ffd00, target = 0x00007fffb4200c40
Exit from system call
post syscall: sysnum=0x000000000000009e, result=0x0000000000000000 (0)
thread 49300 segment change => app lib tls base: 0x000000abbba01640, alt tls base: 0x0000000000000000
finished handling system call

d_r_dispatch: target = 0x00007ffff7a97d4d
priv_mcontext_t @0x00007fff348d8bc0
    SKIPPED for size
Entry into F3792(0x00007ffff7a97d4d).0x00007fffb51ec04c (trace head)(shared)


This is the bad part, it keeps going on for a while. And every time it tries to fall through to 0x00007FFFF7A97D4D it takes the inserted BB (which starts at 0x00007FFFF7A97D4B) rather than invoke my BB instrumentation for the original.

Break

Entry into F3792(0x00007ffff7a97d4d).0x00007fffb51ec04c (trace head)(shared)

fcache_enter = 0x00007fffb41ffd00, target = 0x00007fffb51ec04c
Exit from F3792(0x00007ffff7a97d4d).0x00007fffb51ec082 (shared)
 (block ends with syscall)
Entry into do_syscall to execute a non-ignorable system call
system call 158
fcache_enter = 0x00007fffb41ffd00, target = 0x00007fffb4200c40
Exit from system call
post syscall: sysnum=0x000000000000009e, result=0x0000000000000000 (0)
thread 49300 segment change => app lib tls base: 0x000000abbba01640, alt tls base: 0x0000000000000000
finished handling system call

d_r_dispatch: target = 0x00007ffff7a97d4d
Going to start trace with F3792 (tag 0x00007ffff7a97d4d)
Creating private copy of F3792 (0x00007ffff7a97d4d) for trace creation
check_thread_vm_area: pc = 0x00007ffff7a97d4d
new vm area for thread: 0x00007ffff79a6000-0x00007ffff7b11000 ----  libc.so.6
checking thread vmareas against executable_areas
prepend_entry_to_fraglist: putting fragment @0x00007ffff7a97d4d (private) on vmarea 0x00007ffff79a6000-0x00007ffff7b11000
check_thread_vm_area: check_stop = 0x00007ffff7b11000

I think the line in red is what triggers it to break the loop. Of course, I could be wrong.

Solution
So, based on this. Is there a way to insert the new BB with a different tag? One that starts at 0x00007FFFF7A97D4B instead of 0x00007FFFF7A97D4D? Is there any other way to resolve this conflict?
Technically, this does not affect or break the application, but it does create may be a 100 system calls instead of 1, which is obviously not good would affect performance. 
Reply all
Reply to author
Forward
0 new messages