Following x64 code going executed via far call [fs:0x33] on WoW64 applications

290 views
Skip to first unread message

QtK

unread,
Jun 4, 2021, 1:23:25 PM6/4/21
to DynamoRIO Users
Hello,

Going trough the documentation I did not find any way of switching context to x64 when running a WoW64 application, which can be useful in case the of chameleon code, e.g:

0x100000 mov eax, 0 // 32bits WoW64 application's code
0x100005 call 0x33:0x1000a0 
0x10000a ret
...
0x1000a0 sub rsp, 0x60 // x64 code

Is there any way of handling such things on a 32bits drrun tool?

Bests,
Qetak.

Derek Bruening

unread,
Jun 4, 2021, 2:38:08 PM6/4/21
to dynamor...@googlegroups.com
There is some support for mixed-mode 32-bit-and-64-bit.  The core certainly supports it, labeling each block as 32-bit or 64-bit.  You would have to use 64-bit DR and a 64-bit client, and set options like -inject_x64 to ensure 64-bit DR is injected into 32-bit children.  The drrun launcher may not support this; you would have to use a 64-bit launcher process (e.g., bin64/drrun on bin64/create_process) to inject.  This is not well-tested recently nor well-supported for clients; IIRC the "win32.mixedmode" test is currently missing from the CI (but the intent was to re-add it using the build-and-test scheme worked out for the "win32.xarch" test).  If you wanted to improve the support, the first step would be reviving something like the win32.mixedmode test, but with an added client, and then adding support to drrun for more convenient usage.

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/c9b109b8-a613-4d28-944e-4dba1f23933fn%40googlegroups.com.

QtK

unread,
Jun 5, 2021, 8:57:22 AM6/5/21
to DynamoRIO Users
On Friday, June 4, 2021 at 8:38:08 PM UTC+2 Derek Bruening wrote:
There is some support for mixed-mode 32-bit-and-64-bit. 
Glad to hear that most of the work is done already!
 
The core certainly supports it, labeling each block as 32-bit or 64-bit.  You would have to use 64-bit DR and a 64-bit client, and set options like -inject_x64 to ensure 64-bit DR is injected into 32-bit children.  The drrun launcher may not support this; you would have to use a 64-bit launcher process (e.g., bin64/drrun on bin64/create_process) to inject.  

I have been trying  with the following command line : 
bin64\drrun.exe -inject_x64 -c32 32bitsclient.dll -- -c64 64bitsclient.dll -- bin64\create_process.exe hello_world32.exe
And:
bin64\drrun.exe -inject_x64 -c 64bitsclient.dll -- bin64\create_process.exe hello_world32.exe

And it looks like create_process.exe gets injected, but hello_world32.exe isn't, in both cases. 

win32.xarch is passing, and win32.mixedmode does not exist.

I'd be glad to improve support for mixed-32-64bits but I do not know dynamoRIO well at all, I have no idea how to do so.
Even if I can't improve the support, I'd be satisfied with the above create_process trick mentioned above if it was working in this case.

Derek Bruening

unread,
Jun 7, 2021, 11:21:35 AM6/7/21
to dynamor...@googlegroups.com
Hmm, looking at https://github.com/DynamoRIO/dynamorio/pull/4653 it says:

Adds an -inject_x64 option to inject a 64-bit DR lib into a 32-bit
child from a 64-bit parent, but this option is only sketched out and
is not fully supported yet: #49 covers adding tests and official
support.

So it looks like either some additional pieces are needed for the new injection defaults from that PR,
or it may work by going back to the old injection defaults by tweaking the -early_inject, -early_inject_map,
and -early_inject_location options.

win32.mixedmode parameters are set.

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.

QtK

unread,
Jun 7, 2021, 12:31:29 PM6/7/21
to DynamoRIO Users
I'm going to play with those options a bit to try to find a solution. I would love to try to help improving DynamoRIO if I ever have time (and if I succeed). Could you pinpoint which source code needs to be modified?
Also, 
The comment on this test states : 
    # TODO i#803: Once cross-arch injection works on Windows, add a test of
    # the "-c32 -c64" syntax like we have for Linux. 
I am confused as from what I've read from https://github.com/DynamoRIO/dynamorio/pull/4653 , 64bit client should be injected in 32bits child.

QtK

unread,
Jun 7, 2021, 12:35:27 PM6/7/21
to DynamoRIO Users
Also, you seem to mention options that are not mentioned in the `-help` from drrun tool. I could not find traces of those options in the documentation either. Were can I find information about those? 

QtK

unread,
Jun 7, 2021, 1:17:44 PM6/7/21
to DynamoRIO Users
I am noting some stuff here for the record:

I found early_inject_location options in the source code, and -early_inject_location 5 is the only one that seems to trigger an injection to the child process, but it triggers an Out Of Memory error, that would need some investigation but this is highly unlikely to be a true OOM as the process is just getting injected. 
After investigation using -debug -loglevel 2, an encode cti error occurs : target beyond 32-bit reach. This error gets triggered by a client assert here : https://github.com/DynamoRIO/dynamorio/blob/master/core/ir/x86/encode.c#L2639


Derek Bruening

unread,
Jun 7, 2021, 2:49:11 PM6/7/21
to dynamor...@googlegroups.com
Looking at the git blame: looks like it's from PR#4324 and that test was later added as part of the win32.xarch test, so it seems to be a now-stale comment.
 

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.

Derek Bruening

unread,
Jun 7, 2021, 2:51:10 PM6/7/21
to dynamor...@googlegroups.com
There are (too) many internal options not supported/useful/simple enough for external docs: https://github.com/DynamoRIO/dynamorio/blob/master/core/optionsx.h

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
Message has been deleted

QtK

unread,
Jun 11, 2021, 7:55:01 AM6/11/21
to DynamoRIO Users
Thanks for all the answers, and thanks for your precious time, which is much appreciated. 

I'm back after a few debugging.
In default injection mode (RtlUserThreadStart hooking) : 
A crash occurs trying to encode a jmp to RtlUserThreadStart in generate_switch_mode_jmp_to_hook function. This function makes a mod switch to x86, and then tries to create a jmp to RtlUserThreadStart from ntdll64, which obviously is not possible without a far call. 
I'm kind of confused by this function. I do not understand the purpose of the restore_esp variable defined at https://github.com/DynamoRIO/dynamorio/blob/master/suite/tests/CMakeLists.txt#L4486, and by the whole function. It switches to x86, "restore_esp", and than jmp to the hook, which is supposed to be a hook to x64 version of RtlUserThreadStart. Can't we just directly jump to RtlUserThreadStart?

Also, at https://github.com/DynamoRIO/dynamorio/blob/33b31b8164bd8050adbb2e2c01869b20739614a2/core/win32/inject.c#L1250, we assume that we switched to x86 (which is problematic as we are trying to jmp to x64 version of RtlUserThreadStart), and we try to jump back to x64 calculating the address to jump to using "pre_jmp" variable, which points to a buffer. An other problem occurs using this variable at line https://github.com/DynamoRIO/dynamorio/blob/33b31b8164bd8050adbb2e2c01869b20739614a2/core/win32/inject.c#L1262

      RAW_INSERT_INT32(cur_local_pos, pre_jmp + far_jmp_len);

 pre_jmp being in x64 address space, RAW_INSERT can't be INT32, as it will trigger an ASSERT for address range. This has to be RAW_INSERT_INT64. 

I tried some stuff to fix all that, but I am really confused about the "generate_switch_mode_jmp_to_hook" function switching to x86 to "restore_esp" and jumping to an ntdll64 function, while after this function call we assume being in x86 mode.


Side note : INJECT_LOCATION_KiUserApc suggested above is triggering an OOM error (probably done by an assert or something that tries to reach the injected x64 dll into the 32bits hello_world) that is hard to debug. I was focusing on RtlUserThreadStart hook for now.

I hope you can give some answers that (the many many, -thanks for that-) comments did not answer,
Best regards

Qtk

QtK

unread,
Jun 11, 2021, 8:44:21 AM6/11/21
to DynamoRIO Users
Here is the stacktrace of encoding error if you ever need it : 
0:000> kn
 # Child-SP          RetAddr           Call Site
00 00000196`c17ebc68 00000000`153bdcdf ntdll!NtRaiseHardError
01 00000196`c17ebc70 00000000`1537313b dynamorio!nt_messagebox+0x17f [c:\users\qtk\documents\dynamorio\core\win32\ntdll.c @ 3776]
02 00000196`c17ebd10 00000000`15103753 dynamorio!debugbox+0x5b [c:\users\qtk\documents\dynamorio\core\win32\os.c @ 5363]
03 00000196`c17ebd40 00000000`150fe115 dynamorio!d_r_notify+0x283 [c:\users\qtk\documents\dynamorio\core\utils.c @ 1957]
04 00000196`c17ec5a0 00000000`152d9dc9 dynamorio!external_error+0x175 [c:\users\qtk\documents\dynamorio\core\utils.c @ 204]
05 00000196`c17ec620 00000000`152da169 dynamorio!encode_cti+0x519 [c:\users\qtk\documents\dynamorio\core\ir\x86\encode.c @ 2638]
06 00000196`c17ec6f0 00000000`152c5c34 dynamorio!instr_encode_arch+0x359 [c:\users\qtk\documents\dynamorio\core\ir\x86\encode.c @ 2783]
07 00000196`c17ec980 00000000`152ba748 dynamorio!instr_encode_to_copy+0x44 [c:\users\qtk\documents\dynamorio\core\ir\encode_shared.c @ 134]
08 00000196`c17ec9d0 00000000`153d1c30 dynamorio!instrlist_encode_to_copy+0x2d8 [c:\users\qtk\documents\dynamorio\core\ir\instrlist.c @ 572]
09 00000196`c17eca50 00000000`153d2207 dynamorio!generate_switch_mode_jmp_to_hook+0x350 [c:\users\qtk\documents\dynamorio\core\win32\inject.c @ 1102]
0a 00000196`c17ecc50 00000000`153d3512 dynamorio!inject_gencode_mapped_helper+0x297 [c:\users\qtk\documents\dynamorio\core\win32\inject.c @ 1217]
0b 00000196`c17ecef0 00000000`153cde1f dynamorio!inject_gencode_mapped+0x1e2 [c:\users\qtk\documents\dynamorio\core\win32\inject.c @ 1436]
0c 00000196`c17ed190 00000000`1537c9e3 dynamorio!inject_into_new_process+0x43f [c:\users\qtk\documents\dynamorio\core\win32\inject.c @ 1580]
0d 00000196`c17ed730 00000000`153763ab dynamorio!inject_into_process+0x3a3 [c:\users\qtk\documents\dynamorio\core\win32\os.c @ 3430]
0e 00000196`c17ed9d0 00000000`153955c5 dynamorio!maybe_inject_into_process+0x1eb [c:\users\qtk\documents\dynamorio\core\win32\os.c @ 3565]
0f 00000196`c17eda40 00000000`15386c9c dynamorio!postsys_CreateUserProcess+0xb15 [c:\users\qtk\documents\dynamorio\core\win32\syscall.c @ 3243]
10 00000196`c17ee8d0 00000000`150eac67 dynamorio!post_system_call+0x1bfc [c:\users\qtk\documents\dynamorio\core\win32\syscall.c @ 4421]
11 00000196`c17eec70 00000000`150dc1e9 dynamorio!handle_post_system_call+0xf7 [c:\users\qtk\documents\dynamorio\core\dispatch.c @ 2178]
12 00000196`c17eecd0 00000000`150d8d29 dynamorio!dispatch_enter_dynamorio+0x1249 [c:\users\qtk\documents\dynamorio\core\dispatch.c @ 879]
13 00000196`c17eee80 00007ff5`dce0f31b dynamorio!d_r_dispatch+0x19 [c:\users\qtk\documents\dynamorio\core\dispatch.c @ 165]
14 00000196`c17eefe0 00000196`c17cf080 0x00007ff5`dce0f31b
15 00000196`c17eefe8 abababab`abababab 0x00000196`c17cf080
16 00000196`c17eeff0 abababab`abababab 0xabababab`abababab
17 00000196`c17eeff8 00000000`00000000 0xabababab`abababab

Using cronbuild 8.0.18655 (no important commit has been made to inject.c since this cronbuild)

QtK

unread,
Jun 11, 2021, 10:02:25 AM6/11/21
to DynamoRIO Users
      RAW_INSERT_INT32(cur_local_pos, pre_jmp + far_jmp_len);

 pre_jmp being in x64 address space, RAW_INSERT can't be INT32, as it will trigger an ASSERT for address range. This has to be RAW_INSERT_INT64. 

Nevermind this is impossible because jmp far 0x33:[64 bit] do not exist in intel arch. We have to do it some other way. 

QtK

unread,
Jun 11, 2021, 12:24:44 PM6/11/21
to DynamoRIO Users
I think I found the problem.
The second allocate_remote_code_buffer call (https://github.com/DynamoRIO/dynamorio/blob/master/core/win32/inject.c#L1205) from inject_gencode_mapped_helper fails to find a low-address code to allocate.
From the comments I have read in the code, this might be cause by my version of Windows, which is a win10~ish.
Any idea about how we could get around this problem? 

QtK

unread,
Jun 14, 2021, 11:35:31 AM6/14/21
to DynamoRIO Users
From what I have read from inject_gencode_mapped_helper, it might be possible to allocate the "local_buffer" in the target process and do everything from there? The buffer seems to be written to the target process in the end anyway.
The only allocation that fails to have a result <2GB is the local allocation in current process. 
So I guess that a remote allocation of this buffer (with a bit of modification in the shellcode creation) would work?

Message has been deleted

QtK

unread,
Jun 16, 2021, 11:06:39 AM6/16/21
to DynamoRIO Users
I think there is some error in the code but I'd need an information to fix it :
should a x64 client dll injected in a x86 process be hooking RtlUserThreadStart from ntdll64 or ntdll32?

Derek Bruening

unread,
Jun 16, 2021, 3:38:07 PM6/16/21
to dynamor...@googlegroups.com
On Wed, Jun 16, 2021 at 11:06 AM QtK <qtkre...@gmail.com> wrote:
I think there is some error in the code but I'd need an information to fix it :
should a x64 client dll injected in a x86 process be hooking RtlUserThreadStart from ntdll64 or ntdll32?

ntdll64.
 

Le lundi 14 juin 2021 à 17:35:31 UTC+2, QtK a écrit :
From what I have read from inject_gencode_mapped_helper, it might be possible to allocate the "local_buffer" in the target process and do everything from there? The buffer seems to be written to the target process in the end anyway.
The only allocation that fails to have a result <2GB is the local allocation in current process. 
So I guess that a remote allocation of this buffer (with a bit of modification in the shellcode creation) would work?

Le vendredi 11 juin 2021 à 18:24:44 UTC+2, QtK a écrit :
I think I found the problem.
The second allocate_remote_code_buffer call (https://github.com/DynamoRIO/dynamorio/blob/master/core/win32/inject.c#L1205) from inject_gencode_mapped_helper fails to find a low-address code to allocate.
From the comments I have read in the code, this might be cause by my version of Windows, which is a win10~ish.
Any idea about how we could get around this problem? 

Le vendredi 11 juin 2021 à 16:02:25 UTC+2, QtK a écrit :
      RAW_INSERT_INT32(cur_local_pos, pre_jmp + far_jmp_len);

 pre_jmp being in x64 address space, RAW_INSERT can't be INT32, as it will trigger an ASSERT for address range. This has to be RAW_INSERT_INT64. 

Nevermind this is impossible because jmp far 0x33:[64 bit] do not exist in intel arch. We have to do it some other way. 

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.

QtK

unread,
Jun 17, 2021, 4:17:54 AM6/17/21
to DynamoRIO Users
Thanks for the answer. 
If we need to hook ntdll64, then I do not understand the purpose of the function generate_switch_mode_jmp_to_hook in inject.c. Comments talk about a 32 bit address, but ntdll64 hook is a 64 bits address, plus the function does a mod switch to x86 before trying to jump on ntdll64 hook.
Could you please clarify the purpose of this function?

Your time is much appreciated, thanks again.

Derek Bruening

unread,
Jun 17, 2021, 12:58:47 PM6/17/21
to dynamor...@googlegroups.com
Looking at the old code ('git show 9ce2418^:core/win32/inject.c '):
  • When x86_code was set, there were two mode switches: one in the initial takeover code, and another created by generate_switch_mode_jmp_to_hook() to go back to 32-bit mode.  So it assumed the takeover point was 32-bit code, so it has to swap to 64-bit to initialized 64-bit DR, and then back to 32-bit when it starts executing the app code at that takeover point.
  • x86_code was only set for INJECT_LOCATION_ImageEntry, not for the win32.mixedmode test which passed INJECT_LOCATION_KiUserApc: so none of the mode switch code was used at all before for injecting 64-bit DR into a WOW64 child, because it targeted ntdll64.
  • If actually targeting a 64-bit-mode takeover point, x86_code should be false (at least for purposes of whether the 2 mode switches are needed).  Maybe there is a bug in the new code which sets x86_code from the child PE (so 32-bit for WOW64) if it then uses an ntdll64 hook point?
  • Be sure any ntdll64 hook point is actually executed: there's a comment that ntdll64!RtlUserThreadStart is not executed for a WOW64 child.  So maybe a 32-bit hook point is the easiest thing there after all?  And then the two mode switches are needed.
Message has been deleted
Message has been deleted

QtK

unread,
Jun 21, 2021, 9:02:07 AM6/21/21
to DynamoRIO Users
Thanks for your interesting answer. 

I've deleted my previous posts that are now irrelevant. I have understood the process injection routine now. 
The last thing that bugs me is finding when dynamo_earliest_init_takeover is going to pop the pushed value to transfer control to it.

So, I have made a working code that injects and start the target process with parent x64 child x86, and a x64 client DLL.  
Now, a new error shows up, which is the exact same one as the one that occurs using KiUserApc injection point : out of memory error. 

This occurs at vmm_heap_init(), when trying to allocate space for the "vmheap" section : https://github.com/DynamoRIO/dynamorio/blob/d02506ae72fd69838e559804633ba5f0743604ca/core/heap.c#L1904
While debugging this, I found that this call tries to allocate 2GB of memory (which I *think* is impossible, as this is the whole available space for our program).

Thanks for your time and for your precious answers,

Bests,

Qtk

QtK

unread,
Jun 21, 2021, 9:17:45 AM6/21/21
to DynamoRIO Users
Also, I will do a pull request providing the working code for this injection with RtlUserThreadStart once I get everything working properly.
Message has been deleted

Derek Bruening

unread,
Jun 21, 2021, 11:23:47 AM6/21/21
to dynamor...@googlegroups.com
On Mon, Jun 21, 2021 at 9:02 AM QtK <qtkre...@gmail.com> wrote:
Thanks for your interesting answer. 

I've deleted my previous posts that are now irrelevant. I have understood the process injection routine now. 
The last thing that bugs me is finding when dynamo_earliest_init_takeover is going to pop the pushed value to transfer control to it.

Looking at the code: dynamorio_earliest_init_takeover_C calls dynamorio_app_take_over which takes over at the return address inside dynamorio_earliest_init_takeover_C,
which then returns back to dynamorio_earliest_init_takeover in asm and does some GPR popping and then returns to the retaddr we've placed on the stack (e.g., RtlUserThreadStart by default).  Hmm, I thought it was cleaner than this: this means there are 2 interpreted blocks presented to clients which are really DR code.  I filed https://github.com/DynamoRIO/dynamorio/issues/4958 on not passing them to clients or on tweaking the context to avoid them.

QtK

unread,
Jun 21, 2021, 11:36:10 AM6/21/21
to DynamoRIO Users
Alright, I think I got it now !

About the memory allocation error, NtAllocateVirtualMemory is called for a size of 0x1000, which is tiny, and the allocation fails with 0xC000000D, which is STATUS_INVALID_PARAMETER. I have no idea yet about what could such error, it is not documented in microsoft documentation and this looks quite obscure. 

Message has been deleted

QtK

unread,
Jun 22, 2021, 6:38:55 AM6/22/21
to DynamoRIO Users
Got my debugger running again, and the second allocation tries to allocate 0x200001000 bytes. Lowering this amount makes the allocation succeed, but the program fails later on, because of conflicting addresses. 
I've read in this commit (https://github.com/DynamoRIO/dynamorio/commit/99fd6ce5dae8b7ea33268e1043a518ad42e4dd80#diff-1814fc3305715d5032f6fe90d9a1c30160692c96ef98db803b6c1626b364d457) adding the whole vmheap thing that -reachable_heap option can be passed to avoid using the vmheap and use only vmcode. This works for create_process.exe, but it is not propagated to the child, that still tries to allocate both vmcode and vmheap.

QtK

unread,
Jun 22, 2021, 11:11:52 AM6/22/21
to DynamoRIO Users
Managed to make things running with a custom allocator that allocates way less than 0x200001000 bytes, plus -reachable_heap parameter that skips this custom allocator for vmheap for create_process.exe.

Target gets injected, dynamoRIO starts running (my client isn't loaded yet, so I guess it's still stuck in some initialization of some sort), and a crash occurs at global_do_syscall_syscall in the syscall instruction:

Application C:\Users\Qtk\Desktop\SomeClient\dist\Win32\Release\TargetApp.exe (4612).  DynamoRIO internal crash at PC 0x000000001541cae7.  Please report this at http://dynamorio.org/issues/.  Program aborted.
0xc000001d 0x00000000 0x000000001541cae7 0x000000001541cae7 0x0000000000000000 0x0000000000000000
Base: 0x0000000015000000
Registers: eax=0x0000000000000166 ebx=0x0000000000878000 ecx=0x00000000441d3ab0 edx=0x00000000441d3ab0
        esi=0x0000000000000000 edi=0x0000000000000000 esp=0x0000000000bdff83 ebp=0x0000000000000000
        r8 =0x0000000000000001 r9 =0x0000000000000000 r10=0x0000000000000000 r11=0x00000000007ce5a0
        r12=0x0000000000879000 r13=0x00000000007cfda0 r14=0x00000000007ceeb0 r15=0x0000000077654660
        eflags=0x0000000000010202
version 8.0.0, custom build 

Putting a bp at NtRaiseHardError shows a really weird stacktrace : 

 # Child-SP          RetAddr           Call Site
00 00000000`3c2637b8 00000000`153be37f ntdll!NtRaiseHardError
01 00000000`3c2637c0 00000000`3c263840 0x153be37f
02 00000000`3c2637c8 00000000`155708b0 0x3c263840
03 00000000`3c2637d0 00000000`00000101 0x155708b0
04 00000000`3c2637d8 00000000`00000000 0x101

Derek Bruening

unread,
Jun 22, 2021, 11:29:51 AM6/22/21
to dynamor...@googlegroups.com
Sounds like changes to the default options (larger initial allocations: 8G vmheap) while the mixedmode test was disabled broke it further.  You would want to tweak the vmheap size for mixedmode.

The options not propagating is not good: that is a key part of injection.  Xref the environment variable propagation changes in PR #4653 -- are they not working for this case?

Message has been deleted

QtK

unread,
Jun 22, 2021, 3:53:09 PM6/22/21
to DynamoRIO Users
There is a lot to review in this PR, it is quite big, I'll try to find out why options are not getting propagated. 

I'm not sure that this will solve the injection issue tho, as I tried to force -reachable_heap in child, and it resulted in an other error related to my previous message with the syscall as far as I can remember. Or maybe it was mentioning a far call, I'm not sure about that. I will have to check again. 

QtK

unread,
Jun 25, 2021, 9:51:30 AM6/25/21
to DynamoRIO Users
Hello again.

Managed to have a code successfully injected, arguments properly passed to the child, and client DLL starting. Except it's crashing on the switch mode jump. 
I can see the block being passed to drmgr_register_bb_instrumentation_event, showing :

00000000`00b90000 680f00b900            push    0B9000Fh
00000000`00b90005 66c74424042300  mov     word ptr [rsp+4],23h
00000000`00b9000c ff2c24                      jmp     fword ptr [rsp]
00000000`00b9000f 8b250010b900       mov     esp,dword ptr [00000000`01721015] // should be x86 code but displaying as amd64 as current windbg mode is amd64

Could find a way to land on a breakpoint on 0xb90000. Is there a way of doing so?

QtK

unread,
Jun 25, 2021, 10:02:14 AM6/25/21
to DynamoRIO Users
Oh also, as the client DLL is running, the above code is being instrumented. This is the early code you were talking about when you said "early blocks presented to the client".

Derek Bruening

unread,
Jun 25, 2021, 11:34:58 AM6/25/21
to dynamor...@googlegroups.com
The block will end on the far jump.  DR will mangle it to use the "far ibl" IIRC.

To break on app code: do not use debugger breakpoints.  Quoting from 
https://dynamorio.org/page_debugging.html: "Use read watchpoints instead of breakpoints in application code, as the trap instruction inserted by the debugger into the application code can end up copied into DynamoRIO's code cache, resulting in an unhandled trap."

QtK

unread,
Jun 25, 2021, 12:28:22 PM6/25/21
to DynamoRIO Users
It seems like even if I put a watchpoint on my target, the application crashes inside instrumentation of the very first block, and never reaches true execution.

QtK

unread,
Jun 29, 2021, 7:35:24 AM6/29/21
to DynamoRIO Users
Injection seems to work, but RtlUserThreadCreate segfaults somehow. I suspect that I messed up with the stack, but I have no idea how. 
I have noticed that it takes its arguments from eax and ebx (in x86 which I'm calling). I'm not sure if EBX is getting restored after a call to dynamorio_earliest_init_takeover. You have a comment saying : 

         *  [...]So we instead only touch
* xax here and we target an asm routine in DR that will preserve the
* other regs, enabling returning to the hooked routine w/ the
* original state (except xax which is scratch and xbx which kernel
* isn't counting on of course).

Do you confirm that XBX is clubbered upon calling this routine? 

QtK

unread,
Jun 29, 2021, 10:53:29 AM6/29/21
to DynamoRIO Users
I'm facing a curious situation.
esp is saved before the switch+jmp to RtlUserThreadStart, and saved inside the swicthmode + jmp. I did add some code to save and restore eax and ebx too. Saving and restoring EBX is not a problem. But it looks like saving and restoring EAX is. When I do, it messes up dynamorio.dll initialization somehow : it cannot find cached path of the dll in the child and abort here : https://github.com/DynamoRIO/dynamorio/blob/master/core/win32/os.c#L3247 on assert dr_earliest_injected. I guess that in normal circumstances, this condition : https://github.com/DynamoRIO/dynamorio/blob/master/core/win32/os.c#L3239 should not pass, thus no assert gets triggered.    

QtK

unread,
Jun 29, 2021, 11:53:57 AM6/29/21
to DynamoRIO Users
Prototype of RtlUserThreadStart is (PTHREAD_START_ROUTINE pfnStartAddr, PVOID pvParam), and start routine is hold by eax. 
By printing eax value during the switch + jmp from the client, we can see that eax value is getting thrown away and is now 0. RtlUserThreadStart still runs, and crashes when jumping to the routine (obviously).

In order to fix this, I increase the space before earliest_args struct to 12 bytes, and then push eax to remote_data + 8. We can restore eax value in generate_switch_mode_jmp_to_hook, the same way this exact function restores esp (mov eax, [remote_data+8]).

Doing so makes dynamoRIO initialization fails, see the previous message for details about that crash. I do not understand yet why passing a NULL value in eax makes the routine & dynamoRIO start properly, while passing the proper value in eax makes dynamoRIO initialization fail.

Derek Bruening

unread,
Jun 29, 2021, 12:28:00 PM6/29/21
to dynamor...@googlegroups.com
On Tue, Jun 29, 2021 at 7:35 AM QtK <qtkre...@gmail.com> wrote:
Injection seems to work, but RtlUserThreadCreate segfaults somehow. I suspect that I messed up with the stack, but I have no idea how. 
I have noticed that it takes its arguments from eax and ebx (in x86 which I'm calling). I'm not sure if EBX is getting restored after a call to dynamorio_earliest_init_takeover. You have a comment saying : 

         *  [...]So we instead only touch
* xax here and we target an asm routine in DR that will preserve the
* other regs, enabling returning to the hooked routine w/ the
* original state (except xax which is scratch and xbx which kernel
* isn't counting on of course).

This comment is stale.  PR #4653 added preservation of xax, so it is restored.
I don't know why it talks about xbx -- EARLIEST_INIT_DEBUGBREAK
uses it, but after it's been saved.  Seems an incorrect comment.
 
Message has been deleted
Message has been deleted

QtK

unread,
Jun 30, 2021, 6:39:27 AM6/30/21
to DynamoRIO Users
Thanks for the precision. 

I do now encounter an instr_encoding not found in client when the "start" export is getting called, as it should be. (occurring on drmgr_register_bb_instrumentation_event callback)
Crash occurs at the first instruction executed from .text section of target binary, in visual studio runtime :

E8 F0 03 00 00 call    ___security_init_cookie

This looks like a normal call instruction to me, I'm not sure why this isn't decoded properly. Also, I'm not sure why I get an instruction encoding error, not decoding, but this might be a problem with my client.

I also tested giving an empty client for injection, and it crashes too at some syscall assert : https://github.com/DynamoRIO/dynamorio/blob/d02506ae72fd69838e559804633ba5f0743604ca/core/win32/syscall.c#L2964

It seems like x64 injected client integration in WOW64 binaries ins't fully working yet. 

QtK

unread,
Jun 30, 2021, 9:43:48 AM6/30/21
to DynamoRIO Users
Encoding error occurs on dr_insert_call_instrumentation and dr_insert_mbr_instrumentation callbacks.
Message has been deleted

QtK

unread,
Jun 30, 2021, 10:56:30 AM6/30/21
to DynamoRIO Users
I suspect that drwrap used in those callbacks inserts x64 trampoline code, not x86 one, thus leading to an encoding/decoding error. If this is true, we would need to get current dynamoRIO context and change this wrapper depending on currently executed instruction architecture. 
One huge problem about this, is that instruction creation is using X-- instructions (XAX, XBX...), which are aliases for RAX/EAX depending on dynamorio's build arch. I am feeling like making a change depending on current context would be a huge huge work. Or maybe I am not looking into the right direction. Do you have any nudge about where I should be looking for to fix this kind of problem?

QtK

unread,
Jun 30, 2021, 12:25:17 PM6/30/21
to DynamoRIO Users
Small updates. 

I have now a working injection x64 -> Wow64 -> x64. Everything seems to work as expected, except for dr_insert_call_instrumentation and dr_insert_mbr_instrumentation callbacks I previously mentioned. 
I had to disable an assert in syscalls that were checking for syscall xax register size.

I hope I will find a way to work around dr_insert_call_instrumentation and dr_insert_mbr_instrumentation callbacks easily so that I can provide this PR.

Derek Bruening

unread,
Jun 30, 2021, 12:47:09 PM6/30/21
to dynamor...@googlegroups.com
Let's separate out issues with client libraries and client instrumentation helpers being made mixed-mode-aware.  Getting plain DR working with no client would be its own separate PR.

QtK

unread,
Jul 1, 2021, 9:51:19 AM7/1/21
to DynamoRIO Users
Thanks for the answer, I will.

I think encoding problem is quite deeper than expected. Even if the wrapper was successful, the callback is x64 code. 

We would need some kind of check somehow the switch back to current arch (with a far jmp) to be inserted prior to all callbacks to clients. Would you have an idea of where to put such mode switches for them to be placed at the right place?

Derek Bruening

unread,
Jul 1, 2021, 10:10:34 AM7/1/21
to dynamor...@googlegroups.com
Right, looking back, I believe all the prior work on mixed-mode (mixed 64 and 32) was purely in the core with no client, including the experimental work like -x86_to_x64 trying to speed up 32-bit code by converting to 64-bit -- so it seems there never was support for these various issues with client API features you're hitting, so work would be needed to add that.  (FWIW it seems it should be less work than what's already in the core for mixed-mode, handling the far transitions and 32-vs-64 fragments.)

Looking at dr_insert_clean_call should show how it works: e.g., insert_meta_call_vargs() for the actual call; {prepare_for,cleanup_after}_clean_call() (inlined) and emit_clean_call_{save,restore}() (out-of-line) for other setup.

QtK

unread,
Jul 1, 2021, 12:31:54 PM7/1/21
to DynamoRIO Users

Looking at dr_insert_clean_call should show how it works: e.g., insert_meta_call_vargs() for the actual call; {prepare_for,cleanup_after}_clean_call() (inlined) and emit_clean_call_{save,restore}() (out-of-line) for other setup.

dr_insert_clean_call  your are mentioning is called in functions like dr_insert_mbr_instrumentation which works using X-- registers/macros (XAX, XBX...) to add meta instructions as far as I can see. The first decoding error occurs when trying to decode the save of a TLS slot with a mov instruction. This instruction is encoded with x64 move operation of a TLS slot to a x64 register, which obviously has no valid encoding in x86 mode (and the callback gets hit in x86 mode, because we are in WOW64). That is why I'm saying that this might be a bit deeper than it looks.

The only true easy solution I see right now is switching mode 32->64 using smth like 
push CS64 
call $+5 
retf
before creating those meta instructions when current instruction is in x86 mode. This does not looks like an easy task as this requires to execute a block before continuing processing current one, because meta instructions added in the callback thing are added after any possibility of editing the current block.

QtK

unread,
Jul 6, 2021, 7:04:11 AM7/6/21
to DynamoRIO Users
Hello again.

Coming back to ask you if you had an idea about how I could try to cleanly fix this issue. In case this can help, here is the stack trace on encoding error : 

 # Child-SP          RetAddr           Call Site
00 00000000`1f945058 00000000`153be4cf ntdll!NtRaiseHardError
01 00000000`1f945060 00000000`1537392b dynamorio!nt_messagebox+0x17f [c:\users\qtk\documents\dynamorio\core\win32\ntdll.c @ 3776]
02 00000000`1f945100 00000000`15103823 dynamorio!debugbox+0x5b [c:\users\qtk\documents\dynamorio\core\win32\os.c @ 5397]
03 00000000`1f945130 00000000`150fe1e5 dynamorio!d_r_notify+0x283 [c:\users\qtk\documents\dynamorio\core\utils.c @ 1957]
04 00000000`1f945990 00000000`152dab51 dynamorio!external_error+0x175 [c:\users\qtk\documents\dynamorio\core\utils.c @ 204]
05 00000000`1f945a10 00000000`152c60a5 dynamorio!instr_encode_arch+0xa01 [c:\users\qtk\documents\dynamorio\core\ir\x86\encode.c @ 2842]
06 00000000`1f945cc0 00000000`152b0091 dynamorio!instr_encode_check_reachability+0x45 [c:\users\qtk\documents\dynamorio\core\ir\encode_shared.c @ 127]
07 00000000`1f945d10 00000000`152a479b dynamorio!private_instr_encode+0x71 [c:\users\qtk\documents\dynamorio\core\ir\instr_shared.c @ 352]
08 00000000`1f945db0 00000000`150f1004 dynamorio!instr_length+0x5b [c:\users\qtk\documents\dynamorio\core\ir\instr_shared.c @ 1239]
09 00000000`1f945df0 00000000`150ec539 dynamorio!emit_fragment_common+0x7b4 [c:\users\qtk\documents\dynamorio\core\emit.c @ 447]
0a 00000000`1f946ba0 00000000`152ec9d2 dynamorio!emit_fragment_ex+0x59 [c:\users\qtk\documents\dynamorio\core\emit.c @ 870]
0b 00000000`1f946bf0 00000000`150d976d dynamorio!build_basic_block_fragment+0x972 [c:\users\qtk\documents\dynamorio\core\arch\interp.c @ 5166]
0c 00000000`1f946e80 00000000`1f8e631a dynamorio!d_r_dispatch+0xa5d [c:\users\qtk\documents\dynamorio\core\dispatch.c @ 214]
0d 00000000`1f946fe0 00000000`1f927080 0x1f8e631a
0e 00000000`1f946fe8 abababab`abababab 0x1f927080
0f 00000000`1f946ff0 abababab`abababab 0xabababab`abababab
10 00000000`1f946ff8 00000000`00000000 0xabababab`abababab
 
Thanks for your time,

Bests

QtK

unread,
Jul 6, 2021, 10:04:12 AM7/6/21
to DynamoRIO Users
I have saw in the code that it is now possible to mix x86 and x64 instructions in the same block, thus creating a code that would jump to x64 mode before addition of clean call using meta instructions seems impossible.

QtK

unread,
Jul 9, 2021, 2:13:20 PM7/9/21
to DynamoRIO Users
Hello,


I was wondering what was missing for this to be fully working (and fixing DR_REG_X* problems when working with x64 in x86).

Best,

Qtk

Derek Bruening

unread,
Jul 9, 2021, 4:14:35 PM7/9/21
to dynamor...@googlegroups.com
It was fully working for core DR.  It was used for an experiment -x86_to_x64 which translated x86 code to x64 in a WOW64 process and tried to speed it up.  It was all implemented inside the core so no client interface was used and not enough cases of things like DR_REG_X were hit for a general solution to be built.

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.

QtK

unread,
Jul 13, 2021, 6:34:10 AM7/13/21
to DynamoRIO Users
Ok I get it.

The function works well, but using it in dr_insert_xxx_instrumentation won't fix the fact that the callback function is 64bits, not 32.
This is not a trivial issue to fix. 

Reply all
Reply to author
Forward
0 new messages