Locking in clean calls and drwrap_wrap pre/post callbacks

38 views
Skip to first unread message

Michael Flanders

unread,
Nov 24, 2025, 1:23:53 PM (12 days ago) Nov 24
to DynamoRIO Users
Hello!

I am running DynamoRIO and my DR client on a Windows 10 IA-32 application in debug mode. This is all running in a 4 core, 64GB 64bit Windows 10 VM. 

I am inserting clean calls before nearly every instruction and have wrapped, using `drwrap_wrap` of some file manipulating and memory allocation functions like `operator new`, `malloc`, `CreateFileA`, etc. I am aware that this amount of clean calls is not amazing, but this is a prototype and performance is not much of a concern at the moment.

Error
My `drwrap_wrap` callbacks and clean call functions both access the same global arrays and vectors using `drvector_lock` and `dr_mutex_lock`. I have triple checked that every lock has a corresponding unlock on all code paths, but I am running into the following error:
```
Finished synch with all threads: result=0
SYSLOG_WARNING: Failed to suspend for ProcessTlsInformation
SYSLOG_ERROR: Application ... (5644).  Internal Error: DynamoRIO debug check failure: C:\Users\name\Desktop\dynamorio\core\dispatch.c:545 dc == NULL || OWN_NO_LOCKS(dc)
(Error occurred @260272 frags in tid 1804)
version 11.3.20098, custom build
```

From the logs, it looks like from a call in `presys_SetInformationProcess`, `sync_with_all_threads` is called, and the amount of times DR loops, waiting for all threads to suspend, exceeds `max_loops` and DR stops attempting to suspend all threads. It seems like when execution tries to enter the code cache again, I get this OWN_NO_LOCKS failure. I get this assert curiosity:
```
SYSLOG_WARNING: CURIOSITY : loop_count < max_loops in file C:\Users\name\Desktop\dynamorio\core\synch.c line 1488
```

On loglevel=3, I get that the threads that are not at a safe spot to suspend are commonly at these PCs as well as some others which I resolved with symquery:
```
thread 8092 not at safe spot (pc=0x6d2463d2) for 4
PS C:\Users\name\Desktop> .\drmemory\bin\symquery.exe -e ..\dynamorio\build32\lib32\debug\dynamorio.dll  -f -a 0x3253d2                             d_r_print_encoding_first_line_to_buffer+0x62    
```
and this one in ntdll:
```
thread 6376 not at safe spot (pc=0x7743324c) for 4
```
which looks like a syscall/code cache code?:
```
interp: start_pc = 0x774332a0                                                                                      0x774332a0  b8 0e 00 07 00       mov    eax, 0x0007000e                                                          0x774332a5  ba 50 91 44 77       mov    edx, 0x77449150                                                          0x774332aa  ff d2                call   edx                                                                    end_pc = 0x774332ac
find_syscall_num: found syscall number write: 458766                                                             syscall # is 458766                                                                                              found optimizable system call 0x7000e                                                                            ending bb at syscall & NOT removing the interrupt itself                                                         setting cur_pc (for fall-through) to 0x774332ac                                                                  exit_branch_type=0x0 bb->exit_target=0x18e1bc00                                                                  emit_fragment: bb use ibl <0x18e1bc00>                                                                           exit_branch_type=0x0 target=0x18e1bc00 l->flags=0x9002                                                           Fragment 8713, tag 0x774332a0, flags 0x1801030, shared, size 26, must end trace:                                         [ntdll.dll!ZwSetEvent]                                                                                   Entry into F8713(0x774332a0).0x1db85920 (shared)                                                                         [ntdll.dll!ZwSetEvent]                                                                                   Exit from sourceless ibl: trace jmp*    []                                                                        (target 0x774332ac not in cache)                                                                                fragment_add_ibl_target tag 0x774332ac, branch 2, F0                                                                                                                                                                              d_r_dispatch: target = 0x774332ac                                                                                                                                                                                                 interp: start_pc = 0x774332ac                                                                                      0x774332ac  c2 08 00             ret    0x0008                                                                 end_pc = 0x774332af
```

I also have a lot of SYSLOG warnings about memory consumption:
```
program.exe.0.1804.html:85949:SYSLOG_WARNING: Application
program.exe(5644). Out of contiguous memory. Alloc type: 0x22.
program.exe.0.1804.html:85950:SYSLOG_WARNING: Application
program.exe (5644). Out of contiguous memory. Alloc type: 0x41.
```

Debugging so far
While debugging this, I removed instrumentation added instructions to try to figure out if a particular instruction caused this (like computing args to clean calls). I noticed that after removing a certain amount of added instructions, I no longer got this error and execution succeeded reliably. After adding back one more instrumentation instruction which does not touch locks, I started reliably getting the error again regardless of which one instrumentation instruction I added back. I can add this one instruction in without getting the error by bumping `synch_all_threads_max_loops`, but this doesn't work when I add back all my instrumentation instructions and doesn't seem like a proper solution. I am guessing that at a certain basic block size or memory size of basic blocks, things get moved around triggering a suspend all threads that fails?

Solution?
I am sort of stuck as far as pinpointing the exact cause of this problem, but I was thinking the problem could be:
  1. using the same locks in clean calls (app code?) and `drwrap_wrap` functions (DR code). 
  2. the amount of generated instrumentation code because of the problem happening after adding in a certain amount of instrumentation code (above)
For 1) I was reading through the docs and it seems like a fix to this might be some combo of using the functions: `dr_app_recurlock_lock`,  `dr_mark_safe_to_suspend`, and/or `dr_redirect_execution`. The locks used in the clean calls and drwrap_wrap call backs are pretty much only for synchronizing use of a custom hashtable and a couple of arena allocators. 


Derek Bruening

unread,
Nov 24, 2025, 1:53:25 PM (12 days ago) Nov 24
to Michael Flanders, DynamoRIO Users
This may be https://github.com/DynamoRIO/dynamorio/issues/569 where as you noted DR is not able to safely relocate a thread in a clean call.
In many cases this doesn't matter as no relocation is needed during app runs, particularly on Linux with the "reset" feature turned off: but on Windows there are cases such as the one you hit with the app doing SetInformationProcess.ProcessTlsInformation on another process.
There are ideas for relocating in that issue but that is not necessarily simple to implement and probably relies on clients not changing contexts so probably needs some client API to handle corner cases.
A probably simpler option is to reduce the # of clean calls you're using, which would help with performance and other things. 
Another option that just tries to improve ProcessTlsInformation would be to use a less strict synchall target: we don't really need a valid relocation context; we just need to update the app TEB pointer of each thread. We'd have to study this operation to figure out a relaxation: we might have to add a new one that disallows DR code itself but allows client code; or maybe simple locks could be relied on but I don't remember the synch model for the app TEB fields: likely locks are not used for a thread's own fields today.

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/dynamorio-users/66d2983d-ef6a-40f0-979f-a7caf1f79fa5n%40googlegroups.com.

Michael Flanders

unread,
Dec 2, 2025, 12:16:49 PM (4 days ago) Dec 2
to DynamoRIO Users
Thanks for looking at this.

I've been looking into this more to see the feasibility of reducing the number of clean calls and it seems I can still run into this error even without inserting any calls out of the code cache.

I am manually inserting the clean call sequence before target instructions like below (currently targeting only windows MSVC IA-32:
```
dr_prepare_for_call()
dr_insert_save_fpstate()
... compute each arg value based on target instruction memory operands and which registers are used and push the arg value...
dr_insert_call() // with 0 arguments since they are already pushed
dr_insert_restore_fpstate()
dr_cleanup_after_call() // and cleanup pushed stack arguments
```
This adds a lot of instructions before each target instruction in each code cache block.
If I comment out all of the dr_insert_call()s and set loglevel=3, I still get the error in the original post. If I comment out all of the dr_insert_call()s and set loglevel=1, I no longer get the error.

I built DynamoRIO with some extra log statements in `core/translate.c` in `recreate_app_state_internal`, since that seems to be returning a failure result 20,000 times trying to suspend all threads. It seems to be taking this fall through case here https://github.com/DynamoRIO/dynamorio/blob/07544c3340ce6a20645e87aec03a231ddcc2b8eb/core/translate.c#L1610 and returning `RECREATE_FAILURE`.

I also noticed that it is always two threads that cannot be suspended, and one seems to be in DR code and the other in a syscall (maybe? I am unsure of this one). For the latest erroring run, one thread that cannot be suspended is almost always at pc=0x71db6672 and the other is always around pc=0x77a0324c

In this run, dynamorio.dll is loaded at [0x71a90000, 0x71f73000], 0x71db6672-0x71a90000=0x326672 and, using symquery, offset 0x326672 in dynamorio.dll for this build is: dynamorio_syscall_wow64_noedx+0x1f at core/x86.asm_core.s:2818

I can't nail down where 0x77a0324c is, but it seems like it is a code cache location near ntdll syscalls. I found a nearby fragment, 0x77a03340 which seems to end in a call to ntdll.dll!ZwAllocateVirtualMemory: `Fragment 324, tag 0x77a03340, flags 0x1801030, shared, size 15, must end trace: [ntdll.dll!ZwAllocateVirtualMemory]`.

I am wondering if the amount of instructions I am adding to the code cache is causing a lot of memory (re)allocations or if a lot of dynamorio logging (loglevel>=3) is happening which cannot be suspended? 

Do you have any other suggestions or ideas for debugging this?

Michael Flanders

unread,
Dec 2, 2025, 2:15:25 PM (4 days ago) Dec 2
to DynamoRIO Users
I set a breakpoint here after this line is hit more than 800 times with a failure result https://github.com/DynamoRIO/dynamorio/blob/07544c3340ce6a20645e87aec03a231ddcc2b8eb/core/synch.c#L1441

For the two threads that seem to never suspend, they seem to be in varying states like this:
```
 10  Id: 1788.1250 Suspend: 1 Teb: 0031c000 Unfrozen
 # ChildEBP RetAddr      Args to Child
00 25670984 6ffa56dd     701f8bd8 25670a08 00001250 dynamorio!atomic_aligned_read_int [C:\Users\name\Desktop\dynamorio2\core\arch\atomic_exports.h @ 1160]
01 256709fc 6fef583c     1e71d4c0 00000000 00000001 dynamorio!bb_build_abort+0x1bd [C:\Users\name\Desktop\dynamorio2\core\arch\interp.c @ 4440]
02 25670a30 6fedaaa1     1e71d4c0 00000001 00000001 dynamorio!check_thread_vm_area_cleanup+0x2ec [C:\Users\name\Desktop\dynamorio2\core\vmareas.c @ 7284]
03 25670cf0 6ffae65c     1e71d4c0 0ebf73e0 0ebf73e0 dynamorio!check_thread_vm_area+0xfd1 [C:\Users\name\Desktop\dynamorio2\core\vmareas.c @ 7700]
04 25670d20 6ffc0c36     1e71d4c0 25670d7c 00000000 dynamorio!check_new_page_start+0x5c [C:\Users\name\Desktop\dynamorio2\core\arch\interp.c @ 715]
05 25670d54 6ff9cc04     1e71d4c0 25670d7c 0ebf73e0 dynamorio!init_interp_build_bb+0x1d6 [C:\Users\name\Desktop\dynamorio2\core\arch\interp.c @ 5048]
06 25670f04 6fe1f488     1e71d4c0 0ebf73e0 00000000 dynamorio!build_basic_block_fragment+0x2e4 [C:\Users\name\Desktop\dynamorio2\core\arch\interp.c @ 5121]
07 25670ff4 700963e4     1e71d4c0 00000000 00000000 dynamorio!d_r_dispatch+0x8a8 [C:\Users\name\Desktop\dynamorio2\core\dispatch.c @ 210]
08 25670a40 00000000     25670da0 00000001 00000000 dynamorio!call_switch_stack+0x23 [C:\Users\name\Desktop\dynamorio2\build32\core\x86.asm_core.s @ 1951]

  11  Id: 1788.1584 Suspend: 1 Teb: 0031f000 Unfrozen
 # ChildEBP RetAddr      Args to Child
00 107bec1c 700060bd     00000470 107beeb8 00000000 dynamorio!dynamorio_syscall_wow64_noedx+0x1f [C:\Users\name\Desktop\dynamorio2\build32\core\x86.asm_core.s @ 2818]
01 107beec8 6fff8cdd     00000470 00000000 701f8b00 dynamorio!os_wait_event+0x25d [C:\Users\name\Desktop\dynamorio2\core\win32\os.c @ 8572]
02 107beef4 6fe3ee06     701f8b00 00000000 00000000 dynamorio!mutex_wait_contended_lock+0xad [C:\Users\name\Desktop\dynamorio2\core\win32\os.c @ 8672]
03 107bef0c 6fe3ec5e     701f8bc8 00000000 107beff4 dynamorio!d_r_mutex_lock_app+0xa6 [C:\Users\name\Desktop\dynamorio2\core\utils.c @ 884]
04 107bef1c 6fe1f379     701f8bc8 107bea08 00000000 dynamorio!d_r_mutex_lock+0xe [C:\Users\name\Desktop\dynamorio2\core\utils.c @ 897]
05 107beff4 700963e4     21594500 00000000 00000000 dynamorio!d_r_dispatch+0x799 [C:\Users\name\Desktop\dynamorio2\core\dispatch.c @ 199]
06 107bea40 00000000     21594500 21594500 00000001 dynamorio!call_switch_stack+0x23 [C:\Users\name\Desktop\dynamorio2\build32\core\x86.asm_core.s @ 1951]
```

and

```
  10  Id: 1788.1250 Suspend: 1 Teb: 0031c000 Unfrozen
 # ChildEBP RetAddr      Args to Child
00 25670c1c 700060bd     00000470 25670eb8 00000000 dynamorio!dynamorio_syscall_wow64_noedx+0x1f [C:\Users\name\Desktop\dynamorio2\build32\core\x86.asm_core.s @ 2818]
01 25670ec8 6fff8cdd     00000470 00000000 701f8b00 dynamorio!os_wait_event+0x25d [C:\Users\name\Desktop\dynamorio2\core\win32\os.c @ 8572]
02 25670ef4 6fe3ee06     701f8b00 00000000 00000000 dynamorio!mutex_wait_contended_lock+0xad [C:\Users\name\Desktop\dynamorio2\core\win32\os.c @ 8672]
03 25670f0c 6fe3ec5e     701f8bc8 00000000 25670ff4 dynamorio!d_r_mutex_lock_app+0xa6 [C:\Users\name\Desktop\dynamorio2\core\utils.c @ 884]
04 25670f1c 6fe1f379     701f8bc8 25670a08 00000000 dynamorio!d_r_mutex_lock+0xe [C:\Users\name\Desktop\dynamorio2\core\utils.c @ 897]
05 25670ff4 700963e4     1e71d4c0 00000000 00000000 dynamorio!d_r_dispatch+0x799 [C:\Users\name\Desktop\dynamorio2\core\dispatch.c @ 199]
06 25670a40 00000000     1e71d4c0 1e71d4c0 00000001 dynamorio!call_switch_stack+0x23 [C:\Users\name\Desktop\dynamorio2\build32\core\x86.asm_core.s @ 1951]

  11  Id: 1788.1584 Suspend: 1 Teb: 0031f000 Unfrozen

 # ChildEBP RetAddr      Args to Child
00 107be990 7003358d     ffffffff 0eab1000 107be9e0 dynamorio!dynamorio_syscall_wow64_noedx+0x1f [C:\Users\name\Desktop\dynamorio2\build32\core\x86.asm_core.s @ 2818]
01 107be9c0 700046cc     0eab1000 107be9e0 0000001c dynamorio!query_virtual_memory+0x1d [C:\Users\name\Desktop\dynamorio2\core\win32\ntdll.c @ 2284]
02 107bea04 6fff5d2f     0ebf44a0 107bea2c 00000001 dynamorio!query_memory_internal+0x12c [C:\Users\name\Desktop\dynamorio2\core\win32\os.c @ 5791]
03 107bea3c 6feda4bd     0ebf44a0 107becdc 107becc8 dynamorio!get_memory_info+0x5f [C:\Users\name\Desktop\dynamorio2\core\win32\os.c @ 5868]
04 107becf0 6ffae65c     21594500 0ebf44a0 0ebf44a0 dynamorio!check_thread_vm_area+0x9ed [C:\Users\name\Desktop\dynamorio2\core\vmareas.c @ 7554]
05 107bed20 6ffc0c36     21594500 107bed7c 00000000 dynamorio!check_new_page_start+0x5c [C:\Users\name\Desktop\dynamorio2\core\arch\interp.c @ 715]
06 107bed54 6ff9cc04     21594500 107bed7c 0ebf44a0 dynamorio!init_interp_build_bb+0x1d6 [C:\Users\name\Desktop\dynamorio2\core\arch\interp.c @ 5048]
07 107bef04 6fe1f488     21594500 0ebf44a0 00000000 dynamorio!build_basic_block_fragment+0x2e4 [C:\Users\name\Desktop\dynamorio2\core\arch\interp.c @ 5121]
08 107beff4 700963e4     21594500 00000000 00000000 dynamorio!d_r_dispatch+0x8a8 [C:\Users\name\Desktop\dynamorio2\core\dispatch.c @ 210]
09 107bea40 00000000     107becdc 107becc8 107becc4 dynamorio!call_switch_stack+0x23 [C:\Users\name\Desktop\dynamorio2\build32\core\x86.asm_core.s @ 1951]
04 107becf0 6ffae65c     21594500 0ebf44a0 0ebf44a0 dynamorio!check_thread_vm_area+0x9ed [C:\Users\name\Desktop\dynamorio2\core\vmareas.c @ 7554]
05 107bed20 6ffc0c36     21594500 107bed7c 00000000 dynamorio!check_new_page_start+0x5c [C:\Users\name\Desktop\dynamorio2\core\arch\interp.c @ 715]
06 107bed54 6ff9cc04     21594500 107bed7c 0ebf44a0 dynamorio!init_interp_build_bb+0x1d6 [C:\Users\name\Desktop\dynamorio2\core\arch\interp.c @ 5048]
07 107bef04 6fe1f488     21594500 0ebf44a0 00000000 dynamorio!build_basic_block_fragment+0x2e4 [C:\Users\name\Desktop\dynamorio2\core\arch\interp.c @ 5121]
08 107beff4 700963e4     21594500 00000000 00000000 dynamorio!d_r_dispatch+0x8a8 [C:\Users\name\Desktop\dynamorio2\core\dispatch.c @ 210]
09 107bea40 00000000     107becdc 107becc8 107becc4 dynamorio!call_switch_stack+0x23 [C:\Users\name\Desktop\dynamorio2\build32\core\x86.asm_core.s @ 1951]
```

I've attached a file with a handful more stack traces. 
thread_cant_suspend.txt

Michael Flanders

unread,
Dec 2, 2025, 2:27:06 PM (4 days ago) Dec 2
to DynamoRIO Users
This is at the release_11.3.0 git tag

Derek Bruening

unread,
Dec 2, 2025, 3:55:15 PM (4 days ago) Dec 2
to Michael Flanders, DynamoRIO Users
Debug build at -loglevel 3 is slow, and if your target app runs a lot of different code without reuse, it is not surprising that a synchall has trouble with so much time spent building blocks and printing logs.
It does not seem a configuration that is worth studying; you'll have to do selective logging for diagnostics for such an app.
Do you see problems without logging, esp in release build, or in debug "-checklevel 0"?

Michael Flanders

unread,
Dec 5, 2025, 1:59:02 PM (16 hours ago) Dec 5
to DynamoRIO Users
I do get the same error with both DynamoRIO and my client compiled for a release build and with debug `-checklevel 0` and without specifying any loglevels. 

I rewrote my client to use the lean clean call approach like the `memtrace_x86.c` sample. Instrumentation still adds a good amount of bytes to each basic block, but from sampling a few basic blocks, it seems this is a pretty big saving. Running on a release build now, it seems like things are running fine without errors. 
Reply all
Reply to author
Forward
0 new messages