Increasing Memory Usage

Edgar Allen

unread,

Oct 26, 2022, 8:58:30 AM10/26/22

to DynamoRIO Users

First off - I love DynamoRIO. Amazing tool. Thanks to Derek and all for their hard work.

I am working on a fuzzing harness with DynamoRIO. A big component of this harness involves selecting a point within the target application to halt execution and begin a while(1) loop which is constantly calling fork() to create child processes which continue execution and report their exit status to the parent.

It works exactly as intended, however, when I check the application memory usage I can see that it is constantly growing. This increasing memory usage occurs whether I place the infinite fork() loop in either a function pre-callback or a basic block callback. I checked gdb and saw that DynamoRIO is spending a lot of time in heap-related code: special_heap_alloc(), common_heap_extend_commitment(), os_heap_commit(), etc.

My loop code makes no allocations at all, so I suspect DynamoRIO is expanding its own internal heap as a result of my loop code, but I cannot understand why. Is it treating the loop as a continually expanding block of code rather than a small piece of code which loops on itself? If yes, is there a way for me to instruct DynamoRIO to somehow ignore this loop code and not commit it internally?

I tried playing with different runtime options and function wrapping options but I've had no success. Please let me know what ideas you have, and whether you need any additional info from me.

Thank you!

assad.hashm...@gmail.com

unread,

Oct 27, 2022, 4:31:36 AM10/27/22

to DynamoRIO Users

> I tried playing with different runtime options and function wrapping options but I've had no success.

Have you run with -debug and -loglevel?

There may be clues/warning messages in the logs or an ASSERT which hints at the cause.

Edgar Allen

unread,

Oct 27, 2022, 7:37:36 AM10/27/22

to DynamoRIO Users

When running in debug mode there are no warnings or asserts that stand out to me, but I can see that the heap size is indeed expanding. Here are some comparisons:

"Thread heap breakdown" section:

Total cur usage: 62KB -> 1232KB
IR: 30K, 87039# -> 181K, 1114063#
TH Counter: 20K, 885# -> 452K, 19326#
Other: 45K, 897# -> 712K, 19342#

"Thread statistics" section:

Current special heap capacity (bytes): 4096 -> 1069056
Total reserved memory (bytes): 61440 -> 14155776
Heap claimed (bytes): 25904 -> 10211982
Current total memory from OS (bytes): 61440 -> 22380544

To clarify what is happening in my code, I have a pre-function callback on a function somewhere in my single-threaded target application. In this callback, I start a while(1) loop, and on each loop I call fork() and allow the child process to return from the callback to continue executing. The parent process, however, simply calls waitpid(), and has two write() calls and a read() call after. Despite the simplicity of the parent and lack of allocations, its memory is still ballooning as you can see above. I am guessing that DynamoRIO has issues with my pre-function callback that never ends, but I'm not clear on that, and would appreciate any insight from someone who may know more! Thanks.

Derek Bruening

unread,

Oct 27, 2022, 1:35:23 PM10/27/22

to Edgar Allen, DynamoRIO Users

Debug build will report un-freed memory at exit: is there no such report about a memory leak at process exit time?

I assume this fork and waitpid are called directly in client mode and not in app (managed) mode since you said it is in a drwrap callout. Calling some syscalls directly without going through DR API routines can cause problems. E.g., the infinite wait by this thread while in a callout from the code cache can easily cause hangs where DR wants to relocate this thread from another thread and will send it signals and wait for it to reach a relocatable point. It should be bracketed by dr_mark_safe_to_suspend() calls, and then a dr_redirect_execution() since the code cache return point will not be reliable.

That said, it is not clear why there would be any memory usage increase from a hidden-from-DR fork and waitpid. It would need further debugging; breakpoints or prints or whatnot on allocations to see how they are being triggered. Is this a true fork or is this a clone that has CLONE_VM set?

Re: fuzzing: note that there is the Dr. Fuzz framework as well: https://drmemory.org/page_drfuzz.html

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/bb163f89-b70c-4951-83c9-9031d1e4e30en%40googlegroups.com.

Edgar Allen

unread,

Oct 27, 2022, 3:00:26 PM10/27/22

to DynamoRIO Users

Thanks for the response!

I amended my while(1) loop to simply loop 1000 times and then call dr_exit_process() while still in the drwrap pre-function callback hoping that this would cause the debug logs to report something, but I don't see anything related to memory leaks. I do see a handful of these messages, though: "SYSLOG_WARNING: many pending signals: asking for 2nd special unit" and "SYSLOG_WARNING: potentially unsafe: allocating a new fragile special heap unit!".

I can confirm that my fork() call leads to clone() with flags CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD - there no CLONE_VM flag, so I should be good there.

Regarding Dr. Fuzz, it is something I definitely want to explore in the future but unfortunately my current use case might make it a little challenging.

Finally, what you said about the callout's infinite loop causing hangs makes sense (and I do see that "many pending signals" SYSLOG message above), but I am not sure I understand what you are recommending. Are you suggesting I call dr_mark_safe_to_suspend() in the drwrap pre-func callback prior to starting the while(1) loop, and then use dr_redirect_execution() in the subsequent child processes to return? If yes, where do I obtain the mcontext to pass into dr_redirect_execution() - do I call dr_get_mcontext() at the start of the pre-func callback and store it there? dr_get_mcontext() documentation doesn't seem to indicate that I can do this, so I may be misunderstanding your recommendation here.

Derek Bruening

unread,

Oct 28, 2022, 11:52:48 AM10/28/22

to Edgar Allen, DynamoRIO Users

On Thu, Oct 27, 2022 at 3:00 PM Edgar Allen <omfu...@gmail.com> wrote:

Thanks for the response!

I amended my while(1) loop to simply loop 1000 times and then call dr_exit_process() while still in the drwrap pre-function callback hoping that this would cause the debug logs to report something, but I don't see anything related to memory leaks. I do see a handful of these messages, though: "SYSLOG_WARNING: many pending signals: asking for 2nd special unit" and "SYSLOG_WARNING: potentially unsafe: allocating a new fragile special heap unit!".

What are all these signals? This is in the parent? Is this SIGCHLD? Are these accumulating b/c DR is waiting to get back to its dispatcher to deliver them but that never happens? Is this a source of the memory growth seen?

I can confirm that my fork() call leads to clone() with flags CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD - there no CLONE_VM flag, so I should be good there.

Regarding Dr. Fuzz, it is something I definitely want to explore in the future but unfortunately my current use case might make it a little challenging.

Finally, what you said about the callout's infinite loop causing hangs makes sense (and I do see that "many pending signals" SYSLOG message above), but I am not sure I understand what you are recommending. Are you suggesting I call dr_mark_safe_to_suspend() in the drwrap pre-func callback prior to starting the while(1) loop, and then use dr_redirect_execution() in the subsequent child processes to return? If yes, where do I obtain the mcontext to pass into dr_redirect_execution() - do I call dr_get_mcontext() at the start of the pre-func callback and store it there? dr_get_mcontext() documentation doesn't seem to indicate that I can do this, so I may be misunderstanding your recommendation here.

There is a drwrap_get_mcontext.

To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/6bfa2d7f-4212-4640-b7f7-3f1e9e2e397dn%40googlegroups.com.

Edgar Allen

unread,

Oct 28, 2022, 1:40:38 PM10/28/22

to DynamoRIO Users

Thanks for all of your help so far. Apologies for all of the questions.

I did what I described above, but I used drwrap_get_mcontext() to store the mcontext as you recommended. The parent process is now marked as safe to suspend within its infinite loop via dr_mark_safe_to_suspend() and the child processes are returning from the loop using dr_redirect_execution(). The fuzzer is somewhat slower, but continues to work.

Unfortunately, the memory growth in the parent process is still present. My understanding is that the parent process is now safe for DR to suspend while in its infinite loop, so I imagine that DR should be doing so to address any signals that it has been buffering. Either my understanding is incorrect, or the issue is not related to signals, or maybe something unrelated to those two things entirely.

I think I may need to use some higher loglevel and try to really scrutinize it line-by-line in order to get a better understanding of what's going on. Please let me know if you have any other ideas, or recommendations of specific things to look for in the log files. Thank you!

Derek Bruening

unread,

Oct 28, 2022, 4:00:25 PM10/28/22

to Edgar Allen, DynamoRIO Users

On Fri, Oct 28, 2022 at 1:40 PM Edgar Allen <omfu...@gmail.com> wrote:

Thanks for all of your help so far. Apologies for all of the questions.

I did what I described above, but I used drwrap_get_mcontext() to store the mcontext as you recommended. The parent process is now marked as safe to suspend within its infinite loop via dr_mark_safe_to_suspend() and the child processes are returning from the loop using dr_redirect_execution(). The fuzzer is somewhat slower, but continues to work.

Unfortunately, the memory growth in the parent process is still present. My understanding is that the parent process is now safe for DR to suspend while in its infinite loop, so I imagine that DR should be doing so to address any signals that it has been buffering. Either my understanding is incorrect, or the issue is not related to signals, or maybe something unrelated to those two things entirely.

I don't know that app signals will be delivered there: I was referring to DR-sent signals for relocation. They may be if that is a proper "safe point". Whether they are stacking up or being delivered should be quite clear from debug logs.

To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/ca13f6bd-0d20-49be-b413-7a8392b04b46n%40googlegroups.com.

Edgar Allen

unread,

Oct 31, 2022, 10:53:36 AM10/31/22

to DynamoRIO Users

I found the specific source of the memory explosion: there are constant 4KB allocations related to pending SIGCHLD signals, like you suggested. The call stack per gdb is below. In the log files, I can see a ton of "record_pending_signal(17)" calls (17 = SIGCHLD). This is called on every fuzz iteration (AKA every fork() call), and each time another 4KB of memory is added to the process.

Originally I thought that DR shouldn't be handling/storing SIGCHLD signals for long since my client is catching them with waitpid() and moving on. But, I can see now that they are building up, so despite my client catching these signals, DR is continuing to store them, possibly because it does not have time to suspend the client and do whatever it is waiting to do with these signals (which I am not exactly clear on). You mentioned earlier that bracketing a section of code with dr_mark_safe_to_suspend() calls may help so I added a safe-to-suspend section within the infinite loop, but the same issue is still present.

Ultimately it seems to me that DR is storing these SIGCHLD signals for something and either my client needs to help DR to process them, or block them from being stored.

#0 os_heap_commit ()

#1 vmm_heap_commit ()

#2 extend_commitment ()

#3 common_heap_extend_commitment ()

#4 special_unit_extend_commitment ()
#5 special_heap_calloc ()

#6 special_heap_alloc ()

#7 record_pending_signal ()

#8 master_signal_handler_C ()

#9 xfer_to_new_libdr ()

Derek Bruening

unread,

Nov 7, 2022, 12:25:49 PM11/7/22

to Edgar Allen, DynamoRIO Users

DR has a handler for SIGCHLD and it is the one receiving it and attempting to deliver it (which means that either the app must have a handler for it or your client registered for the signal event forcing DR to send it signals even if they are ignored by the app). Your raw waitpid bypasses DR and the kernel sees it as satisfied and moves on.

For DR to deliver the signal to an app handler it needs to run code in managed mode. For asynchronous signals like this DR waits for the thread to exit the cache and reach the dispatcher; it does not try to relocate it as this is not considered urgent and it is fine to wait a bit. If you wanted to add a relocation step when in a safe spot that seems reasonable. Note that this will terminate your callback and it will have to restart after any app signal handler code (if the app handler returns to the pre-wrap point).

Another possible change is limiting how many signals are queued. The kernel doesn't queue more than one non-real-time signal; DR tries to keep more (with a limit for alarms) with the logic that A) its overhead is causing delays and it should be closer to non-DR behavior to try to deliver multiple than to drop them, and B) clients wanting to see signals might be deliberately sending them. But there could certainly be limits. That would address the end OOM, but does not seem a full solution as this infinite callback is still blocking signals from being delivered which is not good and might break the app.

To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/0e5378f5-2767-4fc4-b20e-1ca4d3355e60n%40googlegroups.com.

Reply all

Reply to author

Forward