Thread init event called more than once

72 views
Skip to first unread message

algra...@gmail.com

unread,
Sep 8, 2022, 6:28:01 PM9/8/22
to DynamoRIO Users
Should a thead init callback ever be called more than once for a given thread?
My client prints a startup message (with [pid]) and registers a thread init
callback which prints [pid/tid]. It also registers a thread exit callback.

Generally it works like I expect. But for some targets I see
something like this:

### [19361] CLIENT STARTUP
### [19361/19361] Thread start
### [19362/19362] Thread start
### [19362] CLIENT STARTUP
### [19362/19362] Thread start
### [19362/19362] Thread exit
### [19361/19361] Thread exit
### [19363/19363] Thread start
### [19363] CLIENT STARTUP
### [19363/19363] Thread start
### [19363/19363] Thread exit

So when 19361 forks a new process 19362, in that new process the
thread_init callback for 19362 is called first, then dr_client_main,
then the thread_init callback is called again.

Why would a thread_init callback be called twice for the same thread?

My guess is it's something like this:

  - 19361 calls fork()
  - in the new process 19362 (which is for the time being a copy
    of 19361) DynamoRIO notices a new thread and calls thread_init -
    even though the thread has a copy of the state from 19361
  - 19362 then calls exec() to replace the old image with a new one,
    DynamoRIO does some magic to inject itself into that new
    executable, and relaunch the client, so we get a client startup
    message and then another thread_init for 19362

If my client is maintaining per-thread state, it would seem logical
to ignore at least one of these thread_inits. Is there some guidance
on the best way to maintain per-thread state in a way that works
nicely with forks and execs?

Vadim Petrochenkov

unread,
Sep 9, 2022, 3:14:46 AM9/9/22
to DynamoRIO Users
My guidance would be to make a small test application that uses
- fork
- vfork
- execve, including failing execve
and executes all of then in an application that already has multiple threads.

Then carefully read what the documentation for `dr_register_thread_exit_event` says about thread exit callbacks being called twice, and then test this application with your per-thread state and tracing.
All these cases appear in real applications, but not really covered correctly (or at least explicitly) by simple client examples in DynamoRIO.

The general guidance is to split your per-thread data into two parts - process internal data like structures in memory, and entities potentially shared between processes (including parent and child processes) like file descriptors.
Then add thread-init, thread-exit, fork-init and pre-/post-execve callbacks and carefully think what may happen with your data in all possible cases in which these callbacks are called.

algra...@gmail.com

unread,
Sep 9, 2022, 9:55:10 AM9/9/22
to DynamoRIO Users
Thanks, that's helpful. I don't even have to have additional threads before I start seeing a difference between fork() and vfork().

With fork():
  - thread_init on the main thread of the original process
  - fork_init event in the new process
  - the new thread has already copied the TLS data of the parent thread
  - no additional thread_init event in the new process
  -  thread_exit events on both processes

So it's like I expect fork() to do. fork() has duplicated the process(), and the DynamoRIO context is already set up for the new process. The per-thread state that my client had been maintaining for the parent process (and had put in a TLS slot), is already duplicated and accessible in the child process.

With vfork() it's completely different:
  - thread_init on the main thread of the original process
  - *no* fork_init event in either process
  - new process does *not* have TLS data of the parent thread
  - a thread_init in the new process (and dr_get_process_id_from_drcontext has the new pid)
 
So although the new (child) thread has inherited the global state and program context of the parent, it has not inherited the DynamoRIO client state. Instead, the child gets a thread_init callback as if it was supposed to set up new state.

Looking at the restrictions on what the child process can do after vfork(), maybe this is all intentional, that DynamoRIO is expecting the child to exit() or exec() very soon, and not do anything that needs tracking in its state. But the client has already instrumented the code and the client (as well as wanting access to its own state) may do a variety of things that break the vfork() requirements.

Maybe it's just not feasible for a client to meet the restrictions on what vfork()'s child can do? But intercepting the parent's call to vfork() and turning it into a call to fork() doesn't seem right either, since fork() and vfork() behave differently in respect of the parent waiting.

Derek Bruening

unread,
Sep 13, 2022, 12:08:30 PM9/13/22
to algra...@gmail.com, DynamoRIO Users
With vfork, the new "process" is also a new "thread" sharing the address space of the parent, and DR treats it as such.  This is why you get a thread_init event for the new child thread as though it were a regular new thread without a separate pid.  For most use cases, the separate pid and the potential to diverge the signal handlers do not really matter to most clients and treating it as a regular thread and not a new process is the right approach.  On Linux it is also possible to clone the pid and the address space but not the signal handlers, and again DR treats that like a regular thread but just tracks a separate set of signal handlers.  Vfork is much the same from DR's point of view.

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/3e5cf12d-797f-4fb0-9b1f-812bc211482fn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages