Dump and reset covered BBs (drcovlib)

Daniel Elsner

unread,

Mar 14, 2022, 6:45:09 AM3/14/22

to DynamoRIO Users

Hi,

I'm looking for a solution to dump BB coverage several times during runtime and always reset the already recorded coverage after dumping.

As far as I understand it, drcovlib implements an analysis pass that will only see BBs before they are put into the code cache (and not instrument them). Hence, if I dump coverage at runtime, I will not see the covered BBs again, unless the cache has been invalidated and the BBs are reloaded.

My question is: What would be the canonical DR implementation: (1) extend drcovlib to reset code cache after each dump or (2) insert a BB instrumentation that puts seen BBs into a set which is cleared after each dump?

Re (1): How could I implement resetting the code cache for the set of already covered BBs?

Thanks in advance!

sharma...@google.com

unread,

Mar 15, 2022, 10:04:37 AM3/15/22

to DynamoRIO Users

Hi Daniel,

It is indeed possible to flush the code cache: see dr_delay_flush_region, dr_unlike_flush_region, and dr_flush_region_ex, which differ in the constraints on their usage, provided guarantees and overhead. All of them would require re-building the code cache so there would definitely be some overhead associated with all of them.

For your use case, it does seem more efficient to instrument BBs to add the required information to some data structure that is dumped and cleared when you want (perhaps using dr_set_itimer like here, or per N dynamic basic blocks, etc). See the container data structures offered by DR that may be useful here.

Abhinav

Daniel Elsner

unread,

Mar 15, 2022, 11:33:03 AM3/15/22

to DynamoRIO Users

Hi Abhinav,

thanks for your response.

I'll go with instrumenting the BBs then, it does indeed seem more efficient than re-building the code cache for them - especially, since it's basically an increment on the BB's hit count, i.e., only a single meta instruction.

Cheers

Daniel

Message has been deleted

sharma...@google.com

unread,

Mar 16, 2022, 5:42:30 PM3/16/22

to DynamoRIO Users

Hi Daniel,

For (1), you could use dr_suspend_all_other_threads/dr_resume_all_other_threads. But a global synchronization obviously would incur a lot of overhead.

For (2), you may use dr_mutex routines to protect access to your shared data structure. However, note the restrictions on its usage; particularly, you'll have to use a clean call instead of the inline increment you mentioned in your email, for incrementing the coverage vector, which will be slower.

Another option is that you maintain two coverage vectors. At any time, based on some atomic flag, one of them will accumulate coverage data (threads will increment the BB counters in this vector), and the other one can be dumped as needed and re-initialized to zero. So, when you need to dump the current stats, you would toggle the flag so that the current vector is no longer being used for accumulating coverage data and can be accessed safely for dumping, while the other vector receives the future counter increments. The advantage of this approach is that counter increments (which happen much more frequently than coverage vector dumps) can still be inlined (you don't need a clean call), therefore will be much faster. You can add instrumentation to read the atomic flag and based on its value decide which coverage vector to use for incrementing. As an example of the atomic flag, see the tracing_window variable at https://github.com/DynamoRIO/dynamorio/blob/9a7eddbc5606df355decf828935b98d9cca86ace/clients/drcachesim/tracer/tracer.cpp#L256.

Hope this helps.

Abhinav

Daniel Elsner

unread,

Mar 17, 2022, 7:36:25 AM3/17/22

to DynamoRIO Users

Hi Abhinav,

thank you very much for your answer!

I had already removed my question because I found out that if I dump coverage through an DR annotation, the application will wait until the DR annotation handler (where I dump the coverage) is executed. Therefore, I would assume that no further synchronization is required.

However, surprisingly, I'm facing several issues with my DR instrumentation. I've tried 3 different approaches for the instrumentation, but all of them either crash the client, do not work with the -debug DR option, or are incorrectly incrementing the BB hit counts for each module.

I've attached an excerpt from my instrumentation; I've added comments to the 3 instrumentation approaches. I would greatly appreciate, if you could briefly have a look at it and let me know what it is that I'm doing wrong here. I took all of my code from the many DR samples and from drcov(lib).

Another problem I face is with attaching the DR client to a running process (i.e., a JVM). Here, I seem to run into a deadlock when the JVM process attempts to exit (both, client and JVM are stuck then). Do you have any idea, what could be the problem? I tried using nudges similarly as in drcovlib which didn't help either.

Thanks in advance!

Daniel

------------------------------------------------------------

// Data structure

typedef struct _covered_bb_t {
uint start_offset; /* Offset to the parent module's start address. */
uint hit_count;
} covered_bb_t;

typedef struct _module_entry_t {
uint id;
module_data_t* data;
drvector_t covered_blocks;
} module_entry_t;

// Clean call

static void
clean_call(uint* ptr)
{
*ptr += 1;
}

// Instrumentation handler

static dr_emit_flags_t
event_bb_instrumentation(void* drcontext, void* tag, instrlist_t* bb, instr_t* instr, bool for_trace, bool translating, void* user_data)
{
if (!instr_is_app(instr))
return DR_EMIT_DEFAULT;

if (!drmgr_is_first_instr(drcontext, instr))
return DR_EMIT_DEFAULT;

app_pc start_pc;
start_pc = dr_fragment_app_pc(tag);
module_entry_t* mod_entry;
mod_entry = module_table_lookup(module_table, start_pc);
if (mod_entry == NULL || mod_entry->data == NULL) return DR_EMIT_DEFAULT;

uint start_offset = start_pc - mod_entry->data->start;
covered_bb_t* entry;
int i;

drvector_lock(&mod_entry->covered_blocks);
for (i = mod_entry->covered_blocks.entries - 1; i >= 0; i--) {
entry = drvector_get_entry(&mod_entry->covered_blocks, i);
ASSERT(entry != NULL, "failed to get BB entry");
if (entry->start_offset == start_offset) {
break;
}
entry = NULL;
}
if (entry == NULL) {
entry = dr_global_alloc(sizeof(*entry));
entry->hit_count = 0; // init to zero here
entry->start_offset = start_offset;
drvector_append(&mod_entry->covered_blocks, entry);
}
drvector_unlock(&mod_entry->covered_blocks);

/* Option 1: Clean call (supposedly slow and raises exception with -debug). Will also only increment counters for the last loaded module (?). */
// dr_insert_clean_call(drcontext, bb, instr, (void*)clean_call, false, 1, OPND_CREATE_INTPTR(&(entry->hit_count)));

/* Option 2: Insert meta instruction for increment (raises exception with -debug). */
// drreg_reserve_aflags(drcontext, bb, instr);
// instrlist_meta_preinsert(bb, instr, INSTR_CREATE_inc(drcontext, OPND_CREATE_ABSMEM(&(entry->hit_count), OPSZ_4)));
// drreg_unreserve_aflags(drcontext, bb, instr);

/* Option 3: Supposedly fast counter update (raises exceptions and crashes client). */
// drx_insert_counter_update(drcontext, bb, instr, SPILL_SLOT_MAX + 1, IF_AARCHXX_(SPILL_SLOT_MAX + 1) & (entry->hit_count), 1, 0);

return DR_EMIT_DEFAULT;

}

// DR Setup

drmgr_register_bb_instrumentation_event(NULL, event_bb_instrumentation, NULL)

// If Option 2: Init drreg

drreg_options_t ops = { sizeof(ops), 1, false };

drreg_init(&ops);

Daniel Elsner

unread,

Mar 18, 2022, 5:56:57 AM3/18/22

to DynamoRIO Users

Edit: I managed to work around the `-attach` problem by step-by-step removing parts from drcovlib, which works fine with `-attach`.

I also am assuming that the problem with the instrumentation is rooted in the data structure I chose for storing the covered basic blocks (drvector).

The drvector container performs relocation when allocating more memory, which messes up the absolute addresses I use for the increment.

I'll try with a drtable which doesn't seem to perform relocation and by passing DRTABLE_MEM_REACHABLE I should also be able to prevent other memory access exceptions.

Thanks,

Daniel

Daniel Elsner

unread,

May 4, 2022, 6:26:17 PM5/4/22

to DynamoRIO Users

Hi Abhinav,

I'm trying to use the DR client I created on Windows to instrument Java applications and face considerable overhead (if using -attach) or even DR crashes (if starting the application through DR).

I registered a simple exception handler in my DR client and observed that the JVM process throws many 0xc0000005 access violation exceptions (which presumably is normal for JVM processes: https://stackoverflow.com/a/36258856).

Now, I've searched through all GH issues and posts in this Google group related to DR and instrumenting JVM processes, but could not get it to work.

I've tried a handful of JDK versions and a variety of DR runtime options (-no_hw_cache_consistency, -ignore_assert_list '*', -s 60, -vm_size 1G, -no_enable_reset, -disable_traces, -no_sandbox_writes).

Basically, to reproduce it, use a JDK of your choice (I've tried OpenJDK 8, 11, 14, 17) on Windows and run for example the following, which already takes way too long:

drrun.exe -- cmd /c "mvn -version" or drrun.exe -- cmd /c "mvn test" (in a simple Maven project)

(Maven will use the java.exe application internally)

Do you have any idea how I could switch off handling any 0xC0000005 exceptions from the DR client, to reduce the overhead?

I've tried the following already:

static bool event_exception(void* drcontext, dr_exception_t* excpt) {

DWORD exception_code = excpt->record->ExceptionCode;

if ((exception_code == EXCEPTION_ACCESS_VIOLATION)) {

return false; // tried "true" and "dr_redirect_execution()" as well...

}

return true;

}

drmgr_register_exception_event(event_exception);

Thanks in advance!

Daniel

Daniel Elsner

unread,

May 5, 2022, 7:53:47 AM5/5/22

to DynamoRIO Users

I guess since my last question is more generally about running JVM processes under DR, it makes sense to put it into a new conversation. I'll open a new one!

Reply all

Reply to author

Forward