Extra trace file generation and wrong number of instruction lengths in the tracefiles for certain applications

21 views
Skip to first unread message

Vivek Govindasamy

unread,
Jun 22, 2026, 1:21:36 PM (19 hours ago) Jun 22
to DynamoRIO Users
When generating traces for certain workloads such as stressapp, we observe that the number of trace files generated does not match the number of threads that the application is run with, and some extra trace files are generated. The total number of instructions that are present in the trace files do not match what is observed when running perf stat and we observe a lower number of total instructions in the trace files. So generating the trace causes more trace files to be generated compared to the number of threads, while checking the total instruction count results in fewer instructions.

We currently observe this issue for stressapp and multiload. For most applications we observe the same number of trace files as the number of threads and correct instruction counts. We have our own instruction counter tool to check the number of instructions in the trace file. This issue seems to occur regardless of x86 or ARM platforms.

The traces are generated by using 
/bin64/drrun -t drcachesim -offline -outdir . -- ./application, and preprocessed using /bin64/drrun -t drcachesim - indir tracefile


Bin Wang

unread,
Jun 22, 2026, 3:43:19 PM (16 hours ago) Jun 22
to DynamoRIO Users
Hi Vivek,

Thank you for reporting this.

1. My guess for the extra trace files is that there were transient/auxiliary threads created while the workload was running and they were captured by DynamoRIO. Can you share the exact flags and thread counts are you passing when executing stressapp and multiload? How many more trace files do you observe?
2. Did you run `perf stat -e instructions`? Do you see the inconsistent number of instruction for every workload or just stressapp/multiload? Because `perf stat -e instructions` would count both user space and kernel space instructions (e.g., systemcall handling). DynamoRIO operates strictly in user space and have no visibility in the kernel. You could try running `perf stat -e instructions:u` (restricting perf to user-mode code), which will eliminate kernel overhead and should align closely with DR's instruction count.

Reply all
Reply to author
Forward
0 new messages