I was not able to reproduce this issue on the main branch or with a directed test on my branch. But I fixed it locally and then re-ran but I hit another assert.
<CURIOSITY : (0) && "crashed while walking dynamic header" in file /home/prasun/dynamorio/core/unix/module_elf.c line 326
<CURIOSITY : out_data->alignment == alignment in file /home/prasun/dynamorio/core/unix/module.c line 483
<Application <dir>/python3.7 (59312) DynamoRIO usage error : meta-instr faulted? must set translation field and handle fault!>
<Usage error: meta-instr faulted? must set translation field and handle fault! (/home/prasun/dynamorio/core/translate.c, line 1016)
This assert occurs in the master branch as well. I saw this in a slightly old rev (Oct 27 a314825) from which my code was forked.
It is seen in later revs also but stops showing at 4bcc907 (Mar 10). However the benchmark never completes. It normally takes under 4 minutes to run but when I ran with basic_counts analyzer it did not finish overnight (this is a 64 thread run but I don't think basic counts should have much overhead?). With just drrun it runs fine (takes about 10s extra). The processes don't use CPU and seem to be in wait/pipe_wait. With '-offline' it also keeps running for hours but I didn't see anything unusual with gdb or in the logs (-loglevel 4) - it seemed to be executing app code with tracing instrumentation. This is seen in the most recent build (adb1bd4 May 17). This is a tensorflow benchmark (BERT) which has JIT code so that may be playing a part but we did not see the error with another tensorflow run.
In case it is useful, I see this assert in the commit (1cd0ba5 Mar 8) before 4bcc907 (Mar 10):
ASSERT FAILURE: /home/prasun/dynamorio/clients/drcachesim/tracer/tracer.cpp:1692: tracing_disabled.load(std::memory_order_acquire) == BBDUP_MODE_COUNT ()