Cannot run drraw2trace with ERROR: Conversion failed: Failed to process file for thread 25: Failed to close prior component

60 views
Skip to first unread message

Surim Oh

unread,
Nov 7, 2023, 12:04:45 AM11/7/23
to DynamoRIO Users
Hi All,

I am using drraw2trace after getting the offline raw trace to convert the raw to trace.
I am seeing drraw2trace stopping with the following error.

ERROR: Conversion failed: Failed to process file for thread 25: Failed to close prior component

The raw trace was collected by the following command.
drrun -t drcachesim -jobs 40 -outdir <dir> -offline -- taskset -c 3 clang ...

The size of the raw trace is 20G, the conversation (raw2trace) runs until the size of the converted trace reaches 32G, and it eventually stops with the error. It seems it keeps opening the zip file and closes it every time it writes. For some reason, zipCloseFileInZip returns an error code and drraw2trace crashes.
This also happens for another workload I tested (raw trace: 174G, the converted trace stops at 12G).  

Does anyone have an idea of this issue and the workaround?
What is the critical reason for crashing drraw2trace on a failure to close the zip file?

Thank you,
Surim

Surim Oh

unread,
Nov 7, 2023, 5:51:43 PM11/7/23
to DynamoRIO Users
The error code is -103 which is ZIP_BADZIPFILE.

Derek Bruening

unread,
Nov 7, 2023, 6:27:23 PM11/7/23
to Surim Oh, DynamoRIO Users
We have never seen that before.  The build uses third_party/zlib/contrib/minizip for zipfile support.  I would suggest minimizing a reproducible test case to isolate where it happens: is this only in a custom build or in one of the github packages; only on certain workloads or sizes of workloads; or what.

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/7b3c01bc-c851-410b-8da9-dbaa64b95ea2n%40googlegroups.com.

Mingsheng Xu

unread,
Feb 19, 2024, 6:29:40 PMFeb 19
to DynamoRIO Users
Hi All,

I have more information on this error with another application:

- The application being traced is from the SPEC CPU® 2017 suite: 531.deepsjeng_r
- The tracer and convertor are from a release build of release 10 on Linux
- Traced from the beginning, the conversion for the raw file will always fail when the trace file is around ~28G
- I played with the tracing options trying to delimit the point of failure, but it was unsuccessful:
  1. with `-trace_for_instrs 12G`, a raw file of 49G would be successfully converted into a trace file of 25G after ~2.5 hours.
  2. with `-trace_for_instrs 14G`, a raw file of 55G would fail when the trace file reached 28G with `ERROR: Conversion failed: Failed to process file for thread 70986: Failed to close prior component`.
  3. So it seems that something between 49G and 55G in the raw file triggered the error.
  4. I tried skipping the first 49G and only keeping the point of failure with `-trace_after_instrs 12G -exit_after_tracing 4G`. The raw file is 8.1G,  which should contain the point of failure. However, the conversion would succeed.
  5. So the error seems to be sensitive to when the tracing starts and I cannot reduce the reproducible test case.
- I also tried converting with a recent weekly build DynamoRIO-Linux-10.0.19762. It would fail with the same `ERROR: Conversion failed: Failed to process file for thread 70986: Failed to close prior component`
- Another issue not directly related to the conversion error is that the successfully converted trace (traced with `-trace_for_instrs 12G` mentioned above, for example) is unusable. For example, the view tool would say:
 
Failed to open trace/window.0000/drmemtrace.deepsjeng_s_base.memtrace-m64.64874.9297.trace.zip
Failed to initialize scheduler: Failed to open trace/window.0000/drmemtrace.deepsjeng_s_base.memtrace-m64.64874.9297.trace.zip
ERROR: failed to initialize analyzer
 
  And If I use `unzip -l <trace file>` to list the chunks within the trace file, it says:
 
Archive:  drmemtrace.deepsjeng_s_base.memtrace-m64.64874.9297.trace.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of drmemtrace.deepsjeng_s_base.memtrace-m64.64874.9297.trace.zip or
        drmemtrace.deepsjeng_s_base.memtrace-m64.64874.9297.trace.zip.zip, and cannot find drmemtrace.deepsjeng_s_base.memtrace-m64.64874.9297.trace.zip.ZIP, period.

Do you have any suggestions on how I can provide more information on this error? Thank you so much!

Mingsheng Xu

unread,
Feb 19, 2024, 6:32:20 PMFeb 19
to DynamoRIO Users
Correction: The application being traced is from the SPEC CPU® 2017 suite: 631.deepsjeng_s

Derek Bruening

unread,
Feb 20, 2024, 1:44:26 PMFeb 20
to Mingsheng Xu, DynamoRIO Users
It is sounding like a problem with third_party/zlib/contrib/minizip.  As you can see in the code, the "Failed to close prior component" is from a failure in minizip's zipCloseFileInZip().  I would suggest tweaking that code to insert the failure return value into the error string to at least get that error code.  We know minizip claims to support >4GB components and we have tested that so base 64-bit zip component support should not be the problem.  I would also confirm that everything works fine when not using zip format: e.g., "-trace_compress lz4".

Reply all
Reply to author
Forward
0 new messages