Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Detected Nondeterminism in Dr. Memory between two runs of the same, deterministic executable under same configurations

35 views
Skip to first unread message

Alen Jo

unread,
Sep 12, 2024, 10:01:13 PM9/12/24
to DynamoRIO Users
Hi Mr. Derek Bruening!

I originally opened an issue called "Detected Nondeterminism in Dr. Memory between two runs of the same executable under the same configurations." This new issue addresses some of the questions you may have about the determinism of the programs and the determinism of the environment.

I wanted to see the differences between the dynamically linked vs. statically linked FFprobe binary. Additionally, I made another benchmark for testing, where I would also run a reduced version of the FFprobe binary based on the initial stack traces that Dr. Memory generated, where I edited out certain sections of code in the ffprobe binary (this is the second set in each section respectively).

Setup
Version of Dr. Memory: Dr. Memory version 2.6.19800 -- build 0
Environment: Barebones Ubuntu 22.04 Server with isolated Docker container
Hardware Specifications: 376GB of RAM and 2 Intel Xeon Gold 5218 16-core CPUs @ 2.30GHz

Prerequisites
Download the audio file at this Google Drive link: Audio file

Dynamically-Linked Version
The dynamically linked program is tested deterministic because of the following criterion I took into effect: Does it have the same output, and are the runtime modules loaded the same? I have determined that the output was the same in all 250 runs of the software, and all 250 runs of the software running have the same modules loaded into runtime, which I analyzed using strace.

Dynamically-Linked Executable Reproducible Steps - Original:
1. Clone the repository https://github.com/FFmpeg/FFmpeg
2. Install dependencies: nasm and yasm
3. Go into the cloned directory and enter "./configure --enable-shared --disable-static && make -j$(nproc)"
4. Run for 250 times: drmemory -logdir "<your directory here>" ./ffprobe sanic.mp3
5. Diff each Dr. Memory run sequentially

Dynamically-Linked Executable Reproducible Steps - Reduced:
1. Clone the repository https://github.com/joalen/FFmpeg
2. Install dependencies: nasm and yasm
3. Go into the cloned directory and enter "./configure --enable-shared --disable-static && make -j$(nproc)"
4. Run for 250 times: drmemory -logdir "<your directory here>" ./ffprobe sanic.mp3
5. Diff each Dr. Memory run sequentially

Observations:
The original FFprobe program reported two uninitialized reads in one run and no errors in the other. This is the case in 30/250 Dr. Memory report pairs.

The reduced binary reported not only kept the two uninitialized reads but also added a memory leak error in one run and no errors in the other run. This is the case for 16/250 Dr. Memory report pairs.

Statically-Linked Version
In the statically linked version, the outputs remained the same in all 250 runs and in addition, running ldd on the FFprobe static binary reported "not a dynamic executable." I also ran the file command, which says "ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked for GNU/Linux 3.2.0, stripped"

Statically-Linked Executable Reproducible Steps - Original:
1. Clone the repository https://github.com/FFmpeg/FFmpeg
2. Install dependencies: nasm and yasm
3. Go into the cloned directory and enter "--enable-static --disable-shared --extra-libs=-static --extra-cflags="-I/usr/local/include" --extra-ldflags="-L/usr/local/lib" --pkg-config-flags="--static && make -j$(nproc)"
4. Run for 250 times drmemory -logdir "<your directory here>" ./ffprobe sanic.mp3
5. Diff each Dr. Memory run sequentially

Statically-Linked Executable Reproducible Steps - Reduced:
1. Clone the repository https://github.com/joalen/FFmpeg
2. Install dependencies: nasm and yasm
3. Go into the cloned directory and enter "--enable-static --disable-shared --extra-libs=-static --extra-cflags="-I/usr/local/include" --extra-ldflags="-L/usr/local/lib" --pkg-config-flags="--static && make -j$(nproc)"
4. Run for 250 times drmemory -logdir "<your directory here>" ./ffprobe sanic.mp3
5. Diff each Dr. Memory run sequentially

Observations:
The original FFprobe program reported multiple errors about "UNADDRESSABLE ACCESS beyond heap bounds" for reading and writing operations. This turns out to be the case in 6/250 Dr. Memory report pairs.

The reduced binary reported kept the same "UNADDRESSABLE ACCESS beyond heap bounds" for both the reading and writing operations. This turns out to be the case in 4/250 Dr. Memory report pairs.

More observations:
Additionally, when running Dr. Memory on that FFprobe binary, this error stack shows up:
drmem_crash.png

Would you mind taking a look at the Dr. Memory program and provide insights into the Dr. Memory tool for the above observations?

Thank you very much!

Alen Jo

unread,
Sep 12, 2024, 11:08:09 PM9/12/24
to DynamoRIO Users
To clarify on the observations for the statically-linked binary, the number of errors reported were different for both the original and reduced, where one run had 491 of those "UNADDRESSABLE ACCESS beyond heap bounds" for reading and writing operations and the other run had 492 of the same error. Sorry about the missing information!

Derek Bruening

unread,
Sep 23, 2024, 2:11:24 PM9/23/24
to Alen Jo, DynamoRIO Users
As already mentioned, I think you will find that most sources of non-determinism are not from the tools but from the applications (really, the system libraries) themselves.  Again as already mentioned, this happens even on the same machine.

I would suggest using something like a tracing tool that records everything that happened so you can study what the difference is.

For example, using the drmemtrace tool (https://dynamorio.org/page_drcachesim.html) to produce an offline trace (stored on disk so you can re-analyze it afterward) and running a hello,world test application:

$ bin64/drrun -t drmemtrace -offline -- suite/tests/bin/simple_app; bin64/drrun -t drmemtrace -tool basic_counts -indir drmemtrace.simple_app.* 2>&1 | grep 'total (fetched) i'
Hello, world!
      134058 total (fetched) instructions

Now we add a new env var:
$ export MY_NEW_ENV_VAR=an_extra_env_var

And run the exact same hello,world app on the exact same machine in the same shell with the only difference this one env var and suddenly it's running 630 more instructions:

$ rm -rf drmemtrace.simple_app.*; bin64/drrun -stderr_mask 0 -t drmemtrace -offline -- suite/tests/bin/simple_app; bin64/drrun -t drmemtrace -tool basic_counts -indir drmemtrace.simple_app.* 2>&1 | grep 'total (fetched) i'
Hello, world!
      134698 total (fetched) instructions

I would suggest replicating something like this and examining the trace to see the difference, to build a deeper understanding of where these types of seemingly non-deterministic behaviors come from.  There is a lot of code run behind the scenes that you do not see if you think only of the "hello,world" main() routine.

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/efe19dcf-c468-427a-9ccc-650697aad5dfn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages