There's some possibility that you're dealing with some
non-deterministic crashes due to concurrency, etc. A way to
troubleshoot that would be to retry any of the crashing test cases,
say, 10k times (just a shell script loop or so).
That said:
> I am also unable to verify the crashes
> using any of afl's companions like afl-tmin - it always comes up with 0 byte
> files.
This sounds very odd. What does afl-tmin say, exactly? A file being
shrunk to 0 bytes would imply that the program is not reading the
input at all. Perhaps it's running out of memory, or is not called
correctly (e.g., expecting data on stdin, not in the file in argv[1])?
It's possible, but not trivial (basically, core dumps are slow and can
really mess up timing / crash detection). If the crashes were due to a
page fault, you may have some useful messages (including the offending
addresses) in the dmesg.
Another suggestion for your testing: to make sure that
libdislocator.so is working as expected on your test runs outside AFL,
you may want to set AFL_LD_VERBOSE=1 in the environment. This should
produce verbose messages from the library; if you don't see any,
something is probably wrong with the LD_PRELOAD stuff.
dmesg has some messages but apparently only from when the fuzzer was running, not from the time I tried to resume it (i.e. when it crashed on a previously working queue entry).
Since I recompiled the binary in the meantime, I'll see if I can make sense out of the addresses found in dmesg once I get to resume this instance and the crashes start happening again.
Perhaps run it under "strace -f" and see if there's anything suspicious in the
strace output?
>this is on a Debian server setup, not a desktop setup where the core dumps
>might be put elsewhere by some desktop tool).
This sounds more like "I guess" than "I know".
What's in your /proc/sys/kernel/core_pattern file?
That's just libdislocator failing to allocate memory (and returning
NULL). You may want to try under GDB to see what's actually caussing
segv after this failure.
> Those mmap calls look like the ones coming from afl's instrumentation. si_addr=0x7ffff1c33000 corresponds to one of those calls.
What do you mean specifically?
This is backwards.
"set follow-fork-mode child" does the magic). gdb now shows a crash in a line of code in my library that, in theory, should never be able to crash (all it does is checking an index into a vector is valid). This doesn't make it any easier...