Running AFL++ on clang-instrumented binary

52 views
Skip to first unread message

George Trabucchi

unread,
May 5, 2023, 6:28:37 PM5/5/23
to afl-users
Hi Everyone!

I am trying to run an experiment similar to Google's FuzzBench project, just on a much smaller scale. In the FuzzBench research, it was mentioned that they used clang's source-based coverage to collect coverage information to avoid bias between fuzzer-specific coverage tools.

This is what I am trying to do. I looked at clang's documentation for this process, and it looks like the best way to replicate it is to instrument a binary with the proper clang flags, and then run AFL++ on this binary so that I can collect the coverage information from that. But as far as I know, I can't run AFL++ on a binary that wasn't instrumented with an AFL specific compile (e.g. afl-gcc, afl-clang-fast). When I try, AFL++ fails because my binary "is not instrumented".

So I am thinking maybe an alternative is to instrument the binary with clang, and then instrument it again with AFL++ (in a different location), and then use a tool like "llvm-profdata" to merge AFL++'s coverage data with clang's. However, this seems to contradict the original idea, which is to collect fuzzer-independent coverage.

Any help here would be much appreciated!

George Trabucchi

Marc

unread,
May 6, 2023, 4:27:44 AM5/6/23
to afl-...@googlegroups.com, George Trabucchi
Hi,

a) AFL++ works with clang's sanitize-coverage too (it's call `native`
there: `AFL_LLVM_INSTRUMENT=native afl-cc -fsanitize=fuzzer -o foo
foo.c`, or compile anything normally with `clang
-fsanitize=fuzzer-no-link` and then use afl++'s compiler to link the
harness)
You can get a minimal cmplog by adding `trace-cmps` etc.
Note that clang's instrumentation is not as good as AFL++'s though.
So if you you want to benchmark AFL++ you should use it's native
instrumentation.

b) fuzzbench moved away from sancov for coverage to profdata two years
ago I think.

Regards,
Marc

> I am trying to run an experiment similar to Google's FuzzBench
> <https://dl.acm.org/doi/pdf/10.1145/3468264.3473932> project, just on a
> much smaller scale. In the FuzzBench research, it was mentioned that
> they used clang's source-based coverage to collect coverage information
> to avoid bias between fuzzer-specific coverage tools.
>
> This is what I am trying to do. I looked at clang's documentation
> <https://clang.llvm.org/docs/SourceBasedCodeCoverage.html> for this
> process, and it looks like the best way to replicate it is to instrument
> a binary with the proper clang flags, and then run AFL++ on this binary
> so that I can collect the coverage information from that. But as far as
> I know, I can't run AFL++ on a binary that wasn't instrumented with an
> AFL specific compile (e.g. afl-gcc, afl-clang-fast). When I try, AFL++
> fails because my binary "is not instrumented".
>
> So I am thinking maybe an alternative is to instrument the binary with
> clang, and then instrument it again with AFL++ (in a different
> location), and then use a tool like "llvm-profdata" to merge AFL++'s
> coverage data with clang's. However, this seems to contradict the
> original idea, which is to collect fuzzer-independent coverage.

--
Marc Heuse
www.mh-sec.de

PGP: AF3D 1D4C D810 F0BB 977D 3807 C7EE D0A0 6BE9 F573
Reply all
Reply to author
Forward
0 new messages