Are afl-clang-fast++ and afl-g++ compatible?

392 views
Skip to first unread message

fencep...@gmail.com

unread,
Sep 19, 2017, 3:46:07 AM9/19/17
to afl-users
Hi,

The program I'm fuzzing is extremely slow, so I wanted to use the AFL_LOOP. The program itself can't be compiled with afl-clang-fast++, and I'm not able to change that part, but I needed to write a fuzzing harness in any case, so it is fuzzable in the first place, and that one can be compiled with the AFL_LOOP. I hope the situation is understandable. My question is, will the fuzzing be successful, if the main program is compiled with afl-g++ and only the harness with afl-clang-fast++?

Cheers,
fpe

Adam

unread,
Sep 19, 2017, 3:41:08 PM9/19/17
to afl-...@googlegroups.com
IIRC the coverage maps generated by code compiled with gcc are not
compatible with those created with clang. So I wouldn't expect that
mixing and matching is going to work in a sane manner. (Someone please
jump in and correct me if I'm wrong or confirm that this is accurate.)

However, if you're writing a test harness, you shouldn't even need to
compile the main program to do your fuzzing, right? Unless you're using
a .so which was compiled with gcc or are doing any IPC stuff, you should
be good to go.

Another idea would be to defer the fork() server until just before the
program reads input data (which is compatible with targets compiled with
gcc). I haven't done this myself, but you can read more about it in
section 10 of technical_details.txt or just skip directly to the blog
article which goes into more detail:
https://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html

Let us know how it goes and feel free to post back any notes on
deferring the fork() server if you end up going down that route.
> --
> You received this message because you are subscribed to the Google
> Groups "afl-users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to afl-users+...@googlegroups.com
> <mailto:afl-users+...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

Jakub Wilk

unread,
Sep 20, 2017, 7:05:49 AM9/20/17
to afl-...@googlegroups.com
* fencep...@gmail.com, 2017-09-19, 00:46:
>My question is, will the fuzzing be successful, if the main program is
>compiled with afl-g++ and only the harness with afl-clang-fast++?

* On x86, code generated by afl-gcc and afl-clang-fast++ can't be linked
together. This is because the __afl_prev_loc variable is global in
afl-gcc code, but it's thread-local in afl-clang-fast code. This makes
the linker unhappy:

/usr/bin/ld: __afl_prev_loc: TLS reference in /tmp/foo-593a28.o mismatches non-TLS reference in bar.o

I guess it would work if you renamed the variable in one of them.
This is also not a problem on x86-64, where afl-gcc's __afl_prev_loc is
declared as local to the object file.

* You must make sure that no code built with afl-gcc is executed before
__AFL_LOOP. This sucks, because if there's some costly initialization
code, you can't put it outside the loop easily.


Other than that, I can't think of a reason why mixing afl-gcc and
afl-clang-fast wouldn't work.

--
Jakub Wilk

hail...@googlemail.com

unread,
Sep 25, 2017, 10:24:38 AM9/25/17
to afl-users
Hi Adam,

Trying to clarify:
The goal is to fuzz a library of a module of Apache. Instead of fuzzing Apache on every single input, there is the harness, which is directly using the library. The harness is compiled with clang. The library is compiled with gcc.

If I understand you correctly, you first say gcc and clang generated maps are incompatible.

In the second paragraph, as far as I understood, you say if the map on a .so library was generated with gcc it is compatible to the program that was compiled with clang.

I don't understand how those two go together.

The reference to fuzzing without execve is a general idea to speed up and has nothing to do with mixing clang and gcc - correct?

Cheers,
fpe

☣Adam

unread,
Sep 26, 2017, 3:35:32 PM9/26/17
to afl-...@googlegroups.com
If the library is compiled with gcc instrumentation, you will probably want the harness to be compiled without instrumentation. The real question is do you care about teaching new code in the harness? I'm guessing not, so just don't instrument the harness. The library code should increment the bitmap in shared memory at index "cur_location ^ prev_location" (as described in the technical_details.txt doc) for each new section of coffee it hits.

As I understand it, the llvm version of instrumentation doesn't use memory addresses for cur_location and prev_location, but rather random (unique) identifiers. I took a quick scan of the code to verify this, but I want able to find the instrumentation code, so that's just based off my (possibly faulty) memory. That's why I believe them to be incompatible. It might work out ok in the end and you'll just be detecting new branches in the test harness at "wrong" indexes, but they should be consistent... In any case, I'd encourage you to experiment and report back what works well for you in practice.

I misunderstood and thought you could only compile the .so with gcc and the main binary (which turned out to be Apache) with llvm. Since your not running Apache, I didn't think that was relevant. Now that I realize it was the harness, not Apache, I still think the algorithms are different, but I've talked myself around to thinking it might not matter since they'll be consistent.

Let us know how it goes.


...

Jakub Wilk

unread,
Sep 26, 2017, 3:51:06 PM9/26/17
to afl-...@googlegroups.com
* ☣Adam <ad...@dc949.org>, 2017-09-26, 14:35:
>As I understand it, the llvm version of instrumentation doesn't use
>memory addresses for cur_location and prev_location, but rather random
>(unique) identifiers.

From docs/technical_details.txt:
"The cur_location value is generated randomly to simplify the process of
linking complex projects and keep the XOR output distributed uniformly."

This true for both afl-gcc and afl-clang-fast.

--
Jakub Wilk

☣Adam

unread,
Oct 2, 2017, 10:01:01 AM10/2/17
to afl-...@googlegroups.com
I think I was probably thinking of the qemu mode not being compatible with anything on account of using memory addresses for the bitmap indexes rather than a pseudorandom identifier.

Can anyone point me to the code which injects the identifiers (both g++ and clang)? I'd be interested to see how it works and see the level to use the same algorithm as qemu. This would be slower, but compatible, which is sometimes desirable.

--
You received this message because you are subscribed to the Google Groups "afl-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to afl-users+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages