Instrumentation as a LLVM pass?

1,306 views
Skip to first unread message

Keegan McAllister

unread,
Apr 2, 2015, 11:53:10 AM4/2/15
to afl-...@googlegroups.com
I'm working on the Servo project [1] at Mozilla Research. We're building a new browser engine, and I'd like to use afl wherever we can, as part of a defense in depth security strategy.

Although we link a number of C libraries, the Servo engine itself and many of its dependencies are written in Rust. [2] The Rust compiler uses LLVM, and it might be easy to modify afl-as.c to handle the resulting assembly dialect. However, I was wondering if anyone is working on a version of the afl instrumentation that works as an LLVM pass. This would ease integration with Rust tooling and a lot of other language ecosystems.

If there's no project underway, then I might attempt it myself. Any advice would be much appreciated. It seems that I could reuse the forkserver and setup code (extracted to a .s file and compiled separately?) and would only need to reimplement the trampoline injection in LLVM.

I found LibFuzzer [3] and ASan's coverage counters [4] through this forum. But I think afl would be better for our use cases. Rust doesn't support ASan yet, either.

Thanks,
keegan

[1] https://github.com/servo/servo
[2] http://www.rust-lang.org/
[3] http://llvm.org/docs/LibFuzzer.html
[4] https://code.google.com/p/address-sanitizer/wiki/AsanCoverage#Coverage_counters

Michal Zalewski

unread,
Apr 2, 2015, 12:36:36 PM4/2/15
to afl-users
> Although we link a number of C libraries, the Servo engine itself and many
> of its dependencies are written in Rust. [2] The Rust compiler uses LLVM,
> and it might be easy to modify afl-as.c to handle the resulting assembly
> dialect. However, I was wondering if anyone is working on a version of the
> afl instrumentation that works as an LLVM pass. This would ease integration
> with Rust tooling and a lot of other language ecosystems.

Not so far, I believe. Kostya (of ASAN fame) implemented something
equivalent to AFL instrumentation several weeks ago (the coverage
counters you mention in your message -
https://code.google.com/p/address-sanitizer/wiki/AsanCoverage#Coverage_counters).

I haven't gotten to it yet (and am feeling guilty for that). The main
challenge is that it uses ASAN-specific output format, rather than the
AFL-specific one; and comes without the forkserver bits. So, it would
probably need to be extended a bit.

> If there's no project underway, then I might attempt it myself. Any advice
> would be much appreciated. It seems that I could reuse the forkserver and
> setup code (extracted to a .s file and compiled separately?) and would only
> need to reimplement the trampoline injection in LLVM.

There is also a possibly more readable, higher-level sample in the
qemu directory. But yeah, the basic idea is to initialize the fork
server before main, and then simply add a xor operation as explained
in http://lcamtuf.coredump.cx/afl/technical_details.txt.

/mz

László Szekeres

unread,
Apr 2, 2015, 2:30:17 PM4/2/15
to afl-...@googlegroups.com
Hi,

I have a simple LLVM pass that does basically the same instrumentation as AFL (afl-as).

I haven't integrated it with AFL though, currently I just link it with a dummy runtime, which has the "counter map" and dumps it to a file at exit. It should be easy to integrate this to AFL, I haven't done it yet because I only wrote this pass to run an experiment (after a discussion with Kostya) in order to see which edge-coverage instrumentation has lower performance overhead, "AFL-style" or "ASAN-style". Both "styles" increments a counter in each basic block, the difference is the following:

ASAN-style:
 - In order to measure edge-coverage, it introduces new basic blocks, by splitting critical edges (see https://code.google.com/p/address-sanitizer/wiki/AsanCoverage#Edge_coverage).
 - It has one (1-byte) counter for each basic block, including the newly introduced ones.

AFL-style:
 - It does not introduce new basic blocks.
 - It has a fixed number of (1-byte) counters: 64k.
 - It assigns a random ID to each block and increments counter[current_block_id ^ previous_block_id].
 - It also updates previous_block_id for the sake of precision (see http://lcamtuf.coredump.cx/afl/technical_details.txt).

The AFL-style coverage metric "statistically" produces the same edge-coverage information as the ASAN-style, especially for programs which has less than 64k basic blocks (so collisions are unlikely). You would think that the AFL-style instrumentation is faster, because splitting critical edges increases the number of basic blocks by 49% (measured on the SPEC2006 benchmark), which means more instrumentation, more (unconditional) branches, more pressure on the i-cache and so on.

While this is true, ASAN-style coverage has 50% performance overhead, while the AFL-style has 68% (measured on SPEC2006). This is because in each basic block the former updates only one counter (one incb instruction), while the latter updates the counter *and* updates the previous_block_id. Two stores instead of one. The original "afl-as" instruments on the assembly level, so splitting critical edges would be of course hard, but instrumenting on the IR level "ASAN-style" gives more precision and it's also faster. However AFL would need bigger changes to support arbitrary number of counters as opposed to the fix 64k.

Anyhow, the above is just something to think about. I'm attaching both passes for both "styles" (ASAN-sytle is "simple_cov.cc"). They work on one module, so you need to compile your program with LTO, then apply the coverage instrumentation pass and link the dummy runtime. I'm happy to help integrating this into AFL and also to provide more details on the measurements I ran.

Thanks,
Laszlo
coverage.tar.gz

Michal Zalewski

unread,
Apr 4, 2015, 7:32:02 PM4/4/15
to afl-users
> Anyhow, the above is just something to think about. I'm attaching both
> passes for both "styles" (ASAN-sytle is "simple_cov.cc"). They work on one
> module, so you need to compile your program with LTO, then apply the
> coverage instrumentation pass and link the dummy runtime. I'm happy to help
> integrating this into AFL and also to provide more details on the
> measurements I ran.

Cool, interesting benchmarks!

I have never ever looked at the guts of llvm; but looks like the basic
missing piece would be just emitting the forkserver init bits at entry
point; I'm assuming this is something analogous to the *_rt.c
destructors you have?

Do you mind if I put it into experimental/ for afl?

/mz

László Szekeres

unread,
Apr 6, 2015, 2:30:42 AM4/6/15
to afl-...@googlegroups.com
Cool, interesting benchmarks!

I have never ever looked at the guts of llvm; but looks like the basic
missing piece would be just emitting the forkserver init bits at entry
point; I'm assuming this is something analogous to the *_rt.c
destructors you have?

Exactly. I've just put together a patch so you can actually try it with afl. See it attached. 

Take this as proposal - it's super experimental - I literally hacked this together tonight and just tried it with a dummy example and it seems to work. :)
I haven't run any measurements, but it should be faster, I left some comments in the README why.
 
Do you mind if I put it into experimental/ for afl?
 
Please, feel free!

Thanks,
Laszlo
afl-llvm.patch

Michal Zalewski

unread,
Apr 7, 2015, 12:36:16 AM4/7/15
to afl-users
Cool, this now lives in experimental/ - field reports welcome!

Another major benefit of doing it this way would be having
CPU-agnostic instrumentation (with the init stuff rewritten as C).
I'll probably play with it this weekend.
> --
> You received this message because you are subscribed to the Google Groups
> "afl-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to afl-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

László Szekeres

unread,
Apr 7, 2015, 1:29:01 AM4/7/15
to afl-...@googlegroups.com
Cool, thanks! I've just run a quick benchmark, and it looks pretty good to me:

= With original instrumentation =

$ time afl-1.58b/afl-showmap -o foo build/gzip-afl/gzip -c -9 < random-file >/dev/null
afl-showmap 1.58b (Apr  6 2015 20:58:23) by <lca...@google.com>
[*] Executing 'build/gzip-afl/gzip'...

-- Program output begins --
-- Program output ends --
[+] Captured 593 tuples in 'foo'.

real 0m11.902s
user 0m11.864s
sys 0m0.022s

= With llvm based instrumentation = 

$ time afl-1.58b/afl-showmap -o foo build/gzip-llvm/gzip -c -9 < random-file >/dev/null
afl-showmap 1.58b (Apr  6 2015 20:58:23) by <lca...@google.com>
[*] Executing 'build/gzip-llvm/gzip'...

-- Program output begins --
-- Program output ends --
[+] Captured 553 tuples in 'foo'.

real 0m4.536s
user 0m4.511s
sys 0m0.018s

I haven't looked into why the diff in the tuples.

Another major benefit of doing it this way would be having
CPU-agnostic instrumentation (with the init stuff rewritten as C).
I'll probably play with it this weekend.

Yes, I rewrote the init stuff too in C, no more assembly, I'll clean up a the patch a little bit tomorrow evening and send it.

Thanks,
Laszlo

Michal Zalewski

unread,
Apr 7, 2015, 2:01:15 AM4/7/15
to afl-users
> real 0m11.902s
>
> real 0m4.536s

Nice!

> I haven't looked into why the diff in the tuples.

Well, for one, the old wrapper forces -O3 -funroll-loops? That could
be it. It's disabled if you set AFL_DONT_OPTIMIZE beforehand.

Looking at afl-clang.py, I suspect we'll need something more robust
than the -c check to differentiate between compilation / linking (say,
to properly handle '$(CC) foo.c -o foo', which is pretty common), but
that shouldn't be a big deal...

There are also several env variables (docs/env_variables.txt) worth
honoring (AFL_HARDEN, AFL_USE_ASAN, AFL_INST_RATIO,
AFL_DONT_OPTIMIZE), but that's pretty straightforward.

> Yes, I rewrote the init stuff too in C, no more assembly, I'll clean up a
> the patch a little bit tomorrow evening and send it.

Ah, cool!:-) You can use afl_forkserver() and some bits of afl_setup()
in qemu_mode/patches/afl-qemu-cpu-inl.h as a pretty good point of
reference.

/mz

Michal Zalewski

unread,
Apr 8, 2015, 2:46:10 AM4/8/15
to afl-users
Just jotting some more stuff down...

I think it would be also nice to allocate __afl_area_ptr if SHM setup
fails, so that the binary can be run / debugged outside afl-showmap &
related tools.

It's actually probably a good optimization for the "old"
instrumentation, too - map to a fixed address or create a dummy map on
failure, and then, instead of retrieving and checking the pointer,
just write stuff unconditionally.

The one problem I did notice with the prototype is that if you have
other __attribute__((constructor)) functions in the compiled code,
they will execute before the _rt shim and always segfault :-/

Michal Zalewski

unread,
Apr 8, 2015, 2:51:48 AM4/8/15
to afl-users
> It's actually probably a good optimization for the "old"
> instrumentation, too - map to a fixed address or create a dummy map on
> failure, and then, instead of retrieving and checking the pointer,
> just write stuff unconditionally.

Hm, actually, re-reading my old notes, I think this was yielding just
around 10%, at the increased risk of getting into trouble when bumping
into other mappings. So maybe not really worth it in the old code.

/mz

László Szekeres

unread,
Apr 8, 2015, 3:28:30 AM4/8/15
to afl-...@googlegroups.com, Domagoj Babic
Find attached the cleaned up version.

- the runtime is completely rewritten in C
- afl-clang is rewritten as well, it supports all kinds of compilation 
- all flags are supported, except AFL_INST_RATIO
 
Also, no 32 bit support yet, but it's not a big deal to fix it.

I think it would be also nice to allocate __afl_area_ptr if SHM setup
fails, so that the binary can be run / debugged outside afl-showmap &
related tools.

This is done.

The one problem I did notice with the prototype is that if you have
other __attribute__((constructor)) functions in the compiled code,
they will execute before the _rt shim and always segfault :-/

This is fixed too, by setting to priority to zero. 
 
Hm, actually, re-reading my old notes, I think this was yielding just
around 10%, at the increased risk of getting into trouble when bumping
into other mappings. So maybe not really worth it in the old code.

Yes, I was planning to do this too, but then I noticed that previous conversation mentioning the 10%. And as you say it's also risky. I might try it anyway.

Thanks,
Laszlo
llvm_instrumentaion.tar.gz

Michal Zalewski

unread,
Apr 8, 2015, 3:31:26 AM4/8/15
to afl-users, Domagoj Babic
Great, thanks! I'll try to find some time tomorrow to get it out the
door, along with a bunch of other pending changes.

/mz

Keegan McAllister

unread,
Apr 8, 2015, 1:33:09 PM4/8/15
to afl-...@googlegroups.com
I'm excited to see so much progress already! Many thanks to László for sharing the code.
 
Another major benefit of doing it this way would be having
CPU-agnostic instrumentation (with the init stuff rewritten as C).
I'll probably play with it this weekend.

Yeah, this is a big deal for Servo's use case. Rust is memory-safe by default; the code that we write in the unsafe dialect is disproportionately the low-level, platform-specific stuff. So I think we'd get more value than the typical afl user from fuzzing on different platforms.

keegan

Keegan McAllister

unread,
Apr 8, 2015, 8:56:29 PM4/8/15
to afl-...@googlegroups.com
The Rust integration is working!

    https://github.com/kmcallister/afl.rs

btw, I suggest adding #include <stdlib.h> to afl_cov_rt.c. The lack of a prototype for getenv() caused mystery segfaults until I turned on -Werror and saw what was going on. Maybe this happened because I was building with gcc and not clang.

keegan

Michal Zalewski

unread,
Apr 9, 2015, 2:50:24 AM4/9/15
to afl-users
>> The one problem I did notice with the prototype is that if you have
>> other __attribute__((constructor)) functions in the compiled code,
>> they will execute before the _rt shim and always segfault :-/
>
> This is fixed too, by setting to priority to zero.

Hmm... this may be not enough; looks like it still breaks OpenSSL
build for me. We die in OPENSSL_cpuid_setup from cryptlib.c, which is
apparently called straight from _init, probably because of:

.section .init
call OPENSSL_cpuid_setup

...in some of the hand-written assembly :-/

/mz

László Szekeres

unread,
Apr 9, 2015, 1:56:45 PM4/9/15
to afl-...@googlegroups.com
Oh my! This patch should fix it.



/mz

--
You received this message because you are subscribed to the Google Groups "afl-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to afl-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Laszlo

fix.patch

Michal Zalewski

unread,
Apr 9, 2015, 2:09:54 PM4/9/15
to afl-users
Neat, thanks!

__attribute__((I_REALLY_MEAN_IT))

Jonathan Neuschäfer

unread,
Apr 9, 2015, 3:24:41 PM4/9/15
to 'László Szekeres' via afl-users
On Thu, Apr 09, 2015 at 10:56:23AM -0700, 'László Szekeres' via afl-users wrote:
> - /* Whoops. */
> + /* Whooooops. */

The best part. :D


Jonathan

László Szekeres

unread,
Apr 9, 2015, 3:30:11 PM4/9/15
to afl-...@googlegroups.com
That's should *really* not happen, so it deserves the extra o-s! :D


--
You received this message because you are subscribed to the Google Groups "afl-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to afl-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Laszlo

Konstantin Serebryany

unread,
Apr 10, 2015, 12:47:15 AM4/10/15
to afl-...@googlegroups.com
Laszlo,

Would you be interested in extending the sanitizer coverage to support
AFL's 64K shmem?
I think all we need to do is to synchronize internal sanitizer's
counters (counters_vec from sanitizer_coverage_libcdep.cc)
with the AFL's 64K shmem block at exit.
This way we will not depend on LTO, and will have AFL-compatible
instrumentation in LLVM trunk.
I'd be happy to review patches if you send them my way.

László Szekeres

unread,
Apr 10, 2015, 1:24:17 AM4/10/15
to afl-...@googlegroups.com
Of course, if there's an interest to get this into LLVM.

So, the afl instrumentation does not depend on LTO. (Only the other, "asan-style" instrumentation pass I've sent earlier, which simply counts the number of BBs in a single module-pass - in order to allocate the right sized bitmap - so it expects the whole program.)

By synchronization do you mean some sort of post-processing which maps the asan-style (variable size) counter map to the fixed 64k size afl counter map? That would change the "semantics" of the instrumentation, which might be a problem. I think it would be better to just keep the afl-style instrumentation, which is independent from the current "-fsanitize-coverage" passes and just include it under -fsanitize-coverage=afl or something. I'm happy to send an LLVM patch for that.

(On the AFL side it might worth considering the support of "asan-style" coverage, using variable size counter maps - given the benefits I've mentioned in my first e-mail. That way we had afl support in -fsanitize-coverage and -fsanitize-coverage support in afl! Wouldn't it be nice? :) 

Thanks,
Laszlo

Konstantin Serebryany

unread,
Apr 10, 2015, 2:01:49 AM4/10/15
to afl-...@googlegroups.com
On Thu, Apr 9, 2015 at 10:23 PM, 'László Szekeres' via afl-users
<afl-...@googlegroups.com> wrote:
> Of course, if there's an interest to get this into LLVM.
>
> So, the afl instrumentation does not depend on LTO. (Only the other,
> "asan-style" instrumentation pass I've sent earlier, which simply counts the
> number of BBs in a single module-pass - in order to allocate the right sized
> bitmap - so it expects the whole program.)
>
> By synchronization do you mean some sort of post-processing which maps the
> asan-style (variable size) counter map to the fixed 64k size afl counter
> map?
Yes.
> That would change the "semantics" of the instrumentation, which might
> be a problem.

The semantics will only improve (from probabilistic to strict).

The only problem I see is with SIGKILL, where we will fail to
synchronize with shmem at exit.
We can overcome this too: have large enough (e.g. N * 64K) global
variable for counters that will be used for the whole program,
and map each of the N 64K chunks to AFL's shmem when in AFL mode (when
used w/o afl, the counters will be distinct).
It may cost one extra indirection in instrumentation (can't see right
now how to avoid it w/o LTO and I don't want to depend on LTO here).

> I think it would be better to just keep the afl-style
> instrumentation, which is independent from the current "-fsanitize-coverage"
> passes and just include it under -fsanitize-coverage=afl or something.

It can be done, but I am reluctant to accept this into SanitizerCoverage.cpp.
As I've said before, AFL-style coverage is done the way it's done not
because it is optimal, but because
of the restrictions of the asm instrumentation which can not create new edges.
We can and should do better in a compiler.

> I'm
> happy to send an LLVM patch for that.
>
> (On the AFL side it might worth considering the support of "asan-style"
> coverage, using variable size counter maps - given the benefits I've
> mentioned in my first e-mail. That way we had afl support in
> -fsanitize-coverage and -fsanitize-coverage support in afl! Wouldn't it be
> nice? :)

That would be very nice indeed. :)
Also, support for
https://code.google.com/p/address-sanitizer/wiki/AsanCoverage#Caller-callee_coverage
might be good.
If anything should be done on LLVM compiler-rt side for better
integration with AFL (shmem, forkserver, whatever) -- just send the
patches.

--kcc

Michal Zalewski

unread,
Apr 10, 2015, 2:09:29 AM4/10/15
to afl-users
> (On the AFL side it might worth considering the support of "asan-style"
> coverage, using variable size counter maps - given the benefits I've
> mentioned in my first e-mail. That way we had afl support in
> -fsanitize-coverage and -fsanitize-coverage support in afl! Wouldn't it be
> nice? :)

I'm all for it. I feel very guilty for not supporting ASAN
instrumentation, but it was some combination of not having time; not
having a particularly elegant solution for the forkserver bits; and
the fact that relatively few people keep their compilers sufficiently
up-to-date (so the demand is probably modest for now).

The main reason for the 64 kB map is essentially to have a very
predictable bound on how much time afl-fuzz needs to spend analyzing
the paths, especially for very fast binaries (if you want to keep up
with 2-4 k execs/sec/core, spending 200 microseconds to find new edges
or hash the trace is problematic). But of course, that cuts both ways,
and with sparse maps, you end up wasting some CPU time.

(Well, another goal of doing that was being able to use random IDs
without risking collisions between the IDs assigned to different .o or
.so files.)

/mz

László Szekeres

unread,
Apr 10, 2015, 2:09:10 PM4/10/15
to afl-...@googlegroups.com
On Thu, Apr 9, 2015 at 11:01 PM, Konstantin Serebryany <konstantin....@gmail.com> wrote:
> That would change the "semantics" of the instrumentation, which might
> be a problem.

The semantics will only improve (from probabilistic to strict).

There are two cases:

1) when the number of edges is much less than 64k: in this case the information loss due to collisions is negligible. 

2) when the number of edges is close to 64k or more: in this case the "precision" (as in information content/entropy) of the ASAN bitmap can be larger and you might be able to encode this into the 64k AFL bitmap lossless, e.g. by compressing it. But we definitely don't want to do something like that because that changes the "semantics" of the bitmap. So the question is more like: do you have a "semantic preserving" encoding algorithm which is better than what the AFL instrumentation already does? Because, basically what the AFL instrumentation does is encoding the "precise" information that ASAN-coverege generates on the fly. It does it in a lossy way, but by keeping the "each byte is a counter for and edge" semantics.

So my point here is that the "synchronization" is unnecessary. But you might have a better map[N] --> map[64k] function in mind.
 
The only problem I see is with SIGKILL, where we will fail to
synchronize with shmem at exit.
We can overcome this too: have large enough (e.g. N * 64K) global
variable for counters that will be used for the whole program,
and map each of the N 64K chunks to AFL's shmem when in AFL mode (when
used w/o afl, the counters will be distinct).
It may cost one extra indirection in instrumentation (can't see right
now how to avoid it w/o LTO and I don't want to depend on LTO here).

> I think it would be better to just keep the afl-style
> instrumentation, which is independent from the current "-fsanitize-coverage"
> passes and just include it under -fsanitize-coverage=afl or something.

It can be done, but I am reluctant to accept this into SanitizerCoverage.cpp.
As I've said before, AFL-style coverage is done the way it's done not
because it is optimal, but because
of the restrictions of the asm instrumentation which can not create new edges.
We can and should do better in a compiler.
 
Regardless of the underlying reasons why the AFL-style coverage metric works as it is, it does have some benefits. The two that Michal pointed out are very important: 1) the constant analysis time vs. linear, 2) no need for synchronization between instrumented modules and shared libraries. Another benefit of the simple and elegant design of "AFL-style" for instance, is that it doesn't need any special instrumentation to cover function call edges, while the ASAN-style does need one. 

The performance overhead of the same styles are actually the same too. In my previous e-mail I was comparing the overhead of the AFL instrumentation to the "simple ASAN-style pass" I attached, which just breaks critical edges to get basic branch coverage, but doesn't deal with function calls (equivalent of -fsanitize-coverage=3). A more fair comparison is with the -fsanitize-coverage=4 (with additional calleer-callee coverage). The average overhead of AFL and -fsanitize-coverage=4 is basically identical: they are both ~65%, measured on SPEC2006.

So I believe both "styles" have their own benefits and it would make a lot of sense for AFL to support arbitrary size bitmaps and for -fsanitize-coverage to support fixed size bitmaps. Imagine you have a target with source code, but it uses a binary only library. Instrumenting the binary library with something more complex than afl-style would be hard, but you can do it on the assembly level for the binary and on the IR level for the main program in a compatible way. (and you need to keep the semantics for that.) To make the fixed size counter map more generic, we could have an optional multiplier for the size of the map, so you could get precision for larger programs e.g. with (n=2)*64k, while keeping it simple.

So, in order to have mutual support, I'd still suggest adding the fixed sized counter map support under -fsanitize-coverage=afl or similar as it is now, extended with the optional multiplier. This would also eliminate those complications with SIGKILL, LTO and the others you've mentioned. (But again, you might have a better mapping function.) And AFL could get guaranteed precision with the support of arbitrary size counter maps. There wouldn't be complications here either, the ASAN instrumented binary could just report the correct map size to AFL before creating the SHM. I'm happy to help with patches both ways. :)

What do you think?

Thanks,
Laszlo

Konstantin Serebryany

unread,
Apr 11, 2015, 1:49:38 AM4/11/15
to afl-...@googlegroups.com, Evgeniy Stepanov, Alexey Samsonov
+Evgeniy and Alexey

I don't buy this. 
Constant time is 64K and linear time is proportional to the number of edges, which is also around 64K for most targets where AFL shines. 
What I do buy is simplicity on the client side. 64K is really simple for the consumer of the information, I like it!
 
2) no need for synchronization between instrumented modules and shared libraries. Another benefit of the simple and elegant design of "AFL-style" for instance, is that it doesn't need any special instrumentation to cover function call edges, while the ASAN-style does need one. 

Agree, AFL solves it elegantly. 
(less precise though).  
 
The performance overhead of the same styles are actually the same too. In my previous e-mail I was comparing the overhead of the AFL instrumentation to the "simple ASAN-style pass" I attached, which just breaks critical edges to get basic branch coverage, but doesn't deal with function calls (equivalent of -fsanitize-coverage=3). A more fair comparison is with the -fsanitize-coverage=4 (with additional calleer-callee coverage). The average overhead of AFL and -fsanitize-coverage=4 is basically identical: they are both ~65%, measured on SPEC2006.


Agree. 
 
So I believe both "styles" have their own benefits and it would make a lot of sense for AFL to support arbitrary size bitmaps and for -fsanitize-coverage to support fixed size bitmaps. Imagine you have a target with source code, but it uses a binary only library. Instrumenting the binary library with something more complex than afl-style would be hard, but you can do it on the assembly level for the binary and on the IR level for the main program in a compatible way. (and you need to keep the semantics for that.) To make the fixed size counter map more generic, we could have an optional multiplier for the size of the map, so you could get precision for larger programs e.g. with (n=2)*64k, while keeping it simple.

So, in order to have mutual support, I'd still suggest adding the fixed sized counter map support under -fsanitize-coverage=afl or similar as it is now, extended with the optional multiplier. This would also eliminate those complications with SIGKILL, LTO and the others you've mentioned. (But again, you might have a better mapping function.) And AFL could get guaranteed precision with the support of arbitrary size counter maps. There wouldn't be complications here either, the ASAN instrumented binary could just report the correct map size to AFL before creating the SHM. I'm happy to help with patches both ways. :)


My greatest problem with having AFL style instrumentation is just extra added complexity in the code.
We have too much of them (both code and complexity) already.

However, if you do feel so strong about it and are willing to *maintain* this code in trunk in future -- let's do it. 
Please send patches to llvm-commits with me, samsonov@ and eugenis@  in CC (using phabricator).
(I may or may not be able to respond quickly, but one of us will).  
Make sure to include tests, for both llvm and compiler-rt.

Don't bother with -fsanitize-coverage=afl flag for now, just implement -mllvm -sanitizer-coverage-afl.
Alexey was going to implement support for -fsanitize-coverage=foo,bar flags later this month.  

--kcc 
 
What do you think?

Thanks,
Laszlo

--

Konstantin Serebryany

unread,
Apr 11, 2015, 12:21:32 PM4/11/15
to afl-...@googlegroups.com, Evgeniy Stepanov, Alexey Samsonov
BTW, one reason why I'd love to see AFL support asan-style coverage instrumentation: 
it is the actual coverage that can be mapped back to source locations and presented in human readable form. 

Consider you have a corpus that the fuzzer refuses to extend any further. 
If you've been running it with asan-style coverage your free byproduct on disk is the list of covered edges. 
You can also extract the list of all instrumented edges, and then subtract one from another to get the list of non-covered edges. 
Then you can symbolize this list and present it to a human expert who can manually guide the fuzzer towards more coverage. 

With afl-style coverage, even implemented in LLVM, you will have to re-run the corpus with some other coverage instrumentation
to get human readable results. 

--kcc 

Michal Zalewski

unread,
Apr 11, 2015, 10:06:40 PM4/11/15
to afl-users
Somebody pointed out that trying to compile -shared libraries with
afl-clang-fast now results in:

/usr/bin/ld: ./afl-llvm-rt.o: .preinit_array section is not allowed in DSO

I guess we could have a version without .preinit_array when we see
-shared, but maybe somebody has better ideas?

/mz

Michal Zalewski

unread,
Apr 11, 2015, 10:16:30 PM4/11/15
to afl-users
Never mind, I think we can get rid of the preinit stuff by replacing
mmap() with just a .comm region that __afl_area_ptr is initially set
to, and then updated to point to SHM.

Michal Zalewski

unread,
Apr 11, 2015, 10:23:10 PM4/11/15
to afl-users
Yup, seems to work. It's just this now:

u8 __afl_area_initial[MAP_SIZE];
u8* __afl_area_ptr = __afl_area_initial;

static void __afl_map_shm(void) {
...
__afl_area_ptr = shmat(shm_id, NULL, 0);
...
}

__attribute__((section(".init_array"), used))
static void (*__afl_init_f)(void) = __afl_init;

/mz

Keegan McAllister

unread,
Apr 12, 2015, 12:10:14 PM4/12/15
to afl-...@googlegroups.com
> (Well, another goal of doing that was being able to use random IDs
> without risking collisions between the IDs assigned to different .o or
> .so files.)

Yeah, it's nice that you can add instrumentation on a per-compilation-unit basis. I haven't tried it, but I think you could even mix LLVM-instrumented and afl-as-instrumented code, assuming you disable one or the other forkserver. This would be useful on big, mixed-language codebases. It's also attractive for fuzzing JITs, because afl instrumentation would be an easy addition.

I'd also like to play around with hashing other events besides CFG edge transitions. e.g.

    void *my_malloc_hook (size_t size, const void *caller) {
        __afl_area_ptr[M(caller, size)]++;
        return malloc(size);
    }

for some suitable map M. This would allow afl to notice when it's found a way to change the size of an allocation, even when it doesn't affect control flow of the instrumented code. But if you're making the bitmap format more flexible, which seems like a great idea, then I think there are better ways to incorporate these hooks.

keegan
Reply all
Reply to author
Forward
0 new messages