Making AFL modular?

158 views
Skip to first unread message

Jacek Wielemborek

unread,
Sep 24, 2015, 9:17:47 AM9/24/15
to afl-users, lca...@coredump.cx
Hello,

I recently read Hanno's article on lwn.net [1] and someone there
suggested using instrumentation capabilities from gcov instead of the
current instrumentation technique. I don't know enough about the current
technique AFL uses to judge whether it's actually doable and reasonable,
but I took a look at the source code of afl-fuzz.c and afl-gcc.c and had
the impression that AFL could greatly benefit from becoming more
modular, for example by splitting it into code units that solve the
following problems:

1. Instrumentation - "client" (fuzzed binary) and "server" (afl-fuzz)
functionality,

2. User interface, including CLI and fuzzer_stats,

3. Bare fuzzing algorithms,

4. Output analysis - detecting crashes and hangs, perhaps also
implementing other features: stdout analysis? Interfacing with debuggers?

5. AFL genetic algorithm,

6. Possibly other features that I missed

IMHO it would be perfect to turn all but the CLI code into a "libafl"
library that could be used for building fuzzers for different systems.
This would also make it easier to hack those parts and possibly
interface with AFL to a much greater extent.

I know that this is a big design change and wonder if Michał is
interested in this. Michał, if you are, could you suggest which
milestones would have to be achieved in order to gradually reach such
architecture? In other words, how could I help?

Cheers,
d33tah

[1] https://lwn.net/Articles/657959/

signature.asc

Michal Zalewski

unread,
Sep 24, 2015, 3:24:11 PM9/24/15
to afl-users, Michal Zalewski
There are two sorts of modularity: one is just about making the layout
of the code more clear, and I think it's overdue. It's just not high
on my list, but afl-fuzz.c should be broken into several components,
and some of the repeated code in afl-tmin.c, afl-showmap.c, and
afl-fuzz.c should be refactored.

The other kind is about coming up with truly flexible and small APIs
between the pieces of code with the intent of making them reusable.
Here, I'm not so sure: it is a huge expense and something that will
limit flexibility in the long haul, and the question is, would anyone
want to really use just one component, but not the others? The most
convincing use case I can think of is plugging in a different mutation
engine into AFL. Stuff like using gcov... you can get gcov-equivalent
coverage via config.h, but it's just not a very good idea.

/mz
> --
> You received this message because you are subscribed to the Google Groups "afl-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to afl-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Jacek Wielemborek

unread,
Sep 29, 2015, 8:49:10 AM9/29/15
to afl-...@googlegroups.com, Michal Zalewski
W dniu 24.09.2015 o 21:23, Michal Zalewski pisze:
> There are two sorts of modularity: one is just about making the layout
> of the code more clear, and I think it's overdue. It's just not high
> on my list, but afl-fuzz.c should be broken into several components,
> and some of the repeated code in afl-tmin.c, afl-showmap.c, and
> afl-fuzz.c should be refactored.

Since it's not high on your list, perhaps you'd accept a patch from
somebody else? If you described what you'd like to see done, perhaps I'd
do it if you'd make a task small enough. For starters, how about getting
rid of global variables?

> The other kind is about coming up with truly flexible and small APIs
> between the pieces of code with the intent of making them reusable.
> Here, I'm not so sure: it is a huge expense and something that will
> limit flexibility in the long haul, and the question is, would anyone
> want to really use just one component, but not the others? The most
> convincing use case I can think of is plugging in a different mutation
> engine into AFL. Stuff like using gcov... you can get gcov-equivalent
> coverage via config.h, but it's just not a very good idea.
>
> /mz

Breaking the code into several components would already make hacking it
much simpler, so I guess that it could be enough for now.

signature.asc

Michal Zalewski

unread,
Oct 13, 2015, 11:09:11 AM10/13/15
to afl-users
>> There are two sorts of modularity: one is just about making the layout
>> of the code more clear, and I think it's overdue. It's just not high
>> on my list, but afl-fuzz.c should be broken into several components,
>> and some of the repeated code in afl-tmin.c, afl-showmap.c, and
>> afl-fuzz.c should be refactored.
>
> Since it's not high on your list, perhaps you'd accept a patch from
> somebody else?

Probably, provided it's a polished patch (ideally, tested on multiple
OSes and with a logical layout of the code).

If it's going to be a proof-of-concept solution that would actually
make code look or work worse in the short haul, and would require
several weeks of work on my end to get it in a good / readable /
logical shape, I'd probably put it on a back burner.

/mz

Doug Birdwell

unread,
Oct 13, 2015, 4:44:14 PM10/13/15
to afl-users, lca...@coredump.cx
I would add to the list what Michal already indicated below: get rid of redundant code.  This was a significant issue for me when I wrote the network fuzzing extensions -- without the modularity, these extensions are needed in 3-4 other places, and I wasn't willing to write / modify the network stuff multiple times to do this.  Therefore, network support is only implemented in afl-fuzz.

The user interface could be decoupled from the remaining code, picking up data from the running instances of the fuzzer and displaying them on demand.

A second issue is the genetic algorithm.  Ideally, it would allow mutation modules to be plugged in and provide an easy way to influence their probabilities of selection - allowing users to experiment with the mutations and their rates of application.  It would also be good if it were integrated with a way to distribute processing (rather than simply run multiple copies of afl-fuzz).

Not that my network extension is a shining example of modularity, either, as Jacek pointed out in a separate post!

As Jacek said, there is an opportunity for a lot of work -- possibly justifying a modified design and an AFL v. 2.  I'm interested in helping, but my focus of attention is somewhat at the mercy of others, which usually means I am time-limited.

Doug

Doug Birdwell

unread,
Oct 13, 2015, 4:47:15 PM10/13/15
to afl-users
I suspect testing on multiple / many platforms is an issue for many of us (even with Docker).  It would be nice to know who can test what (and is willing to do so), and ideally contribute patches.

Doug

Jacek Wielemborek

unread,
Oct 13, 2015, 4:56:04 PM10/13/15
to afl-...@googlegroups.com, lca...@coredump.cx, bird...@gmail.com
W dniu 13.10.2015 o 22:44, Doug Birdwell pisze:
> I would add to the list what Michal already indicated below: get rid of
> redundant code. This was a significant issue for me when I wrote the
> network fuzzing extensions -- without the modularity, these extensions are
> needed in 3-4 other places, and I wasn't willing to write / modify the
> network stuff multiple times to do this. Therefore, network support is
> only implemented in afl-fuzz.
>
> The user interface could be decoupled from the remaining code, picking up
> data from the running instances of the fuzzer and displaying them on demand.
>
> A second issue is the genetic algorithm. Ideally, it would allow mutation
> modules to be plugged in and provide an easy way to influence their
> probabilities of selection - allowing users to experiment with the
> mutations and their rates of application. It would also be good if it were
> integrated with a way to distribute processing (rather than simply run
> multiple copies of afl-fuzz).
>
> Not that my network extension is a shining example of modularity, either,
> as Jacek pointed out in a separate post!
>
> As Jacek said, there is an opportunity for a lot of work -- possibly
> justifying a modified design and an AFL v. 2. I'm interested in helping,
> but my focus of attention is somewhat at the mercy of others, which usually
> means I am time-limited.
>
> Doug

I guess that this is the moment when I can show my AFL refactoring
attempts. The code can be found here:

https://github.com/d33tah/afl-refactor

First thing I tried was moving global variables into a separate struct
and passing it as function arguments. I also tried divding afl-fuzz.c
into util.c, shm-instr.c, fuzzing-engine.c and front-end afl-fuzz.c
remains. The code might be looking uglier now than the original
afl-fuzz.c, but I feel that it might be easier to hack now. Would anyone
like to make any suggestions as to how I could improve it further?

As for the IDE, I actually tried Eclipse CDT instead of Vim for this
task. Surprisingly, it worked pretty well if you changed some defaults.

signature.asc

Jacek Wielemborek

unread,
Oct 19, 2015, 4:36:56 PM10/19/15
to afl-...@googlegroups.com, lca...@coredump.cx, bird...@gmail.com
W dniu 13.10.2015 o 22:55, Jacek Wielemborek pisze:
> I guess that this is the moment when I can show my AFL refactoring
> attempts. The code can be found here:
>
> https://github.com/d33tah/afl-refactor
>
> First thing I tried was moving global variables into a separate struct
> and passing it as function arguments. I also tried divding afl-fuzz.c
> into util.c, shm-instr.c, fuzzing-engine.c and front-end afl-fuzz.c
> remains. The code might be looking uglier now than the original
> afl-fuzz.c, but I feel that it might be easier to hack now. Would anyone
> like to make any suggestions as to how I could improve it further?
>
> As for the IDE, I actually tried Eclipse CDT instead of Vim for this
> task. Surprisingly, it worked pretty well if you changed some defaults.

I had a go at trying to integrate this with Go using the cgo extension.
Here are the results:

https://github.com/d33tah/afl-refactor/tree/go

It's now possible to reference Go functions within afl-fuzz and the
other way round - the idea is that this way, the most complicated pieces
of C code could be rewritten to a higher-level language. Anyone willing
to experiment with that?

signature.asc
Reply all
Reply to author
Forward
0 new messages