implementation porn for the morbidly curious

758 views
Skip to first unread message

Michal Zalewski

unread,
Dec 18, 2014, 1:12:05 AM12/18/14
to afl-users
The internals of afl-fuzz aren't documented particularly well outside
the comments in the code, so I figured it may be useful to describe
the underlying mechanics in case anybody wants to replicate it in
another tool, or make improvements or tweaks to the current codebase.

Enjoy!

== The instrumentation ==

The code injected by afl-gcc / afl-clang is essentially equivalent to:

cur_location = $compile_time_random;
shared_mem[cur_location ^ prev_location]++;
prev_location = cur_location;

...where shared_mem[] is a 64 kB region of memory created by the
parent process and used to record instrumentation data. The size is
chosen empirically to minimize the likelihood of collisions in typical
targets while keeping the map small enough to allow microsecond-range
hashing / byte counting / etc. SHM is used to minimize the number of
syscalls and context switches when writing or reading the data.

The aforementioned code is injected at the beginning of every function
and at every straightforward conditional branch (there is also some
overspray on non-conditional branches due to the simplicity of
afl-as).

As should be evident, the code doesn't merely track branch directions,
but also keeps track of hit count using an 8-bit counter. It would
have been nicer to use saturating arithmetics, but this would require
additional instructions on x86, so the counter can sometimes overflow.

On the receiving side, the fuzzer uses a simple lookup table to
convert raw counts into more coarse buckets, so that, for example,
branch hit count changing from 1 to 2 would be seen as an interesting
development - but from 40 to 41, not so much.

Based on a fair amount of testing, this form of instrumentation
considerably more useful than simple block or edge coverage; you can
do your own testing by enabling COVERAGE_ONLY in config.h, recompiling
the target, and seeing how far you can get.

You can also try out afl-showmap to preview the captured
instrumentation data without doing any actual fuzzing work.

As should be evident, the instrumentation isn't thread-aware and
out-of-order thread execution may result in spurious variations in the
output bitmap. This is usually very self-limiting, so I haven't put
any effort into fixing it.

== The fork server ==

The afl-fuzz process runs the target binary in a fairly graceful
single-core model where the fuzzer always sleeps while the child is
running and vice versa; I'm trying to preserve this model to permit
effortless parallelization.

To avoid the significant overhead of execve(), we use a nice fork
server model devised by Jann Horn:

http://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html

In essence, the instrumentation stops the program at main() and then
simply calls fork() to create running copy-on-write instances on
demand. This is very fast everywhere except for MacOS X, because the
underlying behavior of fork() on that platform seems to be stuck in
the early 90s (huh). Apparently, running afl-fuzz in a Linux VM on
MacOS X is a lot faster than running it natively.

== The main loop ==

In principle, the fuzzer follows a simple algorithm mentioned in the
README and consisting of several steps:

1) Grabbing the next file from the input queue,

2) If not already done for this file, sequentially removing
variable-length blocks from the input while testing if the recorded
instrumentation output changes in any way. If any portions can be
removed without affecting the recorded path, the old file is replaced
with a new, smaller input.

This is not a substitute for starting with small files, but in many
cases, can make the target program faster and improve the odds of
actually hitting something interesting when making random tweaks (also
see docs/perf_tips.txt).

3) Mutating the file using a bunch of common-sense strategies,
starting with a deterministic pass (e.g., sequential bit flips) that
take time proportional to the size of the file; and then continuing
with random, stacked tweaks that run for a roughly fixed number of
cycles (well, that's a lie - see calculate_score).

One nice thing about afl-fuzz is that I could select the strategies
for optimal yields without having to guess too much:

http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html

The mutation strategies used here are considerably more sophisticated
than tools such as zzuf or honggfuzz, but generally don't try to
reason about the structure of the underlying data - nothing similar to
radamsa or custom template-based tools.

When any of the mutations result in an execution trace that shows a
new state transition (i.e., previously-unseen byte is set) or a
noteworthy change to the coarse hit count, it is added at the end of
the queue. Changes to execution path that don't trigger new state
transitions, and merely alter the ordering or sequence of existing
ones, are not taken into account.

4) Recomputing a small set of "favored" inputs that offer
(approximately) the smallest and fastest-executing subset of all input
files generated so far while including every state transition seen so
far. This is discussed in some additional detail here:

https://groups.google.com/forum/#!topic/afl-users/_4PAO2aZCes

Only the favored inputs are guaranteed to be fuzzed in every queue
cycle. The paths that aren't favored are skipped 99% of the time when
there are not-yet-fuzzed favored entries ahead of them in the queue;
or 90% of the time when there's nothing urgent in the pipeline.

(If you want to do something similar for an existing data set, check
out experimental/minimization_script/).

5) Running all new queue entries from other synced fuzzers (when
running in sync mode) and importing them if they result in any locally
new state transitions being seen (see docs/parallel_fuzzing.txt).

6) If at the end of the queue, increasing the cycle counter and
restarting from the first file on the list.

== The exit criteria ==

The fuzzer never stops on its own, but there are three significant
milestones in the fuzzing process:

1) The completion of the first (and usually longest) queue cycle. This
is essentially the point where the fuzzer has completed deterministic
steps and some modest amount of random fuzzing on all the recursively
discovered favored paths.

2) Completion of deterministic steps on all input paths - "pending" on
the status screen reads "0". This usually takes several dozen queue
cycles and usually indicates fairly thorough coverage.

3) The point where no new paths are seen for a week or so, suggesting
that there probably isn't much more to be found (perhaps unless you
parallelize).

== Crash and hang deduplication ==

Hangs and crashes are deemed "unique" if their traces show branches
that have not been seen in any previously-recorded fault. This is a
compromise solution somewhat similar to looking at the backtrace in
gdb and marking crashes as interesting if there is any new function
seen in the call path.

The alternative of looking at the faulting instruction or the last
instrumentation data point is problematic, because it can group
together unrelated faults simply because they all happen in a common
function such as strcpy() or free(). What may be obvious to a human
being is fairly hard to decide programatically, so we err on the side
of verbosity.

The "new branch" approach can lead to some count inflation early in
the game, but the effect should quickly taper off.

== Session resume ==

Sessions can be seamlessly resumed by using the old output directory
(-o) as the new input (-i). The fuzzer loses the current position in
the queue cycle and will start over from the first file, but it does
remember which files have gone through deterministic testing, speeding
things up.

== The -C mode ==

The feature is described here:

http://lcamtuf.blogspot.com/2014/11/afl-fuzz-crash-exploration-mode.html

It essentially works in a manner identical to the normal mode, except
that mutations are rejected if they no longer cause a crash. I found
this to be extremely useful for evaluating the impact of crashes
without digging into the codebase.

/mz

Thomas Jarosch

unread,
Dec 18, 2014, 5:51:35 AM12/18/14
to afl-...@googlegroups.com
On Wednesday, 17. December 2014 22:11:44 Michal Zalewski wrote:
> Enjoy!

thanks for the write up :) It's nice to see the
instrumentation process documented in such detail.

One thing I was wondering about yesterday: With all the intelligent
fuzzing strategies that afl applies, it's too bad we only detect
hangs or crashes.

It would be nice to detect the use of uninitialized memory.
May be some kind of very slow valgrind mode? Just thinking out loud here.

Related project:
https://code.google.com/p/memory-sanitizer/

LLVM only though.

[when to stop fuzzing]
> The fuzzer never stops on its own, but there are three significant
> milestones in the fuzzing process:
>
> ...
>
> 3) The point where no new paths are seen for a week or so, suggesting
> that there probably isn't much more to be found (perhaps unless you
> parallelize)

should we provide a hint to the user about this?

[session resume]
> Sessions can be seamlessly resumed by using the old output directory
> (-o) as the new input (-i). The fuzzer loses the current position in
> the queue cycle and will start over from the first file, but it does
> remember which files have gone through deterministic testing, speeding
> things up.

does that also work more-or-less good in parallel fuzzing mode?

foreach FUZZER
- move "output" files to "input"
- start fuzzer again

?

Cheers,
Thomas

Michal Zalewski

unread,
Dec 18, 2014, 10:21:49 AM12/18/14
to afl-users
> It would be nice to detect the use of uninitialized memory.
> May be some kind of very slow valgrind mode? Just thinking out loud here.
>
> Related project:
> https://code.google.com/p/memory-sanitizer/
>
> LLVM only though.

Both MSAN and ASAN should be supported by afl, although with some
caveats (see docs/notes_for_asan.txt) :-)

>> 3) The point where no new paths are seen for a week or so, suggesting
>> that there probably isn't much more to be found (perhaps unless you
>> parallelize)
>
> should we provide a hint to the user about this?

The cycle counter is color-coded to provide a somewhat similar signal
(it starts yellow and goes green after a while).
>> Sessions can be seamlessly resumed by using the old output directory
>> (-o) as the new input (-i). The fuzzer loses the current position in
>> the queue cycle and will start over from the first file, but it does
>> remember which files have gone through deterministic testing, speeding
>> things up.
>
> does that also work more-or-less good in parallel fuzzing mode?
>
> foreach FUZZER
> - move "output" files to "input"
> - start fuzzer again

Yeah, this should work OK.

/mz

Thomas Jarosch

unread,
Dec 18, 2014, 10:45:02 AM12/18/14
to afl-...@googlegroups.com
On Thursday, 18. December 2014 07:21:28 Michal Zalewski wrote:
> > It would be nice to detect the use of uninitialized memory.
> > May be some kind of very slow valgrind mode? Just thinking out loud
> > here.
> >
> > Related project:
> > https://code.google.com/p/memory-sanitizer/
> >
> > LLVM only though.
>
> Both MSAN and ASAN should be supported by afl, although with some
> caveats (see docs/notes_for_asan.txt) :-)

ah, cool, I wasn't aware of those acronyms. That's why it didn't ring a bell
while reading the README of afl. Searching for ASAN brings up
"Autistic Self Advocacy Network" as the first hit...

How does afl handle errors detected by ASAN?
Do they count as "crashes"? Or none at all?

Thanks,
Thomas

Michal Zalewski

unread,
Dec 18, 2014, 10:46:36 AM12/18/14
to afl-users
> How does afl handle errors detected by ASAN?
> Do they count as "crashes"? Or none at all?

Yeah, they should end up in the crash bucket.

/mz

Jakub Wilk

unread,
Dec 18, 2014, 1:31:06 PM12/18/14
to afl-...@googlegroups.com
* Michal Zalewski <lca...@gmail.com>, 2014-12-17, 22:11:
>== Session resume ==
>
>Sessions can be seamlessly resumed by using the old output directory
>(-o) as the new input (-i). The fuzzer loses the current position in
>the queue cycle and will start over from the first file, but it does
>remember which files have gone through deterministic testing, speeding
>things up.

This is cool, but it would be even better if there was an option to
resume in-place. Having to invent a name for the new output directory
every time you want to resume gets tedious after a while...

--
Jakub Wilk

Michal Zalewski

unread,
Dec 18, 2014, 2:22:28 PM12/18/14
to afl-users
> This is cool, but it would be even better if there was an option to resume
> in-place. Having to invent a name for the new output directory every time
> you want to resume gets tedious after a while...

Yeah. On the flip side, there's some ambiguity around user intent.
Especially early on, users probably just want to restart (e.g.,
because they made some adjustments to the command line, added better
test cases, etc), and resuming from previous state may lead to bad
outcomes. Later in the game, they will probably want to resume, and
losing state would make them unhappy.

Not sure how to solve that. Maybe an explicit --resume option, but not
sure if adding and removing it is substantially less painful than rm +
mv.

/mz

Jakub Wilk

unread,
Dec 18, 2014, 3:17:37 PM12/18/14
to afl-...@googlegroups.com
* Michal Zalewski <lca...@gmail.com>, 2014-12-18, 11:22:
>>This is cool, but it would be even better if there was an option to
>>resume in-place. Having to invent a name for the new output directory
>>every time you want to resume gets tedious after a while...
>
>Yeah. On the flip side, there's some ambiguity around user intent.
>Especially early on, users probably just want to restart (e.g., because
>they made some adjustments to the command line, added better test
>cases, etc), and resuming from previous state may lead to bad outcomes.

Yup, I typically have to restart afl-fuzz a few times before I get
everything right.

>Later in the game, they will probably want to resume, and losing state
>would make them unhappy.
>
>Not sure how to solve that. Maybe an explicit --resume option,

If you could add --resume, I would be so happy. :-)

>but not sure if adding and removing it is substantially less painful
>than rm + mv.

It would be substantially less painful for me. OMMV.

--
Jakub Wilk

Alexander Cherepanov

unread,
Dec 18, 2014, 3:25:19 PM12/18/14
to afl-...@googlegroups.com
On 2014-12-18 22:22, Michal Zalewski wrote:
> Not sure how to solve that. Maybe an explicit --resume option, but not
> sure if adding and removing it is substantially less painful than rm +
> mv.

I would vote for explicit --resume!

--
Alexander Cherepanov

Michal Zalewski

unread,
Dec 18, 2014, 3:42:54 PM12/18/14
to afl-users
> I would vote for explicit --resume!

Fine, fine, you win =) I need to add some code for that (for example,
to preserve old crashes and hangs). Resuming plot data would be a
major hurdle, though.

Michal Zalewski

unread,
Dec 19, 2014, 4:13:52 AM12/19/14
to afl-users
>> I would vote for explicit --resume!
> Fine, fine, you win =)

This should be now implemented, more or less. More precisely, you
should be able to do in-place resume via:

./afl-fuzz -i- -o old_output_dir [...old_params...]

I haven't tested extensively, so bug reports welcome. Your old crashes
and hangs will be backed up in a timestamped subdirectory. Crash
counts and queue position are not preserved; neither are timing stats
or plots. Resumption merely reuses the old output queue and loads the
"went through deterministic fuzzing" metadata.

/mz

Alexander Cherepanov

unread,
Dec 19, 2014, 10:06:27 AM12/19/14
to afl-...@googlegroups.com
On 19.12.2014 12:13, Michal Zalewski wrote:
>>> I would vote for explicit --resume!
>> Fine, fine, you win =)
>
> This should be now implemented, more or less.

Cool, thanks!

> More precisely, you
> should be able to do in-place resume via:
>
> ./afl-fuzz -i- -o old_output_dir [...old_params...]

Sorry, I haven't yet tested it but I have some remark regarding this option.

The syntax "-i-" doesn't seem very intuitive because "-" is usually used
to mean stdin but before discussing it further...

I have some ideas for input selection:
- to accept a name of a single input file after -i;
- to be able to start without any input files at all. Right now I
usually do `printf "\0" > in/null`;
- to accept a file with file names instead of a directory with files;
- to recursively process input directory.

First two ideas seem to be quite natural. Last two are more related to
my current workflow and I'm not sure how useful they are. (Generally
it's easier to copy a file list than to copy a directory with files,
hardlinking is not always convenient etc.)

After desired ideas are determined, the syntax for options can be
decided and decided if stdin is acceptable somewhere.

--
Alexander Cherepanov

Michal Zalewski

unread,
Dec 19, 2014, 1:10:09 PM12/19/14
to afl-users
> The syntax "-i-" doesn't seem very intuitive because "-" is usually used to
> mean stdin but before discussing it further...

Yeah, but meh, it's just a short notation for "none". I thought about
@, but we're already using it for something else; or "+", but not sure
if this is objectively less confusing. Other punctuation probably
makes less sense.

> - to accept a name of a single input file after -i;

Should be easy.

> - to be able to start without any input files at all. Right now I usually
> do `printf "\0" > in/null`;

What would be the use case? You generally need at least one viable
file... otherwise, you'd be at best wasting a ton of CPU time to reach
viable syntax, or at worst, will never construct one (e.g.,
synthesizing PNGs is pretty unlikely).

> - to accept a file with file names instead of a directory with files;
> - to recursively process input directory.

That's some extra code, so I'd like to get a better sense of how
useful it would be; in what circumstances do you need to pull in files
from random locations in the filesystem or from complex directory
trees?

Note that the general philosophy is "less is better", i.e., you
probably want to start with a small number of carefully selected input
cases that genuinely do something unique, not a large data dump.

Cheers,
/mz

Alexander Cherepanov

unread,
Dec 19, 2014, 6:30:45 PM12/19/14
to afl-...@googlegroups.com
On 19.12.2014 21:09, Michal Zalewski wrote:
>> The syntax "-i-" doesn't seem very intuitive because "-" is usually used to
>> mean stdin but before discussing it further...
>
> Yeah, but meh, it's just a short notation for "none". I thought about
> @, but we're already using it for something else; or "+", but not sure
> if this is objectively less confusing. Other punctuation probably
> makes less sense.

IMHO a separate option like -r would be better -- it's easier to
add/remove when editing command line. With -r used, -i will be ignored
so it's not very elegant solution but probably convenient enough.

>> - to be able to start without any input files at all. Right now I usually
>> do `printf "\0" > in/null`;
>
> What would be the use case?

Pulling samples out of thin air:-)

> You generally need at least one viable
> file... otherwise, you'd be at best wasting a ton of CPU time to reach
> viable syntax, or at worst, will never construct one (e.g.,
> synthesizing PNGs is pretty unlikely).

Sure it will not work in more complex cases but should work in more
simple cases. I don't have enough experience with afl to say how
widespread simple cases are but I'm definitely going to experiment with it.

>> - to accept a file with file names instead of a directory with files;
>> - to recursively process input directory.
>
> That's some extra code, so I'd like to get a better sense of how
> useful it would be; in what circumstances do you need to pull in files
> from random locations in the filesystem

Consider minimized corpora. You have a complex directory tree containing
all kinds of samples of some type (e.g. elf) and you want to fuzz
several programs (e.g. readelf/objdump from binutils/elfutils/...).
minimize_corpus.sh is relatively fast so you can generate a specific
corpus for each program before each fuzz round. How would you store info
about minimized sets of samples? Bonus points if it's easy to save this
info after fuzzing round is over.

The solution in the spirit of afl is to create a directory near original
directory for each corpus and hardlink corresponding samples there. For
me, it was more convenient store the info in a file in a dir
corresponding to the current fuzzing round.

It's not a big deal though.

> or from complex directory trees?

I mentioned this mainly for completeness:-)

> Note that the general philosophy is "less is better", i.e., you
> probably want to start with a small number of carefully selected input
> cases that genuinely do something unique, not a large data dump.

As I said elsewhere, perhaps afl-fuzz should automatically carefully
select such cases from a large data dump?

--
Alexander Cherepanov

Jakub Wilk

unread,
Dec 19, 2014, 6:45:55 PM12/19/14
to afl-...@googlegroups.com
* Alexander Cherepanov <cher...@mccme.ru>, 2014-12-20, 02:30:
>>>The syntax "-i-" doesn't seem very intuitive because "-" is usually
>>>used to mean stdin but before discussing it further...
>>
>>Yeah, but meh, it's just a short notation for "none". I thought about
>>@, but we're already using it for something else; or "+", but not sure
>>if this is objectively less confusing. Other punctuation probably
>>makes less sense.
>
>IMHO a separate option like -r would be better -- it's easier to
>add/remove when editing command line. With -r used, -i will be ignored
>so it's not very elegant solution but probably convenient enough.

+1

--
Jakub Wilk

Michal Zalewski

unread,
Dec 19, 2014, 6:47:42 PM12/19/14
to afl-users
> IMHO a separate option like -r would be better -- it's easier to add/remove
> when editing command line. With -r used, -i will be ignored so it's not very
> elegant solution but probably convenient enough.

The clarity argument cuts both ways - I tought of --resume first, but
then, didn't like the ambiguity when I started implementing it. For
example, what should happen if the value of -i has changed, or there
are new files in the input directory? What would people expect?

>> What would be the use case?
> Pulling samples out of thin air:-)

Hmm... it's cool, but it's not a mainstream use case, so I wouldn't
want to keep piling on command-line options that people are more
likely to misuse than really need (and when really needed, you can
just do -i testcases/other/hello/ or create your own file in a matter
of seconds).

The reality is that most folks won't read README, and many of them
don't have prior experience with fuzzing. That's fine - the tool tries
to make it easy to hit the ground running - but it also means being
very conservative with the number of knobs we offer.

>> [...loading files from random locations...]
> Consider minimized corpora. You have a complex directory tree containing all
> kinds of samples of some type (e.g. elf) and you want to fuzz several
> programs (e.g. readelf/objdump from binutils/elfutils/...).
> minimize_corpus.sh is relatively fast so you can generate a specific corpus
> for each program before each fuzz round.

Right, and minimize_corpus.sh generates a single directory for you,
right? Or do you mean that you'd rather have it conserve disk space by
just listing files, but not making copies of them?

That's probably a fair request, although TBH, the corpus is usually
just a couple of megs, so it's not a huge saving.

> As I said elsewhere, perhaps afl-fuzz should automatically carefully select
> such cases from a large data dump?

It does, internally. But it's not a substitute for choosing a good set
of starting files, and it makes some trade-offs for performance
reasons, so the output is not exposed externally. That's why
minimize_corpus.sh exists (and why it's much slower, too).

Cheers,
/mz

Alexander Cherepanov

unread,
Dec 21, 2014, 8:16:20 PM12/21/14
to afl-...@googlegroups.com
On 2014-12-20 02:47, Michal Zalewski wrote:
>> IMHO a separate option like -r would be better -- it's easier to add/remove
>> when editing command line. With -r used, -i will be ignored so it's not very
>> elegant solution but probably convenient enough.
>
> The clarity argument cuts both ways -

I didn't talk about clarity, only about convenience.

> I tought of --resume first, but
> then, didn't like the ambiguity when I started implementing it. For
> example, what should happen if the value of -i has changed, or there
> are new files in the input directory? What would people expect?

I thought about ignoring -i completely when -r is given. But it could be
nice to be able to add more samples during fuzzing (through stop/restart).

>>> What would be the use case?
>> Pulling samples out of thin air:-)
>
> Hmm... it's cool,

Yes!

> but it's not a mainstream use case, so I wouldn't
> want to keep piling on command-line options that people are more
> likely to misuse than really need (and when really needed, you can
> just do -i testcases/other/hello/ or create your own file in a matter
> of seconds).

Ok

BTW, when there is no a long signature blocking progress, wouldn't empty
starting file leads to smaller samples and hence to more effective fuzzing?

> The reality is that most folks won't read README, and many of them
> don't have prior experience with fuzzing. That's fine - the tool tries
> to make it easy to hit the ground running - but it also means being
> very conservative with the number of knobs we offer.

Ok, let's mark this as (*).

>>> [...loading files from random locations...]
>> Consider minimized corpora. You have a complex directory tree containing all
>> kinds of samples of some type (e.g. elf) and you want to fuzz several
>> programs (e.g. readelf/objdump from binutils/elfutils/...).
>> minimize_corpus.sh is relatively fast so you can generate a specific corpus
>> for each program before each fuzz round.
>
> Right, and minimize_corpus.sh generates a single directory for you,
> right? Or do you mean that you'd rather have it conserve disk space by
> just listing files, but not making copies of them?

Yes, that's what I do now.

> That's probably a fair request, although TBH, the corpus is usually
> just a couple of megs, so it's not a huge saving.

The last minimized corpus for multiarch objdump from binutils was 42MB
with 634 files. But I haven't yet migrated to the latest minimize_corpus.sh.

>> As I said elsewhere, perhaps afl-fuzz should automatically carefully select
>> such cases from a large data dump?
>
> It does, internally. But it's not a substitute for choosing a good set
> of starting files,and it makes some trade-offs for performance
> reasons, so the output is not exposed externally. That's why
> minimize_corpus.sh exists (and why it's much slower, too).

Sorry, I don't get it. I thought that input samples are not minimized in
afl-fuzz to permit a user to force fuzzing with some samples. But it
seems it's not a good idea and it's not the case. So I don't understand
why not to minimize corpus when afl starts. You already perform dry run
of all input samples and my experience with minimize_corpus.sh is that
this is a bigger part of the process.

And if you implement minimization process in C it would be faster and
can try to use more elaborate algo (which is not easy in bash). I was
going to play with it but haven't yet managed to get to it.

Anyway, I don't see how you can expect a user to use minimize_corpus.sh
if you are serious about (*). BTW another block is big files. It you
really want you fuzzer to be usable by novices IMHO you should either
ignore files bigger then the limit or truncate them. An error message
written in white-on-white is not very helpful:-)

--
Alexander Cherepanov

Sami Liedes

unread,
Dec 21, 2014, 8:54:54 PM12/21/14
to afl-...@googlegroups.com
On Mon, Dec 22, 2014 at 04:16:18AM +0300, Alexander Cherepanov wrote:
> And if you implement minimization process in C it would be faster and can
> try to use more elaborate algo (which is not easy in bash). I was going to
> play with it but haven't yet managed to get to it.

I have something in C++ (and in Python, but it didn't turn out to be
as fast as I'd have liked, though fast enough to be usable). I
minimized 30k crashing inputs for Clang and got bored at tuple 2/520k
when every tuple required grepping through a 50G file with the shell
script.

The C++ thing seems to work, I think; currently it replaces the bash
script from the "Sorting trace sets..." part down, i.e. it assumes
that there already is a .traces directory with the traces (however it
doesn't need the .lookup or .all files). Caveat: This was hacked
together in the last couple of hours and tested once on one input. I
*think* it shouldn't delete your home directory or anything like that,
but no guarantees. It does minimize my 30.5k clang inputs with the
traces already generated in 7 minutes though.

http://sliedes.kapsi.fi/afl/min-traces.cpp

If for some reason you are more interested in a nearly-complete and
even slightly less tested Python script, here's one such; there's code
missing from the exec_showmap function (since I didn't want to
regenerate the traces - that takes long enough for 30k Clang c++
inputs), and you need to uncomment the code that does rm -rf to
.traces, mkdirs the output dir and copies the files to the output dir.

http://sliedes.kapsi.fi/afl/minimize_corpus.py

Note: I wouldn't have published these yet were it not for the interest
on the mailing list. So really, these may be suitable for a starting
point, but don't take them for any kind of polished gems...

Sami
signature.asc

Michal Zalewski

unread,
Dec 21, 2014, 9:47:32 PM12/21/14
to afl-users
> BTW, when there is no a long signature blocking progress, wouldn't empty
> starting file leads to smaller samples and hence to more effective fuzzing?

The built-in trimmer reduces the size if it believes that it makes no
difference based on instrumentation output. For most part, there's no
substantial benefit with \0 versus "hello".

> Sorry, I don't get it. I thought that input samples are not minimized in
> afl-fuzz to permit a user to force fuzzing with some samples. But it seems
> it's not a good idea and it's not the case. So I don't understand why not to
> minimize corpus when afl starts. You already perform dry run of all input
> samples and my experience with minimize_corpus.sh is that this is a bigger
> part of the process.

Several reasons for this. One is that, as mentioned, the internal
minimization algorithm works a bit differently. The other is just
trying to seek balance between completely ignoring user's wishes and
trying to provide good experience for people who ignore the README and
just dump every file they could find into the input directory.

> And if you implement minimization process in C it would be faster and can
> try to use more elaborate algo (which is not easy in bash). I was going to
> play with it but haven't yet managed to get to it.

Oh yeah, minimize_corpus.sh definitely should get rewritten in C, but
that's a separate topic =) If anyone wants to give it a try,
afl-showmap.c is a pretty good skeleton for this.

> Anyway, I don't see how you can expect a user to use minimize_corpus.sh if
> you are serious about (*).

That's why we pay 10x-100x more attention to the corpus "minimized" at
runtime. So, the impact of users starting with a bad corpus is not too
bad, but if somebody is actually selecting seemingly-similar test
cases on purpose, we still give them some non-trivial amount of air
time. There are also some startup warnings to warn people that they
may be Doing It Wrong (tm).

> BTW another block is big files. It you really want you fuzzer to be usable by novices IMHO you should either ignore files
> bigger then the limit or truncate them.

The reason I kept a hard error here is essentially to have a
last-resort warning along the lines of "you are doing it *really*,
*horribly* wrong and need to read the documentation now". We can
correct / work around most problems, but when we're presented with a
10 MB video or so, there aren't that many options but to stop.

/mz

Jakub Wilk

unread,
Jan 19, 2015, 6:07:23 AM1/19/15
to afl-...@googlegroups.com
* Michal Zalewski <lca...@gmail.com>, 2014-12-19, 01:13:
>./afl-fuzz -i- -o old_output_dir [...old_params...]
>
>I haven't tested extensively, so bug reports welcome. Your old crashes
>and hangs will be backed up in a timestamped subdirectory. Crash counts
>and queue position are not preserved; neither are timing stats or
>plots. Resumption merely reuses the old output queue and loads the
>"went through deterministic fuzzing" metadata.

Perhaps it's only me having bad luck all the time, but AFAICS the resume
feature doesn't really help much with completing the first cycle.

What I'm seeing is that afl-fuzz is busy deterministically processing
paths near the beginning of the queue, instead of quickly progressing to
the place it was interrupted last time.

Part of the problem is that non-favored paths are chosen at random for
deterministic processing; and if you resume then most likely a different
set will be chosen than the last time.

But I also see favored paths being deterministically processed,
even though they are early in the queue, which is more baffling.

--
Jakub Wilk

Michal Zalewski

unread,
Jan 19, 2015, 12:55:09 PM1/19/15
to afl-users
> Perhaps it's only me having bad luck all the time, but AFAICS the resume
> feature doesn't really help much with completing the first cycle.

It should be a lot faster than starting over, although yeah, it does
start from the beginning, so there's a 1% chance of processing any
non-favored entries at random - shouldn't take very long, but probably
gives the wrong impression.

The set of favored paths may also differ *a bit* since later additions
may evict some of the early-stage favored paths and result in a
somewhat different set being selected; there should be significant
overlap, but you may end up seeing some changes.

I'll add code to start from the last processed offset, would be fairly
trivial and generally makes sense =)

/mz

Michal Zalewski

unread,
Jan 19, 2015, 5:16:18 PM1/19/15
to afl-users
> I'll add code to start from the last processed offset, would be fairly
> trivial and generally makes sense =)

Should be done now (along with the red CPU thing).

Jakub Wilk

unread,
Jan 19, 2015, 7:18:33 PM1/19/15
to afl-...@googlegroups.com
* Michal Zalewski <lca...@gmail.com>, 2015-01-19, 14:15:
>>I'll add code to start from the last processed offset, would be fairly
>>trivial and generally makes sense =)
>Should be done now

Thanks! It seems to work, but:

1) GCC says:

afl-fuzz.c: In function ‘find_start_position’:
afl-fuzz.c:2677:11: warning: variable ‘i’ set but not used [-Wunused-but-set-variable]

2) afl-fuzz doesn't save fuzzer_stats on abort, so it might be slightly
outdated. Not a big deal, but you might want to fix it. :-)

--
Jakub Wilk

Michal Zalewski

unread,
Jan 19, 2015, 7:45:43 PM1/19/15
to afl-users
> afl-fuzz.c: In function ‘find_start_position’:
> afl-fuzz.c:2677:11: warning: variable ‘i’ set but not used
> [-Wunused-but-set-variable]

Grr. For the record, the alternative is:

warning: ignoring return value of 'read', declared with attribute
warn_unused_result [-Wunused-result]

...and __attribute__((used)) doesn't work. I hate compilers.

/mz

Ilfak Guilfanov

unread,
Jan 19, 2015, 8:01:26 PM1/19/15
to afl-...@googlegroups.com
Hi Michal,

JFYI, one way of suppressing such warning is:

/// Macro to avoid of message 'Parameter x is never used'
#define qnotused(x) (void)x

It works quite well.
--
Best regards,
Ilfak Guilfanov
Hex-Rays SA: state of the art binary software analysis tools
tel: +32-4-2224300, fax: +32-4-2235600

Michal Zalewski

unread,
Jan 19, 2015, 8:02:11 PM1/19/15
to afl-users
> #define qnotused(x) (void)x

Neat, thanks.

/mz

Michal Zalewski

unread,
Jan 20, 2015, 12:26:55 AM1/20/15
to afl-users
There was a fairly nasty bug in the 1.16b release - sorry (in essence,
the fuzzer would forever ignore any test cases below the resumption
offset, instead of skipping them just once). This is now fixed.

/mz

Jakub Wilk

unread,
Jan 20, 2015, 5:01:51 AM1/20/15
to afl-...@googlegroups.com
* Michal Zalewski <lca...@gmail.com>, 2015-01-19, 21:26:
>There was a fairly nasty bug in the 1.16b release - sorry (in essence,
>the fuzzer would forever ignore any test cases below the resumption
>offset, instead of skipping them just once). This is now fixed.

With 1.16b I saw:

total paths : 821
now processing : 848* (103.29%)

Was it a manifestation of the aforementioned bug, or something else?

--
Jakub Wilk

Michal Zalewski

unread,
Jan 20, 2015, 10:23:01 AM1/20/15
to afl-users
> now processing : 848* (103.29%)
> Was it a manifestation of the aforementioned bug, or something else?

Yup, same thing.

/mz

Dr S3curity

unread,
Jan 28, 2015, 11:32:01 AM1/28/15
to afl-...@googlegroups.com
You have developed a gcc plugins for instrumenting right?
Are you using GIMPLE?

Thanks

On Thursday, December 18, 2014 at 7:12:05 AM UTC+1, Michal Zalewski wrote:
The internals of afl-fuzz aren't documented particularly well outside
the comments in the code, so I figured it may be useful to describe
the underlying mechanics in case anybody wants to replicate it in
another tool, or make improvements or tweaks to the current codebase.

Enjoy!

== The instrumentation ==

The code injected by afl-gcc / afl-clang is essentially equivalent to:

  cur_location = $compile_time_random;
  shared_mem[cur_location ^ prev_location]++;
  prev_location = cur_location;

...where shared_mem[] is a 64 kB region of memory created by the
parent process and used to record instrumentation data. The size is
chosen empirically to minimize the likelihood of collisions in typical
targets while keeping the map small enough to allow microsecond-range
hashing / byte counting / etc. SHM is used to minimize the number of
syscalls and context switches when writing or reading the data.

The aforementioned code is injected at the beginning of every function
and at every straightforward conditional branch (there is also some
overspray on non-conditional branches due to the simplicity of
afl-as).

As should be evident, the code doesn't merely track branch directions,
but also keeps track of hit count using an 8-bit counter. It would
have been nicer to use saturating arithmetics, but this would require
additional instructions on x86, so the counter can sometimes overflow.

On the receiving side, the fuzzer uses a simple lookup table to
convert raw counts into more coarse buckets, so that, for example,
branch hit count changing from 1 to 2 would be seen as an interesting
development - but from 40 to 41, not so much.

Based on a fair amount of testing, this form of instrumentation
considerably more useful than simple block or edge coverage; you can
do your own testing by enabling COVERAGE_ONLY in config.h, recompiling
the target, and seeing how far you can get.

You can also try out afl-showmap to preview the captured
instrumentation data without doing any actual fuzzing work.

As should be evident, the instrumentation isn't thread-aware and
out-of-order thread execution may result in spurious variations in the
output bitmap. This is usually very self-limiting, so I haven't put
any effort into fixing it.

== The fork server ==

The afl-fuzz process runs the target binary in a fairly graceful
single-core model where the fuzzer always sleeps while the child is
running and vice versa; I'm trying to preserve this model to permit
effortless parallelization.

To avoid the significant overhead of execve(), we use a nice fork
server model devised by Jann Horn:

http://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html

In essence, the instrumentation stops the program at main() and then
simply calls fork() to create running copy-on-write instances on
demand. This is very fast everywhere except for MacOS X, because the
underlying behavior of fork() on that platform seems to be stuck in
the early 90s (huh). Apparently, running afl-fuzz in a Linux VM on
MacOS X is a lot faster than running it natively.

== The main loop ==

In principle, the fuzzer follows a simple algorithm mentioned in the
README and consisting of several steps:

1) Grabbing the next file from the input queue,

2) If not already done for this file, sequentially removing
variable-length blocks from the input while testing if the recorded
instrumentation output changes in any way. If any portions can be
removed without affecting the recorded path, the old file is replaced
with a new, smaller input.

This is not a substitute for starting with small files, but in many
cases, can make the target program faster and improve the odds of
actually hitting something interesting when making random tweaks (also
see docs/perf_tips.txt).

3) Mutating the file using a bunch of common-sense strategies,
starting with a deterministic pass (e.g., sequential bit flips) that
take time proportional to the size of the file; and then continuing
with random, stacked tweaks that run for a roughly fixed number of
cycles (well, that's a lie - see calculate_score).

One nice thing about afl-fuzz is that I could select the strategies
for optimal yields without having to guess too much:

http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html

The mutation strategies used here are considerably more sophisticated
than tools such as zzuf or honggfuzz, but generally don't try to
reason about the structure of the underlying data - nothing similar to
radamsa or custom template-based tools.

When any of the mutations result in an execution trace that shows a
new state transition (i.e., previously-unseen byte is set) or a
noteworthy change to the coarse hit count, it is added at the end of
the queue. Changes to execution path that don't trigger new state
transitions, and merely alter the ordering or sequence of existing
ones, are not taken into account.

4) Recomputing a small set of "favored" inputs that offer
(approximately) the smallest and fastest-executing subset of all input
files generated so far while including every state transition seen so
far. This is discussed in some additional detail here:

https://groups.google.com/forum/#!topic/afl-users/_4PAO2aZCes

Only the favored inputs are guaranteed to be fuzzed in every queue
cycle. The paths that aren't favored are skipped 99% of the time when
there are not-yet-fuzzed favored entries ahead of them in the queue;
or 90% of the time when there's nothing urgent in the pipeline.

(If you want to do something similar for an existing data set, check
out experimental/minimization_script/).

5) Running all new queue entries from other synced fuzzers (when
running in sync mode) and importing them if they result in any locally
new state transitions being seen (see docs/parallel_fuzzing.txt).

6) If at the end of the queue, increasing the cycle counter and
restarting from the first file on the list.

== The exit criteria ==

The fuzzer never stops on its own, but there are three significant
milestones in the fuzzing process:

1) The completion of the first (and usually longest) queue cycle. This
is essentially the point where the fuzzer has completed deterministic
steps and some modest amount of random fuzzing on all the recursively
discovered favored paths.

2) Completion of deterministic steps on all input paths - "pending" on
the status screen reads "0". This usually takes several dozen queue
cycles and usually indicates fairly thorough coverage.

3) The point where no new paths are seen for a week or so, suggesting
that there probably isn't much more to be found (perhaps unless you
parallelize).

== Crash and hang deduplication ==

Hangs and crashes are deemed "unique" if their traces show branches
that have not been seen in any previously-recorded fault. This is a
compromise solution somewhat similar to looking at the backtrace in
gdb and marking crashes as interesting if there is any new function
seen in the call path.

The alternative of looking at the faulting instruction or the last
instrumentation data point is problematic, because it can group
together unrelated faults simply because they all happen in a common
function such as strcpy() or free(). What may be obvious to a human
being is fairly hard to decide programatically, so we err on the side
of verbosity.

The "new branch" approach can lead to some count inflation early in
the game, but the effect should quickly taper off.

== Session resume ==

Sessions can be seamlessly resumed by using the old output directory
(-o) as the new input (-i). The fuzzer loses the current position in
the queue cycle and will start over from the first file, but it does
remember which files have gone through deterministic testing, speeding
things up.

== The -C mode ==

The feature is described here:

http://lcamtuf.blogspot.com/2014/11/afl-fuzz-crash-exploration-mode.html

It essentially works in a manner identical to the normal mode, except
that mutations are rejected if they no longer cause a crash. I found
this to be extremely useful for evaluating the impact of crashes
without digging into the codebase.

/mz

Michal Zalewski

unread,
Jan 28, 2015, 11:41:35 AM1/28/15
to afl-users
> You have developed a gcc plugins for instrumenting right?

Nope. There are two key reasons: -fplugin= isn't supported on many
still-popular versions of GCC (including OpenBSD); and it wouldn't
work with clang.

Instead, we're inserting ourselves as an "assembler" that is called by
gcc or clang, and then we call real GNU as when we're done making
changes to the assembly file. This has some downsides and it's
possible that a well-implemented -fplugin= mode would be faster - if
anyone is up for implementing it, I'd be happy to add this mode as an
option to afl.

/mz

Dr S3curity

unread,
Jan 28, 2015, 12:01:28 PM1/28/15
to afl-...@googlegroups.com
Do you got any documentation about what you have done for afl "assembler" etc...?

Andrew Griffiths

unread,
Jan 28, 2015, 12:09:50 PM1/28/15
to afl-...@googlegroups.com
afl-as.c is heavily commented in the source code for what's required (apple support, state change, inline assembly), and if appropriate, inserting instrumentation after a conditional branch change. You can see what AFL is doing behind the scenes if you something like:

mkdir aflasm
TMPDIR=${PWD}/aflasm AFL_KEEP_ASSEMBLY=1 make 

and it will leave instrumented assembly files behind for you to look at


--
You received this message because you are subscribed to the Google Groups "afl-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to afl-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michal Zalewski

unread,
Jan 28, 2015, 12:45:06 PM1/28/15
to afl-users
Yup, there's also this for a slightly more high-level explanation of
why the injected code looks the way it does:

http://lcamtuf.coredump.cx/afl/technical_details.txt

Jakub Wilk

unread,
Jun 30, 2017, 7:25:45 PM6/30/17
to afl-...@googlegroups.com
* Michal Zalewski <lca...@gmail.com>, 2015-01-19, 09:54:
>>Perhaps it's only me having bad luck all the time, but AFAICS the resume
>>feature doesn't really help much with completing the first cycle.
[...]
>I'll add code to start from the last processed offset, would be fairly trivial
>and generally makes sense =)

This was implemented in 1.16b, but AFAICS it broke in 2.20b:
write_stats_file() writes 10 spaces after "cur_path", but find_start_position()
wants exactly 7 spaces.

--
Jakub Wilk

Michal Zalewski

unread,
Jun 30, 2017, 9:58:58 PM6/30/17
to afl-users
> This was implemented in 1.16b, but AFAICS it broke in 2.20b:
> write_stats_file() writes 10 spaces after "cur_path", but
> find_start_position() wants exactly 7 spaces.

Whoops, ACK.

David Van Rood

unread,
Jul 2, 2017, 3:49:01 AM7/2/17
to afl-users
Thank you, I thought I was doing something incorrectly. I was running on 128 cores machine, and I had to restart a few times.

Leo Broukhis

unread,
Jul 4, 2017, 2:52:21 AM7/4/17
to afl-users
On Wednesday, December 17, 2014 at 10:12:05 PM UTC-8, Michal Zalewski wrote:

== The instrumentation ==
As should be evident, the code doesn't merely track branch directions,
but also keeps track of hit count using an 8-bit counter. It would
have been nicer to use saturating arithmetics, but this would require
additional instructions on x86, so the counter can sometimes overflow. 
 
After an increment, this should require exactly one additional instruction:
subtracting 0 with borrow from the incremented location. 

Leo
 
Reply all
Reply to author
Forward
0 new messages