Question about how afl-fuzzer generates test cases

1,278 views
Skip to first unread message

xiedi...@gmail.com

unread,
Mar 14, 2015, 8:53:25 PM3/14/15
to afl-...@googlegroups.com
Hi everyone, I want to do something like guided random testing.
The issue I'm facing now is how to generate inputs for a program.
As afl-fuzzer uses genetic algorithm to generate test cases, I want to reuse
that part of code. But afl-fuzz.c has over 7k lines of code, I have some trouble
understanding the source code. 
Can anyone point out the part of code that performs test case generation?
Thanks in advance.

Michal Zalewski

unread,
Mar 14, 2015, 8:54:13 PM3/14/15
to afl-users
> Can anyone point out the part of code that performs test case generation?

You may want to have a look at docs/technical_details.txt instead?

/mz

Ben Nagy

unread,
Mar 14, 2015, 11:21:48 PM3/14/15
to afl-...@googlegroups.com
On Sun, Mar 15, 2015 at 8:36 AM, <xiedi...@gmail.com> wrote:
> Hi everyone, I want to do something like guided random testing.
> The issue I'm facing now is how to generate inputs for a program.

To get a more useful answer, it might be worth starting by explaining
how the way afl works is different from your understanding of the term
"guided random testing". To me, there isn't any.

> As afl-fuzzer uses genetic algorithm to generate test cases, I want to reuse
> that part of code. But afl-fuzz.c has over 7k lines of code

This is not how I'd characterise the way afl generates test cases. The
case generation code itself is more or less state-of-the-art black box
generation. You'd get a similar experience with radamsa or one of many
random junk generators ( eventually ). The case generation is not what
I would call 'genetic' because there is no (current) feedback between
HOW cases are generated and the fitness function. There is, however,
some feedback between areas of the file that are fuzzed, and how long
is spent per file. Additionally, because the coverage statistic is
'masked' by the coverage bitmap you don't get to know exactly which
tests affected which program areas, unlike some earlier works.

The 'genetic' part is simply the fact that more successful cases are
used as a base for subsequent blind mutation. 'Genetic' case
generation, to me, would be (for example) if/when the generator
changes the weighting it gives to certain kinds of tests based on the
overall effectiveness of that test domain (flips, dict insert,
arithmetic... ). You have to be aware, though, that because of the way
the evolution works there's a fair chance that doing this, especially
early, would lead to local maxima that would handicap you later.

I'm also suspecting that we have something of an XY problem here, so
for best results you may try explaining what you are trying to do in
more depth ( and how afl doesn't do it )

Just my 0.02.

Cheers,

ben

Dingbao Xie

unread,
Mar 15, 2015, 4:43:26 PM3/15/15
to afl-...@googlegroups.com, b...@iagu.net
Thank you very much for your detailed answer. 
I'm a newbie to afl-fuzzer and also fuzzing testing.
Let me explain my application more clearly. 
I instrumented the source program and after executing a test
case the instrumented program will give a fitness value to show 
how good the test case is. I will use such information to guide
the generation of new test cases. 

Currently I can random generate integers as test input, but for complex test input
such as string, file I don't know how to generate it. I'm looking for an existing tool
to do that.

Michal Zalewski

unread,
Mar 15, 2015, 4:53:34 PM3/15/15
to afl-users, b...@iagu.net
> Thank you very much for your detailed answer.
> I'm a newbie to afl-fuzzer and also fuzzing testing.
> Let me explain my application more clearly.
> I instrumented the source program and after executing a test
> case the instrumented program will give a fitness value to show
> how good the test case is. I will use such information to guide
> the generation of new test cases.

How would that be different from AFL? Or from the more strict
application of GA done by Jared DeMott long time ago [1]?

I have a reasonably decent discussion of afl-fuzz mutation strategies here:

http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html

...and you could just as conceivably just reuse radamsa or zzuf or
anything else of that sort.

/mz

[1] http://www.vdalabs.com/tools/EFS.pdf
Reply all
Reply to author
Forward
0 new messages