How to ask afl to fuzz using a certain kind of file size ?

1,285 views
Skip to first unread message

lael.c...@gmail.com

unread,
Nov 2, 2016, 2:07:12 PM11/2/16
to afl-users
I’m trying to fuzz a complex parser. As a valid input mostly grammar based (an instuction ), using a dictionary isn’t really useful

However, In order for file to be valid, it’s size must be dividable by 8. 
Trying to parse a file that don’t have that follow this rule always segfault the parser I’m checking this in the program before calling the parser function.

I thought this could be easily detected with instrumentation and valid test cases. However, american fuzzy lop keep fuzzing with file whose size are less than 8 byte which makes nothing crashing.

Is there a way to tell American fuzzy lop that input files size should be at least 8 bytes ?

Michal Zalewski

unread,
Nov 2, 2016, 3:05:06 PM11/2/16
to afl-users
> I’m trying to fuzz a complex parser. As a valid input mostly grammar based
> (an instuction ), using a dictionary isn’t really useful

Why? Would it be a different situation than, say, fuzzing SQL?

> Is there a way to tell American fuzzy lop that input files size should be at
> least 8 bytes ?

You can write a postprocessor to reject files that don't meet the
criteria (see experimental/). Although I would caution against
optimizing against what may be a non-issue; wasting some CPU cycles is
usually not a big deal.

/mz

lael.c...@gmail.com

unread,
Nov 2, 2016, 3:48:43 PM11/2/16
to afl-users


Le mercredi 2 novembre 2016 20:05:06 UTC+1, Michal Zalewski a écrit :
> I’m trying to fuzz a complex parser. As a valid input mostly grammar based
> (an instuction ), using a dictionary isn’t really useful

Why? Would it be a different situation than, say, fuzzing SQL?
Basically, the parser parse bytecode. An instruction is variable length and takes between 1 to 6 bytes. An instruction consists of 0 or several prefixes (you can’t mix every prefix with each other) an opcode and operands can have suboperands. an instruction isn’t allowed to cross the 8 byte boundary which mean nop are required for padding.
Again everything in binary.

I don’t think sql have 600 opcodes, or rather keywords.

This would require to write a complex finite state machine.

You can write a postprocessor to reject files that don't meet the
criteria (see experimental/). Although I would caution against
optimizing against what may be a non-issue; wasting some CPU cycles is
usually not a big deal.

stuct stat sb;
fstat
(fs,&sb);
if(sb.st_size%8!=0) exit(EXIT_FAILURE);
before parsing and period. Otherwise the crash cases wouldn’t be usefull (because they would be based on files with incorrect data size).
/mz 
But how I get alf to load the generated post processing shared object ?
Also (just out of curiosity, it has nothing to do with my use case) would this technique be useful to reject anything smaller than 32 bytes directly from the beginning ?

Thanks anyway. 

Michal Zalewski

unread,
Nov 2, 2016, 4:29:51 PM11/2/16
to afl-users
> I don’t think sql have 600 opcodes, or rather keywords.

I don't think that what you're describing is inherently incompatible
with how AFL uses dictionaries. I'd give it a try. Its ability to put
together grammar from dictionary tokens surprises me more often than
not.

> But how I get alf to load the generated post processing shared object ?

There's a discussion of how to pull this off in experimental/post_library/.

> Also (just out of curiosity, it has nothing to do with my use case) would
> this technique be useful to reject anything smaller than 32 bytes directly
> from the beginning ?

Yep, see above :-) This avoids calling the target binary altogether.

/mz

Tim Newsham

unread,
Nov 2, 2016, 5:35:43 PM11/2/16
to afl-users

On Wednesday, November 2, 2016 at 8:07:12 AM UTC-10, lael.c...@gmail.com wrote:
However, In order for file to be valid, it’s size must be dividable by 8. 


Why dont you read in the input file, truncate it to be an even multiple of 8-bytes, and pass it to the parser?

Tim

lael.c...@gmail.com

unread,
Nov 3, 2016, 7:32:31 AM11/3/16
to afl-users


Le mercredi 2 novembre 2016 21:29:51 UTC+1, Michal Zalewski a écrit :
> I don’t think sql have 600 opcodes, or rather keywords.

I don't think that what you're describing is inherently incompatible
with how AFL uses dictionaries. I'd give it a try. Its ability to put
together grammar from dictionary tokens surprises me more often than
not.
Basically, it’s as hard as trying to disassemble x86_64.
There's a discussion of how to pull this off in experimental/post_library/.
I couldn’t found it.
Yep, see above :-) This avoids calling the target binary altogether.
But for 32 byte this would be slower than calling the binary with file lower than that, isn’t it ? 
/mz

lael.c...@gmail.com

unread,
Nov 3, 2016, 7:55:14 AM11/3/16
to afl-users
Le mercredi 2 novembre 2016 22:35:43 UTC+1, Tim Newsham a écrit :

Why dont you read in the input file, truncate it to be an even multiple of 8-bytes, and pass it to the parser?

Tim
Because of the time, the file size is lower than that. How can I fill the missing bytes without inducing errors in the way afl works ?
Do I take from the original input buffer or do I use the standard random() C function.

Where to put the missing bytes ? at the begging or the end of the buffer ?

Chris Kerr

unread,
Nov 3, 2016, 8:32:46 AM11/3/16
to afl-...@googlegroups.com
Truncate, not pad - you make it smaller until it is a multiple of 8. The extra
bytes are thrown away. If the file is smaller than 8 bytes, exit with an error.
signature.asc

lael.c...@gmail.com

unread,
Nov 3, 2016, 9:43:58 AM11/3/16
to afl-users
Le jeudi 3 novembre 2016 13:32:46 UTC+1, Chris Kerr a écrit :
Truncate, not pad - you make it smaller until it is a multiple of 8. The extra
bytes are thrown away. If the file is smaller than 8 bytes, exit with an error.
I agree for truncation. But the real problem is most of the time the file size is lower than 8 bytes. So ~35‒60% of tests are rejected and aren’t useful.

IOhannes m zmölnig

unread,
Nov 3, 2016, 9:47:31 AM11/3/16
to afl-...@googlegroups.com
On 2016-11-03 12:32, lael.c...@gmail.com wrote:
>> > Yep, see above :-) This avoids calling the target binary altogether.
>> >
> But for 32 byte this would be slower than calling the binary with file
> lower than that, isn’t it ?
>

why?

signature.asc

Chris Kerr

unread,
Nov 3, 2016, 10:06:03 AM11/3/16
to afl-...@googlegroups.com
They are useful. AFL learns that files shorter than 8 bytes aren't interesting
and will subsequently concentrate more fuzzing effort on files longer than 8
bytes.
signature.asc

lael.c...@gmail.com

unread,
Nov 3, 2016, 12:53:15 PM11/3/16
to afl-users


Le jeudi 3 novembre 2016 15:06:03 UTC+1, Chris Kerr a écrit :
On Thursday 03 November 2016 06:43:57 lael.c...@gmail.com wrote:
> Le jeudi 3 novembre 2016 13:32:46 UTC+1, Chris Kerr a écrit :
> > Truncate, not pad - you make it smaller until it is a multiple of 8. The
> > extra
> > bytes are thrown away. If the file is smaller than 8 bytes, exit with an
> > error.
They are useful. AFL learns that files shorter than 8 bytes aren't interesting
and will subsequently concentrate more fuzzing effort on files longer than 8
bytes.
Even after 5 hours on a system with a core i7‒6700k most of test cases are 3 or 5 byte long.
So the problem is definitely with files that don’t have enough input, no with those that don't strictly respect the 8 bytes boundary.

Remember having exactly 16 bytes or 65536 bytes is Ok too.

lael.c...@gmail.com

unread,
Nov 3, 2016, 12:54:52 PM11/3/16
to afl-users, zmoe...@umlaeute.mur.at
Le jeudi 3 novembre 2016 14:47:31 UTC+1, IOhannes m zmölnig a écrit :
On 2016-11-03 12:32, lael.c...@gmail.com wrote:
why?
Because if I understand the documentation correctly, it would be faster to let test case executing ? 

Michal Zalewski

unread,
Nov 3, 2016, 2:06:19 PM11/3/16
to afl-users
> Even after 5 hours on a system with a core i7‒6700k most of test cases are 3
> or 5 byte long.

You mean the test cases in <out_dir>/queue/?

If these cases crash, they should not be appearing in the output dir at all.

/mz

lael.c...@gmail.com

unread,
Nov 3, 2016, 2:25:18 PM11/3/16
to afl-users
Le jeudi 3 novembre 2016 19:06:19 UTC+1, Michal Zalewski a écrit :
You mean the test cases in <out_dir>/queue/?

If these cases crash, they should not be appearing in the output dir at all.
No I’m talking about the size I see when I perform ls -l .cur_input. (I did it 100 times)
/mz
And in those 5 hours, no crash were found.

Michal Zalewski

unread,
Nov 3, 2016, 2:57:36 PM11/3/16
to afl-users
> No I’m talking about the size I see when I perform ls -l .cur_input. (I did
> it 100 times)

It's probably not a particularly good metric; looking at the queue is
a better indication of what AFL may be spending time on.

> And in those 5 hours, no crash were found.

In an earlier e-mail, you said: "However, In order for file to be
valid, it’s size must be dividable by 8. Trying to parse a file that
don’t have that follow this rule always segfault the parser."

So I'm not sure how to reconcile this?

/mz

lael.c...@gmail.com

unread,
Nov 3, 2016, 4:24:28 PM11/3/16
to afl-users


Le jeudi 3 novembre 2016 19:57:36 UTC+1, Michal Zalewski a écrit :

> And in those 5 hours, no crash were found.

In an earlier e-mail, you said: "However, In order for file to be
valid, it’s size must be dividable by 8.  Trying to parse a file that
don’t have that follow this rule always segfault the parser."

So I'm not sure how to reconcile this?
I also said that I do something like this in the target process :
stuct stat sb;
fstat
(fd,&sb);
if(sb.st_size%BUNDLE_SIZE) exit(EXIT_FAILURE);
/mz
By the way, you still didn’t explained how I can get afl to load the compile post processing library ? 

Michal Zalewski

unread,
Nov 3, 2016, 4:30:41 PM11/3/16
to afl-users
> I also said that I do something like this in the target process :
> stuct stat sb;
> fstat(fd,&sb);
> if(sb.st_size%BUNDLE_SIZE) exit(EXIT_FAILURE);

In this case, the queue should be mostly files that meet your
criteria; I suggest looking at that as an indicator of what AFL is
spending its time on.

> By the way, you still didn’t explained how I can get afl to load the compile
> post processing library ?

I did, several e-mails ago; please see experimental/post_library/.
Specifically, see the header of post_library.so.c.

IOhannes m zmölnig

unread,
Nov 3, 2016, 4:33:06 PM11/3/16
to afl-...@googlegroups.com
On 11/03/2016 09:24 PM, lael.c...@gmail.com wrote:
> By the way, you still didn’t explained how I can get afl to load the
> compile post processing library ?

well, michal said:
> You can write a postprocessor to reject files that don't meet the
> criteria (see experimental/).

so, looking at experimental/post_library/post_library.so.c, I see
> the postprocessor library is passed to afl-fuzz via AFL_POST_LIBRARY.
> The library must be compiled with:
>
> gcc -shared -Wall -O3 post_library.so.c -o post_library.so
>

which probably answers your question:

AFL_POST_LIBRARY=/path/to/foo/postlibrary.so afl-fuzz ...

fgmsrd
IOhannes

signature.asc

lael.c...@gmail.com

unread,
Nov 3, 2016, 4:33:59 PM11/3/16
to afl-users


Le jeudi 3 novembre 2016 21:30:41 UTC+1, Michal Zalewski a écrit :
> I also said that I do something like this in the target process :
> stuct stat sb;
> fstat(fd,&sb);
> if(sb.st_size%BUNDLE_SIZE) exit(EXIT_FAILURE);

In this case, the queue should be mostly files that meet your
criteria; I suggest looking at that as an indicator of what AFL is
spending its time on.
Afl also told that only 14 code paths among the ~5000 my program has were explored at the end of those 5 hours 

> By the way, you still didn’t explained how I can get afl to load the compile
> post processing library ?

I did, several e-mails ago; please see experimental/post_library/.
Specifically, see the header of post_library.so.c.
 The header explain how to create and compile the library not how to get afl to dlopen it.

lael.c...@gmail.com

unread,
Nov 3, 2016, 4:36:25 PM11/3/16
to afl-users, zmoe...@umlaeute.mur.at
Le jeudi 3 novembre 2016 21:33:06 UTC+1, IOhannes m zmölnig a écrit :
so, looking at experimental/post_library/post_library.so.c, I see
> the postprocessor library is passed to afl-fuzz via AFL_POST_LIBRARY.

Thank you, I thought AFL_POST_LIBRARY was preprocessor macro. I didn’t think it was an environment variable.

Michal Zalewski

unread,
Nov 3, 2016, 4:42:22 PM11/3/16
to afl-users
>> In this case, the queue should be mostly files that meet your
>> criteria; I suggest looking at that as an indicator of what AFL is
>> spending its time on.
>
> Afl also told that only 14 code paths among the ~5000 my program has were
> explored at the end of those 5 hours

Can you please have a look at the queue and tell us what percentage of
test cases meet your criteria, and how many of them do you have total?

In general, if AFL is not finding too many code paths in the program,
the more likely explanation is that it is bumping into a roadblock
unrelated to a relatively trivial size constraint. 14 paths strongly
suggests that it isn't. Perhaps it needs a dictionary, perhaps the
program quits prematurely for other reasons (memory limits, incorrect
cmdline), perhaps not all files are properly instrumented -
unfortunately, this can be a bit painful to troubleshoot.

> The header explain how to create and compile the library not how to get afl
> to dlopen it.

[lcamtuf@raccoon afl]$ grep -B 1 AFL_POST_LIBRARY
experimental/post_library/post_library.so.c
With that out of the way: the postprocessor library is passed to afl-fuzz

lael.c...@gmail.com

unread,
Nov 4, 2016, 8:37:23 PM11/4/16
to afl-users
Le jeudi 3 novembre 2016 21:42:22 UTC+1, Michal Zalewski a écrit :
Can you please have a look at the queue and tell us what percentage of
test cases meet your criteria, and how many of them do you have total?
Don’t know if it’s linked to my test case, but it seems to appends when memory is limited to 600Mb and time to 2 seconds.
In general, if AFL is not finding too many code paths in the program,
the more likely explanation is that it is bumping into a roadblock
unrelated to a relatively trivial size constraint. 14 paths strongly
suggests that it isn't. Perhaps it needs a dictionary, perhaps the
program quits prematurely for other reasons (memory limits, incorrect
cmdline), perhaps not all files are properly instrumented -
unfortunately, this can be a bit painful to troubleshoot.

[lcamtuf@raccoon afl]$ grep -B 1 AFL_POST_LIBRARY
experimental/post_library/post_library.so.c
   With that out of the way: the postprocessor library is passed to afl-fuzz
   via AFL_POST_LIBRARY. The library must be compiled with:
I increased the memory allocation limit to 14000 megabytes and I got strange results when I use the -M switch in multi mode and all my ᴄᴘᴜ cores are set to full.

The code paths grows to 4000 and five minutes after that, most fuzzer instances segfaults (no crash are founds in the fuzzed parser).
I suppose  that I can’t just attach ɢᴅʙ to ᴀꜰʟ. But can hyperthreading modify timings in a way that would case american fuzzy lop to crash itself ?

Michal Zalewski

unread,
Nov 4, 2016, 8:45:58 PM11/4/16
to afl-users
>> Can you please have a look at the queue and tell us what percentage of
>> test cases meet your criteria, and how many of them do you have total?
>
> Don’t know if it’s linked to my test case, but it seems to appends when
> memory is limited to 600Mb and time to 2 seconds.

Sorry, not sure I follow?

> I increased the memory allocation limit to 14000 megabytes and I got strange
> results when I use the -M switch in multi mode and all my ᴄᴘᴜ cores are set
> to full.
>
> The code paths grows to 4000 and five minutes after that, most fuzzer
> instances segfaults (no crash are founds in the fuzzed parser).

I'm guessing you're running out of memory on the system, as a
consequence of trying to run multiple instances of a target binary
that seems very memory-intensive. Checking dmesg should confirm this.

> I suppose that I can’t just attach ɢᴅʙ to ᴀꜰʟ. But can hyperthreading
> modify timings in a way that would case american fuzzy lop to crash itself ?

No.

/mz

lael.c...@gmail.com

unread,
Nov 4, 2016, 8:51:49 PM11/4/16
to afl-users


Le samedi 5 novembre 2016 01:45:58 UTC+1, Michal Zalewski a écrit :
Sorry, not sure I follow?
 As soon as I modify the limits, the program works normally

I'm guessing you're running out of memory on the system, as a
consequence of trying to run multiple instances of a target binary
that seems very memory-intensive. Checking dmesg should confirm this.
No, the target process is not memory intensive, the parser is completely written in ragel.
The output from dmesg was empty about this.

However, I ran the 8 american fuzzy lop instances on the same tmpfs filesystem, and my ʀᴀᴍ is not ᴇᴄᴄ.

Michal Zalewski

unread,
Nov 4, 2016, 8:54:17 PM11/4/16
to afl-users
>> Sorry, not sure I follow?
> As soon as I modify the limits, the program works normally

Right, but my question was: "can you please have a look at the queue
and tell us what percentage of test cases meet your criteria, and how
many of them do you have total?"

>> I'm guessing you're running out of memory on the system, as a
>> consequence of trying to run multiple instances of a target binary
>> that seems very memory-intensive. Checking dmesg should confirm this.
>
> No, the target process is not memory intensive, the parser is completely
> written in ragel.

If it's not memory-intensive, then increasing memory limit from 600 MB
should not make a difference, but you're saying that it did.

> However, I ran the 8 american fuzzy lop instances on the same tmpfs
> filesystem, and my ʀᴀᴍ is not ᴇᴄᴄ.

That shouldn't matter, unless your system is having some serious issues.

/mz

lael.c...@gmail.com

unread,
Nov 4, 2016, 8:57:19 PM11/4/16
to afl-users
Le samedi 5 novembre 2016 01:54:17 UTC+1, Michal Zalewski a écrit : 

If it's not memory-intensive, then increasing memory limit from 600 MB
should not make a difference, but you're saying that it did.
Decreasing it works too.

That shouldn't matter, unless your system is having some serious issues.
I also found the ᴇꜰɪ was configured to decrease ᴄᴘᴜ frequency if all the cores runs at a temperature above 98℃ 

/mz
Reply all
Reply to author
Forward
0 new messages