dealing with 'stateful' fuzzing

555 views
Skip to first unread message

Victor Julien

unread,
Feb 11, 2017, 12:09:23 AM2/11/17
to afl-...@googlegroups.com
Hi all,

When fuzzing Suricata I'm seeing crashes that I can't reproduce. I'm
using llvm/persistent mode with asan on 64bit, mem limit 0. My
hypothesis is that it's because I'm keeping some global state.

In suricata we process network packets. Some packets (e.g. tcp) update a
global state. My fuzz input treats the file generated by AFL as a
packet. The global state is initialized and freed outside of the
'__AFL_LOOP' loop.

When realized I was partly testing additions to the state, I moved the
state init/free code inside the loop and now the crashes are gone (speed
is also down by factor of 10). This seems to confirm my hypothesis.

Now my question is: I want to know what caused the crashes in this
'stateful' mode. In the real world Suricata takes in lots of packets
too, many of which are controlled by bad actors. So I would like to find
out what the crashes in stateful mode are.

I guess a couple of things could be useful:

1. crash report by AFL containing asan symbolized output. That would at
least give me a starting point.

2. AFL option to attach gdb to the first crash case it finds to allow
manual inspection of the running program

3. on my end I guess I could write to disk the input AFL has generated
and then write some logic to read it in in the same order. That would
hopefully reproduce the same conditions.

Anything I'm overlooking that already makes this possible or easier?

Thanks,
Victor

--
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------

Michal Zalewski

unread,
Feb 11, 2017, 12:21:21 AM2/11/17
to afl-users
> When fuzzing Suricata I'm seeing crashes that I can't reproduce.

One thing to try is to test the input by calling the target program
10k times or so, in a simple shell loop. Some crashes are
non-deterministic because they depend on scheduler decisions or memory
randomization artifacts.

> My hypothesis is that it's because I'm keeping some global state.

It's possible, but typically less likely. That said...

> In suricata we process network packets. Some packets (e.g. tcp) update a
> global state. My fuzz input treats the file generated by AFL as a
> packet. The global state is initialized and freed outside of the
> '__AFL_LOOP' loop.

That sounds like a problem, if you have global state that doesn't get
reset within __AFL_LOOP(), you're bound to not only have trouble
reproducing bugs, but also get less out of the instrumentation (since
the information about that persisted state is lost on AFL). My advice
would be to reset state thoroughly (or not use persistent mode if this
is not doable; deferred forkserver may be a viable alternative).

If you want to test multi-packet parsing, modifying __AFL_LOOP to read
and process several packets may be a better option.

> I guess a couple of things could be useful:
>
> 1. crash report by AFL containing asan symbolized output. That would at
> least give me a starting point.

AFL doesn't capture stderr for the target, although you could make a
relatively simple one-off change to accommodate this. Capturing core
files or invoking an external symbolizer is highly problematic,
because it adds an unpredictable but significant delay and tends to
lead to crashes being misdiagnosed as timeouts.

In the next major version of AFL, I'm planning to add support for
re-running a crash to collect this additional data once a fault is
confirmed, but this wouldn't help in your particular case (since
re-running might not necessarily trigger the same issue).

The best piece of info you may have is whatever ended up in dmesg - at
least the faulting instruction pointer. Better than nothing, but not
great.

> 2. AFL option to attach gdb to the first crash case it finds to allow
> manual inspection of the running program

This would require fairly substantial architectural changes (moving to
ptrace()).

> 3. on my end I guess I could write to disk the input AFL has generated
> and then write some logic to read it in in the same order. That would
> hopefully reproduce the same conditions.

This seems problematic, too. Without knowing where the chain starts,
we'd have to store a ton of historical data. Generally, for better or
worse, AFL has an expectation that the state is reset to a blank
slate, and that if you want to test multi-packet handling or something
like that, that it will be still implemented within a single pass of
__AFL_LOOP.

Cheers,
/mz

Chris Kerr

unread,
Feb 13, 2017, 2:40:35 AM2/13/17
to afl-...@googlegroups.com
On Thursday 09 February 2017 11:28:51 Victor Julien wrote:
> Hi all,
>
> When fuzzing Suricata I'm seeing crashes that I can't reproduce. I'm
> using llvm/persistent mode with asan on 64bit, mem limit 0. My
> hypothesis is that it's because I'm keeping some global state.
>
> In suricata we process network packets. Some packets (e.g. tcp) update a
> global state. My fuzz input treats the file generated by AFL as a
> packet. The global state is initialized and freed outside of the
> '__AFL_LOOP' loop.
>
> Anything I'm overlooking that already makes this possible or easier?
>

Is there any way you could assert() that the global state has not been
changed? Or if there are some parts that are allowed to change and others that
are not, reset the parts that can change based part of the AFL input, so that
every possible state is explored.
signature.asc

Victor Julien

unread,
Feb 13, 2017, 12:32:16 PM2/13/17
to afl-...@googlegroups.com
On 11-02-17 06:20, Michal Zalewski wrote:
>> When fuzzing Suricata I'm seeing crashes that I can't reproduce.
>
> One thing to try is to test the input by calling the target program
> 10k times or so, in a simple shell loop. Some crashes are
> non-deterministic because they depend on scheduler decisions or memory
> randomization artifacts.
>
>> My hypothesis is that it's because I'm keeping some global state.
>
> It's possible, but typically less likely. That said...
>
>> In suricata we process network packets. Some packets (e.g. tcp) update a
>> global state. My fuzz input treats the file generated by AFL as a
>> packet. The global state is initialized and freed outside of the
>> '__AFL_LOOP' loop.
>
> That sounds like a problem, if you have global state that doesn't get
> reset within __AFL_LOOP(), you're bound to not only have trouble
> reproducing bugs, but also get less out of the instrumentation (since
> the information about that persisted state is lost on AFL). My advice
> would be to reset state thoroughly (or not use persistent mode if this
> is not doable; deferred forkserver may be a viable alternative).

So I confirmed my hypothesis: I added logic to store all the inputs
within a single AFL_LOOP, and added some logic to 'replay' this. I could
reliably get to the crash (which turned out to be uninteresting, a bug
in the test not in the tested code).

> If you want to test multi-packet parsing, modifying __AFL_LOOP to read
> and process several packets may be a better option.

I don't think I understand the difference. If I understand
http://lcamtuf.coredump.cx/afl/technical_details.txt correctly a
deferred fork server puts a loop after some global init code, which is
what I do as well before the AFL_LOOP. Are there other relevant differences?

>> I guess a couple of things could be useful:
>>
>> 1. crash report by AFL containing asan symbolized output. That would at
>> least give me a starting point.
>
> AFL doesn't capture stderr for the target, although you could make a
> relatively simple one-off change to accommodate this. Capturing core
> files or invoking an external symbolizer is highly problematic,
> because it adds an unpredictable but significant delay and tends to
> lead to crashes being misdiagnosed as timeouts.

Would it make sense to have a 'fatal' option? Just bail on the first
crash and then enter some post-crash analysis logic that can get the bt,
or analyze the dumped core? I would be perfectly happy to run such
logic, as my tests normally run for weeks/months w/o a single crash.

> In the next major version of AFL, I'm planning to add support for
> re-running a crash to collect this additional data once a fault is
> confirmed, but this wouldn't help in your particular case (since
> re-running might not necessarily trigger the same issue).
>
> The best piece of info you may have is whatever ended up in dmesg - at
> least the faulting instruction pointer. Better than nothing, but not
> great.
>
>> 2. AFL option to attach gdb to the first crash case it finds to allow
>> manual inspection of the running program
>
> This would require fairly substantial architectural changes (moving to
> ptrace()).
>
>> 3. on my end I guess I could write to disk the input AFL has generated
>> and then write some logic to read it in in the same order. That would
>> hopefully reproduce the same conditions.
>
> This seems problematic, too. Without knowing where the chain starts,
> we'd have to store a ton of historical data. Generally, for better or
> worse, AFL has an expectation that the state is reset to a blank
> slate, and that if you want to test multi-packet handling or something
> like that, that it will be still implemented within a single pass of
> __AFL_LOOP.

I implemented this outside of AFL. In Suricata I've added a bunch of
commandline options to expose internal API's specifically for AFL. In
this case the generated input files are fed to Suricata's packet
decoders/parsers.

I added an option to store all the inputs generated in a single AFL loop
and remove them if the loop completes. If the loop doesn't complete the
inputs are retained and using another option I can 'replay' them. I
could reproduce my issue with 100% reliability this way.

As far as I can tell, with identical input, Suricata behaves exactly the
same. So it seems reliable.

I guess I should test by adding a fake crash case and see how quickly
and reliably it finds it.

Thanks!

Victor Julien

unread,
Feb 13, 2017, 12:36:29 PM2/13/17
to afl-...@googlegroups.com
In my case it's not the state change I worried about, this was
intentional. However it was wondering how to analyze the crash.

Some logic in Suricata requires state. For example IP fragmentation
requires a few fragments that are then reassembled into a single packet.
In my case the code that was crashing was only called on the 4th input
and depended on the updates to the global state by the earlier 3 inputs.

It sounds like this is a bit of abuse of the AFL concepts, but it does
seem to work.
signature.asc

Michal Zalewski

unread,
Feb 13, 2017, 12:52:05 PM2/13/17
to afl-users
>> If you want to test multi-packet parsing, modifying __AFL_LOOP to read
>> and process several packets may be a better option.
>
> I don't think I understand the difference. If I understand
> http://lcamtuf.coredump.cx/afl/technical_details.txt correctly a
> deferred fork server puts a loop after some global init code, which is
> what I do as well before the AFL_LOOP. Are there other relevant differences?

Deferred forkserver fairly reliably prevents any "global state" issues
that crop up with __AFL_LOOP, because instead of merely looping, it
actually restores the process from a snapshot image.

That said, if you do have a way to reliably reset state, __AFL_LOOP is
definitely a better choice, and you can simulate receiving several
packets from within that code (just by splitting the AFL-generated
input according to some formula on every iteration).

>> AFL doesn't capture stderr for the target, although you could make a
>> relatively simple one-off change to accommodate this. Capturing core
>> files or invoking an external symbolizer is highly problematic,
>> because it adds an unpredictable but significant delay and tends to
>> lead to crashes being misdiagnosed as timeouts.
>
> Would it make sense to have a 'fatal' option? Just bail on the first
> crash and then enter some post-crash analysis logic that can get the bt,
> or analyze the dumped core?

The crashing process doesn't exist anymore by the time that AFL learns
about the crash, so it's too late for meaningful diagnostics. We'd
need to move to a far more clunky (and less portable) API, ptrace(),
to get that benefit.

Cheers,
/mz
Reply all
Reply to author
Forward
0 new messages