new in AFL 2.41b: sensible hang detection, at last

491 views
Skip to first unread message

Michal Zalewski

unread,
Apr 12, 2017, 12:55:35 AM4/12/17
to afl-users
One of the major user complaints about AFL was that it's quite
haphazard about flagging stuff as "hangs".

This was a side effect of AFL trying to apply very aggressive timeouts
to generated test cases to avoid falling into "tarpits" of slow paths.
But it also meant that unless you were manually specifying a generous
time limit (-t), most of the stuff in the hangs/ folder was just due
to stuff that's minimally slower or affected by random scheduler
flukes, rather than any genuine DoS potential.

In AFL 2.41b, the concept of a timeout for the purpose of path
rejection is decoupled from the notion of true hangs. A timeout is
flagged as a hang only if it takes more than one second to execute.
You can fine-tune this with AFL_HANG_TMOUT (in milliseconds).

The re-execution is done only if the timeout appears to have any new
bits in the execution map, so the trade-off is that some hangs could
be conceivably missed if they are identical to something that was
previously flagged as a timeout but wasn't slow enough to be a hang;
but this seems like a fair trade-off.

/mz

Leo Barnes

unread,
Apr 12, 2017, 12:28:20 PM4/12/17
to afl-users
Nice! This will be incredibly useful!

//Leo

Leo Barnes

unread,
Jul 30, 2017, 3:39:18 PM7/30/17
to afl-users
Hi Michal,

I'm still getting slightly weird results when it comes to skipping inputs due to timeout at startup. This is what I see:

1. I set no timeout, AFL does dry run on my 9 inputs and determines 220 ms to be a good timeout
2. I set '-t 300+' as timeout on a fresh start. AFL goes through my 9 inputs and rejects all of them due to timeout.

I'm relying heavily on persistent mode to give me better execution speed, since the target I'm fuzzing does some pretty expensive initialization (I do a dry run before the AFL persistent mode loop to increase target stability). This means that the first time the target is run, it's probably much slower than the subsequent runs, or at least I assume that is what's happening.

Would it be possible to change the dry-run timeout rejection logic to have a hard timeout of AFL_HANG_TMOUT (so that we don't wait forever), but only reject inputs that have an average execution time larger than what is specified by `-t xxx`?

Thanks!
//Leo

Michal Zalewski

unread,
Jul 30, 2017, 3:53:08 PM7/30/17
to afl-users
> 1. I set no timeout, AFL does dry run on my 9 inputs and determines 220 ms
> to be a good timeout
> 2. I set '-t 300+' as timeout on a fresh start. AFL goes through my 9 inputs
> and rejects all of them due to timeout.

This is not impossible in persistent mode if you have very slow
startup (since every timeout means a restart of the process). Doing a
dummy exec after every timeout during calibration would... well, "fix"
the problem, but cause problems down the line (since it could trap the
actual fuzzing process in an endless cycle of timeouts). Doing a dummy
exec after every timeout during a fuzzing job... would be expensive.

> Would it be possible to change the dry-run timeout rejection logic to have a
> hard timeout of AFL_HANG_TMOUT (so that we don't wait forever), but only
> reject inputs that have an average execution time larger than what is
> specified by `-t xxx`?

The underlying problem is still that if the timeout is shorter than
the time needed to reinitialize the persistent-mode binary after kill,
your fuzzing can get stuck in an endless loop of restarts and
timeouts. So from that perspective, it's auto-calibration that's
wrong, not -t nnn+ that's misbehaving :-(

The "correct" solution would be probably to time cold startup and
normal startup in persistent mode separately, and enforce that based
on whether the child respawned or not. Except that I'm having a hard
time imagining how this should work with -t nnn+...

/mz

Leo Barnes

unread,
Jul 30, 2017, 4:41:15 PM7/30/17
to afl-users


On Sunday, July 30, 2017 at 12:53:08 PM UTC-7, Michal Zalewski wrote:
> 1. I set no timeout, AFL does dry run on my 9 inputs and determines 220 ms
> to be a good timeout
> 2. I set '-t 300+' as timeout on a fresh start. AFL goes through my 9 inputs
> and rejects all of them due to timeout.

This is not impossible in persistent mode if you have very slow
startup (since every timeout means a restart of the process). Doing a
dummy exec after every timeout during calibration would... well, "fix"
the problem, but cause problems down the line (since it could trap the
actual fuzzing process in an endless cycle of timeouts). Doing a dummy
exec after every timeout during a fuzzing job... would be expensive.

> Would it be possible to change the dry-run timeout rejection logic to have a
> hard timeout of AFL_HANG_TMOUT (so that we don't wait forever), but only
> reject inputs that have an average execution time larger than what is
> specified by `-t xxx`?

The underlying problem is still that if the timeout is shorter than
the time needed to reinitialize the persistent-mode binary after kill,
your fuzzing can get stuck in an endless loop of restarts and
timeouts. So from that perspective, it's auto-calibration that's
wrong, not -t nnn+ that's misbehaving :-(

Yeah, I see what you mean. I guess one approach for me would be to try to use __AFL_INIT() to move the forkserver initialization to after my own internal dry run. I have never managed to get that to work though since my target is pretty complex.
 

The "correct" solution would be probably to time cold startup and
normal startup in persistent mode separately, and enforce that based
on whether the child respawned or not. Except that I'm having a hard
time imagining how this should work with -t nnn+...

Yeah, for persistent mode, it would probably be better to be able to to specify two different timeouts. One for cold start and one for warm. Probably not useful for 90% of fuzzing cases, but it would definitely help with this case.

Cheers,
//Leo
 

/mz
Reply all
Reply to author
Forward
0 new messages