Michal, and all.
This is not related to instrumentation speed ups but single testcase runtimes:
I was kinda curious how much of the run time (for certain applications) writing out the testcase file to disk is, and if it would be possible to hijack open/read calls (LD_PRELOAD) to be reading from a second SHM instead of disk. That way all of the I/O never needs to touch the disk.
It would be complex to track file descriptors but it is possible by keeping a table somewhere in afl-fuzz.
Thoughts?
-Parker
--
You received this message because you are subscribed to the Google Groups "afl-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to afl-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
> What would really speed things up is a way to delay the start of the fork-server until needed. I'm using a hack where I toggle a weak global variable inspected by the afl setup routine. It'd be nice if there were a more official way to do this.
Sure. I couldn't think of anything robust / simple enough without
refactoring the forkserver code or requiring other, more substantial
changes to how binaries are linked - but ideas welcome :-)
(Ideally, it would be cool to be able to specify something like
AFL_SKIP=nnnn to skip nnnn initial instrumentation points; but some
compile-time "start here" shim may also work, with less flexibility.)
> 1) Use the return-address pointer on the stack instead of the explicit
> random number in the hash.
This is probably a no-go unless we explicitly disable ASLR (which
would be platform-dependent and messy). Otherwise, if addresses differ
across runs, things will go south.
We could rely just on the less-significant bits of the address, but
these are often aligned (especially around jumps, so 3-4 bits of that
will show heavy bias).
> 2) Use some other hash instead of xor(old/2, new). How about keeping a hash
> like new_hash = hash(old_hash / 256, new_value). This would keep some more
> entropy describing the "path" taken to get to this point, rather than only
> the previous single label.
I played with that, essentially doing >> 8 (I think it came up in one
of the earlier threads). It didn't seem to have a particularly
meaningful impact on the ultimate edge coverage with my benchmark
targets, while producing a denser bitmap. Anyway, it's a very simple
tweak if you want to play with it :-)
>> > 1) Use the return-address pointer on the stack instead of the explicit
>> > random number in the hash.
>> This is probably a no-go unless we explicitly disable ASLR (which
>> would be platform-dependent and messy). Otherwise, if addresses differ
>> across runs, things will go south.
>
> Actually, having differing runs have differing collisions in the hash table
> might be a good thing.
Sure, but you don't want the same locations giving you different
tuples within a single job, otherwise, it becomes impossible to tell
if the mutation produced any changes in the behavior of the target
binary (well, within the scope of what afl-fuzz tries to do).
It's less of a problem for afl-fuzz, since the fork() clones will have
stable mappings, but it becomes an issue if you need to disable the
forkserver (there are several cases where this is necessary) or with
tools such as afl-tmin or afl-cmin (neither of which use the
forkserver approach).
>> I played with that, essentially doing >> 8 (I think it came up in one
>> of the earlier threads). It didn't seem to have a particularly
>> meaningful impact on the ultimate edge coverage with my benchmark
>> targets, while producing a denser bitmap. Anyway, it's a very simple
>> tweak if you want to play with it :-)
>
> It's a little more subtle than that. Instead of storing the previous
> location, you store a hash of the previous set of locations.
Sorry, what I meant to say is, I experimented with this:
shm_region[cur_loc ^ prev_loc]++;
prev_loc = cur_loc ^ (prev_loc >> 4);
This, in effect, ensures that that the write location is a function of
four locations, not just two. So, you preserve more info about control
flow. But it did not seem to make a noteworthy difference compared to
simple edge cov.
> This trick will probably let the fuzzer find 32-bit constants fairly easily
> for PNG fuzzing.
So would extracting immediate values from testl / testq / cmpl / cmpq
to build a dictionary at compile time, right?
/mz