new in 2.27b: token capture library (libtokencap)

446 views
Skip to first unread message

Michal Zalewski

unread,
Aug 5, 2016, 7:25:45 PM8/5/16
to afl-users
Many folks expressed a belief that having AFL intercept strcmp(),
memcmp(), and related calls would allow it to greatly improve coverage
in some targets without having to manually develop a dictionary. And
as much as it pains me to say so, it's a very reasonable belief :-)

Fully integrating this into AFL would be fairly difficult, but in
version 2.27b, you can find a new companion library, libtokencap.so.
The library works with any dynamically linked code that's compiled
with -fno-builtin, identifies syntax tokens in read-only memory
regions, and dumps them to an ouput file when one of them is passed to
any of the functions in question.

The best way to use it right now is to let AFL run for a while, build
up a corpus, and then postprocess the corpus like so:

export AFL_TOKEN_FILE=$PWD/temp_output.txt

for i in <out_dir>/queue/id*; do
LD_PRELOAD=/path/to/libtokencap.so \
/path/to/target/program [...params, including $i...]
done

sort -u temp_output.txt >afl_dictionary.txt

Equipped with the resulting file, you can load it into another AFL
session and see if it improves coverage. Field reports welcome!

PS. Linux-only, sorry :-( If somebody comes up with a portable and
non-invasive way to probe for read-only mappings, we can port it
easily - see __tokencap_load_mappings.

PPS. More in libtokencap/README.tokens

/mz

Jacek Wielemborek

unread,
Aug 5, 2016, 7:30:29 PM8/5/16
to afl-...@googlegroups.com
W dniu 06.08.2016 o 01:25, Michal Zalewski pisze:
Nice! Did you test it on any library/program and manage to find new
crashes with it?

signature.asc

Michal Zalewski

unread,
Aug 5, 2016, 7:33:51 PM8/5/16
to afl-users
> Nice! Did you test it on any library/program and manage to find new
> crashes with it?

My usual test suite isn't memcmp / strcmp-heavy, but I'll test it a
bit more soon. For now, I just know that it works with test programs
and successfully extracts one very obscure token from libpng =)

/mz

Kurt Roeckx

unread,
Aug 6, 2016, 4:25:53 AM8/6/16
to afl-...@googlegroups.com
On Fri, Aug 05, 2016 at 04:25:24PM -0700, Michal Zalewski wrote:
> Many folks expressed a belief that having AFL intercept strcmp(),
> memcmp(), and related calls would allow it to greatly improve coverage
> in some targets without having to manually develop a dictionary. And
> as much as it pains me to say so, it's a very reasonable belief :-)
>
> Fully integrating this into AFL would be fairly difficult, but in
> version 2.27b, you can find a new companion library, libtokencap.so.
> The library works with any dynamically linked code that's compiled
> with -fno-builtin, identifies syntax tokens in read-only memory
> regions, and dumps them to an ouput file when one of them is passed to
> any of the functions in question.

What about cases that don't end up in read-only memory?

I guess I might also have other compare functions I might wish to
intercept that are specific to my application.


Kurt

Michal Zalewski

unread,
Aug 6, 2016, 10:38:32 AM8/6/16
to afl-users
> What about cases that don't end up in read-only memory?

They are probably overwhelmingly non-interesting. Ignoring them is a
simple way to avoid accidentally dumping some of the input as a
dictionary.

/mz

Kurt Roeckx

unread,
Aug 6, 2016, 11:16:26 AM8/6/16
to afl-...@googlegroups.com
So for this to work properly, things that are const need to be
marked as const. Fixed strings probably already work. But I'm
guessing that there are lots of cases that aren't marked const
that should be.

I think I at least have a few cases where a table used to be marked
as const, but it needed to be kept sorted so we can binary search
in it. But it seems that's too hard to do manually, so the const
got removed and it now gets sorted at run time.

Maybe I should try to get that sorted at compile time rather than
runtime.


Kurt

Michal Zalewski

unread,
Aug 6, 2016, 11:54:03 AM8/6/16
to afl-users
> So for this to work properly, things that are const need to be
> marked as const. Fixed strings probably already work.

Right, this is meant to pick up constant strings. The read-only check
is probably the best we can do without coming up with a more
complicated and fragile solution. You could try rejecting patterns
that appear to be present in the input file, and otherwise log
everything. A bit less elegant, but may work.

/mz

Shai Sarfaty

unread,
Aug 7, 2016, 4:05:55 AM8/7/16
to afl-users
hi what about adding an instrumentation that will see if there is a compare that is not a byte compare but is a WORD\DWORD etc(64bit), it will modify it to a multiple compares that will cause AFL to understand better how dipper in the compare of the number it reached to ?

Michal Zalewski

unread,
Aug 7, 2016, 10:30:57 AM8/7/16
to afl-users
> hi what about adding an instrumentation that will see if there is a compare
> that is not a byte compare but is a WORD\DWORD etc(64bit), it will modify it
> to a multiple compares that will cause AFL to understand better how dipper
> in the compare of the number it reached to ?

Patches welcome =)

Shai Sarfaty

unread,
Aug 8, 2016, 4:19:08 AM8/8/16
to afl-users
ok.. working on it ..

just a note: i have tried using the env "AFL_KEEP_ASSEMBLY=1"  but this still remove the generated file from "/tmp"

Michal Zalewski

unread,
Aug 8, 2016, 4:21:41 AM8/8/16
to afl-users
> just a note: i have tried using the env "AFL_KEEP_ASSEMBLY=1" but this
> still remove the generated file from "/tmp"

Hmm, how so? Are you setting it when running afl-gcc? I don't see
anything obviously wrong with the implementation.

You may want to set TMPDIR to have the file created at a predictable
location, though.

/mz

Hanno Böck

unread,
Aug 8, 2016, 1:37:16 PM8/8/16
to afl-...@googlegroups.com
Hi,

I recently had a conversation with Kostya about this and he said he
implemented something in LibFuzzer that seems pretty smart to me.
If I recall correctly libfuzzer is trying to capture strcmp/memcmp
calls and then tries to find one of the strings in the input file and
replaces it with the other one.

I wonder if you want to consider something like this for afl.

Also in the past you put a lot of emphasis on the ease of use of afl.
If it's possible it would certainly the best if whatever mechanism gets
implemented would "just work" without any manual effort. I think
there's nothing in principle that would prevent that.

--
Hanno Böck
https://hboeck.de/

mail/jabber: ha...@hboeck.de
GPG: BBB51E42
Reply all
Reply to author
Forward
0 new messages