Fuzzing xpdf: very slow (depends on templates), RAM vs SSD no dice, AFL_PERSISTENT possible?

298 views
Skip to first unread message

kciredor

unread,
Oct 23, 2017, 4:32:08 AM10/23/17
to afl-users
Hi,

Trying to fuzz xpdf I'm running into issues I cannot comprehend. It's running very slow and there's not a lot I can think of to fix this, so it must be something I simply don't know yet ;-)

Started fuzzing 'pdfinfo' but reading a twitter message from Ben Nagy claiming 'pdftoppm' would be a better target (parses the pdf deeper), I had to agree and switched. That's when trouble started: pdftoppm fuzzing is very slow (15 - 50 runs per core per second with the same intial templates).

- I have patched out the 'write output file' part of pdftoppm. Did not really change a lot, as in, still very slow.
- Templates: 90 pdf files, mostly <1 kb, about 10 are ~10kb, a couple are up to 100kb. I tried deleting all initial templates and stick with afl's "small.pdf" file. This helps a lot, but, if everyone fuzzes with the same fuzzer and same input template, we'll all find the same bugs. (Also: it's still slow: 125/sec). Besides: small.pdf makes pdftoppm exit early
(broken 'xref' header), so it's probably never going to parse deep enough. Right?
- RAM vs SSD: mount -t ramfs -o size=2g ramfs fuzz -> putting everything in there does not increase the speed at all, it's as if the SSD is exactly as fast as running everything from RAM. Also: Ben Nagy gave advice to run everything -except crashes- from RAM. But playing around running afl, breaking, symlinking crashes to a real on disk dir and resuming afl, does not work (afl moves the symlink away and creates a new crashes dir). This is a separate issue, but, what would be the best practice? Every 30 minutes rsync the ramdisk to persistent storage?
- afl-clang-fast vs afl-gcc really helped a lot. It started with 400/sec instead of 125/sec. After a while it's down to 70/sec though (it's up and down all the time).

Last advice I got was to try AFL_PERSISTENT mode, but is that even a thing when working with a cli exec like pdftoppm?

Would love to learn from this community on how to improve. Probably missing some best practices and many "gotcha's". Can't imagine fuzzing something like xpdf should be this hard and turn out this slow.

Thanks in advance ;-)!

Cheers,
kciredor

Jonas Wagner

unread,
Oct 23, 2017, 10:41:20 AM10/23/17
to afl-users
Hi kciredor,

Is your CPU fully utilized? If yes, it's probably just xpdf doing a lot of processing. I would expect this to be the bottleneck, given that you already use a ramdisk and did some other tweaks.

You can see where time is being spent in more detail if you run "perf top" or similar. Does this help?

Best,
Jonas

--
You received this message because you are subscribed to the Google Groups "afl-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to afl-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michal Zalewski

unread,
Oct 23, 2017, 10:58:53 AM10/23/17
to afl-users
> Started fuzzing 'pdfinfo' but reading a twitter message from Ben Nagy claiming 'pdftoppm' would be a better target (parses the pdf deeper), I had to agree and switched. That's when trouble started: pdftoppm fuzzing is very slow (15 - 50 runs per core per second with the same intial templates).

As Jonas noted, you probably need to profile the tool or the library
to understand what it's doing or why it's slow. Maybe it's parsing
some large config file at startup or so. That said, if it's hitting
100+ execs/sec with small files, it's not awful. Using llvm_mode
(afl-clang-fast) and following other tips from docs/perf_tips.txt may
help. Otherwise, parallelizing is likely the way to go.

> - Templates: 90 pdf files, mostly <1 kb, about 10 are ~10kb, a couple are up to 100kb. I tried deleting all initial templates and stick with afl's "small.pdf" file. This helps a lot, but, if everyone fuzzes with the same fuzzer and same input template, we'll all find the same bugs. (Also: it's still slow: 125/sec). Besides: small.pdf makes pdftoppm exit early
> (broken 'xref' header), so it's probably never going to parse deep enough. Right?

You may want to fix the file instead of starting with large inputs. It
would be also important to use a dictionary.

> - RAM vs SSD: mount -t ramfs -o size=2g ramfs fuzz -> putting everything in there does not increase the speed at all, it's as if the SSD is exactly as fast as running everything from RAM.

Your OS likely caches all the reads & writes in RAM anyway.

> Also: Ben Nagy gave advice to run everything -except crashes- from RAM. But playing around running afl, breaking, symlinking crashes to a real on disk dir and resuming afl, does not work (afl moves the symlink away and creates a new crashes dir). This is a separate issue, but, what would be the best practice? Every 30 minutes rsync the ramdisk to persistent storage?

This is not really supported out of the box, but yeah, you could sync
the crashes/ subdir to a persistent fs every now and then.

> Last advice I got was to try AFL_PERSISTENT mode, but is that even a thing when working with a cli exec like pdftoppm?

It may be doable, but I'm guessing it's not easy. You'd need to
understand and modify the code of the target program.

/mz

kciredor

unread,
Oct 23, 2017, 11:04:32 AM10/23/17
to afl-users
Hi Jonas,

Funny thing: cores start at ~ 35% for a while (with higher execs/sec) and than go to 100% (with lower execs/sec). Not a clue....

Thanks for the perf top idea, I found the two methods responsible for most of the load and will start looking into them now ;-)

Best,
kciredor

kciredor

unread,
Oct 23, 2017, 11:07:42 AM10/23/17
to afl-users
Hi Michal,

Got it, thanks for all the feedback!

One question I have remains: if we all use the same fuzzer and input (e.g. afl's small.pdf), wouldn't we all get the same bugs? I was hoping I could make a change by using my own initial templates, with high coverage of pdf functionality, but it seems the size of the templates hurts performance very hard.

Best,
kciredor

Michal Zalewski

unread,
Oct 23, 2017, 11:31:52 AM10/23/17
to afl-users
> One question I have remains: if we all use the same fuzzer and input (e.g. afl's small.pdf), wouldn't we all get the same bugs?

Maybe, depends... on whether there actually are other people fuzzing
it using the exact same parameters, on how much time they spend doing
it, and on how lucky they get.

> I was hoping I could make a change by using my own initial templates, with high coverage of pdf functionality, but it seems the size of the templates hurts performance very hard.

That's definitely OK, although you want to aim for smaller files.
Getting in the 10 kB+ territory is seldom productive. A dictionary
(which can be custom-crafted) offers a better payoff here.

/mz

Jakub Wilk

unread,
Oct 23, 2017, 11:42:58 AM10/23/17
to afl-...@googlegroups.com
* Michal Zalewski <lca...@gmail.com>, 2017-10-23, 08:31:
>A dictionary (which can be custom-crafted) offers a better payoff here.

Does anyone have a good dictionary for PDFs? The one shipped with AFL is
way too big to be useful in practice. :(

--
Jakub Wilk

kciredor

unread,
Oct 24, 2017, 11:38:21 AM10/24/17
to afl-users
Afraid I can't help you there, but why is the shipped-with-afl dictionary not working out for you?
Reply all
Reply to author
Forward
0 new messages