On Tue, Mar 3, 2015 at 1:16 AM, Michal Zalewski <
lca...@gmail.com> wrote:
[discussion on ram]
I always use ramdisks or /dev/shm when available, if only to preserve
hardware life. Obviously it's faster, but whether or not the delta can
be considered a bottleneck for a given target is an open ( but moot
imho ) question.
> In my mind, the cons are:
>
> 1) Writing standalone filters will be usually considerably more
> complex than achieving the same goal through code changes in the
> target binary. [...]
Even if this were the case ( and I am dubious ) it only applies to
open source targets.
> 2) Especially less experienced users will be likely tempted to write
> them in scripting languages. Doing so will probably result in a fairly
> significant bottleneck for fast binaries. Arguably, not our problem,
> but then, relatively few users will honestly read all the docs, etc.
It's a valid concern, I suppose, but surely this hypothetical user
would first fuzz without fixups, and thus notice that they've killed
their performance? As an aside, I think that even lowly "scripting
languages" could probably muster the awesome power required to parse a
couple of hundred streams a second. ;)
> 3) The code will be an unintentional "attack surface" on its own,
> since in cases such as PNG, you will need to parse the file to some
> extent. So, it will probably need some babysitting to deal with
> crashes, hangs, etc. Doable, but adds more complexity.
A timeout on the call out to the fixup, and an EOF handler?
> 4) In general, when you add an "advanced" feature like this, there is
> a temptation for users to try it out,
This is the crux of the argument. Say, for example, I fix up a length
field because I know ( or guess ) it's an early parser check - great,
except I've now spared the code path that deals with incorrect
lengths. When we have flying cars etc I guess afl will be able to tell
us that we're bottlenecking at a branch that has a lot of code behind
it, or we'll all be using SAGE at a thousand tests a second or
something, but for now it's always going to be guesswork, and may do
more harm than good.
> So far,
> the checksum use case seems to be the strongest one, but that's
> actually pretty rare (PNG, tar, TCP, what else?).
Lengths and magics are very common. Just by watching the stdout of
some pdf parsers I can see that the xrefs are parsed early. In some
cases a missing xref keyword is an immediate abort, so when the
startxrefs index is broken then there's a lot less chance something
good will happen. In .doc there are countless small things that I
won't bore you with, but they're easier to fix in the file than in the
reader. It can also become obvious from just watching your fuzzer
output. For example, at the moment I collate my crashes and I see a
huge number of aborts following a certain stack sequence. I can go and
look at that code and add a fix.
Before you jump on me, yes I know that having an xref token in a
dictionary would evolve past the missing keyword, but it will 'never'
evolve past the startxrefs [idx] check.
So, IMHO, the argument boils down to "fixups are often premature
optimisation, and some users will do it wrong". The counterargument is
"sometimes they _are_ needed, and altering the code under test is
sometimes hard/impossible, and philosophically distasteful".
Unfortunately both sides are assessing "how often will it help"
anecdotally right now.
I can't really see any merit in the performance or implementation
wrangling on either side of the debate. Staying off disk is nice but
hardly a killer feature. If fixups are slower but 100x more likely to
hit new code then who cares? And if you can write a tool that babysits
a fuzz target then babysitting a socket is hardly going to be
daunting.
Anyway, just my 0.02.
PS: [apropos nothing] Is it just me or are the available flags not
covered (succinctly) anywhere except in the usage output from -help?
Every time I wanted to check the flags I end up back in the source
file...
Cheers,
ben