Hi Stefan,
Am 13.08.2018 um 17:45 schrieb Stefan Nagy:
> It used to be at 0.25x speed compared to a native compile.
> I made several changes, now it is at 0.65x speed.
>
> By "native speed", are you referring to /Dyninst/ forkserver-only
> instrumentation?
no, with native I meant afl-gcc source code compilation.
I agree my writing was unclear. it was before my morning coffee :)
> It's unclear to me why instrumenting blocks where isEntryBlock() returns
> false yields any performance benefit. I noticed you also
> toggle setSaveFPR(false) and setTrampRecursive(true), but these gave me
> mixed results when I was messing with them a few months back. Could you
> please provide some detail on the three performance modes?
the original implementation was instrumenting every basic block. but
writing to the map for every basic block a) pollutes the map and b) is
unnecessary overhead.
this means e.g. a -> b -> c -> d -> e .... its a single path and we
dont need b, c, d, ...
so which basic blocks do we want? those where there are either 2+
callees or 2+ callers. and if you add one -x it does check for that and
increases the speed by ~40% on average.
if you add another -x it just toggles the setSaveFPR and
setTrampRecursive. This increases things a lot. but why? I am not so
sure :) I just tested various options on various binaries and these two
made a difference without a negative side effect.
if you specify -xxx ... well then I basically do what -xx is doing but
afterwards search for the inserted instructions of dyninst and replace
them with my own.
oh yes this can go wrong :) and its experimental but there is a big
warning to do that and check if the program still works fine.
the most efficient implementation would be if BPatch_constExpr would
support XOR and working with arrays, but this is not the case (yet).
Regards,
Marc