cur_loc = (cur_loc >> 4) ^ (cur_loc << 8); cur_loc &= MAP_SIZE - 1;
this is quick but dirty. The speed is useful here since this is executed each time the basic block is executed. Though I am a bit skeptical of the quality of (x>>4)^(x<<8) as a has function.
I think a better approach here might be to use a (deterministic) random function such as SHA1(cur_loc) & (MAP_SIZE - 1). This would be much more expensive, but instead of executing this every time, it could be generated once into the JIT-emitted code, amortizing the cost over how many times the basic block is executed per compilation. Something like:
tcg_gen_movi_tl(aflCurLoc_reg, hashAddr(pc));
gen_helper_afl_maybe_log(aflCurLoc_reg);
where hashAddr(target_ulong addr) returns SHA1(addr) & (MAP_SIZE - 1), and gen_helper_afl_maybe_log generates a call to the afl helper in C.
Tim
>I think a better approach here might be to use a (deterministic) random
>function such as SHA1(cur_loc) & (MAP_SIZE - 1). This would be much more
>expensive, but instead of executing this every time, it could be generated
>once into the JIT-emitted code, amortizing the cost over how many times the
>basic block is executed per compilation.
You'd get the same results as SHA-1 much faster using any kind of hash
function that provides good avalanche characteristics (avalanche = mixing of
bits, as a simplified explanation). SHA-1 is a cryptographic hash function
that provides properties like preimage resistance which is pretty unnecessary,
what I would do is use something like the finaliser function from Murmurhash:
h ^= h >> 16;
h *= 0x85ebca6b;
h ^= h >> 13;
h *= 0xc2b2ae35;
h ^= h >> 16;
or for 64-bit systems:
h ^= h >> 33;
h *= 0xff51afd7ed558ccd;
h ^= h >> 33;
h *= 0xc4ceb9fe1a85ec53;
h ^= h >> 33;
That's almost as fast as the original but gets much better avalanche.
Peter.
--
You received this message because you are subscribed to the Google Groups "afl-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to afl-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You'd get the same results as SHA-1 much faster using any kind of hash
function that provides good avalanche characteristics (avalanche = mixing of
bits, as a simplified explanation). SHA-1 is a cryptographic hash function
That's almost as fast as the original but gets much better avalanche.
Peter.
Tim, are you seeing any evidence of that? I was doing some
benchmarking with targets such as libpng and wasn't seeing huge
differences between QEMU and native maps.
/mz
>Based on those - is there a more optimal avalanche/hash function in this
>scenario?
No. You either get good avalanche or you don't, the definition is that a
single changed input bit should change 50% of the output bits so it doesn't
matter what the input is.
Replying to mz's comment, given that fmix() from Murmurhash is going to be,
at most, a few cycles slower than the current approach, it can't hurt to use
it.
The longer reply is that you'd need to benchmark things to see if it really is
an issue, but I'd just use fmix().
Peter.
Attached is an implementation of what I had in mind.
NOTE: This seems like a bug in the current afl-qemu-trace, and should probably be investigated further!
I investigated the afl-qemu-trace weirdness I saw earlier -- afl/qemu is tracing basic blocks before they are executing but before it checks for various exceptional conditions. If an exception occurs the afl/qemu tracer will be invoked several times for the same basic block, leading to some spurious self edges.
or for 64-bit systems:
h ^= h >> 33;
h *= 0xff51afd7ed558ccd;
h ^= h >> 33;
h *= 0xc4ceb9fe1a85ec53;
h ^= h >> 33;
That's almost as fast as the original but gets much better avalanche.
map size 15 32768 edge collisions 24581 of 50252 -- 48.9% node collisions 6912 of 24195 -- 28.6% map size 16 65536 edge collisions 15741 of 50252 -- 31.3% node collisions 4553 of 24195 -- 18.8% map size 17 131072 edge collisions 9552 of 50252 -- 19.0% node collisions 2820 of 24195 -- 11.7% map size 18 262144 edge collisions 6295 of 50252 -- 12.5% node collisions 2135 of 24195 -- 8.8% map size 19 524288 edge collisions 4538 of 50252 -- 9.0% node collisions 1993 of 24195 -- 8.2% map size 20 1048576 edge collisions 3781 of 50252 -- 7.5% node collisions 1989 of 24195 -- 8.2% map size 21 2097152 edge collisions 2136 of 50252 -- 4.3% node collisions 945 of 24195 -- 3.9% map size 22 4194304 edge collisions 1270 of 50252 -- 2.5% node collisions 408 of 24195 -- 1.7% map size 23 8388608 edge collisions 777 of 50252 -- 1.5% node collisions 159 of 24195 -- 0.7% map size 24 16777216 edge collisions 543 of 50252 -- 1.1% node collisions 0 of 24195 -- 0.0%
map size 15 32768
edge collisions 24517 of 50252 -- 48.8%
node collisions 7027 of 24195 -- 29.0%
map size 16 65536
edge collisions 15155 of 50252 -- 30.2%
node collisions 3939 of 24195 -- 16.3%
map size 17 131072
edge collisions 8540 of 50252 -- 17.0%
node collisions 2092 of 24195 -- 8.6%
map size 18 262144
edge collisions 4549 of 50252 -- 9.1%
node collisions 1074 of 24195 -- 4.4%
map size 19 524288
edge collisions 2425 of 50252 -- 4.8%
node collisions 541 of 24195 -- 2.2%
map size 20 1048576
edge collisions 1201 of 50252 -- 2.4%
node collisions 286 of 24195 -- 1.2%
map size 21 2097152
edge collisions 606 of 50252 -- 1.2%
node collisions 143 of 24195 -- 0.6%
map size 22 4194304
edge collisions 301 of 50252 -- 0.6%
node collisions 80 of 24195 -- 0.3%
map size 23 8388608
edge collisions 170 of 50252 -- 0.3%
node collisions 38 of 24195 -- 0.2%
map size 24 16777216
edge collisions 60 of 50252 -- 0.1%
node collisions 18 of 24195 -- 0.1%
Peter.