Dear all,
I would like to report some results that we got after running coverage experiments for cBench. The results are available in these two charts:
*
https://homepages.dcc.ufmg.br/~fernando/coisas/cBenchResults/instrs-norm1.pdf*
https://homepages.dcc.ufmg.br/~fernando/coisas/cBenchResults/instrs-norm2.pdf To compute these figures, we have used CFGGrind, a dynamic CFG
reconstructor (
https://github.com/rimsa/CFGgrind). What we have found
is that, for some benchmarks, the extra inputs greatly increase code
coverage. Examples of programs in this category include bzip2d,
bzip2e, office_ghostscript, security_sha and office_ispell.
However, these benchmarks seem to be more the exception than the
rule. For most of the benchmarks, the extra inputs do not seem to
increase coverage. For instance, in telecom_adpcm_d or
network_patricia the first input that we tried already yielded the
maximum coverage that we could get with any input. Such is the case of
several other benchmarks, as you will see in the figures.
So, maybe, for some of the benchmarks, it would be nice to see if
it would be possible to find a more diverse set of datasets. You guys
can check the coverage using CFGGrind directly. However, we can also
run CFGGrind onto any new dataset that eventually becomes available
for cBench.
Regards,
Fernando Pereira (UFMG)