Dear all,
I would like to report some results that we got after running
coverage experiments for cBench. The results are available in
these two charts:
*
https://homepages.dcc.ufmg.br/~fernando/coisas/cBenchResults/instrs-norm1.pdf
*
https://homepages.dcc.ufmg.br/~fernando/coisas/cBenchResults/instrs-norm2.pdf
To compute these figures, we have used CFGGrind, a dynamic
CFG
reconstructor (
https://github.com/rimsa/CFGgrind). What we have
found
is that, for some benchmarks, the extra inputs greatly increase
code
coverage. Examples of programs in this category include bzip2d,
bzip2e, office_ghostscript, security_sha and office_ispell.
However, these benchmarks seem to be more the exception than
the
rule. For most of the benchmarks, the extra inputs do not seem
to
increase coverage. For instance, in telecom_adpcm_d or
network_patricia the first input that we tried already yielded
the
maximum coverage that we could get with any input. Such is the
case of
several other benchmarks, as you will see in the figures.
So, maybe, for some of the benchmarks, it would be nice to
see if
it would be possible to find a more diverse set of datasets. You
guys
can check the coverage using CFGGrind directly. However, we can
also
run CFGGrind onto any new dataset that eventually becomes
available
for cBench.
Regards,
Fernando Pereira (UFMG)
--