Does "printf" in the target code influence the fuzzer?

marco restelli

unread,

Jun 2, 2021, 1:34:23 PM6/2/21

to libf...@googlegroups.com

Hi all,
I am experimenting with libFuzzer and it seems to me that adding
"printf" calls
in the target code changes the "data" array generated by the fuzzer.
Is this correct?

What are the best options to observe the covered and not covered parts of the
target code without altering the fuzzer?

Also: adding a "printf" seems to help the fuzzer discovering a code segment. Are
there other, maybe better, ways to guide the fuzzer towards a given
code segment?

Thank you,
Marco

Konstantin Serebryany

unread,

Jun 9, 2021, 3:21:30 PM6/9/21

to mres...@gmail.com, libfuzzer

Hi Marco,

I don't see an obvious way how an added printf would change the libFuzzer's behaviour

except for making the runs slower and more chatty.

Also, not sure what do you mean by "the "data" array generated by the fuzzer".

Is this something you can demonstrate on a self-contained fuzz target?

--kcc

--
You received this message because you are subscribed to the Google Groups "libfuzzer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to libfuzzer+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/libfuzzer/CAHV2F1%2Bk1ZmZN1PXX5ddz-RHqo8C8Mezv70nZpGNoeBZoD7OPA%40mail.gmail.com.

marco restelli

unread,

Jun 11, 2021, 12:16:37 PM6/11/21

to libfuzzer

Hi Konstantin,

thank you for your reply.

> Also, not sure what do you mean by "the "data" array generated by the fuzzer".

Here I meant the uint8_t * data array passed to the fuzz target.

But I think I should elaborate more on what I am trying.

In the code below, I consider inserting two calls to printf: the

first one is

printf("%f %f\n", x, y);

and prints the input parameters to my function at each iteration,

while the second one is inside the if block. I compare two builds

obtained enabling and commenting out the second call to printf.

The code is compiled with

clang -Wall --pedantic -g -ggdb -O1 -fsanitize=fuzzer,address,signed-integer-overflow test_with_printf.cpp -o test_with_printf

with

$ clang --version

clang version 13.0.0 (https://github.com/llvm/llvm-project.git 1dee479ff632ef841ca7b28485779d898dd15e84)

Target: x86_64-unknown-linux-gnu

Thread model: posix

I now run both executables with

./test_with_printf -seed=2035867768

The first printf prints different numbers depending whether the second

call to printf is commented out or not. More precisely, the output is

the same for the first 337 iterations and starts to differ from

iteration 338.

The results are reproducible: multiple runs of the same code version

give identical results.

The preamble written by the fuzzer is also different in the two cases:

1) for the case with commented out printf call:

INFO: Running with entropic power schedule (0xFF, 100).

INFO: Seed: 2035867768

INFO: Loaded 1 modules (19 inline 8-bit counters): 19 [0x789040, 0x789053),

INFO: Loaded 1 PC tables (19 PCs): 19 [0x562720,0x562850),

INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes

INFO: A corpus is not provided, starting from an empty corpus

#2 INITED cov: 8 ft: 8 corp: 1/1b exec/s: 0 rss: 30Mb

#5 NEW cov: 8 ft: 10 corp: 2/3b lim: 4 exec/s: 0 rss: 30Mb L: 2/2 MS: 3 CrossOver-ChangeBit-CrossOver-

#9 NEW cov: 8 ft: 12 corp: 3/6b lim: 4 exec/s: 0 rss: 30Mb L: 3/3 MS: 4 ShuffleBytes-ChangeBinInt-ChangeBit-InsertByte-

#13 NEW cov: 9 ft: 14 corp: 4/10b lim: 4 exec/s: 0 rss: 30Mb L: 4/4 MS: 4 InsertByte-ChangeBit-ChangeBit-CrossOver-

2) for the case with enabled printf call

INFO: Running with entropic power schedule (0xFF, 100).

INFO: Seed: 2035867768

INFO: Loaded 1 modules (20 inline 8-bit counters): 20 [0x789080, 0x789094),

INFO: Loaded 1 PC tables (20 PCs): 20 [0x5627a0,0x5628e0),

INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes

INFO: A corpus is not provided, starting from an empty corpus

#2 INITED cov: 7 ft: 7 corp: 1/1b exec/s: 0 rss: 29Mb

#5 NEW cov: 7 ft: 9 corp: 2/3b lim: 4 exec/s: 0 rss: 30Mb L: 2/2 MS: 3 CrossOver-ChangeBit-CrossOver-

#9 NEW cov: 7 ft: 11 corp: 3/6b lim: 4 exec/s: 0 rss: 30Mb L: 3/3 MS: 4 ShuffleBytes-ChangeBinInt-ChangeBit-InsertByte-

#13 NEW cov: 9 ft: 14 corp: 4/10b lim: 4 exec/s: 0 rss: 30Mb L: 4/4 MS: 4 InsertByte-ChangeBit-ChangeBit-CrossOver-

Here is the code:

#include <fenv.h>

#include <stdio.h>

#include <fuzzer/FuzzedDataProvider.h>

float f( float const x, float const y )

{

float z = 0.0f;

printf("%f %f\n", x, y);

if( ( 322.56f < x ) && ( x < 322.57f ) )

{

// printf("Inside the if block\n");

z -= x;

}

return z;

}

extern "C" int LLVMFuzzerTestOneInput(const uint8_t * const data, size_t const size)

{

FuzzedDataProvider provider(data, size);

auto x = provider.ConsumeFloatingPointInRange<float>(0.0f,1000.0f);

auto y = provider.ConsumeFloatingPoint<float>();

f( x, y );

return 0;

}

extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv)

{

feenableexcept( FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW );

return 0;

}

Konstantin Serebryany

unread,

Jun 11, 2021, 6:55:35 PM6/11/21

to marco restelli, libfuzzer

The two binaries are substantially different, you can see it in these two lines:

INFO: Loaded 1 PC tables (19 PCs): 19 [0x562720,0x562850),

INFO: Loaded 1 PC tables (20 PCs): 20 [0x5627a0,0x5628e0),

For some reason, the printfs make the second binary have more instrumentation points.

(For a detailed answer we will need to debug the SanitizerCoverage behaviour, but this

difference doesn't surprise me).

More instrumentation points means different coverage feedback, which means different heuristic behaviour,

which means different random mutations.

You may expect that given a fixed seed, libFuzzer will generate the same mutations

on the same binary (at least, if this is not so, we'll want to fix it).

But not for different binaries.

--kcc

--
You received this message because you are subscribed to the Google Groups "libfuzzer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to libfuzzer+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/libfuzzer/bc7d5734-8d88-4af6-acb6-d8ab1463898dn%40googlegroups.com.

marco restelli

unread,

Jun 12, 2021, 1:26:34 AM6/12/21

to libfuzzer

субота, 12. јун 2021. у 00:55:35 UTC+2 konstantin....@gmail.com је написао/ла:

> The two binaries are substantially different, you can see it in these two lines:

> INFO: Loaded 1 PC tables (19 PCs): 19 [0x562720,0x562850),

> INFO: Loaded 1 PC tables (20 PCs): 20 [0x5627a0,0x5628e0),

> For some reason, the printfs make the second binary have more instrumentation points.

> (For a detailed answer we will need to debug the SanitizerCoverage behaviour, but this

> difference doesn't surprise me).

> More instrumentation points means different coverage feedback, which means different heuristic behaviour,

> which means different random mutations.

OK, this makes sense. I have seen this with multiple experiments

adding printf calls to the code, but I did not check the INFO lines

that carefully. Then the message here for me is that if those line

differ I should not expect the fuzzer runs to be exactly comparable.

> You may expect that given a fixed seed, libFuzzer will generate the same mutations

> on the same binary (at least, if this is not so, we'll want to fix it).

> But not for different binaries.

OK. Now: I would like to know which code lines are covered and which

not, possibly knowing also how many times a given line has been hit.

What would be the best option to get this information, without

altering the target binaries?

Thank you,

Marco

Konstantin Serebryany

unread,

Jun 14, 2021, 8:57:41 PM6/14/21

to marco restelli, libfuzzer

> You may expect that given a fixed seed, libFuzzer will generate the same mutations
> on the same binary (at least, if this is not so, we'll want to fix it).
> But not for different binaries.

OK. Now: I would like to know which code lines are covered and which
not, possibly knowing also how many times a given line has been hit.
What would be the best option to get this information, without
altering the target binaries?

Check here: https://github.com/google/fuzzing/blob/master/tutorial/libFuzzerTutorial.md#visualizing-coverage

marco restelli

unread,

Jul 16, 2021, 3:33:04 AM7/16/21

to libfuzzer

2021-06-15 2:57 GMT+02:00, Konstantin Serebryany
<konstantin....@gmail.com>:

Hi Konstantin,
I finally could get back to your suggestion and indeed it works
wery well, thank you!

Marco

Reply all

Reply to author

Forward