Does "printf" in the target code influence the fuzzer?

131 views
Skip to first unread message

marco restelli

unread,
Jun 2, 2021, 1:34:23 PM6/2/21
to libf...@googlegroups.com
Hi all,
I am experimenting with libFuzzer and it seems to me that adding
"printf" calls
in the target code changes the "data" array generated by the fuzzer.
Is this correct?

What are the best options to observe the covered and not covered parts of the
target code without altering the fuzzer?

Also: adding a "printf" seems to help the fuzzer discovering a code segment. Are
there other, maybe better, ways to guide the fuzzer towards a given
code segment?

Thank you,
Marco

Konstantin Serebryany

unread,
Jun 9, 2021, 3:21:30 PM6/9/21
to mres...@gmail.com, libfuzzer
Hi Marco, 

I don't see an obvious way how an added printf would change the libFuzzer's behaviour
except for making the runs slower and more chatty. 
Also, not sure what do you mean by "the "data" array generated by the fuzzer".

Is this something you can demonstrate on a self-contained fuzz target? 

--kcc 


--
You received this message because you are subscribed to the Google Groups "libfuzzer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to libfuzzer+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/libfuzzer/CAHV2F1%2Bk1ZmZN1PXX5ddz-RHqo8C8Mezv70nZpGNoeBZoD7OPA%40mail.gmail.com.

marco restelli

unread,
Jun 11, 2021, 12:16:37 PM6/11/21
to libfuzzer
Hi Konstantin,

thank you for your reply.

> Also, not sure what do you mean by "the "data" array generated by the fuzzer".

Here I meant the   uint8_t * data   array passed to the fuzz target.
But I think I should elaborate more on what I am trying.

In the code below, I consider inserting two calls to printf: the
first one is  
  printf("%f %f\n", x, y);
and prints the input parameters to my function at each iteration,
while the second one is inside the if block. I compare two builds
obtained enabling and commenting out the second call to printf.

The code is compiled with 

clang -Wall --pedantic -g -ggdb -O1 -fsanitize=fuzzer,address,signed-integer-overflow test_with_printf.cpp -o test_with_printf

with 

$ clang --version
clang version 13.0.0 (https://github.com/llvm/llvm-project.git 1dee479ff632ef841ca7b28485779d898dd15e84)
Target: x86_64-unknown-linux-gnu
Thread model: posix

I now run both executables with 

./test_with_printf -seed=2035867768 

The first printf prints different numbers depending whether the second
call to printf is commented out or not. More precisely, the output is
the same for the first 337 iterations and starts to differ from
iteration 338.

The results are reproducible: multiple runs of the same code version
give identical results.

The preamble written by the fuzzer is also different in the two cases:

1) for the case with commented out printf call:

INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 2035867768
INFO: Loaded 1 modules   (19 inline 8-bit counters): 19 [0x789040, 0x789053), 
INFO: Loaded 1 PC tables (19 PCs): 19 [0x562720,0x562850), 
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2 INITED cov: 8 ft: 8 corp: 1/1b exec/s: 0 rss: 30Mb
#5 NEW    cov: 8 ft: 10 corp: 2/3b lim: 4 exec/s: 0 rss: 30Mb L: 2/2 MS: 3 CrossOver-ChangeBit-CrossOver-
#9 NEW    cov: 8 ft: 12 corp: 3/6b lim: 4 exec/s: 0 rss: 30Mb L: 3/3 MS: 4 ShuffleBytes-ChangeBinInt-ChangeBit-InsertByte-
#13 NEW    cov: 9 ft: 14 corp: 4/10b lim: 4 exec/s: 0 rss: 30Mb L: 4/4 MS: 4 InsertByte-ChangeBit-ChangeBit-CrossOver-

2) for the case with enabled printf call

INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 2035867768
INFO: Loaded 1 modules   (20 inline 8-bit counters): 20 [0x789080, 0x789094), 
INFO: Loaded 1 PC tables (20 PCs): 20 [0x5627a0,0x5628e0), 
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2 INITED cov: 7 ft: 7 corp: 1/1b exec/s: 0 rss: 29Mb
#5 NEW    cov: 7 ft: 9 corp: 2/3b lim: 4 exec/s: 0 rss: 30Mb L: 2/2 MS: 3 CrossOver-ChangeBit-CrossOver-
#9 NEW    cov: 7 ft: 11 corp: 3/6b lim: 4 exec/s: 0 rss: 30Mb L: 3/3 MS: 4 ShuffleBytes-ChangeBinInt-ChangeBit-InsertByte-
#13 NEW    cov: 9 ft: 14 corp: 4/10b lim: 4 exec/s: 0 rss: 30Mb L: 4/4 MS: 4 InsertByte-ChangeBit-ChangeBit-CrossOver-


Here is the code:


#include <fenv.h>
#include <stdio.h>

#include <fuzzer/FuzzedDataProvider.h>

float f( float const x, float const y )
{
  float z = 0.0f;

  printf("%f %f\n", x, y);
  
  if( ( 322.56f < x ) && ( x < 322.57f ) )
  {
    // printf("Inside the if block\n");
    z -= x;
  }

  return z;
}


extern "C" int LLVMFuzzerTestOneInput(const uint8_t * const data, size_t const size)
{
  FuzzedDataProvider provider(data, size);

  auto x = provider.ConsumeFloatingPointInRange<float>(0.0f,1000.0f);
  auto y = provider.ConsumeFloatingPoint<float>();

  f( x, y );

  return 0;
}


extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv) 
{
  feenableexcept( FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW );
  return 0;
}


Konstantin Serebryany

unread,
Jun 11, 2021, 6:55:35 PM6/11/21
to marco restelli, libfuzzer
The two binaries are substantially different, you can see it in these two lines: 
INFO: Loaded 1 PC tables (19 PCs): 19 [0x562720,0x562850), 
INFO: Loaded 1 PC tables (20 PCs): 20 [0x5627a0,0x5628e0), 
For some reason, the printfs make the second binary have more instrumentation points. 
(For a detailed answer we will need to debug the SanitizerCoverage behaviour, but this
difference doesn't surprise me). 
More instrumentation points means different coverage feedback, which means different heuristic behaviour, 
which means different random mutations. 

You may expect that given a fixed seed, libFuzzer will generate the same mutations
on the same binary (at least, if this is not so, we'll want to fix it). 
But not for different binaries. 

--kcc 


--
You received this message because you are subscribed to the Google Groups "libfuzzer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to libfuzzer+...@googlegroups.com.

marco restelli

unread,
Jun 12, 2021, 1:26:34 AM6/12/21
to libfuzzer
субота, 12. јун 2021. у 00:55:35 UTC+2 konstantin....@gmail.com је написао/ла:

> The two binaries are substantially different, you can see it in these two lines: 
> INFO: Loaded 1 PC tables (19 PCs): 19 [0x562720,0x562850), 
> INFO: Loaded 1 PC tables (20 PCs): 20 [0x5627a0,0x5628e0), 
> For some reason, the printfs make the second binary have more instrumentation points. 
> (For a detailed answer we will need to debug the SanitizerCoverage behaviour, but this
> difference doesn't surprise me). 
> More instrumentation points means different coverage feedback, which means different heuristic behaviour, 
> which means different random mutations. 

OK, this makes sense. I have seen this with multiple experiments
adding printf calls to the code, but I did not check the INFO lines
that carefully. Then the message here for me is that if those line
differ I should not expect the fuzzer runs to be exactly comparable.

> You may expect that given a fixed seed, libFuzzer will generate the same mutations
> on the same binary (at least, if this is not so, we'll want to fix it). 
> But not for different binaries. 

OK. Now: I would like to know which code lines are covered and which
not, possibly knowing also how many times a given line has been hit.
What would be the best option to get this information, without
altering the target binaries?

Thank you,
   Marco

Konstantin Serebryany

unread,
Jun 14, 2021, 8:57:41 PM6/14/21
to marco restelli, libfuzzer


> You may expect that given a fixed seed, libFuzzer will generate the same mutations
> on the same binary (at least, if this is not so, we'll want to fix it). 
> But not for different binaries. 

OK. Now: I would like to know which code lines are covered and which
not, possibly knowing also how many times a given line has been hit.
What would be the best option to get this information, without
altering the target binaries?


marco restelli

unread,
Jul 16, 2021, 3:33:04 AM7/16/21
to libfuzzer
2021-06-15 2:57 GMT+02:00, Konstantin Serebryany
<konstantin....@gmail.com>:
Hi Konstantin,
I finally could get back to your suggestion and indeed it works
wery well, thank you!

Marco
Reply all
Reply to author
Forward
0 new messages