Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Retrieving coverage information in libFuzzer

543 views
Skip to first unread message

Jonas Möller

unread,
May 19, 2022, 1:12:02 PM5/19/22
to libf...@googlegroups.com
Hello,

as part of a research project I am currently trying to port Nezha [1], a differential fuzzing framework based on libfuzzer, to a more recent LLVM version. Unfortunately, Nezha uses the deprecated -fsanitize-coverage=trace-pc instrumentation to get coverage information. This has been replaced by -fsanitize-coverage=pc-table (set implicitly by -fsanitize=fuzzer). If I am not mistaken, the new ModulePCTable reports frequency based coverage information for each PC and module. Is this correct so far?

Furthermore, I want to extend Nezha's coverage metric using execution path information. As far as I can tell, this is not possible with the PC table, since I could not distinguish between the calls "foo-bar-foo" and "foo-foo-bar". Would it instead be possible to hook into the __sanitizer_cov_trace_* callbacks (thus, basically extending TPC.HandleCmp to record execution traces)? Or are there edge cases where the PCTable would record coverage which would not be reported by the __sanitizer_cov_trace_* callbacks (e.g. PCTable reports edge-level coverage while the callbacks only report bb-level coverage)?

I am grateful for any help or suggestions.

Sincerely,
Jonas

[1] https://github.com/nezha-dt/nezha

Konstantin Serebryany

unread,
May 23, 2022, 3:40:47 PM5/23/22
to Jonas Möller, libfuzzer
Hi Jonas, 


On Thu, May 19, 2022 at 10:12 AM Jonas Möller <jo.mo...@tu-braunschweig.de> wrote:
Hello,

as part of a research project I am currently trying to port Nezha [1], a differential fuzzing framework based on libfuzzer, to a more recent LLVM version.

I am a fan of the Nezha approach!! 
>> NEZHA exploits the behavioral asymmetries between multiple test programs to focus on inputs that are more likely to trigger semantic bugs.
(purely based on reading the paper, I haven't actually tried it in full)
 
Unfortunately, Nezha uses the deprecated -fsanitize-coverage=trace-pc instrumentation to get coverage information. This has been replaced by -fsanitize-coverage=pc-table (set implicitly by -fsanitize=fuzzer).

pc-table is not an instrumentation, it simply creates a table to be used with either =trace-pc-guard or =inline-8bit-counters. 
-fsanitize=fuzz uses =inline-8bit-counters,pc-table
 
If I am not mistaken, the new ModulePCTable reports frequency based coverage information for each PC and module. Is this correct so far?

err. Not sure I understand this :( 

 
Furthermore, I want to extend Nezha's coverage metric using execution path information. As far as I can tell, this is not possible with the PC table, since I could not distinguish between the calls "foo-bar-foo" and "foo-foo-bar".

Right, =inline-8bit-counters,pc-table can't give you paths. 
=trace-pc or =trace-pc-guard can. 

 
Would it instead be possible to hook into the __sanitizer_cov_trace_* callbacks (thus, basically extending TPC.HandleCmp to record execution traces)?

Mmm. That would be a very indirect way to get paths. 
I'd rely on =trace-pc-guard instead. 
 
Or are there edge cases where the PCTable would record coverage which would not be reported by the __sanitizer_cov_trace_* callbacks (e.g. PCTable reports edge-level coverage while the callbacks only report bb-level coverage)?

I am grateful for any help or suggestions.

We are very close to open-sourcing another fuzzing engine where what you want might be easier to achieve. 
BTW, implementing something Nezha-like in that engine is on my list :) 
Stay tuned for details. 

 

Sincerely,
Jonas

[1] https://github.com/nezha-dt/nezha

--
You received this message because you are subscribed to the Google Groups "libfuzzer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to libfuzzer+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/libfuzzer/CA5253C5-FF8A-4452-9484-0939FCD12878%40tu-braunschweig.de.

Jonas Möller

unread,
May 23, 2022, 8:26:21 PM5/23/22
to libfuzzer, Konstantin Serebryany

Hello Konstantin,

thanks for your response!

If I am not mistaken, the new ModulePCTable reports frequency based
coverage information for each PC and module. Is this correct so far?
err. Not sure I understand this :(

Just for clarification: I was referring to the "ModulePCTable" variable in the "TracePC"-class (FuzzerTracePC.cpp) and more specifically the UpdateObservedPCs function:

for (size_t i = 0; i < NumModules; i++) {
auto &M = Modules[i];
for (size_t r = 0; r < M.NumRegions; r++) {
auto &R = M.Regions[r];
if (!R.Enabled) continue;
for (uint8_t *P = R.Start; P < R.Stop; P++)
if (*P) { const PCTableEntry *TE = &ModulePCTable[i].Start[M.Idx(P)]
Observe(TE);
}
}
}

My interpretation of this code is that each region of a module references a section of an array (using R.Start and R.Stop). Each item in the array (references by *P) represents the amount of calls to the respective PC. So if *P == 0 the corresponding PC has not been accessed during execution.

Or, as a debug explanation:

printf("PC: %lu has been called %d time(s)\n", ModulePCTable[i].Start[M.Idx(P)].PC, *P);

So, as you said in your response, I am not able to get execution paths (e.g. foo-bar-foo), but only frequency information (e.g. foo: 2, bar: 1) from this information.

pc-table is not an instrumentation, it simply creates a table to be used
with either =trace-pc-guard or =inline-8bit-counters.
-fsanitize=fuzz uses =inline-8bit-counters,pc-table

A little bit off topic: I am a little bit confused by this. I found that trace-pc-guard has been deprecated since commit 50a1c697127749eec567d14819d549b63af1242f and has been replaced with pc-table. Unfortunately, I have not found a reason for this switch. If both can be used (seemingly interchangeably) why was pc-table preferred?

As I far as I understand your explanation, it is possible to use pc-table and trace-pc-guard in conjunction, but trace-pc-guard can not be used with inline-8bit-counters? Or in other words, both trace-pc-guard and inline-8bit-counters can be used to populate the pc-table?

Would it instead be possible to hook into the __sanitizer_cov_trace_*
callbacks (thus, basically extending TPC.HandleCmp to record execution
traces)?
Mmm. That would be a very indirect way to get paths.
I'd rely on =trace-pc-guard instead.

Since libFuzzer currently uses inline-8bit-counters (and this method is incompatible with trace-pc-guard) wouldn't this require a sizeable rewriting of the tracing logic to populate the pc-table? I would like to avoid rewriting large sections and keep the changes to a minimum. Or is there something I am missing?

We are very close to open-sourcing another fuzzing engine where what you
want might be easier to achieve.
BTW, implementing something Nezha-like in that engine is on my list :)
Stay tuned for details.

I am definitely curious about the result :)


Sincerely,

Jonas


PS: I hope this reaches the mailing list correctly and does not create a new thread.

Konstantin Serebryany

unread,
May 25, 2022, 12:45:02 PM5/25/22
to Jonas Möller, libfuzzer
On Mon, May 23, 2022 at 5:26 PM Jonas Möller <jo.mo...@tu-braunschweig.de> wrote:

Hello Konstantin,

thanks for your response!

If I am not mistaken, the new ModulePCTable reports frequency based
coverage information for each PC and module. Is this correct so far?
err. Not sure I understand this :(

Just for clarification: I was referring to the "ModulePCTable" variable in the "TracePC"-class (FuzzerTracePC.cpp) and more specifically the UpdateObservedPCs function:

for (size_t i = 0; i < NumModules; i++) {
auto &M = Modules[i];
for (size_t r = 0; r < M.NumRegions; r++) {
auto &R = M.Regions[r];
if (!R.Enabled) continue;
for (uint8_t *P = R.Start; P < R.Stop; P++)
if (*P) { const PCTableEntry *TE = &ModulePCTable[i].Start[M.Idx(P)]
Observe(TE);
}
}
}

My interpretation of this code is that each region of a module references a section of an array (using R.Start and R.Stop). Each item in the array (references by *P) represents the amount of calls to the respective PC. So if *P == 0 the corresponding PC has not been accessed during execution.

Or, as a debug explanation:

printf("PC: %lu has been called %d time(s)\n", ModulePCTable[i].Start[M.Idx(P)].PC, *P);

So, as you said in your response, I am not able to get execution paths (e.g. foo-bar-foo), but only frequency information (e.g. foo: 2, bar: 1) from this information.


RIght. These are counters and nothing more. 
 

pc-table is not an instrumentation, it simply creates a table to be used
with either =trace-pc-guard or =inline-8bit-counters.
-fsanitize=fuzz uses =inline-8bit-counters,pc-table

A little bit off topic: I am a little bit confused by this. I found that trace-pc-guard has been deprecated since commit 50a1c697127749eec567d14819d549b63af1242f and has been replaced with pc-table. Unfortunately, I have not found a reason for this switch.


Performance. 
If we only need the counters, then using inline-8bit-counters is much faster. 
trace-pc-guard was removed from fsanitize=fuzzer, but it's not deprecated as an instrumentation machamism available in SanitizerCoverage. 
 

If both can be used (seemingly interchangeably) why was pc-table preferred?

As I far as I understand your explanation, it is possible to use pc-table and trace-pc-guard in conjunction, but trace-pc-guard can not be used with inline-8bit-counters?


It is entirely possible to use both at the same time. 

% cat cov.cc
#include <stdio.h>
__attribute__((noinline))
void foo() { printf("foo\n"); }

int main(int argc, char **argv) {
  if (argc == 2)
    foo();
  printf("main\n");
}
% clang  -O2 -fsanitize-coverage=inline-8bit-counters,trace-pc-guard,pc-table  cov.cc
% objdump -d a.out  | grep -A 20  main.:
00000000004280b0 <main>:
  4280b0:       53                      push   %rbx
  4280b1:       89 fb                   mov    %edi,%ebx
  4280b3:       bf 84 fb 43 00          mov    $0x43fb84,%edi
  4280b8:       e8 43 1b ff ff          call   419c00 <__sanitizer_cov_trace_pc_guard>   <<<<< trace-pc-guard 
  4280bd:       80 05 cd 7a 01 00 01    addb   $0x1,0x17acd(%rip)        # 43fb91 <__start___sancov_cntrs+0x1>  <<<<< 
inline-8bit-counters

libFuzzer currently doesn't do it, but nothing prevents you from doing it. 

But, if you already inject calls into the code via trace-pc-guard, there is little point in using inline-8bit-counters
because you can increment counters inside the trace-pc-guard callback. 


Or in other words, both trace-pc-guard and inline-8bit-counters can be used to populate the pc-table?


pc-table is populated at compile/link time, and is a static information about the PCs that are instrumented by either (or both) of trace-pc-guard and inline-8bit-counters
 

Would it instead be possible to hook into the __sanitizer_cov_trace_*
callbacks (thus, basically extending TPC.HandleCmp to record execution
traces)?
Mmm. That would be a very indirect way to get paths.
I'd rely on =trace-pc-guard instead.

Since libFuzzer currently uses inline-8bit-counters (and this method is incompatible with trace-pc-guard) wouldn't this require a sizeable rewriting of the tracing logic to populate the pc-table? I would like to avoid rewriting large sections and keep the changes to a minimum. Or is there something I am missing?


It should be possible to make relatively local modifications in libFuzzer to support trace-pc-guard. 
(But we won't have cycles to do code review for that, sorry)
Reply all
Reply to author
Forward
0 new messages