Questions around CMP instruction tracing

32 views
Skip to first unread message

Niklas Goegge

unread,
Dec 13, 2022, 3:35:13 PM12/13/22
to libfuzzer
Hi,

I have been digging through the libFuzzer code and have some questions around the CMP instruction tracing that is done.

Afaict, LLVM's code coverage instrumentation (SanitizerCoverage) is used to trace CMP instructions as well as memcmp/strcmp calls. There are three tables that store recent compares (i.e. TracePC::TORC4, TracePC::TORC8 and TracePC::TORCW), which are used to (sometimes) fill the auto dictionary (i.e. MutationDispatcher::PersitentAutoDictionary) as part of the default mutations. If I am not misreading the code, then all tables can only hold 32 compares, meaning some compares get overwritten while a fuzz target is running (makes sense since we want to store *recent* compares).

Questions:

1. From what I gathered, tracing for memcmp and strcmp only happens while the actual fuzz target (user callback) is running (because libFuzzer options were otherwise leaking into the auto dict https://bugs.llvm.org/show_bug.cgi?id=37047). Wouldn't the same make sense for CMP instruction tracing?

2. If I am using a custom mutator, are compares from my mutator stored in the TORC* tables? If yes, then that could be counter productive if my mutator also uses "LLVMFuzzerMutate", right?

3. Since only 32 compares can be stored in each table, isn't there a risk that a target like the one below would end up overwriting all the interesting compares of "fuzz_me", if "cleanup" for some reason is doing a bunch of CMP instructions?
```
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
  setup();
  fuzz_me(data, size);
  cleanup();
  return 0;
}
```

4. More generally, how "bad" is it if compares from code unrelated to the code that should be fuzzed, leak into the three tables of recent compares?

Thanks,
Niklas
Message has been deleted

Kostya Serebryany

unread,
Dec 14, 2022, 3:20:51 PM12/14/22
to Niklas Goegge, libfuzzer
Hi Niklas, 

First a quick note on libFuzzer state: we stopped our development efforts there and switched to https://github.com/google/centipede
Its logic for using CMPs is a bit different. https://github.com/google/centipede/blob/main/runner_sancov.cc

On Tue, Dec 13, 2022 at 12:37 PM Niklas Goegge <n.go...@gmail.com> wrote:
Hi,

I have been digging through the libFuzzer code and have some questions around the CMP instruction tracing that is done.

Afaiu, LLVM's code coverage instrumentation (SanitizerCoverage) is used to trace CMP instructions

Yes. 
 
as well as memcmp/strcmp calls.

This is not handled by SanitizerCoverage, but by interceptors in the sanitizer/libfuzzer run time. 
Same for Centipede.
 
There are three tables that store recent compares (i.e. TracePC::TORC4, TracePC::TORC8 and TracePC::TORCW), which are used to (sometimes) fill the auto dictionary (i.e. MutationDispatcher::PersitentAutoDictionary) as part of the default mutations. If I am not misreading the code, then all tables can only hold 32 compares, meaning some compares get overwritten while a fuzz target is running (makes sense since we want to store *recent* compares).

yes, that was my logic behind the current code. 
 

Questions:

1. Tracing for memcmp and strcmp only happens while the actual fuzz target (user callback) is running (because libFuzzer options were otherwise leaking into the auto dict https://bugs.llvm.org/show_bug.cgi?id=37047). Wouldn't the same make sense for CMP instruction tracing?

libFuzzer's code is not instrumented with sancov, while interceptors kick in even when doing libFuzzer's own computations. 
(Not the case for Centipede, since Centipede is fully out-of-process)
 

2. If I am using a custom mutator, are compares from my mutator stored in the TORC* tables?

This shouldn't happen, but maybe it does? dunno
 
If yes, then that could be counter productive if my mutator also uses "LLVMFuzzerMutate", right?

probably  

3. Since "only" 32 compares can be stored in each table, isn't there a risk that a target like the one below would end up overwriting all the interesting compares of "fuzz_me", if "cleanup" for some reason is doing a bunch of CMP instructions?
```
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
  setup();
  fuzz_me(data, size);
  cleanup();
  return 0;
}
```

of course. the heuristic is far from perfect. 
 

4. More generally, how bad is it if compares from code unrelated to the code that should be fuzzed, leak into the three tables of recent compares?

Having wrong comparisons in the tables shouldn't be too bad. 
not having the write ones is worse though, and so if the wrong ones overwrite the right ones, it's bad. 
 

Thanks,
Niklas

--
You received this message because you are subscribed to the Google Groups "libfuzzer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to libfuzzer+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/libfuzzer/fd40d999-b1a1-4e24-8026-e5a72a16a898n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages