a few questions about the libFuzzer

377 views
Skip to first unread message

Romanova, Katya

unread,
Jul 13, 2017, 9:43:41 PM7/13/17
to libf...@googlegroups.com

Dear libFuzzer developers,

 

I’m about to start using the libFuzzer and I’d like to know an answer to a few question before I begin.

(1)     Can libFuzzer be efficiently used to test the programs with the structured inputs? If so, how is this structure/file format is specified to the libFuzzer (if possible, please point me to the documentation). To elaborate my question a little more, I will give an example. Let’s say we want to fuzz elf object file dumper (not that I’m going to fuzz it, it just a good example). If libFuzzer starts generating random ELF files, we might not test much, since almost any input will be quickly rejected. However, if libFuzzer could be taught that the input buffer is actually is a structure, where the first section is a header (which is a structure itself), followed by program header section, followed by .text section, followed by .data section, etc. Instead of fuzzing the whole input buffer it will be more efficiently fuzz only one or more section/subsection of the buffer. Can I do it? If so, how?

(2)     One more question related to question #1. Can libFuzzer be taught to generate inputs, where some pre-determined part of this input will *always* contain the same value (and basically never fuzz this value). Let’s look at ELF obj file dumper as an example again. Each ELF file starts with the 4 byte long magic number in its header. If this magic number is fuzzed/altered, the obj file dumper fuzzer won’t testing anything useful, since objdumper will immediately reports an error and exits. Though thinking about it, fuzz target might merge this pre-determined value with the rest of the fuzzed buffer itself before doing any useful work on it. However, it’s nice to find out if there is an easy existing approach to this problem.

(3)     Will libFuzzer work efficiently for fuzzing JavaScript engine (i.e. if a large set of super-small conformance JavaScript tests are added to the corpus, what are the chances that the fuzzer will be successful in creating  interesting valid/invalid test cases)? Out of curiosity, but have you tried efficiently fuzz C/C++ front-end with libFuzzer?

Katya.

 

Konstantin Serebryany

unread,
Jul 13, 2017, 10:04:11 PM7/13/17
to Romanova, Katya, libf...@googlegroups.com
On Thu, Jul 13, 2017 at 6:43 PM, Romanova, Katya <katya.r...@sony.com> wrote:

Dear libFuzzer developers,

 

I’m about to start using the libFuzzer and I’d like to know an answer to a few question before I begin.

(1)     Can libFuzzer be efficiently used to test the programs with the structured inputs?


Yes!!!! :) 
 

If so, how is this structure/file format is specified to the libFuzzer (if possible, please point me to the documentation).


Protobufs. 
You need to couple libFuzzer with https://github.com/google/libprotobuf-mutator
This will require a bit of build rule hacking to get protobufs and libprotobuf-mutator build with your project. 
Although we are actively using this code in our build environment, our support for other build envs might be poor. 
Shoot questions here if you are stuck. 


Alternative is to use your own custom mutator that understands the structure of your own data format and preserves the structure while mutating the data. 
(this is exactly what libprotobuf-mutator does). 
See LLVMFuzzerCustomMutator and LLVMFuzzerCustomCrossOver in FuzzerInterface.h
And test/CustomMutatorTest.cpp / test/CustomCrossOverAndMutateTest.cpp for simple examples. 

 

To elaborate my question a little more, I will give an example. Let’s say we want to fuzz elf object file dumper (not that I’m going to fuzz it, it just a good example). If libFuzzer starts generating random ELF files, we might not test much, since almost any input will be quickly rejected. However, if libFuzzer could be taught that the input buffer is actually is a structure, where the first section is a header (which is a structure itself), followed by program header section, followed by .text section, followed by .data section, etc. Instead of fuzzing the whole input buffer it will be more efficiently fuzz only one or more section/subsection of the buffer. Can I do it? If so, how?

(2)     One more question related to question #1. Can libFuzzer be taught to generate inputs, where some pre-determined part of this input will *always* contain the same value (and basically never fuzz this value).


You can do this again using LLVMFuzzerCustomMutator. 
 

Let’s look at ELF obj file dumper as an example again. Each ELF file starts with the 4 byte long magic number in its header. If this magic number is fuzzed/altered, the obj file dumper fuzzer won’t testing anything useful, since objdumper will immediately reports an error and exits. Though thinking about it, fuzz target might merge this pre-determined value with the rest of the fuzzed buffer itself before doing any useful work on it. However, it’s nice to find out if there is an easy existing approach to this problem.


For this particular example it might not be worth using the custom mutator. 
libFuzzer will produce some inputs with broken magic header, but those won't be added to the corpus as they won't add new coverage. 
 

(3)     Will libFuzzer work efficiently for fuzzing JavaScript engine (i.e. if a large set of super-small conformance JavaScript tests are added to the corpus, what are the chances that the fuzzer will be successful in creating  interesting valid/invalid test cases)?



But IIRC we don't fuzz V8 by feeding JS code into it, instead we fuzz parts of V8 that consume other kinds of inputs. 
There are also some V8 fuzz targets in chrome repo, IIRC.

 

Out of curiosity, but have you tried efficiently fuzz C/C++ front-end with libFuzzer?


We've made a clang-fuzzer that throws garbage into clang and found quite a few lexer/parser bugs. 
The progress wasn't great, there are still several unfixed bugs in clang. 

I am currently playing with fuzzing C++ as a structured input (with libprotobuf-mutator).
This does not stress the frontend, but instead gives hard time to the backend. 
Examples: 
(again, the progress is stuck due to two timeout bugs. Once they are fixed, I'll continue). 
The code for this one is not public yet (working on it)


--kcc 
Reply all
Reply to author
Forward
0 new messages