Using libFuzzer to detect/fuzz samples triggering specific source file/line

283 views
Skip to first unread message

Atte Kettunen

unread,
Jun 28, 2017, 3:51:43 AM6/28/17
to libf...@googlegroups.com

Hi,

I'm trying to use libFuzzer in a scenario where I have a stack trace for a crash, but no crash reproducing file. I have huge number of samples and the stack trace points to a very uncommon area in the source code, so it it very likely that almost all of those samples are useless. I would prefer to use libFuzzer, because the target has high initialization overhead, so scanning all those files for every stack trace would take ages.

I took the code from "-exit_on_src_pos" and modified it to work together with DoPlainRun. My goal was to get libFuzzer to print out some message when a file triggers code lines from one of the top frames from the stack trace.

I got this far:

./fuzzer: Running 4 inputs 1 time(s) each.
Running: ./samples/should-not-trigger.sample
Executed ./samples/should-not-trigger.sample in 12 ms
Running: ./samples/should-trigger.sample
Executed ./samples/should-trigger.sample in 3 ms
PC: 12076249
INFO: found line matching 'parserInternals.c:20'.
Running: ./samples/should-not-trigger.sample
Executed ./samples/should-not-trigger.sample in 0 ms
PC: 12076249
INFO: found line matching 'parserInternals.c:20'.
***
*** NOTE: fuzzing was not performed, you have only
***       executed the target code on a fixed set of inputs.
***

Problem is that after first time triggering the source position libFuzzer has 'parserInternals.c:20' marked as covered so the print happens for every file.

I couldn't find any way to "free" those PCs, so that they wouldn't come up for the next files, until actually triggered again. Is it even possible with libFuzzer and its SanitizerCoverage interface? If not, would it be feasible to add?

Finding all the samples is one thing, another thing is that we can't use libFuzzer for the fuzzing, because it will wander away after new coverage. One solution I was thinking would be to add a flag "-only_with_src_pos" to libFuzzer, that would collect and fuzz samples that trigger the source position(s) listed and ignore other coverage.

--
Cheers,

Atte Kettunen
@attekett

Konstantin Serebryany

unread,
Jun 30, 2017, 7:52:15 AM6/30/17
to Atte Kettunen, libfuzzer
It's possible to extend libfuzzer the way you suggest. But why not simply disable instrumentation for the code you don't care about?

I'm ooo, will be able to respond with more details later.

--kcc (sent from phone)

--
You received this message because you are subscribed to the Google Groups "libfuzzer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to libfuzzer+unsubscribe@googlegroups.com.
To post to this group, send email to libf...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/libfuzzer/CAK0DrSxEiT2iEjOC1H8%2BUG6MzaWNVFFb4FQReowhnygg81YC%2Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Atte Kettunen

unread,
Jun 30, 2017, 8:55:26 AM6/30/17
to Konstantin Serebryany, libfuzzer

It would be more practical to have the functionality in libfuzzer. Especially when we want to automate the process for multiple stack traces and complex target program.

The suggested flag would also allow automated targeted fuzzing for regression testing.

Konstantin Serebryany

unread,
Jul 11, 2017, 6:37:22 PM7/11/17
to Atte Kettunen, libfuzzer
[back from OOO, sorry for delay]

After thinking a bit more I don't see any simple way to implement this in libFuzzer. 
Right now we get the coverage feedback via a callback (__sanitizer_cov_trace_pc_guard from https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs-with-guards).
This function is performance-critical and we can't insert extra code there w/o slowing down everything. 
It's likely that we will use inlined instrumentation later, which will make this even more complicated. 

--kcc 


On Fri, Jun 30, 2017 at 5:55 AM, Atte Kettunen <atte...@gmail.com> wrote:

It would be more practical to have the functionality in libfuzzer. Especially when we want to automate the process for multiple stack traces and complex target program.

The suggested flag would also allow automated targeted fuzzing for regression testing.

Atte Kettunen

unread,
Aug 10, 2017, 11:09:24 AM8/10/17
to Konstantin Serebryany, libfuzzer

If we forget the fuzzing for specific code paths, would it be possible to add the option of detecting files that trigger specific code lines when doing plain run for a given corpus? That would "only" require reset of the coverage data between execution of each test case.

Two week ago __sanitizer_cov_reset was added to sanitizer coverage interface, but so far I had no luck in using it in libFuzzer DoPlainRun. Do you think it could be used in this context? If I recall right, SanitizerCoverage used to have coverage data reset for custom in-memory fuzzer implementations, but it is not anymore in the documentation.

Konstantin Serebryany

unread,
Aug 10, 2017, 12:32:37 PM8/10/17
to Atte Kettunen, libfuzzer
On Thu, Aug 10, 2017 at 8:09 AM, Atte Kettunen <atte...@gmail.com> wrote:

If we forget the fuzzing for specific code paths, would it be possible to add the option of detecting files that trigger specific code lines when doing plain run for a given corpus? That would "only" require reset of the coverage data between execution of each test case.
I am in the process of migrating to yet another kind of instrumentation
(-fsanitize-coverage=inline-8bit-counters,pc-table; not documented yet), which will make this trivial and almost free. 
The code already works, but it'll take me time to switch the defaults.

If you were to write the documentation for the new flag, what would that be? 


Two week ago __sanitizer_cov_reset

This is unrelated (it was done to collect separate coverage for every test in a large single-process test case).
libFuzzer does not use this code, it implements its own __sanitizer_cov_trace_pc_guard_init and friends
 
was added to sanitizer coverage interface, but so far I had no luck in using it in libFuzzer DoPlainRun. Do you think it could be used in this context? If I recall right, SanitizerCoverage used to have coverage data reset for custom in-memory fuzzer implementations, but it is not anymore in the documentation.

Atte Kettunen

unread,
Aug 18, 2017, 3:49:33 AM8/18/17
to Konstantin Serebryany, libfuzzer

Do you mean documentation on the level of what shows up in -help=1, or more in-depth documentation on what use cases the flag would have?

Konstantin Serebryany

unread,
Aug 19, 2017, 10:29:24 PM8/19/17
to Atte Kettunen, libfuzzer
On Fri, Aug 18, 2017 at 12:49 AM, Atte Kettunen <atte...@gmail.com> wrote:

Do you mean documentation on the level of what shows up in -help=1, or more in-depth documentation on what use cases the flag would have?

-help=1, but enough for me to understand what exactly you want to happen. 

Atte Kettunen

unread,
Aug 23, 2017, 3:03:52 AM8/23/17
to Konstantin Serebryany, libfuzzer

Hi,

print_on_src_pos 0 When running individual tests without fuzzing, print filenames with PC originating from the given source location. Example: -print_on_src_pos=foo.cc:123,bar.cc:99

-exit_on_src_pos does not accept a list of source positions as an argument, for this feature it would be nice addition. If we can use a list of positions then the position that triggered the print should be in the print, but other than that the print format doesn't really matter from my point of view.




Konstantin Serebryany

unread,
Aug 24, 2017, 4:13:52 PM8/24/17
to Atte Kettunen, libfuzzer
On Wed, Aug 23, 2017 at 12:03 AM, Atte Kettunen <atte...@gmail.com> wrote:

Hi,

print_on_src_pos 0 When running individual tests without fuzzing, print filenames with PC originating from the given source location. Example: -print_on_src_pos=foo.cc:123,bar.cc:99

Will this be a significant improvement over just printing all new PCs (and then using grep)? 
(-print_pcs=1; although currently it works only during fuzzing)

Atte Kettunen

unread,
Aug 25, 2017, 2:37:29 AM8/25/17
to Konstantin Serebryany, libfuzzer
On Thu, Aug 24, 2017 at 11:13 PM, Konstantin Serebryany <konstantin....@gmail.com> wrote:


On Wed, Aug 23, 2017 at 12:03 AM, Atte Kettunen <atte...@gmail.com> wrote:

Hi,

print_on_src_pos 0 When running individual tests without fuzzing, print filenames with PC originating from the given source location. Example: -print_on_src_pos=foo.cc:123,bar.cc:99

Will this be a significant improvement over just printing all new PCs (and then using grep)? 
(-print_pcs=1; although currently it works only during fuzzing)

For me, -print_pcs=1 doesn't work on clang version 6.0.0-svn310235-1~exp1 (trunk) and Fuzzer@311405, so can't verify if anything has changed since I last time used it.

IIRC -print_pcs=1 doesn't print the file that triggers that specific PC, only that a new input was found that triggers PC <X>, and it only prints when we hit that PC for the first time.

This feature would allow running through all the samples and pick out all the files that trigger specified code line(s). Printing just new PCs, as implemented in print_pcs, will only give us the first file that triggers the specified line(s).

Konstantin Serebryany

unread,
Aug 25, 2017, 12:33:02 PM8/25/17
to Atte Kettunen, libfuzzer


On Thu, Aug 24, 2017 at 11:37 PM, Atte Kettunen <atte...@gmail.com> wrote:


On Thu, Aug 24, 2017 at 11:13 PM, Konstantin Serebryany <konstantin.s.serebryany@gmail.com> wrote:


On Wed, Aug 23, 2017 at 12:03 AM, Atte Kettunen <atte...@gmail.com> wrote:

Hi,

print_on_src_pos 0 When running individual tests without fuzzing, print filenames with PC originating from the given source location. Example: -print_on_src_pos=foo.cc:123,bar.cc:99

Will this be a significant improvement over just printing all new PCs (and then using grep)? 
(-print_pcs=1; although currently it works only during fuzzing)

For me, -print_pcs=1 doesn't work

I and George have been changing lots of stuff lately. Which flags did you try? 
In particular I know that I broke -print_pcs=1 with just -fsanitize-coverage=trace-pc-guard
You need either -fsanitize-coverage=trace-pc-guard,pc-table or -fsanitize=fuzzer (which is what we expect to become the main flag)
 
on clang version 6.0.0-svn310235-1~exp1 (trunk) and Fuzzer@311405, so can't verify if anything has changed since I last time used it.

IIRC -print_pcs=1 doesn't print the file that triggers that specific PC, only that a new input was found that triggers PC <X>, and it only prints when we hit that PC for the first time.
Correct.  

This feature would allow running through all the samples and pick out all the files that trigger specified code line(s). Printing just new PCs, as implemented in print_pcs, will only give us the first file that triggers the specified line(s).

Got it, sounds doable. Let me think how to implement this better.
Maybe we should use a regexp instead of a comma-separated list? 
 
As always, feel free to send a patch if you want it sooner. :) 
Reply all
Reply to author
Forward
0 new messages