Generating programs around one system call

38 views
Skip to first unread message

Ryan Hancock

unread,
Jan 27, 2023, 4:27:02ā€ÆAM1/27/23
to syzkaller
Hello folks!

Currently running a Linux Host, and FreeBSD executor VM's.

I'd like to get syzkaller to generate programs that always use a specific syscalls (in my case its mmap). How would I go about doing that? I can't seem to find how to do this.

Another side question is in the coverage reports on the web hub, how are the percentages related, trying to look through the code to determine how these percentages are found. specifically, when looking at a directory what does the percentage inside the brackets versus outside the brackets mean? My assumption was the PC addresses hit versus not but cannot figure out what the other percentage means.

Thank you!

Cheers,

Ryan

Ryan Hancock

unread,
Jan 27, 2023, 1:01:34ā€ÆPM1/27/23
to syzkaller
Must have skimmed the original coverage docs - but found them now so my second question is now answered.

Also I noticed the setting up a FreeBSD VM image was rather terse compared to the actual process. I wrote up notes that does the entire process more verbosely, is this a push request you folks would be interested in?

Cheers

Aleksandr Nogikh

unread,
Jan 27, 2023, 3:08:55ā€ÆPM1/27/23
to Ryan Hancock, syzkaller
Hi Ryan,

On Fri, Jan 27, 2023 at 7:01 PM Ryan Hancock <kryan....@gmail.com> wrote:
>
> Must have skimmed the original coverage docs - but found them now so my second question is now answered.
>
> Also I noticed the setting up a FreeBSD VM image was rather terse compared to the actual process. I wrote up notes that does the entire process more verbosely, is this a push request you folks would be interested in?

Yes, it would be great if you could publish them as a PR!

>
> Cheers
>
> On Friday, January 27, 2023 at 2:27:02 AM UTC-7 Ryan Hancock wrote:
>>
>> Hello folks!
>>
>> Currently running a Linux Host, and FreeBSD executor VM's.
>>
>> I'd like to get syzkaller to generate programs that always use a specific syscalls (in my case its mmap). How would I go about doing that? I can't seem to find how to do this.

If you mean some way to force syzkaller to include at least one mmap
call to each executed program, then no, that's not possible.

You could try to limit the set of enabled syscalls via the
enabled_syscalls option
(https://github.com/google/syzkaller/blob/master/pkg/mgrconfig/config.go#L168).
It won't guarantee that mmap will be in every program, but it will
proportionally increase the interest of syzkaller in including the
mmap call into programs (simply because there won't be too many other
options). You could also try to set up a coverage filter
(https://github.com/google/syzkaller/blob/master/pkg/mgrconfig/config.go#L137)
to force syzkaller to ignore all coverage that does not belong to the
specified files/functions.

--
Aleksandr

Ryan Hancock

unread,
Jan 27, 2023, 3:29:56ā€ÆPM1/27/23
to syzkaller
Ahh okay - yea was doing the enable/disable lists, but hard to tell what sort of program will be generated.

Side question - is there a way to start the web client without actually having the runners started? Looking things like the hub etc, but would love to have the web interface up to further explore the corpusdb and coverage info without having runners up. Maybe this is a silly question I'm not sure haha.

Aleksandr Nogikh

unread,
Jan 30, 2023, 4:56:46ā€ÆAM1/30/23
to Ryan Hancock, syzkaller
On Fri, Jan 27, 2023 at 9:29 PM Ryan Hancock <kryan....@gmail.com> wrote:
>
> Ahh okay - yea was doing the enable/disable lists, but hard to tell what sort of program will be generated.
>
> Side question - is there a way to start the web client without actually having the runners started? Looking things like the hub etc, but would love to have the web interface up to further explore the corpusdb and coverage info without having runners up. Maybe this is a silly question I'm not sure haha.

For exploring the corpus db, you can take a look at this tool:
https://github.com/google/syzkaller/blob/master/tools/syz-db/syz-db.go

For coverage info, it's more tricky. Syzkaller does not preserve
coverage between syz-manager starts, the coverage is collected
dynamically after the fuzzer has started. So the best you can do is to
just save the coverage report html page as a file (to open it later)
or save the raw coverage (just a list of PCs) and then process it
yourself.

--
Aleksandr
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/98025758-9365-4874-96dd-c52d86b07107n%40googlegroups.com.

Ryan Hancock

unread,
Feb 5, 2023, 1:44:10ā€ÆPM2/5/23
to Aleksandr Nogikh, syzkaller
So I have been fooling around with the executor and I'm pretty sure individual system call coverage could be added relatively easily but just disabling
cover_enable everywhere else, and pushing the enabling and disabling of the KCOV device in the execute_syscall function. This would have to be changed for each platform so maybe not so easy.

But when I did that it seems to nicely capture only the places the code paths specific system calls can touch. Which can be very useful for understanding the breadth of some of these larger system calls.

Wonder what your thoughts are on this. Maybe its too specific of a feature?

Ryan Hancock

Dmitry Vyukov

unread,
Feb 6, 2023, 2:47:07ā€ÆAM2/6/23
to Ryan Hancock, Aleksandr Nogikh, syzkaller
On Sun, 5 Feb 2023 at 19:44, Ryan Hancock <kryan....@gmail.com> wrote:
>
> So I have been fooling around with the executor and I'm pretty sure individual system call coverage could be added relatively easily but just disabling
> cover_enable everywhere else, and pushing the enabling and disabling of the KCOV device in the execute_syscall function. This would have to be changed for each platform so maybe not so easy.

This can be done in an OS-independent execute_call function for all
OSes at once.
Potentially we can also collect coverage, but then discard it.

> But when I did that it seems to nicely capture only the places the code paths specific system calls can touch. Which can be very useful for understanding the breadth of some of these larger system calls.
>
> Wonder what your thoughts are on this. Maybe its too specific of a feature?

What exactly do you mean by "understanding the breadth of some of
these larger system calls"? Can you give some examples? What exactly
are you looking for? For what system calls?

Ryan Hancock

unread,
Feb 6, 2023, 10:58:55ā€ÆAM2/6/23
to Dmitry Vyukov, Aleksandr Nogikh, syzkaller
Sure!
I'm trying to show the extent at which some system calls which have very broad APIs (e.g., mmap, ioctl) can touch kernel subsystems, and various code paths in the kernel. Larger surface areas like this make these system calls as ideal candidates for exploiting vulnerabilities or bugs in the code even under system call disabling policies like seccomp.

Fuzzing is nice as a quick lower bound for this as with things like ghidra you have to try and manually specify function pointers which can be traversed etc,

There may be other uses for isolating in on calls for folks, but this is my current use case.

Ryan Hancock

Dmitry Vyukov

unread,
Feb 8, 2023, 6:33:28ā€ÆAM2/8/23
to Ryan Hancock, Aleksandr Nogikh, syzkaller
On Mon, 6 Feb 2023 at 16:58, Ryan Hancock <kryan....@gmail.com> wrote:
>
> Sure!
> I'm trying to show the extent at which some system calls which have very broad APIs (e.g., mmap, ioctl) can touch kernel subsystems, and various code paths in the kernel. Larger surface areas like this make these system calls as ideal candidates for exploiting vulnerabilities or bugs in the code even under system call disabling policies like seccomp.
>
> Fuzzing is nice as a quick lower bound for this as with things like ghidra you have to try and manually specify function pointers which can be traversed etc,
>
> There may be other uses for isolating in on calls for folks, but this is my current use case.

syz-manager can show per-syscall coverage reports. IIRC it's syscalls
-> coverage links.
It looks close enough to what you describe. Wound't it work for your case?

Dmitry Vyukov

unread,
Feb 10, 2023, 1:36:22ā€ÆAM2/10/23
to Ryan Hancock, syzkaller
On Thu, 9 Feb 2023 at 17:38, Ryan Hancock <kryan....@gmail.com> wrote:
>
> Ahh yea the Central Limit Theorem loves showing up. I assumed this to be the case when looking at individual system calls, thats why I think of any experiment as always a lower bound to a number rather
> then the number itself. I sadly, dont have access to 30 machines or 30 days. Would be interesting if the harmonic mean across system calls actually had significant changes though.... lots of fun experiments that I may
> just run in the background haha.
>
> For point 2, im a little more confused by that, what do you mean if some code is covered by 2+ sys calls it will be attributed 1 randomly. For example if I have a vnode_lock code block used extensively everywhere will it
> be only attributed two 1 system call?

+syzkaller mailing list

If some is covered by, say, read and write system calls, then in the
syz-manager per-syscall coverage report it will be attributed to
either read or write, but not both. Though, 1 line in a function can
be attributed to read, while another line in the same function can be
attributed to write. Whichever syscall call gets to cover this line
first in this run.

> On Thu, Feb 9, 2023 at 3:54 AM Dmitry Vyukov <dvy...@google.com> wrote:
>>
>> On Thu, 9 Feb 2023 at 00:17, Ryan Hancock <kryan....@gmail.com> wrote:
>> >
>> > Did that quick experiment and look at the coverage of mmap for example, and checked if various vfs_syscalls.c functions were being fired, seems like nothing stood out so I think you are right that when you select a specific function it only include the coverage of that function. One thing that I noticed however,
>> > is that the amount of lines covered after two hours in many of the calls is much smaller. Most likely to the "coverage-guided" part of syzkaller? To be honest I have not looked much into that side of things so don't know much about
>> > how the fuzzer chooses executions. Seems by disabling other coverage might be a way to force kind of a depth first approach to the fuzzer for these system calls - i could be also completely off base haha.
>>
>> 2 points:
>> 1. There are lots of variation between runs. We concluded that to get
>> any meaningful A/B comparison one needs to do >30 syz-manager runs for
>> A and B each with 10 VMs and 12/24h duration, then aggregate these
>> numbers. Only statistically significant differences will matter.
>> If we compare just 2 individual runs, that's pretty much random numbers.
>>
>> 2. If some code is covered by 2+ syscalls, then it will be attributed
>> to only 1 of them randomly. This may explain part of what you see.
>> However, it should not affect the cases you described, e.g. ioctl
>> covers tons of code that is coverable by any other syscalls.
>>
>>
>>
>> > On Wed, Feb 8, 2023 at 1:56 PM Ryan Hancock <kryan....@gmail.com> wrote:
>> >>
>> >> Hmm, interesting, going to try and confirm this by re-running some things and see if my modified version of syz-executor which just turns on coverage for a selected system call gets similar results to the default one running for the same amount of time.
>> >> I remember when I first did this before, it was briefly and I saw some weirdness sections of code being highlighted, it could have been my own bias getting in the way there however.
>> >>
>> >> In the meantime, i'll search some of the cover pkg code to see if I can parse how this filtering is done. Appreciate the responses
>> >>
>> >> On Wed, Feb 8, 2023 at 1:31 PM Dmitry Vyukov <dvy...@google.com> wrote:
>> >>>
>> >>> On Wed, 8 Feb 2023 at 18:40, Ryan Hancock <kryan....@gmail.com> wrote:
>> >>> >
>> >>> > Correct me if I'm wrong, but when looking through that particular feature, it pretty much sorts on all execution that have that specific system call?
>> >>>
>> >>> IIRC it should coverage from just that system call.
>> >>>
>> >>> > Maybe I misread the code - but if I am correct, then that can make results deceiving, as now you are subject to the noise of other system calls. Suppose there is a relatively small system call in terms of the basic blocks it touches,
>> >>> > it would be hard to determine this, if it was contained executions with one that touches many basic blocks.

Ryan Hancock

unread,
Feb 11, 2023, 12:24:04ā€ÆPM2/11/23
to Dmitry Vyukov, syzkaller
Ahh - interesting, so this would contribute to for example to system calls like read or write with one sort of looking larger than the other on a specific run, as one will take ownership of a trace point over the other?
This makes me move towards saying isolating on system calls is good for this reason then, because now multiple system calls can own common code paths, but the other method is nice for seeing unique (ish?) code paths.

Dmitry Vyukov

unread,
Feb 13, 2023, 2:36:10ā€ÆAM2/13/23
to Ryan Hancock, syzkaller
On Sat, 11 Feb 2023 at 18:24, Ryan Hancock <kryan....@gmail.com> wrote:
>
> Ahh - interesting, so this would contribute to for example to system calls like read or write with one sort of looking larger than the other on a specific run, as one will take ownership of a trace point over the other?

Correct.

But I think this should not noticeably affect the use case you
mentioned: "I'm trying to show the extent at which some system calls
which have very broad APIs (e.g., mmap, ioctl)".

Ryan Hancock

unread,
Feb 13, 2023, 4:34:55ā€ÆPM2/13/23
to Dmitry Vyukov, syzkaller
Okay, good to know.

For me, it does, as even though I'm trying to look at the breadth of specific system calls it requires showing what the average system call is like, so readers have a better understanding.
But fortunately I have the method I did already of modifying the executor which seems to work well, although you have to re-run it for each system call, but that's fine and can be easily scripted.

This has been great info. I'll craft up a push request as soon as I have the time here which adds onto the FreeBSD setup README as well as another that extends the coverage document to include what has been stated here for future reference.
Reply all
Reply to author
Forward
0 new messages