Syskaller setup

Andrew Zhu Aday

unread,

Feb 10, 2017, 5:31:01 PM2/10/17

to syzk...@googlegroups.com, Shankara Pailoor

Hi all,

Shankara and I are two university students looking to do a research project involving syzkaller.

Is it possible to run syskaller on MacOS + VMware Fusion using a 4.8 linux kernel image? We currently do not have access to a linux lab machine, and are unsure how nested virtualization running QEMU on top of VMware would work.

Thanks!
Andrew

Dmitry Vyukov

unread,

Feb 13, 2017, 4:26:05 AM2/13/17

to Andrew Zhu Aday, syzkaller, Shankara Pailoor

On Fri, Feb 10, 2017 at 11:30 PM, Andrew Zhu Aday
<andre...@columbia.edu> wrote:
> Hi all,
>
> Shankara and I are two university students looking to do a research project
> involving syzkaller.

Hi Andrew, Shankara,

Sounds cool! Do you mind sharing any details?

> Is it possible to run syskaller on MacOS + VMware Fusion using a 4.8 linux
> kernel image? We currently do not have access to a linux lab machine, and
> are unsure how nested virtualization running QEMU on top of VMware would
> work.

I did not try.
In theory it should work. I know that linux supports nested
virtualization fine starting from 3.16.

There is another option. Syz-manager could run directly on MacOS and
use VMware to create linux test VMs. Syz-manager should be pretty
portable, but you will need to teach it how to create VMs using VMware
(i.e. implement something similar to vm/qemu/qemu.go but using VMware
instead of qemu).

Dmitry Vyukov

unread,

Feb 14, 2017, 4:31:56 AM2/14/17

to Andrew Zhu Aday, Shankara Pailoor, andreyknvl, Alexander Potapenko, syzkaller

corpus.db.zip

On Tue, Feb 14, 2017 at 3:03 AM, Andrew Zhu Aday <andre...@columbia.edu> wrote:
> Also, do you guys have coverage stats or benchmarks for running syzkaller
> against the linux kernel?
> For example, something like time to cover 50k lines starting from an empty
> corpus.
>
> Any metrics would be very helpful!

+syzkaller mailing list

I've attached my current corpus. Extract it into workdir/corpus.db and syz-manager will start from it.

Also you can extract individual programs from the corpus and got an idea how they look. To do that you need:

$ go install github.com/google/syzkaller/tools/syz-db

$ syz-db unpack corpus.db empty.dir

Then empty.dir will contain all individual programs as text files.

Re benchmarking, if you want to compare 2 different versions of syzkaller side-by-side, you can do the following:

1. Run syz-manager with -bench=baseline flag. It will dump various execution stats to the baseline file periodically.

2. Run an experimental version of syz-manager with -bench=experiment flag. It will dump stats to experiment file.

(you can run both of them at the same time, if you setup different versions in different dirs and have enough resources on the machine)

3. Then do:

$ go install github.com/google/syzkaller/tools/syz-benchcmp

$ syz-benchcmp baseline experiment

It will show graphs comparing performance of the two versions (coverage, corpus size, executions/sec). See the attached example.

You can do the experiment starting from empty corpus, or from large existing corpus, or whatever you want.

I've also attached 2 example stats files from my runs (baseline and mutateconst), you can extract "time to cover 50k lines" from there.

079071243.html

baseline

mutateconst

Shankara Pailoor

unread,

Feb 14, 2017, 7:34:53 AM2/14/17

to Dmitry Vyukov, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

Awesome! Thank you.

Shankara Pailoor

unread,

Feb 19, 2017, 9:45:11 PM2/19/17

to Dmitry Vyukov, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

Hi,

Just had some follow up questions:

1) If you don't mind sharing, what is the typical setup you have for fuzzing? Do you run it on GCE and if so do you have specific images? How many VMs and number of programs per VM?

2) How long do you let your fuzzers run? While using the corpus you sent I am getting approximately 65,000 LOC covered over 5-6 hours. I am using 3 VMs with around 4 processes per VM. That is pretty much the limit of the capabilities of my machine

3) Do you have documents detailing the fuzzing algorithm that you would be willing to share? While looking through the code it seems like there are parameters hard coded (i.e. in mutate.go I see there is 20/31 chance that you insert a new call, or 10/11 chance you change the args of a call) What made you choose those numbers?

Regards,

Shankara

Dmitry Vyukov

unread,

Feb 20, 2017, 8:59:30 AM2/20/17

to Shankara Pailoor, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

On Mon, Feb 20, 2017 at 5:45 AM, Shankara Pailoor <sp3...@columbia.edu> wrote:

Hi,

Just had some follow up questions:

1) If you don't mind sharing, what is the typical setup you have for fuzzing? Do you run it on GCE and if so do you have specific images? How many VMs and number of programs per VM?

Sure.

On GCE we use 10 VMs, 2 CPU x 7.5 GB, 8-10 procs per manager.

Locally we use as much as you can fit into RAM with 2-2.5GB/manager, again 8-10 procs/manager.

For images we use:

https://github.com/google/syzkaller/blob/master/tools/create-image.sh

https://github.com/google/syzkaller/blob/master/tools/create-gce-image.sh

Nothing particularly special, but you may find some sysctls and boot params useful.

2) How long do you let your fuzzers run? While using the corpus you sent I am getting approximately 65,000 LOC covered over 5-6 hours. I am using 3 VMs with around 4 processes per VM. That is pretty much the limit of the capabilities of my machine

Generally, infinitely. They still uncover something new (both coverage and bugs) over time.

3) Do you have documents detailing the fuzzing algorithm that you would be willing to share? While looking through the code it seems like there are parameters hard coded (i.e. in mutate.go I see there is 20/31 chance that you insert a new call, or 10/11 chance you change the args of a call) What made you choose those numbers?

No documentation besides the source code.

Re these magical numbers, there is no particular theory underneath. Mostly common sense. I've changed some of these numbers recently using the benchmarking mode to actually prove that a change is useful. We need more such work.

Shankara Pailoor

unread,

Feb 20, 2017, 9:20:22 AM2/20/17

to Dmitry Vyukov, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

One more quick question. For the corpus you are currently using, how long did it take to generate? It was purely generational, correct?

Dmitry Vyukov

unread,

Feb 21, 2017, 3:44:40 AM2/21/17

to Shankara Pailoor, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

On Mon, Feb 20, 2017 at 5:20 PM, Shankara Pailoor <sp3...@columbia.edu> wrote:
> One more quick question. For the corpus you are currently using, how long
> did it take to generate? It was purely generational, correct?

Yes, it's "purely generational" in the sense that I did not add any
programs manually.
Re how long, it's difficult to answer. It's the corpus I use during
development test runs (but frequently I leave it running over nights
and weekends). So it has very long history, but lots of inputs were
dropped when we changed program format/syscall descriptions.

Andrew Zhu Aday

unread,

Feb 27, 2017, 11:53:16 PM2/27/17

to Dmitry Vyukov, Shankara Pailoor, andreyknvl, Alexander Potapenko, syzkaller

Hey Dmitry,

We're currently using kcov to examine coverage of real workloads. But it's difficult for us to test anything substantial on our current nested-vm setup.

Would it be possible for us to use any pre-configured GCE machines? This would greatly help our project. Shankara already tried running an instance on gce but was unable to add kcov.

Thanks,

Andrew

Dmitry Vyukov

unread,

Feb 28, 2017, 2:18:59 AM2/28/17

to Andrew Zhu Aday, Shankara Pailoor, andreyknvl, Alexander Potapenko, syzkaller

On Tue, Feb 28, 2017 at 7:53 AM, Andrew Zhu Aday
<andre...@columbia.edu> wrote:
> Hey Dmitry,
>
> We're currently using kcov to examine coverage of real workloads. But it's
> difficult for us to test anything substantial on our current nested-vm
> setup.
>
> Would it be possible for us to use any pre-configured GCE machines? This
> would greatly help our project. Shankara already tried running an instance
> on gce but was unable to add kcov.

Hi Andrew,

What was the problem with KCOV?

Our build system continuously builds tip kernels with KASAN+KCOV and
starts these kernels on GCE machines.
Here is the config that we use:
https://github.com/google/syzkaller/blob/master/syz-gce/kernel.config
And here is a script that creates GCE-compatible images:
https://github.com/google/syzkaller/blob/master/tools/create-gce-image.sh
And here you can see how it is all scripted:
https://github.com/google/syzkaller/blob/master/syz-gce/syz-gce.go#L639
Then you need to upload the image to GCS with:
gsutil cp disk.tar.gz gs://my-images
and then create an a GCE image from my-images/disk.tar.gz

Shankara Pailoor

unread,

Feb 28, 2017, 7:41:06 AM2/28/17

to Dmitry Vyukov, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

Hi Dmitry,

Sorry, I tried using the create-gce-image.sh and created a user space system using the command

sudo debootstrap --include=openssh-server,curl,tar,time,strace stable debian

I then tried to run ./create-gce-image.sh ~/debian /path/to/bzImage /path/to/vmlinux

but I get "The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8)".The offending line was:

echo -en "o\nn\np\n1\n2048\n\na\n1\nw\n" | sudo fdisk /dev/nbd0

I unmounted the old wheezy mount when running syzkaller locally and ran partprobe on /dev/nbd0.

Regards,

Shankara

Dmitry Vyukov

unread,

Feb 28, 2017, 7:44:22 AM2/28/17

to Shankara Pailoor, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

On Tue, Feb 28, 2017 at 1:41 PM, Shankara Pailoor <sp3...@columbia.edu> wrote:
> Hi Dmitry,
>
> Sorry, I tried using the create-gce-image.sh and created a user space system
> using the command
>
> sudo debootstrap --include=openssh-server,curl,tar,time,strace stable debian
>
> I then tried to run ./create-gce-image.sh ~/debian /path/to/bzImage
> /path/to/vmlinux
>
> but I get "The kernel still uses the old table. The new table will be used
> at the next reboot or after you run partprobe(8) or kpartx(8)".The offending
> line was:
>
> echo -en "o\nn\np\n1\n2048\n\na\n1\nw\n" | sudo fdisk /dev/nbd0

Try to add "sleep 10" before that line.

Shankara Pailoor

unread,

Feb 28, 2017, 7:46:22 AM2/28/17

to Dmitry Vyukov, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

Same issue unfortunately.

Dmitry Vyukov

unread,

Feb 28, 2017, 7:51:13 AM2/28/17

to Shankara Pailoor, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

Maybe you need to run that "run partprobe(8) or kpartx(8)" before that
line? I have not seen such message on my linux...

Shankara Pailoor

unread,

Feb 28, 2017, 8:53:44 AM2/28/17

to Dmitry Vyukov, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

Just in case this clarifies a bit more, but this is what appears in dmesg after running the command

[227230.299273] block nbd0: queue cleared

[227250.588673] block nbd0: NBD_DISCONNECT

[227250.588793] block nbd0: Receive control failed (result -32)

[227250.588815] block nbd0: shutting down socket

[227250.588817] block nbd0: queue cleared

Dmitry Vyukov

unread,

Feb 28, 2017, 8:56:02 AM2/28/17

to Shankara Pailoor, Andrew Zhu Aday, andreyknvl, Alexander Potapenko, syzkaller

On Tue, Feb 28, 2017 at 2:53 PM, Shankara Pailoor <sp3...@columbia.edu> wrote:
> Just in case this clarifies a bit more, but this is what appears in dmesg
> after running the command
>
> [227230.299273] block nbd0: queue cleared
> [227250.588673] block nbd0: NBD_DISCONNECT
> [227250.588793] block nbd0: Receive control failed (result -32)
> [227250.588815] block nbd0: shutting down socket
> [227250.588817] block nbd0: queue cleared

No ideas.

Just in case, I know that some people prefer to boot a stock OS on
GCE, then copy kernel onto in and kexec into the new kernel. But I
don't know details.

> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

yang ou

unread,

Jan 25, 2019, 11:18:59 AM1/25/19

to syzkaller

Hi Dmitry Vyukov,

I used your attached corpus in the recent syzkaller(commit fa3d6b0b21cfd27db2381afedc5da7a69d587191),

but it was deleted from 10115 to 24 programs.

Could you share me another maximum corpus.

Thanks!

Yang Ou

在 2017年2月14日星期二 UTC+8下午5:31:56，Dmitry Vyukov写道：

Dmitry Vyukov

unread,

Jan 25, 2019, 12:14:48 PM1/25/19

to yang ou, syzkaller

On Fri, Jan 25, 2019 at 5:19 PM yang ou <aen...@gmail.com> wrote:
>
> Hi Dmitry Vyukov,
> I used your attached corpus in the recent syzkaller(commit fa3d6b0b21cfd27db2381afedc5da7a69d587191),

How did you use it?

> but it was deleted from 10115 to 24 programs.
>
> Could you share me another maximum corpus.
>
> Thanks!
> Yang Ou
>

> On Tuesday, February 14, 2017 UTC+8 at 5:31:56 pm, Dmitry Vyukov wrote:
>>
>> corpus.db.zip

>> On Tue, Feb 14, 2017 at 3:03 AM, Andrew Zhu Aday <andre...@columbia.edu> wrote:
>> > Also, do you guys have coverage stats or benchmarks for running syzkaller
>> > against the linux kernel?
>> > For example, something like time to cover 50k lines starting from an empty
>> > corpus.
>> >
>> > Any metrics would be very helpful!
>>
>> +syzkaller mailing list
>>
>> I've attached my current corpus. Extract it into workdir/corpus.db and syz-manager will start from it.
>> Also you can extract individual programs from the corpus and got an idea how they look. To do that you need:
>> $ go install github.com/google/syzkaller/tools/syz-db
>> $ syz-db unpack corpus.db empty.dir
>> Then empty.dir will contain all individual programs as text files.
>>
>> Re benchmarking, if you want to compare 2 different versions of syzkaller side-by-side, you can do the following:
>> 1. Run syz-manager with -bench=baseline flag. It will dump various execution stats to the baseline file periodically.
>> 2. Run an experimental version of syz-manager with -bench=experiment flag. It will dump stats to experiment file.
>> (you can run both of them at the same time, if you setup different versions in different dirs and have enough resources on the machine)
>> 3. Then do:
>> $ go install github.com/google/syzkaller/tools/syz-benchcmp
>> $ syz-benchcmp baseline experiment
>> It will show graphs comparing performance of the two versions (coverage, corpus size, executions/sec). See the attached example.
>> You can do the experiment starting from empty corpus, or from large existing corpus, or whatever you want.
>>
>> I've also attached 2 example stats files from my runs (baseline and mutateconst), you can extract "time to cover 50k lines" from there.
>>
>>
>>

yang ou

unread,

Jan 26, 2019, 7:29:38 PM1/26/19

to syzkaller

I'm using the syzkaller to fuzz the KMSAN kernel a month ago( commit fc036ec83f2b591cf4ee65d9e60363409df83538), with the provided .config.extended config.

And the config for syzkaller is same as https://github.com/google/syzkaller/blob/master/docs/linux/setup_ubuntu-host_qemu-vm_x86-64-kernel.md#syzkaller.

Also, is there any big difference between the corpus of KMSAN and other linux kernel?

Could you share a corpus has a big coverage over KMSAN? Maybe from the syzbot.

在 2019年1月26日星期六 UTC+8上午12:18:59，yang ou写道：

Dmitry Vyukov

unread,

Jan 28, 2019, 5:17:22 AM1/28/19

to yang ou, syzkaller

On Sun, Jan 27, 2019 at 1:29 AM yang ou <aen...@gmail.com> wrote:
>
> I'm using the syzkaller to fuzz the KMSAN kernel a month ago( commit fc036ec83f2b591cf4ee65d9e60363409df83538), with the provided .config.extended config.

And what do you do with found bugs? I don't see any bug reports from
your email address on syzkaller mailing list nor on LKML.
I mean some people sell bugs that are later used to exploit my bank
account and phone.

> And the config for syzkaller is same as https://github.com/google/syzkaller/blob/master/docs/linux/setup_ubuntu-host_qemu-vm_x86-64-kernel.md#syzkaller.
> Also, is there any big difference between the corpus of KMSAN and other linux kernel?
> Could you share a corpus has a big coverage over KMSAN? Maybe from the syzbot.
>

> On Saturday, January 26, 2019 UTC+8 at 12:18:59 am, yang ou wrote:
>>
>> Hi Dmitry Vyukov,
>> I used your attached corpus in the recent syzkaller(commit fa3d6b0b21cfd27db2381afedc5da7a69d587191),
>> but it was deleted from 10115 to 24 programs.
>>
>> Could you share me another maximum corpus.
>>
>> Thanks!
>> Yang Ou
>>

>> On Tuesday, February 14, 2017 UTC+8 at 5:31:56 pm, Dmitry Vyukov wrote:
>>>
>>> corpus.db.zip

yang ou

unread,

Jan 28, 2019, 9:28:19 PM1/28/19

to Dmitry Vyukov, syzkaller

Ok, I understand.

I'm a student, and I am working on my paper to improve linux kernel fuzzing. I only tested with the older kernel, I found that all of the bugs have been reported, mostly by the syzbot. If I find some new bugs, I will report it.

I think that a corpus with big coverage is important for kernel fuzzing. So, I want to do my research based on your corpus to and test whether it can improve some things.

So, I come to call for your help. You can send my email only.

Thanks for your help!

Dmitry Vyukov <dvy...@google.com> 于2019年1月28日周一下午6:17写道：

Reply all

Reply to author

Forward