Manual bug reproduction

717 views
Skip to first unread message

Kainaat Singh

unread,
Feb 16, 2021, 7:33:43 AM2/16/21
to syzkaller
Hello,

I am trying to run syz-execprog and I know that we need to run it in a separate VM. 

1.Do I need to have the same Debian image in this VM or any other image should work?
2. If I need to use the same Debian image, is there a possibility to increase the disk size in the script "create-image.sh" itself?

Dmitry Vyukov

unread,
Feb 16, 2021, 7:42:26 AM2/16/21
to Kainaat Singh, syzkaller
On Tue, Feb 16, 2021 at 1:33 PM Kainaat Singh <kainaat...@gmail.com> wrote:
>
> Hello,
>
> I am trying to run syz-execprog and I know that we need to run it in a separate VM.
>
> 1.Do I need to have the same Debian image in this VM or any other image should work?

Hi Kainaat,

Preferably, yes, the same image otherwise some bugs that have some
implicit dependencies on the image, or the setup it produced may not
work.
However, most bugs don't have such dependencies, for these you can use
any image.
But understanding if they have such dependencies or not requires some expertise.

> 2. If I need to use the same Debian image, is there a possibility to increase the disk size in the script "create-image.sh" itself?

Do you mean locally or the upstream copy?
You definitely can modify it locally to create a larger image. The
size should be somewhere in the script easy to find.

Kainaat Singh

unread,
Feb 16, 2021, 7:46:01 AM2/16/21
to syzkaller
Thanks! I found the switch in the script. 

Kainaat Singh

unread,
Feb 18, 2021, 11:11:19 AM2/18/21
to syzkaller
Hello,

I am trying to reproduce the errors using syz-repro and syz-execprog. I have tried them with two of the crashes that I got while fuzzing the i915 driver. I have a "memory leak" and a "BUG: soft lockup" that I am trying to reproduce.

When fuzzing with syz-execprog it fails to produce any crash using the crash logs. 

With syz-repro, in testing for both crashes, some program causes a crash and syz-repro is extracting the program but is unsuccessful at the end. I am sharing some end logs for each crash.
For soft lockup:
program crashed: no output from test machine
bisect: crashed, chunk #2 evicted
bisect: guilty chunks: [<1>, <1>, <1>, <1>, <1>, <11>, <42>, <659>]
bisect: guilty chunks split: [<1>, <1>, <1>, <1>, <1>], <11>, [<42>, <659>]
bisect: chunk split: <11> => <5>, <6>
bisect: triggering crash without chunk #1

program did not crash
bisect: not crashed, both chunks required
bisect: too many guilty chunks, aborting
failed to extract reproducer
reproducing took 5h24m2.47271057s

For memory leak:
program crashed: memory leak
bisect: crashed, chunk #1 evicted
bisect: guilty chunks: [<1>, <1>, <1>, <1>, <1>, <1>, <1>, <3>]
bisect: guilty chunks split: [<1>, <1>, <1>, <1>, <1>, <1>, <1>], <3>, []
bisect: chunk split: <3> => <1>, <2>
bisect: triggering crash without chunk #1

program did not crash
single: failed to extract reproducer
bisect: bisecting 2516 programs with base timeout 6m0s
bisect: bisecting 2516 programs
bisect: executing all 2516 programs
testing program 

not a leak crash: no output from test machine
bisect: didn't crash
failed to extract reproducer
reproducing took 3h20m4.599591079s

Could these crashes be too flaky to reproduce? 

Dmitry Vyukov

unread,
Feb 19, 2021, 1:12:43 PM2/19/21
to Kainaat Singh, syzkaller
On Thu, Feb 18, 2021 at 5:11 PM Kainaat Singh <kainaat...@gmail.com> wrote:
>
> Hello,
>
Hard to say.

But the process seems to be affected by these "no output from test
machine" issues. They happen in both cases, and it's not the initial
bug that you tried to reproduce.
If you fix these "no output from test machine" first, it may help to
reproduce other bugs.


> On Tuesday, February 16, 2021 at 1:46:01 PM UTC+1 Kainaat Singh wrote:
>>
>> Thanks! I found the switch in the script.
>>
>> On Tuesday, February 16, 2021 at 1:42:26 PM UTC+1 Dmitry Vyukov wrote:
>>>
>>> On Tue, Feb 16, 2021 at 1:33 PM Kainaat Singh <kainaat...@gmail.com> wrote:
>>> >
>>> > Hello,
>>> >
>>> > I am trying to run syz-execprog and I know that we need to run it in a separate VM.
>>> >
>>> > 1.Do I need to have the same Debian image in this VM or any other image should work?
>>>
>>> Hi Kainaat,
>>>
>>> Preferably, yes, the same image otherwise some bugs that have some
>>> implicit dependencies on the image, or the setup it produced may not
>>> work.
>>> However, most bugs don't have such dependencies, for these you can use
>>> any image.
>>> But understanding if they have such dependencies or not requires some expertise.
>>>
>>> > 2. If I need to use the same Debian image, is there a possibility to increase the disk size in the script "create-image.sh" itself?
>>>
>>> Do you mean locally or the upstream copy?
>>> You definitely can modify it locally to create a larger image. The
>>> size should be somewhere in the script easy to find.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/eab18cb4-270f-4b7a-b8a3-4755fae5c0fbn%40googlegroups.com.

Kainaat Singh

unread,
Feb 24, 2021, 6:54:42 AM2/24/21
to syzkaller
I am trying to understand what syzkaller does to the VM when the kernel crashes.

1. Does it restart the VM? Else, how is the kernel put back to its original state?

2. When there is a crash, does the main thread stop fuzzing until the crash is reproduced? I am running only one VM instance, so how syzkaller is handling this situation.

In my case, a vGPU is attached to the VM and sometimes when there is a crash in syz-repro, I can see this and it seems to be running fine after this:

program crashed: no output from test machine
failed to init instance: failed to create VM: failed to read from qemu: EOF
qemu-system-x86_64: -device vfio-pci,sysfsdev=/sys/bus/pci/devices/0000:00:02.0/ef8e2751-94bf-4b71-9c1d-432dca83a9ec: vfio ef8e2751-94bf-4b71-9c1d-432dca83a9ec: failed to open /dev/vfio/23: Device or re>
bisect: crashed, chunk #1 evicted
bisect: guilty chunks: [<1317>]
bisect: guilty chunks split: [], <1317>, []
bisect: chunk split: <1317> => <658>, <659>

I am guessing that the VM is still active and the vGPU is attached to it this is why we get this error. When there is "no output from test machine" does syzkaller restarts the VM?

Dmitry Vyukov

unread,
Feb 25, 2021, 9:04:27 AM2/25/21
to Kainaat Singh, syzkaller
On Wed, Feb 24, 2021 at 12:54 PM Kainaat Singh <savvy....@gmail.com> wrote:
>
> I am trying to understand what syzkaller does to the VM when the kernel crashes.
>
> 1. Does it restart the VM? Else, how is the kernel put back to its original state?

Yes. If qemu is used, then it's not just restarted. The previous qemu
process is killed and a new one is created.

> 2. When there is a crash, does the main thread stop fuzzing until the crash is reproduced? I am running only one VM instance, so how syzkaller is handling this situation.

Overall yes, but it's a bit involved, also depends on "phase" of work.
But, yes, generally it will start reproducing and it takes precedence
over fuzzing.
If you are interested in details, you need to read and understand the code:
https://github.com/google/syzkaller/blob/master/syz-manager/manager.go#L308-L440

You can disable reproduction in the manager config if you want.

> In my case, a vGPU is attached to the VM and sometimes when there is a crash in syz-repro, I can see this and it seems to be running fine after this:
>
> program crashed: no output from test machine
> failed to init instance: failed to create VM: failed to read from qemu: EOF
> qemu-system-x86_64: -device vfio-pci,sysfsdev=/sys/bus/pci/devices/0000:00:02.0/ef8e2751-94bf-4b71-9c1d-432dca83a9ec: vfio ef8e2751-94bf-4b71-9c1d-432dca83a9ec: failed to open /dev/vfio/23: Device or re>
> bisect: crashed, chunk #1 evicted
> bisect: guilty chunks: [<1317>]
> bisect: guilty chunks split: [], <1317>, []
> bisect: chunk split: <1317> => <658>, <659>
>
> I am guessing that the VM is still active and the vGPU is attached to it this is why we get this error.

I don't think this has anything to do with reproduction.
This looks like a bug in qemu or the host kernel that it does not
release the device immediately.


> When there is "no output from test machine" does syzkaller restarts the VM?

Yes.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/d24805f1-3847-440d-a9a2-d9521ca2006en%40googlegroups.com.

Kainaat Singh

unread,
Feb 26, 2021, 9:05:29 AM2/26/21
to Dmitry Vyukov, syzkaller
Thanks! that helps Sorry to bother you but I have one last question.

Is it advisable to make additions to the syzkaller code? I would like to run a command on the Host machine just before the qemu instance is started. 

Palash Oswal

unread,
Feb 26, 2021, 9:43:00 AM2/26/21
to Kainaat Singh, Dmitry Vyukov, syzkaller
On Fri, Feb 26, 2021 at 7:35 PM Kainaat Singh <savvy....@gmail.com> wrote:
>
> Thanks! that helps Sorry to bother you but I have one last question.
>
> Is it advisable to make additions to the syzkaller code? I would like to run a command on the Host machine just before the qemu instance is started.


I would personally be hesitant to run the compiled code on the host
because of the following considerations:
* The Kernel versions might be different between the syzkaller VM and your host.
* The devices/drivers running on the host might be different between
the host and the VM.
* The system might become unstable if there are actual bugs on the
kernel on the host and might create unreliable syzkaller fuzzing runs.

Kainaat Singh

unread,
Feb 26, 2021, 11:25:14 AM2/26/21
to Palash Oswal, Dmitry Vyukov, syzkaller
To be clear, I want to modify qemu.go from the syzkaller code to run a command on the host machine before it spawns a VM. 

And if I am not mistaken, isn't it the intended use of syz-manager to run on the host and spawn VMs? If I understand you correctly, you are prefering the isolated host setup?

Dmitry Vyukov

unread,
Mar 1, 2021, 2:20:25 PM3/1/21
to Kainaat Singh, Palash Oswal, syzkaller
Hi Kainaat,

Overall useful contributions are welcome. However, for each
contribution there is a question of how useful it is for other users,
how much it fits into overall project goals, how complex/how much
maintenance burden it is, what's the right design for the feature.

The addition you want to do is a work around for qemu/kernel bug,
right? Have you reported it to qemu/kernel? The device should not be
busy after the qemu process was terminated.

Note that what you want to do is already supported. You can specify
your custom script as qemu binary in config, do the necessary actions
and start real qemu.

Kainaat Singh

unread,
Mar 17, 2021, 5:29:08 PM3/17/21
to syzkaller
Hi Dmitry,

Thank you for your help. I was able to modify and fuzz properly. It was indeed a kernel bug.

I had a few questions:
1. There is a BUG: use-after-free that KASAN catches while launching the VM. When I manually launch the VM, the VM bypassing the KASAN BUG. But syzkaller seems to complain and it keeps trying to respawn the VM. Currently, I have removed KASAN from the target kernel configuration. Is it possible to work around this BUG for syzkaller with KASAN enabled?

2. In my fuzzing campaign, I notice that there are some crashes that syzkaller tries to queue and reproduce while some that it does not. Is it normal behaviour?  

A few logs for reference:

loop: instance 0 finished, crash=true
vm-0: crash: BUG: unable to handle kernel paging request
loop: phase=4 shutdown=false instances=1/1 [0] repro: pending=0 reproducing=0 queued=0
loop: starting instance 0
VMs 0, executed 101132, cover 8633, signal 12568/21884, crashes 6, repro 0
VMs 0, executed 101132, cover 8633, signal 12568/21884, crashes 6, repro 0
fuzzer vm-0 connected
VMs 1, executed 101135, cover 8633, signal 12568/21884, crashes 6, repro 0
poll from vm-0: candidates=0 inputs=0 maxsignal=0
VMs 1, executed 101164, cover 8633, signal 12568/21884, crashes 6, repro 0
VMs 1, executed 101164, cover 8633, signal 12568/21884, crashes 6, repro 0
poll from vm-0: candidates=0 inputs=0 maxsignal=0
VMs 1, executed 101189, cover 8633, signal 12568/21890, crashes 6, repro 0


loop: instance 0 finished, crash=true
vm-0: crash: memory leak in 
VMs 0, executed 114630, cover 8833, signal 12658/22612, crashes 6, repro 0
loop: phase=4 shutdown=false instances=1/1 [0] repro: pending=0 reproducing=1 queued=1
reproducing crash 'memory leak in ': single: executing 8 programs separately with timeout 1m40s

Dmitry Vyukov

unread,
Mar 18, 2021, 3:05:14 AM3/18/21
to Kainaat Singh, syzkaller
On Wed, Mar 17, 2021 at 10:29 PM Kainaat Singh <savvy....@gmail.com> wrote:
>
> Hi Dmitry,
>
> Thank you for your help. I was able to modify and fuzz properly. It was indeed a kernel bug.
>
> I had a few questions:
> 1. There is a BUG: use-after-free that KASAN catches while launching the VM. When I manually launch the VM, the VM bypassing the KASAN BUG. But syzkaller seems to complain and it keeps trying to respawn the VM. Currently, I have removed KASAN from the target kernel configuration. Is it possible to work around this BUG for syzkaller with KASAN enabled?

You can use manager config "ignores" parameter for this.

Dmitry Vyukov

unread,
Mar 18, 2021, 3:06:35 AM3/18/21
to Kainaat Singh, syzkaller
Yes, this is normal. syzkaller tries to reproduce the same bug at most
3 times. And it reproduces at most 1 bug of the same type at a time.

Kainaat Singh

unread,
Mar 18, 2021, 1:13:09 PM3/18/21
to syzkaller
Hi Dmitry,

I tried the "ignores" parameter but it would not ignore the bug. This seems to happen because syzkaller turns on PANIC parameters. This bug is causing a kernel panic which is sending it into a reboot.  

I guess this is why it is working when I launch it manually.

Can I somehow get pass this?

Dmitry Vyukov

unread,
Mar 18, 2021, 2:41:16 PM3/18/21
to Kainaat Singh, syzkaller
On Thu, Mar 18, 2021 at 6:13 PM Kainaat Singh <savvy....@gmail.com> wrote:
>
> Hi Dmitry,
>
> I tried the "ignores" parameter but it would not ignore the bug. This seems to happen because syzkaller turns on PANIC parameters. This bug is causing a kernel panic which is sending it into a reboot.
>
> I guess this is why it is working when I launch it manually.
>
> Can I somehow get pass this?

I see only 1 place where we enforce panic_on_warn=1: in
vm/qemu/qemu.go. I assume you are using "qemu" VM type. You can
override it by passing "panic_on_warn=0" in "cmdline" parameter.
But I also think we need to remove panic_on_warn=1 from vm/qemu.


> On Thursday, 18 March 2021 at 08:05:14 UTC+1 Dmitry Vyukov wrote:
>>
>> On Wed, Mar 17, 2021 at 10:29 PM Kainaat Singh <savvy....@gmail.com> wrote:
>> >
>> > Hi Dmitry,
>> >
>> > Thank you for your help. I was able to modify and fuzz properly. It was indeed a kernel bug.
>> >
>> > I had a few questions:
>> > 1. There is a BUG: use-after-free that KASAN catches while launching the VM. When I manually launch the VM, the VM bypassing the KASAN BUG. But syzkaller seems to complain and it keeps trying to respawn the VM. Currently, I have removed KASAN from the target kernel configuration. Is it possible to work around this BUG for syzkaller with KASAN enabled?
>>
>> You can use manager config "ignores" parameter for this.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/b2f452d8-24b7-4a40-9123-b16d16dfac8en%40googlegroups.com.

Dmitry Vyukov

unread,
Mar 18, 2021, 2:51:40 PM3/18/21
to Kainaat Singh, syzkaller
On Thu, Mar 18, 2021 at 7:41 PM Dmitry Vyukov <dvy...@google.com> wrote:
>
> On Thu, Mar 18, 2021 at 6:13 PM Kainaat Singh <savvy....@gmail.com> wrote:
> >
> > Hi Dmitry,
> >
> > I tried the "ignores" parameter but it would not ignore the bug. This seems to happen because syzkaller turns on PANIC parameters. This bug is causing a kernel panic which is sending it into a reboot.
> >
> > I guess this is why it is working when I launch it manually.
> >
> > Can I somehow get pass this?
>
> I see only 1 place where we enforce panic_on_warn=1: in
> vm/qemu/qemu.go. I assume you are using "qemu" VM type. You can
> override it by passing "panic_on_warn=0" in "cmdline" parameter.
> But I also think we need to remove panic_on_warn=1 from vm/qemu.

https://github.com/google/syzkaller/pull/2504 will remove

Kainaat Singh

unread,
Mar 22, 2021, 6:32:16 AM3/22/21
to syzkaller
Thank you for your support!
Reply all
Reply to author
Forward
0 new messages