Debugging Lost Connection to Test Machine

10 views
Skip to first unread message

Marco Vanotti

unread,
Jun 12, 2020, 2:10:29 AM6/12/20
to syzk...@googlegroups.com, Dmitry Vyukov
Hi Dmitry and syzkaller users,

I am trying to identify the root cause of a lost connection to test machine error. I am able to reproduce it locally more or less reliably, but I am at a loss trying to understand what's going on.

I am running syz-manager with -debug and only enabling a subset of fuchsia system calls that I know should trigger a kernel panic. However, after some time I get the "lost connection to test machine" message, and I can't see anything in the logs.

Do you have any tips to get a better understanding of what is happening?

Best Regards,
Marco

Dmitry Vyukov

unread,
Jun 13, 2020, 6:46:45 AM6/13/20
to Marco Vanotti, syzkaller
,On Fri, Jun 12, 2020 at 8:10 AM 'Marco Vanotti' via syzkaller
Hi Marco,

How do these logs look like? Please post some samples.

Does syz-manager come up with reproducers? If yes, then debugging the
reproducer should shed some light.
You may also try to reply some crash logs manually with syz-execprog
or syz-crush. If it reproduces, then it's possible to minimize the log
manually or with syz-repro utility.

Marco Vanotti

unread,
Jun 16, 2020, 2:33:35 AM6/16/20
to Dmitry Vyukov, syzkaller
Hi Dmitry,

Thanks for your answer. This ended up being a different bug that caused the kernel to just reboot. I manually minimized the syscalls (removing a few and checking whether the crash still happened) and then realized what the bug was.

I thought somehow syzkaller was not seeing a kernel panic happening, but it was the machine rebooting directly.

I have other Lost Connection to Test machine bugs to debug :p ! But at least this one is done.

Thanks!
Marco

Dmitry Vyukov

unread,
Jun 16, 2020, 3:07:23 AM6/16/20
to Marco Vanotti, syzkaller
On Tue, Jun 16, 2020 at 8:33 AM Marco Vanotti <mvan...@google.com> wrote:
>
> Hi Dmitry,
>
> Thanks for your answer. This ended up being a different bug that caused the kernel to just reboot. I manually minimized the syscalls (removing a few and checking whether the crash still happened) and then realized what the bug was.

Good, it's resolved.

> I thought somehow syzkaller was not seeing a kernel panic happening, but it was the machine rebooting directly.

This sounds bad. This means all bugs will be thrown into "lost
connection" bucket. Is this reboot message in the log but not
recognized as a "bug"? Or it's not the in the log at all? If it's the
former, please add a test to pkg/report/testdata/fuchsia/report, if
it's the latter, there is something wrong with console output
interception (however, vm/qemu should do it right b/c/ it's widely
used, so kernel writing it somewhere else?).

Marco Vanotti

unread,
Jun 16, 2020, 3:25:05 AM6/16/20
to Dmitry Vyukov, syzkaller
No, no. What I meant is that I thought syzkaller was not seeing a crash log and why I wanted to debug it.

It all started with me reverting a fix to a bug, and enabling only the related syscalls so syzkaller could reproduce the problem.

I was seeing crashes (lost connection to test machine), but I was not seeing any output that would indicate that syzkaller found the bug. That's when I sent the first email asking for help debugging it.

It turned out that there was this other bug that would cause the machine to reboot directly, no logs at all. With the bug "fixed" (kernel panic instead of direct reboot), syzkaller recognizes the crash.

I didn't wait around for syz manager to come up with a reproducer, but it was hitting it pretty consistently.

Tomorrow I will try finding the original bug with this other bug patched up :)

Best,
Marco

Dmitry Vyukov

unread,
Jun 16, 2020, 3:33:17 AM6/16/20
to Marco Vanotti, syzkaller
On Tue, Jun 16, 2020 at 9:25 AM Marco Vanotti <mvan...@google.com> wrote:
>
> No, no. What I meant is that I thought syzkaller was not seeing a crash log and why I wanted to debug it.
>
> It all started with me reverting a fix to a bug, and enabling only the related syscalls so syzkaller could reproduce the problem.
>
> I was seeing crashes (lost connection to test machine), but I was not seeing any output that would indicate that syzkaller found the bug. That's when I sent the first email asking for help debugging it.
>
> It turned out that there was this other bug that would cause the machine to reboot directly, no logs at all.

Ah, I see. So this is working-as-intended on the syzkaller side. Thanks.
Reply all
Reply to author
Forward
0 new messages