[syzbot] riscv/fixes test error: lost connection to test machine

97 views
Skip to first unread message

syzbot

unread,
May 27, 2022, 8:55:24 AM5/27/22
to a...@eecs.berkeley.edu, linux-...@vger.kernel.org, linux...@lists.infradead.org, pal...@dabbelt.com, paul.w...@sifive.com, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
userspace arch: riscv64

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2c5da6...@syzkaller.appspotmail.com



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Dmitry Vyukov

unread,
May 27, 2022, 9:02:01 AM5/27/22
to syzbot, a...@eecs.berkeley.edu, linux-...@vger.kernel.org, linux...@lists.infradead.org, pal...@dabbelt.com, paul.w...@sifive.com, syzkall...@googlegroups.com
On Fri, 27 May 2022 at 14:55, syzbot
<syzbot+2c5da6...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
> console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
> dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
> compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> userspace arch: riscv64
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+2c5da6...@syzkaller.appspotmail.com

The CONFIG_KASAN_VMALLOC allows riscv kernel to boot, but now Go
processes started crashing with:

1970/01/01 00:06:55 fuzzer started
runtime: lfstack.push invalid packing: node=0xffffff5908a940 cnt=0x1
packed=0xffff5908a9400001 -> node=0xffff5908a940
fatal error: lfstack.push
runtime stack:
runtime.throw({0x30884c, 0xc})
/usr/local/go/src/runtime/panic.go:1198 +0x60
runtime.(*lfstack).push(0xdb3850, 0xffffff5908a940)
/usr/local/go/src/runtime/lfstack.go:30 +0x1a8

Go runtime tries to shove some data into the upper 16 bits of pointers
assuming they are unused.
However, the original pointer node=0xffffff5908a940 suggest riscv now
has 56-bit users-space address space?
Documentation/riscv/vm-layout.rst claims 48-bit pointers:
"
The RISC-V privileged architecture document states that the 64bit addresses
"must have bits 63–48 all equal to bit 47, or else a page-fault exception will
occur.":
...
0000000000000000 | 0 | 0000003fffffffff | 256 GB |
user-space virtual memory, different per mm
"

Alexandre Ghiti

unread,
May 27, 2022, 9:50:21 AM5/27/22
to syzkaller-bugs
Yes, sv57 was merged recently.
 
Documentation/riscv/vm-layout.rst claims 48-bit pointers:
"
The RISC-V privileged architecture document states that the 64bit addresses
"must have bits 63–48 all equal to bit 47, or else a page-fault exception will
occur.":

Thanks for pointing that, I extracted that from the specification before sv57 was specified, I'll fix that.

The current kernel code will use sv57 as it is supported and advertised by qemu, and to my knowledge, you can't downgrade to sv48 unless by re-compiling qemu using the following:

diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 6dbe9b541f..a64b50ed75 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -637,7 +637,7 @@ static const char valid_vm_1_10_64[16] = {
     [VM_1_10_MBARE] = 1,
     [VM_1_10_SV39] = 1,
     [VM_1_10_SV48] = 1,
-    [VM_1_10_SV57] = 1
+    [VM_1_10_SV57] = 0
 };
 
 /* Machine Information Registers */

Dmitry Vyukov

unread,
May 27, 2022, 9:55:24 AM5/27/22
to Alexandre Ghiti, syzkaller-bugs
There is no kernel config to force SV48/39, right?

Alexandre Ghiti

unread,
May 27, 2022, 10:01:19 AM5/27/22
to syzkaller-bugs
No, we rely on what the hardware advertises, if it supports sv57, we'll go for sv57, if not, we'll try sv48...etc. I had some patches to force the downgrade by using the device tree but they never got merged though.

Dmitry Vyukov

unread,
May 27, 2022, 1:04:21 PM5/27/22
to Alexandre Ghiti, syzkaller-bugs, syzbot, Albert Ou, LKML, linux-riscv, Palmer Dabbelt, Paul Walmsley
On Fri, 27 May 2022 at 16:01, Alexandre Ghiti
+original CC list

FTR sent Go runtime change to support SV57:
https://go-review.googlesource.com/c/go/+/409055

Dmitry Vyukov

unread,
May 27, 2022, 1:12:55 PM5/27/22
to Alexandre Ghiti, syzkaller-bugs, syzbot, Albert Ou, LKML, linux-riscv, Palmer Dabbelt, Paul Walmsley
Is CONFIG_CMDLINE broken on riscv?
I am running with:

CONFIG_CMDLINE="earlyprintk=serial net.ifnames=0
sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb
nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000
nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000
nf-conntrack-sane.ports=20000 binder.debug_mask=0
rcupdate.rcu_expedited=1 no_hash_pointers page_owner=on
sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4
secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1
msr.allow_writes=off dummy_hcd.num=2 smp.csd_lock_timeout=300000
watchdog_thresh=165 workqueue.watchdog_thresh=420
sysctl.net.core.netdev_unregister_timeout_secs=420 panic_on_warn=1"

But getting BUGs with the default timeout:
watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:4:2039]

Alexandre Ghiti

unread,
May 28, 2022, 4:03:31 AM5/28/22
to Dmitry Vyukov, Alexandre Ghiti, syzkaller-bugs, syzbot, Albert Ou, LKML, linux-riscv, Palmer Dabbelt, Paul Walmsley
Thank you for that, I'll pull that into Ubuntu when merged. Do you know
if any other programming language does the same and would need a fix too?


>
> _______________________________________________
> linux-riscv mailing list
> linux...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

Alexandre Ghiti

unread,
May 28, 2022, 4:09:58 AM5/28/22
to Dmitry Vyukov, Alexandre Ghiti, syzkaller-bugs, syzbot, Albert Ou, LKML, linux-riscv, Palmer Dabbelt, Paul Walmsley
This command line is 608-character long, but we are still stuck with the
default COMMAND_LINE_SIZE to 512, I imagine that it is the problem. I
had proposed a patch last year to bump that to 1024, but it never got
merged
https://lore.kernel.org/lkml/CAEn-LTqTXCEC=bXTvGyo8SNL0JMWRKtiSwQB7R=Pc4uh...@mail.gmail.com/T/#m4b45019dc0f5573f2a50c1f6007c5109fa35efff


>
> But getting BUGs with the default timeout:
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:4:2039]
>

Dmitry Vyukov

unread,
May 28, 2022, 4:26:58 AM5/28/22
to Alexandre Ghiti, Alexandre Ghiti, syzkaller-bugs, syzbot, Albert Ou, LKML, linux-riscv, Palmer Dabbelt, Paul Walmsley
Nothing comes to mind right now.
But this is not only about language runtimes, it's about all software
out there. However, x86 has 5-level pages now as well, it should stomp
on these problems earlier... but somehow it did not happen for Go
runtime.

Dmitry Vyukov

unread,
May 28, 2022, 4:31:30 AM5/28/22
to Alexandre Ghiti, Alexandre Ghiti, syzkaller-bugs, syzbot, Albert Ou, LKML, linux-riscv, Palmer Dabbelt, Paul Walmsley
risc-v maintainers, please merge it now.
I would even suggest 2048:

git grep "define COMMAND_LINE_SIZE" arch/
arch/alpha/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256
arch/arc/include/asm/setup.h:#define COMMAND_LINE_SIZE 256
arch/arm/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 1024
arch/arm64/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 2048
arch/ia64/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 2048
arch/m68k/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256
arch/microblaze/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256
arch/mips/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 4096
arch/parisc/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 1024
arch/powerpc/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 2048
arch/s390/include/asm/setup.h:#define COMMAND_LINE_SIZE CONFIG_COMMAND_LINE_SIZE
arch/sparc/include/uapi/asm/setup.h:# define COMMAND_LINE_SIZE 2048
arch/sparc/include/uapi/asm/setup.h:# define COMMAND_LINE_SIZE 256
arch/um/include/asm/setup.h:#define COMMAND_LINE_SIZE 4096
arch/x86/include/asm/setup.h:#define COMMAND_LINE_SIZE 2048
arch/xtensa/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256


It's also interesting how the kernel handles overflow. Imagine one
adds that_critical_security_feature=1 to the end of an existing long
line.

Alexandre Ghiti

unread,
May 31, 2022, 10:10:02 AM5/31/22
to Dmitry Vyukov, Alexandre Ghiti, syzkaller-bugs, syzbot, Albert Ou, LKML, linux-riscv, Palmer Dabbelt, Paul Walmsley
Your comment rang a bell and I searched in my old patchsets: I had submitted a patch [1] to output a warning in case of an overflow and to correctly truncate the command line to avoid such issues: it was taken with another series [2] which was actually never merged...My bad on this one, I followed my patch in the series but not the series itself.

I'll try to re-submit it because I agree the current behaviour is really wrong.

Dmitry Vyukov

unread,
Aug 4, 2022, 1:36:11 AM8/4/22
to Alexandre Ghiti, Alexandre Ghiti, syzkaller-bugs, syzbot, Albert Ou, LKML, linux-riscv, Palmer Dabbelt, Paul Walmsley
On Tue, 31 May 2022 at 16:10, Alexandre Ghiti
FTR I've merged the Go fix for SV57:
https://go-review.googlesource.com/c/go/+/409055
but it will only appear in Go 1.20.

And we still have the command line length issue for reviving syzbot testing.

syzbot

unread,
May 25, 2023, 9:51:45 AM5/25/23
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
Crashes did not happen for a while, no reproducer and no activity.
Reply all
Reply to author
Forward
0 new messages