WARNING in __do_kernel_fault

31 views
Skip to first unread message

syzbot

unread,
Jan 27, 2021, 11:56:23 AM1/27/21
to Dave....@arm.com, catalin...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, mark.r...@arm.com, syzkall...@googlegroups.com, wi...@kernel.org
Hello,

syzbot found the following issue on:

HEAD commit: 2ab38c17 mailmap: remove the "repo-abbrev" comment
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15a25264d00000
kernel config: https://syzkaller.appspot.com/x/.config?x=ad43be24faf1194c
dashboard link: https://syzkaller.appspot.com/bug?extid=45b6fce29ff97069e2c5
userspace arch: arm64

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+45b6fc...@syzkaller.appspotmail.com

REISERFS (device loop0): Using rupasov hash to sort names
------------[ cut here ]------------
Ignoring spurious kernel translation fault at virtual address 0000000000000030
WARNING: CPU: 1 PID: 5380 at arch/arm64/mm/fault.c:364 __do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
Modules linked in:
CPU: 1 PID: 5380 Comm: syz-executor.0 Not tainted 5.11.0-rc5-syzkaller-00037-g2ab38c17aac1 #0
Hardware name: linux,dummy-virt (DT)
pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
pc : __do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
lr : __do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
sp : ffff800014933830
x29: ffff800014933830 x28: f1ff00000c28bc00
x27: ffff80001231db80 x26: f0ff00002054a0b8
x25: 0000000000000000 x24: f1ff000004217680
x23: 0000000097c78006 x22: 0000000000000030
x21: 0000000000000025 x20: ffff800014933960
x19: 0000000097c78006 x18: 00000000fffffffb
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000020 x14: 6c656e72656b2073
x13: 00000000000006f9 x12: ffff8000149334e0
x11: ffff80001313b450 x10: 00000000ffffe000
x9 : ffff80001313b450 x8 : ffff80001308b450
x7 : ffff80001313b450 x6 : 0000000000000000
x5 : ffff00007fbe1948 x4 : 0000000000015ff5
x3 : 0000000000000001 x2 : 0000000000000000
x1 : 0000000000000000 x0 : f1ff00000c28bc00
Call trace:
__do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
do_page_fault+0x1c0/0x3a0 arch/arm64/mm/fault.c:649
do_translation_fault+0xb4/0xc4 arch/arm64/mm/fault.c:660
do_mem_abort+0x44/0xbc arch/arm64/mm/fault.c:793
el1_abort+0x40/0x6c arch/arm64/kernel/entry-common.c:118
el1_sync_handler+0xb0/0xcc arch/arm64/kernel/entry-common.c:209
el1_sync+0x70/0x100 arch/arm64/kernel/entry.S:656
reiserfs_xattr_jcreate_nblocks fs/reiserfs/xattr.h:79 [inline]
reiserfs_security_init+0x98/0x10c fs/reiserfs/xattr_security.c:70
reiserfs_mkdir+0xf4/0x320 fs/reiserfs/namei.c:821
xattr_mkdir.constprop.0+0x24/0x3c fs/reiserfs/xattr.c:76
create_privroot fs/reiserfs/xattr.c:889 [inline]
reiserfs_xattr_init+0x16c/0x320 fs/reiserfs/xattr.c:1011
reiserfs_fill_super+0xa34/0xd20 fs/reiserfs/super.c:2177
mount_bdev+0x1c4/0x1f0 fs/super.c:1366
get_super_block+0x1c/0x30 fs/reiserfs/super.c:2606
legacy_get_tree+0x34/0x64 fs/fs_context.c:592
vfs_get_tree+0x2c/0xf0 fs/super.c:1496
do_new_mount fs/namespace.c:2881 [inline]
path_mount+0x3e8/0xaf0 fs/namespace.c:3211
do_mount fs/namespace.c:3224 [inline]
__do_sys_mount fs/namespace.c:3432 [inline]
__se_sys_mount fs/namespace.c:3409 [inline]
__arm64_sys_mount+0x1a8/0x2fc fs/namespace.c:3409
__invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:49 [inline]
el0_svc_common.constprop.0+0x74/0x190 arch/arm64/kernel/syscall.c:159
do_el0_svc+0x78/0x90 arch/arm64/kernel/syscall.c:198
el0_svc+0x14/0x20 arch/arm64/kernel/entry-common.c:365
el0_sync_handler+0x1a8/0x1b0 arch/arm64/kernel/entry-common.c:381
el0_sync+0x190/0x1c0 arch/arm64/kernel/entry.S:699


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Dmitry Vyukov

unread,
Jan 27, 2021, 12:00:43 PM1/27/21
to syzbot, Dave Martin, Catalin Marinas, Linux ARM, LKML, Mark Rutland, syzkaller-bugs, Will Deacon, Andrey Konovalov
On Wed, Jan 27, 2021 at 5:56 PM syzbot
<syzbot+45b6fc...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 2ab38c17 mailmap: remove the "repo-abbrev" comment
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15a25264d00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=ad43be24faf1194c
> dashboard link: https://syzkaller.appspot.com/bug?extid=45b6fce29ff97069e2c5
> userspace arch: arm64
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+45b6fc...@syzkaller.appspotmail.com

This happens on arm64 instance with mte enabled.
There is a GPF in reiserfs_xattr_init on x86_64 reported:
https://syzkaller.appspot.com/bug?id=8abaedbdeb32c861dc5340544284167dd0e46cde
so I would assume it's just a plain NULL deref. Is this WARNING not
indicative of a kernel bug? Or there is something special about this
particular NULL deref?
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/0000000000009bbb7905b9e4a624%40google.com.

Will Deacon

unread,
Jan 27, 2021, 12:15:00 PM1/27/21
to Dmitry Vyukov, syzbot, Dave Martin, Catalin Marinas, Linux ARM, LKML, Mark Rutland, syzkaller-bugs, Andrey Konovalov
On Wed, Jan 27, 2021 at 06:00:30PM +0100, Dmitry Vyukov wrote:
> On Wed, Jan 27, 2021 at 5:56 PM syzbot
> <syzbot+45b6fc...@syzkaller.appspotmail.com> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 2ab38c17 mailmap: remove the "repo-abbrev" comment
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15a25264d00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=ad43be24faf1194c
> > dashboard link: https://syzkaller.appspot.com/bug?extid=45b6fce29ff97069e2c5
> > userspace arch: arm64
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+45b6fc...@syzkaller.appspotmail.com
>
> This happens on arm64 instance with mte enabled.
> There is a GPF in reiserfs_xattr_init on x86_64 reported:
> https://syzkaller.appspot.com/bug?id=8abaedbdeb32c861dc5340544284167dd0e46cde
> so I would assume it's just a plain NULL deref. Is this WARNING not
> indicative of a kernel bug? Or there is something special about this
> particular NULL deref?

Congratulations, you're the first person to trigger this warning!

This fires if we take an unexpected data abort in the kernel but when we
get into the fault handler the page-table looks ok (according to the CPU via
an 'AT' instruction). Are you using QEMU system emulation? Perhaps its
handling of AT isn't quite right.

Will

Dmitry Vyukov

unread,
Jan 27, 2021, 12:24:34 PM1/27/21
to Will Deacon, syzbot, Dave Martin, Catalin Marinas, Linux ARM, LKML, Mark Rutland, syzkaller-bugs, Andrey Konovalov
Hi Will,

Yes, it's qemu-system-aarch64 5.2 with -machine virt,mte=on -cpu max.
Do you see any way forward for this issue? Can somehow prove/disprove
it's qemu at fault?
The instance just started running, but it seems to be the most common
crash so far and it seems to happen on _all_ gpf's.
You can see all arm64 crashes so far here:
https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64-mte
They all happen in reiserfs_security_init, but locally I got a bunch
of different stacks, e.g.:


------------[ cut here ]------------
Ignoring spurious kernel translation fault at virtual address ffff8000170f5fe0
WARNING: CPU: 1 PID: 16450 at arch/arm64/mm/fault.c:364
__do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
Modules linked in:
CPU: 1 PID: 16450 Comm: syz-executor.1 Not tainted 5.11.0-rc3 #36
Hardware name: linux,dummy-virt (DT)
pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
pc : __do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
lr : __do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
sp : ffff800015443550
x29: ffff800015443550 x28: fcff00002db4bc00
x27: 0000000000000020 x26: 0000000000000001
x25: 0000000000000018 x24: 0000000000000008
x23: 0000000080400009 x22: ffff8000170f5fe0
x21: 0000000000000025 x20: ffff800015443620
x19: 0000000097800047 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 00003d8e94eba1ca x14: 0000000000000017
x13: 0000000000000017 x12: 0000000000000000
x11: 0000000000000010 x10: 68b6895a9d433f2e
x9 : f39ec128c34c6307 x8 : fcff00002db4ca98
x7 : f5ff00000e65a400 x6 : 000000403e245885
x5 : 0000000000000000 x4 : ffff00007dbe1948
x3 : ffff00007dbe84b0 x2 : ffff00007dbe1948
x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
__do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
do_bad_area arch/arm64/mm/fault.c:462 [inline]
do_translation_fault+0x5c/0xc4 arch/arm64/mm/fault.c:662
do_mem_abort+0x44/0xb4 arch/arm64/mm/fault.c:792
el1_abort+0x40/0x6c arch/arm64/kernel/entry-common.c:118
el1_sync_handler+0xb0/0xcc arch/arm64/kernel/entry-common.c:209
el1_sync+0x70/0x100 arch/arm64/kernel/entry.S:656
fast_imageblit drivers/video/fbdev/core/sysimgblt.c:229 [inline]
sys_imageblit+0x3b4/0x440 drivers/video/fbdev/core/sysimgblt.c:275
drm_fb_helper_sys_imageblit drivers/gpu/drm/drm_fb_helper.c:794 [inline]
drm_fbdev_fb_imageblit+0x5c/0x80 drivers/gpu/drm/drm_fb_helper.c:2266
bit_putcs_unaligned drivers/video/fbdev/core/bitblit.c:139 [inline]
bit_putcs+0x23c/0x470 drivers/video/fbdev/core/bitblit.c:188
fbcon_putcs+0xfc/0x120 drivers/video/fbdev/core/fbcon.c:1304
do_update_region+0x158/0x1b4 drivers/tty/vt/vt.c:676
invert_screen+0xe4/0x1f4 drivers/tty/vt/vt.c:800
highlight drivers/tty/vt/selection.c:57 [inline]
clear_selection drivers/tty/vt/selection.c:84 [inline]
clear_selection+0x50/0x70 drivers/tty/vt/selection.c:80
vc_do_resize+0x4f8/0x574 drivers/tty/vt/vt.c:1241
vc_resize+0x24/0x30 drivers/tty/vt/vt.c:1346
fbcon_do_set_font+0xd8/0x2c0 drivers/video/fbdev/core/fbcon.c:2402
fbcon_set_font+0x200/0x260 drivers/video/fbdev/core/fbcon.c:2488
con_font_set drivers/tty/vt/vt.c:4667 [inline]
con_font_op+0x2b8/0x444 drivers/tty/vt/vt.c:4711
vt_io_ioctl drivers/tty/vt/vt_ioctl.c:587 [inline]
vt_ioctl+0x17b0/0x2020 drivers/tty/vt/vt_ioctl.c:817
tty_ioctl+0xa60/0xe5c drivers/tty/tty_io.c:2658
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__arm64_sys_ioctl+0xac/0xf0 fs/ioctl.c:739
__invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
el0_svc_common.constprop.0+0x74/0x190 arch/arm64/kernel/syscall.c:158
do_el0_svc+0x78/0x90 arch/arm64/kernel/syscall.c:204
el0_svc+0x14/0x20 arch/arm64/kernel/entry-common.c:365
el0_sync_handler+0x1a8/0x1b0 arch/arm64/kernel/entry-common.c:381
el0_sync+0x190/0x1c0 arch/arm64/kernel/entry.S:699


WARNING: CPU: 1 PID: 399 at arch/arm64/mm/fault.c:364
__do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
Modules linked in:

CPU: 1 PID: 399 Comm: syz-executor.1 Not tainted 5.11.0-rc3 #36
Hardware name: linux,dummy-virt (DT)
pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=--)
pc : __do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
lr : __do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
sp : ffff800030b33a80
x29: ffff800030b33a80
x28: f0ff00000c0d4b00

x27: ffff800012d79098
x26: ffff80001333ce68

x25: f0ff00001aaa4a00
x24: faff00000f219680

x23: 0000000097810006 x22: 0000000000000114
x21: 0000000000000025 x20: ffff800030b33bb0
x19: 0000000097810006 x18: 0000000000000020
x17: 0000000000000000 x16: 0000000000000000
x15: f0ff00000c0d5010 x14: 6c656e72656b2073
x13: 756f697275707320 x12: 756166206e6f6974
x11: 616c736e61727420 x10: 6461206c61757472
x9 : 697620746120746c x8 : 3030303030303030
x7 : 3030207373657264 x6 : ffff8000132e79bf
x5 : 000000000000000a x4 : 0000000000000000
x3 : 0000000000000001 x2 : ffff00007dbe1950
x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
__do_kernel_fault+0x198/0x1c0 arch/arm64/mm/fault.c:364
do_page_fault+0x1c0/0x3a0 arch/arm64/mm/fault.c:649
do_translation_fault+0xb4/0xc4 arch/arm64/mm/fault.c:660
do_mem_abort+0x44/0xb4 arch/arm64/mm/fault.c:792
el1_abort+0x40/0x6c arch/arm64/kernel/entry-common.c:118
el1_sync_handler+0xb0/0xcc arch/arm64/kernel/entry-common.c:209
el1_sync+0x70/0x100 arch/arm64/kernel/entry.S:656
spin_unlock_irq include/linux/spinlock.h:404 [inline]
io_ring_set_wakeup_flag fs/io_uring.c:6930 [inline]
io_disable_sqo_submit+0x5c/0x90 fs/io_uring.c:8891
io_uring_create fs/io_uring.c:9711 [inline]
io_uring_setup+0x6b8/0xe10 fs/io_uring.c:9739
__do_sys_io_uring_setup fs/io_uring.c:9745 [inline]
__se_sys_io_uring_setup fs/io_uring.c:9742 [inline]
__arm64_sys_io_uring_setup+0x20/0x30 fs/io_uring.c:9742
__invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
el0_svc_common.constprop.0+0x74/0x190 arch/arm64/kernel/syscall.c:158
do_el0_svc+0x78/0x90 arch/arm64/kernel/syscall.c:204

Will Deacon

unread,
Jan 27, 2021, 12:34:53 PM1/27/21
to Dmitry Vyukov, syzbot, Dave Martin, Catalin Marinas, Linux ARM, LKML, Mark Rutland, syzkaller-bugs, Andrey Konovalov
> Yes, it's qemu-system-aarch64 5.2 with -machine virt,mte=on -cpu max.
> Do you see any way forward for this issue? Can somehow prove/disprove
> it's qemu at fault?
> The instance just started running, but it seems to be the most common
> crash so far and it seems to happen on _all_ gpf's.
> You can see all arm64 crashes so far here:
> https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64-mte
> They all happen in reiserfs_security_init, but locally I got a bunch
> of different stacks, e.g.:

Your best bet is to hack is_spurious_el1_translation_fault() to dump addr,
es and par, then we can help decipher the logs here. It could also easily be
a bug in that code, since it hasn't been run before (well, other than
contrived testing when I wrote it).

Will

Andrey Konovalov

unread,
Jan 27, 2021, 1:46:42 PM1/27/21
to Dmitry Vyukov, Will Deacon, syzbot, Dave Martin, Catalin Marinas, Linux ARM, LKML, Mark Rutland, syzkaller-bugs
I've reproduced this crash (by taking [1] and changing
sys_memfd_create to 279), but it manifests as a normal null-ptr-deref
for me. I'm using the latest QEMU master. Which QEMU does syzbot use
exactly?

[1] https://syzkaller.appspot.com/text?tag=ReproC&x=14d3621cd00000

Dmitry Vyukov

unread,
Jan 27, 2021, 1:57:08 PM1/27/21
to Andrey Konovalov, Will Deacon, syzbot, Dave Martin, Catalin Marinas, Linux ARM, LKML, Mark Rutland, syzkaller-bugs
qemu-system-aarch64 5.2 from this container:
https://github.com/google/syzkaller/blob/master/tools/docker/syzbot/Dockerfile
you can get a prebuilt version with:
docker pull gcr.io/syzkaller/syzbot


> [1] https://syzkaller.appspot.com/text?tag=ReproC&x=14d3621cd00000
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/CAAeHK%2ByWe_GRDi8j7aPZAauTrfdjgYpYoj9F_KrsG3vtHDwTsw%40mail.gmail.com.

Andrey Konovalov

unread,
Jan 27, 2021, 2:16:49 PM1/27/21
to Dmitry Vyukov, Will Deacon, syzbot, Dave Martin, Catalin Marinas, Linux ARM, LKML, Mark Rutland, syzkaller-bugs
Reproduced with this QEMU, still a normal null-ptr-deref. Where do I
find the full list of arguments that are passed to QEMU on syzbot?

Dmitry Vyukov

unread,
Jan 27, 2021, 2:43:47 PM1/27/21
to Andrey Konovalov, Will Deacon, syzbot, Dave Martin, Catalin Marinas, Linux ARM, LKML, Mark Rutland, syzkaller-bugs
On Wed, Jan 27, 2021 at 8:16 PM 'Andrey Konovalov' via syzkaller-bugs
I am yet to document all details of these new instances, but the
syzkaller config contains:
"qemu_args": "-machine
virt,virtualization=on,mte=on,graphics=on,usb=on -cpu max"
the rest are in vm/qemu/qemu.go

Andrey Konovalov

unread,
Jan 27, 2021, 2:56:37 PM1/27/21
to Dmitry Vyukov, Will Deacon, syzbot, Dave Martin, Catalin Marinas, Linux ARM, LKML, Mark Rutland, syzkaller-bugs
OK, the virtualization=on part is what causes this. Bug in QEMU?

Dmitry Vyukov

unread,
Mar 12, 2021, 5:56:53 AM3/12/21
to Will Deacon, syzbot, Dave Martin, Catalin Marinas, Linux ARM, LKML, Mark Rutland, syzkaller-bugs, Andrey Konovalov
Should dumping of addr/es/par be included into mainline kernel code if
this WARNING is not decipherable without this info?

Also, Andrey localized this to mte=on,virtualization=on combination,
does this point towards qemu bug?

syzbot

unread,
Jul 12, 2021, 1:10:19 PM7/12/21
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
Crashes did not happen for a while, no reproducer and no activity.
Reply all
Reply to author
Forward
0 new messages