BUG: unable to handle kernel paging request in xfs_sb_quiet_read

syzbot

unread,

Dec 19, 2019, 10:15:09 AM12/19/19

to ak...@linux-foundation.org, allison....@oracle.com, arya...@virtuozzo.com, bfo...@redhat.com, darric...@oracle.com, dchi...@redhat.com, d...@axtens.net, dvy...@google.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, san...@redhat.com, syzkall...@googlegroups.com, torv...@linux-foundation.org

Hello,

syzbot found the following crash on:

HEAD commit: 2187f215 Merge tag 'for-5.5-rc2-tag' of git://git.kernel.o..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11059951e00000
kernel config: https://syzkaller.appspot.com/x/.config?x=ab2ae0615387ef78
dashboard link: https://syzkaller.appspot.com/bug?extid=4722bf4c6393b73a792b
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12727c71e00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12ff5151e00000

The bug was bisected to:

commit 0609ae011deb41c9629b7f5fd626dfa1ac9d16b0
Author: Daniel Axtens <d...@axtens.net>
Date: Sun Dec 1 01:55:00 2019 +0000

x86/kasan: support KASAN_VMALLOC

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=161240aee00000
final crash: https://syzkaller.appspot.com/x/report.txt?x=151240aee00000
console output: https://syzkaller.appspot.com/x/log.txt?x=111240aee00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+4722bf...@syzkaller.appspotmail.com
Fixes: 0609ae011deb ("x86/kasan: support KASAN_VMALLOC")

BUG: unable to handle page fault for address: fffff52000680000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 21ffee067 P4D 21ffee067 PUD aa51c067 PMD a85e1067 PTE 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 3088 Comm: kworker/1:2 Not tainted 5.5.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: xfs-buf/loop0 xfs_buf_ioend_work
RIP: 0010:xfs_sb_quiet_read_verify+0x47/0xc0 fs/xfs/libxfs/xfs_sb.c:735
Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 7f 49 8b 9c 24 30 01
00 00 48 b8 00 00 00 00 00 fc ff df 48 89 da 48 c1 ea 03 <0f> b6 04 02 84
c0 74 04 3c 03 7e 50 8b 1b bf 58 46 53 42 89 de e8
RSP: 0018:ffffc90008187cc0 EFLAGS: 00010a06
RAX: dffffc0000000000 RBX: ffffc90003400000 RCX: ffffffff82ad3c26
RDX: 1ffff92000680000 RSI: ffffffff82aa0a0f RDI: ffff8880a2cdba70
RBP: ffffc90008187cd0 R08: ffff88809eb6c500 R09: ffffed1015d2703d
R10: ffffed1015d2703c R11: ffff8880ae9381e3 R12: ffff8880a2cdb940
R13: ffff8880a2cdb95c R14: ffff8880a2cdbb74 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffff52000680000 CR3: 000000009f5ab000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
xfs_buf_ioend+0x3f9/0xde0 fs/xfs/xfs_buf.c:1162
xfs_buf_ioend_work+0x19/0x20 fs/xfs/xfs_buf.c:1183
process_one_work+0x9af/0x1740 kernel/workqueue.c:2264
worker_thread+0x98/0xe40 kernel/workqueue.c:2410
kthread+0x361/0x430 kernel/kthread.c:255
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Modules linked in:
CR2: fffff52000680000
---[ end trace 744ceb50d377bf94 ]---
RIP: 0010:xfs_sb_quiet_read_verify+0x47/0xc0 fs/xfs/libxfs/xfs_sb.c:735
Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 7f 49 8b 9c 24 30 01
00 00 48 b8 00 00 00 00 00 fc ff df 48 89 da 48 c1 ea 03 <0f> b6 04 02 84
c0 74 04 3c 03 7e 50 8b 1b bf 58 46 53 42 89 de e8
RSP: 0018:ffffc90008187cc0 EFLAGS: 00010a06
RAX: dffffc0000000000 RBX: ffffc90003400000 RCX: ffffffff82ad3c26
RDX: 1ffff92000680000 RSI: ffffffff82aa0a0f RDI: ffff8880a2cdba70
RBP: ffffc90008187cd0 R08: ffff88809eb6c500 R09: ffffed1015d2703d
R10: ffffed1015d2703c R11: ffff8880ae9381e3 R12: ffff8880a2cdb940
R13: ffff8880a2cdb95c R14: ffff8880a2cdbb74 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffff52000680000 CR3: 000000009f5ab000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

Daniel Axtens

unread,

Dec 20, 2019, 1:04:01 AM12/20/19

to syzbot, ak...@linux-foundation.org, allison....@oracle.com, arya...@virtuozzo.com, bfo...@redhat.com, darric...@oracle.com, dchi...@redhat.com, dvy...@google.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, san...@redhat.com, syzkall...@googlegroups.com, torv...@linux-foundation.org

syzbot <syzbot+4722bf...@syzkaller.appspotmail.com> writes:

> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: 2187f215 Merge tag 'for-5.5-rc2-tag' of git://git.kernel.o..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11059951e00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=ab2ae0615387ef78
> dashboard link: https://syzkaller.appspot.com/bug?extid=4722bf4c6393b73a792b
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12727c71e00000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12ff5151e00000
>
> The bug was bisected to:
>
> commit 0609ae011deb41c9629b7f5fd626dfa1ac9d16b0
> Author: Daniel Axtens <d...@axtens.net>
> Date: Sun Dec 1 01:55:00 2019 +0000
>
> x86/kasan: support KASAN_VMALLOC

Looking at the log, it's an access of fffff52000680000 that goes wrong.

Reversing the shadow calculation, it looks like an attempted access of
FFFFC90003400000, which is in vmalloc space. I'm not sure what that
memory represents.

Looking at the instruction pointer, it seems like we're here:

static void
xfs_sb_quiet_read_verify(
struct xfs_buf *bp)
{
struct xfs_dsb *dsb = XFS_BUF_TO_SBP(bp);

if (dsb->sb_magicnum == cpu_to_be32(XFS_SB_MAGIC)) { <<<< fault here
/* XFS filesystem, verify noisily! */
xfs_sb_read_verify(bp);

Is it possible that dsb is junk?

Regards,
Daniel

Brian Foster

unread,

Dec 20, 2019, 8:04:54 AM12/20/19

to Daniel Axtens, syzbot, ak...@linux-foundation.org, allison....@oracle.com, arya...@virtuozzo.com, darric...@oracle.com, dchi...@redhat.com, dvy...@google.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, san...@redhat.com, syzkall...@googlegroups.com, torv...@linux-foundation.org

Hmm.. so the context here is a read I/O completion verifier. That means
the I/O returned success and we're running a verifier function to detect
content corruption, etc., before the buffer read returns to the caller.
This particular call is quiet superblock verification, which is used
when the filesystem may legitimately be something other than XFS (so we
don't want to spit out corruption messages if the verification fails).
From that perspective, it's certainly possible dsb is junk.

The buffer itself is a sector sized uncached buffer. That means the page
count for the buffer shouldn't be more than 1, which in turn means that
->b_addr should be initialized as such:

_xfs_buf_map_pages()
{
...
if (bp->b_page_count == 1) {
/* A single page buffer is always mappable */
bp->b_addr = page_address(bp->b_pages[0]) + bp->b_offset;
...
}

... which isn't a vmap. However, we do have a multi-read dance in
xfs_readsb() where we first read the superblock without a verifier, read
the sector size specified in the super (which could be garbage) and then
re-read the superblock with a buffer based on that. So when I run the
attached reproducer, I see something like this:

<...>-885 [002] ...1 68.897501: xfs_buf_init: dev 7:0 bno 0xffffffffffffffff nblks 0x1 hold 1 pincount 0 lock 0 flags NO_IOACCT caller xfs_buf_get_uncached+0x91/0x3c0 [xfs]
repro-885 [002] ...1 68.897576: xfs_buf_get_uncached: dev 7:0 bno 0xffffffffffffffff nblks 0x1 hold 1 pincount 0 lock 0 flags NO_IOACCT|PAGES caller xfs_buf_read_uncached+0x3f/0x140 [xfs]
...
repro-885 [002] ...1 68.899077: xfs_buf_init: dev 7:0 bno 0xffffffffffffffff nblks 0x41 hold 1 pincount 0 lock 0 flags NO_IOACCT caller xfs_buf_get_uncached+0x91/0x3c0 [xfs]
repro-885 [002] ...1 68.899613: xfs_buf_get_uncached: dev 7:0 bno 0xffffffffffffffff nblks 0x41 hold 1 pincount 0 lock 0 flags NO_IOACCT|PAGES caller xfs_buf_read_uncached+0x3f/0x140 [xfs]
...

... where the sector size (65 * 512 == 33280) looks bogus. That said, it
looks like we have error checks throughout the page allocation/mapping
sequence so it isn't obvious what the problem is here. As far as we can
tell, we successfully allocated and mapped the 9 pages required for this
I/O. Thus I'd think we'd be able to get far enough to examine the
content to establish this is not a valid XFS sb and fail the mount.

Since this mapping functionality is fairly fundamental code in XFS, I
ran a quick test to use a multi-page directory block size (i.e. mkfs.xfs
-f <dev> -nsize=8k), started populating a directory and very quickly hit
a similar crash. I'm going to double check that this works as expected
without KASAN vmalloc support enabled, but is it possible something is
wrong with KASAN here?

Brian

Daniel Axtens

unread,

Dec 20, 2019, 2:27:57 PM12/20/19

to Brian Foster, syzbot, ak...@linux-foundation.org, allison....@oracle.com, arya...@virtuozzo.com, darric...@oracle.com, dchi...@redhat.com, dvy...@google.com, linux-...@vger.kernel.org, linu...@vger.kernel.org, san...@redhat.com, syzkall...@googlegroups.com, torv...@linux-foundation.org

>> > HEAD commit: 2187f215 Merge tag 'for-5.5-rc2-tag' of git://git.kernel.o..

> Since this mapping functionality is fairly fundamental code in XFS, I
> ran a quick test to use a multi-page directory block size (i.e. mkfs.xfs
> -f <dev> -nsize=8k), started populating a directory and very quickly hit
> a similar crash. I'm going to double check that this works as expected
> without KASAN vmalloc support enabled, but is it possible something is
> wrong with KASAN here?

Yes, as it turns out. xfs is using vm_map_ram, and the commit syzkaller
is testing is missing the support for vm_map_ram. Support landed in
master at d98c9e83b5e7 ("kasan: fix crashes on access to memory mapped
by vm_map_ram()") but that's _after_ 2187f215 which syzkaller was
testing

#syz fix: kasan: fix crashes on access to memory mapped by vm_map_ram()

Sorry for the noise.

Regards,
Daniel

Reply all

Reply to author

Forward

BUG: unable to handle kernel paging request in xfs_sb_quiet_read_verify

syzbot

Daniel Axtens

Brian Foster

Daniel Axtens