linux-next boot error: BUG: unable to handle kernel NULL pointer dereference in mempool_init_node

43 views
Skip to first unread message

syzbot

unread,
Nov 11, 2020, 2:45:18 AM11/11/20
to ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, s...@canb.auug.org.au, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 3e14f70c Add linux-next specific files for 20201111
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12e6af62500000
kernel config: https://syzkaller.appspot.com/x/.config?x=d6f4c7e100b61b76
dashboard link: https://syzkaller.appspot.com/bug?extid=2d6f3dad1a42d86a5801
compiler: gcc (GCC) 10.1.0-syz 20200507

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2d6f3d...@syzkaller.appspotmail.com

RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
NET: Registered protocol family 44
pci_bus 0000:00: resource 4 [io 0x0000-0x0cf7 window]
pci_bus 0000:00: resource 5 [io 0x0d00-0xffff window]
pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
pci_bus 0000:00: resource 7 [mem 0xc0000000-0xfebfefff window]
pci 0000:00:00.0: Limiting direct PCI/PCI transfers
PCI: CLS 0 bytes, default 64
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
software IO TLB: mapped [mem 0x00000000b5e00000-0x00000000b9e00000] (64MB)
RAPL PMU: API unit is 2^-32 Joules, 0 fixed counters, 10737418240 ms ovfl timer
kvm: already loaded the other module
clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x212735223b2, max_idle_ns: 440795277976 ns
clocksource: Switched to clocksource tsc
Initialise system trusted keyrings
workingset: timestamp_bits=40 max_order=21 bucket_order=0
zbud: loaded
DLM installed
squashfs: version 4.0 (2009/01/31) Phillip Lougher
FS-Cache: Netfs 'nfs' registered for caching
NFS: Registering the id_resolver key type
Key type id_resolver registered
Key type id_legacy registered
nfs4filelayout_init: NFSv4 File Layout Driver Registering...
Installing knfsd (copyright (C) 1996 ok...@monad.swb.de).
FS-Cache: Netfs 'cifs' registered for caching
Key type cifs.spnego registered
Key type cifs.idmap registered
ntfs: driver 2.1.32 [Flags: R/W].
efs: 1.0a - http://aeschi.ch.eu.org/efs/
jffs2: version 2.2. (NAND) (SUMMARY) © 2001-2006 Red Hat, Inc.
romfs: ROMFS MTD (C) 2007 Red Hat, Inc.
QNX4 filesystem 0.2.3 registered.
qnx6: QNX6 filesystem 1.0.0 registered.
fuse: init (API version 7.32)
orangefs_debugfs_init: called with debug mask: :none: :0:
orangefs_init: module version upstream loaded
JFS: nTxBlock = 8192, nTxLock = 65536
SGI XFS with ACLs, security attributes, realtime, quota, no debug enabled
9p: Installing v9fs 9p2000 file system support
FS-Cache: Netfs '9p' registered for caching
NILFS version 2 loaded
befs: version: 0.9.3
ocfs2: Registered cluster interface o2cb
ocfs2: Registered cluster interface user
OCFS2 User DLM kernel interface loaded
gfs2: GFS2 installed
BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-rc3-next-20201111-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:nearest_obj include/linux/slub_def.h:169 [inline]
RIP: 0010:____kasan_slab_free+0x19/0x110 mm/kasan/common.c:350
Code: 00 48 c7 c0 fb ff ff ff c3 cc cc cc cc cc cc cc cc 41 55 49 89 d5 41 54 49 89 fc 48 89 f7 55 48 89 f5 53 89 cb e8 f7 27 7e ff <41> 8b 7c 24 18 48 be 00 00 00 00 00 16 00 00 48 c1 e8 0c 48 89 c1
RSP: 0000:ffffc90000c67d30 EFLAGS: 00010293
RAX: 00000001436d0000 RBX: 0000000000000000 RCX: ffffffff8130a760
RDX: ffff888140748000 RSI: ffffffff8130a76a RDI: 0000000000000007
RBP: ffff8881436d0000 R08: 00000000000000fe R09: ffffed10286da800
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffffffff81945766 R14: ffff888143557944 R15: ffffffff81943b80
FS: 0000000000000000(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000018 CR3: 000000000b08e000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
kasan_slab_free_mempool include/linux/kasan.h:202 [inline]
kasan_poison_element mm/mempool.c:107 [inline]
add_element mm/mempool.c:124 [inline]
mempool_init_node+0x37e/0x580 mm/mempool.c:205
mempool_create_node mm/mempool.c:269 [inline]
mempool_create+0x76/0xc0 mm/mempool.c:254
mempool_create_kmalloc_pool include/linux/mempool.h:88 [inline]
init_caches fs/ceph/super.c:785 [inline]
init_ceph+0x193/0x2d7 fs/ceph/super.c:1261
do_one_initcall+0x103/0x650 init/main.c:1212
do_initcall_level init/main.c:1285 [inline]
do_initcalls init/main.c:1301 [inline]
do_basic_setup init/main.c:1321 [inline]
kernel_init_freeable+0x600/0x684 init/main.c:1521
kernel_init+0xd/0x1b8 init/main.c:1410
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
Modules linked in:
CR2: 0000000000000018
---[ end trace d7568b3491dd0938 ]---
RIP: 0010:nearest_obj include/linux/slub_def.h:169 [inline]
RIP: 0010:____kasan_slab_free+0x19/0x110 mm/kasan/common.c:350
Code: 00 48 c7 c0 fb ff ff ff c3 cc cc cc cc cc cc cc cc 41 55 49 89 d5 41 54 49 89 fc 48 89 f7 55 48 89 f5 53 89 cb e8 f7 27 7e ff <41> 8b 7c 24 18 48 be 00 00 00 00 00 16 00 00 48 c1 e8 0c 48 89 c1
RSP: 0000:ffffc90000c67d30 EFLAGS: 00010293
RAX: 00000001436d0000 RBX: 0000000000000000 RCX: ffffffff8130a760
RDX: ffff888140748000 RSI: ffffffff8130a76a RDI: 0000000000000007
RBP: ffff8881436d0000 R08: 00000000000000fe R09: ffffed10286da800
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffffffff81945766 R14: ffff888143557944 R15: ffffffff81943b80
FS: 0000000000000000(0000) GS:ffff8880b9e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000018 CR3: 000000000b08e000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Qian Cai

unread,
Nov 11, 2020, 11:26:23 AM11/11/20
to syzbot, ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, s...@canb.auug.org.au, syzkall...@googlegroups.com, Andrey Konovalov, Dmitry Vyukov, Alexander Potapenko, Marco Elver
It looks to me the code paths below had recently been modified heavily by this
patchset. If this is reproducible, it can be confirmed by reverting it.

https://lore.kernel.org/linux-arm-kernel/cover.160504666...@google.com/

Andrey Konovalov

unread,
Nov 11, 2020, 12:43:58 PM11/11/20
to Qian Cai, syzbot, Andrew Morton, LKML, Linux Memory Management List, Linux-Next Mailing List, Stephen Rothwell, syzkaller-bugs, Dmitry Vyukov, Alexander Potapenko, Marco Elver
On Wed, Nov 11, 2020 at 5:26 PM Qian Cai <c...@redhat.com> wrote:
>
> It looks to me the code paths below had recently been modified heavily by this
> patchset. If this is reproducible, it can be confirmed by reverting it.
>
> https://lore.kernel.org/linux-arm-kernel/cover.160504666...@google.com/

I'll try to reproduce this and figure out the issue. Thanks for letting us know!

Lorenzo Stoakes

unread,
Nov 11, 2020, 2:26:59 PM11/11/20
to Andrey Konovalov, Qian Cai, syzbot, Andrew Morton, LKML, Linux Memory Management List, Linux-Next Mailing List, Stephen Rothwell, syzkaller-bugs, Dmitry Vyukov, Alexander Potapenko, Marco Elver
On Wed, 11 Nov 2020 at 17:44, Andrey Konovalov <andre...@google.com> wrote:
> I'll try to reproduce this and figure out the issue. Thanks for letting us know!

I hope you don't mind me diving in here, I was taking a look just now
and managed to reproduce this locally - I bisected the issue to
105397399 ("kasan: simplify kasan_poison_kfree").

If I stick a simple check in as below it fixes the issue, so I'm
guessing something is violating the assumptions in 105397399?


diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 7a94cebc0324..16163159a017 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -387,6 +387,11 @@ void __kasan_slab_free_mempool(void *ptr, unsigned long ip)
struct page *page;

page = virt_to_head_page(ptr);
+
+ if (!PageSlab(page)) {
+ return;
+ }
+
____kasan_slab_free(page->slab_cache, ptr, ip, false);
}


--
Lorenzo Stoakes
https://ljs.io

Andrey Konovalov

unread,
Nov 11, 2020, 2:44:21 PM11/11/20
to Lorenzo Stoakes, Qian Cai, syzbot, Andrew Morton, LKML, Linux Memory Management List, Linux-Next Mailing List, Stephen Rothwell, syzkaller-bugs, Dmitry Vyukov, Alexander Potapenko, Marco Elver
Ah, by the looks of it, ceph's init_caches() functions asks for
kmalloc-backed mempool, but at the same time provides a size that
doesn't fit into any kmalloc cache, and kmalloc falls back onto
page_alloc. Hard to say whether this is an issue in ceph, but I guess
we'll have to make KASAN fool proof either way and keep the PageSlab()
check in kasan_slab_free_mempool().

Thank you for debugging this, Lorenzo. I'll fix this in v10.
Reply all
Reply to author
Forward
0 new messages