[linux-next:master] [slab] db93cdd664: BUG:kernel_NULL_pointer_dereference,address

0 views
Skip to first unread message

kernel test robot

unread,
Sep 17, 2025, 1:01:53 AM (8 days ago) Sep 17
to Alexei Starovoitov, oe-...@lists.linux.dev, l...@intel.com, Vlastimil Babka, kasa...@googlegroups.com, cgr...@vger.kernel.org, linu...@kvack.org, olive...@intel.com


Hello,

kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:

commit: db93cdd664fa02de9be883dd29343b21d8fc790f ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: boot

config: i386-randconfig-062-20250913
compiler: clang-20
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <olive...@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202509171214...@intel.com


[ 7.101117][ T0] BUG: kernel NULL pointer dereference, address: 00000010
[ 7.102290][ T0] #PF: supervisor read access in kernel mode
[ 7.103219][ T0] #PF: error_code(0x0000) - not-present page
[ 7.104161][ T0] *pde = 00000000
[ 7.104762][ T0] Thread overran stack, or stack corrupted
[ 7.105726][ T0] Oops: Oops: 0000 [#1]
[ 7.106410][ T0] CPU: 0 UID: 0 PID: 0 Comm: swapper Tainted: G T 6.17.0-rc3-00014-gdb93cdd664fa #1 NONE 40eff3b43e4f0000b061f2e660abd0b2911f31b1
[ 7.108712][ T0] Tainted: [T]=RANDSTRUCT
[ 7.109368][ T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 7.110952][ T0] EIP: kmalloc_nolock_noprof (mm/slub.c:5607)
[ 7.112838][ T0] Code: 90 90 90 90 90 89 45 bc 0f bd 75 bc 75 05 be ff ff ff ff 46 83 fe 0e 0f 83 b6 01 00 00 6b c7 38 8b 84 b0 b4 79 d0 b2 89 45 ec <8b> 40 10 a9 00 00 01 00 75 1b 8b 0d ec 28 db b3 31 f6 a9 87 04 00
All code
========
0: 90 nop
1: 90 nop
2: 90 nop
3: 90 nop
4: 90 nop
5: 89 45 bc mov %eax,-0x44(%rbp)
8: 0f bd 75 bc bsr -0x44(%rbp),%esi
c: 75 05 jne 0x13
e: be ff ff ff ff mov $0xffffffff,%esi
13: 46 83 fe 0e rex.RX cmp $0xe,%esi
17: 0f 83 b6 01 00 00 jae 0x1d3
1d: 6b c7 38 imul $0x38,%edi,%eax
20: 8b 84 b0 b4 79 d0 b2 mov -0x4d2f864c(%rax,%rsi,4),%eax
27: 89 45 ec mov %eax,-0x14(%rbp)
2a:* 8b 40 10 mov 0x10(%rax),%eax <-- trapping instruction
2d: a9 00 00 01 00 test $0x10000,%eax
32: 75 1b jne 0x4f
34: 8b 0d ec 28 db b3 mov -0x4c24d714(%rip),%ecx # 0xffffffffb3db2926
3a: 31 f6 xor %esi,%esi
3c: a9 .byte 0xa9
3d: 87 04 00 xchg %eax,(%rax,%rax,1)

Code starting with the faulting instruction
===========================================
0: 8b 40 10 mov 0x10(%rax),%eax
3: a9 00 00 01 00 test $0x10000,%eax
8: 75 1b jne 0x25
a: 8b 0d ec 28 db b3 mov -0x4c24d714(%rip),%ecx # 0xffffffffb3db28fc
10: 31 f6 xor %esi,%esi
12: a9 .byte 0xa9
13: 87 04 00 xchg %eax,(%rax,%rax,1)
[ 7.115899][ T0] EAX: 00000000 EBX: 00000101 ECX: 00000200 EDX: 00000000
[ 7.116940][ T0] ESI: 00000009 EDI: 0000000e EBP: b2d07d18 ESP: b2d07cd4
[ 7.118013][ T0] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00210002
[ 7.119201][ T0] CR0: 80050033 CR2: 00000010 CR3: 03672000 CR4: 00000090
[ 7.120263][ T0] Call Trace:
[ 7.120791][ T0] Modules linked in:
[ 7.121455][ T0] CR2: 0000000000000010
[ 7.122145][ T0] ---[ end trace 0000000000000000 ]---
[ 7.123070][ T0] EIP: kmalloc_nolock_noprof (mm/slub.c:5607)
[ 7.123973][ T0] Code: 90 90 90 90 90 89 45 bc 0f bd 75 bc 75 05 be ff ff ff ff 46 83 fe 0e 0f 83 b6 01 00 00 6b c7 38 8b 84 b0 b4 79 d0 b2 89 45 ec <8b> 40 10 a9 00 00 01 00 75 1b 8b 0d ec 28 db b3 31 f6 a9 87 04 00
All code
========
0: 90 nop
1: 90 nop
2: 90 nop
3: 90 nop
4: 90 nop
5: 89 45 bc mov %eax,-0x44(%rbp)
8: 0f bd 75 bc bsr -0x44(%rbp),%esi
c: 75 05 jne 0x13
e: be ff ff ff ff mov $0xffffffff,%esi
13: 46 83 fe 0e rex.RX cmp $0xe,%esi
17: 0f 83 b6 01 00 00 jae 0x1d3
1d: 6b c7 38 imul $0x38,%edi,%eax
20: 8b 84 b0 b4 79 d0 b2 mov -0x4d2f864c(%rax,%rsi,4),%eax
27: 89 45 ec mov %eax,-0x14(%rbp)
2a:* 8b 40 10 mov 0x10(%rax),%eax <-- trapping instruction
2d: a9 00 00 01 00 test $0x10000,%eax
32: 75 1b jne 0x4f
34: 8b 0d ec 28 db b3 mov -0x4c24d714(%rip),%ecx # 0xffffffffb3db2926
3a: 31 f6 xor %esi,%esi
3c: a9 .byte 0xa9
3d: 87 04 00 xchg %eax,(%rax,%rax,1)

Code starting with the faulting instruction
===========================================
0: 8b 40 10 mov 0x10(%rax),%eax
3: a9 00 00 01 00 test $0x10000,%eax
8: 75 1b jne 0x25
a: 8b 0d ec 28 db b3 mov -0x4c24d714(%rip),%ecx # 0xffffffffb3db28fc
10: 31 f6 xor %esi,%esi
12: a9 .byte 0xa9
13: 87 04 00 xchg %eax,(%rax,%rax,1)


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250917/202509171214...@intel.com



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Vlastimil Babka

unread,
Sep 17, 2025, 4:03:29 AM (8 days ago) Sep 17
to kernel test robot, Alexei Starovoitov, Harry Yoo, oe-...@lists.linux.dev, l...@intel.com, kasa...@googlegroups.com, cgr...@vger.kernel.org, linu...@kvack.org
On 9/17/25 07:01, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:
>
> commit: db93cdd664fa02de9be883dd29343b21d8fc790f ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>
> in testcase: boot
>
> config: i386-randconfig-062-20250913
> compiler: clang-20
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <olive...@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202509171214...@intel.com
>
>
> [ 7.101117][ T0] BUG: kernel NULL pointer dereference, address: 00000010
> [ 7.102290][ T0] #PF: supervisor read access in kernel mode
> [ 7.103219][ T0] #PF: error_code(0x0000) - not-present page
> [ 7.104161][ T0] *pde = 00000000
> [ 7.104762][ T0] Thread overran stack, or stack corrupted

Note this.

> [ 7.105726][ T0] Oops: Oops: 0000 [#1]
> [ 7.106410][ T0] CPU: 0 UID: 0 PID: 0 Comm: swapper Tainted: G T 6.17.0-rc3-00014-gdb93cdd664fa #1 NONE 40eff3b43e4f0000b061f2e660abd0b2911f31b1
> [ 7.108712][ T0] Tainted: [T]=RANDSTRUCT
> [ 7.109368][ T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 7.110952][ T0] EIP: kmalloc_nolock_noprof (mm/slub.c:5607)

That's here.
if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))

dmesg already contains line "SLUB: HWalign=64, Order=0-3, MinObjects=0,
CPUs=1, Nodes=1" so all kmem caches are fully initialized, so doesn't look
like a bootstrap issue. Probably it's due to the stack overflow and not
actual bug on this line.

Because of that it's also unable to print the backtrace. But the only
kmallock_nolock usage for now is in slub itself, alloc_slab_obj_exts():

/* Prevent recursive extension vector allocation */
gfp |= __GFP_NO_OBJ_EXT;
if (unlikely(!allow_spin)) {
size_t sz = objects * sizeof(struct slabobj_ext);

vec = kmalloc_nolock(sz, __GFP_ZERO, slab_nid(slab));
} else {
vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
slab_nid(slab));
}

Prevent recursive... hm? And we had stack overflow?
Also .config has CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT=y

So, this?
diff --git a/mm/slub.c b/mm/slub.c
index 837ee037abb5..c4f17ac6e4b6 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2092,7 +2092,8 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
if (unlikely(!allow_spin)) {
size_t sz = objects * sizeof(struct slabobj_ext);

- vec = kmalloc_nolock(sz, __GFP_ZERO, slab_nid(slab));
+ vec = kmalloc_nolock(sz, __GFP_ZERO | __GFP_NO_OBJ_EXT,
+ slab_nid(slab));
} else {
vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
slab_nid(slab));
@@ -5591,7 +5592,8 @@ void *kmalloc_nolock_noprof(size_t size, gfp_t gfp_flags, int node)
bool can_retry = true;
void *ret = ERR_PTR(-EBUSY);

- VM_WARN_ON_ONCE(gfp_flags & ~(__GFP_ACCOUNT | __GFP_ZERO));
+ VM_WARN_ON_ONCE(gfp_flags & ~(__GFP_ACCOUNT | __GFP_ZERO |
+ __GFP_NO_OBJ_EXT));

if (unlikely(!size))
return ZERO_SIZE_PTR;

Vlastimil Babka

unread,
Sep 17, 2025, 5:18:18 AM (8 days ago) Sep 17
to kernel test robot, Alexei Starovoitov, Harry Yoo, Suren Baghdasaryan, oe-...@lists.linux.dev, l...@intel.com, kasa...@googlegroups.com, cgr...@vger.kernel.org, linu...@kvack.org
On 9/17/25 10:03, Vlastimil Babka wrote:
> On 9/17/25 07:01, kernel test robot wrote:
>>
>>
>> Hello,
>>
>> kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:
>>
>> commit: db93cdd664fa02de9be883dd29343b21d8fc790f ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
>> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>>
>> in testcase: boot
>>
>> config: i386-randconfig-062-20250913
>> compiler: clang-20
>> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>>
>> (please refer to attached dmesg/kmsg for entire log/backtrace)

Managed to reproduce locally and my suggested fix works so I'm going to fold
it unless there's objections or better suggestions.

Also I was curious to find out which path is triggered so I've put a
dump_stack() before the kmalloc_nolock call:

[ 0.731812][ T0] Call Trace:
[ 0.732406][ T0] __dump_stack+0x18/0x30
[ 0.733200][ T0] dump_stack_lvl+0x32/0x90
[ 0.734037][ T0] dump_stack+0xd/0x20
[ 0.734780][ T0] alloc_slab_obj_exts+0x181/0x1f0
[ 0.735862][ T0] __alloc_tagging_slab_alloc_hook+0xd1/0x330
[ 0.736988][ T0] ? __slab_alloc+0x4e/0x70
[ 0.737858][ T0] ? __set_page_owner+0x167/0x280
[ 0.738774][ T0] __kmalloc_cache_noprof+0x379/0x460
[ 0.739756][ T0] ? depot_fetch_stack+0x164/0x180
[ 0.740687][ T0] ? __set_page_owner+0x167/0x280
[ 0.741604][ T0] __set_page_owner+0x167/0x280
[ 0.742503][ T0] post_alloc_hook+0x17a/0x200
[ 0.743404][ T0] get_page_from_freelist+0x13b3/0x16b0
[ 0.744427][ T0] ? kvm_sched_clock_read+0xd/0x20
[ 0.745358][ T0] ? kvm_sched_clock_read+0xd/0x20
[ 0.746290][ T0] ? __next_zones_zonelist+0x26/0x60
[ 0.747265][ T0] __alloc_frozen_pages_noprof+0x143/0x1080
[ 0.748358][ T0] ? lock_acquire+0x8b/0x180
[ 0.749209][ T0] ? pcpu_alloc_noprof+0x181/0x800
[ 0.750198][ T0] ? sched_clock_noinstr+0x8/0x10
[ 0.751119][ T0] ? local_clock_noinstr+0x137/0x140
[ 0.752089][ T0] ? kvm_sched_clock_read+0xd/0x20
[ 0.753023][ T0] alloc_slab_page+0xda/0x150
[ 0.753879][ T0] new_slab+0xe1/0x500
[ 0.754615][ T0] ? kvm_sched_clock_read+0xd/0x20
[ 0.755577][ T0] ___slab_alloc+0xd79/0x1680
[ 0.756469][ T0] ? pcpu_alloc_noprof+0x538/0x800
[ 0.757408][ T0] ? __mutex_unlock_slowpath+0x195/0x3e0
[ 0.758446][ T0] __slab_alloc+0x4e/0x70
[ 0.759237][ T0] ? mm_alloc+0x38/0x80
[ 0.759993][ T0] kmem_cache_alloc_noprof+0x1db/0x470
[ 0.760993][ T0] ? mm_alloc+0x38/0x80
[ 0.761745][ T0] ? mm_alloc+0x38/0x80
[ 0.762506][ T0] mm_alloc+0x38/0x80
[ 0.763260][ T0] poking_init+0xe/0x80
[ 0.764032][ T0] start_kernel+0x16b/0x470
[ 0.764858][ T0] i386_start_kernel+0xce/0xf0
[ 0.765723][ T0] startup_32_smp+0x151/0x160

And the reason is we still have restricted gfp_allowed_mask at this point:
/* The GFP flags allowed during early boot */
#define GFP_BOOT_MASK (__GFP_BITS_MASK & ~(__GFP_RECLAIM|__GFP_IO|__GFP_FS))

It's only lifted to a full allowed mask later in the boot.

That means due to "kmalloc_nolock() is not supported on architectures that
don't implement cmpxchg16b" such architectures will no longer get objexts
allocated in early boot. I guess that's not a big deal.

Also any later allocation having its flags screwed for some reason to not
have __GFP_RECLAIM will also lose its objexts. Hope that's also acceptable.
I don't know if we can distinguish a real kmalloc_nolock() scope in
alloc_slab_obj_exts() without inventing new gfp flags or passing an extra
argument through several layers of functions.

Alexei Starovoitov

unread,
Sep 17, 2025, 2:38:56 PM (7 days ago) Sep 17
to Vlastimil Babka, kernel test robot, Alexei Starovoitov, Harry Yoo, Suren Baghdasaryan, oe-...@lists.linux.dev, kbuild test robot, kasan-dev, open list:CONTROL GROUP (CGROUP), linux-mm
On Wed, Sep 17, 2025 at 2:18 AM Vlastimil Babka <vba...@suse.cz> wrote:
>
> On 9/17/25 10:03, Vlastimil Babka wrote:
> > On 9/17/25 07:01, kernel test robot wrote:
> >>
> >>
> >> Hello,
> >>
> >> kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:
> >>
> >> commit: db93cdd664fa02de9be883dd29343b21d8fc790f ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
> >> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> >>
> >> in testcase: boot
> >>
> >> config: i386-randconfig-062-20250913
> >> compiler: clang-20
> >> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> >>
> >> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
> Managed to reproduce locally and my suggested fix works so I'm going to fold
> it unless there's objections or better suggestions.

Thanks for the fix. Not sure what I was thinking. __GFP_NO_OBJ_EXT
is obviously needed there.
Ohh. That's interesting.

> That means due to "kmalloc_nolock() is not supported on architectures that
> don't implement cmpxchg16b" such architectures will no longer get objexts
> allocated in early boot. I guess that's not a big deal.
>
> Also any later allocation having its flags screwed for some reason to not
> have __GFP_RECLAIM will also lose its objexts. Hope that's also acceptable.
> I don't know if we can distinguish a real kmalloc_nolock() scope in
> alloc_slab_obj_exts() without inventing new gfp flags or passing an extra
> argument through several layers of functions.

I think it's ok-ish.
Can we add a check to alloc_slab_obj_exts() that sets allow_spin=true
if we're in the boot phase? Like:
if (gfp_allowed_mask != __GFP_BITS_MASK)
allow_spin = true;
or some cleaner way to detect boot time by checking slab_state ?
bpf is not active during the boot and nothing should be
calling kmalloc_nolock.

Vlastimil Babka

unread,
Sep 18, 2025, 3:06:43 AM (7 days ago) Sep 18
to Alexei Starovoitov, kernel test robot, Alexei Starovoitov, Harry Yoo, Suren Baghdasaryan, oe-...@lists.linux.dev, kbuild test robot, kasan-dev, open list:CONTROL GROUP (CGROUP), linux-mm
Checking the gfp_allowed_mask should work. Slab state is already UP so won't
help, and this is not really about slab state anyway.
But whether worth it... Suren what do you think?

Suren Baghdasaryan

unread,
Sep 18, 2025, 10:49:39 AM (6 days ago) Sep 18
to Vlastimil Babka, Alexei Starovoitov, kernel test robot, Alexei Starovoitov, Harry Yoo, oe-...@lists.linux.dev, kbuild test robot, kasan-dev, open list:CONTROL GROUP (CGROUP), linux-mm
Vlastimil's fix is correct. We definitely need __GFP_NO_OBJ_EXT when
allocating an obj_exts vector, otherwise it will try to recursively
allocate an obj_exts vector for obj_exts allocation.

For the additional __GFP_BITS_MASK check, that sounds good to me as
long as we add a comment on why that is there. Or maybe such a check
deserves to be placed in a separate function similar to
gfpflags_allow_{spinning | blocking}?

Alexei Starovoitov

unread,
Sep 18, 2025, 9:39:58 PM (6 days ago) Sep 18
to Suren Baghdasaryan, Vlastimil Babka, kernel test robot, Alexei Starovoitov, Harry Yoo, oe-...@lists.linux.dev, kbuild test robot, kasan-dev, open list:CONTROL GROUP (CGROUP), linux-mm
I would not. I think adding 'boot or not' logic to these two
will muddy the waters and will make the whole slab/page_alloc/memcg
logic and dependencies between them much harder to follow.
I'd either add a comment to alloc_slab_obj_exts() explaining
what may happen or add 'boot or not' check only there.
imo this is a niche, rare and special.

Suren Baghdasaryan

unread,
Sep 19, 2025, 11:01:51 AM (5 days ago) Sep 19
to Alexei Starovoitov, Vlastimil Babka, kernel test robot, Alexei Starovoitov, Harry Yoo, oe-...@lists.linux.dev, kbuild test robot, kasan-dev, open list:CONTROL GROUP (CGROUP), linux-mm
Ok, comment it is then.
Will you be sending a new version or Vlastimil will be including that
in his fixup?

Alexei Starovoitov

unread,
Sep 19, 2025, 2:32:15 PM (5 days ago) Sep 19
to Suren Baghdasaryan, Vlastimil Babka, kernel test robot, Alexei Starovoitov, Harry Yoo, oe-...@lists.linux.dev, kbuild test robot, kasan-dev, open list:CONTROL GROUP (CGROUP), linux-mm
Whichever way. I can, but so far Vlastimil phrasing of comments
were much better than mine :) So I think he can fold what he prefers.
Reply all
Reply to author
Forward
0 new messages