[syzbot ci] Re: Close race in freeing special fields and map value

0 views
Skip to first unread message

syzbot ci

unread,
Feb 26, 2026, 4:31:55 AM (2 days ago) Feb 26
to and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, kerne...@meta.com, k...@meta.com, marti...@kernel.org, mem...@gmail.com, pau...@kernel.org, yats...@meta.com, syz...@lists.linux.dev, syzkall...@googlegroups.com
syzbot ci has tested the following series

[v1] Close race in freeing special fields and map value
https://lore.kernel.org/all/20260225185121....@gmail.com
* [PATCH bpf-next v1 1/4] bpf: Register dtor for freeing special fields
* [PATCH bpf-next v1 2/4] bpf: Delay freeing fields in local storage
* [PATCH bpf-next v1 3/4] bpf: Retire rcu_trace_implies_rcu_gp() from local storage
* [PATCH bpf-next v1 4/4] selftests/bpf: Add tests for special fields races

and found the following issue:
KASAN: slab-use-after-free Read in free_all

Full report is available here:
https://ci.syzbot.org/series/ad9c7ce6-d861-4afe-821a-e8fae6120b12

***

KASAN: slab-use-after-free Read in free_all

tree: bpf-next
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next.git
base: f620af11c27b8ec9994a39fe968aa778112d1566
arch: amd64
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config: https://ci.syzbot.org/builds/b2735b0b-963d-41a7-bb33-db10494a24e1/config
syz repro: https://ci.syzbot.org/findings/3ee61f99-7d80-4986-a0b1-492ffb6b1c46/syz_repro

==================================================================
BUG: KASAN: slab-use-after-free in free_all+0x84/0x140 kernel/bpf/memalloc.c:271
Read of size 8 at addr ffff888112ee9260 by task swapper/0/0

CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xba/0x230 mm/kasan/report.c:482
kasan_report+0x117/0x150 mm/kasan/report.c:595
free_all+0x84/0x140 kernel/bpf/memalloc.c:271
do_call_rcu_ttrace+0x385/0x400 kernel/bpf/memalloc.c:315
__free_by_rcu+0x23b/0x3f0 kernel/bpf/memalloc.c:381
rcu_do_batch kernel/rcu/tree.c:2617 [inline]
rcu_core+0x7cd/0x1070 kernel/rcu/tree.c:2869
handle_softirqs+0x22a/0x870 kernel/softirq.c:622
__do_softirq kernel/softirq.c:656 [inline]
invoke_softirq kernel/softirq.c:496 [inline]
__irq_exit_rcu+0x5f/0x150 kernel/softirq.c:723
irq_exit_rcu+0x9/0x30 kernel/softirq.c:739
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1056 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1056
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:697
RIP: 0010:pv_native_safe_halt+0xf/0x20 arch/x86/kernel/paravirt.c:63
Code: 8e 6d 02 c3 cc cc cc cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 43 3d 1c 00 fb f4 <e9> 7c ea 02 00 cc cc cc cc cc cc cc cc cc cc cc cc 90 90 90 90 90
RSP: 0018:ffffffff8e407dc0 EFLAGS: 00000202
RAX: 00000000000d2297 RBX: ffffffff819a80ad RCX: 0000000080000001
RDX: 0000000000000001 RSI: ffffffff8def22fa RDI: ffffffff8c27a800
RBP: ffffffff8e407eb0 R08: ffff88812103395b R09: 1ffff1102420672b
R10: dffffc0000000000 R11: ffffed102420672c R12: ffffffff90117db0
R13: 1ffffffff1c929d8 R14: 0000000000000000 R15: 0000000000000000
arch_safe_halt arch/x86/kernel/process.c:766 [inline]
default_idle+0x9/0x20 arch/x86/kernel/process.c:767
default_idle_call+0x72/0xb0 kernel/sched/idle.c:122
cpuidle_idle_call kernel/sched/idle.c:191 [inline]
do_idle+0x1bd/0x500 kernel/sched/idle.c:332
cpu_startup_entry+0x43/0x60 kernel/sched/idle.c:430
rest_init+0x2de/0x300 init/main.c:760
start_kernel+0x385/0x3d0 init/main.c:1210
x86_64_start_reservations+0x24/0x30 arch/x86/kernel/head64.c:310
x86_64_start_kernel+0x143/0x1c0 arch/x86/kernel/head64.c:291
common_startup_64+0x13e/0x147
</TASK>

Allocated by task 5961:
kasan_save_stack mm/kasan/common.c:57 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
poison_kmalloc_redzone mm/kasan/common.c:398 [inline]
__kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:415
kasan_kmalloc include/linux/kasan.h:263 [inline]
__do_kmalloc_node mm/slub.c:5219 [inline]
__kmalloc_node_noprof+0x4e0/0x7c0 mm/slub.c:5225
kmalloc_node_noprof include/linux/slab.h:1093 [inline]
__bpf_map_area_alloc kernel/bpf/syscall.c:398 [inline]
bpf_map_area_alloc+0x64/0x170 kernel/bpf/syscall.c:411
trie_alloc+0x14f/0x340 kernel/bpf/lpm_trie.c:588
map_create+0xafd/0x16a0 kernel/bpf/syscall.c:1507
__sys_bpf+0x6e1/0x950 kernel/bpf/syscall.c:6210
__do_sys_bpf kernel/bpf/syscall.c:6341 [inline]
__se_sys_bpf kernel/bpf/syscall.c:6339 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:6339
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 83:
kasan_save_stack mm/kasan/common.c:57 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
poison_slab_object mm/kasan/common.c:253 [inline]
__kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
kasan_slab_free include/linux/kasan.h:235 [inline]
slab_free_hook mm/slub.c:2687 [inline]
slab_free mm/slub.c:6124 [inline]
kfree+0x1c1/0x630 mm/slub.c:6442
bpf_map_free kernel/bpf/syscall.c:892 [inline]
bpf_map_free_deferred+0x217/0x460 kernel/bpf/syscall.c:919
process_one_work kernel/workqueue.c:3275 [inline]
process_scheduled_works+0xb02/0x1830 kernel/workqueue.c:3358
worker_thread+0xa50/0xfc0 kernel/workqueue.c:3439
kthread+0x388/0x470 kernel/kthread.c:467
ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

Last potentially related work creation:
kasan_save_stack+0x3e/0x60 mm/kasan/common.c:57
kasan_record_aux_stack+0xbd/0xd0 mm/kasan/generic.c:556
insert_work+0x3d/0x330 kernel/workqueue.c:2199
__queue_work+0xd03/0x1020 kernel/workqueue.c:2354
queue_work_on+0x106/0x1d0 kernel/workqueue.c:2405
bpf_map_put_with_uref kernel/bpf/syscall.c:975 [inline]
bpf_map_release+0x127/0x140 kernel/bpf/syscall.c:985
__fput+0x44f/0xa70 fs/file_table.c:469
task_work_run+0x1d9/0x270 kernel/task_work.c:233
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
__exit_to_user_mode_loop kernel/entry/common.c:67 [inline]
exit_to_user_mode_loop+0xed/0x480 kernel/entry/common.c:98
__exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline]
do_syscall_64+0x32d/0xf80 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff888112ee9000
which belongs to the cache kmalloc-cg-1k of size 1024
The buggy address is located 608 bytes inside of
freed 1024-byte region [ffff888112ee9000, ffff888112ee9400)

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x112ee8
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
memcg:ffff888106b17c01
flags: 0x17ff00000000040(head|node=0|zone=2|lastcpupid=0x7ff)
page_type: f5(slab)
raw: 017ff00000000040 ffff88810005d280 dead000000000100 dead000000000122
raw: 0000000000000000 0000000800100010 00000000f5000000 ffff888106b17c01
head: 017ff00000000040 ffff88810005d280 dead000000000100 dead000000000122
head: 0000000000000000 0000000800100010 00000000f5000000 ffff888106b17c01
head: 017ff00000000003 ffffea00044bba01 00000000ffffffff 00000000ffffffff
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 5550, tgid 5550 (dhcpcd), ts 37867090520, free_ts 37834172120
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x231/0x280 mm/page_alloc.c:1889
prep_new_page mm/page_alloc.c:1897 [inline]
get_page_from_freelist+0x24dc/0x2580 mm/page_alloc.c:3962
__alloc_frozen_pages_noprof+0x18d/0x380 mm/page_alloc.c:5250
alloc_slab_page mm/slub.c:3255 [inline]
allocate_slab+0x77/0x660 mm/slub.c:3444
new_slab mm/slub.c:3502 [inline]
refill_objects+0x331/0x3c0 mm/slub.c:7134
refill_sheaf mm/slub.c:2804 [inline]
__pcs_replace_empty_main+0x2b9/0x620 mm/slub.c:4578
alloc_from_pcs mm/slub.c:4681 [inline]
slab_alloc_node mm/slub.c:4815 [inline]
__do_kmalloc_node mm/slub.c:5218 [inline]
__kmalloc_node_track_caller_noprof+0x572/0x7b0 mm/slub.c:5327
kmalloc_reserve net/core/skbuff.c:635 [inline]
__alloc_skb+0x2c1/0x7d0 net/core/skbuff.c:713
alloc_skb include/linux/skbuff.h:1383 [inline]
alloc_skb_with_frags+0xca/0x890 net/core/skbuff.c:6750
sock_alloc_send_pskb+0x878/0x990 net/core/sock.c:2995
unix_dgram_sendmsg+0x460/0x18e0 net/unix/af_unix.c:2125
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
sock_write_iter+0x503/0x550 net/socket.c:1195
do_iter_readv_writev+0x619/0x8c0 fs/read_write.c:-1
vfs_writev+0x33c/0x990 fs/read_write.c:1059
do_writev+0x154/0x2e0 fs/read_write.c:1105
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
page last free pid 5678 tgid 5678 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
__free_pages_prepare mm/page_alloc.c:1433 [inline]
__free_frozen_pages+0xc2b/0xdb0 mm/page_alloc.c:2978
__slab_free+0x263/0x2b0 mm/slub.c:5532
qlink_free mm/kasan/quarantine.c:163 [inline]
qlist_free_all+0x97/0x100 mm/kasan/quarantine.c:179
kasan_quarantine_reduce+0x148/0x160 mm/kasan/quarantine.c:286
__kasan_slab_alloc+0x22/0x80 mm/kasan/common.c:350
kasan_slab_alloc include/linux/kasan.h:253 [inline]
slab_post_alloc_hook mm/slub.c:4501 [inline]
slab_alloc_node mm/slub.c:4830 [inline]
__do_kmalloc_node mm/slub.c:5218 [inline]
__kmalloc_noprof+0x316/0x760 mm/slub.c:5231
kmalloc_noprof include/linux/slab.h:966 [inline]
tomoyo_realpath_from_path+0xe3/0x5d0 security/tomoyo/realpath.c:251
tomoyo_realpath_nofollow+0xc3/0x110 security/tomoyo/realpath.c:304
tomoyo_find_next_domain+0x259/0x1aa0 security/tomoyo/domain.c:731
tomoyo_bprm_check_security+0x11b/0x180 security/tomoyo/tomoyo.c:102
security_bprm_check+0x85/0x240 security/security.c:794
search_binary_handler fs/exec.c:1654 [inline]
exec_binprm fs/exec.c:1696 [inline]
bprm_execve+0x896/0x1460 fs/exec.c:1748
do_execveat_common+0x50d/0x690 fs/exec.c:1846
__do_sys_execve fs/exec.c:1930 [inline]
__se_sys_execve fs/exec.c:1924 [inline]
__x64_sys_execve+0x97/0xc0 fs/exec.c:1924
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Memory state around the buggy address:
ffff888112ee9100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888112ee9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888112ee9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888112ee9280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888112ee9300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
----------------
Code disassembly (best guess):
0: 8e 6d 02 mov 0x2(%rbp),%gs
3: c3 ret
4: cc int3
5: cc int3
6: cc int3
7: cc int3
8: cc int3
9: cc int3
a: cc int3
b: 90 nop
c: 90 nop
d: 90 nop
e: 90 nop
f: 90 nop
10: 90 nop
11: 90 nop
12: 90 nop
13: 90 nop
14: 90 nop
15: 90 nop
16: 90 nop
17: 90 nop
18: 90 nop
19: 90 nop
1a: 90 nop
1b: f3 0f 1e fa endbr64
1f: eb 07 jmp 0x28
21: 0f 00 2d 43 3d 1c 00 verw 0x1c3d43(%rip) # 0x1c3d6b
28: fb sti
29: f4 hlt
* 2a: e9 7c ea 02 00 jmp 0x2eaab <-- trapping instruction
2f: cc int3
30: cc int3
31: cc int3
32: cc int3
33: cc int3
34: cc int3
35: cc int3
36: cc int3
37: cc int3
38: cc int3
39: cc int3
3a: cc int3
3b: 90 nop
3c: 90 nop
3d: 90 nop
3e: 90 nop
3f: 90 nop


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syz...@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzk...@googlegroups.com.

Kumar Kartikeya Dwivedi

unread,
Feb 26, 2026, 10:06:05 AM (2 days ago) Feb 26
to syzbot ci, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, kerne...@meta.com, k...@meta.com, marti...@kernel.org, pau...@kernel.org, yats...@meta.com, syz...@lists.linux.dev, syzkall...@googlegroups.com
> [...]

This is because the 'ma' is embedded in map struct, so we have an
optimization to speed up map destruction without waiting for pending
callbacks' rcu_barrier() by copying it out. Typically no callbacks are
in progress by the time map is gone, so it wasn't seen in tests, but
UAF can happen when we kmemdup() and proceed with freeing the bpf_map,
such that the 'ma' pointer becomes invalid.

Just storing dtor and dtor ctx directly is not ok either, since it
accesses the map struct and the corresponding btf_record to decide how
to free fields. It will cause the same UAF. The right way for me
appears to be to duplicate the btf_record (it's not very big anyway)
from the map and store it as the dtor ctx. To remove dependency on
ma->dtor, just store dtor directly in the bpf_mem_caches along with
dtor ctx. Then the dtor can take care of everything else.

Kumar Kartikeya Dwivedi

unread,
Feb 26, 2026, 12:37:00 PM (2 days ago) Feb 26
to syzbot+ci382...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, kerne...@meta.com, k...@meta.com, marti...@kernel.org, mem...@gmail.com, pau...@kernel.org, syz...@lists.linux.dev, syzkall...@googlegroups.com, yats...@meta.com
#syz test
---

diff --git a/include/linux/bpf_mem_alloc.h b/include/linux/bpf_mem_alloc.h
index 7517eaf94ac9..4ce0d27f8ea2 100644
--- a/include/linux/bpf_mem_alloc.h
+++ b/include/linux/bpf_mem_alloc.h
@@ -14,7 +14,7 @@ struct bpf_mem_alloc {
struct obj_cgroup *objcg;
bool percpu;
struct work_struct work;
- void (*dtor)(void *obj, void *ctx);
+ void (*dtor_ctx_free)(void *ctx);
void *dtor_ctx;
};

@@ -35,7 +35,9 @@ int bpf_mem_alloc_percpu_init(struct bpf_mem_alloc *ma, struct obj_cgroup *objcg
int bpf_mem_alloc_percpu_unit_init(struct bpf_mem_alloc *ma, int size);
void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma);
void bpf_mem_alloc_set_dtor(struct bpf_mem_alloc *ma,
- void (*dtor)(void *obj, void *ctx), void *ctx);
+ void (*dtor)(void *obj, void *ctx),
+ void (*dtor_ctx_free)(void *ctx),
+ void *ctx);

/* Check the allocation size for kmalloc equivalent allocator */
int bpf_mem_alloc_check_size(bool percpu, size_t size);
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 74f7a6f44c50..582f0192b7e1 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -125,6 +125,11 @@ struct htab_elem {
char key[] __aligned(8);
};

+struct htab_btf_record {
+ struct btf_record *record;
+ u32 key_size;
+};
+
static inline bool htab_is_prealloc(const struct bpf_htab *htab)
{
return !(htab->map.map_flags & BPF_F_NO_PREALLOC);
@@ -459,28 +464,80 @@ static int htab_map_alloc_check(union bpf_attr *attr)

static void htab_mem_dtor(void *obj, void *ctx)
{
- struct bpf_htab *htab = ctx;
+ struct htab_btf_record *hrec = ctx;
struct htab_elem *elem = obj;
void *map_value;

- if (IS_ERR_OR_NULL(htab->map.record))
+ if (IS_ERR_OR_NULL(hrec->record))
return;

- map_value = htab_elem_value(elem, htab->map.key_size);
- bpf_obj_free_fields(htab->map.record, map_value);
+ map_value = htab_elem_value(elem, hrec->key_size);
+ bpf_obj_free_fields(hrec->record, map_value);
}

static void htab_pcpu_mem_dtor(void *obj, void *ctx)
{
- struct bpf_htab *htab = ctx;
void __percpu *pptr = *(void __percpu **)obj;
+ struct htab_btf_record *hrec = ctx;
int cpu;

- if (IS_ERR_OR_NULL(htab->map.record))
+ if (IS_ERR_OR_NULL(hrec->record))
return;

for_each_possible_cpu(cpu)
- bpf_obj_free_fields(htab->map.record, per_cpu_ptr(pptr, cpu));
+ bpf_obj_free_fields(hrec->record, per_cpu_ptr(pptr, cpu));
+}
+
+static void htab_dtor_ctx_free(void *ctx)
+{
+ struct htab_btf_record *hrec = ctx;
+
+ btf_record_free(hrec->record);
+ kfree(ctx);
+}
+
+static int htab_set_dtor(const struct bpf_htab *htab, void (*dtor)(void *, void *))
+{
+ u32 key_size = htab->map.key_size;
+ const struct bpf_mem_alloc *ma;
+ struct htab_btf_record *hrec;
+ int err;
+
+ /* No need for dtors. */
+ if (IS_ERR_OR_NULL(htab->map.record))
+ return 0;
+
+ hrec = kzalloc(sizeof(*hrec), GFP_KERNEL);
+ if (!hrec)
+ return -ENOMEM;
+ hrec->key_size = key_size;
+ hrec->record = btf_record_dup(htab->map.record);
+ if (IS_ERR(hrec->record)) {
+ err = PTR_ERR(hrec->record);
+ kfree(hrec);
+ return err;
+ }
+ ma = htab_is_percpu(htab) ? &htab->pcpu_ma : &htab->ma;
+ /* Kinda sad, but cast away const-ness since we change ma->dtor. */
+ bpf_mem_alloc_set_dtor((struct bpf_mem_alloc *)ma, dtor, htab_dtor_ctx_free, hrec);
+ return 0;
+}
+
+static int htab_map_check_btf(const struct bpf_map *map, const struct btf *btf,
+ const struct btf_type *key_type, const struct btf_type *value_type)
+{
+ struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
+
+ if (htab_is_prealloc(htab))
+ return 0;
+ /*
+ * We must set the dtor using this callback, as map's BTF record is not
+ * populated in htab_map_alloc(), so it will always appear as NULL.
+ */
+ if (htab_is_percpu(htab))
+ return htab_set_dtor(htab, htab_pcpu_mem_dtor);
+ else
+ return htab_set_dtor(htab, htab_mem_dtor);
}

static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
@@ -595,17 +652,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr)
round_up(htab->map.value_size, 8), true);
if (err)
goto free_map_locked;
- /* See comment below */
- bpf_mem_alloc_set_dtor(&htab->pcpu_ma, htab_pcpu_mem_dtor, htab);
- } else {
- /*
- * Register the dtor unconditionally. map->record is
- * set by map_create() after map_alloc() returns, so it
- * is always NULL at this point. The dtor checks
- * IS_ERR_OR_NULL(htab->map.record) and becomes a no-op
- * for maps without special fields.
- */
- bpf_mem_alloc_set_dtor(&htab->ma, htab_mem_dtor, htab);
}
}

@@ -2318,6 +2364,7 @@ const struct bpf_map_ops htab_map_ops = {
.map_seq_show_elem = htab_map_seq_show_elem,
.map_set_for_each_callback_args = map_set_for_each_callback_args,
.map_for_each_callback = bpf_for_each_hash_elem,
+ .map_check_btf = htab_map_check_btf,
.map_mem_usage = htab_map_mem_usage,
BATCH_OPS(htab),
.map_btf_id = &htab_map_btf_ids[0],
@@ -2340,6 +2387,7 @@ const struct bpf_map_ops htab_lru_map_ops = {
.map_seq_show_elem = htab_map_seq_show_elem,
.map_set_for_each_callback_args = map_set_for_each_callback_args,
.map_for_each_callback = bpf_for_each_hash_elem,
+ .map_check_btf = htab_map_check_btf,
.map_mem_usage = htab_map_mem_usage,
BATCH_OPS(htab_lru),
.map_btf_id = &htab_map_btf_ids[0],
@@ -2519,6 +2567,7 @@ const struct bpf_map_ops htab_percpu_map_ops = {
.map_seq_show_elem = htab_percpu_map_seq_show_elem,
.map_set_for_each_callback_args = map_set_for_each_callback_args,
.map_for_each_callback = bpf_for_each_hash_elem,
+ .map_check_btf = htab_map_check_btf,
.map_mem_usage = htab_map_mem_usage,
BATCH_OPS(htab_percpu),
.map_btf_id = &htab_map_btf_ids[0],
@@ -2539,6 +2588,7 @@ const struct bpf_map_ops htab_lru_percpu_map_ops = {
.map_seq_show_elem = htab_percpu_map_seq_show_elem,
.map_set_for_each_callback_args = map_set_for_each_callback_args,
.map_for_each_callback = bpf_for_each_hash_elem,
+ .map_check_btf = htab_map_check_btf,
.map_mem_usage = htab_map_mem_usage,
BATCH_OPS(htab_lru_percpu),
.map_btf_id = &htab_map_btf_ids[0],
diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
index 137e855c718b..682a9f34214b 100644
--- a/kernel/bpf/memalloc.c
+++ b/kernel/bpf/memalloc.c
@@ -102,7 +102,8 @@ struct bpf_mem_cache {
int percpu_size;
bool draining;
struct bpf_mem_cache *tgt;
- struct bpf_mem_alloc *ma;
+ void (*dtor)(void *obj, void *ctx);
+ void *dtor_ctx;

/* list of objects to be freed after RCU GP */
struct llist_head free_by_rcu;
@@ -261,15 +262,14 @@ static void free_one(void *obj, bool percpu)
kfree(obj);
}

-static int free_all(struct llist_node *llnode, bool percpu,
- struct bpf_mem_alloc *ma)
+static int free_all(struct bpf_mem_cache *c, struct llist_node *llnode, bool percpu)
{
struct llist_node *pos, *t;
int cnt = 0;

llist_for_each_safe(pos, t, llnode) {
- if (ma->dtor)
- ma->dtor((void *)pos + LLIST_NODE_SZ, ma->dtor_ctx);
+ if (c->dtor)
+ c->dtor((void *)pos + LLIST_NODE_SZ, c->dtor_ctx);
free_one(pos, percpu);
cnt++;
}
@@ -280,7 +280,7 @@ static void __free_rcu(struct rcu_head *head)
{
struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu_ttrace);

- free_all(llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size, c->ma);
+ free_all(c, llist_del_all(&c->waiting_for_gp_ttrace), !!c->percpu_size);
atomic_set(&c->call_rcu_ttrace_in_progress, 0);
}

@@ -312,7 +312,7 @@ static void do_call_rcu_ttrace(struct bpf_mem_cache *c)
if (atomic_xchg(&c->call_rcu_ttrace_in_progress, 1)) {
if (unlikely(READ_ONCE(c->draining))) {
llnode = llist_del_all(&c->free_by_rcu_ttrace);
- free_all(llnode, !!c->percpu_size, c->ma);
+ free_all(c, llnode, !!c->percpu_size);
}
return;
}
@@ -421,7 +421,7 @@ static void check_free_by_rcu(struct bpf_mem_cache *c)
dec_active(c, &flags);

if (unlikely(READ_ONCE(c->draining))) {
- free_all(llist_del_all(&c->waiting_for_gp), !!c->percpu_size, c->ma);
+ free_all(c, llist_del_all(&c->waiting_for_gp), !!c->percpu_size);
atomic_set(&c->call_rcu_in_progress, 0);
} else {
call_rcu_hurry(&c->rcu, __free_by_rcu);
@@ -546,7 +546,6 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, bool percpu)
c->objcg = objcg;
c->percpu_size = percpu_size;
c->tgt = c;
- c->ma = ma;
init_refill_work(c);
prefill_mem_cache(c, cpu);
}
@@ -569,7 +568,6 @@ int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, bool percpu)
c->objcg = objcg;
c->percpu_size = percpu_size;
c->tgt = c;
- c->ma = ma;

init_refill_work(c);
prefill_mem_cache(c, cpu);
@@ -622,7 +620,6 @@ int bpf_mem_alloc_percpu_unit_init(struct bpf_mem_alloc *ma, int size)
c->objcg = objcg;
c->percpu_size = percpu_size;
c->tgt = c;
- c->ma = ma;

init_refill_work(c);
prefill_mem_cache(c, cpu);
@@ -631,7 +628,7 @@ int bpf_mem_alloc_percpu_unit_init(struct bpf_mem_alloc *ma, int size)
return 0;
}

-static void drain_mem_cache(struct bpf_mem_cache *c, struct bpf_mem_alloc *ma)
+static void drain_mem_cache(struct bpf_mem_cache *c)
{
bool percpu = !!c->percpu_size;

@@ -642,13 +639,13 @@ static void drain_mem_cache(struct bpf_mem_cache *c, struct bpf_mem_alloc *ma)
* Except for waiting_for_gp_ttrace list, there are no concurrent operations
* on these lists, so it is safe to use __llist_del_all().
*/
- free_all(llist_del_all(&c->free_by_rcu_ttrace), percpu, ma);
- free_all(llist_del_all(&c->waiting_for_gp_ttrace), percpu, ma);
- free_all(__llist_del_all(&c->free_llist), percpu, ma);
- free_all(__llist_del_all(&c->free_llist_extra), percpu, ma);
- free_all(__llist_del_all(&c->free_by_rcu), percpu, ma);
- free_all(__llist_del_all(&c->free_llist_extra_rcu), percpu, ma);
- free_all(llist_del_all(&c->waiting_for_gp), percpu, ma);
+ free_all(c, llist_del_all(&c->free_by_rcu_ttrace), percpu);
+ free_all(c, llist_del_all(&c->waiting_for_gp_ttrace), percpu);
+ free_all(c, __llist_del_all(&c->free_llist), percpu);
+ free_all(c, __llist_del_all(&c->free_llist_extra), percpu);
+ free_all(c, __llist_del_all(&c->free_by_rcu), percpu);
+ free_all(c, __llist_del_all(&c->free_llist_extra_rcu), percpu);
+ free_all(c, llist_del_all(&c->waiting_for_gp), percpu);
}

static void check_mem_cache(struct bpf_mem_cache *c)
@@ -687,6 +684,9 @@ static void check_leaked_objs(struct bpf_mem_alloc *ma)

static void free_mem_alloc_no_barrier(struct bpf_mem_alloc *ma)
{
+ /* We can free dtor ctx only once all callbacks are done using it. */
+ if (ma->dtor_ctx_free)
+ ma->dtor_ctx_free(ma->dtor_ctx);
check_leaked_objs(ma);
free_percpu(ma->cache);
free_percpu(ma->caches);
@@ -758,7 +758,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma)
c = per_cpu_ptr(ma->cache, cpu);
WRITE_ONCE(c->draining, true);
irq_work_sync(&c->refill_work);
- drain_mem_cache(c, ma);
+ drain_mem_cache(c);
rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress);
rcu_in_progress += atomic_read(&c->call_rcu_in_progress);
}
@@ -773,7 +773,7 @@ void bpf_mem_alloc_destroy(struct bpf_mem_alloc *ma)
c = &cc->cache[i];
WRITE_ONCE(c->draining, true);
irq_work_sync(&c->refill_work);
- drain_mem_cache(c, ma);
+ drain_mem_cache(c);
rcu_in_progress += atomic_read(&c->call_rcu_ttrace_in_progress);
rcu_in_progress += atomic_read(&c->call_rcu_in_progress);
}
@@ -1022,9 +1022,31 @@ int bpf_mem_alloc_check_size(bool percpu, size_t size)
return 0;
}

-void bpf_mem_alloc_set_dtor(struct bpf_mem_alloc *ma,
- void (*dtor)(void *obj, void *ctx), void *ctx)
+void bpf_mem_alloc_set_dtor(struct bpf_mem_alloc *ma, void (*dtor)(void *obj, void *ctx),
+ void (*dtor_ctx_free)(void *ctx), void *ctx)
{
- ma->dtor = dtor;
+ struct bpf_mem_caches *cc;
+ struct bpf_mem_cache *c;
+ int cpu, i;
+
+ ma->dtor_ctx_free = dtor_ctx_free;
ma->dtor_ctx = ctx;
+
+ if (ma->cache) {
+ for_each_possible_cpu(cpu) {
+ c = per_cpu_ptr(ma->cache, cpu);
+ c->dtor = dtor;
+ c->dtor_ctx = ctx;
+ }
+ }
+ if (ma->caches) {
+ for_each_possible_cpu(cpu) {
+ cc = per_cpu_ptr(ma->caches, cpu);
+ for (i = 0; i < NUM_CACHES; i++) {
+ c = &cc->cache[i];
+ c->dtor = dtor;
+ c->dtor_ctx = ctx;
+ }
+ }
+ }
}

syzbot ci

unread,
Feb 26, 2026, 12:38:03 PM (2 days ago) Feb 26
to mem...@gmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, kerne...@meta.com, k...@meta.com, marti...@kernel.org, mem...@gmail.com, pau...@kernel.org, syz...@lists.linux.dev, syzkall...@googlegroups.com, yats...@meta.com

Unknown command

Kumar Kartikeya Dwivedi

unread,
Feb 26, 2026, 1:28:39 PM (2 days ago) Feb 26
to syzbot+ci382...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, kerne...@meta.com, k...@meta.com, marti...@kernel.org, mem...@gmail.com, pau...@kernel.org, syz...@lists.linux.dev, syzkall...@googlegroups.com, yats...@meta.com
#syz test:

syzbot ci

unread,
Feb 26, 2026, 1:30:19 PM (2 days ago) Feb 26
to mem...@gmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, kerne...@meta.com, k...@meta.com, marti...@kernel.org, mem...@gmail.com, pau...@kernel.org, syz...@lists.linux.dev, syzkall...@googlegroups.com, yats...@meta.com

Unknown command

Kumar Kartikeya Dwivedi

unread,
Feb 26, 2026, 1:36:56 PM (2 days ago) Feb 26
to syzbot+ci382...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, kerne...@meta.com, k...@meta.com, marti...@kernel.org, mem...@gmail.com, pau...@kernel.org, syz...@lists.linux.dev, syzkall...@googlegroups.com, yats...@meta.com
Main things to remember:
-dtor is set fro m map_check_btf because record cannot be inspected in
alloc.
-set dtor for each cache individually.
-free ctx after callbacks are done running.

Signed-off-by: Kumar Kartikeya Dwivedi <mem...@gmail.com>
---
#syz test

include/linux/bpf_mem_alloc.h | 6 ++-
kernel/bpf/hashtab.c | 86 +++++++++++++++++++++++++++--------
kernel/bpf/memalloc.c | 70 ++++++++++++++++++----------
3 files changed, 118 insertions(+), 44 deletions(-)
--
2.47.3

syzbot ci

unread,
Feb 26, 2026, 1:42:23 PM (2 days ago) Feb 26
to mem...@gmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, kerne...@meta.com, k...@meta.com, marti...@kernel.org, mem...@gmail.com, pau...@kernel.org, syz...@lists.linux.dev, syzkall...@googlegroups.com, yats...@meta.com

Unknown command

Reply all
Reply to author
Forward
0 new messages