[syzbot] [bpf?] KCSAN: data-race in __htab_map_lookup_elem / bpf_lru_pop_free

20 views
Skip to first unread message

syzbot

unread,
Jun 10, 2025, 4:01:39 AMJun 10
to and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
Hello,

syzbot found the following issue on:

HEAD commit: 19272b37aa4f Linux 6.16-rc1
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=101f5a0c580000
kernel config: https://syzkaller.appspot.com/x/.config?x=f437300db311c188
dashboard link: https://syzkaller.appspot.com/bug?extid=ad4661d6ca888ce7fe11
compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/ab33a6ff9377/disk-19272b37.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d5cfaf818a35/vmlinux-19272b37.xz
kernel image: https://storage.googleapis.com/syzbot-assets/186f6b167a3a/bzImage-19272b37.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ad4661...@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in __htab_map_lookup_elem / bpf_lru_pop_free

write to 0xffff8881042a62e8 of 4 bytes by task 22653 on cpu 0:
__local_list_add_pending kernel/bpf/bpf_lru_list.c:358 [inline]
bpf_common_lru_pop_free kernel/bpf/bpf_lru_list.c:457 [inline]
bpf_lru_pop_free+0xbd4/0xcb0 kernel/bpf/bpf_lru_list.c:504
prealloc_lru_pop kernel/bpf/hashtab.c:303 [inline]
__htab_lru_percpu_map_update_elem+0xea/0x600 kernel/bpf/hashtab.c:1349
bpf_percpu_hash_update+0x61/0xa0 kernel/bpf/hashtab.c:2408
bpf_map_update_value+0x297/0x3a0 kernel/bpf/syscall.c:266
generic_map_update_batch+0x3f5/0x540 kernel/bpf/syscall.c:1982
bpf_map_do_batch+0x255/0x380 kernel/bpf/syscall.c:5344
__sys_bpf+0x2e0/0x790 kernel/bpf/syscall.c:-1
__do_sys_bpf kernel/bpf/syscall.c:5943 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5941 [inline]
__x64_sys_bpf+0x41/0x50 kernel/bpf/syscall.c:5941
x64_sys_call+0x2478/0x2fb0 arch/x86/include/generated/asm/syscalls_64.h:322
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd2/0x200 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff8881042a62e8 of 4 bytes by task 22637 on cpu 1:
lookup_nulls_elem_raw kernel/bpf/hashtab.c:643 [inline]
__htab_map_lookup_elem+0xab/0x150 kernel/bpf/hashtab.c:673
htab_lru_percpu_map_lookup_elem+0x20/0xb0 kernel/bpf/hashtab.c:2342
bpf_prog_1592a6279ab44e8a+0x48/0x50
bpf_dispatcher_nop_func include/linux/bpf.h:1322 [inline]
__bpf_prog_run include/linux/filter.h:718 [inline]
bpf_prog_run include/linux/filter.h:725 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2258 [inline]
bpf_trace_run2+0x107/0x1c0 kernel/trace/bpf_trace.c:2299
__traceiter_kfree+0x2e/0x50 include/trace/events/kmem.h:94
__do_trace_kfree include/trace/events/kmem.h:94 [inline]
trace_kfree include/trace/events/kmem.h:94 [inline]
kfree+0x27b/0x320 mm/slub.c:4829
___sys_recvmsg+0x135/0x370 net/socket.c:2829
do_recvmmsg+0x1ef/0x540 net/socket.c:2923
__sys_recvmmsg net/socket.c:2997 [inline]
__do_sys_recvmmsg net/socket.c:3020 [inline]
__se_sys_recvmmsg net/socket.c:3013 [inline]
__x64_sys_recvmmsg+0xe5/0x170 net/socket.c:3013
x64_sys_call+0x1c6a/0x2fb0 arch/x86/include/generated/asm/syscalls_64.h:300
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd2/0x200 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

value changed: 0x3dd8f34f -> 0x7cc9e3a7

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 22637 Comm: syz.3.6512 Tainted: G W 6.16.0-rc1-syzkaller #0 PREEMPT(voluntary)
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Shankari

unread,
Jul 1, 2025, 5:39:49 PMJul 1
to syzkaller-bugs
Hi,


> BUG: KCSAN: data-race in __htab_map_lookup_elem / bpf_lru_pop_free

Using 6.16-rc4 with KCSAN and a stress test that triggers concurrent lookups and evictions on an LRU hash map, I haven’t yet hit the race in practice, but the access to the `ref` field seems potentially unsynchronized.

Would converting `ref` to an `atomic_t`, or wrapping it with `READ_ONCE()`/`WRITE_ONCE()`, be the appropriate direction to fix this? Or is there a higher-level guarantee that’s supposed to prevent this race?

Happy to test or help prepare a patch.

Aleksandr Nogikh

unread,
Jul 1, 2025, 5:43:28 PMJul 1
to Shankari, syzkaller-bugs
Hi Shankari,

Please note that syzkall...@googlegroups.com is mostly used as an
archive, I doubt anyone monitors it closely. If you want to reach
kernel developers, you'd better To/Cc the relevant people/mailing
lists.

FWIW here's the link to the bug report in Lore:
https://lore.kernel.org/all/6847e661.a70a022...@google.com/T/

--
Aleksandr
Message has been deleted

Shankari

unread,
Jul 16, 2025, 1:47:28 AMJul 16
to syzkaller-bugs
#syz test

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7159,6 +7159,19 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
  }
 
  ret = btf_struct_access(&env->log, reg, off, size, atype, &btf_id, &flag, &field_name);
+
+ /* Block access to sensitive kernel-internal fields */
+ if (field_name && reg->btf && btf_is_kernel(reg->btf)) {
+ const struct btf_type *base_type = btf_type_by_id(reg->btf, reg->btf_id);
+ const char *type_name = btf_name_by_offset(reg->btf, base_type->name_off);
+
+ if (strcmp(type_name, "bpf_lru_node") == 0 &&
+ strcmp(field_name, "ref") == 0) {
+ verbose(env,
+ "access to field 'ref' in struct bpf_lru_node is not allowed\n");
+ return -EACCES;
+ }
+ }
  }
 
  if (ret < 0)

On Tuesday, June 10, 2025 at 1:31:39 PM UTC+5:30 syzbot wrote:
Reply all
Reply to author
Forward
0 new messages