bot+b...@kernel.org
unread,10:46āÆAMĀ (5 hours ago)Ā 10:46āÆAMSign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to ead...@qq.com, sashi...@kernel.org, ead...@qq.com, jiayua...@linux.dev, sashiko...@lists.linux.dev, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, em...@etsalapatis.com, jo...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, mem...@gmail.com, net...@vger.kernel.org, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev, marti...@kernel.org, c...@meta.com, ihor.s...@linux.dev
> bpf: Fix smp_processor_id() call trace for preemptible kernels
>
> bpf_mem_cache_free_rcu() maybe called in preemptible context, this
> will trigger the below warning message:
>
> BUG: using smp_processor_id() in preemptible [00000000] code: syz.0.17/5820
> caller is bpf_mem_cache_free_rcu+0x48/0xc0 kernel/bpf/memalloc.c:954
> Call Trace:
> check_preemption_disabled+0xd3/0xe0 lib/smp_processor_id.c:47
> bpf_mem_cache_free_rcu+0x48/0xc0 kernel/bpf/memalloc.c:954
> rhtab_delete_elem+0x185a/0x1b30 kernel/bpf/hashtab.c:2969
> __rhtab_map_lookup_and_delete_batch+0x935/0xcb0 kernel/bpf/hashtab.c:3349
> bpf_map_do_batch+0x445/0x630 kernel/bpf/syscall.c:-1
> __sys_bpf+0x906/0xd90 kernel/bpf/syscall.c:-1
>
> this_cpu_ptr() access needs to be guarded against migration.
The commit message says "guarded against migration" but the fix actually
disables preemption using get_cpu()/put_cpu(). Is this description accurate?
get_cpu() does preempt_disable() + __smp_processor_id(), not just
migrate_disable(). The later sentence "disabling preemption rather than
prohibiting migration" correctly describes the fix, but this earlier line
understates the requirement.
> Wrapping this batch operation in bpf_disable_instrumentation() risk
> blinding BPF tracing globally on the CPU if preemption occurs.
>
> bpf_disable_instrumentation() increments the per-CPU bpf_prog_active counter.
> Because migrate_disable() and rcu_read_lock() do not disable preemption under
> CONFIG_PREEMPT_RCU, the task can be preempted during this potentially long
> loop.
>
> If preempted, bpf_prog_active would remain elevated on that CPU, which could
> cause subsequent tasks scheduled on the same CPU to silently drop BPF tracing
> events (kprobes, tracepoints, perf).
The reasoning about bpf_disable_instrumentation() is sound, but could the
commit message clarify that rhtab_delete_elem() already calls
bpf_disable_instrumentation() internally?
Looking at rhtab_delete_elem():
kernel/bpf/hashtab.c:rhtab_delete_elem() {
...
bpf_disable_instrumentation();
ret = rhashtable_remove_fast(&rhtab->table, &elem->htab_node,
rhtab->rhtable_params);
bpf_enable_instrumentation();
...
}
This means the same "blinding BPF tracing" window already exists in the
per-element path. The argument is against a longer-held instrumentation
disable across the entire loop, not a categorical concern.
> Therefore, we fix this by disabling preemption rather than prohibiting
> migration.
>
> Fixes: 5af6807bdb10 ("bpf: Introduce bpf_mem_free_rcu() similar to kfree_rcu().")
Should the Fixes tag point to a different commit? The bug occurs when
__rhtab_map_lookup_and_delete_batch() calls rhtab_delete_elem() which calls
bpf_mem_cache_free_rcu() in preemptible context.
Commit 5af6807bdb10 introduced bpf_mem_free_rcu() in 2023 but did not modify
kernel/bpf/hashtab.c or introduce the rhtab code. The call chain was
introduced by:
Fixes: 6905f8601298 ("bpf: Allow special fields in resizable hashtab")
which added the rhtab implementation including __rhtab_map_lookup_and_delete_batch()
and the call to bpf_mem_cache_free_rcu() in rhtab_delete_elem().
[ ... ]
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -3345,8 +3345,10 @@
> }
>
> if (do_delete) {
> + get_cpu();
> for (i = 0; i < total; i++)
> rhtab_delete_elem(rhtab, del_elems[i], NULL, 0);
> + put_cpu();
> }
>
> rcu_read_unlock();
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See:
https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary:
https://github.com/kernel-patches/bpf/actions/runs/28451219226