[syzbot] [bpf?] KASAN: stack-out-of-bounds Write in __bpf_get_stack

6 views
Skip to first unread message

syzbot

unread,
Nov 10, 2025, 1:41:34 PM (5 days ago) Nov 10
to and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, con...@arnaud-lcm.com, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
Hello,

syzbot found the following issue on:

HEAD commit: f8c67d8550ee bpf: Use kmalloc_nolock() in range tree
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=121a50b4580000
kernel config: https://syzkaller.appspot.com/x/.config?x=e46b8a1c645465a9
dashboard link: https://syzkaller.appspot.com/bug?extid=d1b7fa1092def3628bd7
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12270412580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=128bd084580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/d9e95bfbe4ee/disk-f8c67d85.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/0766b6dd0e91/vmlinux-f8c67d85.xz
kernel image: https://storage.googleapis.com/syzbot-assets/79089f9e9e93/bzImage-f8c67d85.xz

The issue was bisected to:

commit e17d62fedd10ae56e2426858bd0757da544dbc73
Author: Arnaud Lecomte <con...@arnaud-lcm.com>
Date: Sat Oct 25 19:28:58 2025 +0000

bpf: Refactor stack map trace depth calculation into helper function

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1632d0b4580000
final oops: https://syzkaller.appspot.com/x/report.txt?x=1532d0b4580000
console output: https://syzkaller.appspot.com/x/log.txt?x=1132d0b4580000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d1b7fa...@syzkaller.appspotmail.com
Fixes: e17d62fedd10 ("bpf: Refactor stack map trace depth calculation into helper function")

==================================================================
BUG: KASAN: stack-out-of-bounds in __bpf_get_stack+0x5a3/0xaa0 kernel/bpf/stackmap.c:493
Write of size 168 at addr ffffc900030e73a8 by task syz.1.44/6108

CPU: 0 UID: 0 PID: 6108 Comm: syz.1.44 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x240 mm/kasan/report.c:482
kasan_report+0x118/0x150 mm/kasan/report.c:595
check_region_inline mm/kasan/generic.c:-1 [inline]
kasan_check_range+0x2b0/0x2c0 mm/kasan/generic.c:200
__asan_memcpy+0x40/0x70 mm/kasan/shadow.c:106
__bpf_get_stack+0x5a3/0xaa0 kernel/bpf/stackmap.c:493
____bpf_get_stack kernel/bpf/stackmap.c:517 [inline]
bpf_get_stack+0x33/0x50 kernel/bpf/stackmap.c:514
____bpf_get_stack_raw_tp kernel/trace/bpf_trace.c:1653 [inline]
bpf_get_stack_raw_tp+0x1a9/0x220 kernel/trace/bpf_trace.c:1643
bpf_prog_4b3f8e3d902f6f0d+0x41/0x49
bpf_dispatcher_nop_func include/linux/bpf.h:1364 [inline]
__bpf_prog_run include/linux/filter.h:721 [inline]
bpf_prog_run include/linux/filter.h:728 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2075 [inline]
bpf_trace_run2+0x284/0x4b0 kernel/trace/bpf_trace.c:2116
__traceiter_kfree+0x2e/0x50 include/trace/events/kmem.h:97
__do_trace_kfree include/trace/events/kmem.h:97 [inline]
trace_kfree include/trace/events/kmem.h:97 [inline]
kfree+0x62f/0x6d0 mm/slub.c:6824
compute_scc+0x9a6/0xa20 kernel/bpf/verifier.c:25021
bpf_check+0x5df2/0x1c210 kernel/bpf/verifier.c:25162
bpf_prog_load+0x13ba/0x1a10 kernel/bpf/syscall.c:3095
__sys_bpf+0x507/0x860 kernel/bpf/syscall.c:6171
__do_sys_bpf kernel/bpf/syscall.c:6281 [inline]
__se_sys_bpf kernel/bpf/syscall.c:6279 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:6279
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fc4d8b8f6c9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffcd2851bb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00007fc4d8de5fa0 RCX: 00007fc4d8b8f6c9
RDX: 0000000000000094 RSI: 00002000000000c0 RDI: 0000000000000005
RBP: 00007fc4d8c11f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fc4d8de5fa0 R14: 00007fc4d8de5fa0 R15: 0000000000000003
</TASK>

The buggy address belongs to stack of task syz.1.44/6108
and is located at offset 296 in frame:
__bpf_get_stack+0x0/0xaa0 include/linux/mmap_lock.h:-1

This frame has 1 object:
[32, 36) 'rctx.i'

The buggy address belongs to a 8-page vmalloc region starting at 0xffffc900030e0000 allocated at copy_process+0x54b/0x3c00 kernel/fork.c:2012
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x572fb
memcg:ffff88803037aa02
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000000 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff ffff88803037aa02
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x2dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_ZERO|__GFP_NOWARN), pid 1340, tgid 1340 (kworker/u8:6), ts 107851542040, free_ts 101175357499
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x240/0x2a0 mm/page_alloc.c:1850
prep_new_page mm/page_alloc.c:1858 [inline]
get_page_from_freelist+0x2365/0x2440 mm/page_alloc.c:3884
__alloc_frozen_pages_noprof+0x181/0x370 mm/page_alloc.c:5183
alloc_pages_mpol+0x232/0x4a0 mm/mempolicy.c:2416
alloc_frozen_pages_noprof mm/mempolicy.c:2487 [inline]
alloc_pages_noprof+0xa9/0x190 mm/mempolicy.c:2507
vm_area_alloc_pages mm/vmalloc.c:3647 [inline]
__vmalloc_area_node mm/vmalloc.c:3724 [inline]
__vmalloc_node_range_noprof+0x96c/0x12d0 mm/vmalloc.c:3897
__vmalloc_node_noprof+0xc2/0x110 mm/vmalloc.c:3960
alloc_thread_stack_node kernel/fork.c:311 [inline]
dup_task_struct+0x3d4/0x830 kernel/fork.c:881
copy_process+0x54b/0x3c00 kernel/fork.c:2012
kernel_clone+0x21e/0x840 kernel/fork.c:2609
user_mode_thread+0xdd/0x140 kernel/fork.c:2685
call_usermodehelper_exec_sync kernel/umh.c:132 [inline]
call_usermodehelper_exec_work+0x9c/0x230 kernel/umh.c:163
process_one_work kernel/workqueue.c:3263 [inline]
process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3346
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3427
kthread+0x711/0x8a0 kernel/kthread.c:463
ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
page last free pid 5918 tgid 5918 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1394 [inline]
__free_frozen_pages+0xbc4/0xd30 mm/page_alloc.c:2906
vfree+0x25a/0x400 mm/vmalloc.c:3440
kcov_put kernel/kcov.c:439 [inline]
kcov_close+0x28/0x50 kernel/kcov.c:535
__fput+0x44c/0xa70 fs/file_table.c:468
task_work_run+0x1d4/0x260 kernel/task_work.c:227
exit_task_work include/linux/task_work.h:40 [inline]
do_exit+0x6b5/0x2300 kernel/exit.c:966
do_group_exit+0x21c/0x2d0 kernel/exit.c:1107
get_signal+0x1285/0x1340 kernel/signal.c:3034
arch_do_signal_or_restart+0xa0/0x790 arch/x86/kernel/signal.c:337
exit_to_user_mode_loop+0x72/0x130 kernel/entry/common.c:40
exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline]
do_syscall_64+0x2bd/0xfa0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Memory state around the buggy address:
ffffc900030e7300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffffc900030e7380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffc900030e7400: f1 f1 f1 f1 00 00 f2 f2 00 00 f3 f3 00 00 00 00
^
ffffc900030e7480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffffc900030e7500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Nov 10, 2025, 2:01:43 PM (5 days ago) Nov 10
to linux-...@vger.kernel.org, syzkall...@googlegroups.com
For archival purposes, forwarding an incoming command email to
linux-...@vger.kernel.org, syzkall...@googlegroups.com.

***

Subject: Re: [syzbot] [bpf?] KASAN: stack-out-of-bounds Write in __bpf_get_stack
Author: lis...@listout.xyz
#syz test

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 2365541c81dd..c68589d0f5f0 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -479,7 +479,6 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
goto err_fault;
}

- trace_nr = trace->nr - skip;
copy_len = trace_nr * elem_size;

ips = trace->ip + skip;

--
Regards,
listout

syzbot

unread,
Nov 10, 2025, 2:17:41 PM (5 days ago) Nov 10
to linux-...@vger.kernel.org, syzkall...@googlegroups.com
For archival purposes, forwarding an incoming command email to
linux-...@vger.kernel.org, syzkall...@googlegroups.com.

***

Subject: Re: [syzbot] [bpf?] KASAN: stack-out-of-bounds Write in __bpf_get_stack
Author: lis...@listout.xyz

On 10.11.2025 10:41, syzbot wrote:
#syz test

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 2365541c81dd..2db09ce39828 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -480,7 +480,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
}

trace_nr = trace->nr - skip;
- copy_len = trace_nr * elem_size;
+ /*copy_len = trace_nr * elem_size;*/

ips = trace->ip + skip;
if (user_build_id) {
@@ -490,7 +490,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
for (i = 0; i < trace_nr; i++)
id_offs[i].ip = ips[i];
} else {
- memcpy(buf, ips, copy_len);
+ memcpy(buf, ips, trace_nr);
}

/* trace/ips should not be dereferenced after this point */

--
Regards,
listout

syzbot

unread,
Nov 10, 2025, 2:33:06 PM (5 days ago) Nov 10
to linux-...@vger.kernel.org, lis...@listout.xyz, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+d1b7fa...@syzkaller.appspotmail.com
Tested-by: syzbot+d1b7fa...@syzkaller.appspotmail.com

Tested on:

commit: f8c67d85 bpf: Use kmalloc_nolock() in range tree
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=15ceb412580000
kernel config: https://syzkaller.appspot.com/x/.config?x=e46b8a1c645465a9
dashboard link: https://syzkaller.appspot.com/bug?extid=d1b7fa1092def3628bd7
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=17b66412580000

Note: testing is done by a robot and is best-effort only.

syzbot

unread,
Nov 10, 2025, 2:50:05 PM (5 days ago) Nov 10
to linux-...@vger.kernel.org, lis...@listout.xyz, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
invalid opcode in error_return

Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 6994 Comm: syz.1.247 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
RIP: 0010:error_return+0xa/0x20 arch/x86/entry/entry_64.S:1091
Code: cc cc cc cc cc cc cc cc cc cc cc cc 48 8d 7c 24 08 e8 5a 4c 46 0a 48 89 c7 e9 12 4c 46 0a 90 90 50 9c 58 a9 00 02 00 00 74 02 <0f> 0b 58 f6 84 24 88 00 00 00 03 0f 84 31 fc ff ff e9 60 fb ff ff
RSP: 0018:ffffc90000007a78 EFLAGS: 00010206
RAX: 0000000000000286 RBX: 1ffff1100f9266d4 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc90000007a70
RBP: ffffffff8b46984e R08: ffffc90000007a6f R09: 0000000000000000
R10: ffffc90000007a68 R11: fffff52000000f4e R12: ffffc9000c2c3048
R13: ffffc90000007b00 R14: ffff88807c9336a0 R15: ffffc9000c2c3060
FS: 00007f9d4ee566c0(0000) GS:ffff88812613b000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000100000000 CR3: 00000000726c6000 CR4: 00000000003526f0
Call Trace:
<IRQ>
RIP: 3100:rcu_lock_release include/linux/rcupdate.h:341 [inline]
RIP: 3100:rcu_do_batch kernel/rcu/tree.c:2607 [inline]
RIP: 3100:rcu_core+0xcab/0x1770 kernel/rcu/tree.c:2861
Code: 00 00 00 00 fc ff df 41 80 3c 06 00 74 08 4c 89 ff e8 59 1d 7e 00 48 c7 43 08 00 00 00 00 48 89 df 4d 89 e3 2e e8 4d 4e 58 1e <48> c7 c7 40 d7 f3 8d 4c 89 ee e8 b6 77 f5 ff 65 8b 05 7f 61 c6 10
RSP: f400:0000000000000000 EFLAGS: 404bee7c878af400
==================================================================
BUG: KASAN: stack-out-of-bounds in __show_regs+0x4e/0x620 arch/x86/kernel/process_64.c:79
Read of size 8 at addr ffffc90000007af8 by task syz.1.247/6994

CPU: 0 UID: 0 PID: 6994 Comm: syz.1.247 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
Call Trace:
<IRQ>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x240 mm/kasan/report.c:482
kasan_report+0x118/0x150 mm/kasan/report.c:595
__show_regs+0x4e/0x620 arch/x86/kernel/process_64.c:79
show_regs_if_on_stack arch/x86/kernel/dumpstack.c:165 [inline]
show_trace_log_lvl+0x31d/0x550 arch/x86/kernel/dumpstack.c:237
show_regs arch/x86/kernel/dumpstack.c:470 [inline]
__die_body+0xa6/0xb0 arch/x86/kernel/dumpstack.c:412
die+0x2a/0x50 arch/x86/kernel/dumpstack.c:439
do_trap_no_signal arch/x86/kernel/traps.c:206 [inline]
do_trap+0x14a/0x3d0 arch/x86/kernel/traps.c:247
do_error_trap+0x1c1/0x280 arch/x86/kernel/traps.c:267
handle_invalid_op+0x34/0x40 arch/x86/kernel/traps.c:304
exc_invalid_op+0x39/0x50 arch/x86/kernel/traps.c:397
asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:616
RIP: 0010:error_return+0xa/0x20 arch/x86/entry/entry_64.S:1091
Code: cc cc cc cc cc cc cc cc cc cc cc cc 48 8d 7c 24 08 e8 5a 4c 46 0a 48 89 c7 e9 12 4c 46 0a 90 90 50 9c 58 a9 00 02 00 00 74 02 <0f> 0b 58 f6 84 24 88 00 00 00 03 0f 84 31 fc ff ff e9 60 fb ff ff
RSP: 0018:ffffc90000007a78 EFLAGS: 00010206
RAX: 0000000000000286 RBX: 1ffff1100f9266d4 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc90000007a70
RBP: ffffffff8b46984e R08: ffffc90000007a6f R09: 0000000000000000
R10: ffffc90000007a68 R11: fffff52000000f4e R12: ffffc9000c2c3048
R13: ffffc90000007b00 R14: ffff88807c9336a0 R15: ffffc9000c2c3060
RIP: 3100:rcu_lock_release include/linux/rcupdate.h:341 [inline]
RIP: 3100:rcu_do_batch kernel/rcu/tree.c:2607 [inline]
RIP: 3100:rcu_core+0xcab/0x1770 kernel/rcu/tree.c:2861
Code: 00 00 00 00 fc ff df 41 80 3c 06 00 74 08 4c 89 ff e8 59 1d 7e 00 48 c7 43 08 00 00 00 00 48 89 df 4d 89 e3 2e e8 4d 4e 58 1e <48> c7 c7 40 d7 f3 8d 4c 89 ee e8 b6 77 f5 ff 65 8b 05 7f 61 c6 10
RSP: f400:0000000000000000 EFLAGS: 404bee7c878af400 ORIG_RAX: 0000000000000000
RAX: ffffffff81cbf590 RBX: ffffc9000c2c3040 RCX: 0000000000000000
RDX: 0000008000000008 RSI: 0000000000000000 RDI: ffffffff8df3d740
RBP: 0000000000000000 R08: ffffffff8d74996d R09: 0000000041b58ab3
R10: 1ffff92000000f58 R11: 1ffff92001858608 R12: ffffffff81cbf716
R13: ffff88807c932970 R14: ffff88807c9309f3 R15: ffffffff81ed3477
</IRQ>
<TASK>
</TASK>

The buggy address belongs to a 0-page vmalloc region starting at 0xffffc90000000000 allocated at map_irq_stack arch/x86/kernel/irq_64.c:49 [inline]
The buggy address belongs to a 0-page vmalloc region starting at 0xffffc90000000000 allocated at irq_init_percpu_irqstack+0x342/0x4a0 arch/x86/kernel/irq_64.c:76
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xb8808
flags: 0xfff00000002000(reserved|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000002000 ffffea0002e20208 ffffea0002e20208 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner info is not present (never set?)

Memory state around the buggy address:
ffffc90000007980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffffc90000007a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffc90000007a80: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 f2 f2
^
ffffc90000007b00: 00 00 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
ffffc90000007b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
----------------
Code disassembly (best guess), 6 bytes skipped:
0: df 41 80 filds -0x80(%rcx)
3: 3c 06 cmp $0x6,%al
5: 00 74 08 4c add %dh,0x4c(%rax,%rcx,1)
9: 89 ff mov %edi,%edi
b: e8 59 1d 7e 00 call 0x7e1d69
10: 48 c7 43 08 00 00 00 movq $0x0,0x8(%rbx)
17: 00
18: 48 89 df mov %rbx,%rdi
1b: 4d 89 e3 mov %r12,%r11
1e: 2e e8 4d 4e 58 1e cs call 0x1e584e71
* 24: 48 c7 c7 40 d7 f3 8d mov $0xffffffff8df3d740,%rdi <-- trapping instruction
2b: 4c 89 ee mov %r13,%rsi
2e: e8 b6 77 f5 ff call 0xfff577e9
33: 65 8b 05 7f 61 c6 10 mov %gs:0x10c6617f(%rip),%eax # 0x10c661b9


Tested on:

commit: f8c67d85 bpf: Use kmalloc_nolock() in range tree
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=15ee6412580000
kernel config: https://syzkaller.appspot.com/x/.config?x=e46b8a1c645465a9
dashboard link: https://syzkaller.appspot.com/bug?extid=d1b7fa1092def3628bd7
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=13eaa60a580000

syzbot

unread,
Nov 10, 2025, 3:58:37 PM (4 days ago) Nov 10
to linux-...@vger.kernel.org, syzkall...@googlegroups.com
For archival purposes, forwarding an incoming command email to
linux-...@vger.kernel.org, syzkall...@googlegroups.com.

***

Subject: Re: [syzbot] [bpf?] KASAN: stack-out-of-bounds Write in __bpf_get_stack
Author: lis...@listout.xyz

On 10.11.2025 10:41, syzbot wrote:
#syz test

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 2365541c81dd..885130e4ab0d 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -480,6 +480,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
}

trace_nr = trace->nr - skip;
+ trace_nr = min_t(u32, trace_nr, size / elem_size);
copy_len = trace_nr * elem_size;

ips = trace->ip + skip;

--
Regards,
listout

Brahmajit Das

unread,
Nov 10, 2025, 4:25:57 PM (4 days ago) Nov 10
to syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, con...@arnaud-lcm.com, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
syzbot reported a stack-out-of-bounds write in __bpf_get_stack()
triggered via bpf_get_stack() when capturing a kernel stack trace.

After the recent refactor that introduced stack_map_calculate_max_depth(),
the code in stack_map_get_build_id_offset() (and related helpers) stopped
clamping the number of trace entries (`trace_nr`) to the number of elements
that fit into the stack map value (`num_elem`).

As a result, if the captured stack contained more frames than the map value
can hold, the subsequent memcpy() would write past the end of the buffer,
triggering a KASAN report like:

BUG: KASAN: stack-out-of-bounds in __bpf_get_stack+0x...
Write of size N at addr ... by task syz-executor...

Restore the missing clamp by limiting `trace_nr` to `num_elem` before
computing the copy length. This mirrors the pre-refactor logic and ensures
we never copy more bytes than the destination buffer can hold.

No functional change intended beyond reintroducing the missing bound check.

Reported-by: syzbot+d1b7fa...@syzkaller.appspotmail.com
Fixes: e17d62fedd10 ("bpf: Refactor stack map trace depth calculation into helper function")
Signed-off-by: Brahmajit Das <lis...@listout.xyz>
---
kernel/bpf/stackmap.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 2365541c81dd..885130e4ab0d 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -480,6 +480,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
}

trace_nr = trace->nr - skip;
+ trace_nr = min_t(u32, trace_nr, size / elem_size);
copy_len = trace_nr * elem_size;

ips = trace->ip + skip;
--
2.51.2

syzbot

unread,
Nov 10, 2025, 4:34:04 PM (4 days ago) Nov 10
to linux-...@vger.kernel.org, lis...@listout.xyz, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+d1b7fa...@syzkaller.appspotmail.com
Tested-by: syzbot+d1b7fa...@syzkaller.appspotmail.com

Tested on:

commit: f8c67d85 bpf: Use kmalloc_nolock() in range tree
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17828c12580000
kernel config: https://syzkaller.appspot.com/x/.config?x=e46b8a1c645465a9
dashboard link: https://syzkaller.appspot.com/bug?extid=d1b7fa1092def3628bd7
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=10616412580000

syzbot

unread,
Nov 10, 2025, 6:43:05 PM (4 days ago) Nov 10
to linux-...@vger.kernel.org, syzkall...@googlegroups.com
For archival purposes, forwarding an incoming command email to
linux-...@vger.kernel.org, syzkall...@googlegroups.com.

***

Subject: Re: [syzbot] [bpf?] KASAN: stack-out-of-bounds Write in __bpf_get_stack
Author: lis...@listout.xyz

On 10.11.2025 10:41, syzbot wrote:
#syz test

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 885130e4ab0d..f9081de43689 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -480,7 +480,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
}

trace_nr = trace->nr - skip;
- trace_nr = min_t(u32, trace_nr, size / elem_size);
+ trace_nr = min_t(u32, trace_nr, max_depth - skip);
copy_len = trace_nr * elem_size;

ips = trace->ip + skip;

--
Regards,
listout

syzbot

unread,
Nov 10, 2025, 7:21:52 PM (4 days ago) Nov 10
to linux-...@vger.kernel.org, syzkall...@googlegroups.com
For archival purposes, forwarding an incoming command email to
linux-...@vger.kernel.org, syzkall...@googlegroups.com.

***

Subject: Re: [syzbot] [bpf?] KASAN: stack-out-of-bounds Write in __bpf_get_stack
Author: lis...@listout.xyz

On 10.11.2025 10:41, syzbot wrote:
#syz test

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 2365541c81dd..f9081de43689 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -480,6 +480,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
}

trace_nr = trace->nr - skip;

syzbot

unread,
Nov 10, 2025, 7:22:05 PM (4 days ago) Nov 10
to linux-...@vger.kernel.org, lis...@listout.xyz, syzkall...@googlegroups.com
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

failed to apply patch:
checking file kernel/bpf/stackmap.c
Hunk #1 FAILED at 480.
1 out of 1 hunk FAILED



Tested on:

commit: f8c67d85 bpf: Use kmalloc_nolock() in range tree
git tree: bpf-next
patch: https://syzkaller.appspot.com/x/patch.diff?x=114e7084580000

Brahmajit Das

unread,
Nov 10, 2025, 7:37:44 PM (4 days ago) Nov 10
to syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, con...@arnaud-lcm.com, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
syzbot reported a stack-out-of-bounds write in __bpf_get_stack()
triggered via bpf_get_stack() when capturing a kernel stack trace.

After the recent refactor that introduced stack_map_calculate_max_depth(),
the code in stack_map_get_build_id_offset() (and related helpers) stopped
clamping the number of trace entries (`trace_nr`) to the number of elements
that fit into the stack map value (`num_elem`).

As a result, if the captured stack contained more frames than the map value
can hold, the subsequent memcpy() would write past the end of the buffer,
triggering a KASAN report like:

BUG: KASAN: stack-out-of-bounds in __bpf_get_stack+0x...
Write of size N at addr ... by task syz-executor...

Restore the missing clamp by limiting `trace_nr` to `num_elem` before
computing the copy length. This mirrors the pre-refactor logic and ensures
we never copy more bytes than the destination buffer can hold.

No functional change intended beyond reintroducing the missing bound check.

Reported-by: syzbot+d1b7fa...@syzkaller.appspotmail.com
Fixes: e17d62fedd10 ("bpf: Refactor stack map trace depth calculation into helper function")
Signed-off-by: Brahmajit Das <lis...@listout.xyz>
---
Changes in v2:
- Use max_depth instead of num_elem logic, this logic is similar to what
we are already using __bpf_get_stackid

Changes in v1:
- RFC patch that restores the number of trace entries by setting
trace_nr to trace_nr or num_elem based on whichever is the smallest.
Link: https://lore.kernel.org/all/20251110211640...@listout.xyz/
---
kernel/bpf/stackmap.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 2365541c81dd..f9081de43689 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -480,6 +480,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
}

trace_nr = trace->nr - skip;
+ trace_nr = min_t(u32, trace_nr, max_depth - skip);
copy_len = trace_nr * elem_size;

ips = trace->ip + skip;
--
2.51.2

syzbot

unread,
Nov 10, 2025, 9:28:04 PM (4 days ago) Nov 10
to linux-...@vger.kernel.org, lis...@listout.xyz, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+d1b7fa...@syzkaller.appspotmail.com
Tested-by: syzbot+d1b7fa...@syzkaller.appspotmail.com

Tested on:

commit: f8c67d85 bpf: Use kmalloc_nolock() in range tree
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=10790658580000
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=1033fa92580000

Brahmajit Das

unread,
Nov 11, 2025, 3:13:20 AM (4 days ago) Nov 11
to syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, con...@arnaud-lcm.com, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
syzbot reported a stack-out-of-bounds write in __bpf_get_stack()
triggered via bpf_get_stack() when capturing a kernel stack trace.

After the recent refactor that introduced stack_map_calculate_max_depth(),
the code in stack_map_get_build_id_offset() (and related helpers) stopped
clamping the number of trace entries (`trace_nr`) to the number of elements
that fit into the stack map value (`num_elem`).

As a result, if the captured stack contained more frames than the map value
can hold, the subsequent memcpy() would write past the end of the buffer,
triggering a KASAN report like:

BUG: KASAN: stack-out-of-bounds in __bpf_get_stack+0x...
Write of size N at addr ... by task syz-executor...

Restore the missing clamp by limiting `trace_nr` to `num_elem` before
computing the copy length. This mirrors the pre-refactor logic and ensures
we never copy more bytes than the destination buffer can hold.

No functional change intended beyond reintroducing the missing bound check.

Reported-by: syzbot+d1b7fa...@syzkaller.appspotmail.com
Fixes: e17d62fedd10 ("bpf: Refactor stack map trace depth calculation into helper function")
Signed-off-by: Brahmajit Das <lis...@listout.xyz>
---
Changes in v3:
Revert back to num_elem based logic for setting trace_nr. This was
suggested by bpf-ci bot, mainly pointing out the chances of underflow
when max_depth < skip.

Quoting the bot's reply:
The stack_map_calculate_max_depth() function can return a value less than
skip when sysctl_perf_event_max_stack is lowered below the skip value:

max_depth = size / elem_size;
max_depth += skip;
if (max_depth > curr_sysctl_max_stack)
return curr_sysctl_max_stack;

If sysctl_perf_event_max_stack = 10 and skip = 20, this returns 10.

Then max_depth - skip = 10 - 20 underflows to 4294967286 (u32 wraps),
causing min_t() to not limit trace_nr at all. This means the original OOB
write is not fixed in cases where skip > max_depth.

With the default sysctl_perf_event_max_stack = 127 and skip up to 255, this
scenario is reachable even without admin changing sysctls.

Changes in v2:
- Use max_depth instead of num_elem logic, this logic is similar to what
we are already using __bpf_get_stackid
Link: https://lore.kernel.org/all/20251111003721...@listout.xyz/

Changes in v1:
- RFC patch that restores the number of trace entries by setting
trace_nr to trace_nr or num_elem based on whichever is the smallest.
Link: https://lore.kernel.org/all/20251110211640...@listout.xyz/
---
kernel/bpf/stackmap.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 2365541c81dd..cef79d9517ab 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -426,7 +426,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
struct perf_callchain_entry *trace_in,
void *buf, u32 size, u64 flags, bool may_fault)
{
- u32 trace_nr, copy_len, elem_size, max_depth;
+ u32 trace_nr, copy_len, elem_size, num_elem, max_depth;
bool user_build_id = flags & BPF_F_USER_BUILD_ID;
bool crosstask = task && task != current;
u32 skip = flags & BPF_F_SKIP_FIELD_MASK;
@@ -480,6 +480,8 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
}

trace_nr = trace->nr - skip;
+ num_elem = size / elem_size;
+ trace_nr = min_t(u32, trace_nr, num_elem);

bot+b...@kernel.org

unread,
Nov 11, 2025, 3:43:43 AM (4 days ago) Nov 11
to lis...@listout.xyz, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, con...@arnaud-lcm.com, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev, marti...@kernel.org, c...@meta.com, ihor.s...@linux.dev
> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> index 2365541c8..f9081de43 100644
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -480,6 +480,7 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task,
> }
>
> trace_nr = trace->nr - skip;
> + trace_nr = min_t(u32, trace_nr, max_depth - skip);
> copy_len = trace_nr * elem_size;
>
> ips = trace->ip + skip;

Can max_depth - skip underflow when max_depth < skip?

The stack_map_calculate_max_depth() function can return a value less than
skip when sysctl_perf_event_max_stack is lowered below the skip value:

max_depth = size / elem_size;
max_depth += skip;
if (max_depth > curr_sysctl_max_stack)
return curr_sysctl_max_stack;

If sysctl_perf_event_max_stack = 10 and skip = 20, this returns 10.

Then max_depth - skip = 10 - 20 underflows to 4294967286 (u32 wraps),
causing min_t() to not limit trace_nr at all. This means the original OOB
write is not fixed in cases where skip > max_depth.

With the default sysctl_perf_event_max_stack = 127 and skip up to 255, this
scenario is reachable even without admin changing sysctls.

The pre-refactor code used:

num_elem = size / elem_size;
trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem;

Perhaps the fix should directly use num_elem instead of max_depth - skip:

u32 num_elem = size / elem_size;
trace_nr = min_t(u32, trace_nr, num_elem);

Or check for underflow:

if (max_depth > skip)
trace_nr = min_t(u32, trace_nr, max_depth - skip);
else
trace_nr = 0;


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/19251115736

Yonghong Song

unread,
Nov 11, 2025, 8:45:14 PM (3 days ago) Nov 11
to Brahmajit Das, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, con...@arnaud-lcm.com, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com


On 11/11/25 12:12 AM, Brahmajit Das wrote:
> syzbot reported a stack-out-of-bounds write in __bpf_get_stack()
> triggered via bpf_get_stack() when capturing a kernel stack trace.
>
> After the recent refactor that introduced stack_map_calculate_max_depth(),
> the code in stack_map_get_build_id_offset() (and related helpers) stopped
> clamping the number of trace entries (`trace_nr`) to the number of elements
> that fit into the stack map value (`num_elem`).
>
> As a result, if the captured stack contained more frames than the map value
> can hold, the subsequent memcpy() would write past the end of the buffer,
> triggering a KASAN report like:
>
> BUG: KASAN: stack-out-of-bounds in __bpf_get_stack+0x...
> Write of size N at addr ... by task syz-executor...
>
> Restore the missing clamp by limiting `trace_nr` to `num_elem` before
> computing the copy length. This mirrors the pre-refactor logic and ensures
> we never copy more bytes than the destination buffer can hold.
>
> No functional change intended beyond reintroducing the missing bound check.
>
> Reported-by: syzbot+d1b7fa...@syzkaller.appspotmail.com
> Fixes: e17d62fedd10 ("bpf: Refactor stack map trace depth calculation into helper function")
> Signed-off-by: Brahmajit Das <lis...@listout.xyz>

Acked-by: Yonghong Song <yongho...@linux.dev>

Lecomte, Arnaud

unread,
Nov 12, 2025, 3:40:40 AM (3 days ago) Nov 12
to Brahmajit Das, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
I am a not sure this is the right solution and I am scared that by
forcing this clamping, we are hiding something else.
If we have a look at the code below:
```

|

if (trace_in) {
trace = trace_in;
trace->nr = min_t(u32, trace->nr, max_depth);
} else if (kernel && task) {
trace = get_callchain_entry_for_task(task, max_depth);
} else {
trace = get_perf_callchain(regs, kernel, user, max_depth,
crosstask, false, 0);
} ``` trace should be (if I remember correctly) clamped there. If not,
it might hide something else. I would like to have a look at the return
for each if case through gdb. |
Thanks,
Arnaud

Brahmajit Das

unread,
Nov 12, 2025, 3:59:03 AM (3 days ago) Nov 12
to Lecomte, Arnaud, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
On 12.11.2025 08:40, 'Lecomte, Arnaud' via syzkaller-bugs wrote:
> I am a not sure this is the right solution and I am scared that by
> forcing this clamping, we are hiding something else.
> If we have a look at the code below:
> ```
>
> |
>
> if (trace_in) {
> trace = trace_in;
> trace->nr = min_t(u32, trace->nr, max_depth);
> } else if (kernel && task) {
> trace = get_callchain_entry_for_task(task, max_depth);
> } else {
> trace = get_perf_callchain(regs, kernel, user, max_depth,
> crosstask, false, 0);
> } ``` trace should be (if I remember correctly) clamped there. If not, it
> might hide something else. I would like to have a look at the return for
> each if case through gdb. |

Sure, I can do that.

>
> Thanks,
> Arnaud

--
Regards,
listout

David Laight

unread,
Nov 12, 2025, 8:38:22 AM (3 days ago) Nov 12
to Brahmajit Das, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, con...@arnaud-lcm.com, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
Please can we have no unnecessary min_t().
You wouldn't write:
x = (u32)a < (u32)b ? (u32)a : (u32)b;

David

Brahmajit Das

unread,
Nov 12, 2025, 9:48:04 AM (3 days ago) Nov 12
to David Laight, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, con...@arnaud-lcm.com, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
On 12.11.2025 13:35, David Laight wrote:
> On Tue, 11 Nov 2025 13:42:54 +0530
> Brahmajit Das <lis...@listout.xyz> wrote:
>
...snip...
>
> Please can we have no unnecessary min_t().
> You wouldn't write:
> x = (u32)a < (u32)b ? (u32)a : (u32)b;
>
> David
>
> > copy_len = trace_nr * elem_size;
> >
> > ips = trace->ip + skip;
>

Hi David,

Sorry, I didn't quite get that. Would prefer something like:
trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem;
The pre-refactor code.

--
Regards,
listout

Lecomte, Arnaud

unread,
Nov 12, 2025, 11:11:48 AM (3 days ago) Nov 12
to Brahmajit Das, David Laight, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
min_t is a min with casting which is unnecessary in this case as
trace_nr and num_elem
are already u32.

> The pre-refactor code.
>

David Laight

unread,
Nov 13, 2025, 2:40:52 AM (2 days ago) Nov 13
to Lecomte, Arnaud, Brahmajit Das, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
Correct

David

>
> > The pre-refactor code.
> >
>

Brahmajit Das

unread,
Nov 13, 2025, 7:49:17 AM (2 days ago) Nov 13
to Lecomte, Arnaud, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
On 12.11.2025 08:40, 'Lecomte, Arnaud' via syzkaller-bugs wrote:
> I am a not sure this is the right solution and I am scared that by
> forcing this clamping, we are hiding something else.
> If we have a look at the code below:
> ```
>
> |
>
> if (trace_in) {
> trace = trace_in;
> trace->nr = min_t(u32, trace->nr, max_depth);
> } else if (kernel && task) {
> trace = get_callchain_entry_for_task(task, max_depth);
> } else {
> trace = get_perf_callchain(regs, kernel, user, max_depth,
> crosstask, false, 0);
> } ``` trace should be (if I remember correctly) clamped there. If not, it
> might hide something else. I would like to have a look at the return for
> each if case through gdb. |

Hi Arnaud,
So I've been debugging this the reproducer always takes the else branch
so trace holds whatever get_perf_callchain returns; in this situation.

I mostly found it to be a value around 4.

In some case the value would exceed to something 27 or 44, just after
the code block

if (unlikely(!trace) || trace->nr < skip) {
if (may_fault)
rcu_read_unlock();
goto err_fault;
}

So I'm assuming there's some race condition that might be going on
somewhere.
I'm still debugging bug I'm open to ideas and definitely I could be
wrong here, please feel free to correct/point out.

--
Regards,
listout

Lecomte, Arnaud

unread,
Nov 13, 2025, 8:26:16 AM (2 days ago) Nov 13
to Brahmajit Das, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
Which value ? trace->nr ?
> I'm still debugging bug I'm open to ideas and definitely I could be
> wrong here, please feel free to correct/point out.

I should be able to have a look tomorrow evening as I am currently a bit
overloaded
with my work.

Thanks,
Arnaud

Brahmajit Das

unread,
Nov 13, 2025, 8:49:36 AM (2 days ago) Nov 13
to Lecomte, Arnaud, syzbot+d1b7fa...@syzkaller.appspotmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, marti...@linux.dev, net...@vger.kernel.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
On 13.11.2025 13:26, Lecomte, Arnaud wrote:
>
> On 13/11/2025 12:49, Brahmajit Das wrote:
> > On 12.11.2025 08:40, 'Lecomte, Arnaud' via syzkaller-bugs wrote:
> > > I am a not sure this is the right solution and I am scared that by
> > > forcing this clamping, we are hiding something else.
> > > If we have a look at the code below:
...snip...
> > > might hide something else. I would like to have a look at the return for
> > > each if case through gdb. |
> > Hi Arnaud,
> > So I've been debugging this the reproducer always takes the else branch
> > so trace holds whatever get_perf_callchain returns; in this situation.
> >
> > I mostly found it to be a value around 4.
> >
> > In some case the value would exceed to something 27 or 44, just after
> > the code block
> >
> > if (unlikely(!trace) || trace->nr < skip) {
> > if (may_fault)
> > rcu_read_unlock();
> > goto err_fault;
> > }
> >
> > So I'm assuming there's some race condition that might be going on
> > somewhere.
> Which value ? trace->nr ?

Yep, trace->nr

> > I'm still debugging bug I'm open to ideas and definitely I could be
> > wrong here, please feel free to correct/point out.
>
> I should be able to have a look tomorrow evening as I am currently a bit
> overloaded
> with my work.

Awesome, thank you. I'll try to dig around a bit more meanwhile.

--
Regards,
listout
Reply all
Reply to author
Forward
0 new messages