[syzbot] [kernel?] KASAN: stack-out-of-bounds Read in __show_regs (2)

18 views
Skip to first unread message

syzbot

unread,
Jun 15, 2024, 3:06:24 AMJun 15
to b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
Hello,

syzbot found the following issue on:

HEAD commit: a957267fa7e9 Add linux-next specific files for 20240611
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=171e6e56980000
kernel config: https://syzkaller.appspot.com/x/.config?x=9a880e96898e79f8
dashboard link: https://syzkaller.appspot.com/bug?extid=e9be5674af5e3a0b9ecc
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/6451759a606b/disk-a957267f.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/7f635dbe5b8a/vmlinux-a957267f.xz
kernel image: https://storage.googleapis.com/syzbot-assets/33eafd1b8aec/bzImage-a957267f.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e9be56...@syzkaller.appspotmail.com

asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:synchronize_rcu+0x0/0x360 kernel/rcu/tree.c:4005
Code: Unable to access opcode bytes at 0x1ffff920012daeae.
RSP: 76c0:0000000000000203 EFLAGS: 1ffff920012daed4
==================================================================
BUG: KASAN: stack-out-of-bounds in __show_regs+0xa6/0x610 arch/x86/kernel/process_64.c:83
Read of size 8 at addr ffffc900096d7618 by task syz-executor.0/9934

CPU: 1 PID: 9934 Comm: syz-executor.0 Not tainted 6.10.0-rc3-next-20240611-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:91 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:117
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
__show_regs+0xa6/0x610 arch/x86/kernel/process_64.c:83
show_trace_log_lvl+0x3d4/0x520 arch/x86/kernel/dumpstack.c:301
sched_show_task+0x578/0x740 kernel/sched/core.c:7432
report_rtnl_holders+0x1ba/0x2d0 net/core/rtnetlink.c:104
call_timer_fn+0x18e/0x650 kernel/time/timer.c:1792
expire_timers kernel/time/timer.c:1843 [inline]
__run_timers kernel/time/timer.c:2417 [inline]
__run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2428
run_timer_base kernel/time/timer.c:2437 [inline]
run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2447
handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
__do_softirq kernel/softirq.c:588 [inline]
invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 4e d0 63 f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 eb cc f5 65 8b 05 74 4f 6e 74 85 c0 74 43 48 c7 04 24 0e 36
RSP: 0018:ffffc9000330f600 EFLAGS: 00000206
RAX: 4e6b703e49168e00 RBX: 1ffff92000661ec4 RCX: ffffffff816fafba
RDX: dffffc0000000000 RSI: ffffffff8bcac1a0 RDI: 0000000000000001
RBP: ffffc9000330f690 R08: ffffffff92fcb7ef R09: 1ffffffff25f96fd
R10: dffffc0000000000 R11: fffffbfff25f96fe R12: dffffc0000000000
R13: 1ffff92000661ec0 R14: ffffc9000330f620 R15: 0000000000000246
spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
__wake_up_common_lock+0x18c/0x1e0 kernel/sched/wait.c:108
__unix_dgram_recvmsg+0x5f4/0x12f0 net/unix/af_unix.c:2415
sock_recvmsg_nosec+0x18e/0x1d0 net/socket.c:1046
____sys_recvmsg+0x3c0/0x470 net/socket.c:2802
___sys_recvmsg net/socket.c:2846 [inline]
do_recvmmsg+0x474/0xae0 net/socket.c:2940
__sys_recvmmsg net/socket.c:3019 [inline]
__do_sys_recvmmsg net/socket.c:3042 [inline]
__se_sys_recvmmsg net/socket.c:3035 [inline]
__x64_sys_recvmmsg+0x199/0x250 net/socket.c:3035
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f53cfc7cea9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f53d0a0c0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 00007f53cfdb4050 RCX: 00007f53cfc7cea9
RDX: 0000000000010106 RSI: 00000000200000c0 RDI: 0000000000000003
RBP: 00007f53cfcebff4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007f53cfdb4050 R15: 00007fffb8c061e8
</TASK>

The buggy address belongs to the virtual mapping at
[ffffc900096d0000, ffffc900096d9000) created by:
copy_process+0x5d1/0x3dc0 kernel/fork.c:2201

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88807c7f0000 pfn:0x7c7f0
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000000 0000000000000000 dead000000000122 0000000000000000
raw: ffff88807c7f0000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x102dc2(GFP_HIGHUSER|__GFP_NOWARN|__GFP_ZERO), pid 2, tgid 2 (kthreadd), ts 126928355749, free_ts 121449634751
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1470
prep_new_page mm/page_alloc.c:1478 [inline]
get_page_from_freelist+0x2cbd/0x2d70 mm/page_alloc.c:3457
__alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4715
alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2263
vm_area_alloc_pages mm/vmalloc.c:3567 [inline]
__vmalloc_area_node mm/vmalloc.c:3643 [inline]
__vmalloc_node_range_noprof+0x971/0x1460 mm/vmalloc.c:3824
alloc_thread_stack_node kernel/fork.c:310 [inline]
dup_task_struct+0x444/0x8c0 kernel/fork.c:1110
copy_process+0x5d1/0x3dc0 kernel/fork.c:2201
kernel_clone+0x226/0x8f0 kernel/fork.c:2778
kernel_thread+0x1bc/0x240 kernel/fork.c:2840
create_kthread kernel/kthread.c:412 [inline]
kthreadd+0x60d/0x810 kernel/kthread.c:765
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:144
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
page last free pid 5674 tgid 5674 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1089 [inline]
free_unref_folios+0x103a/0x1b00 mm/page_alloc.c:2669
folios_put_refs+0x76e/0x860 mm/swap.c:1020
free_pages_and_swap_cache+0x2ea/0x690 mm/swap_state.c:332
__tlb_batch_free_encoded_pages mm/mmu_gather.c:136 [inline]
tlb_batch_pages_flush mm/mmu_gather.c:149 [inline]
tlb_flush_mmu_free mm/mmu_gather.c:366 [inline]
tlb_flush_mmu+0x3a3/0x680 mm/mmu_gather.c:373
tlb_finish_mmu+0xd4/0x200 mm/mmu_gather.c:465
exit_mmap+0x44f/0xc80 mm/mmap.c:3395
__mmput+0x115/0x390 kernel/fork.c:1341
exit_mm+0x220/0x310 kernel/exit.c:565
do_exit+0x9aa/0x28e0 kernel/exit.c:861
do_group_exit+0x207/0x2c0 kernel/exit.c:1023
__do_sys_exit_group kernel/exit.c:1034 [inline]
__se_sys_exit_group kernel/exit.c:1032 [inline]
__x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1032
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Memory state around the buggy address:
ffffc900096d7500: 00 00 00 00 00 f3 f3 f3 f3 f3 f3 f3 00 00 00 00
ffffc900096d7580: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2
>ffffc900096d7600: 00 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
^
ffffc900096d7680: 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 f3 f3 f3
ffffc900096d7700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
----------------
Code disassembly (best guess):
0: 9c pushf
1: 8f 44 24 20 pop 0x20(%rsp)
5: 42 80 3c 23 00 cmpb $0x0,(%rbx,%r12,1)
a: 74 08 je 0x14
c: 4c 89 f7 mov %r14,%rdi
f: e8 4e d0 63 f6 call 0xf663d062
14: f6 44 24 21 02 testb $0x2,0x21(%rsp)
19: 75 52 jne 0x6d
1b: 41 f7 c7 00 02 00 00 test $0x200,%r15d
22: 74 01 je 0x25
24: fb sti
25: bf 01 00 00 00 mov $0x1,%edi
* 2a: e8 c3 eb cc f5 call 0xf5ccebf2 <-- trapping instruction
2f: 65 8b 05 74 4f 6e 74 mov %gs:0x746e4f74(%rip),%eax # 0x746e4faa
36: 85 c0 test %eax,%eax
38: 74 43 je 0x7d
3a: 48 rex.W
3b: c7 .byte 0xc7
3c: 04 24 add $0x24,%al
3e: 0e (bad)
3f: 36 ss


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Jun 23, 2024, 7:38:22 AM (10 days ago) Jun 23
to b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
syzbot has found a reproducer for the following issue on:

HEAD commit: f76698bd9a8c Add linux-next specific files for 20240621
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=14293f0e980000
kernel config: https://syzkaller.appspot.com/x/.config?x=ca79e3c3b9118bd0
dashboard link: https://syzkaller.appspot.com/bug?extid=e9be5674af5e3a0b9ecc
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=118dd151980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/f204c5d02251/disk-f76698bd.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/50289c7e8999/vmlinux-f76698bd.xz
kernel image: https://storage.googleapis.com/syzbot-assets/c360e133a94f/bzImage-f76698bd.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e9be56...@syzkaller.appspotmail.com

Code: 8b 3d 3c 28 4a 0c 48 89 de 5b e9 d3 de 5a 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 8b 04 24 <65> 48 8b 0c 25 00 d6 03 00 65 8b 15 20 5b 70 7e f7 c2 00 01 ff 00
RSP: 0018:ffffc9000b8bf5b8 EFLAGS: 00000202
RAX: ffffffff81375c3f RBX: ffffffff81f9a77f RCX: ffff888025659e00
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
==================================================================
BUG: KASAN: out-of-bounds in __show_regs+0x172/0x610 arch/x86/kernel/process_64.c:87
Read of size 8 at addr ffffc9000b8bf528 by task swapper/1/0

CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.10.0-rc4-next-20240621-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
__show_regs+0x172/0x610 arch/x86/kernel/process_64.c:87
show_trace_log_lvl+0x3d4/0x520 arch/x86/kernel/dumpstack.c:301
sched_show_task+0x578/0x740 kernel/sched/core.c:7503
report_rtnl_holders+0x1ba/0x2d0 net/core/rtnetlink.c:104
call_timer_fn+0x18e/0x650 kernel/time/timer.c:1792
expire_timers kernel/time/timer.c:1843 [inline]
__run_timers kernel/time/timer.c:2417 [inline]
__run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2428
run_timer_base kernel/time/timer.c:2437 [inline]
run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2447
handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
__do_softirq kernel/softirq.c:588 [inline]
invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:native_irq_disable arch/x86/include/asm/irqflags.h:37 [inline]
RIP: 0010:arch_local_irq_disable arch/x86/include/asm/irqflags.h:92 [inline]
RIP: 0010:acpi_safe_halt+0x21/0x30 drivers/acpi/processor_idle.c:113
Code: 90 90 90 90 90 90 90 90 90 65 48 8b 04 25 00 d6 03 00 48 f7 00 08 00 00 00 75 10 66 90 0f 00 2d d5 66 9c 00 f3 0f 1e fa fb f4 <fa> c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90
RSP: 0018:ffffc900001a7d08 EFLAGS: 00000246
RAX: ffff8880176b0000 RBX: ffff88801b2db864 RCX: 00000000123988e9
RDX: 0000000000000001 RSI: ffff88801b2db800 RDI: ffff88801b2db864
RBP: 000000000003a6b8 R08: ffff8880b9537ccb R09: 1ffff110172a6f99
R10: dffffc0000000000 R11: ffffffff8b8d9ba0 R12: ffff88801a71e000
R13: 0000000000000000 R14: 0000000000000001 R15: ffffffff8ead7e20
acpi_idle_enter+0xe4/0x140 drivers/acpi/processor_idle.c:707
cpuidle_enter_state+0x112/0x480 drivers/cpuidle/cpuidle.c:267
cpuidle_enter+0x5d/0xa0 drivers/cpuidle/cpuidle.c:388
call_cpuidle kernel/sched/idle.c:155 [inline]
cpuidle_idle_call kernel/sched/idle.c:230 [inline]
do_idle+0x375/0x5d0 kernel/sched/idle.c:326
cpu_startup_entry+0x42/0x60 kernel/sched/idle.c:424
start_secondary+0x100/0x100 arch/x86/kernel/smpboot.c:313
common_startup_64+0x13e/0x147
</TASK>

The buggy address belongs to the virtual mapping at
[ffffc9000b8b8000, ffffc9000b8c1000) created by:
copy_process+0x5d1/0x3d90 kernel/fork.c:2206

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff8880678c2140 pfn:0x678c2
memcg:ffff88801a6b4002
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000000 0000000000000000 dead000000000122 0000000000000000
raw: ffff8880678c2140 0000000000000000 00000001ffffffff ffff88801a6b4002
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x102dc2(GFP_HIGHUSER|__GFP_NOWARN|__GFP_ZERO), pid 6303, tgid 6303 (syz-executor.4), ts 1539185935031, free_ts 1529629150697
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1500
prep_new_page mm/page_alloc.c:1508 [inline]
get_page_from_freelist+0x2ccb/0x2d80 mm/page_alloc.c:3487
__alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4745
alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2263
vm_area_alloc_pages mm/vmalloc.c:3576 [inline]
__vmalloc_area_node mm/vmalloc.c:3652 [inline]
__vmalloc_node_range_noprof+0x971/0x1460 mm/vmalloc.c:3833
alloc_thread_stack_node kernel/fork.c:313 [inline]
dup_task_struct+0x444/0x8c0 kernel/fork.c:1114
copy_process+0x5d1/0x3d90 kernel/fork.c:2206
kernel_clone+0x226/0x8f0 kernel/fork.c:2788
__do_sys_clone3 kernel/fork.c:3089 [inline]
__se_sys_clone3+0x2cb/0x350 kernel/fork.c:3073
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 6242 tgid 6242 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1093 [inline]
free_unref_page+0xd22/0xea0 mm/page_alloc.c:2651
__slab_free+0x31b/0x3d0 mm/slub.c:4384
qlink_free mm/kasan/quarantine.c:163 [inline]
qlist_free_all+0x9e/0x140 mm/kasan/quarantine.c:179
kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
__kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3975 [inline]
slab_alloc_node mm/slub.c:4037 [inline]
__do_kmalloc_node mm/slub.c:4157 [inline]
__kmalloc_noprof+0x1a3/0x400 mm/slub.c:4171
kmalloc_noprof include/linux/slab.h:664 [inline]
kzalloc_noprof include/linux/slab.h:778 [inline]
tomoyo_encode2 security/tomoyo/realpath.c:45 [inline]
tomoyo_encode+0x26f/0x540 security/tomoyo/realpath.c:80
tomoyo_path_perm+0x3ca/0x740 security/tomoyo/file.c:831
tomoyo_path_symlink+0xde/0x120 security/tomoyo/tomoyo.c:212
security_path_symlink+0xe3/0x140 security/security.c:1876
do_symlinkat+0x136/0x3a0 fs/namei.c:4530
__do_sys_symlink fs/namei.c:4553 [inline]
__se_sys_symlink fs/namei.c:4551 [inline]
__x64_sys_symlink+0x7a/0x90 fs/namei.c:4551
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Memory state around the buggy address:
ffffc9000b8bf400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffffc9000b8bf480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffc9000b8bf500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
ffffc9000b8bf580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffffc9000b8bf600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
----------------
Code disassembly (best guess):
0: 8b 3d 3c 28 4a 0c mov 0xc4a283c(%rip),%edi # 0xc4a2842
6: 48 89 de mov %rbx,%rsi
9: 5b pop %rbx
a: e9 d3 de 5a 00 jmp 0x5adee2
f: 0f 1f 00 nopl (%rax)
12: 90 nop
13: 90 nop
14: 90 nop
15: 90 nop
16: 90 nop
17: 90 nop
18: 90 nop
19: 90 nop
1a: 90 nop
1b: 90 nop
1c: 90 nop
1d: 90 nop
1e: 90 nop
1f: 90 nop
20: 90 nop
21: 90 nop
22: f3 0f 1e fa endbr64
26: 48 8b 04 24 mov (%rsp),%rax
* 2a: 65 48 8b 0c 25 00 d6 mov %gs:0x3d600,%rcx <-- trapping instruction
31: 03 00
33: 65 8b 15 20 5b 70 7e mov %gs:0x7e705b20(%rip),%edx # 0x7e705b5a
3a: f7 c2 00 01 ff 00 test $0xff0100,%edx


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

Tetsuo Handa

unread,
Jun 24, 2024, 9:23:02 PM (8 days ago) Jun 24
to syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org, kasan-dev, linux-mm
Hello.

This report is triggered by my debug printk() patch at
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/net/core/rtnetlink.c?id=5210cbe9a47fc5c1f43ba16d481e6335f3e2f345
but I can't find where the bug is (x86 bug or mm bug or kasan bug or my bug).

On 2024/06/15 16:06, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: a957267fa7e9 Add linux-next specific files for 20240611
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=171e6e56980000
> kernel config: https://syzkaller.appspot.com/x/.config?x=9a880e96898e79f8
> dashboard link: https://syzkaller.appspot.com/bug?extid=e9be5674af5e3a0b9ecc
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Quoting from https://syzkaller.appspot.com/text?tag=CrashReport&x=17786fb1980000
and https://syzkaller.appspot.com/text?tag=CrashLog&x=15e0202a980000 :

----------------------------------------
BUG: KASAN: stack-out-of-bounds in __show_regs+0xa6/0x610 arch/x86/kernel/process_64.c:83
Read of size 8 at addr ffffc90008807618 by task syz.0.1430/9588

CPU: 0 UID: 0 PID: 9588 Comm: syz.0.1430 Not tainted 6.10.0-rc5-next-20240624-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
__show_regs+0xa6/0x610 arch/x86/kernel/process_64.c:83
show_trace_log_lvl+0x3d4/0x520 arch/x86/kernel/dumpstack.c:301
sched_show_task+0x578/0x740 kernel/sched/core.c:7506
report_rtnl_holders+0x1ba/0x2d0 net/core/rtnetlink.c:104
call_timer_fn+0x18e/0x650 kernel/time/timer.c:1792
expire_timers kernel/time/timer.c:1843 [inline]
__run_timers kernel/time/timer.c:2417 [inline]
__run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2428
run_timer_base kernel/time/timer.c:2437 [inline]
run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2447
handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
__do_softirq kernel/softirq.c:588 [inline]
invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 0e 94 61 f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 10 ca f5 65 8b 05 b4 54 6b 74 85 c0 74 43 48 c7 04 24 0e 36
RSP: 0018:ffffc9000407f600 EFLAGS: 00000206
RAX: 13958dc9d919f000 RBX: 1ffff9200080fec4 RCX: ffffffff816fd2da
RDX: dffffc0000000000 RSI: ffffffff8bcac820 RDI: 0000000000000001
RBP: ffffc9000407f690 R08: ffffffff92fe47ef R09: 1ffffffff25fc8fd
R10: dffffc0000000000 R11: fffffbfff25fc8fe R12: dffffc0000000000
R13: 1ffff9200080fec0 R14: ffffc9000407f620 R15: 0000000000000246
spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
__wake_up_common_lock+0x18c/0x1e0 kernel/sched/wait.c:108
__unix_dgram_recvmsg+0x5f4/0x12f0 net/unix/af_unix.c:2415
sock_recvmsg_nosec+0x18e/0x1d0 net/socket.c:1046
____sys_recvmsg+0x3c0/0x470 net/socket.c:2814
___sys_recvmsg net/socket.c:2858 [inline]
do_recvmmsg+0x474/0xae0 net/socket.c:2952
__sys_recvmmsg net/socket.c:3031 [inline]
__do_sys_recvmmsg net/socket.c:3054 [inline]
__se_sys_recvmmsg net/socket.c:3047 [inline]
__x64_sys_recvmmsg+0x199/0x250 net/socket.c:3047
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fdfbaf75d39
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fdfbbc7e048 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 00007fdfbb104070 RCX: 00007fdfbaf75d39
RDX: 0000000000010106 RSI: 00000000200000c0 RDI: 0000000000000003
RBP: 00007fdfbaff6766 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007fdfbb104070 R15: 00007ffeafeb36a8
</TASK>

The buggy address belongs to the virtual mapping at
[ffffc90008800000, ffffc90008809000) created by:
copy_process+0x5d1/0x3d90 kernel/fork.c:2206

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x295f2
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000000 0000000000000000 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x2dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_ZERO), pid 1052, tgid 1052 (kworker/u8:5), ts 20453244600, free_ts 0
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1500
prep_new_page mm/page_alloc.c:1508 [inline]
get_page_from_freelist+0x2ccb/0x2d80 mm/page_alloc.c:3487
__alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4745
alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2263
vm_area_alloc_pages mm/vmalloc.c:3576 [inline]
__vmalloc_area_node mm/vmalloc.c:3652 [inline]
__vmalloc_node_range_noprof+0x971/0x1460 mm/vmalloc.c:3833
alloc_thread_stack_node kernel/fork.c:313 [inline]
dup_task_struct+0x444/0x8c0 kernel/fork.c:1114
copy_process+0x5d1/0x3d90 kernel/fork.c:2206
kernel_clone+0x226/0x8f0 kernel/fork.c:2788
user_mode_thread+0x132/0x1a0 kernel/fork.c:2866
call_usermodehelper_exec_work+0x5c/0x230 kernel/umh.c:172
process_one_work kernel/workqueue.c:3224 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3305
worker_thread+0x86d/0xd40 kernel/workqueue.c:3383
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:144
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
page_owner free stack trace missing

Memory state around the buggy address:
ffffc90008807500: 00 00 00 00 00 f3 f3 f3 f3 f3 f3 f3 00 00 00 00
ffffc90008807580: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2
>ffffc90008807600: 00 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
^
ffffc90008807680: 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 f3 f3 f3
ffffc90008807700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
----------------------------------------

----------------------------------------
[ 560.831831][ C0] DEBUG: holding rtnl_mutex for 937 jiffies.
[ 560.838015][ C0] task:kworker/u8:9 state:R running task stack:20216 pid:2460 tgid:2460 ppid:2 flags:0x00004000
[ 560.849882][ C0] Workqueue: netns cleanup_net
[ 560.854770][ C0] Call Trace:
[ 560.854789][ C0] <TASK>
[ 560.872376][ C0] __schedule+0x17e8/0x4a20
[ 560.877336][ C0] ? mark_lock+0x9a/0x360
[ 560.881823][ C0] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 560.887887][ C0] ? __virt_addr_valid+0x183/0x520
[ 560.893171][ C0] ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[ 560.899593][ C0] ? lock_release+0xbf/0x9f0
[ 560.904330][ C0] ? __pfx___schedule+0x10/0x10
[ 560.909271][ C0] ? lockdep_hardirqs_on+0x99/0x150
[ 560.914617][ C0] ? mark_lock+0x9a/0x360
[ 560.919119][ C0] preempt_schedule_irq+0xfb/0x1c0
[ 560.924392][ C0] ? __pfx_preempt_schedule_irq+0x10/0x10
[ 560.931783][ C0] irqentry_exit+0x5e/0x90
[ 560.936590][ C0] asm_sysvec_reschedule_ipi+0x1a/0x20
[ 560.942783][ C0] RIP: 0010:synchronize_rcu+0x0/0x360
[ 560.948403][ C0] Code: e1 07 80 c1 03 38 c1 0f 8c 97 fe ff ff 4c 89 f7 e8 15 50 80 00 e9 8a fe ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <f3> 0f 1e fa 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 e4 e0 48
[ 560.968242][ C0] RSP: 76c0:0000000000000a06 EFLAGS: 1ffff92001100ed4
[ 560.975129][ C0] ==================================================================
[ 560.994479][ C0] BUG: KASAN: stack-out-of-bounds in __show_regs+0xa6/0x610
[ 561.002642][ C0] Read of size 8 at addr ffffc90008807618 by task syz.0.1430/9588
[ 561.014598][ C0]
[ 561.017321][ C0] CPU: 0 UID: 0 PID: 9588 Comm: syz.0.1430 Not tainted 6.10.0-rc5-next-20240624-syzkaller #0
[ 561.028952][ C0] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
[ 561.043847][ C0] Call Trace:
[ 561.047213][ C0] <IRQ>
[ 561.050101][ C0] dump_stack_lvl+0x241/0x360
[ 561.054963][ C0] ? __pfx_dump_stack_lvl+0x10/0x10
[ 561.073046][ C0] ? __pfx__printk+0x10/0x10
[ 561.077786][ C0] ? _printk+0xd5/0x120
[ 561.082123][ C0] print_report+0x169/0x550
[ 561.086775][ C0] ? __virt_addr_valid+0xbd/0x520
[ 561.091947][ C0] ? __show_regs+0xa6/0x610
[ 561.096544][ C0] kasan_report+0x143/0x180
[ 561.101170][ C0] ? show_opcodes+0x148/0x170
[ 561.105909][ C0] ? __show_regs+0xa6/0x610
[ 561.110457][ C0] __show_regs+0xa6/0x610
[ 561.114858][ C0] ? asm_sysvec_reschedule_ipi+0x1a/0x20
[ 561.120539][ C0] ? asm_sysvec_reschedule_ipi+0x1a/0x20
[ 561.126227][ C0] show_trace_log_lvl+0x3d4/0x520
[ 561.131292][ C0] ? __pfx_synchronize_rcu+0x10/0x10
[ 561.136630][ C0] sched_show_task+0x578/0x740
[ 561.141466][ C0] ? report_rtnl_holders+0x183/0x2d0
[ 561.147055][ C0] ? __pfx__printk+0x10/0x10
[ 561.151699][ C0] ? __pfx_sched_show_task+0x10/0x10
[ 561.157153][ C0] report_rtnl_holders+0x1ba/0x2d0
[ 561.162519][ C0] ? report_rtnl_holders+0x20/0x2d0
[ 561.167755][ C0] call_timer_fn+0x18e/0x650
[ 561.172361][ C0] ? call_timer_fn+0xc0/0x650
[ 561.177086][ C0] ? __pfx_report_rtnl_holders+0x10/0x10
[ 561.182785][ C0] ? __pfx_call_timer_fn+0x10/0x10
[ 561.187939][ C0] ? __pfx_report_rtnl_holders+0x10/0x10
[ 561.193631][ C0] ? __pfx_report_rtnl_holders+0x10/0x10
[ 561.199303][ C0] ? __pfx_report_rtnl_holders+0x10/0x10
[ 561.204994][ C0] ? _raw_spin_unlock_irq+0x23/0x50
[ 561.210231][ C0] ? lockdep_hardirqs_on+0x99/0x150
[ 561.215469][ C0] ? __pfx_report_rtnl_holders+0x10/0x10
[ 561.221120][ C0] __run_timer_base+0x66a/0x8e0
[ 561.226093][ C0] ? __pfx___run_timer_base+0x10/0x10
[ 561.231493][ C0] ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[ 561.237874][ C0] run_timer_softirq+0xb7/0x170
[ 561.242832][ C0] handle_softirqs+0x2c4/0x970
[ 561.247626][ C0] ? __irq_exit_rcu+0xf4/0x1c0
[ 561.252429][ C0] ? __pfx_handle_softirqs+0x10/0x10
[ 561.257856][ C0] ? irqtime_account_irq+0xd4/0x1e0
[ 561.263090][ C0] __irq_exit_rcu+0xf4/0x1c0
[ 561.267711][ C0] ? __pfx___irq_exit_rcu+0x10/0x10
[ 561.272931][ C0] irq_exit_rcu+0x9/0x30
[ 561.277231][ C0] sysvec_apic_timer_interrupt+0xa6/0xc0
[ 561.283185][ C0] </IRQ>
[ 561.286769][ C0] <TASK>
[ 561.289972][ C0] asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 561.297172][ C0] RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140
[ 561.307112][ C0] Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 0e 94 61 f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 10 ca f5 65 8b 05 b4 54 6b 74 85 c0 74 43 48 c7 04 24 0e 36
[ 561.327228][ C0] RSP: 0018:ffffc9000407f600 EFLAGS: 00000206
[ 561.333355][ C0] RAX: 13958dc9d919f000 RBX: 1ffff9200080fec4 RCX: ffffffff816fd2da
[ 561.341352][ C0] RDX: dffffc0000000000 RSI: ffffffff8bcac820 RDI: 0000000000000001
[ 561.349458][ C0] RBP: ffffc9000407f690 R08: ffffffff92fe47ef R09: 1ffffffff25fc8fd
[ 561.357460][ C0] R10: dffffc0000000000 R11: fffffbfff25fc8fe R12: dffffc0000000000
[ 561.365478][ C0] R13: 1ffff9200080fec0 R14: ffffc9000407f620 R15: 0000000000000246
[ 561.373533][ C0] ? mark_lock+0x9a/0x360
[ 561.378221][ C0] ? __pfx__raw_spin_unlock_irqrestore+0x10/0x10
[ 561.385142][ C0] ? autoremove_wake_function+0x37/0x110
[ 561.391145][ C0] __wake_up_common_lock+0x18c/0x1e0
[ 561.396936][ C0] __unix_dgram_recvmsg+0x5f4/0x12f0
[ 561.403018][ C0] ? __pfx___unix_dgram_recvmsg+0x10/0x10
[ 561.409788][ C0] ? __pfx___might_resched+0x10/0x10
[ 561.415745][ C0] ? iovec_from_user+0x61/0x240
[ 561.421927][ C0] ? unix_dgram_recvmsg+0xb6/0xe0
[ 561.427965][ C0] ? __pfx_unix_dgram_recvmsg+0x10/0x10
[ 561.435584][ C0] sock_recvmsg_nosec+0x18e/0x1d0
[ 561.441322][ C0] ____sys_recvmsg+0x3c0/0x470
[ 561.446583][ C0] ? __pfx_____sys_recvmsg+0x10/0x10
[ 561.455788][ C0] ? __might_fault+0xaa/0x120
[ 561.460634][ C0] do_recvmmsg+0x474/0xae0
[ 561.465088][ C0] ? __pfx___futex_wait+0x10/0x10
[ 561.470148][ C0] ? __pfx_do_recvmmsg+0x10/0x10
[ 561.475130][ C0] ? __pfx_futex_wake_mark+0x10/0x10
[ 561.480509][ C0] ? futex_wait+0x285/0x360
[ 561.485124][ C0] ? __pfx_futex_wait+0x10/0x10
[ 561.490014][ C0] ? fd_install+0x9c/0x5d0
[ 561.494459][ C0] ? __pfx_lock_release+0x10/0x10
[ 561.499504][ C0] ? __pfx_do_futex+0x10/0x10
[ 561.504229][ C0] __x64_sys_recvmmsg+0x199/0x250
[ 561.510481][ C0] ? __pfx___x64_sys_recvmmsg+0x10/0x10
[ 561.517399][ C0] ? do_syscall_64+0x100/0x230
[ 561.522660][ C0] ? do_syscall_64+0xb6/0x230
[ 561.529823][ C0] do_syscall_64+0xf3/0x230
[ 561.534742][ C0] ? clear_bhb_loop+0x35/0x90
[ 561.540096][ C0] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 561.546133][ C0] RIP: 0033:0x7fdfbaf75d39
[ 561.550744][ C0] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
[ 561.571064][ C0] RSP: 002b:00007fdfbbc7e048 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
[ 561.580376][ C0] RAX: ffffffffffffffda RBX: 00007fdfbb104070 RCX: 00007fdfbaf75d39
[ 561.588397][ C0] RDX: 0000000000010106 RSI: 00000000200000c0 RDI: 0000000000000003
[ 561.596400][ C0] RBP: 00007fdfbaff6766 R08: 0000000000000000 R09: 0000000000000000
[ 561.604404][ C0] R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
[ 561.612415][ C0] R13: 000000000000006e R14: 00007fdfbb104070 R15: 00007ffeafeb36a8
[ 561.620458][ C0] </TASK>
[ 561.623517][ C0]
[ 561.625876][ C0] The buggy address belongs to the virtual mapping at
[ 561.625876][ C0] [ffffc90008800000, ffffc90008809000) created by:
[ 561.625876][ C0] copy_process+0x5d1/0x3d90
[ 561.643549][ C0]
[ 561.645879][ C0] The buggy address belongs to the physical page:
[ 561.652306][ C0] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x295f2
[ 561.661135][ C0] flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
[ 561.668346][ C0] raw: 00fff00000000000 0000000000000000 dead000000000122 0000000000000000
[ 561.677050][ C0] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
[ 561.685659][ C0] page dumped because: kasan: bad access detected
[ 561.692219][ C0] page_owner tracks the page as allocated
[ 561.697979][ C0] page last allocated via order 0, migratetype Unmovable, gfp_mask 0x2dc2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_ZERO), pid 1052, tgid 1052 (kworker/u8:5), ts 20453244600, free_ts 0
[ 561.716523][ C0] post_alloc_hook+0x1f3/0x230
[ 561.721344][ C0] get_page_from_freelist+0x2ccb/0x2d80
[ 561.727009][ C0] __alloc_pages_noprof+0x256/0x6c0
[ 561.732233][ C0] alloc_pages_mpol_noprof+0x3e8/0x680
[ 561.737727][ C0] __vmalloc_node_range_noprof+0x971/0x1460
[ 561.743664][ C0] dup_task_struct+0x444/0x8c0
[ 561.748479][ C0] copy_process+0x5d1/0x3d90
[ 561.753128][ C0] kernel_clone+0x226/0x8f0
[ 561.757766][ C0] user_mode_thread+0x132/0x1a0
[ 561.762660][ C0] call_usermodehelper_exec_work+0x5c/0x230
[ 561.768674][ C0] process_scheduled_works+0xa2c/0x1830
[ 561.774240][ C0] worker_thread+0x86d/0xd40
[ 561.778849][ C0] kthread+0x2f0/0x390
[ 561.782979][ C0] ret_from_fork+0x4b/0x80
[ 561.787453][ C0] ret_from_fork_asm+0x1a/0x30
[ 561.792332][ C0] page_owner free stack trace missing
[ 561.797698][ C0]
[ 561.800029][ C0] Memory state around the buggy address:
[ 561.805664][ C0] ffffc90008807500: 00 00 00 00 00 f3 f3 f3 f3 f3 f3 f3 00 00 00 00
[ 561.813728][ C0] ffffc90008807580: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2
[ 561.821814][ C0] >ffffc90008807600: 00 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
[ 561.829912][ C0] ^
[ 561.834781][ C0] ffffc90008807680: 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 f3 f3 f3
[ 561.842858][ C0] ffffc90008807700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 561.851024][ C0] ==================================================================
[ 561.859206][ C0] Kernel panic - not syncing: KASAN: panic_on_warn set ...
[ 561.866452][ C0] CPU: 0 UID: 0 PID: 9588 Comm: syz.0.1430 Not tainted 6.10.0-rc5-next-20240624-syzkaller #0
----------------------------------------

arch/x86/kernel/process_64.c:83 is

printk("%sRAX: %016lx RBX: %016lx RCX: %016lx\n",
log_lvl, regs->ax, regs->bx, regs->cx);

(which looks nothing special), and kernel stack area [ffffc90008800000, ffffc90008809000) is
32768 bytes + 4096 bytes (which looks sane to me), and ffffc90008807618 is within the 32768
bytes (which looks sane to me).

Kernel config is https://syzkaller.appspot.com/text?tag=KernelConfig&x=6221d1071c39b052 .
Can somebody find what is wrong?

Tetsuo Handa

unread,
Jul 1, 2024, 8:43:48 AM (2 days ago) Jul 1
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, kasan-dev, linux-mm, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, mi...@redhat.com, tg...@linutronix.de, x...@kernel.org
Hello, KASAN people.

I suspect that KASAN's metadata for kernel stack memory got out of sync for
unknown reason, for the stack trace of PID=7558 was successfully printed for
two times before KASAN complains upon trying to print for the the third time.
Would you decode what is this KASAN message saying?

Quoting from https://syzkaller.appspot.com/text?tag=CrashLog&x=119fd081980000 :

[ 226.304791][ C0] DEBUG: holding rtnl_mutex for 555 jiffies.
[ 226.310883][ C0] task:syz.1.750 state:R running task stack:24672 pid:7558 tgid:7558 ppid:5945 flags:0x00004006
[ 226.322705][ C0] Call Trace:
[ 226.326074][ C0] <TASK>
[ 226.329029][ C0] __schedule+0x17e8/0x4a20
[ 226.333672][ C0] ? __pfx___schedule+0x10/0x10
[ 226.338599][ C0] ? __pfx_lock_release+0x10/0x10
[ 226.343744][ C0] ? _raw_spin_unlock_irqrestore+0xdd/0x140
[ 226.349775][ C0] ? schedule+0x90/0x320
[ 226.354042][ C0] schedule+0x14b/0x320
[ 226.358254][ C0] synchronize_rcu_expedited+0x684/0x830
[ 226.364012][ C0] ? __pfx_synchronize_rcu_expedited+0x10/0x10
[ 226.370239][ C0] ? __pfx_wait_rcu_exp_gp+0x10/0x10
[ 226.375579][ C0] ? __pfx___might_resched+0x10/0x10
[ 226.380877][ C0] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 226.386897][ C0] ? __pfx_autoremove_wake_function+0x10/0x10
[ 226.392985][ C0] ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[ 226.399381][ C0] synchronize_rcu+0x11b/0x360
[ 226.404181][ C0] ? __pfx_synchronize_rcu+0x10/0x10
[ 226.409555][ C0] lockdep_unregister_key+0x556/0x610
[ 226.414991][ C0] ? __pfx_lockdep_unregister_key+0x10/0x10
[ 226.420908][ C0] ? lockdep_hardirqs_on+0x99/0x150
[ 226.426160][ C0] ? __qdisc_destroy+0x150/0x410
[ 226.431211][ C0] ? kfree+0x149/0x360
[ 226.435414][ C0] ? __pfx_pfifo_fast_destroy+0x10/0x10
[ 226.441013][ C0] __qdisc_destroy+0x165/0x410
[ 226.445888][ C0] dev_shutdown+0x9b/0x440
[ 226.450334][ C0] unregister_netdevice_many_notify+0x9c7/0x1d20
[ 226.456717][ C0] ? __lock_acquire+0x1359/0x2000
[ 226.461786][ C0] ? __pfx_unregister_netdevice_many_notify+0x10/0x10
[ 226.468602][ C0] ? __asan_memset+0x23/0x50
[ 226.473208][ C0] ? skb_queue_purge_reason+0x2de/0x500
[ 226.478815][ C0] ? __asan_memset+0x23/0x50
[ 226.483423][ C0] ? skb_queue_purge_reason+0x2de/0x500
[ 226.489025][ C0] ? do_raw_spin_unlock+0x13c/0x8b0
[ 226.494244][ C0] unregister_netdevice_queue+0x303/0x370
[ 226.500013][ C0] ? __pfx_unregister_netdevice_queue+0x10/0x10
[ 226.506317][ C0] __tun_detach+0x6b6/0x1600
[ 226.510942][ C0] tun_chr_close+0x108/0x1b0
[ 226.515574][ C0] ? __pfx_tun_chr_close+0x10/0x10
[ 226.520705][ C0] __fput+0x24a/0x8a0
[ 226.524742][ C0] task_work_run+0x24f/0x310
[ 226.529352][ C0] ? __pfx_task_work_run+0x10/0x10
[ 226.534482][ C0] ? syscall_exit_to_user_mode+0xa3/0x370
[ 226.540265][ C0] syscall_exit_to_user_mode+0x168/0x370
[ 226.545967][ C0] do_syscall_64+0x100/0x230
[ 226.550580][ C0] ? clear_bhb_loop+0x35/0x90
[ 226.555306][ C0] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 226.561215][ C0] RIP: 0033:0x7f2470375b99
[ 226.565668][ C0] RSP: 002b:00007ffd71072888 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
[ 226.574109][ C0] RAX: 0000000000000000 RBX: 00007f2470505aa0 RCX: 00007f2470375b99
[ 226.582119][ C0] RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
[ 226.590129][ C0] RBP: 00007f2470505aa0 R08: 0000000000000008 R09: 0000000871072b9f
[ 226.598153][ C0] R10: 00000000005f3974 R11: 0000000000000246 R12: 000000000003603f
[ 226.606256][ C0] R13: 0000000000000032 R14: 00007f2470505aa0 R15: 00007f2470504078
[ 226.614265][ C0] </TASK>
[ 226.617357][ C0]
[ 226.617357][ C0] Showing all locks held in the system:
[ 226.625107][ C0] 3 locks held by kworker/u8:0/11:
[ 226.630259][ C0] 2 locks held by getty/4850:
[ 226.634966][ C0] #0: ffff88802a6860a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70
[ 226.644790][ C0] #1: ffffc9000312b2f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x6b5/0x1e10
[ 226.654971][ C0] 2 locks held by syz.4.185/5862:
[ 226.660012][ C0] 2 locks held by syz.3.241/6079:
[ 226.665073][ C0] 1 lock held by syz-executor/7179:
[ 226.670288][ C0] 2 locks held by syz.1.750/7558:
[ 226.675345][ C0] #0: ffffffff8f5feac8 (rtnl_mutex){+.+.}-{3:3}, at: tun_chr_close+0x3e/0x1b0
[ 226.684379][ C0] #1: ffffffff8e33abf8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x381/0x830
[ 226.695348][ C0] 2 locks held by syz.0.757/7594:
[ 226.700384][ C0] #0: ffffffff8f6682f0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40
[ 226.708731][ C0] #1: ffffffff8f5feac8 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x5f/0x8b0
[ 226.718045][ C0] 1 lock held by syz.3.764/7609:
[ 226.723082][ C0] 2 locks held by syz.2.767/7615:
[ 226.728152][ C0] #0: ffffc90000007c00 (net/core/rtnetlink.c:83){+.-.}-{0:0}, at: call_timer_fn+0xc0/0x650
[ 226.738338][ C0] #1: ffffffff8e335820 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x55/0x2a0
[ 226.748268][ C0] 3 locks held by syz-executor/7617:
[ 226.753568][ C0]
[ 226.755932][ C0] =============================================
[ 226.755932][ C0]
[ 227.343278][ T5144] usb 3-1: new high-speed USB device number 9 using dummy_hcd
[ 227.536711][ T5144] usb 3-1: config 0 interface 0 altsetting 0 endpoint 0x81 has an invalid bInterval 0, changing to 7
[ 227.568067][ T5144] usb 3-1: config 0 interface 0 altsetting 0 endpoint 0x81 has invalid wMaxPacketSize 0
[ 227.598514][ T5144] usb 3-1: New USB device found, idVendor=05ac, idProduct=022a, bcdDevice= 0.00
[ 227.621562][ T5144] usb 3-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 227.651457][ T5144] usb 3-1: config 0 descriptor??
[ 227.814998][ C0] DEBUG: holding rtnl_mutex for 706 jiffies.
[ 227.821075][ C0] task:syz.1.750 state:D stack:24672 pid:7558 tgid:7558 ppid:5945 flags:0x00004006
[ 227.831413][ C0] Call Trace:
[ 227.834732][ C0] <TASK>
[ 227.837686][ C0] __schedule+0x17e8/0x4a20
[ 227.842246][ C0] ? __pfx___schedule+0x10/0x10
[ 227.847165][ C0] ? __pfx_lock_release+0x10/0x10
[ 227.852388][ C0] ? _raw_spin_unlock_irqrestore+0xdd/0x140
[ 227.858350][ C0] ? schedule+0x90/0x320
[ 227.862622][ C0] schedule+0x14b/0x320
[ 227.866849][ C0] synchronize_rcu_expedited+0x684/0x830
[ 227.872517][ C0] ? __pfx_synchronize_rcu_expedited+0x10/0x10
[ 227.878773][ C0] ? __pfx_wait_rcu_exp_gp+0x10/0x10
[ 227.884098][ C0] ? __pfx___might_resched+0x10/0x10
[ 227.889440][ C0] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 227.895470][ C0] ? __pfx_autoremove_wake_function+0x10/0x10
[ 227.901561][ C0] ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[ 227.907960][ C0] synchronize_rcu+0x11b/0x360
[ 227.912758][ C0] ? __pfx_synchronize_rcu+0x10/0x10
[ 227.918117][ C0] lockdep_unregister_key+0x556/0x610
[ 227.923521][ C0] ? __pfx_lockdep_unregister_key+0x10/0x10
[ 227.929465][ C0] ? lockdep_hardirqs_on+0x99/0x150
[ 227.934718][ C0] ? __qdisc_destroy+0x150/0x410
[ 227.939964][ C0] ? kfree+0x149/0x360
[ 227.944062][ C0] ? __pfx_pfifo_fast_destroy+0x10/0x10
[ 227.949668][ C0] __qdisc_destroy+0x165/0x410
[ 227.954466][ C0] dev_shutdown+0x9b/0x440
[ 227.958940][ C0] unregister_netdevice_many_notify+0x9c7/0x1d20
[ 227.965321][ C0] ? __lock_acquire+0x1359/0x2000
[ 227.970390][ C0] ? __pfx_unregister_netdevice_many_notify+0x10/0x10
[ 227.977386][ C0] ? __asan_memset+0x23/0x50
[ 227.982011][ C0] ? skb_queue_purge_reason+0x2de/0x500
[ 227.987621][ C0] ? __asan_memset+0x23/0x50
[ 227.992237][ C0] ? skb_queue_purge_reason+0x2de/0x500
[ 227.997854][ C0] ? do_raw_spin_unlock+0x13c/0x8b0
[ 228.003082][ C0] unregister_netdevice_queue+0x303/0x370
[ 228.008865][ C0] ? __pfx_unregister_netdevice_queue+0x10/0x10
[ 228.015177][ C0] __tun_detach+0x6b6/0x1600
[ 228.019812][ C0] tun_chr_close+0x108/0x1b0
[ 228.024437][ C0] ? __pfx_tun_chr_close+0x10/0x10
[ 228.029623][ C0] __fput+0x24a/0x8a0
[ 228.033651][ C0] task_work_run+0x24f/0x310
[ 228.038319][ C0] ? __pfx_task_work_run+0x10/0x10
[ 228.043546][ C0] ? syscall_exit_to_user_mode+0xa3/0x370
[ 228.049341][ C0] syscall_exit_to_user_mode+0x168/0x370
[ 228.055042][ C0] do_syscall_64+0x100/0x230
[ 228.059667][ C0] ? clear_bhb_loop+0x35/0x90
[ 228.064374][ C0] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 228.070336][ C0] RIP: 0033:0x7f2470375b99
[ 228.074815][ C0] RSP: 002b:00007ffd71072888 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
[ 228.083252][ C0] RAX: 0000000000000000 RBX: 00007f2470505aa0 RCX: 00007f2470375b99
[ 228.091289][ C0] RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
[ 228.091369][ T5144] appletouch 3-1:0.0: Geyser mode initialized.
[ 228.099291][ C0] RBP: 00007f2470505aa0 R08: 0000000000000008 R09: 0000000871072b9f
[ 228.099348][ C0] R10: 00000000005f3974 R11: 0000000000000246 R12: 000000000003603f
[ 228.099364][ C0] R13: 0000000000000032 R14: 00007f2470505aa0 R15: 00007f2470504078
[ 228.099396][ C0] </TASK>
[ 228.099408][ C0]
[ 228.099408][ C0] Showing all locks held in the system:
[ 228.099452][ C0] 5 locks held by kworker/u8:7/2895:
[ 228.136372][ T5144] input: appletouch as /devices/platform/dummy_hcd.2/usb3/3-1/3-1:0.0/input/input11
[ 228.140345][ C0] 2 locks held by getty/4850:
[ 228.140365][ C0] #0: ffff88802a6860a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70
[ 228.140438][ C0] #1: ffffc9000312b2f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x6b5/0x1e10
[ 228.140505][ C0] 6 locks held by kworker/1:5/5144:
[ 228.140523][ C0] 2 locks held by syz.4.185/5862:
[ 228.140539][ C0] 4 locks held by syz.3.241/6079:
[ 228.140557][ C0] 2 locks held by syz.1.750/7558:
[ 228.140571][ C0] #0: ffffffff8f5feac8 (rtnl_mutex){+.+.}-{3:3}, at: tun_chr_close+0x3e/0x1b0
[ 228.140639][ C0] #1: ffffffff8e33abf8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x381/0x830
[ 228.140716][ C0] 2 locks held by syz.0.757/7594:
[ 228.140730][ C0] #0: ffffffff8f6682f0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40
[ 228.140791][ C0] #1: ffffffff8f5feac8 (rtnl_mutex){+.+.}-{3:3}, at: nl80211_pre_doit+0x5f/0x8b0
[ 228.140857][ C0] 2 locks held by syz.3.775/7633:
[ 228.140871][ C0] #0: ffffc9000499fb40 (net/core/rtnetlink.c:83){+.-.}-{0:0}, at: call_timer_fn+0xc0/0x650
[ 228.140931][ C0] #1: ffffffff8e335820 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x55/0x2a0
[ 228.140997][ C0]
[ 228.141004][ C0] =============================================
[ 228.141004][ C0]
[ 229.184903][ C0] DEBUG: holding rtnl_mutex for 843 jiffies.
[ 229.190994][ C0] task:syz.1.750 state:R running task stack:24672 pid:7558 tgid:7558 ppid:5945 flags:0x00004006
[ 229.202817][ C0] Call Trace:
[ 229.206142][ C0] <TASK>
[ 229.209088][ C0] __schedule+0x17e8/0x4a20
[ 229.213626][ C0] ? mark_lock+0x9a/0x360
[ 229.218009][ C0] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 229.224013][ C0] ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[ 229.230388][ C0] ? __pfx___schedule+0x10/0x10
[ 229.235286][ C0] ? lockdep_hardirqs_on+0x99/0x150
[ 229.240775][ C0] ? mark_lock+0x9a/0x360
[ 229.245156][ C0] preempt_schedule_irq+0xfb/0x1c0
[ 229.250289][ C0] ? __pfx_preempt_schedule_irq+0x10/0x10
[ 229.256053][ C0] ? __pfx_pfifo_fast_destroy+0x10/0x10
[ 229.261637][ C0] irqentry_exit+0x5e/0x90
[ 229.266098][ C0] asm_sysvec_reschedule_ipi+0x1a/0x20
[ 229.271585][ C0] RIP: 0010:lockdep_unregister_key+0x56d/0x610
[ 229.277796][ C0] Code: ff 92 48 c7 c6 d0 b7 6f 81 e8 7f 04 0a 00 e8 ea 18 0a 00 e9 e5 fb ff ff e8 c0 06 21 0a 41 f7 c7 00 02 00 00 74 d0 fb 45 84 f6 <75> cf eb e0 90 0f 0b 90 45 31 f6 e9 62 ff ff ff 90 0f 0b 90 e9 a1
[ 229.297447][ C0] RSP: 0018:ffffc90003c4f820 EFLAGS: 00000246
[ 229.303630][ C0] RAX: dffffc0000000000 RBX: 1ffff92000789f0c RCX: ffffffff947f3803
[ 229.311661][ C0] RDX: 0000000000000001 RSI: ffffffff8bcad5e0 RDI: 1ffff92000789f0c
[ 229.319713][ C0] ==================================================================
[ 229.327779][ C0] BUG: KASAN: stack-out-of-bounds in __show_regs+0x172/0x610
[ 229.335174][ C0] Read of size 8 at addr ffffc90003c4f798 by task kworker/u8:5/234
[ 229.343506][ C0]
[ 229.345910][ C0] CPU: 0 UID: 0 PID: 234 Comm: kworker/u8:5 Not tainted 6.10.0-rc6-next-20240701-syzkaller #0
[ 229.356229][ C0] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
[ 229.366301][ C0] Workqueue: bat_events batadv_nc_worker
[ 229.371979][ C0] Call Trace:
[ 229.375266][ C0] <IRQ>
[ 229.378119][ C0] dump_stack_lvl+0x241/0x360
[ 229.382825][ C0] ? __pfx_dump_stack_lvl+0x10/0x10
[ 229.388052][ C0] ? __pfx__printk+0x10/0x10
[ 229.392661][ C0] ? _printk+0xd5/0x120
[ 229.396840][ C0] print_report+0x169/0x550
[ 229.401361][ C0] ? __virt_addr_valid+0xbd/0x530
[ 229.406407][ C0] ? __show_regs+0x172/0x610
[ 229.411022][ C0] kasan_report+0x143/0x180
[ 229.415549][ C0] ? show_opcodes+0x148/0x170
[ 229.420251][ C0] ? __show_regs+0x172/0x610
[ 229.424865][ C0] __show_regs+0x172/0x610
[ 229.429297][ C0] ? asm_sysvec_reschedule_ipi+0x1a/0x20
[ 229.434949][ C0] ? wq_watchdog_touch+0xef/0x180
[ 229.439991][ C0] ? asm_sysvec_reschedule_ipi+0x1a/0x20
[ 229.445645][ C0] show_trace_log_lvl+0x3d4/0x520
[ 229.450700][ C0] ? lockdep_unregister_key+0x56d/0x610
[ 229.456267][ C0] sched_show_task+0x506/0x6d0
[ 229.461051][ C0] ? report_rtnl_holders+0x29e/0x3f0
[ 229.466361][ C0] ? __pfx__printk+0x10/0x10
[ 229.470970][ C0] ? __pfx_sched_show_task+0x10/0x10
[ 229.476270][ C0] ? _raw_spin_unlock_irqrestore+0xdd/0x140
[ 229.482180][ C0] ? __pfx__raw_spin_unlock_irqrestore+0x10/0x10
[ 229.488536][ C0] report_rtnl_holders+0x320/0x3f0
[ 229.493671][ C0] call_timer_fn+0x18e/0x650
[ 229.498279][ C0] ? call_timer_fn+0xc0/0x650
[ 229.502974][ C0] ? __pfx_report_rtnl_holders+0x10/0x10
[ 229.508624][ C0] ? __pfx_call_timer_fn+0x10/0x10
[ 229.513749][ C0] ? __pfx_report_rtnl_holders+0x10/0x10
[ 229.519403][ C0] ? __pfx_report_rtnl_holders+0x10/0x10
[ 229.525056][ C0] ? __pfx_report_rtnl_holders+0x10/0x10
[ 229.530709][ C0] ? _raw_spin_unlock_irq+0x23/0x50
[ 229.535920][ C0] ? lockdep_hardirqs_on+0x99/0x150
[ 229.541145][ C0] ? __pfx_report_rtnl_holders+0x10/0x10
[ 229.546796][ C0] __run_timer_base+0x66a/0x8e0
[ 229.551946][ C0] ? __pfx___run_timer_base+0x10/0x10
[ 229.557433][ C0] run_timer_softirq+0xb7/0x170
[ 229.562312][ C0] handle_softirqs+0x2c4/0x970
[ 229.567095][ C0] ? __irq_exit_rcu+0xf4/0x1c0
[ 229.571877][ C0] ? __pfx_handle_softirqs+0x10/0x10
[ 229.577184][ C0] ? irqtime_account_irq+0xd4/0x1e0
[ 229.582405][ C0] __irq_exit_rcu+0xf4/0x1c0
[ 229.587017][ C0] ? __pfx___irq_exit_rcu+0x10/0x10
[ 229.592330][ C0] irq_exit_rcu+0x9/0x30
[ 229.596591][ C0] sysvec_apic_timer_interrupt+0xa6/0xc0
[ 229.602250][ C0] </IRQ>
[ 229.605190][ C0] <TASK>
[ 229.608128][ C0] asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 229.614125][ C0] RIP: 0010:rcu_is_watching+0x3a/0xb0
[ 229.619525][ C0] Code: e8 4b 3b 17 0a 89 c3 83 f8 08 73 7a 49 bf 00 00 00 00 00 fc ff df 4c 8d 34 dd 40 6a de 8d 4c 89 f0 48 c1 e8 03 42 80 3c 38 00 <74> 08 4c 89 f7 e8 8c ac 80 00 48 c7 c3 98 7c 03 00 49 03 1e 48 89
[ 229.639321][ C0] RSP: 0018:ffffc90002fe7a00 EFLAGS: 00000246
[ 229.645414][ C0] RAX: 1ffffffff1bbcd48 RBX: 0000000000000000 RCX: ffffffff816f8930
[ 229.653403][ C0] RDX: 0000000000000000 RSI: ffffffff8c207ae0 RDI: ffffffff8c207aa0
[ 229.661389][ C0] RBP: ffffc90002fe7b48 R08: ffffffff8faf796f R09: 1ffffffff1f5ef2d
[ 229.669374][ C0] R10: dffffc0000000000 R11: fffffbfff1f5ef2e R12: 1ffff920005fcf50
[ 229.677448][ C0] R13: ffffffff8b5b3d7b R14: ffffffff8dde6a40 R15: dffffc0000000000
[ 229.685523][ C0] ? batadv_nc_worker+0xcb/0x610
[ 229.690482][ C0] ? lock_release+0xb0/0x9f0
[ 229.695095][ C0] lock_release+0xbf/0x9f0
[ 229.699531][ C0] ? __pfx_lock_acquire+0x10/0x10
[ 229.704573][ C0] ? batadv_nc_worker+0xcb/0x610
[ 229.709525][ C0] ? __pfx_lock_release+0x10/0x10
[ 229.714577][ C0] ? batadv_nc_worker+0xcb/0x610
[ 229.719556][ C0] batadv_nc_worker+0x28b/0x610
[ 229.724427][ C0] ? batadv_nc_worker+0xcb/0x610
[ 229.729383][ C0] ? process_scheduled_works+0x945/0x1830
[ 229.735214][ C0] process_scheduled_works+0xa2c/0x1830
[ 229.741055][ C0] ? __pfx_process_scheduled_works+0x10/0x10
[ 229.747056][ C0] ? assign_work+0x364/0x3d0
[ 229.751662][ C0] worker_thread+0x86d/0xd40
[ 229.756278][ C0] ? _raw_spin_unlock_irqrestore+0xdd/0x140
[ 229.762193][ C0] ? __kthread_parkme+0x169/0x1d0
[ 229.767245][ C0] ? __pfx_worker_thread+0x10/0x10
[ 229.772373][ C0] kthread+0x2f0/0x390
[ 229.776460][ C0] ? __pfx_worker_thread+0x10/0x10
[ 229.781595][ C0] ? __pfx_kthread+0x10/0x10
[ 229.786210][ C0] ret_from_fork+0x4b/0x80
[ 229.790645][ C0] ? __pfx_kthread+0x10/0x10
[ 229.795257][ C0] ret_from_fork_asm+0x1a/0x30
[ 229.800050][ C0] </TASK>
[ 229.803075][ C0]
[ 229.805402][ C0] The buggy address belongs to the virtual mapping at
[ 229.805402][ C0] [ffffc90003c48000, ffffc90003c51000) created by:
[ 229.805402][ C0] copy_process+0x5d1/0x3d70
[ 229.823040][ C0]
[ 229.825368][ C0] The buggy address belongs to the physical page:
[ 229.831782][ C0] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x8 pfn:0x60439
[ 229.840815][ C0] memcg:ffff88801ae90502
[ 229.845059][ C0] flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
[ 229.852190][ C0] raw: 00fff00000000000 0000000000000000 dead000000000122 0000000000000000
[ 229.860795][ C0] raw: 0000000000000008 0000000000000000 00000001ffffffff ffff88801ae90502
[ 229.869388][ C0] page dumped because: kasan: bad access detected
[ 229.875810][ C0] page_owner tracks the page as allocated
[ 229.881541][ C0] page last allocated via order 0, migratetype Unmovable, gfp_mask 0x102dc2(GFP_HIGHUSER|__GFP_NOWARN|__GFP_ZERO), pid 5945, tgid 5945 (syz-executor), ts 214193789047, free_ts 213702691290
[ 229.900048][ C0] post_alloc_hook+0x1f3/0x230
[ 229.904836][ C0] get_page_from_freelist+0x2ccb/0x2d80
[ 229.910401][ C0] __alloc_pages_noprof+0x256/0x6c0
[ 229.915617][ C0] alloc_pages_mpol_noprof+0x3e8/0x680
[ 229.921098][ C0] __vmalloc_node_range_noprof+0x971/0x1460
[ 229.927002][ C0] dup_task_struct+0x444/0x8c0
[ 229.931782][ C0] copy_process+0x5d1/0x3d70
[ 229.936391][ C0] kernel_clone+0x226/0x8f0
[ 229.940907][ C0] __x64_sys_clone+0x258/0x2a0
[ 229.945685][ C0] do_syscall_64+0xf3/0x230
[ 229.950289][ C0] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 229.956189][ C0] page last free pid 6079 tgid 6077 stack trace:
[ 229.962513][ C0] free_unref_page+0xd22/0xea0
[ 229.967289][ C0] __put_partials+0xeb/0x130
[ 229.971890][ C0] put_cpu_partial+0x17c/0x250
[ 229.976665][ C0] __slab_free+0x2ea/0x3d0
[ 229.981093][ C0] qlist_free_all+0x9e/0x140
[ 229.985702][ C0] kasan_quarantine_reduce+0x14f/0x170
[ 229.991174][ C0] __kasan_slab_alloc+0x23/0x80
[ 229.996036][ C0] kmem_cache_alloc_noprof+0x135/0x2a0
[ 230.001536][ C0] alloc_buffer_head+0x2a/0x290
[ 230.006397][ C0] folio_alloc_buffers+0x241/0x5b0
[ 230.011522][ C0] create_empty_buffers+0x3a/0x740
[ 230.016644][ C0] block_read_full_folio+0x25c/0xcd0
[ 230.021940][ C0] filemap_read_folio+0x1a0/0x790
[ 230.026979][ C0] do_read_cache_folio+0x134/0x820
[ 230.032104][ C0] do_read_cache_page+0x30/0x200
[ 230.037057][ C0] sysv_find_entry+0x1af/0x410
[ 230.041836][ C0]
[ 230.044183][ C0] Memory state around the buggy address:
[ 230.049816][ C0] ffffc90003c4f680: f2 f2 f2 f2 00 00 00 00 00 f3 f3 f3 f3 f3 f3 f3
[ 230.057889][ C0] ffffc90003c4f700: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
[ 230.065961][ C0] >ffffc90003c4f780: 00 f2 f2 f2 00 f3 f3 f3 00 00 00 00 00 00 00 00
[ 230.074059][ C0] ^
[ 230.078915][ C0] ffffc90003c4f800: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2
[ 230.086983][ C0] ffffc90003c4f880: 00 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
[ 230.095056][ C0] ==================================================================
[ 230.103224][ C0] Kernel panic - not syncing: KASAN: panic_on_warn set ...

Andrey Konovalov

unread,
Jul 1, 2024, 11:10:50 AM (2 days ago) Jul 1
to Tetsuo Handa, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, kasan-dev, linux-mm, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, mi...@redhat.com, tg...@linutronix.de, x...@kernel.org
On Mon, Jul 1, 2024 at 2:43 PM Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
>
> Hello, KASAN people.
>
> I suspect that KASAN's metadata for kernel stack memory got out of sync for
> unknown reason, for the stack trace of PID=7558 was successfully printed for
> two times before KASAN complains upon trying to print for the the third time.
> Would you decode what is this KASAN message saying?
>
> Quoting from https://syzkaller.appspot.com/text?tag=CrashLog&x=119fd081980000 :

[...]

> [ 229.319713][ C0] ==================================================================
> [ 229.327779][ C0] BUG: KASAN: stack-out-of-bounds in __show_regs+0x172/0x610
> [ 229.335174][ C0] Read of size 8 at addr ffffc90003c4f798 by task kworker/u8:5/234

[...]

> [ 230.044183][ C0] Memory state around the buggy address:
> [ 230.049816][ C0] ffffc90003c4f680: f2 f2 f2 f2 00 00 00 00 00 f3 f3 f3 f3 f3 f3 f3
> [ 230.057889][ C0] ffffc90003c4f700: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
> [ 230.065961][ C0] >ffffc90003c4f780: 00 f2 f2 f2 00 f3 f3 f3 00 00 00 00 00 00 00 00
> [ 230.074059][ C0] ^
> [ 230.078915][ C0] ffffc90003c4f800: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2
> [ 230.086983][ C0] ffffc90003c4f880: 00 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
> [ 230.095056][ C0] ==================================================================

I checked some of the other syzbot reports for this bug, and this
memory state part in some of them looks different.

Specifically, for
https://syzkaller.appspot.com/text?tag=CrashLog&x=14293f0e980000:

[ 1558.929174][ C1] Memory state around the buggy address:
[ 1558.934796][ C1] ffffc9000b8bf400: 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00
[ 1558.942852][ C1] ffffc9000b8bf480: 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00
[ 1558.950897][ C1] >ffffc9000b8bf500: 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00
[ 1558.958943][ C1] ^
[ 1558.964569][ C1] ffffc9000b8bf580: 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00
[ 1558.972613][ C1] ffffc9000b8bf600: 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00

This is weird, because if the metadata is 00, then the memory should
be accessible and there should be no KASAN report.

Which makes me believe you have some kind of a race in your patch (or
there's a race in the kernel that your patch somehow exposes). At
least between the moment KASAN detected the issue and the moment the
reporting procedure got to printing the memory state, the memory state
changed. As this is stack memory that comes from a vmalloc allocation,
I suspect the task whose stack had been at that location died, and
something else got mapped there.

This is my best guess, I hope it's helpful.

Tetsuo Handa

unread,
Jul 2, 2024, 2:11:24 AM (yesterday) Jul 2
to Andrey Konovalov, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, kasan-dev, linux-mm, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, mi...@redhat.com, tg...@linutronix.de, x...@kernel.org
On 2024/07/02 0:10, Andrey Konovalov wrote:
> This is weird, because if the metadata is 00, then the memory should
> be accessible and there should be no KASAN report.
>
> Which makes me believe you have some kind of a race in your patch (or
> there's a race in the kernel that your patch somehow exposes).

Yes, I consider that my patch is exposing an existing race, for I can't
find a race in my patch. (Since
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=b96342141183ffa62bfed5998f9b808c84042322
calls get_task_struct() when recording in-use state, report_rtnl_holders()
can't trigger use-after-free even if the thread died. Also, since previous version
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=5210cbe9a47fc5c1f43ba16d481e6335f3e2f345
synchronously calls synchronize_rcu() when clearing in-use state,
report_rtnl_holders() can't trigger use-after-free because the thread
can't die before calling put_rtnl_holder(). The variable "now" cannot be
0, and !cmpxchg(&rtnl_started[idx], 0, now) must serve as a serialization
lock when recording in-use state.)

> At
> least between the moment KASAN detected the issue and the moment the
> reporting procedure got to printing the memory state, the memory state
> changed.

Indeed, the exact line KASAN complained at varies suggests that the
memory state is modified by somebody else.

> As this is stack memory that comes from a vmalloc allocation,
> I suspect the task whose stack had been at that location died, and
> something else got mapped there.

I consider that the task can't die while calling __show_regs() from
report_rtnl_holders().

>
> This is my best guess, I hope it's helpful.

Well, KASAN says "out-of-bounds". But the reported address

BUG: KASAN: stack-out-of-bounds in __show_regs+0x172/0x610
Read of size 8 at addr ffffc90003c4f798 by task kworker/u8:5/234

is within the kernel stack memory mapping

The buggy address belongs to the virtual mapping at
[ffffc90003c48000, ffffc90003c51000) created by:
copy_process+0x5d1/0x3d7

. Why is this "out-of-bounds" ? What boundary did KASAN compare with?
Is this just a race of KASAN detecting a problem and KASAN reporting
that problem? (But as I explained above, it is unlikely that the thread
to be reported can die while processing report_rtnl_holders()...)

Tetsuo Handa

unread,
Jul 2, 2024, 3:05:28 AM (yesterday) Jul 2
to Andrey Konovalov, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, kasan-dev, linux-mm, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, mi...@redhat.com, tg...@linutronix.de, x...@kernel.org
On 2024/07/02 15:11, Tetsuo Handa wrote:
> Well, KASAN says "out-of-bounds". But the reported address
>
> BUG: KASAN: stack-out-of-bounds in __show_regs+0x172/0x610
> Read of size 8 at addr ffffc90003c4f798 by task kworker/u8:5/234
>
> is within the kernel stack memory mapping
>
> The buggy address belongs to the virtual mapping at
> [ffffc90003c48000, ffffc90003c51000) created by:
> copy_process+0x5d1/0x3d7
>
> . Why is this "out-of-bounds" ? What boundary did KASAN compare with?

I think I found a hint. The KASAN message is printed when the call trace
starts with

__schedule()
preempt_schedule_irq()
irqentry_exit()

. That is, when preemption happens, KASAN by error tries to compare with
unintended stack boundary?

[ 504.507489][ C0] DEBUG: holding rtnl_mutex for 3212 jiffies.
[ 504.513708][ C0] task:kworker/u8:5 state:R running task stack:19992 pid:340 tgid:340 ppid:2 flags:0x00004000
[ 504.525827][ C0] Workqueue: netns cleanup_net
[ 504.530890][ C0] Call Trace:
[ 504.534213][ C0] <TASK>
[ 504.537244][ C0] __schedule+0x17e8/0x4a20
[ 504.541874][ C0] ? mark_lock+0x9a/0x360
[ 504.546279][ C0] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 504.552396][ C0] ? __virt_addr_valid+0x183/0x520
[ 504.557711][ C0] ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[ 504.564121][ C0] ? lock_release+0xbf/0x9f0
[ 504.568918][ C0] ? __pfx___schedule+0x10/0x10
[ 504.573835][ C0] ? lockdep_hardirqs_on+0x99/0x150
[ 504.579189][ C0] ? mark_lock+0x9a/0x360
[ 504.583592][ C0] preempt_schedule_irq+0xfb/0x1c0
[ 504.588984][ C0] ? __pfx_preempt_schedule_irq+0x10/0x10
[ 504.594785][ C0] irqentry_exit+0x5e/0x90
[ 504.599421][ C0] asm_sysvec_reschedule_ipi+0x1a/0x20

[ 463.514954][ C1] DEBUG: holding rtnl_mutex for 993 jiffies.
[ 463.528845][ C1] task:kworker/u8:10 state:R running task stack:19856 pid:5725 tgid:5725 ppid:2 flags:0x00004000
[ 463.536743][ T9938] rock: corrupted directory entry. extent=41, offset=65536, size=8
[ 463.540652][ C1] Workqueue: netns cleanup_net
[ 463.553421][ C1] Call Trace:
[ 463.556740][ C1] <TASK>
[ 463.559706][ C1] __schedule+0x17e8/0x4a20
[ 463.564304][ C1] ? __pfx_validate_chain+0x10/0x10
[ 463.569611][ C1] ? __pfx___schedule+0x10/0x10
[ 463.574628][ C1] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 463.580760][ C1] ? preempt_schedule_irq+0xf0/0x1c0
[ 463.586149][ C1] preempt_schedule_irq+0xfb/0x1c0
[ 463.591401][ C1] ? __pfx_preempt_schedule_irq+0x10/0x10
[ 463.597269][ C1] irqentry_exit+0x5e/0x90
[ 463.601834][ C1] asm_sysvec_apic_timer_interrupt+0x1a/0x20

[ 1558.178669][ C1] DEBUG: holding rtnl_mutex for 536 jiffies.
[ 1558.184806][ C1] task:syz-executor.3 state:R running task stack:25968 pid:6351 tgid:6345 ppid:6200 flags:0x00004006
[ 1558.196699][ C1] Call Trace:
[ 1558.200068][ C1] <TASK>
[ 1558.203055][ C1] __schedule+0x17e8/0x4a20
[ 1558.207638][ C1] ? __pfx___schedule+0x10/0x10
[ 1558.212585][ C1] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 1558.218675][ C1] ? preempt_schedule_irq+0xf0/0x1c0
[ 1558.224004][ C1] preempt_schedule_irq+0xfb/0x1c0
[ 1558.229196][ C1] ? __pfx_preempt_schedule_irq+0x10/0x10
[ 1558.234986][ C1] irqentry_exit+0x5e/0x90
[ 1558.239503][ C1] asm_sysvec_reschedule_ipi+0x1a/0x20

[ 1104.439430][ C0] DEBUG: holding rtnl_mutex for 578 jiffies.
[ 1104.445729][ C0] task:kworker/u8:3 state:R running task stack:18544 pid:53 tgid:53 ppid:2 flags:0x00004000
[ 1104.459070][ C0] Workqueue: netns cleanup_net
[ 1104.464170][ C0] Call Trace:
[ 1104.467478][ C0] <TASK>
[ 1104.470481][ C0] __schedule+0x17e8/0x4a20
[ 1104.476080][ C0] ? mark_lock+0x9a/0x360
[ 1104.480776][ C0] ? __lock_acquire+0x1359/0x2000
[ 1104.486043][ C0] ? __pfx___schedule+0x10/0x10
[ 1104.490987][ C0] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 1104.497017][ C0] ? preempt_schedule_irq+0xf0/0x1c0
[ 1104.502486][ C0] preempt_schedule_irq+0xfb/0x1c0
[ 1104.507809][ C0] ? __pfx_preempt_schedule_irq+0x10/0x10
[ 1104.514030][ C0] irqentry_exit+0x5e/0x90
[ 1104.518689][ C0] asm_sysvec_reschedule_ipi+0x1a/0x20

[ 926.207053][ C1] DEBUG: holding rtnl_mutex for 517 jiffies.
[ 926.213142][ C1] task:syz.1.1365 state:R running task stack:24672 pid:11152 tgid:11152 ppid:10992 flags:0x00004006
[ 926.225053][ C1] Call Trace:
[ 926.228434][ C1] <TASK>
[ 926.231441][ C1] __schedule+0x17e8/0x4a20
[ 926.236054][ C1] ? __pfx___schedule+0x10/0x10
[ 926.241130][ C1] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 926.247265][ C1] ? kasan_save_track+0x51/0x80
[ 926.252225][ C1] ? preempt_schedule_irq+0xf0/0x1c0
[ 926.257705][ C1] preempt_schedule_irq+0xfb/0x1c0
[ 926.262899][ C1] ? __pfx_preempt_schedule_irq+0x10/0x10
[ 926.268725][ C1] ? __pfx_pfifo_fast_destroy+0x10/0x10
[ 926.274379][ C1] irqentry_exit+0x5e/0x90
[ 926.278903][ C1] asm_sysvec_apic_timer_interrupt+0x1a/0x20

[ 940.917894][ C0] DEBUG: holding rtnl_mutex for 1611 jiffies.
[ 940.924066][ C0] task:syz.2.2274 state:R running task stack:24336 pid:15954 tgid:15954 ppid:14850 flags:0x00004006
[ 940.936192][ C0] Call Trace:
[ 940.939550][ C0] <TASK>
[ 940.942540][ C0] __schedule+0x17e8/0x4a20
[ 940.947134][ C0] ? __pfx___schedule+0x10/0x10
[ 940.952070][ C0] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 940.958362][ C0] ? kasan_save_track+0x51/0x80
[ 940.963266][ C0] ? preempt_schedule_irq+0xf0/0x1c0
[ 940.968628][ C0] preempt_schedule_irq+0xfb/0x1c0
[ 940.973790][ C0] ? __pfx_preempt_schedule_irq+0x10/0x10
[ 940.979610][ C0] ? __pfx_pfifo_fast_destroy+0x10/0x10
[ 940.985227][ C0] irqentry_exit+0x5e/0x90
[ 940.989731][ C0] asm_sysvec_apic_timer_interrupt+0x1a/0x20

[ 2120.744289][ C1] DEBUG: holding rtnl_mutex for 1675 jiffies.
[ 2120.750440][ C1] task:syz-executor state:R running task stack:20288 pid:2431 tgid:2431 ppid:1 flags:0x00004006
[ 2120.762291][ C1] Call Trace:
[ 2120.765647][ C1] <TASK>
[ 2120.768615][ C1] __schedule+0x17e8/0x4a20
[ 2120.773210][ C1] ? __pfx___schedule+0x10/0x10
[ 2120.778152][ C1] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 2120.784188][ C1] ? kasan_save_track+0x51/0x80
[ 2120.789118][ C1] ? preempt_schedule_irq+0xf0/0x1c0
[ 2120.794445][ C1] preempt_schedule_irq+0xfb/0x1c0
[ 2120.799621][ C1] ? __pfx_preempt_schedule_irq+0x10/0x10
[ 2120.805378][ C1] ? kvm_kick_cpu+0x26/0xb0
[ 2120.809965][ C1] irqentry_exit+0x5e/0x90
[ 2120.814423][ C1] asm_sysvec_apic_timer_interrupt+0x1a/0x20

[ 1465.514982][ C1] DEBUG: holding rtnl_mutex for 583 jiffies.
[ 1465.521071][ C1] task:kworker/u8:2 state:R running task stack:20232 pid:35 tgid:35 ppid:2 flags:0x00004000
[ 1465.532945][ C1] Workqueue: netns cleanup_net
[ 1465.537846][ C1] Call Trace:
[ 1465.541164][ C1] <TASK>
[ 1465.544132][ C1] __schedule+0x17e8/0x4a20
[ 1465.548730][ C1] ? mark_lock+0x9a/0x360
[ 1465.553148][ C1] ? lockdep_hardirqs_on_prepare+0x43d/0x780
[ 1465.559257][ C1] ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[ 1465.565697][ C1] ? __pfx___schedule+0x10/0x10
[ 1465.570636][ C1] ? lockdep_hardirqs_on+0x99/0x150
[ 1465.575968][ C1] ? mark_lock+0x9a/0x360
[ 1465.580381][ C1] preempt_schedule_irq+0xfb/0x1c0
[ 1465.585599][ C1] ? __pfx_preempt_schedule_irq+0x10/0x10
[ 1465.591383][ C1] irqentry_exit+0x5e/0x90
[ 1465.595895][ C1] asm_sysvec_reschedule_ipi+0x1a/0x20

Andrey Konovalov

unread,
Jul 2, 2024, 10:29:29 AM (17 hours ago) Jul 2
to Tetsuo Handa, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, kasan-dev, linux-mm, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, mi...@redhat.com, tg...@linutronix.de, x...@kernel.org
On Tue, Jul 2, 2024 at 8:11 AM Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
>
> Well, KASAN says "out-of-bounds". But the reported address
>
> BUG: KASAN: stack-out-of-bounds in __show_regs+0x172/0x610
> Read of size 8 at addr ffffc90003c4f798 by task kworker/u8:5/234
>
> is within the kernel stack memory mapping
>
> The buggy address belongs to the virtual mapping at
> [ffffc90003c48000, ffffc90003c51000) created by:
> copy_process+0x5d1/0x3d7
>
> . Why is this "out-of-bounds" ? What boundary did KASAN compare with?
> Is this just a race of KASAN detecting a problem and KASAN reporting
> that problem? (But as I explained above, it is unlikely that the thread
> to be reported can die while processing report_rtnl_holders()...)

KASAN creates the bug title based on the memory state byte, it doesn't
compare any boundaries. So if the memory state got
corrupted/out-of-sync, the title can end up arbitrary. In this case,
the bad access likely touched a redzone between stack variables, so
KASAN reported out-of-bounds.

Since syzbot managed to find a reproducer for this bug, you can ask it
to run the reproducer with additional testing patches to check various
hypotheses [1]. Perhaps, you can write a magic value into task_struct
(or into pt_regs of the task?) in get_rtnl_holder() before saving the
task_struct reference, and then check that marker value in
report_rtnl_holders() before doing sched_show_task(). Depending of
whether the check succeeds, this will give additional information.

One other thing that comes to mind with regards to your patch: if the
task is still executing, the location of things on its stack might
change due to CONFIG_RANDOMIZE_KSTACK_OFFSET while you're printing the
task info. However, if the task is sleeping on a lock, this shouldn't
happen... But maybe a task can wake up during sched_show_task() and
start handling a new syscall? Just some guesses.

[1] https://github.com/google/syzkaller/blob/master/docs/syzbot.md#testing-patches

Tetsuo Handa

unread,
Jul 2, 2024, 11:21:20 AM (16 hours ago) Jul 2
to Andrey Konovalov, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, kasan-dev, linux-mm, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, mi...@redhat.com, tg...@linutronix.de, x...@kernel.org
On 2024/07/02 23:29, Andrey Konovalov wrote:
> One other thing that comes to mind with regards to your patch: if the
> task is still executing, the location of things on its stack might
> change due to CONFIG_RANDOMIZE_KSTACK_OFFSET while you're printing the
> task info. However, if the task is sleeping on a lock, this shouldn't
> happen... But maybe a task can wake up during sched_show_task() and
> start handling a new syscall? Just some guesses.

https://syzkaller.appspot.com/bug?extid=d7491e9e156404745fbb says that
this bug happens without my patch. It seems that this bug happens when
printing registers of a preempted thread. 5.15 kernel does not have
CONFIG_RANDOMIZE_KSTACK_OFFSET config option, but

__schedule()
preempt_schedule_irq()
irqentry_exit_cond_resched()
irqentry_exit()

pattern in 5.15 resembles

__schedule()
preempt_schedule_irq()
irqentry_exit()

pattern in linux-next.

[ 1008.224617][T14487] task:syz-executor.1 state:R running task stack:22256 pid:14483 ppid: 434 flags:0x00004000
[ 1008.224656][T14487] Call Trace:
[ 1008.224661][T14487] <TASK>
[ 1008.224669][T14487] __schedule+0xcbe/0x1580
[ 1008.224689][T14487] ? __sched_text_start+0x8/0x8
[ 1008.224709][T14487] ? ttwu_do_activate+0x15d/0x280
[ 1008.224732][T14487] ? _raw_spin_unlock_irqrestore+0x5c/0x80
[ 1008.224758][T14487] preempt_schedule_irq+0xc7/0x140
[ 1008.224781][T14487] ? __cond_resched+0x20/0x20
[ 1008.224802][T14487] ? try_invoke_on_locked_down_task+0x2a0/0x2a0
[ 1008.224829][T14487] irqentry_exit_cond_resched+0x2a/0x30
[ 1008.224851][T14487] irqentry_exit+0x30/0x40
[ 1008.224874][T14487] sysvec_apic_timer_interrupt+0x55/0xc0
[ 1008.224900][T14487] asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ 1008.224923][T14487] RIP: 0010:preempt_schedule_thunk+0x5/0x18
[ 1008.224950][T14487] Code: fd 85 db 0f 84 98 00 00 00 44 8d 73 01 44 89 f6 09 de bf ff ff ff ff e8 47 e4 8f fd 41 09 de 0f 88 88 00 00 00 e8 89 e0 8f fd <4c> 89 e0 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df 0f b6 04 08 84
[ 1008.224970][T14487] RSP: 0000:0000000000000001 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
[ 1008.224991][T14487] RAX: ffff88811532d948 RBX: ffffc900072ef560 RCX: ffffc900077e7680
[ 1008.225009][T14487] RDX: ffffc900072ef5b0 RSI: ffffffff8100817a RDI: dffffc0000000001
[ 1008.225027][T14487] RBP: 0000000000000001 R08: ffff88811532d948 R09: ffffc900077e7690
[ 1008.225043][T14487] R10: 1ffff92000efced2 R11: ffffffff84bfe126 R12: ffffc900077e7680
[ 1008.225062][T14487] ==================================================================
[ 1008.225071][T14487] BUG: KASAN: stack-out-of-bounds in __show_regs+0x252/0x4d0
[ 1008.225098][T14487] Read of size 8 at addr ffffc900072ef4f8 by task syz-executor.3/14487
[ 1008.225117][T14487]
[ 1008.225123][T14487] CPU: 0 PID: 14487 Comm: syz-executor.3 Not tainted 5.15.118-syzkaller-01748-g241da2ad5601 #0

Reply all
Reply to author
Forward
0 new messages