[syzbot] inconsistent lock state in kmem_cache_alloc

7 views
Skip to first unread message

syzbot

unread,
Sep 26, 2022, 12:33:47 PMSep 26
to ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, ros...@goodmis.org, syzkall...@googlegroups.com, vba...@suse.cz
Hello,

syzbot found the following issue on:

HEAD commit: 105a36f3694e Merge tag 'kbuild-fixes-v6.0-3' of git://git...
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=152bf540880000
kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1020566c880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=104819e4880000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+dfcc5f...@syzkaller.appspotmail.com

================================
WARNING: inconsistent lock state
6.0.0-rc6-syzkaller-00321-g105a36f3694e #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: might_alloc include/linux/sched/mm.h:271 [inline]
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: slab_pre_alloc_hook mm/slab.h:700 [inline]
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: slab_alloc mm/slab.c:3278 [inline]
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: __kmem_cache_alloc_lru mm/slab.c:3471 [inline]
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: kmem_cache_alloc+0x39/0x520 mm/slab.c:3491
{SOFTIRQ-ON-W} state was registered at:
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__fs_reclaim_acquire mm/page_alloc.c:4674 [inline]
fs_reclaim_acquire+0x115/0x160 mm/page_alloc.c:4688
might_alloc include/linux/sched/mm.h:271 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
kmem_cache_alloc_trace+0x38/0x460 mm/slab.c:3557
kmalloc include/linux/slab.h:600 [inline]
kzalloc include/linux/slab.h:733 [inline]
alloc_workqueue_attrs+0x39/0xc0 kernel/workqueue.c:3394
wq_numa_init kernel/workqueue.c:5964 [inline]
workqueue_init+0x12f/0x8ae kernel/workqueue.c:6091
kernel_init_freeable+0x3fb/0x73a init/main.c:1607
kernel_init+0x1a/0x1d0 init/main.c:1512
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
irq event stamp: 96654
hardirqs last enabled at (96654): [<ffffffff81c6581b>] kfree+0x25b/0x390 mm/slab.c:3787
hardirqs last disabled at (96653): [<ffffffff81c65811>] kfree+0x251/0x390 mm/slab.c:3776
softirqs last enabled at (96624): [<ffffffff814841f3>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last enabled at (96624): [<ffffffff814841f3>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
softirqs last disabled at (96649): [<ffffffff814841f3>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last disabled at (96649): [<ffffffff814841f3>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(fs_reclaim);
<Interrupt>
lock(fs_reclaim);

*** DEADLOCK ***

1 lock held by swapper/1/0:
#0: ffffffff91227508 (&fsnotify_mark_srcu){....}-{0:0}, at: fsnotify+0x2f4/0x1680 fs/notify/fsnotify.c:544

stack backtrace:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.0.0-rc6-syzkaller-00321-g105a36f3694e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_usage_bug kernel/locking/lockdep.c:3961 [inline]
valid_state kernel/locking/lockdep.c:3973 [inline]
mark_lock_irq kernel/locking/lockdep.c:4176 [inline]
mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632
mark_lock kernel/locking/lockdep.c:4596 [inline]
mark_usage kernel/locking/lockdep.c:4527 [inline]
__lock_acquire+0x11d9/0x56d0 kernel/locking/lockdep.c:5007
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__fs_reclaim_acquire mm/page_alloc.c:4674 [inline]
fs_reclaim_acquire+0x115/0x160 mm/page_alloc.c:4688
might_alloc include/linux/sched/mm.h:271 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x39/0x520 mm/slab.c:3491
fanotify_alloc_fid_event fs/notify/fanotify/fanotify.c:580 [inline]
fanotify_alloc_event fs/notify/fanotify/fanotify.c:813 [inline]
fanotify_handle_event+0x1130/0x3f40 fs/notify/fanotify/fanotify.c:948
send_to_group fs/notify/fsnotify.c:360 [inline]
fsnotify+0xafb/0x1680 fs/notify/fsnotify.c:570
__fsnotify_parent+0x62f/0xa60 fs/notify/fsnotify.c:230
fsnotify_parent include/linux/fsnotify.h:77 [inline]
fsnotify_file include/linux/fsnotify.h:99 [inline]
fsnotify_access include/linux/fsnotify.h:309 [inline]
__io_complete_rw_common+0x485/0x720 io_uring/rw.c:195
io_complete_rw+0x1a/0x1f0 io_uring/rw.c:228
iomap_dio_complete_work fs/iomap/direct-io.c:144 [inline]
iomap_dio_bio_end_io+0x438/0x5e0 fs/iomap/direct-io.c:178
bio_endio+0x5f9/0x780 block/bio.c:1564
req_bio_endio block/blk-mq.c:695 [inline]
blk_update_request+0x3fc/0x1300 block/blk-mq.c:825
scsi_end_request+0x7a/0x9a0 drivers/scsi/scsi_lib.c:541
scsi_io_completion+0x173/0x1f70 drivers/scsi/scsi_lib.c:971
scsi_complete+0x122/0x3b0 drivers/scsi/scsi_lib.c:1438
blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1022
__do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
common_interrupt+0xa9/0xc0 arch/x86/kernel/irq.c:240
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40 arch/x86/include/asm/idtentry.h:640
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:130 [inline]
RIP: 0010:acpi_safe_halt+0x6f/0xb0 drivers/acpi/processor_idle.c:113
Code: f7 84 db 74 06 5b e9 a0 26 f8 f7 e8 9b 26 f8 f7 e8 f6 9d fe f7 66 90 e8 8f 26 f8 f7 0f 00 2d c8 a5 d1 00 e8 83 26 f8 f7 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 4e 23 f8 f7 48 85 db
RSP: 0018:ffffc9000038fd20 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880121f0200 RSI: ffffffff8983119d RDI: 0000000000000000
RBP: ffff8880178c9064 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: 0000000000000001 R14: ffff8880178c9000 R15: ffff888146aa7804
acpi_idle_do_entry drivers/acpi/processor_idle.c:555 [inline]
acpi_idle_enter+0x524/0x6a0 drivers/acpi/processor_idle.c:692
cpuidle_enter_state+0x1ab/0xd30 drivers/cpuidle/cpuidle.c:239
cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:356
call_cpuidle kernel/sched/idle.c:155 [inline]
cpuidle_idle_call kernel/sched/idle.c:236 [inline]
do_idle+0x3e8/0x590 kernel/sched/idle.c:303
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:400
start_secondary+0x21d/0x2b0 arch/x86/kernel/smpboot.c:262
secondary_startup_64_no_verify+0xce/0xdb
</TASK>
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/1
preempt_count: 101, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.0.0-rc6-syzkaller-00321-g105a36f3694e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9892
might_alloc include/linux/sched/mm.h:274 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x381/0x520 mm/slab.c:3491
fanotify_alloc_fid_event fs/notify/fanotify/fanotify.c:580 [inline]
fanotify_alloc_event fs/notify/fanotify/fanotify.c:813 [inline]
fanotify_handle_event+0x1130/0x3f40 fs/notify/fanotify/fanotify.c:948
send_to_group fs/notify/fsnotify.c:360 [inline]
fsnotify+0xafb/0x1680 fs/notify/fsnotify.c:570
__fsnotify_parent+0x62f/0xa60 fs/notify/fsnotify.c:230
fsnotify_parent include/linux/fsnotify.h:77 [inline]
fsnotify_file include/linux/fsnotify.h:99 [inline]
fsnotify_access include/linux/fsnotify.h:309 [inline]
__io_complete_rw_common+0x485/0x720 io_uring/rw.c:195
io_complete_rw+0x1a/0x1f0 io_uring/rw.c:228
iomap_dio_complete_work fs/iomap/direct-io.c:144 [inline]
iomap_dio_bio_end_io+0x438/0x5e0 fs/iomap/direct-io.c:178
bio_endio+0x5f9/0x780 block/bio.c:1564
req_bio_endio block/blk-mq.c:695 [inline]
blk_update_request+0x3fc/0x1300 block/blk-mq.c:825
scsi_end_request+0x7a/0x9a0 drivers/scsi/scsi_lib.c:541
scsi_io_completion+0x173/0x1f70 drivers/scsi/scsi_lib.c:971
scsi_complete+0x122/0x3b0 drivers/scsi/scsi_lib.c:1438
blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1022
__do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
common_interrupt+0xa9/0xc0 arch/x86/kernel/irq.c:240
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40 arch/x86/include/asm/idtentry.h:640
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:130 [inline]
RIP: 0010:acpi_safe_halt+0x6f/0xb0 drivers/acpi/processor_idle.c:113
Code: f7 84 db 74 06 5b e9 a0 26 f8 f7 e8 9b 26 f8 f7 e8 f6 9d fe f7 66 90 e8 8f 26 f8 f7 0f 00 2d c8 a5 d1 00 e8 83 26 f8 f7 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 4e 23 f8 f7 48 85 db
RSP: 0018:ffffc9000038fd20 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880121f0200 RSI: ffffffff8983119d RDI: 0000000000000000
RBP: ffff8880178c9064 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: 0000000000000001 R14: ffff8880178c9000 R15: ffff888146aa7804
acpi_idle_do_entry drivers/acpi/processor_idle.c:555 [inline]
acpi_idle_enter+0x524/0x6a0 drivers/acpi/processor_idle.c:692
cpuidle_enter_state+0x1ab/0xd30 drivers/cpuidle/cpuidle.c:239
cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:356
call_cpuidle kernel/sched/idle.c:155 [inline]
cpuidle_idle_call kernel/sched/idle.c:236 [inline]
do_idle+0x3e8/0x590 kernel/sched/idle.c:303
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:400
start_secondary+0x21d/0x2b0 arch/x86/kernel/smpboot.c:262
secondary_startup_64_no_verify+0xce/0xdb
</TASK>
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3675, name: syz-executor284
preempt_count: 100, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
Preemption disabled at:
[<ffffffff89c000e1>] softirq_handle_begin kernel/softirq.c:409 [inline]
[<ffffffff89c000e1>] __do_softirq+0xe1/0x9c6 kernel/softirq.c:547
CPU: 1 PID: 3675 Comm: syz-executor284 Tainted: G W 6.0.0-rc6-syzkaller-00321-g105a36f3694e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9892
might_alloc include/linux/sched/mm.h:274 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x381/0x520 mm/slab.c:3491
fanotify_alloc_fid_event fs/notify/fanotify/fanotify.c:580 [inline]
fanotify_alloc_event fs/notify/fanotify/fanotify.c:813 [inline]
fanotify_handle_event+0x1130/0x3f40 fs/notify/fanotify/fanotify.c:948
send_to_group fs/notify/fsnotify.c:360 [inline]
fsnotify+0xafb/0x1680 fs/notify/fsnotify.c:570
__fsnotify_parent+0x62f/0xa60 fs/notify/fsnotify.c:230
fsnotify_parent include/linux/fsnotify.h:77 [inline]
fsnotify_file include/linux/fsnotify.h:99 [inline]
fsnotify_access include/linux/fsnotify.h:309 [inline]
__io_complete_rw_common+0x485/0x720 io_uring/rw.c:195
io_complete_rw+0x1a/0x1f0 io_uring/rw.c:228
iomap_dio_complete_work fs/iomap/direct-io.c:144 [inline]
iomap_dio_bio_end_io+0x438/0x5e0 fs/iomap/direct-io.c:178
bio_endio+0x5f9/0x780 block/bio.c:1564
req_bio_endio block/blk-mq.c:695 [inline]
blk_update_request+0x3fc/0x1300 block/blk-mq.c:825
scsi_end_request+0x7a/0x9a0 drivers/scsi/scsi_lib.c:541
scsi_io_completion+0x173/0x1f70 drivers/scsi/scsi_lib.c:971
scsi_complete+0x122/0x3b0 drivers/scsi/scsi_lib.c:1438
blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1022
__do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
common_interrupt+0xa9/0xc0 arch/x86/kernel/irq.c:240
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40 arch/x86/include/asm/idtentry.h:640
RIP: 0010:kmem_cache_free.part.0+0x141/0x2e0 mm/slab.c:3727
Code: 4c 89 ea 4c 89 e7 e8 ee c0 ff ff 48 85 db 0f 85 a3 00 00 00 9c 58 f6 c4 02 0f 85 7e 01 00 00 48 85 db 74 01 fb 48 8b 44 24 08 <65> 48 2b 04 25 28 00 00 00 0f 85 7c 01 00 00 48 83 c4 10 5b 5d 41
RSP: 0018:ffffc90002f578a0 EFLAGS: 00000206
RAX: 90e611cb37c35f00 RBX: 0000000000000200 RCX: 1ffffffff1bbcead
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81c691b3
RBP: ffffea0001f3fe40 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000080000000 R11: 000000000008c07c R12: ffff888011853000
R13: ffffffff8136bf10 R14: ffff88807cf5e8a8 R15: 00007f457e2d9000
pmd_ptlock_free include/linux/mm.h:2355 [inline]
pgtable_pmd_page_dtor include/linux/mm.h:2392 [inline]
___pmd_free_tlb+0x70/0x220 arch/x86/mm/pgtable.c:72
__pmd_free_tlb arch/x86/include/asm/pgalloc.h:93 [inline]
free_pmd_range mm/memory.c:269 [inline]
free_pud_range mm/memory.c:287 [inline]
free_p4d_range mm/memory.c:321 [inline]
free_pgd_range+0x9a1/0xbe0 mm/memory.c:401
free_pgtables+0x230/0x2f0 mm/memory.c:433
exit_mmap+0x1c7/0x490 mm/mmap.c:3117
__mmput+0x122/0x4b0 kernel/fork.c:1187
mmput+0x56/0x60 kernel/fork.c:1208
exit_mm kernel/exit.c:510 [inline]
do_exit+0x9e2/0x29b0 kernel/exit.c:782
do_group_exit+0xd2/0x2f0 kernel/exit.c:925
get_signal+0x238c/0x2610 kernel/signal.c:2857
arch_do_signal_or_restart+0x82/0x2300 arch/x86/kernel/signal.c:869
exit_to_user_mode_loop kernel/entry/common.c:166 [inline]
exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:201
__syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:294
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f457e24abe9
Code: Unable to access opcode bytes at RIP 0x7f457e24abbf.
RSP: 002b:00007f457e1fc308 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00007f457e2d2428 RCX: 00007f457e24abe9
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f457e2d2428
RBP: 00007f457e2d2420 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f457e2a0064
R13: 0000000000000004 R14: 00007f457e1fc400 R15: 0000000000022000
</TASK>
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3603, name: strace-static-x
preempt_count: 100, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
Preemption disabled at:
[<ffffffff89c000e1>] softirq_handle_begin kernel/softirq.c:409 [inline]
[<ffffffff89c000e1>] __do_softirq+0xe1/0x9c6 kernel/softirq.c:547
CPU: 1 PID: 3603 Comm: strace-static-x Tainted: G W 6.0.0-rc6-syzkaller-00321-g105a36f3694e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9892
might_alloc include/linux/sched/mm.h:274 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x381/0x520 mm/slab.c:3491
fanotify_alloc_fid_event fs/notify/fanotify/fanotify.c:580 [inline]
fanotify_alloc_event fs/notify/fanotify/fanotify.c:813 [inline]
fanotify_handle_event+0x1130/0x3f40 fs/notify/fanotify/fanotify.c:948
send_to_group fs/notify/fsnotify.c:360 [inline]
fsnotify+0xafb/0x1680 fs/notify/fsnotify.c:570
__fsnotify_parent+0x62f/0xa60 fs/notify/fsnotify.c:230
fsnotify_parent include/linux/fsnotify.h:77 [inline]
fsnotify_file include/linux/fsnotify.h:99 [inline]
fsnotify_access include/linux/fsnotify.h:309 [inline]
__io_complete_rw_common+0x485/0x720 io_uring/rw.c:195
io_complete_rw+0x1a/0x1f0 io_uring/rw.c:228
iomap_dio_complete_work fs/iomap/direct-io.c:144 [inline]
iomap_dio_bio_end_io+0x438/0x5e0 fs/iomap/direct-io.c:178
bio_endio+0x5f9/0x780 block/bio.c:1564
req_bio_endio block/blk-mq.c:695 [inline]
blk_update_request+0x3fc/0x1300 block/blk-mq.c:825
scsi_end_request+0x7a/0x9a0 drivers/scsi/scsi_lib.c:541
scsi_io_completion+0x173/0x1f70 drivers/scsi/scsi_lib.c:971
scsi_complete+0x122/0x3b0 drivers/scsi/scsi_lib.c:1438
blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1022
__do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
common_interrupt+0x52/0xc0 arch/x86/kernel/irq.c:240
asm_common_interrupt+0x22/0x40 arch/x86/include/asm/idtentry.h:640
RIP: 0033:0x485e00
Code: 0f 86 de 17 00 00 48 89 8d 08 fb ff ff 48 89 ca 4c 89 d6 4c 89 e7 44 89 8d 00 fb ff ff ff 53 38 48 8b 8d 08 fb ff ff 48 39 c1 <0f> 85 92 f5 ff ff 44 8b 8d 00 fb ff ff b8 ff ff ff 7f 44 29 c8 48
RSP: 002b:00007fff73e2c5f0 EFLAGS: 00000246
RAX: 0000000000000001 RBX: 0000000000619460 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 00007fff73e2cb07 RDI: 0000000000989041
RBP: 00007fff73e2cb40 R08: 0000000000000000 R09: 0000000000000000
R10: 00007fff73e2cb07 R11: 0000000000000000 R12: 0000000000617480
R13: 0000000000534349 R14: 00007fff73e2cb58 R15: 0000000000000064
</TASK>
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/1
preempt_count: 102, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 6.0.0-rc6-syzkaller-00321-g105a36f3694e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9892
might_alloc include/linux/sched/mm.h:274 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x381/0x520 mm/slab.c:3491
fanotify_alloc_fid_event fs/notify/fanotify/fanotify.c:580 [inline]
fanotify_alloc_event fs/notify/fanotify/fanotify.c:813 [inline]
fanotify_handle_event+0x1130/0x3f40 fs/notify/fanotify/fanotify.c:948
send_to_group fs/notify/fsnotify.c:360 [inline]
fsnotify+0xafb/0x1680 fs/notify/fsnotify.c:570
__fsnotify_parent+0x62f/0xa60 fs/notify/fsnotify.c:230
fsnotify_parent include/linux/fsnotify.h:77 [inline]
fsnotify_file include/linux/fsnotify.h:99 [inline]
fsnotify_access include/linux/fsnotify.h:309 [inline]
__io_complete_rw_common+0x485/0x720 io_uring/rw.c:195
io_complete_rw+0x1a/0x1f0 io_uring/rw.c:228
iomap_dio_complete_work fs/iomap/direct-io.c:144 [inline]
iomap_dio_bio_end_io+0x438/0x5e0 fs/iomap/direct-io.c:178
bio_endio+0x5f9/0x780 block/bio.c:1564
req_bio_endio block/blk-mq.c:695 [inline]
blk_update_request+0x3fc/0x1300 block/blk-mq.c:825
scsi_end_request+0x7a/0x9a0 drivers/scsi/scsi_lib.c:541
scsi_io_completion+0x173/0x1f70 drivers/scsi/scsi_lib.c:971
scsi_complete+0x122/0x3b0 drivers/scsi/scsi_lib.c:1438
blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1022
__do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
common_interrupt+0xa9/0xc0 arch/x86/kernel/irq.c:240
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40 arch/x86/include/asm/idtentry.h:640
RIP: 0010:orc_ip arch/x86/kernel/unwind_orc.c:30 [inline]
RIP: 0010:__orc_find+0x6f/0xf0 arch/x86/kernel/unwind_orc.c:52
Code: 72 4d 4c 89 e0 48 29 e8 48 89 c2 48 c1 e8 3f 48 c1 fa 02 48 01 d0 48 d1 f8 48 8d 5c 85 00 48 89 d8 48 c1 e8 03 42 0f b6 14 38 <48> 89 d8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2 75 48 48 63 03 48 01
RSP: 0018:ffffc9000038f840 EFLAGS: 00000a03
RAX: 1ffffffff1bda94f RBX: ffffffff8ded4a7c RCX: ffffffff81595532
RDX: 0000000000000000 RSI: ffffffff8e5f3a12 RDI: ffffffff8ded4a6c
RBP: ffffffff8ded4a6c R08: ffffffff8be023e0 R09: ffffc9000038f92c
R10: fffff52000071f2a R11: 000000000008c07c R12: ffffffff8ded4a8c
R13: ffffffff8ded4a6c R14: ffffffff8ded4a6c R15: dffffc0000000000
orc_find arch/x86/kernel/unwind_orc.c:178 [inline]
unwind_next_frame+0x2a3/0x1cc0 arch/x86/kernel/unwind_orc.c:448
arch_stack_walk+0x7d/0xe0 arch/x86/kernel/stacktrace.c:25
stack_trace_save+0x8c/0xc0 kernel/stacktrace.c:122
kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
__kasan_record_aux_stack+0x7e/0x90 mm/kasan/generic.c:348
call_rcu+0x99/0x790 kernel/rcu/tree.c:2793
put_task_struct_rcu_user+0x7f/0xb0 kernel/exit.c:183
context_switch kernel/sched/core.c:5185 [inline]
__schedule+0xae7/0x52b0 kernel/sched/core.c:6494
schedule_idle+0x57/0x90 kernel/sched/core.c:6598
do_idle+0x303/0x590 kernel/sched/idle.c:331
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:400
start_secondary+0x21d/0x2b0 arch/x86/kernel/smpboot.c:262
secondary_startup_64_no_verify+0xce/0xdb
</TASK>
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/1
preempt_count: 101, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W 6.0.0-rc6-syzkaller-00321-g105a36f3694e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9892
might_alloc include/linux/sched/mm.h:274 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x381/0x520 mm/slab.c:3491
fanotify_alloc_fid_event fs/notify/fanotify/fanotify.c:580 [inline]
fanotify_alloc_event fs/notify/fanotify/fanotify.c:813 [inline]
fanotify_handle_event+0x1130/0x3f40 fs/notify/fanotify/fanotify.c:948
send_to_group fs/notify/fsnotify.c:360 [inline]
fsnotify+0xafb/0x1680 fs/notify/fsnotify.c:570
__fsnotify_parent+0x62f/0xa60 fs/notify/fsnotify.c:230
fsnotify_parent include/linux/fsnotify.h:77 [inline]
fsnotify_file include/linux/fsnotify.h:99 [inline]
fsnotify_access include/linux/fsnotify.h:309 [inline]
__io_complete_rw_common+0x485/0x720 io_uring/rw.c:195
io_complete_rw+0x1a/0x1f0 io_uring/rw.c:228
iomap_dio_complete_work fs/iomap/direct-io.c:144 [inline]
iomap_dio_bio_end_io+0x438/0x5e0 fs/iomap/direct-io.c:178
bio_endio+0x5f9/0x780 block/bio.c:1564
req_bio_endio block/blk-mq.c:695 [inline]
blk_update_request+0x3fc/0x1300 block/blk-mq.c:825
scsi_end_request+0x7a/0x9a0 drivers/scsi/scsi_lib.c:541
scsi_io_completion+0x173/0x1f70 drivers/scsi/scsi_lib.c:971
scsi_complete+0x122/0x3b0 drivers/scsi/scsi_lib.c:1438
blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1022
__do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
common_interrupt+0xa9/0xc0 arch/x86/kernel/irq.c:240
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40 arch/x86/include/asm/idtentry.h:640
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:130 [inline]
RIP: 0010:acpi_safe_halt+0x6f/0xb0 drivers/acpi/processor_idle.c:113
Code: f7 84 db 74 06 5b e9 a0 26 f8 f7 e8 9b 26 f8 f7 e8 f6 9d fe f7 66 90 e8 8f 26 f8 f7 0f 00 2d c8 a5 d1 00 e8 83 26 f8 f7 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 4e 23 f8 f7 48 85 db
RSP: 0018:ffffc9000038fd20 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880121f0200 RSI: ffffffff8983119d RDI: ffffffff8983118a
RBP: ffff8880178c9064 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: 0000000000000001 R14: ffff8880178c9000 R15: ffff888146aa7804
acpi_idle_do_entry drivers/acpi/processor_idle.c:555 [inline]
acpi_idle_enter+0x524/0x6a0 drivers/acpi/processor_idle.c:692
cpuidle_enter_state+0x1ab/0xd30 drivers/cpuidle/cpuidle.c:239
cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:356
call_cpuidle kernel/sched/idle.c:155 [inline]
cpuidle_idle_call kernel/sched/idle.c:236 [inline]
do_idle+0x3e8/0x590 kernel/sched/idle.c:303
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:400
start_secondary+0x21d/0x2b0 arch/x86/kernel/smpboot.c:262
secondary_startup_64_no_verify+0xce/0xdb
</TASK>
------------[ cut here ]------------
do not call blocking ops when !TASK_RUNNING; state=8 set at [<ffffffff814ad360>] ptrace_stop.part.0+0x0/0xa80 kernel/signal.c:2172
WARNING: CPU: 1 PID: 3606 at kernel/sched/core.c:9815 __might_sleep+0x105/0x150 kernel/sched/core.c:9815
Modules linked in:
CPU: 1 PID: 3606 Comm: syz-executor284 Tainted: G W 6.0.0-rc6-syzkaller-00321-g105a36f3694e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
RIP: 0010:__might_sleep+0x105/0x150 kernel/sched/core.c:9815
Code: 6f 02 00 48 8d bb 08 17 00 00 48 89 fa 48 c1 ea 03 80 3c 02 00 75 34 48 8b 93 08 17 00 00 48 c7 c7 00 1e ec 89 e8 7d c6 e8 07 <0f> 0b e9 75 ff ff ff e8 cf d5 74 00 e9 26 ff ff ff 89 34 24 e8 d2
RSP: 0018:ffffc900003f8810 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff88807dd3c080 RCX: 0000000000000000
RDX: ffff88807dd3c080 RSI: ffffffff81611da8 RDI: fffff5200007f0f4
RBP: ffffffff8b840617 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000080000102 R11: 6320746f6e206f64 R12: 0000000000000112
R13: 0000000000404cc0 R14: 0000000000000048 R15: 0000000000000200
FS: 0000555555731300(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000010 CR3: 00000000743dd000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
might_alloc include/linux/sched/mm.h:274 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x381/0x520 mm/slab.c:3491
fanotify_alloc_fid_event fs/notify/fanotify/fanotify.c:580 [inline]
fanotify_alloc_event fs/notify/fanotify/fanotify.c:813 [inline]
fanotify_handle_event+0x1130/0x3f40 fs/notify/fanotify/fanotify.c:948
send_to_group fs/notify/fsnotify.c:360 [inline]
fsnotify+0xafb/0x1680 fs/notify/fsnotify.c:570
__fsnotify_parent+0x62f/0xa60 fs/notify/fsnotify.c:230
fsnotify_parent include/linux/fsnotify.h:77 [inline]
fsnotify_file include/linux/fsnotify.h:99 [inline]
fsnotify_access include/linux/fsnotify.h:309 [inline]
__io_complete_rw_common+0x485/0x720 io_uring/rw.c:195
io_complete_rw+0x1a/0x1f0 io_uring/rw.c:228
iomap_dio_complete_work fs/iomap/direct-io.c:144 [inline]
iomap_dio_bio_end_io+0x438/0x5e0 fs/iomap/direct-io.c:178
bio_endio+0x5f9/0x780 block/bio.c:1564
req_bio_endio block/blk-mq.c:695 [inline]
blk_update_request+0x3fc/0x1300 block/blk-mq.c:825
scsi_end_request+0x7a/0x9a0 drivers/scsi/scsi_lib.c:541
scsi_io_completion+0x173/0x1f70 drivers/scsi/scsi_lib.c:971
scsi_complete+0x122/0x3b0 drivers/scsi/scsi_lib.c:1438
blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1022
__do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
common_interrupt+0xa9/0xc0 arch/x86/kernel/irq.c:240
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40 arch/x86/include/asm/idtentry.h:640
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70 kernel/locking/spinlock.c:194
Code: 74 24 10 e8 5a 76 db f7 48 89 ef e8 e2 f7 db f7 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> 23 b9 ce f7 65 8b 05 ac 45 7f 76 85 c0 74 0a 5b 5d c3 e8 e0 85
RSP: 0018:ffffc90003067c88 EFLAGS: 00000206
RAX: 0000000000000046 RBX: 0000000000000200 RCX: 1ffffffff1bbcead
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001
RBP: ffff888026549640 R08: 0000000000000000 R09: 0000000000000000
R10: ffffed1004ca92c8 R11: 0000000000000001 R12: ffff88807e924040
R13: ffff888026549640 R14: ffffc90003067d20 R15: 0000000000000246
spin_unlock_irqrestore include/linux/spinlock.h:404 [inline]
do_notify_parent_cldstop+0x569/0xa40 kernel/signal.c:2190
ptrace_stop.part.0+0x834/0xa80 kernel/signal.c:2293
ptrace_stop kernel/signal.c:2232 [inline]
ptrace_do_notify+0x215/0x2b0 kernel/signal.c:2344
ptrace_notify+0xc4/0x140 kernel/signal.c:2356
ptrace_report_syscall include/linux/ptrace.h:420 [inline]
ptrace_report_syscall_exit include/linux/ptrace.h:482 [inline]
syscall_exit_work kernel/entry/common.c:249 [inline]
syscall_exit_to_user_mode_prepare+0x129/0x280 kernel/entry/common.c:276
__syscall_exit_to_user_mode_work kernel/entry/common.c:281 [inline]
syscall_exit_to_user_mode+0x9/0x50 kernel/entry/common.c:294
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f457e2494f6
Code: 0f 1f 40 00 31 c9 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 3d 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 90 48 83 ec 28 89 54 24 14 48 89 74 24
RSP: 002b:00007ffd6ecf35c8 EFLAGS: 00000246 ORIG_RAX: 000000000000003d
RAX: 0000000000000000 RBX: 000000000000d707 RCX: 00007f457e2494f6
RDX: 0000000040000001 RSI: 00007ffd6ecf35f4 RDI: 00000000ffffffff
RBP: 0000000000000f02 R08: 0000000000000037 R09: 00007ffd6ed54080
R10: 0000000000000000 R11: 0000000000000246 R12: 431bde82d7b634db
R13: 00007ffd6ecf35f4 R14: 0000000000000000 R15: 0000000000000000
</TASK>
----------------
Code disassembly (best guess):
0: f7 84 db 74 06 5b e9 testl $0xf7f826a0,-0x16a4f98c(%rbx,%rbx,8)
7: a0 26 f8 f7
b: e8 9b 26 f8 f7 callq 0xf7f826ab
10: e8 f6 9d fe f7 callq 0xf7fe9e0b
15: 66 90 xchg %ax,%ax
17: e8 8f 26 f8 f7 callq 0xf7f826ab
1c: 0f 00 2d c8 a5 d1 00 verw 0xd1a5c8(%rip) # 0xd1a5eb
23: e8 83 26 f8 f7 callq 0xf7f826ab
28: fb sti
29: f4 hlt
* 2a: 9c pushfq <-- trapping instruction
2b: 5b pop %rbx
2c: 81 e3 00 02 00 00 and $0x200,%ebx
32: fa cli
33: 31 ff xor %edi,%edi
35: 48 89 de mov %rbx,%rsi
38: e8 4e 23 f8 f7 callq 0xf7f8238b
3d: 48 85 db test %rbx,%rbx


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

ead...@sina.com

unread,
Sep 27, 2022, 3:03:46 AMSep 27
to syzbot+dfcc5f...@syzkaller.appspotmail.com, ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, ros...@goodmis.org, syzkall...@googlegroups.com, vba...@suse.cz, Edward Adam Davis
From: Edward Adam Davis <ead...@sina.com>

1. IN_SOFTIRQ held the fs_reclaim lock:
mark_lock_irq kernel/locking/lockdep.c:4176 [inline]
mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632
mark_lock kernel/locking/lockdep.c:4596 [inline]
mark_usage kernel/locking/lockdep.c:4527 [inline]
__lock_acquire+0x11d9/0x56d0 kernel/locking/lockdep.c:5007
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__fs_reclaim_acquire mm/page_alloc.c:4674 [inline]
fs_reclaim_acquire+0x115/0x160 mm/page_alloc.c:4688
might_alloc include/linux/sched/mm.h:271 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]

2. ENABLE_SOFTIRQ require the fs_reclaim lock:
might_alloc include/linux/sched/mm.h:274 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]

move slab_pre_alloc_hook() to irq context, confirm the context to IN_SOFTIRQ.

Link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
Reported-by: syzbot+dfcc5f...@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <ead...@sina.com>
---
mm/slab.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index 10e96137b44f..29d49d1b1e96 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3275,15 +3275,19 @@ slab_alloc(struct kmem_cache *cachep, struct list_lru *lru, gfp_t flags,
bool init = false;

flags &= gfp_allowed_mask;
+ local_irq_save(save_flags);
cachep = slab_pre_alloc_hook(cachep, lru, &objcg, 1, flags);
- if (unlikely(!cachep))
+ if (unlikely(!cachep)) {
+ local_irq_restore(save_flags);
return NULL;
+ }

objp = kfence_alloc(cachep, orig_size, flags);
- if (unlikely(objp))
+ if (unlikely(objp)) {
+ local_irq_restore(save_flags);
goto out;
+ }

- local_irq_save(save_flags);
objp = __do_cache_alloc(cachep, flags);
local_irq_restore(save_flags);
objp = cache_alloc_debugcheck_after(cachep, flags, objp, caller);
--
2.37.2

ead...@sina.com

unread,
Sep 27, 2022, 3:11:40 AMSep 27
to syzbot+dfcc5f...@syzkaller.appspotmail.com, ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, ros...@goodmis.org, syzkall...@googlegroups.com, vba...@suse.cz, Edward Adam Davis
From: Edward Adam Davis <ead...@sina.com>

1. ENABLE_SOFTIRQ held the fs_reclaim lock:
{SOFTIRQ-ON-W} state was registered at:
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__fs_reclaim_acquire mm/page_alloc.c:4674 [inline]
fs_reclaim_acquire+0x115/0x160 mm/page_alloc.c:4688
might_alloc include/linux/sched/mm.h:271 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
kmem_cache_alloc_trace+0x38/0x460 mm/slab.c:3557
kmalloc include/linux/slab.h:600 [inline]
kzalloc include/linux/slab.h:733 [inline]
alloc_workqueue_attrs+0x39/0xc0 kernel/workqueue.c:3394
wq_numa_init kernel/workqueue.c:5964 [inline]
workqueue_init+0x12f/0x8ae kernel/workqueue.c:6091
kernel_init_freeable+0x3fb/0x73a init/main.c:1607
kernel_init+0x1a/0x1d0 init/main.c:1512
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306

2. IN_SOFTIRQ require the fs_reclaim lock:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_usage_bug kernel/locking/lockdep.c:3961 [inline]
valid_state kernel/locking/lockdep.c:3973 [inline]
mark_lock_irq kernel/locking/lockdep.c:4176 [inline]
mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632
mark_lock kernel/locking/lockdep.c:4596 [inline]
mark_usage kernel/locking/lockdep.c:4527 [inline]
__lock_acquire+0x11d9/0x56d0 kernel/locking/lockdep.c:5007
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__fs_reclaim_acquire mm/page_alloc.c:4674 [inline]
fs_reclaim_acquire+0x115/0x160 mm/page_alloc.c:4688
might_alloc include/linux/sched/mm.h:271 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]

move slab_pre_alloc_hook() to irq context, confirm the context to IN_SOFTIRQ.

Link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
Reported-by: syzbot+dfcc5f...@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <ead...@sina.com>
Changes in v2:
comments update.

ead...@sina.com

unread,
Sep 28, 2022, 4:32:27 AMSep 28
to syzbot+dfcc5f...@syzkaller.appspotmail.com, syzkall...@googlegroups.com, Edward Adam Davis
From: Edward Adam Davis <ead...@sina.com>

On Mon, 26 Sep 2022 09:33:46 -0700
> syzbot found the following issue on:
>
> HEAD commit: 105a36f3694e Merge tag 'kbuild-fixes-v6.0-3' of git://git...
> git tree: upstream
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=152bf540880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
> dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1020566c880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=104819e4880000

Check the issue is fixed.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 105a36f3694e

syzbot

unread,
Sep 28, 2022, 4:45:18 AMSep 28
to ead...@sina.com, syzkall...@googlegroups.com
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

0000000000293 R12: 0000000000080001
[ 67.677748][ T3618] R13: 00007ffc532c88c0 R14: 0000000000000000 R15: 00007ffc532c8e60
[ 67.685913][ T3618] </TASK>
[ 67.691911][ T22] Bluetooth: hci0: command 0x0409 tx timeout
2022/09/28 08:44:35 building call list...
[ 67.972646][ T11] netdevsim netdevsim0 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[ 68.375133][ T3616] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
[ 68.385124][ T3616] in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 3616, name: kworker/1:3
[ 68.394346][ T3616] preempt_count: 0, expected: 0
[ 68.399207][ T3616] RCU nest depth: 0, expected: 0
[ 68.404156][ T3616] 3 locks held by kworker/1:3/3616:
[ 68.409463][ T3616] #0: ffff888011867d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x87a/0x1610
[ 68.419873][ T3616] #1: ffffc90003087da8 ((work_completion)(&data->fib_event_work)){+.+.}-{0:0}, at: process_one_work+0x8ae/0x1610
[ 68.431956][ T3616] #2: ffff8880698ac240 (&data->fib_lock){+.+.}-{3:3}, at: nsim_fib_event_work+0x1df/0x2720
[ 68.442213][ T3616] irq event stamp: 5628
[ 68.446388][ T3616] hardirqs last enabled at (5627): [<ffffffff81c6581b>] kfree+0x25b/0x390
[ 68.455047][ T3616] hardirqs last disabled at (5628): [<ffffffff81c6883e>] kmem_cache_alloc_trace+0x37e/0x4d0
[ 68.465181][ T3616] softirqs last enabled at (5604): [<ffffffff857ce0dd>] __rhashtable_insert_fast.constprop.0+0x9ed/0x1200
[ 68.476693][ T3616] softirqs last disabled at (5602): [<ffffffff857cda2e>] __rhashtable_insert_fast.constprop.0+0x33e/0x1200
[ 68.488124][ T3616] CPU: 1 PID: 3616 Comm: kworker/1:3 Tainted: G W 6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0
[ 68.500217][ T3616] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
[ 68.510295][ T3616] Workqueue: events nsim_fib_event_work
[ 68.515875][ T3616] Call Trace:
[ 68.519170][ T3616] <TASK>
[ 68.522201][ T3616] dump_stack_lvl+0xcd/0x134
[ 68.526997][ T3616] __might_resched.cold+0x222/0x26b
[ 68.532305][ T3616] kmem_cache_alloc_trace+0x36f/0x4d0
[ 68.537710][ T3616] nsim_fib_event_work+0x89c/0x2720
[ 68.542932][ T3616] ? nsim_nexthop_event_nb+0x11d0/0x11d0
[ 68.548588][ T3616] ? lock_downgrade+0x6e0/0x6e0
[ 68.553558][ T3616] ? do_raw_spin_lock+0x120/0x2a0
[ 68.558650][ T3616] process_one_work+0x991/0x1610
[ 68.563702][ T3616] ? pwq_dec_nr_in_flight+0x2a0/0x2a0
[ 68.569112][ T3616] ? rwlock_bug.part.0+0x90/0x90
[ 68.574087][ T3616] ? _raw_spin_lock_irq+0x41/0x50
[ 68.579153][ T3616] worker_thread+0x854/0x1080
[ 68.583959][ T3616] ? __kthread_parkme+0x15f/0x220
[ 68.589130][ T3616] ? process_one_work+0x1610/0x1610
[ 68.594367][ T3616] kthread+0x2e4/0x3a0
[ 68.599944][ T3616] ? kthread_complete_and_exit+0x40/0x40
[ 68.605603][ T3616] ret_from_fork+0x1f/0x30
[ 68.610047][ T3616] </TASK>
executing program
[ 69.395124][ T3616] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
[ 69.404653][ T3616] in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 3616, name: kworker/1:3
[ 69.413786][ T3616] preempt_count: 0, expected: 0
[ 69.418681][ T3616] RCU nest depth: 0, expected: 0
[ 69.423635][ T3616] 3 locks held by kworker/1:3/3616:
[ 69.428842][ T3616] #0: ffff888011867d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x87a/0x1610
[ 69.439316][ T3616] #1: ffffc90003087da8 ((work_completion)(&data->fib_event_work)){+.+.}-{0:0}, at: process_one_work+0x8ae/0x1610
[ 69.451378][ T3616] #2: ffff8880698ac240 (&data->fib_lock){+.+.}-{3:3}, at: nsim_fib_event_work+0x1df/0x2720
[ 69.461528][ T3616] irq event stamp: 6434
[ 69.465960][ T3616] hardirqs last enabled at (6433): [<ffffffff81c6581b>] kfree+0x25b/0x390
[ 69.474604][ T3616] hardirqs last disabled at (6434): [<ffffffff81c6883e>] kmem_cache_alloc_trace+0x37e/0x4d0
[ 69.484706][ T3616] softirqs last enabled at (6412): [<ffffffff857ce0dd>] __rhashtable_insert_fast.constprop.0+0x9ed/0x1200
[ 69.496193][ T3616] softirqs last disabled at (6410): [<ffffffff857cda2e>] __rhashtable_insert_fast.constprop.0+0x33e/0x1200
[ 69.507599][ T3616] CPU: 1 PID: 3616 Comm: kworker/1:3 Tainted: G W 6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0
[ 69.519689][ T3616] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
[ 69.529765][ T3616] Workqueue: events nsim_fib_event_work
[ 69.535347][ T3616] Call Trace:
[ 69.538640][ T3616] <TASK>
[ 69.541582][ T3616] dump_stack_lvl+0xcd/0x134
[ 69.546193][ T3616] __might_resched.cold+0x222/0x26b
[ 69.551510][ T3616] kmem_cache_alloc_trace+0x36f/0x4d0
[ 69.556923][ T3616] nsim_fib_event_work+0x112d/0x2720
[ 69.562678][ T3616] ? nsim_nexthop_event_nb+0x11d0/0x11d0
[ 69.568338][ T3616] ? lock_downgrade+0x6e0/0x6e0
[ 69.573294][ T3616] ? do_raw_spin_lock+0x120/0x2a0
[ 69.578448][ T3616] process_one_work+0x991/0x1610
[ 69.583411][ T3616] ? pwq_dec_nr_in_flight+0x2a0/0x2a0
[ 69.588814][ T3616] ? rwlock_bug.part.0+0x90/0x90
[ 69.593779][ T3616] ? _raw_spin_lock_irq+0x41/0x50
[ 69.598888][ T3616] worker_thread+0x854/0x1080
[ 69.603648][ T3616] ? __kthread_parkme+0x15f/0x220
[ 69.608713][ T3616] ? process_one_work+0x1610/0x1610
[ 69.613981][ T3616] kthread+0x2e4/0x3a0
[ 69.618145][ T3616] ? kthread_complete_and_exit+0x40/0x40
[ 69.623879][ T3616] ret_from_fork+0x1f/0x30
[ 69.628998][ T3616] </TASK>
[ 70.413861][ T3616] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
[ 70.423382][ T3616] in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 3616, name: kworker/1:3
[ 70.432872][ T3616] preempt_count: 0, expected: 0
[ 70.437733][ T3616] RCU nest depth: 0, expected: 0
[ 70.442691][ T3616] 3 locks held by kworker/1:3/3616:
[ 70.448027][ T3616] #0: ffff888011867d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x87a/0x1610
[ 70.458447][ T3616] #1: ffffc90003087da8 ((work_completion)(&data->fib_event_work)){+.+.}-{0:0}, at: process_one_work+0x8ae/0x1610
[ 70.470515][ T3616] #2: ffff8880698ac240 (&data->fib_lock){+.+.}-{3:3}, at: nsim_fib_event_work+0x1df/0x2720
[ 70.480768][ T3616] irq event stamp: 7172
[ 70.485032][ T3616] hardirqs last enabled at (7171): [<ffffffff81c6581b>] kfree+0x25b/0x390
[ 70.493653][ T3616] hardirqs last disabled at (7172): [<ffffffff81c6883e>] kmem_cache_alloc_trace+0x37e/0x4d0
[ 70.503760][ T3616] softirqs last enabled at (7150): [<ffffffff857ce0dd>] __rhashtable_insert_fast.constprop.0+0x9ed/0x1200
[ 70.515207][ T3616] softirqs last disabled at (7148): [<ffffffff857cda2e>] __rhashtable_insert_fast.constprop.0+0x33e/0x1200
[ 70.526717][ T3616] CPU: 1 PID: 3616 Comm: kworker/1:3 Tainted: G W 6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0
[ 70.538821][ T3616] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
[ 70.549096][ T3616] Workqueue: events nsim_fib_event_work
[ 70.554806][ T3616] Call Trace:
[ 70.558119][ T3616] <TASK>
[ 70.561077][ T3616] dump_stack_lvl+0xcd/0x134
[ 70.565817][ T3616] __might_resched.cold+0x222/0x26b
[ 70.571106][ T3616] kmem_cache_alloc_trace+0x36f/0x4d0
[ 70.576535][ T3616] nsim_fib_event_work+0x112d/0x2720
[ 70.581880][ T3616] ? nsim_nexthop_event_nb+0x11d0/0x11d0
[ 70.587590][ T3616] ? lock_downgrade+0x6e0/0x6e0
[ 70.592795][ T3616] ? do_raw_spin_lock+0x120/0x2a0
[ 70.597867][ T3616] process_one_work+0x991/0x1610
[ 70.602942][ T3616] ? pwq_dec_nr_in_flight+0x2a0/0x2a0
[ 70.608368][ T3616] ? rwlock_bug.part.0+0x90/0x90
[ 70.613351][ T3616] ? _raw_spin_lock_irq+0x41/0x50
[ 70.618431][ T3616] worker_thread+0x854/0x1080
[ 70.623148][ T3616] ? __kthread_parkme+0x15f/0x220
[ 70.628216][ T3616] ? process_one_work+0x1610/0x1610
[ 70.633485][ T3616] kthread+0x2e4/0x3a0
[ 70.637587][ T3616] ? kthread_complete_and_exit+0x40/0x40
[ 70.643357][ T3616] ret_from_fork+0x1f/0x30
[ 70.647813][ T3616] </TASK>
[ 71.251847][ T11] netdevsim netdevsim0 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[ 71.433924][ T3616] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
[ 71.443464][ T3616] in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 3616, name: kworker/1:3
[ 71.452980][ T3616] preempt_count: 0, expected: 0
[ 71.457970][ T3616] RCU nest depth: 0, expected: 0
[ 71.463007][ T3616] 3 locks held by kworker/1:3/3616:
[ 71.468225][ T3616] #0: ffff888011867d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x87a/0x1610
[ 71.478784][ T3616] #1: ffffc90003087da8 ((work_completion)(&data->fib_event_work)){+.+.}-{0:0}, at: process_one_work+0x8ae/0x1610
[ 71.490848][ T3616] #2: ffff8880698ac240 (&data->fib_lock){+.+.}-{3:3}, at: nsim_fib_event_work+0x1df/0x2720
[ 71.501098][ T3616] irq event stamp: 8194
[ 71.505263][ T3616] hardirqs last enabled at (8193): [<ffffffff81c6581b>] kfree+0x25b/0x390
[ 71.513887][ T3616] hardirqs last disabled at (8194): [<ffffffff81c6883e>] kmem_cache_alloc_trace+0x37e/0x4d0
[ 71.523990][ T3616] softirqs last enabled at (8170): [<ffffffff857ce0dd>] __rhashtable_insert_fast.constprop.0+0x9ed/0x1200
[ 71.535515][ T3616] softirqs last disabled at (8168): [<ffffffff857cda2e>] __rhashtable_insert_fast.constprop.0+0x33e/0x1200
[ 71.547536][ T3616] CPU: 1 PID: 3616 Comm: kworker/1:3 Tainted: G W 6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0
[ 71.560071][ T3616] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
[ 71.570160][ T3616] Workqueue: events nsim_fib_event_work
[ 71.575747][ T3616] Call Trace:
[ 71.579050][ T3616] <TASK>
[ 71.581998][ T3616] dump_stack_lvl+0xcd/0x134
[ 71.586626][ T3616] __might_resched.cold+0x222/0x26b
[ 71.592043][ T3616] kmem_cache_alloc_trace+0x36f/0x4d0
[ 71.597461][ T3616] nsim_fib_event_work+0x112d/0x2720
[ 71.602792][ T3616] ? nsim_nexthop_event_nb+0x11d0/0x11d0
[ 71.608471][ T3616] ? lock_downgrade+0x6e0/0x6e0
[ 71.613360][ T3616] ? do_raw_spin_lock+0x120/0x2a0
[ 71.619041][ T3616] process_one_work+0x991/0x1610
[ 71.624021][ T3616] ? pwq_dec_nr_in_flight+0x2a0/0x2a0
[ 71.629449][ T3616] ? rwlock_bug.part.0+0x90/0x90
[ 71.634512][ T3616] ? _raw_spin_lock_irq+0x41/0x50
[ 71.639590][ T3616] worker_thread+0x854/0x1080
[ 71.644316][ T3616] ? __kthread_parkme+0x15f/0x220
[ 71.649385][ T3616] ? process_one_work+0x1610/0x1610
[ 71.654818][ T3616] kthread+0x2e4/0x3a0
[ 71.658569][ T27] kauditd_printk_skb: 19 callbacks suppressed
[ 71.658586][ T27] audit: type=1400 audit(1664354679.052:105): avc: denied { read } for pid=3606 comm="syz-fuzzer" name="fuse" dev="devtmpfs" ino=98 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:fuse_device_t tclass=chr_file permissive=1
[ 71.658998][ T3616] ? kthread_complete_and_exit+0x40/0x40
[ 71.669853][ T27] audit: type=1400 audit(1664354679.052:106): avc: denied { open } for pid=3606 comm="syz-fuzzer" path="/dev/fuse" dev="devtmpfs" ino=98 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:fuse_device_t tclass=chr_file permissive=1
[ 71.687854][ T3616] ret_from_fork+0x1f/0x30
[ 71.687894][ T3616] </TASK>
[ 71.783970][ T27] audit: type=1400 audit(1664354679.172:107): avc: denied { read } for pid=3606 comm="syz-fuzzer" name="autofs" dev="devtmpfs" ino=97 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:autofs_device_t tclass=chr_file permissive=1
[ 71.834122][ T27] audit: type=1400 audit(1664354679.172:108): avc: denied { open } for pid=3606 comm="syz-fuzzer" path="/dev/autofs" dev="devtmpfs" ino=97 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:autofs_device_t tclass=chr_file permissive=1
[ 71.860511][ T27] audit: type=1400 audit(1664354679.172:109): avc: denied { read } for pid=3606 comm="syz-fuzzer" name="btrfs-control" dev="devtmpfs" ino=1116 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:lvm_control_t tclass=chr_file permissive=1
[ 71.885804][ T27] audit: type=1400 audit(1664354679.172:110): avc: denied { open } for pid=3606 comm="syz-fuzzer" path="/dev/btrfs-control" dev="devtmpfs" ino=1116 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:lvm_control_t tclass=chr_file permissive=1
[ 71.920503][ T27] audit: type=1400 audit(1664354679.172:111): avc: denied { read } for pid=3606 comm="syz-fuzzer" name="cachefiles" dev="devtmpfs" ino=4 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:cachefiles_device_t tclass=chr_file permissive=1
[ 71.973286][ T27] audit: type=1400 audit(1664354679.172:112): avc: denied { open } for pid=3606 comm="syz-fuzzer" path="/dev/cachefiles" dev="devtmpfs" ino=4 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:cachefiles_device_t tclass=chr_file permissive=1
[ 71.998727][ T27] audit: type=1400 audit(1664354679.182:113): avc: denied { read } for pid=3606 comm="syz-fuzzer" name="renderD128" dev="devtmpfs" ino=623 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:dri_device_t tclass=chr_file permissive=1
[ 72.043451][ T27] audit: type=1400 audit(1664354679.182:114): avc: denied { open } for pid=3606 comm="syz-fuzzer" path="/dev/dri/renderD128" dev="devtmpfs" ino=623 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:dri_device_t tclass=chr_file permissive=1
[ 72.079465][ T3608] can: request_module (can-proto-0) failed.
[ 72.116644][ T3608] can: request_module (can-proto-0) failed.
[ 72.149068][ T3608] can: request_module (can-proto-0) failed.
executing program


syzkaller build log:
go env (err=<nil>)
GO111MODULE="auto"
GOARCH="amd64"
GOBIN=""
GOCACHE="/syzkaller/.cache/go-build"
GOENV="/syzkaller/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/syzkaller/jobs/linux/gopath/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/syzkaller/jobs/linux/gopath"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/syzkaller/jobs/linux/gopath/src/github.com/google/syzkaller/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3708998014=/tmp/go-build -gno-record-gcc-switches"

git status (err=<nil>)
HEAD detached at 0042f2b4c
nothing to commit, working tree clean


go list -f '{{.Stale}}' ./sys/syz-sysgen | grep -q false || go install ./sys/syz-sysgen
make .descriptions
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=0042f2b4c00ce1ceeaa44a0147909fe3a6f86c5c -X 'github.com/google/syzkaller/prog.gitRevisionDate=20220922-164204'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-fuzzer github.com/google/syzkaller/syz-fuzzer
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=0042f2b4c00ce1ceeaa44a0147909fe3a6f86c5c -X 'github.com/google/syzkaller/prog.gitRevisionDate=20220922-164204'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
GOOS=linux GOARCH=amd64 go build "-ldflags=-s -w -X github.com/google/syzkaller/prog.GitRevision=0042f2b4c00ce1ceeaa44a0147909fe3a6f86c5c -X 'github.com/google/syzkaller/prog.gitRevisionDate=20220922-164204'" "-tags=syz_target syz_os_linux syz_arch_amd64 " -o ./bin/linux_amd64/syz-stress github.com/google/syzkaller/tools/syz-stress
mkdir -p ./bin/linux_amd64
gcc -o ./bin/linux_amd64/syz-executor executor/executor.cc \
-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -static-pie -fpermissive -w -DGOOS_linux=1 -DGOARCH_amd64=1 \
-DHOSTGOOS_linux=1 -DGIT_REVISION=\"0042f2b4c00ce1ceeaa44a0147909fe3a6f86c5c\"


Error text is too large and was truncated, full error text is at:
https://syzkaller.appspot.com/x/error.txt?x=15d5164c880000


Tested on:

commit: 105a36f3 Merge tag 'kbuild-fixes-v6.0-3' of git://git...
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=1535662f080000

ead...@sina.com

unread,
Sep 28, 2022, 5:09:07 AMSep 28
to syzbot+dfcc5f...@syzkaller.appspotmail.com, syzkall...@googlegroups.com, Edward Adam Davis
From: Edward Adam Davis <ead...@sina.com>

On Mon, 26 Sep 2022 09:33:46 -0700
> syzbot found the following issue on:
>
> HEAD commit: 105a36f3694e Merge tag 'kbuild-fixes-v6.0-3' of git://git...
> git tree: upstream
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=152bf540880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
> dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1020566c880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=104819e4880000

Check the issue is fixed.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 105a36f3694e

--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -268,8 +268,11 @@ static inline void memalloc_retry_wait(gfp_t gfp_flags)
*/
static inline void might_alloc(gfp_t gfp_mask)
{
+ unsigned long save_flags;
+ local_irq_save(save_flags);
fs_reclaim_acquire(gfp_mask);
fs_reclaim_release(gfp_mask);
+ local_irq_restore(save_flags);

might_sleep_if(gfpflags_allow_blocking(gfp_mask));
}
--
2.37.2

syzbot

unread,
Sep 28, 2022, 5:44:22 AMSep 28
to ead...@sina.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
inconsistent lock state in slab_pre_alloc_hook

================================
WARNING: inconsistent lock state
6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/1/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: might_alloc include/linux/sched/mm.h:273 [inline]
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: slab_pre_alloc_hook.constprop.0+0x19d/0x580 mm/slab.h:700
{SOFTIRQ-ON-W} state was registered at:
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__fs_reclaim_acquire mm/page_alloc.c:4674 [inline]
fs_reclaim_acquire+0x115/0x160 mm/page_alloc.c:4688
dma_resv_lockdep+0x342/0x545 drivers/dma-buf/dma-resv.c:753
do_one_initcall+0xfe/0x650 init/main.c:1296
do_initcall_level init/main.c:1369 [inline]
do_initcalls init/main.c:1385 [inline]
do_basic_setup init/main.c:1404 [inline]
kernel_init_freeable+0x6b1/0x73a init/main.c:1623
kernel_init+0x1a/0x1d0 init/main.c:1512
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
irq event stamp: 330269
hardirqs last enabled at (330268): [<ffffffff81c6637b>] kfree+0x25b/0x390 mm/slab.c:3787
hardirqs last disabled at (330269): [<ffffffff81c658d6>] might_alloc include/linux/sched/mm.h:272 [inline]
hardirqs last disabled at (330269): [<ffffffff81c658d6>] slab_pre_alloc_hook.constprop.0+0x196/0x580 mm/slab.h:700
softirqs last enabled at (330252): [<ffffffff814841f3>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last enabled at (330252): [<ffffffff814841f3>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
softirqs last disabled at (330265): [<ffffffff814841f3>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last disabled at (330265): [<ffffffff814841f3>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(fs_reclaim);
<Interrupt>
lock(fs_reclaim);

*** DEADLOCK ***

1 lock held by swapper/1/0:
#0: ffffffff91228508 (&fsnotify_mark_srcu){....}-{0:0}, at: fsnotify+0x2f4/0x1680 fs/notify/fsnotify.c:544

stack backtrace:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_usage_bug kernel/locking/lockdep.c:3961 [inline]
valid_state kernel/locking/lockdep.c:3973 [inline]
mark_lock_irq kernel/locking/lockdep.c:4176 [inline]
mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632
mark_lock kernel/locking/lockdep.c:4596 [inline]
mark_usage kernel/locking/lockdep.c:4527 [inline]
__lock_acquire+0x11d9/0x56d0 kernel/locking/lockdep.c:5007
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__fs_reclaim_acquire mm/page_alloc.c:4674 [inline]
fs_reclaim_acquire+0x115/0x160 mm/page_alloc.c:4688
might_alloc include/linux/sched/mm.h:273 [inline]
slab_pre_alloc_hook.constprop.0+0x19d/0x580 mm/slab.h:700
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x4e/0x3b0 mm/slab.c:3491
Code: f7 84 db 74 06 5b e9 90 08 f8 f7 e8 8b 08 f8 f7 e8 e6 7f fe f7 66 90 e8 7f 08 f8 f7 0f 00 2d c8 85 d1 00 e8 73 08 f8 f7 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 3e 05 f8 f7 48 85 db
RSP: 0018:ffffc9000038fd20 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880121f0200 RSI: ffffffff8983319d RDI: 0000000000000000
RBP: ffff888145a07864 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: 0000000000000001 R14: ffff888145a07800 R15: ffff88801c0f6804
acpi_idle_do_entry drivers/acpi/processor_idle.c:555 [inline]
acpi_idle_enter+0x524/0x6a0 drivers/acpi/processor_idle.c:692
cpuidle_enter_state+0x1ab/0xd30 drivers/cpuidle/cpuidle.c:239
cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:356
call_cpuidle kernel/sched/idle.c:155 [inline]
cpuidle_idle_call kernel/sched/idle.c:236 [inline]
do_idle+0x3e8/0x590 kernel/sched/idle.c:303
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:400
start_secondary+0x21d/0x2b0 arch/x86/kernel/smpboot.c:262
secondary_startup_64_no_verify+0xce/0xdb
</TASK>
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:277
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/1
preempt_count: 101, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9892
might_alloc include/linux/sched/mm.h:277 [inline]
slab_pre_alloc_hook.constprop.0+0x175/0x580 mm/slab.h:700
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x4e/0x3b0 mm/slab.c:3491
Code: f7 84 db 74 06 5b e9 90 08 f8 f7 e8 8b 08 f8 f7 e8 e6 7f fe f7 66 90 e8 7f 08 f8 f7 0f 00 2d c8 85 d1 00 e8 73 08 f8 f7 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 3e 05 f8 f7 48 85 db
RSP: 0018:ffffc9000038fd20 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880121f0200 RSI: ffffffff8983319d RDI: 0000000000000000
RBP: ffff888145a07864 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: 0000000000000001 R14: ffff888145a07800 R15: ffff88801c0f6804
acpi_idle_do_entry drivers/acpi/processor_idle.c:555 [inline]
acpi_idle_enter+0x524/0x6a0 drivers/acpi/processor_idle.c:692
cpuidle_enter_state+0x1ab/0xd30 drivers/cpuidle/cpuidle.c:239
cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:356
call_cpuidle kernel/sched/idle.c:155 [inline]
cpuidle_idle_call kernel/sched/idle.c:236 [inline]
do_idle+0x3e8/0x590 kernel/sched/idle.c:303
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:400
start_secondary+0x21d/0x2b0 arch/x86/kernel/smpboot.c:262
secondary_startup_64_no_verify+0xce/0xdb
</TASK>
----------------
Code disassembly (best guess):
0: f7 84 db 74 06 5b e9 testl $0xf7f80890,-0x16a4f98c(%rbx,%rbx,8)
7: 90 08 f8 f7
b: e8 8b 08 f8 f7 callq 0xf7f8089b
10: e8 e6 7f fe f7 callq 0xf7fe7ffb
15: 66 90 xchg %ax,%ax
17: e8 7f 08 f8 f7 callq 0xf7f8089b
1c: 0f 00 2d c8 85 d1 00 verw 0xd185c8(%rip) # 0xd185eb
23: e8 73 08 f8 f7 callq 0xf7f8089b
28: fb sti
29: f4 hlt
* 2a: 9c pushfq <-- trapping instruction
2b: 5b pop %rbx
2c: 81 e3 00 02 00 00 and $0x200,%ebx
32: fa cli
33: 31 ff xor %edi,%edi
35: 48 89 de mov %rbx,%rsi
38: e8 3e 05 f8 f7 callq 0xf7f8057b
3d: 48 85 db test %rbx,%rbx


Tested on:

commit: 105a36f3 Merge tag 'kbuild-fixes-v6.0-3' of git://git...
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=153a1b60880000
kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=118f7ca8880000

ead...@sina.com

unread,
Sep 28, 2022, 6:24:14 AMSep 28
to syzbot+dfcc5f...@syzkaller.appspotmail.com, syzkall...@googlegroups.com, Edward Adam Davis
From: Edward Adam Davis <ead...@sina.com>

On Mon, 26 Sep 2022 09:33:46 -0700
> syzbot found the following issue on:
>
> HEAD commit: 105a36f3694e Merge tag 'kbuild-fixes-v6.0-3' of git://git...
> git tree: upstream
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=152bf540880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
> dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1020566c880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=104819e4880000

Check the issue is fixed.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 105a36f3694e

--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4671,7 +4671,10 @@ static bool __need_reclaim(gfp_t gfp_mask)

void __fs_reclaim_acquire(unsigned long ip)
{
+ unsigned long save_flags;
+ local_irq_save(save_flags);
lock_acquire_exclusive(&__fs_reclaim_map, 0, 0, NULL, ip);
+ local_irq_restore(save_flags);
}

void __fs_reclaim_release(unsigned long ip)
--

syzbot

unread,
Sep 28, 2022, 6:52:26 AMSep 28
to ead...@sina.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
inconsistent lock state in kmem_cache_alloc

================================
WARNING: inconsistent lock state
6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
kworker/1:0/22 [HC0[0]:SC1[1]:HE0:SE0] takes:
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: might_alloc include/linux/sched/mm.h:271 [inline]
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: slab_pre_alloc_hook mm/slab.h:700 [inline]
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: slab_alloc mm/slab.c:3278 [inline]
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: __kmem_cache_alloc_lru mm/slab.c:3471 [inline]
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: kmem_cache_alloc+0x39/0x520 mm/slab.c:3491
{SOFTIRQ-ON-W} state was registered at:
__trace_hardirqs_on_caller kernel/locking/lockdep.c:4260 [inline]
lockdep_hardirqs_on_prepare kernel/locking/lockdep.c:4319 [inline]
lockdep_hardirqs_on_prepare+0x28b/0x400 kernel/locking/lockdep.c:4271
trace_hardirqs_on+0x2d/0x120 kernel/trace/trace_preemptirq.c:49
__fs_reclaim_acquire+0x65/0x70 mm/page_alloc.c:4677
fs_reclaim_acquire+0x100/0x150 mm/page_alloc.c:4691
might_alloc include/linux/sched/mm.h:271 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
kmem_cache_alloc_trace+0x38/0x460 mm/slab.c:3557
kmalloc include/linux/slab.h:600 [inline]
kzalloc include/linux/slab.h:733 [inline]
alloc_workqueue_attrs+0x39/0xc0 kernel/workqueue.c:3394
wq_numa_init kernel/workqueue.c:5964 [inline]
workqueue_init+0x12f/0x8ae kernel/workqueue.c:6091
kernel_init_freeable+0x3fb/0x73a init/main.c:1607
kernel_init+0x1a/0x1d0 init/main.c:1512
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
irq event stamp: 475627
hardirqs last enabled at (475626): [<ffffffff81c6585b>] kfree+0x25b/0x390 mm/slab.c:3787
hardirqs last disabled at (475627): [<ffffffff81bc8894>] __fs_reclaim_acquire+0x44/0x70 mm/page_alloc.c:4675
softirqs last enabled at (475612): [<ffffffff857c1229>] spin_unlock_bh include/linux/spinlock.h:394 [inline]
softirqs last enabled at (475612): [<ffffffff857c1229>] nsim_dev_trap_report drivers/net/netdevsim/dev.c:814 [inline]
softirqs last enabled at (475612): [<ffffffff857c1229>] nsim_dev_trap_report_work+0x869/0xc50 drivers/net/netdevsim/dev.c:844
softirqs last disabled at (475623): [<ffffffff814841f3>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last disabled at (475623): [<ffffffff814841f3>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(fs_reclaim);
<Interrupt>
lock(fs_reclaim);

*** DEADLOCK ***

5 locks held by kworker/1:0/22:
#0: ffff888011867d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
#0: ffff888011867d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline]
#0: ffff888011867d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1280 [inline]
#0: ffff888011867d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:636 [inline]
#0: ffff888011867d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:663 [inline]
#0: ffff888011867d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x87a/0x1610 kernel/workqueue.c:2260
#1: ffffc900003dfda8 ((work_completion)(&(&nsim_dev->trap_data->trap_report_dw)->work)){+.+.}-{0:0}, at: process_one_work+0x8ae/0x1610 kernel/workqueue.c:2264
#2: ffff88807e2f62f8 (&devlink->lock_key){+.+.}-{3:3}, at: nsim_dev_trap_report_work+0x54/0xc50 drivers/net/netdevsim/dev.c:835
#3: ffff888022b554e0 (&nsim_trap_data->trap_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:349 [inline]
#3: ffff888022b554e0 (&nsim_trap_data->trap_lock){+.+.}-{2:2}, at: nsim_dev_trap_report drivers/net/netdevsim/dev.c:785 [inline]
#3: ffff888022b554e0 (&nsim_trap_data->trap_lock){+.+.}-{2:2}, at: nsim_dev_trap_report_work+0x1c8/0xc50 drivers/net/netdevsim/dev.c:844
#4: ffffffff91227508 (&fsnotify_mark_srcu){....}-{0:0}, at: fsnotify+0x2f4/0x1680 fs/notify/fsnotify.c:544

stack backtrace:
CPU: 1 PID: 22 Comm: kworker/1:0 Not tainted 6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Workqueue: events nsim_dev_trap_report_work
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_usage_bug kernel/locking/lockdep.c:3961 [inline]
valid_state kernel/locking/lockdep.c:3973 [inline]
mark_lock_irq kernel/locking/lockdep.c:4176 [inline]
mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632
mark_lock kernel/locking/lockdep.c:4596 [inline]
mark_usage kernel/locking/lockdep.c:4527 [inline]
__lock_acquire+0x11d9/0x56d0 kernel/locking/lockdep.c:5007
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__fs_reclaim_acquire+0x60/0x70 mm/page_alloc.c:4676
fs_reclaim_acquire+0x100/0x150 mm/page_alloc.c:4691
might_alloc include/linux/sched/mm.h:271 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x39/0x520 mm/slab.c:3491
RIP: 0010:memory_is_nonzero mm/kasan/generic.c:114 [inline]
RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:128 [inline]
RIP: 0010:memory_is_poisoned mm/kasan/generic.c:159 [inline]
RIP: 0010:check_region_inline mm/kasan/generic.c:180 [inline]
RIP: 0010:kasan_check_range+0xa0/0x180 mm/kasan/generic.c:189
Code: 0f 49 da 49 c1 fb 03 45 85 db 0f 84 cb 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 d0 0f 84 af 00 00 00 48 83 38 00 <74> ed 48 8d 50 08 eb 09 48 83 c0 01 48 39 d0 74 7a 80 38 00 74 f2
RSP: 0018:ffffc900003dfbd8 EFLAGS: 00000246
RAX: ffffed10043547d0 RBX: ffffed10043547e9 RCX: ffffffff874f0e44
RDX: ffffed10043547e8 RSI: 00000000000000c8 RDI: ffff888021aa3e80
RBP: ffffed10043547d0 R08: 0000000000000001 R09: ffff888021aa3f47
R10: ffffed10043547e8 R11: 0000000000000003 R12: 00000000000000c8
R13: 0000000000000000 R14: 0000000000000a20 R15: 0000000000001000
memset+0x20/0x40 mm/kasan/shadow.c:44
__alloc_skb+0x114/0x2f0 net/core/skbuff.c:450
alloc_skb include/linux/skbuff.h:1257 [inline]
nsim_dev_trap_skb_build drivers/net/netdevsim/dev.c:742 [inline]
nsim_dev_trap_report drivers/net/netdevsim/dev.c:799 [inline]
nsim_dev_trap_report_work+0x2ad/0xc50 drivers/net/netdevsim/dev.c:844
process_one_work+0x991/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 22, name: kworker/1:0
preempt_count: 101, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 22 Comm: kworker/1:0 Not tainted 6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Workqueue: events nsim_dev_trap_report_work
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9892
might_alloc include/linux/sched/mm.h:274 [inline]
slab_pre_alloc_hook mm/slab.h:700 [inline]
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x381/0x520 mm/slab.c:3491
RIP: 0010:memory_is_nonzero mm/kasan/generic.c:114 [inline]
RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:128 [inline]
RIP: 0010:memory_is_poisoned mm/kasan/generic.c:159 [inline]
RIP: 0010:check_region_inline mm/kasan/generic.c:180 [inline]
RIP: 0010:kasan_check_range+0xa0/0x180 mm/kasan/generic.c:189
Code: 0f 49 da 49 c1 fb 03 45 85 db 0f 84 cb 00 00 00 45 89 db 4a 8d 14 d8 eb 0d 48 83 c0 08 48 39 d0 0f 84 af 00 00 00 48 83 38 00 <74> ed 48 8d 50 08 eb 09 48 83 c0 01 48 39 d0 74 7a 80 38 00 74 f2
RSP: 0018:ffffc900003dfbd8 EFLAGS: 00000246
RAX: ffffed10043547d0 RBX: ffffed10043547e9 RCX: ffffffff874f0e44
RDX: ffffed10043547e8 RSI: 00000000000000c8 RDI: ffff888021aa3e80
RBP: ffffed10043547d0 R08: 0000000000000001 R09: ffff888021aa3f47
R10: ffffed10043547e8 R11: 0000000000000003 R12: 00000000000000c8
R13: 0000000000000000 R14: 0000000000000a20 R15: 0000000000001000
memset+0x20/0x40 mm/kasan/shadow.c:44
__alloc_skb+0x114/0x2f0 net/core/skbuff.c:450
alloc_skb include/linux/skbuff.h:1257 [inline]
nsim_dev_trap_skb_build drivers/net/netdevsim/dev.c:742 [inline]
nsim_dev_trap_report drivers/net/netdevsim/dev.c:799 [inline]
nsim_dev_trap_report_work+0x2ad/0xc50 drivers/net/netdevsim/dev.c:844
process_one_work+0x991/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>
----------------
Code disassembly (best guess):
0: 0f 49 da cmovns %edx,%ebx
3: 49 c1 fb 03 sar $0x3,%r11
7: 45 85 db test %r11d,%r11d
a: 0f 84 cb 00 00 00 je 0xdb
10: 45 89 db mov %r11d,%r11d
13: 4a 8d 14 d8 lea (%rax,%r11,8),%rdx
17: eb 0d jmp 0x26
19: 48 83 c0 08 add $0x8,%rax
1d: 48 39 d0 cmp %rdx,%rax
20: 0f 84 af 00 00 00 je 0xd5
26: 48 83 38 00 cmpq $0x0,(%rax)
* 2a: 74 ed je 0x19 <-- trapping instruction
2c: 48 8d 50 08 lea 0x8(%rax),%rdx
30: eb 09 jmp 0x3b
32: 48 83 c0 01 add $0x1,%rax
36: 48 39 d0 cmp %rdx,%rax
39: 74 7a je 0xb5
3b: 80 38 00 cmpb $0x0,(%rax)
3e: 74 f2 je 0x32


Tested on:

commit: 105a36f3 Merge tag 'kbuild-fixes-v6.0-3' of git://git...
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=15064e9c880000
kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=1417f52f080000

ead...@sina.com

unread,
Sep 28, 2022, 7:22:27 AMSep 28
to syzbot+dfcc5f...@syzkaller.appspotmail.com, syzkall...@googlegroups.com, Edward Adam Davis
From: Edward Adam Davis <ead...@sina.com>

On Mon, 26 Sep 2022 09:33:46 -0700
> syzbot found the following issue on:
>
> HEAD commit: 105a36f3694e Merge tag 'kbuild-fixes-v6.0-3' of git://git...
> git tree: upstream
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=152bf540880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
> dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1020566c880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=104819e4880000

Check the issue is fixed.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 105a36f3694e

--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -268,8 +268,11 @@ static inline void memalloc_retry_wait(gfp_t gfp_flags)
*/
static inline void might_alloc(gfp_t gfp_mask)
{
+ unsigned long save_flags;
+ local_irq_save(save_flags);
fs_reclaim_acquire(gfp_mask);
fs_reclaim_release(gfp_mask);
+ local_irq_restore(save_flags);

might_sleep_if(gfpflags_allow_blocking(gfp_mask));
}

syzbot

unread,
Sep 28, 2022, 9:59:26 AMSep 28
to ead...@sina.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
inconsistent lock state in slab_pre_alloc_hook

================================
WARNING: inconsistent lock state
6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/1/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: might_alloc include/linux/sched/mm.h:273 [inline]
ffffffff8c0be7e0 (fs_reclaim){+.?.}-{0:0}, at: slab_pre_alloc_hook.constprop.0+0x19d/0x580 mm/slab.h:700
{SOFTIRQ-ON-W} state was registered at:
__trace_hardirqs_on_caller kernel/locking/lockdep.c:4260 [inline]
lockdep_hardirqs_on_prepare kernel/locking/lockdep.c:4319 [inline]
lockdep_hardirqs_on_prepare+0x28b/0x400 kernel/locking/lockdep.c:4271
trace_hardirqs_on+0x2d/0x120 kernel/trace/trace_preemptirq.c:49
__fs_reclaim_acquire+0x65/0x70 mm/page_alloc.c:4677
fs_reclaim_acquire+0x100/0x150 mm/page_alloc.c:4691
dma_resv_lockdep+0x342/0x545 drivers/dma-buf/dma-resv.c:753
do_one_initcall+0xfe/0x650 init/main.c:1296
do_initcall_level init/main.c:1369 [inline]
do_initcalls init/main.c:1385 [inline]
do_basic_setup init/main.c:1404 [inline]
kernel_init_freeable+0x6b1/0x73a init/main.c:1623
kernel_init+0x1a/0x1d0 init/main.c:1512
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
irq event stamp: 1253445
hardirqs last enabled at (1253444): [<ffffffff89a00d66>] asm_sysvec_call_function_single+0x16/0x20 arch/x86/include/asm/idtentry.h:657
hardirqs last disabled at (1253445): [<ffffffff81c65916>] might_alloc include/linux/sched/mm.h:272 [inline]
hardirqs last disabled at (1253445): [<ffffffff81c65916>] slab_pre_alloc_hook.constprop.0+0x196/0x580 mm/slab.h:700
softirqs last enabled at (1253420): [<ffffffff814841f3>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last enabled at (1253420): [<ffffffff814841f3>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
softirqs last disabled at (1253439): [<ffffffff814841f3>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last disabled at (1253439): [<ffffffff814841f3>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(fs_reclaim);
<Interrupt>
lock(fs_reclaim);

*** DEADLOCK ***

1 lock held by swapper/1/0:
#0: ffffffff91228508 (&fsnotify_mark_srcu){....}-{0:0}, at: fsnotify+0x2f4/0x1680 fs/notify/fsnotify.c:544

stack backtrace:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_usage_bug kernel/locking/lockdep.c:3961 [inline]
valid_state kernel/locking/lockdep.c:3973 [inline]
mark_lock_irq kernel/locking/lockdep.c:4176 [inline]
mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632
mark_lock kernel/locking/lockdep.c:4596 [inline]
mark_usage kernel/locking/lockdep.c:4527 [inline]
__lock_acquire+0x11d9/0x56d0 kernel/locking/lockdep.c:5007
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__fs_reclaim_acquire+0x2e/0x70 mm/page_alloc.c:4676
fs_reclaim_acquire+0x100/0x150 mm/page_alloc.c:4691
might_alloc include/linux/sched/mm.h:273 [inline]
slab_pre_alloc_hook.constprop.0+0x19d/0x580 mm/slab.h:700
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x4e/0x3b0 mm/slab.c:3491
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:130 [inline]
RIP: 0010:acpi_safe_halt+0x6f/0xb0 drivers/acpi/processor_idle.c:113
Code: f7 84 db 74 06 5b e9 90 08 f8 f7 e8 8b 08 f8 f7 e8 e6 7f fe f7 66 90 e8 7f 08 f8 f7 0f 00 2d c8 85 d1 00 e8 73 08 f8 f7 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 3e 05 f8 f7 48 85 db
RSP: 0018:ffffc9000038fd20 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880121f0200 RSI: ffffffff8983319d RDI: 0000000000000000
RBP: ffff8880178af864 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: 0000000000000001 R14: ffff8880178af800 R15: ffff88801be98804
acpi_idle_do_entry drivers/acpi/processor_idle.c:555 [inline]
acpi_idle_enter+0x524/0x6a0 drivers/acpi/processor_idle.c:692
cpuidle_enter_state+0x1ab/0xd30 drivers/cpuidle/cpuidle.c:239
cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:356
call_cpuidle kernel/sched/idle.c:155 [inline]
cpuidle_idle_call kernel/sched/idle.c:236 [inline]
do_idle+0x3e8/0x590 kernel/sched/idle.c:303
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:400
start_secondary+0x21d/0x2b0 arch/x86/kernel/smpboot.c:262
secondary_startup_64_no_verify+0xce/0xdb
</TASK>
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:277
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/1
preempt_count: 101, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.0.0-rc6-syzkaller-00321-g105a36f3694e-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
__might_resched.cold+0x222/0x26b kernel/sched/core.c:9892
might_alloc include/linux/sched/mm.h:277 [inline]
slab_pre_alloc_hook.constprop.0+0x175/0x580 mm/slab.h:700
slab_alloc mm/slab.c:3278 [inline]
__kmem_cache_alloc_lru mm/slab.c:3471 [inline]
kmem_cache_alloc+0x4e/0x3b0 mm/slab.c:3491
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline]
RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:130 [inline]
RIP: 0010:acpi_safe_halt+0x6f/0xb0 drivers/acpi/processor_idle.c:113
Code: f7 84 db 74 06 5b e9 90 08 f8 f7 e8 8b 08 f8 f7 e8 e6 7f fe f7 66 90 e8 7f 08 f8 f7 0f 00 2d c8 85 d1 00 e8 73 08 f8 f7 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 3e 05 f8 f7 48 85 db
RSP: 0018:ffffc9000038fd20 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880121f0200 RSI: ffffffff8983319d RDI: 0000000000000000
RBP: ffff8880178af864 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
R13: 0000000000000001 R14: ffff8880178af800 R15: ffff88801be98804
acpi_idle_do_entry drivers/acpi/processor_idle.c:555 [inline]
acpi_idle_enter+0x524/0x6a0 drivers/acpi/processor_idle.c:692
cpuidle_enter_state+0x1ab/0xd30 drivers/cpuidle/cpuidle.c:239
cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:356
call_cpuidle kernel/sched/idle.c:155 [inline]
cpuidle_idle_call kernel/sched/idle.c:236 [inline]
do_idle+0x3e8/0x590 kernel/sched/idle.c:303
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:400
start_secondary+0x21d/0x2b0 arch/x86/kernel/smpboot.c:262
secondary_startup_64_no_verify+0xce/0xdb
</TASK>
----------------
Code disassembly (best guess):
0: f7 84 db 74 06 5b e9 testl $0xf7f80890,-0x16a4f98c(%rbx,%rbx,8)
7: 90 08 f8 f7
b: e8 8b 08 f8 f7 callq 0xf7f8089b
10: e8 e6 7f fe f7 callq 0xf7fe7ffb
15: 66 90 xchg %ax,%ax
17: e8 7f 08 f8 f7 callq 0xf7f8089b
1c: 0f 00 2d c8 85 d1 00 verw 0xd185c8(%rip) # 0xd185eb
23: e8 73 08 f8 f7 callq 0xf7f8089b
28: fb sti
29: f4 hlt
* 2a: 9c pushfq <-- trapping instruction
2b: 5b pop %rbx
2c: 81 e3 00 02 00 00 and $0x200,%ebx
32: fa cli
33: 31 ff xor %edi,%edi
35: 48 89 de mov %rbx,%rsi
38: e8 3e 05 f8 f7 callq 0xf7f8057b
3d: 48 85 db test %rbx,%rbx


Tested on:

commit: 105a36f3 Merge tag 'kbuild-fixes-v6.0-3' of git://git...
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=1619ef6c880000
kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=156db588880000

Vlastimil Babka

unread,
Sep 29, 2022, 9:24:25 AMSep 29
to syzbot, ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, ros...@goodmis.org, syzkall...@googlegroups.com, linu...@kvack.org, Michal Hocko, Jan Kara, Amir Goldstein, Matthew Bobrowski, Linux-FSDevel, Sebastian Andrzej Siewior, Matthew Wilcox
On 9/26/22 18:33, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 105a36f3694e Merge tag 'kbuild-fixes-v6.0-3' of git://git...
> git tree: upstream
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=152bf540880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
> dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1020566c880000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=104819e4880000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+dfcc5f...@syzkaller.appspotmail.com

+CC more folks

I'm not fully sure what this report means but I assume it's because there's
a GFP_KERNEL kmalloc() allocation from softirq context? Should it perhaps
use memalloc_nofs_save() at some well defined point?

Sebastian Andrzej Siewior

unread,
Sep 29, 2022, 9:40:47 AMSep 29
to Vlastimil Babka, syzbot, ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, ros...@goodmis.org, syzkall...@googlegroups.com, linu...@kvack.org, Michal Hocko, Jan Kara, Amir Goldstein, Matthew Bobrowski, Linux-FSDevel, Matthew Wilcox
On 2022-09-29 15:24:22 [+0200], Vlastimil Babka wrote:
> I'm not fully sure what this report means but I assume it's because there's
> a GFP_KERNEL kmalloc() allocation from softirq context? Should it perhaps
> use memalloc_nofs_save() at some well defined point?

my guess is

> > Call Trace:
> > <IRQ>

> > __fs_reclaim_acquire mm/page_alloc.c:4674 [inline]
> > fs_reclaim_acquire+0x115/0x160 mm/page_alloc.c:4688
> > might_alloc include/linux/sched/mm.h:271 [inline]
> > slab_pre_alloc_hook mm/slab.h:700 [inline]
> > slab_alloc mm/slab.c:3278 [inline]
> > __kmem_cache_alloc_lru mm/slab.c:3471 [inline]
> > kmem_cache_alloc+0x39/0x520 mm/slab.c:3491
> > fanotify_alloc_fid_event fs/notify/fanotify/fanotify.c:580 [inline]
> > fanotify_alloc_event fs/notify/fanotify/fanotify.c:813 [inline]

this sets GFP to GFP_KERNEL_ACCOUNT + (__GFP_NOFAIL ||
__GFP_RETRY_MAYFAIL) which contains GFP_KERNEL and

> > fanotify_handle_event+0x1130/0x3f40 fs/notify/fanotify/fanotify.c:948
> > send_to_group fs/notify/fsnotify.c:360 [inline]
> > fsnotify+0xafb/0x1680 fs/notify/fsnotify.c:570
> > __fsnotify_parent+0x62f/0xa60 fs/notify/fsnotify.c:230
> > fsnotify_parent include/linux/fsnotify.h:77 [inline]
> > fsnotify_file include/linux/fsnotify.h:99 [inline]
> > fsnotify_access include/linux/fsnotify.h:309 [inline]
> > __io_complete_rw_common+0x485/0x720 io_uring/rw.c:195
> > io_complete_rw+0x1a/0x1f0 io_uring/rw.c:228
> > iomap_dio_complete_work fs/iomap/direct-io.c:144 [inline]
> > iomap_dio_bio_end_io+0x438/0x5e0 fs/iomap/direct-io.c:178
> > bio_endio+0x5f9/0x780 block/bio.c:1564
> > req_bio_endio block/blk-mq.c:695 [inline]
> > blk_update_request+0x3fc/0x1300 block/blk-mq.c:825
> > scsi_end_request+0x7a/0x9a0 drivers/scsi/scsi_lib.c:541
> > scsi_io_completion+0x173/0x1f70 drivers/scsi/scsi_lib.c:971
> > scsi_complete+0x122/0x3b0 drivers/scsi/scsi_lib.c:1438
> > blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1022
> > __do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
> > invoke_softirq kernel/softirq.c:445 [inline]
> > __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
> > irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
> > common_interrupt+0xa9/0xc0 arch/x86/kernel/irq.c:240
> > </IRQ>


we originate from softirq we can't use GFP_KERNEL. This also noted here:

> > BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
> > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/1
> > preempt_count: 101, expected: 0
> > RCU nest depth: 0, expected: 0
> > INFO: lockdep is turned off.
> > Preemption disabled at:
> > [<0000000000000000>] 0x0
> > CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.0.0-rc6-syzkaller-00321-g105a36f3694e #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
> > Call Trace:
> > <IRQ>
> > __dump_stack lib/dump_stack.c:88 [inline]
> > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
> > __might_resched.cold+0x222/0x26b kernel/sched/core.c:9892
> > might_alloc include/linux/sched/mm.h:274 [inline]
> > slab_pre_alloc_hook mm/slab.h:700 [inline]
> > slab_alloc mm/slab.c:3278 [inline]
> > __kmem_cache_alloc_lru mm/slab.c:3471 [inline]
> > kmem_cache_alloc+0x381/0x520 mm/slab.c:3491
> > fanotify_alloc_fid_event fs/notify/fanotify/fanotify.c:580 [inline]
> > fanotify_alloc_event fs/notify/fanotify/fanotify.c:813 [inline]
> > fanotify_handle_event+0x1130/0x3f40 fs/notify/fanotify/fanotify.c:948

So either the caller needs to be put into task-context or
fanotify_alloc_event() needs something like

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index cd7d09a569fff..9f6c5813f7a93 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -710,7 +710,7 @@ static struct fanotify_event *fanotify_alloc_event(
__kernel_fsid_t *fsid, u32 match_mask)
{
struct fanotify_event *event = NULL;
- gfp_t gfp = GFP_KERNEL_ACCOUNT;
+ gfp_t gfp = GFP_ATOMIC | __GFP_ACCOUNT;
unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
struct inode *id = fanotify_fid_inode(mask, data, data_type, dir,
fid_mode);

Sebastian

Jan Kara

unread,
Sep 29, 2022, 9:56:29 AMSep 29
to Vlastimil Babka, syzbot, ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, ros...@goodmis.org, syzkall...@googlegroups.com, linu...@kvack.org, Michal Hocko, Jan Kara, Amir Goldstein, Matthew Bobrowski, Linux-FSDevel, Sebastian Andrzej Siewior, Matthew Wilcox, Jens Axboe, io-u...@vger.kernel.org
On Thu 29-09-22 15:24:22, Vlastimil Babka wrote:
> On 9/26/22 18:33, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 105a36f3694e Merge tag 'kbuild-fixes-v6.0-3' of git://git...
> > git tree: upstream
> > console+strace: https://syzkaller.appspot.com/x/log.txt?x=152bf540880000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=7db7ad17eb14cb7
> > dashboard link: https://syzkaller.appspot.com/bug?extid=dfcc5f4da15868df7d4d
> > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1020566c880000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=104819e4880000
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+dfcc5f...@syzkaller.appspotmail.com
>
> +CC more folks
>
> I'm not fully sure what this report means but I assume it's because there's
> a GFP_KERNEL kmalloc() allocation from softirq context? Should it perhaps
> use memalloc_nofs_save() at some well defined point?

Thanks for the CC. The problem really is that io_uring is calling into
fsnotify_access() from softirq context. That isn't going to work. The
allocation is just a tip of the iceberg. Fsnotify simply does not expect to
be called from softirq context. All the dcache locks are not IRQ safe, it
can even obtain some sleeping locks and call to userspace if there are
suitable watches set up.

So either io_uring needs to postpone fsnotify calls to a workqueue or we
need a way for io_uring code to tell iomap dio code that the completion
needs to always happen from a workqueue (as it currently does for writes).
Jens?

Honza
--
Jan Kara <ja...@suse.com>
SUSE Labs, CR

Jens Axboe

unread,
Sep 29, 2022, 10:07:44 AMSep 29
to Jan Kara, Vlastimil Babka, syzbot, ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, ros...@goodmis.org, syzkall...@googlegroups.com, linu...@kvack.org, Michal Hocko, Amir Goldstein, Matthew Bobrowski, Linux-FSDevel, Sebastian Andrzej Siewior, Matthew Wilcox, io-u...@vger.kernel.org
Something like this should probably work - I'll write a test case and
vet it.


diff --git a/io_uring/rw.c b/io_uring/rw.c
index 1ae1e52ab4cb..a25cd44cd415 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -236,14 +236,6 @@ static void kiocb_end_write(struct io_kiocb *req)

static bool __io_complete_rw_common(struct io_kiocb *req, long res)
{
- struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
-
- if (rw->kiocb.ki_flags & IOCB_WRITE) {
- kiocb_end_write(req);
- fsnotify_modify(req->file);
- } else {
- fsnotify_access(req->file);
- }
if (unlikely(res != req->cqe.res)) {
if ((res == -EAGAIN || res == -EOPNOTSUPP) &&
io_rw_should_reissue(req)) {
@@ -270,6 +262,20 @@ static inline int io_fixup_rw_res(struct io_kiocb *req, long res)
return res;
}

+static void io_req_rw_complete(struct io_kiocb *req, bool *locked)
+{
+ struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw);
+
+ if (rw->kiocb.ki_flags & IOCB_WRITE) {
+ kiocb_end_write(req);
+ fsnotify_modify(req->file);
+ } else {
+ fsnotify_access(req->file);
+ }
+
+ io_req_task_complete(req, locked);
+}
+
static void io_complete_rw(struct kiocb *kiocb, long res)
{
struct io_rw *rw = container_of(kiocb, struct io_rw, kiocb);
@@ -278,7 +284,7 @@ static void io_complete_rw(struct kiocb *kiocb, long res)
if (__io_complete_rw_common(req, res))
return;
io_req_set_res(req, io_fixup_rw_res(req, res), 0);
- req->io_task_work.func = io_req_task_complete;
+ req->io_task_work.func = io_req_rw_complete;
io_req_task_work_add(req);
}


--
Jens Axboe

Jens Axboe

unread,
Sep 29, 2022, 12:54:14 PMSep 29
to Jan Kara, Vlastimil Babka, syzbot, ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, ros...@goodmis.org, syzkall...@googlegroups.com, linu...@kvack.org, Michal Hocko, Amir Goldstein, Matthew Bobrowski, Linux-FSDevel, Sebastian Andrzej Siewior, Matthew Wilcox, io-u...@vger.kernel.org
Ran that with the attached test case, triggers it before but not with
the patch. Side note - I do wish that the syzbot reproducers were not
x86 specific, I always have to go and edit them for arm64. For this
particular one, I just gave up and wrote one myself.

Thanks for the heads-up Jan, I'll queue up this fix and mark for stable
with the right attributions.

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/fanotify.h>
#include <sys/wait.h>
#include <liburing.h>

int main(int argc, char *argv[])
{
struct io_uring_sqe *sqe;
struct io_uring_cqe *cqe;
struct io_uring ring;
int fan, ret, fd;
void *buf;

fan = fanotify_init(FAN_CLASS_NOTIF|FAN_CLASS_CONTENT, 0);
if (fan < 0) {
if (errno == ENOSYS)
return 0;
perror("fanotify_init");
return 1;
}

if (argc > 1) {
fd = open(argv[1], O_RDONLY | O_DIRECT);
if (fd < 0) {
perror("open");
return 1;
}
} else {
fd = open("file0", O_RDONLY | O_DIRECT);
if (fd < 0) {
perror("open");
return 1;
}
}

ret = fanotify_mark(fan, FAN_MARK_ADD, FAN_ACCESS|FAN_MODIFY, fd, NULL);
if (ret < 0) {
perror("fanotify_mark");
return 1;
}

ret = 0;
if (fork()) {
int wstat;

io_uring_queue_init(4, &ring, 0);
if (posix_memalign(&buf, 4096, 4096))
return 0;
sqe = io_uring_get_sqe(&ring);
io_uring_prep_read(sqe, fd, buf, 4096, 0);
io_uring_submit(&ring);
ret = io_uring_wait_cqe(&ring, &cqe);
if (ret) {
fprintf(stderr, "wait_ret=%d\n", ret);
return 1;
}
wait(&wstat);
ret = WEXITSTATUS(wstat);
} else {
struct fanotify_event_metadata m;
int fret;

fret = read(fan, &m, sizeof(m));
if (fret < 0)
perror("fanotify read");
/* fail if mask isn't right or pid indicates non-task context */
else if (!(m.mask & 1) || !m.pid)
exit(1);
exit(0);
}

return ret;
}

--
Jens Axboe

Jan Kara

unread,
Sep 30, 2022, 9:51:36 AMSep 30
to Jens Axboe, Jan Kara, Vlastimil Babka, syzbot, ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, ros...@goodmis.org, syzkall...@googlegroups.com, linu...@kvack.org, Michal Hocko, Amir Goldstein, Matthew Bobrowski, Linux-FSDevel, Sebastian Andrzej Siewior, Matthew Wilcox, io-u...@vger.kernel.org
Thanks for fixing this so quickly! The test looks good to me.

Honza

Steven Rostedt

unread,
Oct 12, 2022, 7:23:24 AMOct 12
to ead...@sina.com, syzbot+dfcc5f...@syzkaller.appspotmail.com, ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, syzkall...@googlegroups.com, vba...@suse.cz, Thomas Gleixner, Peter Zijlstra
Please do not do this. Open coding interrupt disabling due to locking
issues is not the solution. You need to make the locks themselves
disable interrupts if need be. This breaks PREEMPT_RT, and creates a
"big kernel lock" situation where there's random interrupts being
disabled for no apparent reason.

-- Steve

Vlastimil Babka

unread,
Oct 12, 2022, 8:30:17 AMOct 12
to Steven Rostedt, ead...@sina.com, syzbot+dfcc5f...@syzkaller.appspotmail.com, ak...@linux-foundation.org, kees...@chromium.org, linux-...@vger.kernel.org, mark.r...@arm.com, mhir...@kernel.org, syzkall...@googlegroups.com, Thomas Gleixner, Peter Zijlstra
On 10/12/22 13:23, Steven Rostedt wrote:
> On Tue, 27 Sep 2022 15:11:34 +0800
> ead...@sina.com wrote:
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -3275,15 +3275,19 @@ slab_alloc(struct kmem_cache *cachep, struct list_lru *lru, gfp_t flags,
>> bool init = false;
>>
>> flags &= gfp_allowed_mask;
>> + local_irq_save(save_flags);
>
> Please do not do this. Open coding interrupt disabling due to locking
> issues is not the solution. You need to make the locks themselves
> disable interrupts if need be. This breaks PREEMPT_RT, and creates a
> "big kernel lock" situation where there's random interrupts being
> disabled for no apparent reason.
>
> -- Steve

And FWIW the problem was in the calling context of io_uring, not slab code
itself in the first place, see:

https://lore.kernel.org/all/20220929135627.ykivmdks2w5vzrwg@quack3/
Reply all
Reply to author
Forward
0 new messages