[syzbot] [kernel?] possible deadlock in console_flush_all (2)

33 views
Skip to first unread message

syzbot

unread,
Oct 8, 2023, 2:28:46 AM10/8/23
to linux-...@vger.kernel.org, syzkall...@googlegroups.com, tg...@linutronix.de
Hello,

syzbot found the following issue on:

HEAD commit: 7d730f1bf6f3 Add linux-next specific files for 20231005
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=172d36e9680000
kernel config: https://syzkaller.appspot.com/x/.config?x=f532286be4fff4b5
dashboard link: https://syzkaller.appspot.com/bug?extid=f78380e4eae53c64125c
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/1d7f28a4398f/disk-7d730f1b.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d454d124268e/vmlinux-7d730f1b.xz
kernel image: https://storage.googleapis.com/syzbot-assets/dbca966175cb/bzImage-7d730f1b.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f78380...@syzkaller.appspotmail.com

EEVDF scheduling fail, picking leftmost
======================================================
WARNING: possible circular locking dependency detected
6.6.0-rc4-next-20231005-syzkaller #0 Not tainted
------------------------------------------------------
syz-executor.0/5232 is trying to acquire lock:
ffffffff8cab8560 (console_owner){....}-{0:0}, at: console_lock_spinning_enable kernel/printk/printk.c:1855 [inline]
ffffffff8cab8560 (console_owner){....}-{0:0}, at: console_emit_next_record kernel/printk/printk.c:2888 [inline]
ffffffff8cab8560 (console_owner){....}-{0:0}, at: console_flush_all+0x49a/0xd50 kernel/printk/printk.c:2960

but task is already holding lock:
ffff8880b993c718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (&rq->__lock){-.-.}-{2:2}:
_raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
raw_spin_rq_lock kernel/sched/sched.h:1357 [inline]
rq_lock kernel/sched/sched.h:1671 [inline]
task_fork_fair+0x70/0x240 kernel/sched/fair.c:12399
sched_cgroup_fork+0x3cf/0x510 kernel/sched/core.c:4799
copy_process+0x4580/0x74b0 kernel/fork.c:2609
kernel_clone+0xfd/0x920 kernel/fork.c:2907
user_mode_thread+0xb4/0xf0 kernel/fork.c:2985
rest_init+0x27/0x2b0 init/main.c:691
arch_call_rest_init+0x13/0x30 init/main.c:823
start_kernel+0x39f/0x480 init/main.c:1068
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
secondary_startup_64_no_verify+0x166/0x16b

-> #3 (&p->pi_lock){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:518 [inline]
try_to_wake_up+0xb0/0x15d0 kernel/sched/core.c:4213
__wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
__wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
tty_port_default_wakeup+0x2a/0x40 drivers/tty/tty_port.c:69
serial8250_tx_chars+0x542/0xf60 drivers/tty/serial/8250/8250_port.c:1842
serial8250_handle_irq+0x606/0xbe0 drivers/tty/serial/8250/8250_port.c:1949
serial8250_default_handle_irq+0x94/0x210 drivers/tty/serial/8250/8250_port.c:1969
serial8250_interrupt+0xfc/0x200 drivers/tty/serial/8250/8250_core.c:127
__handle_irq_event_percpu+0x22a/0x740 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
handle_irq arch/x86/kernel/irq.c:238 [inline]
__common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
common_interrupt+0xa9/0xd0 arch/x86/kernel/irq.c:247
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
_raw_spin_unlock_irqrestore+0x31/0x70 kernel/locking/spinlock.c:194
spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
uart_port_unlock_irqrestore include/linux/serial_core.h:667 [inline]
serial_port_runtime_resume+0x2ae/0x330 drivers/tty/serial/serial_port.c:41
__rpm_callback+0xc5/0x4c0 drivers/base/power/runtime.c:392
rpm_callback+0x1da/0x220 drivers/base/power/runtime.c:446
rpm_resume+0xdb5/0x1980 drivers/base/power/runtime.c:912
pm_runtime_work+0x12e/0x180 drivers/base/power/runtime.c:977
process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
process_scheduled_works kernel/workqueue.c:2703 [inline]
worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784
kthread+0x33c/0x440 kernel/kthread.c:388
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304

-> #2 (&tty->write_wait){-...}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
__wake_up_common_lock+0xbb/0x140 kernel/sched/wait.c:137
tty_port_default_wakeup+0x2a/0x40 drivers/tty/tty_port.c:69
serial8250_tx_chars+0x542/0xf60 drivers/tty/serial/8250/8250_port.c:1842
serial8250_handle_irq+0x606/0xbe0 drivers/tty/serial/8250/8250_port.c:1949
serial8250_default_handle_irq+0x94/0x210 drivers/tty/serial/8250/8250_port.c:1969
serial8250_interrupt+0xfc/0x200 drivers/tty/serial/8250/8250_core.c:127
__handle_irq_event_percpu+0x22a/0x740 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
handle_irq arch/x86/kernel/irq.c:238 [inline]
__common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
common_interrupt+0xa9/0xd0 arch/x86/kernel/irq.c:247
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline]
_raw_spin_unlock_irqrestore+0x31/0x70 kernel/locking/spinlock.c:194
spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
uart_port_unlock_irqrestore include/linux/serial_core.h:667 [inline]
serial_port_runtime_resume+0x2ae/0x330 drivers/tty/serial/serial_port.c:41
__rpm_callback+0xc5/0x4c0 drivers/base/power/runtime.c:392
rpm_callback+0x1da/0x220 drivers/base/power/runtime.c:446
rpm_resume+0xdb5/0x1980 drivers/base/power/runtime.c:912
pm_runtime_work+0x12e/0x180 drivers/base/power/runtime.c:977
process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
process_scheduled_works kernel/workqueue.c:2703 [inline]
worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784
kthread+0x33c/0x440 kernel/kthread.c:388
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304

-> #1 (&port_lock_key){-...}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
uart_port_lock_irqsave include/linux/serial_core.h:616 [inline]
serial8250_console_write+0xa7c/0x1060 drivers/tty/serial/8250/8250_port.c:3410
console_emit_next_record kernel/printk/printk.c:2894 [inline]
console_flush_all+0x4d5/0xd50 kernel/printk/printk.c:2960
console_unlock+0x10c/0x260 kernel/printk/printk.c:3029
vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2304
vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
_printk+0xc8/0x100 kernel/printk/printk.c:2329
register_console+0xb30/0x1210 kernel/printk/printk.c:3535
univ8250_console_init+0x35/0x50 drivers/tty/serial/8250/8250_core.c:717
console_init+0xba/0x5c0 kernel/printk/printk.c:3681
start_kernel+0x25a/0x480 init/main.c:1004
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
secondary_startup_64_no_verify+0x166/0x16b

-> #0 (console_owner){....}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3868 [inline]
__lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5136
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
console_lock_spinning_enable kernel/printk/printk.c:1855 [inline]
console_emit_next_record kernel/printk/printk.c:2888 [inline]
console_flush_all+0x4af/0xd50 kernel/printk/printk.c:2960
console_unlock+0x10c/0x260 kernel/printk/printk.c:3029
vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2304
vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
_printk+0xc8/0x100 kernel/printk/printk.c:2329
pick_eevdf kernel/sched/fair.c:963 [inline]
pick_next_entity kernel/sched/fair.c:5247 [inline]
pick_next_task_fair kernel/sched/fair.c:8244 [inline]
__pick_next_task_fair+0xe2/0x540 kernel/sched/fair.c:8297
__pick_next_task kernel/sched/core.c:6003 [inline]
__pick_next_task kernel/sched/core.c:5972 [inline]
pick_next_task kernel/sched/core.c:6061 [inline]
__schedule+0x38d5/0x5a00 kernel/sched/core.c:6640
__schedule_loop kernel/sched/core.c:6753 [inline]
schedule+0xe7/0x270 kernel/sched/core.c:6768
futex_wait_queue+0xfe/0x1f0 kernel/futex/waitwake.c:369
__futex_wait+0x292/0x3c0 kernel/futex/waitwake.c:668
futex_wait+0xea/0x380 kernel/futex/waitwake.c:696
do_futex+0x227/0x350 kernel/futex/syscalls.c:102
__do_sys_futex kernel/futex/syscalls.c:179 [inline]
__se_sys_futex kernel/futex/syscalls.c:160 [inline]
__x64_sys_futex+0x1e1/0x4c0 kernel/futex/syscalls.c:160
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

Chain exists of:
console_owner --> &p->pi_lock --> &rq->__lock

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&rq->__lock);
lock(&p->pi_lock);
lock(&rq->__lock);
lock(console_owner);

*** DEADLOCK ***

3 locks held by syz-executor.0/5232:
#0: ffff8880b993c718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
#1: ffffffff8cb989a0 (console_lock){+.+.}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1924 [inline]
#1: ffffffff8cb989a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x162/0x5f0 kernel/printk/printk.c:2303
#2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:303 [inline]
#2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: srcu_read_lock_nmisafe include/linux/srcu.h:232 [inline]
#2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: console_srcu_read_lock kernel/printk/printk.c:286 [inline]
#2: ffffffff8cb98a10 (console_srcu){....}-{0:0}, at: console_flush_all+0x12a/0xd50 kernel/printk/printk.c:2952

stack backtrace:
CPU: 1 PID: 5232 Comm: syz-executor.0 Not tainted 6.6.0-rc4-next-20231005-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
check_noncircular+0x311/0x3f0 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3868 [inline]
__lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5136
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
console_lock_spinning_enable kernel/printk/printk.c:1855 [inline]
console_emit_next_record kernel/printk/printk.c:2888 [inline]
console_flush_all+0x4af/0xd50 kernel/printk/printk.c:2960
console_unlock+0x10c/0x260 kernel/printk/printk.c:3029
vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2304
vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
_printk+0xc8/0x100 kernel/printk/printk.c:2329
pick_eevdf kernel/sched/fair.c:963 [inline]
pick_next_entity kernel/sched/fair.c:5247 [inline]
pick_next_task_fair kernel/sched/fair.c:8244 [inline]
__pick_next_task_fair+0xe2/0x540 kernel/sched/fair.c:8297
__pick_next_task kernel/sched/core.c:6003 [inline]
__pick_next_task kernel/sched/core.c:5972 [inline]
pick_next_task kernel/sched/core.c:6061 [inline]
__schedule+0x38d5/0x5a00 kernel/sched/core.c:6640
__schedule_loop kernel/sched/core.c:6753 [inline]
schedule+0xe7/0x270 kernel/sched/core.c:6768
futex_wait_queue+0xfe/0x1f0 kernel/futex/waitwake.c:369
__futex_wait+0x292/0x3c0 kernel/futex/waitwake.c:668
futex_wait+0xea/0x380 kernel/futex/waitwake.c:696
do_futex+0x227/0x350 kernel/futex/syscalls.c:102
__do_sys_futex kernel/futex/syscalls.c:179 [inline]
__se_sys_futex kernel/futex/syscalls.c:160 [inline]
__x64_sys_futex+0x1e1/0x4c0 kernel/futex/syscalls.c:160
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f7af7e7cae9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7af8af7178 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: ffffffffffffffda RBX: 00007f7af7f9bf88 RCX: 00007f7af7e7cae9
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f7af7f9bf88
RBP: 00007f7af7f9bf80 R08: 00007f7af8af76c0 R09: 00007f7af8af76c0
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f7af7f9bf8c
R13: 000000000000000b R14: 00007fff1e139c80 R15: 00007fff1e139d68
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Aleksandr Nogikh

unread,
Oct 17, 2023, 6:06:25 AM10/17/23
to syzbot, Benjamin Segall, Peter Zijlstra, linux-...@vger.kernel.org, syzkall...@googlegroups.com, tg...@linutronix.de, Dmitry Vyukov
This is currently causing more than 50% daily crashes on syzbot:

https://syzkaller.appspot.com/upstream/graph/crashes

It began to appear after next-20231004. Judging by the diff between
next-20231003 and next-20231004, it looks like quite a few changes to
kernel/sched were merged in between. Could it be caused by any of
those?
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000e40a2906072e9567%40google.com.

syzbot

unread,
Oct 17, 2023, 9:07:52 AM10/17/23
to adilger...@dilger.ca, bse...@google.com, dvy...@google.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nog...@google.com, pet...@infradead.org, syzkall...@googlegroups.com, tg...@linutronix.de, ty...@mit.edu
syzbot has found a reproducer for the following issue on:

HEAD commit: 213f891525c2 Merge tag 'probes-fixes-v6.6-rc6' of git://gi..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=154b2f55680000
kernel config: https://syzkaller.appspot.com/x/.config?x=4471dbb58008c9f2
dashboard link: https://syzkaller.appspot.com/bug?extid=f78380e4eae53c64125c
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1711f94d680000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1267424d680000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/2865a8ce1485/disk-213f8915.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/9250aaf914b2/vmlinux-213f8915.xz
kernel image: https://storage.googleapis.com/syzbot-assets/a38e8bb73fa3/bzImage-213f8915.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/1f53082eac44/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f78380...@syzkaller.appspotmail.com

list_add corruption. next->prev should be prev (ffff8880b993d228), but was caff904900000000. (next=ffff8880783659f8).
======================================================
WARNING: possible circular locking dependency detected
6.6.0-rc6-syzkaller-00029-g213f891525c2 #0 Not tainted
------------------------------------------------------
syz-executor221/5041 is trying to acquire lock:
ffffffff8cab86c0 (console_owner){-...}-{0:0}, at: console_lock_spinning_enable kernel/printk/printk.c:1858 [inline]
ffffffff8cab86c0 (console_owner){-...}-{0:0}, at: console_emit_next_record kernel/printk/printk.c:2904 [inline]
ffffffff8cab86c0 (console_owner){-...}-{0:0}, at: console_flush_all+0x4ac/0xfb0 kernel/printk/printk.c:2966

but task is already holding lock:
ffff8880b993c718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #4 (&rq->__lock){-.-.}-{2:2}:
_raw_spin_lock_nested+0x31/0x40 kernel/locking/spinlock.c:378
raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
raw_spin_rq_lock kernel/sched/sched.h:1372 [inline]
rq_lock kernel/sched/sched.h:1681 [inline]
task_fork_fair+0x70/0x240 kernel/sched/fair.c:12416
sched_cgroup_fork+0x3cf/0x510 kernel/sched/core.c:4816
copy_process+0x45ec/0x73f0 kernel/fork.c:2609
kernel_clone+0xfd/0x920 kernel/fork.c:2909
user_mode_thread+0xb4/0xf0 kernel/fork.c:2987
rest_init+0x27/0x2b0 init/main.c:691
arch_call_rest_init+0x13/0x30 init/main.c:823
start_kernel+0x39f/0x480 init/main.c:1068
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
secondary_startup_64_no_verify+0x166/0x16b

-> #3 (&p->pi_lock){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:518 [inline]
try_to_wake_up+0xb0/0x15b0 kernel/sched/core.c:4230
__wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
__wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
tty_port_default_wakeup+0x2a/0x40 drivers/tty/tty_port.c:69
serial8250_tx_chars+0x542/0xf60 drivers/tty/serial/8250/8250_port.c:1843
serial8250_handle_irq+0x606/0xbe0 drivers/tty/serial/8250/8250_port.c:1950
serial8250_default_handle_irq+0x94/0x210 drivers/tty/serial/8250/8250_port.c:1970
serial8250_interrupt+0xfc/0x200 drivers/tty/serial/8250/8250_core.c:127
__handle_irq_event_percpu+0x22a/0x740 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
handle_irq arch/x86/kernel/irq.c:238 [inline]
__common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
common_interrupt+0xa9/0xd0 arch/x86/kernel/irq.c:247
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
acpi_safe_halt+0x1b/0x20 drivers/acpi/processor_idle.c:112
acpi_idle_enter+0xc5/0x160 drivers/acpi/processor_idle.c:707
cpuidle_enter_state+0x82/0x500 drivers/cpuidle/cpuidle.c:267
cpuidle_enter+0x4e/0xa0 drivers/cpuidle/cpuidle.c:388
cpuidle_idle_call kernel/sched/idle.c:215 [inline]
do_idle+0x315/0x3f0 kernel/sched/idle.c:282
cpu_startup_entry+0x50/0x60 kernel/sched/idle.c:380
rest_init+0x16f/0x2b0 init/main.c:726
arch_call_rest_init+0x13/0x30 init/main.c:823
start_kernel+0x39f/0x480 init/main.c:1068
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
secondary_startup_64_no_verify+0x166/0x16b

-> #2 (&tty->write_wait){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
__wake_up_common_lock+0xbb/0x140 kernel/sched/wait.c:137
tty_port_default_wakeup+0x2a/0x40 drivers/tty/tty_port.c:69
serial8250_tx_chars+0x542/0xf60 drivers/tty/serial/8250/8250_port.c:1843
serial8250_handle_irq+0x606/0xbe0 drivers/tty/serial/8250/8250_port.c:1950
serial8250_default_handle_irq+0x94/0x210 drivers/tty/serial/8250/8250_port.c:1970
serial8250_interrupt+0xfc/0x200 drivers/tty/serial/8250/8250_core.c:127
__handle_irq_event_percpu+0x22a/0x740 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0xab/0x1e0 kernel/irq/handle.c:210
handle_edge_irq+0x261/0xcf0 kernel/irq/chip.c:831
generic_handle_irq_desc include/linux/irqdesc.h:161 [inline]
handle_irq arch/x86/kernel/irq.c:238 [inline]
__common_interrupt+0xdb/0x240 arch/x86/kernel/irq.c:257
common_interrupt+0xa9/0xd0 arch/x86/kernel/irq.c:247
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:636
native_safe_halt arch/x86/include/asm/irqflags.h:48 [inline]
arch_safe_halt arch/x86/include/asm/irqflags.h:86 [inline]
acpi_safe_halt+0x1b/0x20 drivers/acpi/processor_idle.c:112
acpi_idle_enter+0xc5/0x160 drivers/acpi/processor_idle.c:707
cpuidle_enter_state+0x82/0x500 drivers/cpuidle/cpuidle.c:267
cpuidle_enter+0x4e/0xa0 drivers/cpuidle/cpuidle.c:388
cpuidle_idle_call kernel/sched/idle.c:215 [inline]
do_idle+0x315/0x3f0 kernel/sched/idle.c:282
cpu_startup_entry+0x50/0x60 kernel/sched/idle.c:380
rest_init+0x16f/0x2b0 init/main.c:726
arch_call_rest_init+0x13/0x30 init/main.c:823
start_kernel+0x39f/0x480 init/main.c:1068
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
secondary_startup_64_no_verify+0x166/0x16b

-> #1 (&port_lock_key){-.-.}-{2:2}:
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x3a/0x50 kernel/locking/spinlock.c:162
serial8250_console_write+0x57e/0x1060 drivers/tty/serial/8250/8250_port.c:3411
console_emit_next_record kernel/printk/printk.c:2910 [inline]
console_flush_all+0x4eb/0xfb0 kernel/printk/printk.c:2966
console_unlock+0x10c/0x260 kernel/printk/printk.c:3035
vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2307
vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
_printk+0xc8/0x100 kernel/printk/printk.c:2332
register_console+0xa67/0x10d0 kernel/printk/printk.c:3524
univ8250_console_init+0x35/0x50 drivers/tty/serial/8250/8250_core.c:717
console_init+0xba/0x5c0 kernel/printk/printk.c:3667
start_kernel+0x25a/0x480 init/main.c:1004
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
secondary_startup_64_no_verify+0x166/0x16b

-> #0 (console_owner){-...}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3868 [inline]
__lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5136
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
console_lock_spinning_enable kernel/printk/printk.c:1858 [inline]
console_emit_next_record kernel/printk/printk.c:2904 [inline]
console_flush_all+0x4c1/0xfb0 kernel/printk/printk.c:2966
console_unlock+0x10c/0x260 kernel/printk/printk.c:3035
vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2307
vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
_printk+0xc8/0x100 kernel/printk/printk.c:2332
__list_add_valid_or_report+0xa2/0x100 lib/list_debug.c:29
__list_add_valid include/linux/list.h:88 [inline]
__list_add include/linux/list.h:150 [inline]
list_add include/linux/list.h:169 [inline]
account_entity_enqueue kernel/sched/fair.c:3534 [inline]
enqueue_entity+0x97b/0x1490 kernel/sched/fair.c:5117
enqueue_task_fair+0x15b/0xbc0 kernel/sched/fair.c:6536
enqueue_task kernel/sched/core.c:2102 [inline]
activate_task kernel/sched/core.c:2132 [inline]
ttwu_do_activate+0x214/0xd90 kernel/sched/core.c:3787
ttwu_queue kernel/sched/core.c:4029 [inline]
try_to_wake_up+0x8e7/0x15b0 kernel/sched/core.c:4346
autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:424
__wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
__wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
wake_up_klogd_work_func kernel/printk/printk.c:3840 [inline]
wake_up_klogd_work_func+0x90/0xa0 kernel/printk/printk.c:3829
irq_work_single+0x1b5/0x260 kernel/irq_work.c:221
irq_work_run_list kernel/irq_work.c:252 [inline]
irq_work_run_list+0x92/0xc0 kernel/irq_work.c:235
update_process_times+0x1d5/0x220 kernel/time/timer.c:2074
tick_sched_handle+0x8e/0x170 kernel/time/tick-sched.c:254
tick_sched_timer+0xe9/0x110 kernel/time/tick-sched.c:1492
__run_hrtimer kernel/time/hrtimer.c:1688 [inline]
__hrtimer_run_queues+0x647/0xc10 kernel/time/hrtimer.c:1752
hrtimer_interrupt+0x31b/0x800 kernel/time/hrtimer.c:1814
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1063 [inline]
__sysvec_apic_timer_interrupt+0x105/0x3f0 arch/x86/kernel/apic/apic.c:1080
sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1074
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
memmove+0x50/0x1b0 arch/x86/lib/memmove_64.S:70
ext4_ext_rm_leaf fs/ext4/extents.c:2736 [inline]
ext4_ext_remove_space+0x1dd1/0x4390 fs/ext4/extents.c:2958
ext4_punch_hole+0xe10/0x1040 fs/ext4/inode.c:4011
ext4_fallocate+0xe21/0x3c30 fs/ext4/extents.c:4706
vfs_fallocate+0x46c/0xe80 fs/open.c:324
ioctl_preallocate+0x1a4/0x220 fs/ioctl.c:291
file_ioctl fs/ioctl.c:334 [inline]
do_vfs_ioctl+0x158c/0x1920 fs/ioctl.c:850
__do_sys_ioctl fs/ioctl.c:869 [inline]
__se_sys_ioctl fs/ioctl.c:857 [inline]
__x64_sys_ioctl+0x112/0x210 fs/ioctl.c:857
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

Chain exists of:
console_owner --> &p->pi_lock --> &rq->__lock

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&rq->__lock);
lock(&p->pi_lock);
lock(&rq->__lock);
lock(console_owner);

*** DEADLOCK ***

9 locks held by syz-executor221/5041:
#0: ffff88807e30e410 (sb_writers#4){.+.+}-{0:0}, at: ioctl_preallocate+0x1a4/0x220 fs/ioctl.c:291
#1: ffff888077854a00 (&sb->s_type->i_mutex_key#8){++++}-{3:3}, at: inode_lock include/linux/fs.h:802 [inline]
#1: ffff888077854a00 (&sb->s_type->i_mutex_key#8){++++}-{3:3}, at: ext4_punch_hole+0x1bd/0x1040 fs/ext4/inode.c:3916
#2: ffff888077854ba0 (mapping.invalidate_lock){++++}-{3:3}, at: filemap_invalidate_lock include/linux/fs.h:847 [inline]
#2: ffff888077854ba0 (mapping.invalidate_lock){++++}-{3:3}, at: ext4_punch_hole+0x370/0x1040 fs/ext4/inode.c:3963
#3: ffff888077854888 (&ei->i_data_sem){++++}-{3:3}, at: ext4_punch_hole+0xd94/0x1040 fs/ext4/inode.c:4004
#4: ffffffff8cb98958 (log_wait.lock){-...}-{2:2}, at: __wake_up_common_lock+0xbb/0x140 kernel/sched/wait.c:137
#5: ffff88807eea0a00 (&p->pi_lock){-.-.}-{2:2}, at: class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:518 [inline]
#5: ffff88807eea0a00 (&p->pi_lock){-.-.}-{2:2}, at: try_to_wake_up+0xb0/0x15b0 kernel/sched/core.c:4230
#6: ffff8880b993c718 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x29/0x130 kernel/sched/core.c:558
#7: ffffffff8cb98b00 (console_lock){+.+.}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1927 [inline]
#7: ffffffff8cb98b00 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x162/0x5f0 kernel/printk/printk.c:2306
#8: ffffffff8cb98b70 (console_srcu){....}-{0:0}, at: rcu_lock_acquire include/linux/rcupdate.h:303 [inline]
#8: ffffffff8cb98b70 (console_srcu){....}-{0:0}, at: srcu_read_lock_nmisafe include/linux/srcu.h:232 [inline]
#8: ffffffff8cb98b70 (console_srcu){....}-{0:0}, at: console_srcu_read_lock kernel/printk/printk.c:292 [inline]
#8: ffffffff8cb98b70 (console_srcu){....}-{0:0}, at: console_flush_all+0x12d/0xfb0 kernel/printk/printk.c:2958

stack backtrace:
CPU: 0 PID: 5041 Comm: syz-executor221 Not tainted 6.6.0-rc6-syzkaller-00029-g213f891525c2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
check_noncircular+0x311/0x3f0 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3868 [inline]
__lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5136
lock_acquire kernel/locking/lockdep.c:5753 [inline]
lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718
console_lock_spinning_enable kernel/printk/printk.c:1858 [inline]
console_emit_next_record kernel/printk/printk.c:2904 [inline]
console_flush_all+0x4c1/0xfb0 kernel/printk/printk.c:2966
console_unlock+0x10c/0x260 kernel/printk/printk.c:3035
vprintk_emit+0x17f/0x5f0 kernel/printk/printk.c:2307
vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
_printk+0xc8/0x100 kernel/printk/printk.c:2332
__list_add_valid_or_report+0xa2/0x100 lib/list_debug.c:29
__list_add_valid include/linux/list.h:88 [inline]
__list_add include/linux/list.h:150 [inline]
list_add include/linux/list.h:169 [inline]
account_entity_enqueue kernel/sched/fair.c:3534 [inline]
enqueue_entity+0x97b/0x1490 kernel/sched/fair.c:5117
enqueue_task_fair+0x15b/0xbc0 kernel/sched/fair.c:6536
enqueue_task kernel/sched/core.c:2102 [inline]
activate_task kernel/sched/core.c:2132 [inline]
ttwu_do_activate+0x214/0xd90 kernel/sched/core.c:3787
ttwu_queue kernel/sched/core.c:4029 [inline]
try_to_wake_up+0x8e7/0x15b0 kernel/sched/core.c:4346
autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:424
__wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
__wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
wake_up_klogd_work_func kernel/printk/printk.c:3840 [inline]
wake_up_klogd_work_func+0x90/0xa0 kernel/printk/printk.c:3829
irq_work_single+0x1b5/0x260 kernel/irq_work.c:221
irq_work_run_list kernel/irq_work.c:252 [inline]
irq_work_run_list+0x92/0xc0 kernel/irq_work.c:235
update_process_times+0x1d5/0x220 kernel/time/timer.c:2074
tick_sched_handle+0x8e/0x170 kernel/time/tick-sched.c:254
tick_sched_timer+0xe9/0x110 kernel/time/tick-sched.c:1492
__run_hrtimer kernel/time/hrtimer.c:1688 [inline]
__hrtimer_run_queues+0x647/0xc10 kernel/time/hrtimer.c:1752
hrtimer_interrupt+0x31b/0x800 kernel/time/hrtimer.c:1814
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1063 [inline]
__sysvec_apic_timer_interrupt+0x105/0x3f0 arch/x86/kernel/apic/apic.c:1080
sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1074
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:memmove+0x50/0x1b0 arch/x86/lib/memmove_64.S:71
Code: 0f 1f 44 00 00 48 81 fa a8 02 00 00 72 05 40 38 fe 74 47 48 83 ea 20 48 83 ea 20 4c 8b 1e 4c 8b 56 08 4c 8b 4e 10 4c 8b 46 18 <48> 8d 76 20 4c 89 1f 4c 89 57 08 4c 89 4f 10 4c 89 47 18 48 8d 7f
RSP: 0018:ffffc9000431f800 EFLAGS: 00000286
RAX: ffff88807723c078 RBX: ffff88807723c002 RCX: 0000000000000000
RDX: fffffffffec10288 RSI: ffff88807862bd44 RDI: ffff88807862bd38
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88807723c07c
R13: ffff88807e30c000 R14: dffffc0000000000 R15: dffffc0000000000
ext4_ext_rm_leaf fs/ext4/extents.c:2736 [inline]
ext4_ext_remove_space+0x1dd1/0x4390 fs/ext4/extents.c:2958
ext4_punch_hole+0xe10/0x1040 fs/ext4/inode.c:4011
ext4_fallocate+0xe21/0x3c30 fs/ext4/extents.c:4706
vfs_fallocate+0x46c/0xe80 fs/open.c:324
ioctl_preallocate+0x1a4/0x220 fs/ioctl.c:291
file_ioctl fs/ioctl.c:334 [inline]
do_vfs_ioctl+0x158c/0x1920 fs/ioctl.c:850
__do_sys_ioctl fs/ioctl.c:869 [inline]
__se_sys_ioctl fs/ioctl.c:857 [inline]
__x64_sys_ioctl+0x112/0x210 fs/ioctl.c:857
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7ff93a3c62d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 61 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff93a361218 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ff93a44d6d8 RCX: 00007ff93a3c62d9
RDX: 0000000020000080 RSI: 000000004030582b RDI: 0000000000000004
RBP: 00007ff93a44d6d0 R08: 00007ffd0d4db467 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ff93a41a578
R13: 0000000000000006 R14: 00007ffd0d4db380 R15: 6f6f6c2f7665642f
</TASK>
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:29!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 5041 Comm: syz-executor221 Not tainted 6.6.0-rc6-syzkaller-00029-g213f891525c2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
RIP: 0010:__list_add_valid_or_report+0xa2/0x100 lib/list_debug.c:29
Code: c7 c7 20 2d e9 8a e8 3d 31 3f fd 0f 0b 48 c7 c7 c0 2d e9 8a e8 2f 31 3f fd 0f 0b 48 89 d9 48 c7 c7 20 2e e9 8a e8 1e 31 3f fd <0f> 0b 48 89 f1 48 c7 c7 a0 2e e9 8a 48 89 de e8 0a 31 3f fd 0f 0b
RSP: 0018:ffffc900000079c0 EFLAGS: 00010086
RAX: 0000000000000075 RBX: ffff8880783659f8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff816b8952 RDI: 0000000000000005
RBP: ffff88807eea00b8 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000080010004 R11: 0000000000000001 R12: ffff88807eea0120
R13: ffff88807eea3c00 R14: ffff8880b993c820 R15: ffff8880b993d228
FS: 00007ff93a3616c0(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002002b000 CR3: 0000000020467000 CR4: 0000000000350ef0
Call Trace:
<IRQ>
__list_add_valid include/linux/list.h:88 [inline]
__list_add include/linux/list.h:150 [inline]
list_add include/linux/list.h:169 [inline]
account_entity_enqueue kernel/sched/fair.c:3534 [inline]
enqueue_entity+0x97b/0x1490 kernel/sched/fair.c:5117
enqueue_task_fair+0x15b/0xbc0 kernel/sched/fair.c:6536
enqueue_task kernel/sched/core.c:2102 [inline]
activate_task kernel/sched/core.c:2132 [inline]
ttwu_do_activate+0x214/0xd90 kernel/sched/core.c:3787
ttwu_queue kernel/sched/core.c:4029 [inline]
try_to_wake_up+0x8e7/0x15b0 kernel/sched/core.c:4346
autoremove_wake_function+0x16/0x150 kernel/sched/wait.c:424
__wake_up_common+0x140/0x5a0 kernel/sched/wait.c:107
__wake_up_common_lock+0xd6/0x140 kernel/sched/wait.c:138
wake_up_klogd_work_func kernel/printk/printk.c:3840 [inline]
wake_up_klogd_work_func+0x90/0xa0 kernel/printk/printk.c:3829
irq_work_single+0x1b5/0x260 kernel/irq_work.c:221
irq_work_run_list kernel/irq_work.c:252 [inline]
irq_work_run_list+0x92/0xc0 kernel/irq_work.c:235
update_process_times+0x1d5/0x220 kernel/time/timer.c:2074
tick_sched_handle+0x8e/0x170 kernel/time/tick-sched.c:254
tick_sched_timer+0xe9/0x110 kernel/time/tick-sched.c:1492
__run_hrtimer kernel/time/hrtimer.c:1688 [inline]
__hrtimer_run_queues+0x647/0xc10 kernel/time/hrtimer.c:1752
hrtimer_interrupt+0x31b/0x800 kernel/time/hrtimer.c:1814
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1063 [inline]
__sysvec_apic_timer_interrupt+0x105/0x3f0 arch/x86/kernel/apic/apic.c:1080
sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1074
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:memmove+0x50/0x1b0 arch/x86/lib/memmove_64.S:71
Code: 0f 1f 44 00 00 48 81 fa a8 02 00 00 72 05 40 38 fe 74 47 48 83 ea 20 48 83 ea 20 4c 8b 1e 4c 8b 56 08 4c 8b 4e 10 4c 8b 46 18 <48> 8d 76 20 4c 89 1f 4c 89 57 08 4c 89 4f 10 4c 89 47 18 48 8d 7f
RSP: 0018:ffffc9000431f800 EFLAGS: 00000286
RAX: ffff88807723c078 RBX: ffff88807723c002 RCX: 0000000000000000
RDX: fffffffffec10288 RSI: ffff88807862bd44 RDI: ffff88807862bd38
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88807723c07c
R13: ffff88807e30c000 R14: dffffc0000000000 R15: dffffc0000000000
ext4_ext_rm_leaf fs/ext4/extents.c:2736 [inline]
ext4_ext_remove_space+0x1dd1/0x4390 fs/ext4/extents.c:2958
ext4_punch_hole+0xe10/0x1040 fs/ext4/inode.c:4011
ext4_fallocate+0xe21/0x3c30 fs/ext4/extents.c:4706
vfs_fallocate+0x46c/0xe80 fs/open.c:324
ioctl_preallocate+0x1a4/0x220 fs/ioctl.c:291
file_ioctl fs/ioctl.c:334 [inline]
do_vfs_ioctl+0x158c/0x1920 fs/ioctl.c:850
__do_sys_ioctl fs/ioctl.c:869 [inline]
__se_sys_ioctl fs/ioctl.c:857 [inline]
__x64_sys_ioctl+0x112/0x210 fs/ioctl.c:857
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7ff93a3c62d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 61 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff93a361218 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ff93a44d6d8 RCX: 00007ff93a3c62d9
RDX: 0000000020000080 RSI: 000000004030582b RDI: 0000000000000004
RBP: 00007ff93a44d6d0 R08: 00007ffd0d4db467 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ff93a41a578
R13: 0000000000000006 R14: 00007ffd0d4db380 R15: 6f6f6c2f7665642f
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__list_add_valid_or_report+0xa2/0x100 lib/list_debug.c:29
Code: c7 c7 20 2d e9 8a e8 3d 31 3f fd 0f 0b 48 c7 c7 c0 2d e9 8a e8 2f 31 3f fd 0f 0b 48 89 d9 48 c7 c7 20 2e e9 8a e8 1e 31 3f fd <0f> 0b 48 89 f1 48 c7 c7 a0 2e e9 8a 48 89 de e8 0a 31 3f fd 0f 0b
RSP: 0018:ffffc900000079c0 EFLAGS: 00010086
RAX: 0000000000000075 RBX: ffff8880783659f8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff816b8952 RDI: 0000000000000005
RBP: ffff88807eea00b8 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000080010004 R11: 0000000000000001 R12: ffff88807eea0120
R13: ffff88807eea3c00 R14: ffff8880b993c820 R15: ffff8880b993d228
FS: 00007ff93a3616c0(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002002b000 CR3: 0000000020467000 CR4: 0000000000350ef0
----------------
Code disassembly (best guess):
0: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
5: 48 81 fa a8 02 00 00 cmp $0x2a8,%rdx
c: 72 05 jb 0x13
e: 40 38 fe cmp %dil,%sil
11: 74 47 je 0x5a
13: 48 83 ea 20 sub $0x20,%rdx
17: 48 83 ea 20 sub $0x20,%rdx
1b: 4c 8b 1e mov (%rsi),%r11
1e: 4c 8b 56 08 mov 0x8(%rsi),%r10
22: 4c 8b 4e 10 mov 0x10(%rsi),%r9
26: 4c 8b 46 18 mov 0x18(%rsi),%r8
* 2a: 48 8d 76 20 lea 0x20(%rsi),%rsi <-- trapping instruction
2e: 4c 89 1f mov %r11,(%rdi)
31: 4c 89 57 08 mov %r10,0x8(%rdi)
35: 4c 89 4f 10 mov %r9,0x10(%rdi)
39: 4c 89 47 18 mov %r8,0x18(%rdi)
3d: 48 rex.W
3e: 8d .byte 0x8d
3f: 7f .byte 0x7f


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

Peter Zijlstra

unread,
Oct 17, 2023, 10:13:31 AM10/17/23
to Aleksandr Nogikh, syzbot, Benjamin Segall, linux-...@vger.kernel.org, syzkall...@googlegroups.com, tg...@linutronix.de, Dmitry Vyukov
On Tue, Oct 17, 2023 at 12:06:12PM +0200, Aleksandr Nogikh wrote:
> This is currently causing more than 50% daily crashes on syzbot:
>
> https://syzkaller.appspot.com/upstream/graph/crashes
>
> It began to appear after next-20231004. Judging by the diff between
> next-20231003 and next-20231004, it looks like quite a few changes to
> kernel/sched were merged in between. Could it be caused by any of
> those?

> > EEVDF scheduling fail, picking leftmost

That was in next for a little bit, can you still reproduce with current
next?

Peter Zijlstra

unread,
Oct 17, 2023, 10:16:57 AM10/17/23
to syzbot, adilger...@dilger.ca, bse...@google.com, dvy...@google.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nog...@google.com, syzkall...@googlegroups.com, tg...@linutronix.de, ty...@mit.edu
On Tue, Oct 17, 2023 at 06:07:50AM -0700, syzbot wrote:
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 213f891525c2 Merge tag 'probes-fixes-v6.6-rc6' of git://gi..

> list_add corruption. next->prev should be prev (ffff8880b993d228), but was caff904900000000. (next=ffff8880783659f8).

Urgh, I've not seen that happen before. How reliable does this trigger?

Aleksandr Nogikh

unread,
Oct 17, 2023, 10:28:23 AM10/17/23
to Peter Zijlstra, syzbot, adilger...@dilger.ca, bse...@google.com, dvy...@google.com, linux...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, tg...@linutronix.de, ty...@mit.edu
Thank you for the information!

I've looked closer -- syzbot is currently fuzzing a somewhat old -next
version now (20231005), it could not upgrade after that because of a
-next boot error
(https://syzkaller.appspot.com/bug?extid=6867a9777f4b8dc4e256, already
has a patch). It can explain why we're still seeing these crashes.

Once the fix commit for "linux-next boot error: KASAN:
slab-out-of-bounds Write in vhci_setup" reaches next, it should all be
fine.

And it looks like we should just automatically stop -next fuzzing if
we cannot upgrade the kernel for more than several days in a row..

Tetsuo Handa

unread,
Mar 5, 2024, 9:05:22 AMMar 5
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 001fe047bd5d..237bcab72004 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -48,6 +48,7 @@
#include <linux/mutex_api.h>
#include <linux/plist.h>
#include <linux/poll.h>
+#include <linux/printk.h>
#include <linux/proc_fs.h>
#include <linux/profile.h>
#include <linux/psi.h>
@@ -1680,6 +1681,7 @@ static inline void
rq_lock_irqsave(struct rq *rq, struct rq_flags *rf)
__acquires(rq->lock)
{
+ printk_deferred_enter();
raw_spin_rq_lock_irqsave(rq, rf->flags);
rq_pin_lock(rq, rf);
}
@@ -1688,6 +1690,7 @@ static inline void
rq_lock_irq(struct rq *rq, struct rq_flags *rf)
__acquires(rq->lock)
{
+ printk_deferred_enter();
raw_spin_rq_lock_irq(rq);
rq_pin_lock(rq, rf);
}
@@ -1696,6 +1699,7 @@ static inline void
rq_lock(struct rq *rq, struct rq_flags *rf)
__acquires(rq->lock)
{
+ printk_deferred_enter();
raw_spin_rq_lock(rq);
rq_pin_lock(rq, rf);
}
@@ -1706,6 +1710,7 @@ rq_unlock_irqrestore(struct rq *rq, struct rq_flags *rf)
{
rq_unpin_lock(rq, rf);
raw_spin_rq_unlock_irqrestore(rq, rf->flags);
+ printk_deferred_exit();
}

static inline void
@@ -1714,6 +1719,7 @@ rq_unlock_irq(struct rq *rq, struct rq_flags *rf)
{
rq_unpin_lock(rq, rf);
raw_spin_rq_unlock_irq(rq);
+ printk_deferred_exit();
}

static inline void
@@ -1722,6 +1728,7 @@ rq_unlock(struct rq *rq, struct rq_flags *rf)
{
rq_unpin_lock(rq, rf);
raw_spin_rq_unlock(rq);
+ printk_deferred_exit();
}

DEFINE_LOCK_GUARD_1(rq_lock, struct rq,

syzbot

unread,
Mar 5, 2024, 10:21:05 AMMar 5
to linux-...@vger.kernel.org, penguin...@i-love.sakura.ne.jp, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in locks_remove_posix

general protection fault, probably for non-canonical address 0xdffffc0020008013: 0000 [#1] PREEMPT SMP KASAN
KASAN: probably user-memory-access in range [0x0000000100040098-0x000000010004009f]
CPU: 1 PID: 5487 Comm: dhcpcd-run-hook Not tainted 6.8.0-rc6-syzkaller-g8f50d5c42355-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
RIP: 0010:list_empty include/linux/list.h:373 [inline]
RIP: 0010:locks_remove_posix+0xff/0x440 fs/locks.c:2554
Code: 4d 8b be 20 02 00 00 4d 85 ff 0f 84 9f 01 00 00 e8 16 ea 7b ff 49 8d 47 50 48 ba 00 00 00 00 00 fc ff df 48 89 c1 48 c1 e9 03 <80> 3c 11 00 0f 85 09 03 00 00 49 8b 57 50 48 39 d0 0f 84 6e 01 00
RSP: 0018:ffffc9000388f978 EFLAGS: 00010202
RAX: 0000000100040098 RBX: ffff888019b36500 RCX: 0000000020008013
RDX: dffffc0000000000 RSI: ffffffff820b2e1a RDI: ffff888076090730
RBP: 1ffff92000711f30 R08: 0000000000000003 R09: 0000000000004000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000388f9b0
R13: ffff88802a7e0b40 R14: ffff888076090510 R15: 0000000100040048
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000100040340 CR3: 00000000183e4000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
filp_flush+0x11b/0x1a0 fs/open.c:1518
filp_close+0x1c/0x30 fs/open.c:1527
close_files fs/file.c:432 [inline]
put_files_struct fs/file.c:447 [inline]
put_files_struct+0x1df/0x360 fs/file.c:444
exit_files+0x82/0xb0 fs/file.c:464
do_exit+0xa56/0x2ad0 kernel/exit.c:866
do_group_exit+0xd4/0x2a0 kernel/exit.c:1020
get_signal+0x23b5/0x2790 kernel/signal.c:2893
arch_do_signal_or_restart+0x90/0x7f0 arch/x86/kernel/signal.c:310
exit_to_user_mode_loop kernel/entry/common.c:105 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
irqentry_exit_to_user_mode+0x13c/0x280 kernel/entry/common.c:225
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570
RIP: 0033:0x7fbec7157428
Code: Unable to access opcode bytes at 0x7fbec71573fe.
RSP: 002b:00007fff8928a8a0 EFLAGS: 00010246
RAX: 0000000100040048 RBX: 0000555b854932c0 RCX: 00007fbec7205098
RDX: 0000000000000000 RSI: 0000555b85495d30 RDI: 0000555b85494930
RBP: 0000555b85494930 R08: 0000000000000000 R09: 0000000000000000
R10: 00007fbec7205088 R11: 0000000000000246 R12: 0000000000000000
R13: 0000555b85495d30 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:list_empty include/linux/list.h:373 [inline]
RIP: 0010:locks_remove_posix+0xff/0x440 fs/locks.c:2554
Code: 4d 8b be 20 02 00 00 4d 85 ff 0f 84 9f 01 00 00 e8 16 ea 7b ff 49 8d 47 50 48 ba 00 00 00 00 00 fc ff df 48 89 c1 48 c1 e9 03 <80> 3c 11 00 0f 85 09 03 00 00 49 8b 57 50 48 39 d0 0f 84 6e 01 00
RSP: 0018:ffffc9000388f978 EFLAGS: 00010202
RAX: 0000000100040098 RBX: ffff888019b36500 RCX: 0000000020008013
RDX: dffffc0000000000 RSI: ffffffff820b2e1a RDI: ffff888076090730
RBP: 1ffff92000711f30 R08: 0000000000000003 R09: 0000000000004000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000388f9b0
R13: ffff88802a7e0b40 R14: ffff888076090510 R15: 0000000100040048
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000100040340 CR3: 00000000183e4000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 4d 8b be 20 02 00 00 mov 0x220(%r14),%r15
7: 4d 85 ff test %r15,%r15
a: 0f 84 9f 01 00 00 je 0x1af
10: e8 16 ea 7b ff call 0xff7bea2b
15: 49 8d 47 50 lea 0x50(%r15),%rax
19: 48 ba 00 00 00 00 00 movabs $0xdffffc0000000000,%rdx
20: fc ff df
23: 48 89 c1 mov %rax,%rcx
26: 48 c1 e9 03 shr $0x3,%rcx
* 2a: 80 3c 11 00 cmpb $0x0,(%rcx,%rdx,1) <-- trapping instruction
2e: 0f 85 09 03 00 00 jne 0x33d
34: 49 8b 57 50 mov 0x50(%r15),%rdx
38: 48 39 d0 cmp %rdx,%rax
3b: 0f .byte 0xf
3c: 84 6e 01 test %ch,0x1(%rsi)


Tested on:

commit: 8f50d5c4 Merge branch 'Allow struct_ops maps with a la..
git tree: bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=15c2d486180000
kernel config: https://syzkaller.appspot.com/x/.config?x=7e085fc648b10d1e
dashboard link: https://syzkaller.appspot.com/bug?extid=f78380e4eae53c64125c
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=121c92fe180000

Tetsuo Handa

unread,
Mar 20, 2024, 2:50:18 AMMar 20
to syzbot, syzkall...@googlegroups.com, Alexei Starovoitov, Daniel Borkmann, John Fastabend, Petr Mladek, Steven Rostedt, John Ogness, Sergey Senozhatsky, tg...@linutronix.de, linux-...@vger.kernel.org, bpf
Hello, bpf developers.

syzbot is reporting circular locking dependency because bpf program is hitting
printk() path. To avoid this dependency, something like
https://syzkaller.appspot.com/text?tag=Patch&x=121c92fe180000 is needed.

Is it possible to call printk_deferred_{enter,exit}() from the bpf side (e.g.
bpf_prog_run()) so that we don't need to add overhead on the scheduler side
when bpf programs are not registered?

printk_deferred_{enter,exit}() requires that printk_deferred_exit() is called
on a CPU which printk_deferred_enter() was called, for percpu counter is used.
Is migrate_{disable,enable}() also needed around printk_deferred_{enter,exit}() ?

Alexei Starovoitov

unread,
Mar 20, 2024, 2:56:45 AMMar 20
to Tetsuo Handa, syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann, John Fastabend, Petr Mladek, Steven Rostedt, John Ogness, Sergey Senozhatsky, Thomas Gleixner, LKML, bpf
On Tue, Mar 19, 2024 at 11:50 PM Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
>
> Hello, bpf developers.
>
> syzbot is reporting circular locking dependency because bpf program is hitting
> printk() path. To avoid this dependency, something like
> https://syzkaller.appspot.com/text?tag=Patch&x=121c92fe180000 is needed.
>
> Is it possible to call printk_deferred_{enter,exit}() from the bpf side (e.g.
> bpf_prog_run()) so that we don't need to add overhead on the scheduler side
> when bpf programs are not registered?
>
> printk_deferred_{enter,exit}() requires that printk_deferred_exit() is called
> on a CPU which printk_deferred_enter() was called, for percpu counter is used.
> Is migrate_{disable,enable}() also needed around printk_deferred_{enter,exit}() ?
>
> On 2023/10/08 15:28, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 7d730f1bf6f3 Add linux-next specific files for 20231005
> > git tree: linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=172d36e9680000

This has nothing to do with bpf.
bpf never calls printk().

Tetsuo Handa

unread,
Mar 20, 2024, 3:05:10 AMMar 20
to Alexei Starovoitov, syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann, John Fastabend, Petr Mladek, Steven Rostedt, John Ogness, Sergey Senozhatsky, Thomas Gleixner, LKML, bpf
On 2024/03/20 15:56, Alexei Starovoitov wrote:
> This has nothing to do with bpf.
> bpf never calls printk().

Please see the Sample crash report in the dashboard.
bpf program is hitting printk() via report_bug().

-> #0 (console_owner){....}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
console_lock_spinning_enable kernel/printk/printk.c:1873 [inline]
console_emit_next_record kernel/printk/printk.c:2901 [inline]
console_flush_all+0x810/0xfd0 kernel/printk/printk.c:2973
console_unlock+0x13b/0x4d0 kernel/printk/printk.c:3042
vprintk_emit+0x5a6/0x770 kernel/printk/printk.c:2342
_printk+0xd5/0x120 kernel/printk/printk.c:2367
__report_bug lib/bug.c:195 [inline]
report_bug+0x346/0x500 lib/bug.c:219
handle_bug+0x3e/0x70 arch/x86/kernel/traps.c:239
exc_invalid_op+0x1a/0x50 arch/x86/kernel/traps.c:260
asm_exc_invalid_op+0x1a/0x20 arch/x86/include/asm/idtentry.h:621
__local_bh_enable_ip+0x1be/0x200 kernel/softirq.c:362
spin_unlock_bh include/linux/spinlock.h:396 [inline]
__sock_map_delete net/core/sock_map.c:424 [inline]
sock_map_delete_elem+0xca/0x140 net/core/sock_map.c:446
bpf_prog_2c29ac5cdc6b1842+0x42/0x46
bpf_dispatcher_nop_func include/linux/bpf.h:1234 [inline]
__bpf_prog_run include/linux/filter.h:657 [inline]
bpf_prog_run include/linux/filter.h:664 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2381 [inline]
bpf_trace_run4+0x25a/0x490 kernel/trace/bpf_trace.c:2422
trace_sched_switch include/trace/events/sched.h:222 [inline]
__schedule+0x250b/0x49d0 kernel/sched/core.c:6733
__schedule_loop kernel/sched/core.c:6813 [inline]
schedule+0x14b/0x320 kernel/sched/core.c:6828
ptrace_stop+0x5b4/0x940 kernel/signal.c:2358
ptrace_do_notify kernel/signal.c:2395 [inline]
ptrace_notify+0x255/0x380 kernel/signal.c:2407
ptrace_report_syscall include/linux/ptrace.h:415 [inline]
ptrace_report_syscall_exit include/linux/ptrace.h:477 [inline]
syscall_exit_work+0xbd/0x170 kernel/entry/common.c:167
syscall_exit_to_user_mode_prepare kernel/entry/common.c:194 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:199 [inline]
syscall_exit_to_user_mode+0x273/0x360 kernel/entry/common.c:212
do_syscall_64+0x10a/0x240 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x6d/0x75

Alexei Starovoitov

unread,
Mar 20, 2024, 3:12:33 AMMar 20
to Tetsuo Handa, syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann, John Fastabend, Petr Mladek, Steven Rostedt, John Ogness, Sergey Senozhatsky, Thomas Gleixner, LKML, bpf
On Wed, Mar 20, 2024 at 12:05 AM Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
>
> On 2024/03/20 15:56, Alexei Starovoitov wrote:
> > This has nothing to do with bpf.
> > bpf never calls printk().
>
> Please see the Sample crash report in the dashboard.
> bpf program is hitting printk() via report_bug().

Exactly. local_bh_neable is simply asking for a splat.
_this_ bug is in printk.
It's a generic issue.

sockmap bug is orthogonal.
It's already being looked at.

Tetsuo Handa

unread,
Mar 20, 2024, 3:30:37 AMMar 20
to Alexei Starovoitov, syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann, John Fastabend, Petr Mladek, Steven Rostedt, John Ogness, Sergey Senozhatsky, Thomas Gleixner, LKML, bpf
On 2024/03/20 16:12, Alexei Starovoitov wrote:
> On Wed, Mar 20, 2024 at 12:05 AM Tetsuo Handa
> <penguin...@i-love.sakura.ne.jp> wrote:
>>
>> On 2024/03/20 15:56, Alexei Starovoitov wrote:
>>> This has nothing to do with bpf.
>>> bpf never calls printk().
>>
>> Please see the Sample crash report in the dashboard.
>> bpf program is hitting printk() via report_bug().
>
> Exactly. local_bh_neable is simply asking for a splat.
> _this_ bug is in printk.
> It's a generic issue.

I can't catch. printk() is called due to report_bug().

If the reason report_bug() is called is that spin_unlock_bh() is bad,
this is a bug in sock_map_delete_elem() rather than a bug in printk(), isn't it.

>
> sockmap bug is orthogonal.
> It's already being looked at.

Then, can we expect that this bug is also fixed shortly?

Alexei Starovoitov

unread,
Mar 20, 2024, 4:48:09 AMMar 20
to Tetsuo Handa, John Fastabend, syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann, Petr Mladek, Steven Rostedt, John Ogness, Sergey Senozhatsky, Thomas Gleixner, LKML, bpf
On Wed, Mar 20, 2024 at 12:30 AM Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
>
> On 2024/03/20 16:12, Alexei Starovoitov wrote:
> > On Wed, Mar 20, 2024 at 12:05 AM Tetsuo Handa
> > <penguin...@i-love.sakura.ne.jp> wrote:
> >>
> >> On 2024/03/20 15:56, Alexei Starovoitov wrote:
> >>> This has nothing to do with bpf.
> >>> bpf never calls printk().
> >>
> >> Please see the Sample crash report in the dashboard.
> >> bpf program is hitting printk() via report_bug().
> >
> > Exactly. local_bh_neable is simply asking for a splat.
> > _this_ bug is in printk.
> > It's a generic issue.
>
> I can't catch. printk() is called due to report_bug().
>
> If the reason report_bug() is called is that spin_unlock_bh() is bad,
> this is a bug in sock_map_delete_elem() rather than a bug in printk(), isn't it.

There are two bugs.
The one you've started the thread about is in printk.

The sockmap bug that triggered report_bug() is separate.

>
> >
> > sockmap bug is orthogonal.
> > It's already being looked at.
>
> Then, can we expect that this bug is also fixed shortly?

cc-ing John.
I don't think it's hi-pri.

John Ogness

unread,
Mar 27, 2024, 7:10:42 AMMar 27
to Alexei Starovoitov, Tetsuo Handa, John Fastabend, syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann, Petr Mladek, Steven Rostedt, Sergey Senozhatsky, Thomas Gleixner, LKML, bpf
On 2024-03-20, Alexei Starovoitov <alexei.st...@gmail.com> wrote:
> On Wed, Mar 20, 2024 at 12:30 AM Tetsuo Handa
> <penguin...@i-love.sakura.ne.jp> wrote:
>>
>> On 2024/03/20 16:12, Alexei Starovoitov wrote:
>> > On Wed, Mar 20, 2024 at 12:05 AM Tetsuo Handa
>> > <penguin...@i-love.sakura.ne.jp> wrote:
>> >>
>> >> On 2024/03/20 15:56, Alexei Starovoitov wrote:
>> >>> This has nothing to do with bpf.
>> >>> bpf never calls printk().
>> >>
>> >> Please see the Sample crash report in the dashboard.
>> >> bpf program is hitting printk() via report_bug().
>> >
>> > Exactly. local_bh_neable is simply asking for a splat.
>> > _this_ bug is in printk.
>> > It's a generic issue.
>>
>> I can't catch. printk() is called due to report_bug().
>>
>> If the reason report_bug() is called is that spin_unlock_bh() is bad,
>> this is a bug in sock_map_delete_elem() rather than a bug in printk(), isn't it.
>
> There are two bugs.
> The one you've started the thread about is in printk.

The printk rework (which is not yet fully mainline) will correctly
handle this context.

As to the patch [0] you suggested, it would be more appropriate to
perform deferred_enter/_exit *within* the locked critical section. But
we really only want these whack-a-mole workarounds for cases that can
occur in a non-bug situation. IMHO this is not such a case and falls
into the category of "known problem, the rework will handle it".

John Ogness

[0] https://syzkaller.appspot.com/text?tag=Patch&x=121c92fe180000

Tetsuo Handa

unread,
Mar 27, 2024, 10:25:35 AMMar 27
to John Ogness, Alexei Starovoitov, John Fastabend, syzbot, syzkaller-bugs, Alexei Starovoitov, Daniel Borkmann, Petr Mladek, Steven Rostedt, Sergey Senozhatsky, Thomas Gleixner, LKML, bpf
On 2024/03/27 20:05, John Ogness wrote:
> The printk rework (which is not yet fully mainline) will correctly
> handle this context.
>
> As to the patch [0] you suggested, it would be more appropriate to
> perform deferred_enter/_exit *within* the locked critical section. But
> we really only want these whack-a-mole workarounds for cases that can
> occur in a non-bug situation. IMHO this is not such a case and falls
> into the category of "known problem, the rework will handle it".
>
> John Ogness
>
> [0] https://syzkaller.appspot.com/text?tag=Patch&x=121c92fe180000
>

Since the cause of current flood of lockdep reports is already explained
in https://syzkaller.appspot.com/bug?extid=c4f4d25859c2e5859988 , we don't
need [0] for now. But it is unfortunate that the message which explains
what went wrong cannot be reported due to reporting console_lock dependency.
Therefore, I intend [0] as a workaround for a bug situation. We can revert [0]
after the printk rework completed.

Reply all
Reply to author
Forward
0 new messages