[syzbot] [sound?] possible deadlock in __snd_pcm_lib_xfer (2)

13 views
Skip to first unread message

syzbot

unread,
Aug 29, 2025, 2:38:38 PM8/29/25
to linux-...@vger.kernel.org, linux...@vger.kernel.org, pe...@perex.cz, syzkall...@googlegroups.com, ti...@suse.com
Hello,

syzbot found the following issue on:

HEAD commit: 07d9df80082b Merge tag 'perf-tools-fixes-for-v6.17-2025-08..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17333fbc580000
kernel config: https://syzkaller.appspot.com/x/.config?x=e1e1566c7726877e
dashboard link: https://syzkaller.appspot.com/bug?extid=10b4363fb0f46527f3f3
compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=160e9262580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17ed0242580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/cdf0bbb7922b/disk-07d9df80.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d1975bf771ed/vmlinux-07d9df80.xz
kernel image: https://storage.googleapis.com/syzbot-assets/942416e1bedd/bzImage-07d9df80.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+10b436...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
syzkaller #0 Not tainted
------------------------------------------------------
syz.0.48/6160 is trying to acquire lock:
ffff8880b8923d90 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: spin_lock include/linux/spinlock_rt.h:44 [inline]
ffff8880b8923d90 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: __local_bh_disable_ip+0x264/0x400 kernel/softirq.c:168

but task is already holding lock:
ffff88802f930150 (&group->lock#2){+.+.}-{3:3}, at: __snd_pcm_lib_xfer+0x386/0x1ce0 sound/core/pcm_lib.c:2319

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&group->lock#2){+.+.}-{3:3}:
lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5868
rt_spin_lock+0x88/0x2c0 kernel/locking/spinlock_rt.c:56
spin_lock include/linux/spinlock_rt.h:44 [inline]
_snd_pcm_stream_lock_irqsave+0x7c/0xa0 sound/core/pcm_native.c:171
class_pcm_stream_lock_irqsave_constructor include/sound/pcm.h:682 [inline]
snd_pcm_period_elapsed+0x1e/0x80 sound/core/pcm_lib.c:1938
dummy_hrtimer_callback+0x80/0x180 sound/drivers/dummy.c:386
__run_hrtimer kernel/time/hrtimer.c:1761 [inline]
__hrtimer_run_queues+0x54f/0xd40 kernel/time/hrtimer.c:1825
hrtimer_run_softirq+0x1a3/0x2e0 kernel/time/hrtimer.c:1842
handle_softirqs+0x22c/0x710 kernel/softirq.c:579
__do_softirq kernel/softirq.c:613 [inline]
run_ktimerd+0xcf/0x190 kernel/softirq.c:1043
smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160
kthread+0x711/0x8a0 kernel/kthread.c:463
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

-> #1 (&base->softirq_expiry_lock){+...}-{3:3}:
lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5868
rt_spin_lock+0x88/0x2c0 kernel/locking/spinlock_rt.c:56
spin_lock include/linux/spinlock_rt.h:44 [inline]
hrtimer_cpu_base_lock_expiry kernel/time/hrtimer.c:1383 [inline]
hrtimer_run_softirq+0x7c/0x2e0 kernel/time/hrtimer.c:1838
handle_softirqs+0x22c/0x710 kernel/softirq.c:579
__do_softirq kernel/softirq.c:613 [inline]
run_ktimerd+0xcf/0x190 kernel/softirq.c:1043
smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160
kthread+0x711/0x8a0 kernel/kthread.c:463
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

-> #0 ((softirq_ctrl.lock)){+.+.}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3165 [inline]
check_prevs_add kernel/locking/lockdep.c:3284 [inline]
validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3908
__lock_acquire+0xab9/0xd20 kernel/locking/lockdep.c:5237
reacquire_held_locks+0x127/0x1d0 kernel/locking/lockdep.c:5385
__lock_release kernel/locking/lockdep.c:5574 [inline]
lock_release+0x1b4/0x3e0 kernel/locking/lockdep.c:5889
__local_bh_enable_ip+0x10c/0x270 kernel/softirq.c:228
hrtimer_cancel+0x39/0x60 kernel/time/hrtimer.c:1491
dummy_hrtimer_stop+0xcf/0x100 sound/drivers/dummy.c:410
snd_pcm_do_stop+0x12a/0x1c0 sound/core/pcm_native.c:1525
snd_pcm_action_single sound/core/pcm_native.c:1305 [inline]
snd_pcm_action+0xe4/0x240 sound/core/pcm_native.c:1388
__snd_pcm_xrun+0x27f/0x7c0 sound/core/pcm_lib.c:180
snd_pcm_update_state+0x342/0x430 sound/core/pcm_lib.c:224
snd_pcm_update_hw_ptr0+0x10b2/0x1b00 sound/core/pcm_lib.c:493
snd_pcm_update_hw_ptr sound/core/pcm_lib.c:499 [inline]
__snd_pcm_lib_xfer+0x510/0x1ce0 sound/core/pcm_lib.c:2326
snd_pcm_oss_write3+0x1bc/0x320 sound/core/oss/pcm_oss.c:1241
snd_pcm_plug_write_transfer+0x2cb/0x4c0 sound/core/oss/pcm_plugin.c:630
snd_pcm_oss_write2 sound/core/oss/pcm_oss.c:1373 [inline]
snd_pcm_oss_write1 sound/core/oss/pcm_oss.c:1439 [inline]
snd_pcm_oss_write+0xba2/0x11a0 sound/core/oss/pcm_oss.c:2795
vfs_write+0x284/0xb40 fs/read_write.c:684
ksys_write+0x14b/0x260 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
(softirq_ctrl.lock) --> &base->softirq_expiry_lock --> &group->lock#2

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&group->lock#2);
lock(&base->softirq_expiry_lock);
lock(&group->lock#2);
lock((softirq_ctrl.lock));

*** DEADLOCK ***

2 locks held by syz.0.48/6160:
#0: ffff88802f930150 (&group->lock#2){+.+.}-{3:3}, at: __snd_pcm_lib_xfer+0x386/0x1ce0 sound/core/pcm_lib.c:2319
#1: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
#1: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
#1: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: __rt_spin_lock kernel/locking/spinlock_rt.c:50 [inline]
#1: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x1bb/0x2c0 kernel/locking/spinlock_rt.c:57

stack backtrace:
CPU: 1 UID: 0 PID: 6160 Comm: syz.0.48 Not tainted syzkaller #0 PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_circular_bug+0x2ee/0x310 kernel/locking/lockdep.c:2043
check_noncircular+0x134/0x160 kernel/locking/lockdep.c:2175
check_prev_add kernel/locking/lockdep.c:3165 [inline]
check_prevs_add kernel/locking/lockdep.c:3284 [inline]
validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3908
__lock_acquire+0xab9/0xd20 kernel/locking/lockdep.c:5237
reacquire_held_locks+0x127/0x1d0 kernel/locking/lockdep.c:5385
__lock_release kernel/locking/lockdep.c:5574 [inline]
lock_release+0x1b4/0x3e0 kernel/locking/lockdep.c:5889
__local_bh_enable_ip+0x10c/0x270 kernel/softirq.c:228
hrtimer_cancel+0x39/0x60 kernel/time/hrtimer.c:1491
dummy_hrtimer_stop+0xcf/0x100 sound/drivers/dummy.c:410
snd_pcm_do_stop+0x12a/0x1c0 sound/core/pcm_native.c:1525
snd_pcm_action_single sound/core/pcm_native.c:1305 [inline]
snd_pcm_action+0xe4/0x240 sound/core/pcm_native.c:1388
__snd_pcm_xrun+0x27f/0x7c0 sound/core/pcm_lib.c:180
snd_pcm_update_state+0x342/0x430 sound/core/pcm_lib.c:224
snd_pcm_update_hw_ptr0+0x10b2/0x1b00 sound/core/pcm_lib.c:493
snd_pcm_update_hw_ptr sound/core/pcm_lib.c:499 [inline]
__snd_pcm_lib_xfer+0x510/0x1ce0 sound/core/pcm_lib.c:2326
snd_pcm_oss_write3+0x1bc/0x320 sound/core/oss/pcm_oss.c:1241
snd_pcm_plug_write_transfer+0x2cb/0x4c0 sound/core/oss/pcm_plugin.c:630
snd_pcm_oss_write2 sound/core/oss/pcm_oss.c:1373 [inline]
snd_pcm_oss_write1 sound/core/oss/pcm_oss.c:1439 [inline]
snd_pcm_oss_write+0xba2/0x11a0 sound/core/oss/pcm_oss.c:2795
vfs_write+0x284/0xb40 fs/read_write.c:684
ksys_write+0x14b/0x260 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f0e7a70ebe9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe73a3cd48 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f0e7a935fa0 RCX: 00007f0e7a70ebe9
RDX: 000000000000fc36 RSI: 0000200000000500 RDI: 0000000000000003
RBP: 00007f0e7a791e19 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f0e7a935fa0 R14: 00007f0e7a935fa0 R15: 0000000000000003
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Aug 29, 2025, 8:06:06 PM8/29/25
to big...@linutronix.de, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, mi...@redhat.com, pe...@perex.cz, syzkall...@googlegroups.com, tg...@linutronix.de, ti...@suse.com, x...@kernel.org
syzbot has bisected this issue to:

commit d2d6422f8bd17c6bb205133e290625a564194496
Author: Sebastian Andrzej Siewior <big...@linutronix.de>
Date: Fri Sep 6 10:59:04 2024 +0000

x86: Allow to enable PREEMPT_RT.

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12db5634580000
start commit: 07d9df80082b Merge tag 'perf-tools-fixes-for-v6.17-2025-08..
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=11db5634580000
console output: https://syzkaller.appspot.com/x/log.txt?x=16db5634580000
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10307262580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17110242580000

Reported-by: syzbot+10b436...@syzkaller.appspotmail.com
Fixes: d2d6422f8bd1 ("x86: Allow to enable PREEMPT_RT.")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Hillf Danton

unread,
Aug 29, 2025, 8:46:05 PM8/29/25
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
> Date: Fri, 29 Aug 2025 11:38:35 -0700 [thread overview]
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 07d9df80082b Merge tag 'perf-tools-fixes-for-v6.17-2025-08..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=17333fbc580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=e1e1566c7726877e
> dashboard link: https://syzkaller.appspot.com/bug?extid=10b4363fb0f46527f3f3
> compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=160e9262580000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17ed0242580000

#syz test

--- x/kernel/time/hrtimer.c
+++ y/kernel/time/hrtimer.c
@@ -1718,7 +1718,7 @@ EXPORT_SYMBOL_GPL(hrtimer_active);
static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base,
struct hrtimer_clock_base *base,
struct hrtimer *timer, ktime_t *now,
- unsigned long flags) __must_hold(&cpu_base->lock)
+ unsigned long flags, bool soft) __must_hold(&cpu_base->lock)
{
enum hrtimer_restart (*fn)(struct hrtimer *);
bool expires_in_hardirq;
@@ -1755,6 +1755,8 @@ static void __run_hrtimer(struct hrtimer
* is dropped.
*/
raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
+ if (soft)
+ hrtimer_cpu_base_unlock_expiry(cpu_base);
trace_hrtimer_expire_entry(timer, now);
expires_in_hardirq = lockdep_hrtimer_enter(timer);

@@ -1762,6 +1764,8 @@ static void __run_hrtimer(struct hrtimer

lockdep_hrtimer_exit(expires_in_hardirq);
trace_hrtimer_expire_exit(timer);
+ if (soft)
+ hrtimer_cpu_base_lock_expiry(cpu_base);
raw_spin_lock_irq(&cpu_base->lock);

/*
@@ -1822,7 +1826,8 @@ static void __hrtimer_run_queues(struct
if (basenow < hrtimer_get_softexpires_tv64(timer))
break;

- __run_hrtimer(cpu_base, base, timer, &basenow, flags);
+ __run_hrtimer(cpu_base, base, timer, &basenow, flags,
+ active_mask == HRTIMER_ACTIVE_SOFT);
if (active_mask == HRTIMER_ACTIVE_SOFT)
hrtimer_sync_wait_running(cpu_base, flags);
}
--

syzbot

unread,
Aug 29, 2025, 11:03:08 PM8/29/25
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in __snd_pcm_lib_xfer

======================================================
WARNING: possible circular locking dependency detected
syzkaller #0 Not tainted
------------------------------------------------------
syz.0.46/6843 is trying to acquire lock:
ffff8880b8823d90 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: spin_lock include/linux/spinlock_rt.h:44 [inline]
ffff8880b8823d90 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: __local_bh_disable_ip+0x264/0x400 kernel/softirq.c:168

but task is already holding lock:
ffff88814ccb7150 (&group->lock#2){+.+.}-{3:3}, at: __snd_pcm_lib_xfer+0x386/0x1ce0 sound/core/pcm_lib.c:2319

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&group->lock#2){+.+.}-{3:3}:
lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5868
rt_spin_lock+0x88/0x2c0 kernel/locking/spinlock_rt.c:56
spin_lock include/linux/spinlock_rt.h:44 [inline]
_snd_pcm_stream_lock_irqsave+0x7c/0xa0 sound/core/pcm_native.c:171
class_pcm_stream_lock_irqsave_constructor include/sound/pcm.h:682 [inline]
snd_pcm_period_elapsed+0x1e/0x80 sound/core/pcm_lib.c:1938
dummy_hrtimer_callback+0x80/0x180 sound/drivers/dummy.c:386
__run_hrtimer kernel/time/hrtimer.c:1763 [inline]
__hrtimer_run_queues+0x590/0xda0 kernel/time/hrtimer.c:1829
hrtimer_run_softirq+0x1a3/0x2e0 kernel/time/hrtimer.c:1847
handle_softirqs+0x22c/0x710 kernel/softirq.c:579
__do_softirq kernel/softirq.c:613 [inline]
run_ktimerd+0xcf/0x190 kernel/softirq.c:1043
smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160
kthread+0x711/0x8a0 kernel/kthread.c:463
ret_from_fork+0x3f9/0x770 arch/x86/kernel/process.c:148
vfs_write+0x287/0xb40 fs/read_write.c:684
ksys_write+0x14b/0x260 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&group->lock#2);
lock((softirq_ctrl.lock));
lock(&group->lock#2);
lock((softirq_ctrl.lock));

*** DEADLOCK ***

2 locks held by syz.0.46/6843:
#0: ffff88814ccb7150 (&group->lock#2){+.+.}-{3:3}, at: __snd_pcm_lib_xfer+0x386/0x1ce0 sound/core/pcm_lib.c:2319
#1: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
#1: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
#1: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: __rt_spin_lock kernel/locking/spinlock_rt.c:50 [inline]
#1: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x1bb/0x2c0 kernel/locking/spinlock_rt.c:57

stack backtrace:
CPU: 0 UID: 0 PID: 6843 Comm: syz.0.46 Not tainted syzkaller #0 PREEMPT_{RT,(full)}
vfs_write+0x287/0xb40 fs/read_write.c:684
ksys_write+0x14b/0x260 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f9528fcebe9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f952863e038 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f95291f5fa0 RCX: 00007f9528fcebe9
RDX: 000000000000fc36 RSI: 0000200000000500 RDI: 0000000000000003
RBP: 00007f9529051e19 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f95291f6038 R14: 00007f95291f5fa0 R15: 00007ffe192e0268
</TASK>


Tested on:

commit: 11e7861d Merge tag 'for-linus' of git://git.kernel.org..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15d8d634580000
kernel config: https://syzkaller.appspot.com/x/.config?x=bd9738e00c1bbfb4
dashboard link: https://syzkaller.appspot.com/bug?extid=10b4363fb0f46527f3f3
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=11b7fef0580000

Hillf Danton

unread,
Aug 30, 2025, 2:56:55 AM8/30/25
to syzbot, Sebastian Andrzej Siewior, linux-...@vger.kernel.org, syzkall...@googlegroups.com
> Date: Fri, 29 Aug 2025 20:03:05 -0700
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> possible deadlock in __snd_pcm_lib_xfer
>
> ======================================================
> WARNING: possible circular locking dependency detected
> syzkaller #0 Not tainted
> ------------------------------------------------------
> syz.0.46/6843 is trying to acquire lock:
> ffff8880b8823d90 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: spin_lock include/linux/spinlock_rt.h:44 [inline]
> ffff8880b8823d90 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: __local_bh_disable_ip+0x264/0x400 kernel/softirq.c:168
>
Given softirq_ctrl is percpu, this report is false positive.

Sebastian Andrzej Siewior

unread,
Sep 3, 2025, 10:59:09 AM9/3/25
to Hillf Danton, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On 2025-08-30 14:56:37 [+0800], Hillf Danton wrote:
> > syz.0.46/6843 is trying to acquire lock:
> > ffff8880b8823d90 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: spin_lock include/linux/spinlock_rt.h:44 [inline]
> > ffff8880b8823d90 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: __local_bh_disable_ip+0x264/0x400 kernel/softirq.c:168
> >
> Given softirq_ctrl is percpu, this report is false positive.

No. This can happen on a single CPU.

Sebastian

Hillf Danton

unread,
Sep 3, 2025, 9:05:44 PM9/3/25
to Sebastian Andrzej Siewior, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
But the single CPU theory fails to explain the deadlock reported.

Sebastian Andrzej Siewior

unread,
Sep 4, 2025, 2:12:51 AM9/4/25
to Hillf Danton, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On 2025-09-04 09:05:28 [+0800], Hillf Danton wrote:
> On Wed, 3 Sep 2025 16:59:05 +0200 Sebastian Andrzej Siewior wrote:
> > On 2025-08-30 14:56:37 [+0800], Hillf Danton wrote:
> > > > syz.0.46/6843 is trying to acquire lock:
> > > > ffff8880b8823d90 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: spin_lock include/linux/spinlock_rt.h:44 [inline]
> > > > ffff8880b8823d90 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: __local_bh_disable_ip+0x264/0x400 kernel/softirq.c:168
> > > >
> > > Given softirq_ctrl is percpu, this report is false positive.
> >
> > No. This can happen on a single CPU.
> >
> But the single CPU theory fails to explain the deadlock reported.
>
> > > > Possible unsafe locking scenario:
> > > >
> > > > CPU0 CPU1
> > > > ---- ----
Thead0 Thread1
------ -------c
> > > > lock(&group->lock#2);
preempt to ->
> > > > lock((softirq_ctrl.lock));
> > > > lock(&group->lock#2);
<- preempt to
> > > > lock((softirq_ctrl.lock));
> > > >
> > > > *** DEADLOCK ***
now nobody makes progress

Sebastian

Sebastian Andrzej Siewior

unread,
Sep 4, 2025, 6:21:02 AM9/4/25
to syzbot, tg...@linutronix.de, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, mi...@redhat.com, pe...@perex.cz, syzkall...@googlegroups.com, ti...@suse.com, x...@kernel.org
This is unfortunate. There is nothing that sound did wrong, it is rather
special softirq handling in this case. We don't see this often because
it requires that a timer is cancelled at the time it is running.
The assumption made by sound is that spin_lock_irq() also disables
softirqs. This is not the case on PREEMPT_RT.

The hunk below avoids the splat. Adding local_bh_disable() to
spin_lock_irq() would cure it, too. It would also result in random
synchronisation points across the kernel leading to something less
usable.
The imho best solution would to get rid of softirq_ctrl.lock which has
been proposed
https://lore.kernel.org/all/20250901163811....@linutronix.de/

Comments?

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -84,19 +84,24 @@ void snd_pcm_group_init(struct snd_pcm_group *group)
}

/* define group lock helpers */
-#define DEFINE_PCM_GROUP_LOCK(action, mutex_action) \
+#define DEFINE_PCM_GROUP_LOCK(action, bh_lock, bh_unlock, mutex_action) \
static void snd_pcm_group_ ## action(struct snd_pcm_group *group, bool nonatomic) \
{ \
- if (nonatomic) \
+ if (nonatomic) { \
mutex_ ## mutex_action(&group->mutex); \
- else \
- spin_ ## action(&group->lock); \
+ } else { \
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && bh_lock) \
+ local_bh_disable(); \
+ spin_ ## action(&group->lock); \
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && bh_unlock) \
+ local_bh_enable(); \
+ } \
}

-DEFINE_PCM_GROUP_LOCK(lock, lock);
-DEFINE_PCM_GROUP_LOCK(unlock, unlock);
-DEFINE_PCM_GROUP_LOCK(lock_irq, lock);
-DEFINE_PCM_GROUP_LOCK(unlock_irq, unlock);
+DEFINE_PCM_GROUP_LOCK(lock, 0, 0, lock);
+DEFINE_PCM_GROUP_LOCK(unlock, 0, 0, unlock);
+DEFINE_PCM_GROUP_LOCK(lock_irq, 1, 0, lock);
+DEFINE_PCM_GROUP_LOCK(unlock_irq, 0, 1, unlock);

/**
* snd_pcm_stream_lock - Lock the PCM stream


Sebastian

Takashi Iwai

unread,
Sep 4, 2025, 6:38:11 AM9/4/25
to Sebastian Andrzej Siewior, syzbot, tg...@linutronix.de, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, mi...@redhat.com, pe...@perex.cz, syzkall...@googlegroups.com, ti...@suse.com, x...@kernel.org
Thank you for the detailed analysis! It enlightened me.

It'd be appreciated if this gets fixed in the softirq core side.
If nothing else flies, we can take your workaround, sure...


Takashi

Hillf Danton

unread,
Sep 4, 2025, 11:04:45 PM9/4/25
to Sebastian Andrzej Siewior, syzbot, Takashi Iwai, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Because of the absense of the deadlock detecting mechanism in timer, see
lock_map_acquire() in call_timer_fn() for detail, hrtimer is unable to
detect deadlock like the reported one.

CPU0 CPU1
---- ----
lock A hrtimer B callback
cancel hrtimer B lock A

On the other hand softirq_ctrl plays such a detecting role in accident, but
they are two entirely different things.

Sebastian Andrzej Siewior

unread,
Sep 15, 2025, 11:28:57 AM9/15/25
to Takashi Iwai, syzbot, tg...@linutronix.de, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, mi...@redhat.com, pe...@perex.cz, syzkall...@googlegroups.com, ti...@suse.com, x...@kernel.org
snd_pcm_group_lock_irq() acquires a spinlock_t and disables interrupts
via spin_lock_irq(). This also implicitly disables the handling of
softirqs such as TIMER_SOFTIRQ.
On PREEMPT_RT softirqs are preemptible and spin_lock_irq() does not
disable them. That means a timer can be invoked during spin_lock_irq()
on the same CPU. Due to synchronisations reasons local_bh_disable() has
a per-CPU lock named softirq_ctrl.lock which synchronizes individual
softirq against each other.
syz-bot managed to trigger a lockdep report where softirq_ctrl.lock is
acquired in hrtimer_cancel() in addition to hrtimer_run_softirq(). This
is a possible deadlock.

The softirq_ctrl.lock can not be made part of spin_lock_irq() as this
would lead to too much synchronisation against individual threads on the
system. To avoid the possible deadlock, softirqs must be manually
disabled before the lock is acquired.

Disable softirqs before the lock is acquired on PREEMPT_RT.

Reported-by: syzbot+10b436...@syzkaller.appspotmail.com
Fixes: d2d6422f8bd1 ("x86: Allow to enable PREEMPT_RT.")
Signed-off-by: Sebastian Andrzej Siewior <big...@linutronix.de>
---

I don't see a way around this given the report. I don't see how to
address this within the softirq. Taking this lock as part of every
spin_lock_irq() would be a mess and while testing I didn't even manage
to boot the machine. So I probably missed a detail (but then I would
only know how mad it really is).

This can be an intermediate solution until
https://lore.kernel.org/all/20250901163811....@linutronix.de/

gets merged and the !PREEMPT_RT_NEEDS_BH_LOCK case the default (i.e. not
a config option anymore).

sound/core/pcm_native.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 1eab940fa2e5a..68bee40c9adaf 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -84,19 +84,24 @@ void snd_pcm_group_init(struct snd_pcm_group *group)
}

/* define group lock helpers */
-#define DEFINE_PCM_GROUP_LOCK(action, mutex_action) \
+#define DEFINE_PCM_GROUP_LOCK(action, bh_lock, bh_unlock, mutex_action) \
static void snd_pcm_group_ ## action(struct snd_pcm_group *group, bool nonatomic) \
{ \
- if (nonatomic) \
+ if (nonatomic) { \
mutex_ ## mutex_action(&group->mutex); \
- else \
- spin_ ## action(&group->lock); \
+ } else { \
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && bh_lock) \
+ local_bh_disable(); \
+ spin_ ## action(&group->lock); \
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && bh_unlock) \
+ local_bh_enable(); \
+ } \
}

-DEFINE_PCM_GROUP_LOCK(lock, lock);
-DEFINE_PCM_GROUP_LOCK(unlock, unlock);
-DEFINE_PCM_GROUP_LOCK(lock_irq, lock);
-DEFINE_PCM_GROUP_LOCK(unlock_irq, unlock);
+DEFINE_PCM_GROUP_LOCK(lock, false, false, lock);
+DEFINE_PCM_GROUP_LOCK(unlock, false, false, unlock);
+DEFINE_PCM_GROUP_LOCK(lock_irq, true, false, lock);
+DEFINE_PCM_GROUP_LOCK(unlock_irq, false, true, unlock);

/**
* snd_pcm_stream_lock - Lock the PCM stream
--
2.51.0

Takashi Iwai

unread,
Sep 16, 2025, 5:43:58 AM9/16/25
to Sebastian Andrzej Siewior, Takashi Iwai, syzbot, tg...@linutronix.de, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, mi...@redhat.com, pe...@perex.cz, syzkall...@googlegroups.com, ti...@suse.com, x...@kernel.org
I applied now to for-next branch for 6.18.
It's already at a late stage for 6.17 release, and the issue doesn't
seem like an urgent regression to be fixed.


Thanks!

Takashi

Sebastian Andrzej Siewior

unread,
Sep 16, 2025, 8:23:34 AM9/16/25
to Takashi Iwai, syzbot, tg...@linutronix.de, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, linux...@vger.kernel.org, mi...@redhat.com, pe...@perex.cz, syzkall...@googlegroups.com, ti...@suse.com, x...@kernel.org
On 2025-09-16 11:43:53 [+0200], Takashi Iwai wrote:
> I applied now to for-next branch for 6.18.
> It's already at a late stage for 6.17 release, and the issue doesn't
> seem like an urgent regression to be fixed.

Sure. Thank you.

> Thanks!
>
> Takashi

Sebastian
Reply all
Reply to author
Forward
0 new messages