possible deadlock in _snd_pcm_stream_lock_irqsave

10 views
Skip to first unread message

syzbot

unread,
Sep 9, 2020, 4:49:23 AM9/9/20
to alsa-...@alsa-project.org, ar...@arndb.de, baoli...@linaro.org, linux-...@vger.kernel.org, pe...@perex.cz, syzkall...@googlegroups.com, ti...@suse.com
Hello,

syzbot found the following issue on:

HEAD commit: dff9f829 Add linux-next specific files for 20200908
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1425773e900000
kernel config: https://syzkaller.appspot.com/x/.config?x=37b3426c77bda44c
dashboard link: https://syzkaller.appspot.com/bug?extid=561a74f84100162990b2
compiler: gcc (GCC) 10.1.0-syz 20200507
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=126577ae900000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=121e4e21900000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+561a74...@syzkaller.appspotmail.com

========================================================
WARNING: possible irq lock inversion dependency detected
5.9.0-rc4-next-20200908-syzkaller #0 Not tainted
--------------------------------------------------------
swapper/0/0 just changed the state of lock:
ffff8882153bc908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
but this lock took another, SOFTIRQ-READ-unsafe lock in the past:
(&card->ctl_files_rwlock){.+.+}-{2:2}


and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
Possible interrupt unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&card->ctl_files_rwlock);
local_irq_disable();
lock(&group->lock);
lock(&card->ctl_files_rwlock);
<Interrupt>
lock(&group->lock);

*** DEADLOCK ***

1 lock held by swapper/0/0:
#0: ffffc90000007d68 ((&dpcm->timer)){+.-.}-{0:0}, at: lockdep_copy_map include/linux/lockdep.h:35 [inline]
#0: ffffc90000007d68 ((&dpcm->timer)){+.-.}-{0:0}, at: call_timer_fn+0xdb/0x760 kernel/time/timer.c:1403

the shortest dependencies between 2nd lock and 1st lock:
-> (&card->ctl_files_rwlock){.+.+}-{2:2} {
HARDIRQ-ON-R at:
lock_acquire+0x1f3/0xaf0 kernel/locking/lockdep.c:5398
__raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
_raw_read_lock+0x5b/0x70 kernel/locking/spinlock.c:223
snd_ctl_notify.part.0+0x36/0x550 sound/core/control.c:153
snd_ctl_notify+0x8f/0xb0 sound/core/control.c:181
__snd_ctl_add_replace+0x638/0x800 sound/core/control.c:382
snd_ctl_add_replace+0x76/0x130 sound/core/control.c:399
snd_card_dummy_new_mixer sound/drivers/dummy.c:885 [inline]
snd_dummy_probe+0xbbf/0x1050 sound/drivers/dummy.c:1080
platform_drv_probe+0x87/0x140 drivers/base/platform.c:747
really_probe+0x282/0x9f0 drivers/base/dd.c:553
driver_probe_device+0xfe/0x1d0 drivers/base/dd.c:738
__device_attach_driver+0x1c2/0x220 drivers/base/dd.c:844
bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:431
__device_attach+0x228/0x470 drivers/base/dd.c:912
bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491
device_add+0xb17/0x1c40 drivers/base/core.c:2926
platform_device_add+0x34f/0x6d0 drivers/base/platform.c:597
platform_device_register_full+0x38c/0x4e0 drivers/base/platform.c:720
platform_device_register_resndata include/linux/platform_device.h:131 [inline]
platform_device_register_simple include/linux/platform_device.h:160 [inline]
alsa_card_dummy_init+0x1cc/0x2e0 sound/drivers/dummy.c:1168
do_one_initcall+0x10a/0x7b0 init/main.c:1204
do_initcall_level init/main.c:1277 [inline]
do_initcalls init/main.c:1293 [inline]
do_basic_setup init/main.c:1313 [inline]
kernel_init_freeable+0x5e9/0x66d init/main.c:1512
kernel_init+0xd/0x1c0 init/main.c:1402
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
SOFTIRQ-ON-R at:
lock_acquire+0x1f3/0xaf0 kernel/locking/lockdep.c:5398
__raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
_raw_read_lock+0x5b/0x70 kernel/locking/spinlock.c:223
snd_ctl_notify.part.0+0x36/0x550 sound/core/control.c:153
snd_ctl_notify+0x8f/0xb0 sound/core/control.c:181
__snd_ctl_add_replace+0x638/0x800 sound/core/control.c:382
snd_ctl_add_replace+0x76/0x130 sound/core/control.c:399
snd_card_dummy_new_mixer sound/drivers/dummy.c:885 [inline]
snd_dummy_probe+0xbbf/0x1050 sound/drivers/dummy.c:1080
platform_drv_probe+0x87/0x140 drivers/base/platform.c:747
really_probe+0x282/0x9f0 drivers/base/dd.c:553
driver_probe_device+0xfe/0x1d0 drivers/base/dd.c:738
__device_attach_driver+0x1c2/0x220 drivers/base/dd.c:844
bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:431
__device_attach+0x228/0x470 drivers/base/dd.c:912
bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491
device_add+0xb17/0x1c40 drivers/base/core.c:2926
platform_device_add+0x34f/0x6d0 drivers/base/platform.c:597
platform_device_register_full+0x38c/0x4e0 drivers/base/platform.c:720
platform_device_register_resndata include/linux/platform_device.h:131 [inline]
platform_device_register_simple include/linux/platform_device.h:160 [inline]
alsa_card_dummy_init+0x1cc/0x2e0 sound/drivers/dummy.c:1168
do_one_initcall+0x10a/0x7b0 init/main.c:1204
do_initcall_level init/main.c:1277 [inline]
do_initcalls init/main.c:1293 [inline]
do_basic_setup init/main.c:1313 [inline]
kernel_init_freeable+0x5e9/0x66d init/main.c:1512
kernel_init+0xd/0x1c0 init/main.c:1402
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
(null) at:
general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.0-rc4-next-20200908-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:print_lock_trace kernel/locking/lockdep.c:1751 [inline]
RIP: 0010:print_lock_class_header kernel/locking/lockdep.c:2240 [inline]
RIP: 0010:print_shortest_lock_dependencies.cold+0x110/0x2af kernel/locking/lockdep.c:2263
Code: 48 8b 04 24 48 c1 e8 03 42 80 3c 20 00 74 09 48 8b 3c 24 e8 dd fb de f9 48 8b 04 24 48 8b 00 48 8d 78 14 48 89 fa 48 c1 ea 03 <42> 0f b6 0c 22 48 89 fa 83 e2 07 83 c2 03 38 ca 7c 08 84 c9 0f 85
RSP: 0018:ffffc90000007870 EFLAGS: 00010003
RAX: 0000000000000001 RBX: ffffffff8c8d3cc0 RCX: 0000000000000000
RDX: 0000000000000002 RSI: ffffffff815bc817 RDI: 0000000000000015
RBP: ffffc900000079a0 R08: 0000000000000004 R09: ffff8880ae620f8b
R10: 0000000000000000 R11: 6c756e2820202020 R12: dffffc0000000000
R13: ffffffff8c6de1d0 R14: 0000000000000009 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000200000c0 CR3: 0000000093f18000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
print_irq_inversion_bug.part.0+0x2c6/0x2ee kernel/locking/lockdep.c:3769
print_irq_inversion_bug kernel/locking/lockdep.c:4377 [inline]
check_usage_forwards kernel/locking/lockdep.c:3803 [inline]
mark_lock_irq kernel/locking/lockdep.c:3935 [inline]
mark_lock.cold+0x38/0x10d kernel/locking/lockdep.c:4375
mark_usage kernel/locking/lockdep.c:4260 [inline]
__lock_acquire+0x117d/0x55d0 kernel/locking/lockdep.c:4750
lock_acquire+0x1f3/0xaf0 kernel/locking/lockdep.c:5398
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x94/0xd0 kernel/locking/spinlock.c:159
_snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170
snd_pcm_period_elapsed+0x24/0x250 sound/core/pcm_lib.c:1799
loopback_jiffies_timer_function+0x1a8/0x220 sound/drivers/aloop.c:669
call_timer_fn+0x1ac/0x760 kernel/time/timer.c:1413
expire_timers kernel/time/timer.c:1458 [inline]
__run_timers.part.0+0x67c/0xaa0 kernel/time/timer.c:1755
__run_timers kernel/time/timer.c:1736 [inline]
run_timer_softirq+0xae/0x1a0 kernel/time/timer.c:1768
__do_softirq+0x1f7/0xa91 kernel/softirq.c:298
asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
</IRQ>
__run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
do_softirq_own_stack+0x9d/0xd0 arch/x86/kernel/irq_64.c:77
invoke_softirq kernel/softirq.c:393 [inline]
__irq_exit_rcu kernel/softirq.c:423 [inline]
irq_exit_rcu+0x235/0x280 kernel/softirq.c:435
sysvec_apic_timer_interrupt+0x51/0xf0 arch/x86/kernel/apic/apic.c:1091
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:581
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
Code: ff 4c 89 ef e8 23 c4 b4 f9 e9 8e fe ff ff 48 89 df e8 16 c4 b4 f9 eb 8a cc cc cc cc e9 07 00 00 00 0f 00 2d 94 f6 4a 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d 84 f6 4a 00 f4 c3 cc cc 55 53 e8 e9
RSP: 0018:ffffffff89a07cd8 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffffffff15796a9
RDX: ffffffff89a99dc0 RSI: ffffffff87fdd050 RDI: 0000000000000000
RBP: ffff8880a67b3064 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffff8880a67b3064
R13: 1ffffffff1340fa5 R14: ffff8880a67b3065 R15: 0000000000000001
arch_safe_halt arch/x86/include/asm/paravirt.h:150 [inline]
acpi_safe_halt+0x95/0x180 drivers/acpi/processor_idle.c:111
acpi_idle_do_entry+0x15c/0x1b0 drivers/acpi/processor_idle.c:524
acpi_idle_enter+0x403/0xac0 drivers/acpi/processor_idle.c:650
cpuidle_enter_state+0x150/0xa70 drivers/cpuidle/cpuidle.c:243
cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:355
call_cpuidle kernel/sched/idle.c:132 [inline]
cpuidle_idle_call kernel/sched/idle.c:213 [inline]
do_idle+0x48e/0x730 kernel/sched/idle.c:273
cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:369
start_kernel+0x490/0x4b1 init/main.c:1048
secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
Modules linked in:
---[ end trace 7b8dcf330d33be72 ]---
RIP: 0010:print_lock_trace kernel/locking/lockdep.c:1751 [inline]
RIP: 0010:print_lock_class_header kernel/locking/lockdep.c:2240 [inline]
RIP: 0010:print_shortest_lock_dependencies.cold+0x110/0x2af kernel/locking/lockdep.c:2263
Code: 48 8b 04 24 48 c1 e8 03 42 80 3c 20 00 74 09 48 8b 3c 24 e8 dd fb de f9 48 8b 04 24 48 8b 00 48 8d 78 14 48 89 fa 48 c1 ea 03 <42> 0f b6 0c 22 48 89 fa 83 e2 07 83 c2 03 38 ca 7c 08 84 c9 0f 85
RSP: 0018:ffffc90000007870 EFLAGS: 00010003
RAX: 0000000000000001 RBX: ffffffff8c8d3cc0 RCX: 0000000000000000
RDX: 0000000000000002 RSI: ffffffff815bc817 RDI: 0000000000000015
RBP: ffffc900000079a0 R08: 0000000000000004 R09: ffff8880ae620f8b
R10: 0000000000000000 R11: 6c756e2820202020 R12: dffffc0000000000
R13: ffffffff8c6de1d0 R14: 0000000000000009 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000200000c0 CR3: 0000000093f18000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

syzbot

unread,
Sep 9, 2020, 10:13:11 AM9/9/20
to syzkall...@googlegroups.com, ti...@suse.de
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+561a74...@syzkaller.appspotmail.com

Tested on:

commit: dff9f829 Add linux-next specific files for 20200908
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
patch: https://syzkaller.appspot.com/x/patch.diff?x=132b15ed900000

Note: testing is done by a robot and is best-effort only.

syzbot

unread,
Sep 9, 2020, 1:33:05 PM9/9/20
to alsa-dev...@alsa-project.org, alsa-...@alsa-project.org, ar...@arndb.de, baoli...@linaro.org, boqun...@gmail.com, linux-...@vger.kernel.org, mi...@redhat.com, paul...@chromium.org, pe...@perex.cz, pet...@infradead.org, syzkall...@googlegroups.com, ti...@suse.com, ti...@suse.de, wi...@kernel.org
syzbot has bisected this issue to:

commit e918188611f073063415f40fae568fa4d86d9044
Author: Boqun Feng <boqun...@gmail.com>
Date: Fri Aug 7 07:42:20 2020 +0000

locking: More accurate annotations for read_lock()

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=112dc243900000
start commit: dff9f829 Add linux-next specific files for 20200908
git tree: linux-next
final oops: https://syzkaller.appspot.com/x/report.txt?x=132dc243900000
console output: https://syzkaller.appspot.com/x/log.txt?x=152dc243900000
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1209e245900000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=154b15ed900000

Reported-by: syzbot+561a74...@syzkaller.appspotmail.com
Fixes: e918188611f0 ("locking: More accurate annotations for read_lock()")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Boqun Feng

unread,
Sep 10, 2020, 12:50:19 AM9/10/20
to syzbot, alsa-dev...@alsa-project.org, alsa-...@alsa-project.org, ar...@arndb.de, baoli...@linaro.org, linux-...@vger.kernel.org, mi...@redhat.com, paul...@chromium.org, pe...@perex.cz, pet...@infradead.org, syzkall...@googlegroups.com, ti...@suse.com, ti...@suse.de, wi...@kernel.org
Thanks for reporting.

On Wed, Sep 09, 2020 at 10:33:04AM -0700, syzbot wrote:
> syzbot has bisected this issue to:
>
> commit e918188611f073063415f40fae568fa4d86d9044
> Author: Boqun Feng <boqun...@gmail.com>
> Date: Fri Aug 7 07:42:20 2020 +0000
>
> locking: More accurate annotations for read_lock()
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=112dc243900000
> start commit: dff9f829 Add linux-next specific files for 20200908
> git tree: linux-next
> final oops: https://syzkaller.appspot.com/x/report.txt?x=132dc243900000
> console output: https://syzkaller.appspot.com/x/log.txt?x=152dc243900000

From what I see in the output, probably this is the new deadlock
possibility we find with lockdep, basically if we have:

CPU 0: CPU 1:
read_lock(snd_card::ctl_file_rwlock);
<irq disabled>
spin_lock(snd_pcm_group::lock);
read_lock(snd_card::ctl_file_rwlock);
<interrupted by softirq>
spin_lock(snd_pcm_group::lock);

, because the read_lock() on CPU 1 will be a fair read lock(IOW, not a
recursive reader). And if there is a third CPU is also waiting for the
write_lock(), CPU 1 cannot get the read_lock() due to the fairness:

CPU 0: CPU 1: CPU 2:
read_lock(snd_card::ctl_file_rwlock);
<irq disabled>
spin_lock(snd_pcm_group::lock);
write_lock(snd_card::ctl_file_rwlock);
read_lock(snd_card::ctl_file_rwlock); // fair read lock, can only get the lock if CPU 2 get its lock
<interrupted by softirq>
spin_lock(snd_pcm_group::lock);

That said, I'm still looking into the code to find whether there is a
lock combination of CPU 1. Given I'm not familiar with sound subsystem,
I will appreciate any help on finding the lock pattern on CPU 1 ;-)

Regards,
Boqun
Reply all
Reply to author
Forward
0 new messages