[moderation/CI] Re: blk-mq: fix possible deadlocks

1 view
Skip to first unread message

syzbot ci

unread,
Nov 21, 2025, 2:55:58 AM (11 days ago) Nov 21
to syzkaller-upst...@googlegroups.com, syz...@lists.linux.dev
syzbot ci has tested the following series

[v2] blk-mq: fix possible deadlocks
https://lore.kernel.org/all/20251121062829....@fnnas.com
* [PATCH v2 1/9] blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos
* [PATCH v2 2/9] blk-rq-qos: fix possible debugfs_mutex deadlock
* [PATCH v2 3/9] blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static
* [PATCH v2 4/9] blk-mq-debugfs: warn about possible deadlock
* [PATCH v2 5/9] block/blk-rq-qos: add a new helper rq_qos_add_frozen()
* [PATCH v2 6/9] blk-wbt: fix incorrect lock order for rq_qos_mutex and freeze queue
* [PATCH v2 7/9] blk-iocost: fix incorrect lock order for rq_qos_mutex and freeze queue
* [PATCH v2 8/9] blk-iolatency: fix incorrect lock order for rq_qos_mutex and freeze queue
* [PATCH v2 9/9] block/blk-rq-qos: cleanup rq_qos_add()

and found the following issue:
possible deadlock in pcpu_alloc_noprof

Full report is available here:
https://ci.syzbot.org/series/162b3190-cad9-45ce-843d-4ffb08b0d52e

***

possible deadlock in pcpu_alloc_noprof

tree: torvalds
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux
base: 23cb64fb76257309e396ea4cec8396d4a1dbae68
arch: amd64
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
config: https://ci.syzbot.org/builds/03f7d9f4-1663-4b30-aa9e-5333289a7df2/config

======================================================
WARNING: possible circular locking dependency detected
syzkaller #0 Not tainted
------------------------------------------------------
syz-executor/5988 is trying to acquire lock:
ffffffff8e046120 (fs_reclaim){+.+.}-{0:0}, at: prepare_alloc_pages+0x153/0x610

but task is already holding lock:
ffffffff8e025608 (pcpu_alloc_mutex){+.+.}-{4:4}, at: pcpu_alloc_noprof+0x286/0x1720

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (pcpu_alloc_mutex){+.+.}-{4:4}:
lock_acquire+0x120/0x360
__mutex_lock+0x187/0x1350
pcpu_alloc_noprof+0x286/0x1720
blk_stat_alloc_callback+0xd5/0x220
wbt_init+0xa3/0x500
wbt_enable_default+0x25d/0x350
blk_register_queue+0x36a/0x3f0
__add_disk+0x677/0xd50
add_disk_fwnode+0xfc/0x480
loop_add+0x7f0/0xad0
loop_init+0xd9/0x170
do_one_initcall+0x236/0x820
do_initcall_level+0x104/0x190
do_initcalls+0x59/0xa0
kernel_init_freeable+0x334/0x4b0
kernel_init+0x1d/0x1d0
ret_from_fork+0x4bc/0x870
ret_from_fork_asm+0x1a/0x30

-> #1 (&q->q_usage_counter(io)#17){++++}-{0:0}:
lock_acquire+0x120/0x360
blk_alloc_queue+0x538/0x620
__blk_mq_alloc_disk+0x15c/0x340
loop_add+0x411/0xad0
loop_init+0xd9/0x170
do_one_initcall+0x236/0x820
do_initcall_level+0x104/0x190
do_initcalls+0x59/0xa0
kernel_init_freeable+0x334/0x4b0
kernel_init+0x1d/0x1d0
ret_from_fork+0x4bc/0x870
ret_from_fork_asm+0x1a/0x30

-> #0 (fs_reclaim){+.+.}-{0:0}:
validate_chain+0xb9b/0x2140
__lock_acquire+0xab9/0xd20
lock_acquire+0x120/0x360
fs_reclaim_acquire+0x72/0x100
prepare_alloc_pages+0x153/0x610
__alloc_frozen_pages_noprof+0x123/0x370
__alloc_pages_noprof+0xa/0x30
pcpu_populate_chunk+0x182/0xb30
pcpu_alloc_noprof+0xcbf/0x1720
xt_percpu_counter_alloc+0x161/0x220
translate_table+0x1323/0x2040
ip6t_register_table+0x106/0x7d0
ip6table_filter_table_init+0x75/0xb0
xt_find_table_lock+0x30c/0x3e0
xt_request_find_table_lock+0x26/0x100
do_ip6t_get_ctl+0x730/0x1180
nf_getsockopt+0x26e/0x290
ipv6_getsockopt+0x1ed/0x290
do_sock_getsockopt+0x372/0x450
__x64_sys_getsockopt+0x1a5/0x250
do_syscall_64+0xfa/0xfa0
entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
fs_reclaim --> &q->q_usage_counter(io)#17 --> pcpu_alloc_mutex

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(pcpu_alloc_mutex);
lock(&q->q_usage_counter(io)#17);
lock(pcpu_alloc_mutex);
lock(fs_reclaim);

*** DEADLOCK ***

1 lock held by syz-executor/5988:
#0: ffffffff8e025608 (pcpu_alloc_mutex){+.+.}-{4:4}, at: pcpu_alloc_noprof+0x286/0x1720

stack backtrace:
CPU: 1 UID: 0 PID: 5988 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250
print_circular_bug+0x2ee/0x310
check_noncircular+0x134/0x160
validate_chain+0xb9b/0x2140
__lock_acquire+0xab9/0xd20
lock_acquire+0x120/0x360
fs_reclaim_acquire+0x72/0x100
prepare_alloc_pages+0x153/0x610
__alloc_frozen_pages_noprof+0x123/0x370
__alloc_pages_noprof+0xa/0x30
pcpu_populate_chunk+0x182/0xb30
pcpu_alloc_noprof+0xcbf/0x1720
xt_percpu_counter_alloc+0x161/0x220
translate_table+0x1323/0x2040
ip6t_register_table+0x106/0x7d0
ip6table_filter_table_init+0x75/0xb0
xt_find_table_lock+0x30c/0x3e0
xt_request_find_table_lock+0x26/0x100
do_ip6t_get_ctl+0x730/0x1180
nf_getsockopt+0x26e/0x290
ipv6_getsockopt+0x1ed/0x290
do_sock_getsockopt+0x372/0x450
__x64_sys_getsockopt+0x1a5/0x250
do_syscall_64+0xfa/0xfa0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb85019140a
Code: ff c3 66 0f 1f 44 00 00 48 c7 c2 a8 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb b8 0f 1f 44 00 00 49 89 ca b8 37 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 c7 c2 a8 ff ff ff f7
RSP: 002b:00007fffbe029728 EFLAGS: 00000246 ORIG_RAX: 0000000000000037
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fb85019140a
RDX: 0000000000000040 RSI: 0000000000000029 RDI: 0000000000000003
RBP: 0000000000000029 R08: 00007fffbe02974c R09: ffffffffffffff00
R10: 00007fb8503b2ca8 R11: 0000000000000246 R12: 00007fb85021340a
R13: 00007fb8503b4e60 R14: 00007fb8503b2ca8 R15: 00007fb8503b2ca0
</TASK>


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syz...@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzk...@googlegroups.com.

The email will later be sent to:
[ax...@kernel.dk bvana...@acm.org linux...@vger.kernel.org ming...@redhat.com ni...@linux.ibm.com t...@kernel.org yuk...@fnnas.com]

If the report looks fine to you, reply with:
#syz upstream

syzbot ci

unread,
Nov 29, 2025, 10:39:36 PM (3 days ago) Nov 29
to syzkaller-upst...@googlegroups.com, syz...@lists.linux.dev
syzbot ci has tested the following series

[v3] blk-mq: fix possible deadlocks
https://lore.kernel.org/all/20251130024349....@fnnas.com
* [PATCH v3 01/10] blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos
* [PATCH v3 02/10] blk-rq-qos: fix possible debugfs_mutex deadlock
* [PATCH v3 03/10] blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static
* [PATCH v3 04/10] blk-mq-debugfs: warn about possible deadlock
* [PATCH v3 05/10] block/blk-rq-qos: add a new helper rq_qos_add_frozen()
* [PATCH v3 06/10] blk-wbt: fix incorrect lock order for rq_qos_mutex and freeze queue
* [PATCH v3 07/10] blk-iocost: fix incorrect lock order for rq_qos_mutex and freeze queue
* [PATCH v3 08/10] blk-iolatency: fix incorrect lock order for rq_qos_mutex and freeze queue
* [PATCH v3 09/10] blk-throttle: remove useless queue frozen
* [PATCH v3 10/10] block/blk-rq-qos: cleanup rq_qos_add()

and found the following issue:
possible deadlock in pcpu_alloc_noprof

Full report is available here:
https://ci.syzbot.org/series/1aec77f0-c53f-4b3b-93fb-b3853983b6bd

***

possible deadlock in pcpu_alloc_noprof

tree: linux-next
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next
base: 7d31f578f3230f3b7b33b0930b08f9afd8429817
arch: amd64
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
config: https://ci.syzbot.org/builds/70dca9e4-6667-4930-9024-150d656e503e/config

soft_limit_in_bytes is deprecated and will be removed. Please report your usecase to linu...@kvack.org if you depend on this functionality.
======================================================
WARNING: possible circular locking dependency detected
syzkaller #0 Not tainted
------------------------------------------------------
syz-executor/6047 is trying to acquire lock:
ffffffff8e04f760 (fs_reclaim){+.+.}-{0:0}, at: prepare_alloc_pages+0x152/0x650

but task is already holding lock:
ffffffff8e02dde8 (pcpu_alloc_mutex){+.+.}-{4:4}, at: pcpu_alloc_noprof+0x25b/0x1750

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (pcpu_alloc_mutex){+.+.}-{4:4}:
__mutex_lock+0x187/0x1350
pcpu_alloc_noprof+0x25b/0x1750
blk_stat_alloc_callback+0xd5/0x220
wbt_init+0xa3/0x500
wbt_enable_default+0x25d/0x350
blk_register_queue+0x36a/0x3f0
__add_disk+0x677/0xd50
add_disk_fwnode+0xfc/0x480
loop_add+0x7f0/0xad0
loop_init+0xd9/0x170
do_one_initcall+0x1fb/0x820
do_initcall_level+0x104/0x190
do_initcalls+0x59/0xa0
kernel_init_freeable+0x334/0x4b0
kernel_init+0x1d/0x1d0
ret_from_fork+0x599/0xb30
ret_from_fork_asm+0x1a/0x30

-> #1 (&q->q_usage_counter(io)#17){++++}-{0:0}:
blk_alloc_queue+0x538/0x620
__blk_mq_alloc_disk+0x15c/0x340
loop_add+0x411/0xad0
loop_init+0xd9/0x170
do_one_initcall+0x1fb/0x820
do_initcall_level+0x104/0x190
do_initcalls+0x59/0xa0
kernel_init_freeable+0x334/0x4b0
kernel_init+0x1d/0x1d0
ret_from_fork+0x599/0xb30
ret_from_fork_asm+0x1a/0x30

-> #0 (fs_reclaim){+.+.}-{0:0}:
__lock_acquire+0x15a6/0x2cf0
lock_acquire+0x117/0x340
fs_reclaim_acquire+0x72/0x100
prepare_alloc_pages+0x152/0x650
__alloc_frozen_pages_noprof+0x123/0x370
__alloc_pages_noprof+0xa/0x30
pcpu_populate_chunk+0x182/0xb30
pcpu_alloc_noprof+0xcb6/0x1750
xt_percpu_counter_alloc+0x161/0x220
translate_table+0x1323/0x2040
ip6t_register_table+0x106/0x7d0
ip6table_nat_table_init+0x43/0x2e0
xt_find_table_lock+0x30c/0x3e0
xt_request_find_table_lock+0x26/0x100
do_ip6t_get_ctl+0x730/0x1180
nf_getsockopt+0x26e/0x290
ipv6_getsockopt+0x1ed/0x290
do_sock_getsockopt+0x2b4/0x3d0
__x64_sys_getsockopt+0x1a5/0x250
do_syscall_64+0xfa/0xf80
entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

Chain exists of:
fs_reclaim --> &q->q_usage_counter(io)#17 --> pcpu_alloc_mutex

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(pcpu_alloc_mutex);
lock(&q->q_usage_counter(io)#17);
lock(pcpu_alloc_mutex);
lock(fs_reclaim);

*** DEADLOCK ***

1 lock held by syz-executor/6047:
#0: ffffffff8e02dde8 (pcpu_alloc_mutex){+.+.}-{4:4}, at: pcpu_alloc_noprof+0x25b/0x1750

stack backtrace:
CPU: 0 UID: 0 PID: 6047 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250
print_circular_bug+0x2e2/0x300
check_noncircular+0x12e/0x150
__lock_acquire+0x15a6/0x2cf0
lock_acquire+0x117/0x340
fs_reclaim_acquire+0x72/0x100
prepare_alloc_pages+0x152/0x650
__alloc_frozen_pages_noprof+0x123/0x370
__alloc_pages_noprof+0xa/0x30
pcpu_populate_chunk+0x182/0xb30
pcpu_alloc_noprof+0xcb6/0x1750
xt_percpu_counter_alloc+0x161/0x220
translate_table+0x1323/0x2040
ip6t_register_table+0x106/0x7d0
ip6table_nat_table_init+0x43/0x2e0
xt_find_table_lock+0x30c/0x3e0
xt_request_find_table_lock+0x26/0x100
do_ip6t_get_ctl+0x730/0x1180
nf_getsockopt+0x26e/0x290
ipv6_getsockopt+0x1ed/0x290
do_sock_getsockopt+0x2b4/0x3d0
__x64_sys_getsockopt+0x1a5/0x250
do_syscall_64+0xfa/0xf80
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7feba799150a
Code: ff c3 66 0f 1f 44 00 00 48 c7 c2 a8 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb b8 0f 1f 44 00 00 49 89 ca b8 37 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 c7 c2 a8 ff ff ff f7
RSP: 002b:00007fff14c6a9e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000037
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007feba799150a
RDX: 0000000000000040 RSI: 0000000000000029 RDI: 0000000000000003
RBP: 0000000000000029 R08: 00007fff14c6aa0c R09: ffffffffff000000
R10: 00007feba7bb6368 R11: 0000000000000246 R12: 00007feba7a30907
R13: 00007feba7bb7e60 R14: 00007feba7bb6368 R15: 00007feba7bb6360

Aleksandr Nogikh

unread,
Nov 30, 2025, 5:08:22 AM (2 days ago) Nov 30
to syzbot ci, syzkaller-upst...@googlegroups.com, syz...@lists.linux.dev
#syz upstream
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-upstream-moderation" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-upstream-m...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/syzkaller-upstream-moderation/692bbc76.050a0220.2ffa18.0015.GAE%40google.com.
Reply all
Reply to author
Forward
0 new messages