[syzbot] inconsistent lock state in padata_do_serial (2)

16 views
Skip to first unread message

syzbot

unread,
Sep 18, 2022, 3:43:45 PM9/18/22
to daniel....@oracle.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, steffen....@secunet.com, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 3245cb65fd91 Merge tag 'devicetree-fixes-for-6.0-2' of git..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16310dbf080000
kernel config: https://syzkaller.appspot.com/x/.config?x=98a30118ec9215e9
dashboard link: https://syzkaller.appspot.com/bug?extid=bc05445bc14148d51915
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+bc0544...@syzkaller.appspotmail.com

================================
WARNING: inconsistent lock state
6.0.0-rc5-syzkaller-00025-g3245cb65fd91 #0 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
syz-executor.2/27685 [HC0[0]:SC1[1]:HE1:SE0] takes:
ffffe8ffffc7d280 (&pd_list->lock){+.?.}-{2:2}, at: spin_lock include/linux/spinlock.h:349 [inline]
ffffe8ffffc7d280 (&pd_list->lock){+.?.}-{2:2}, at: padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
{SOFTIRQ-ON-W} state was registered at:
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:349 [inline]
padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
pcrypt_aead_enc+0x57/0x70 crypto/pcrypt.c:89
padata_do_parallel+0x87b/0xa10 kernel/padata.c:217
pcrypt_aead_encrypt+0x39f/0x4d0 crypto/pcrypt.c:117
crypto_aead_encrypt+0xaa/0xf0 crypto/aead.c:94
tls_do_encryption net/tls/tls_sw.c:529 [inline]
tls_push_record+0x13e8/0x3260 net/tls/tls_sw.c:762
bpf_exec_tx_verdict+0xd82/0x11a0 net/tls/tls_sw.c:802
tls_sw_sendmsg+0xa62/0x1820 net/tls/tls_sw.c:1014
inet6_sendmsg+0x99/0xe0 net/ipv6/af_inet6.c:653
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:734
sock_write_iter+0x291/0x3d0 net/socket.c:1108
call_write_iter include/linux/fs.h:2187 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x9e9/0xdd0 fs/read_write.c:578
ksys_write+0x1e8/0x250 fs/read_write.c:631
do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
__do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178
do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:203
entry_SYSENTER_compat_after_hwframe+0x70/0x82
irq event stamp: 740
hardirqs last enabled at (740): [<ffffffff814919a0>] __local_bh_enable_ip+0xa0/0x120 kernel/softirq.c:401
hardirqs last disabled at (739): [<ffffffff814919c3>] __local_bh_enable_ip+0xc3/0x120 kernel/softirq.c:378
softirqs last enabled at (0): [<ffffffff8146f02e>] copy_process+0x213e/0x7090 kernel/fork.c:2202
softirqs last disabled at (717): [<ffffffff81491843>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last disabled at (717): [<ffffffff81491843>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&pd_list->lock);
<Interrupt>
lock(&pd_list->lock);

*** DEADLOCK ***

4 locks held by syz-executor.2/27685:
#0: ffff8880445f0460 (sb_writers#3){.+.+}-{0:0}, at: open_last_lookups fs/namei.c:3470 [inline]
#0: ffff8880445f0460 (sb_writers#3){.+.+}-{0:0}, at: path_openat+0x2613/0x28f0 fs/namei.c:3688
#1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: inode_lock_shared include/linux/fs.h:766 [inline]
#1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: open_last_lookups fs/namei.c:3480 [inline]
#1: ffff8880465111a0 (&sb->s_type->i_mutex_key#9){++++}-{3:3}, at: path_openat+0x1514/0x28f0 fs/namei.c:3688
#2: ffffc900001e0d70 ((&d->timer)){+.-.}-{0:0}, at: lockdep_copy_map include/linux/lockdep.h:31 [inline]
#2: ffffc900001e0d70 ((&d->timer)){+.-.}-{0:0}, at: call_timer_fn+0xd5/0x6b0 kernel/time/timer.c:1464
#3: ffffffff8bf89400 (rcu_read_lock){....}-{1:2}, at: buf_msg net/tipc/msg.h:202 [inline]
#3: ffffffff8bf89400 (rcu_read_lock){....}-{1:2}, at: tipc_bearer_xmit_skb+0x8c/0x410 net/tipc/bearer.c:550

stack backtrace:
CPU: 1 PID: 27685 Comm: syz-executor.2 Not tainted 6.0.0-rc5-syzkaller-00025-g3245cb65fd91 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_usage_bug kernel/locking/lockdep.c:3961 [inline]
valid_state kernel/locking/lockdep.c:3973 [inline]
mark_lock_irq kernel/locking/lockdep.c:4176 [inline]
mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632
mark_lock kernel/locking/lockdep.c:4596 [inline]
mark_usage kernel/locking/lockdep.c:4527 [inline]
__lock_acquire+0x11d9/0x56d0 kernel/locking/lockdep.c:5007
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:349 [inline]
padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
pcrypt_aead_enc+0x57/0x70 crypto/pcrypt.c:89
padata_do_parallel+0x87b/0xa10 kernel/padata.c:217
pcrypt_aead_encrypt+0x39f/0x4d0 crypto/pcrypt.c:117
crypto_aead_encrypt+0xaa/0xf0 crypto/aead.c:94
tipc_aead_encrypt net/tipc/crypto.c:821 [inline]
tipc_crypto_xmit+0xf7a/0x2af0 net/tipc/crypto.c:1756
tipc_bearer_xmit_skb+0x1ed/0x410 net/tipc/bearer.c:557
tipc_disc_timeout+0x75e/0xcb0 net/tipc/discover.c:335
call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1519 [inline]
__run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
__run_timers kernel/time/timer.c:1768 [inline]
run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
__do_softirq+0x1d3/0x9c6 kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1106
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70 kernel/locking/spinlock.c:194
Code: 74 24 10 e8 6a 2e dc f7 48 89 ef e8 f2 af dc f7 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> a3 71 cf f7 65 8b 05 5c 27 7f 76 85 c0 74 0a 5b 5d c3 e8 60 38
RSP: 0018:ffffc90015e5f520 EFLAGS: 00000206
RAX: 0000000000000002 RBX: 0000000000000200 RCX: 1ffffffff21242a6
RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000001
RBP: ffff8880119db3c0 R08: 0000000000000001 R09: ffffffff908e49e7
R10: 0000000000000001 R11: 0000000000000000 R12: ffff888140007640
R13: ffff8880119db3c0 R14: ffffea0001194380 R15: ffff88804650ea48
spin_unlock_irqrestore include/linux/spinlock.h:404 [inline]
get_partial_node.part.0+0x1ad/0x220 mm/slub.c:2207
get_partial_node mm/slub.c:2175 [inline]
get_partial mm/slub.c:2287 [inline]
___slab_alloc+0x918/0xe10 mm/slub.c:3026
__slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3118
slab_alloc_node mm/slub.c:3209 [inline]
slab_alloc mm/slub.c:3251 [inline]
__kmem_cache_alloc_lru mm/slub.c:3258 [inline]
kmem_cache_alloc_lru+0x528/0x720 mm/slub.c:3275
__d_alloc+0x32/0x960 fs/dcache.c:1769
d_alloc+0x4a/0x230 fs/dcache.c:1849
d_alloc_parallel+0xe4/0x1400 fs/dcache.c:2647
lookup_open.isra.0+0x928/0x12a0 fs/namei.c:3338
open_last_lookups fs/namei.c:3481 [inline]
path_openat+0x996/0x28f0 fs/namei.c:3688
do_filp_open+0x1b6/0x400 fs/namei.c:3718
do_sys_openat2+0x16d/0x4c0 fs/open.c:1311
do_sys_open fs/open.c:1327 [inline]
__do_compat_sys_openat fs/open.c:1387 [inline]
__se_compat_sys_openat fs/open.c:1385 [inline]
__ia32_compat_sys_openat+0x13f/0x1f0 fs/open.c:1385
do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
__do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178
do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:203
entry_SYSENTER_compat_after_hwframe+0x70/0x82
RIP: 0023:0xf7fc6549
Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
RSP: 002b:00000000f748f8e0 EFLAGS: 00000286 ORIG_RAX: 0000000000000127
RAX: ffffffffffffffda RBX: 00000000ffffff9c RCX: 00000000f6ea12ec
RDX: 0000000000080001 RSI: 0000000000000000 RDI: 00000000f6f37000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000286 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
----------------
Code disassembly (best guess):
0: 74 24 je 0x26
2: 10 e8 adc %ch,%al
4: 6a 2e pushq $0x2e
6: dc f7 fdiv %st,%st(7)
8: 48 89 ef mov %rbp,%rdi
b: e8 f2 af dc f7 callq 0xf7dcb002
10: 81 e3 00 02 00 00 and $0x200,%ebx
16: 75 25 jne 0x3d
18: 9c pushfq
19: 58 pop %rax
1a: f6 c4 02 test $0x2,%ah
1d: 75 2d jne 0x4c
1f: 48 85 db test %rbx,%rbx
22: 74 01 je 0x25
24: fb sti
25: bf 01 00 00 00 mov $0x1,%edi
* 2a: e8 a3 71 cf f7 callq 0xf7cf71d2 <-- trapping instruction
2f: 65 8b 05 5c 27 7f 76 mov %gs:0x767f275c(%rip),%eax # 0x767f2792
36: 85 c0 test %eax,%eax
38: 74 0a je 0x44
3a: 5b pop %rbx
3b: 5d pop %rbp
3c: c3 retq
3d: e8 .byte 0xe8
3e: 60 (bad)
3f: 38 .byte 0x38


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

ead...@sina.com

unread,
Sep 18, 2022, 9:06:33 PM9/18/22
to syzbot+bc0544...@syzkaller.appspotmail.com, daniel....@oracle.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, steffen....@secunet.com, syzkall...@googlegroups.com, Edward Adam Davis
From: Edward Adam Davis <ead...@sina.com>

Parallelized object serialization uses spin_unlock for unlocking a spin lock
that was previously locked with spin_lock.

This caused the following lockdep warning about an inconsistent lock
state:

inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.

We must use spin_lock_irqsave, because it is possible to trigger tipc
from an irq handler.
Signed-off-by: Edward Adam Davis <ead...@sina.com>
Reported-by: syzbot+bc0544...@syzkaller.appspotmail.com
---
kernel/padata.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/padata.c b/kernel/padata.c
index e5819bb8bd1d..38c7b17da796 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -388,14 +388,15 @@ void padata_do_serial(struct padata_priv *padata)
int hashed_cpu = padata_cpu_hash(pd, padata->seq_nr);
struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
struct padata_priv *cur;
+ unsigned long flags;

- spin_lock(&reorder->lock);
+ spin_lock_irqsave(&reorder->lock, flags);
/* Sort in ascending order of sequence number. */
list_for_each_entry_reverse(cur, &reorder->list, list)
if (cur->seq_nr < padata->seq_nr)
break;
list_add(&padata->list, &cur->list);
- spin_unlock(&reorder->lock);
+ spin_unlock_irqrestore(&reorder->lock, flags);

/*
* Ensure the addition to the reorder list is ordered correctly
--
2.37.2

Daniel Jordan

unread,
Sep 19, 2022, 11:12:57 AM9/19/22
to ead...@sina.com, syzbot+bc0544...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, steffen....@secunet.com, syzkall...@googlegroups.com
Hi Edward,

On Mon, Sep 19, 2022 at 09:05:55AM +0800, ead...@sina.com wrote:
> From: Edward Adam Davis <ead...@sina.com>
>
> Parallelized object serialization uses spin_unlock for unlocking a spin lock
> that was previously locked with spin_lock.

There's nothing unusual about that, though?

> This caused the following lockdep warning about an inconsistent lock
> state:
>
> inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.

Neither HARDIRQ-ON-W nor IN-HARDIRQ-W appear in the syzbot report, did
you mean SOFTIRQ-ON-W and IN-SOFTIRQ-W?

> We must use spin_lock_irqsave, because it is possible to trigger tipc
> from an irq handler.

A softirq handler, not a hardirq handler. I'd suggest using
spin_lock_bh() instead of _irqsave in your patch.


A Fixes tag would be helpful for stable and folks backporting this fix
to understand what kernel versions are affected.
The changelog doesn't explain the problem or why the proposed solution
fixes it.

If I can read these splats right, it seems lockdep is complaining about
how a task can take the reorder lock when softirqs are enabled
(SOFTIRQ-ON-W) as in the tls_push_record() stack, but also when it's in
softirq context (IN-SOFTIRQ-W), as in the tipc_disc_timeout() stack. So
it should be enough to disable softirq here.

Daniel Jordan

unread,
Sep 19, 2022, 11:27:49 AM9/19/22
to ead...@sina.com, syzbot+bc0544...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, steffen....@secunet.com, syzkall...@googlegroups.com
...to avoid a deadlock. Helpful to include the impact to the user too.

ead...@sina.com

unread,
Sep 19, 2022, 8:39:48 PM9/19/22
to daniel....@oracle.com, ead...@sina.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, steffen....@secunet.com, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
From: Edward Adam Davis <ead...@sina.com>

On Mon, 19 Sep 2022 11:12:48 -0400, Daniel Jordan wrote:
> Hi Edward,
>
> On Mon, Sep 19, 2022 at 09:05:55AM +0800, ead...@sina.com wrote:
> > From: Edward Adam Davis <ead...@sina.com>
> >
> > Parallelized object serialization uses spin_unlock for unlocking a spin lock
> > that was previously locked with spin_lock.
>
> There's nothing unusual about that, though?
>
> > This caused the following lockdep warning about an inconsistent lock
> > state:
> >
> > inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
>
> Neither HARDIRQ-ON-W nor IN-HARDIRQ-W appear in the syzbot report, did
> you mean SOFTIRQ-ON-W and IN-SOFTIRQ-W?
Yes, I want say: inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
>
> > We must use spin_lock_irqsave, because it is possible to trigger tipc
> > from an irq handler.
>
> A softirq handler, not a hardirq handler. I'd suggest using
> spin_lock_bh() instead of _irqsave in your patch.
I think _irqsave better than _bh, it can save the irq context, but _bh not,
and in tipc call trace contain SOFTIRQ-ON-W and IN-SOFTIRQ-W.
>
>
> A Fixes tag would be helpful for stable and folks backporting this fix
> to understand what kernel versions are affected.
Yes, I will add it, thanks for your suggestion, and add Cc for the "Fixes"
owner.
Yes, I agree with what you said earlier, But the softirq is already on before
the tipc_bearer_xmit_skb(), that is, (SOFTIRQ-ON-W) and (IN-SOFTIRQ-W) will
be included in the call trace of TIPC.

ead...@sina.com

unread,
Sep 19, 2022, 9:25:16 PM9/19/22
to daniel....@oracle.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, steffen....@secunet.com, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com, Edward Adam Davis
From: Edward Adam Davis <ead...@sina.com>

Parallelized object uses spin_unlock
for unlocking a spin lock that was previously locked with
spin_lock_irqsave.

This caused the following lockdep warning about an inconsistent lock
state:

inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.

We must use spin_lock_irqsave, because it is possible to trigger tipc
from an softirq handler.
Link: https://syzkaller.appspot.com/bug?extid=bc05445bc14148d51915
Reported-by: syzbot+bc0544...@syzkaller.appspotmail.com
Fixes: f601c725a6ac0726 ("padata: remove padata_parallel_queue")
Cc: Daniel Jordan <daniel....@oracle.com>
Signed-off-by: Edward Adam Davis <ead...@sina.com>
---
Changes in v2:
Add Fixes, Cc, And alert comments.

Daniel Jordan

unread,
Sep 19, 2022, 9:47:20 PM9/19/22
to ead...@sina.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, steffen....@secunet.com, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Tue, Sep 20, 2022 at 08:39:08AM +0800, ead...@sina.com wrote:
> From: Edward Adam Davis <ead...@sina.com>
>
> On Mon, 19 Sep 2022 11:12:48 -0400, Daniel Jordan wrote:
> > Hi Edward,
> >
> > On Mon, Sep 19, 2022 at 09:05:55AM +0800, ead...@sina.com wrote:
> > > From: Edward Adam Davis <ead...@sina.com>
> > >
> > > Parallelized object serialization uses spin_unlock for unlocking a spin lock
> > > that was previously locked with spin_lock.
> >
> > There's nothing unusual about that, though?
> >
> > > This caused the following lockdep warning about an inconsistent lock
> > > state:
> > >
> > > inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> >
> > Neither HARDIRQ-ON-W nor IN-HARDIRQ-W appear in the syzbot report, did
> > you mean SOFTIRQ-ON-W and IN-SOFTIRQ-W?
> Yes, I want say: inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> >
> > > We must use spin_lock_irqsave, because it is possible to trigger tipc
> > > from an irq handler.
> >
> > A softirq handler, not a hardirq handler. I'd suggest using
> > spin_lock_bh() instead of _irqsave in your patch.
> I think _irqsave better than _bh, it can save the irq context, but _bh not,
> and in tipc call trace contain SOFTIRQ-ON-W and IN-SOFTIRQ-W.

_irqsave saving the context is about handling nested hardirq disables.
It's not needed here since we don't need to care about disabling
hardirq.

_bh is for disabling softirq, a different context from hardirq. We want
_bh here since the deadlock happens when a CPU takes the lock in both
task and softirq context. padata uses _bh lock variants because it can
be called in softirq context but not hardirq. Let's be consistent and
do it in this case too.
I hope what I said above about hard vs soft irq clears this up.

Daniel Jordan

unread,
Sep 19, 2022, 9:48:52 PM9/19/22
to ead...@sina.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, steffen....@secunet.com, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Tue, Sep 20, 2022 at 09:25:08AM +0800, ead...@sina.com wrote:
> Changes in v2:
> Add Fixes, Cc, And alert comments.

I appreciate the feedback you addressed, but kindly wait until we've
reached a consensus on the last version before sending a new one.
Thanks.

Steffen Klassert

unread,
Sep 20, 2022, 1:54:46 AM9/20/22
to Daniel Jordan, ead...@sina.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
padata_do_serial is called with BHs off, so using spin_lock_bh should not
fix anything here. I guess the problem is that we call padata_find_next
after we enabled the BHs in padata_reorder.

Daniel Jordan

unread,
Sep 20, 2022, 10:11:06 AM9/20/22
to Steffen Klassert, ead...@sina.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
Hi Steffen,
Yeah, padata_do_serial can be called with BHs off, like in the tipc
stack, but there are also cases where BHs can be on, like lockdep said
here:

{SOFTIRQ-ON-W} state was registered at:
...
padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
...

Line 392 is in _do_serial, not _reorder or _find_next.

ead...@sina.com

unread,
Sep 20, 2022, 6:34:19 PM9/20/22
to daniel....@oracle.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, steffen....@secunet.com, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com, Edward Adam Davis
From: Edward Adam Davis <ead...@sina.com>

Hi Daniel Jordan,
Yeah, The core of the problem is to resolve the context inconsistency before entering
the critical region, More specifically, the context should be IN-SOFTIRQ-W
or IN-HARDIRQ-W choose one before entering the critical section.

Steffen Klassert

unread,
Sep 21, 2022, 3:36:21 AM9/21/22
to Daniel Jordan, ead...@sina.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
padata_do_serial was designed to run with BHs off, it is a bug if it
runs with BHs on. But I don't see a case where this can happen. The
only user of padata_do_serial is pcrypt in its serialization callbacks
(pcrypt_aead_enc, pcrypt_aead_dec) and the async crypto callback
pcrypt_aead_done. pcrypt_aead_enc and pcrypt_aead_dec are issued via
the padata_serial_worker with the padata->serial call. BHs are
off here. The crypto callback also runs with BHs off.

What do I miss here?

Daniel Jordan

unread,
Sep 21, 2022, 2:51:48 PM9/21/22
to Steffen Klassert, ead...@sina.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
Ugh.. this newer, buggy part of padata_do_parallel:

/* Maximum works limit exceeded, run in the current task. */
padata->parallel(padata);

This skips the usual path in padata_parallel_worker, which disables BHs.
They should be left off in the above case too.

What about this?

---8<---

Subject: [PATCH] padata: always leave BHs disabled when running ->parallel()

A deadlock can happen when an overloaded system runs ->parallel() in the
context of the current task:

padata_do_parallel
->parallel()
pcrypt_aead_enc/dec
padata_do_serial
spin_lock(&reorder->lock) // BHs still enabled
<interrupt>
...
__do_softirq
...
padata_do_serial
spin_lock(&reorder->lock)

It's a bug for BHs to be on in _do_serial as Steffen points out, so
ensure they're off in the "current task" case like they are in
padata_parallel_worker to avoid this situation.

Reported-by: syzbot+bc0544...@syzkaller.appspotmail.com
Fixes: 4611ce224688 ("padata: allocate work structures for parallel jobs from a pool")
Signed-off-by: Daniel Jordan <daniel....@oracle.com>
---
kernel/padata.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/padata.c b/kernel/padata.c
index e5819bb8bd1d..97f51e0c1776 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -207,14 +207,16 @@ int padata_do_parallel(struct padata_shell *ps,
pw = padata_work_alloc();
spin_unlock(&padata_works_lock);

+ if (!pw) {
+ /* Maximum works limit exceeded, run in the current task. */
+ padata->parallel(padata);
+ }
+
rcu_read_unlock_bh();

if (pw) {
padata_work_init(pw, padata_parallel_worker, padata, 0);
queue_work(pinst->parallel_wq, &pw->pw_work);
- } else {
- /* Maximum works limit exceeded, run in the current task. */
- padata->parallel(padata);
}

return 0;
--
2.37.2

Steffen Klassert

unread,
Sep 22, 2022, 6:55:42 AM9/22/22
to Daniel Jordan, ead...@sina.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Wed, Sep 21, 2022 at 02:51:38PM -0400, Daniel Jordan wrote:
> On Wed, Sep 21, 2022 at 09:36:16AM +0200, Steffen Klassert wrote:
> > On Tue, Sep 20, 2022 at 10:10:57AM -0400, Daniel Jordan wrote:
> > > Yeah, padata_do_serial can be called with BHs off, like in the tipc
> > > stack, but there are also cases where BHs can be on, like lockdep said
> > > here:
> >
> > padata_do_serial was designed to run with BHs off, it is a bug if it
> > runs with BHs on. But I don't see a case where this can happen. The
> > only user of padata_do_serial is pcrypt in its serialization callbacks
> > (pcrypt_aead_enc, pcrypt_aead_dec) and the async crypto callback
> > pcrypt_aead_done. pcrypt_aead_enc and pcrypt_aead_dec are issued via
> > the padata_serial_worker with the padata->serial call. BHs are
> > off here. The crypto callback also runs with BHs off.
> >
> > What do I miss here?
>
> Ugh.. this newer, buggy part of padata_do_parallel:
>
> /* Maximum works limit exceeded, run in the current task. */
> padata->parallel(padata);

Oh well...

> This skips the usual path in padata_parallel_worker, which disables BHs.
> They should be left off in the above case too.
>
> What about this?
>
> ---8<---
>
> Subject: [PATCH] padata: always leave BHs disabled when running ->parallel()
>
> A deadlock can happen when an overloaded system runs ->parallel() in the
> context of the current task:
>
> padata_do_parallel
> ->parallel()
> pcrypt_aead_enc/dec
> padata_do_serial
> spin_lock(&reorder->lock) // BHs still enabled
> <interrupt>
> ...
> __do_softirq
> ...
> padata_do_serial
> spin_lock(&reorder->lock)
>
> It's a bug for BHs to be on in _do_serial as Steffen points out, so
> ensure they're off in the "current task" case like they are in
> padata_parallel_worker to avoid this situation.
>
> Reported-by: syzbot+bc0544...@syzkaller.appspotmail.com
> Fixes: 4611ce224688 ("padata: allocate work structures for parallel jobs from a pool")
> Signed-off-by: Daniel Jordan <daniel....@oracle.com>

Yes, that makes sense.

Acked-by: Steffen Klassert <steffen....@secunet.com>

But we also should look at the call to padata_find_next where BHs are
on. padata_find_next takes the same lock as padata_do_serial, so this
might be a candidate for a deadlock too.

ead...@sina.com

unread,
Sep 22, 2022, 11:17:57 AM9/22/22
to daniel....@oracle.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, steffen....@secunet.com, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com, Edward Adam Davis
From: Edward Adam Davis <ead...@sina.com>

Hi Daniel Jordan,

Wed, 21 Sep 2022 14:51:38 -0400, Daniel Jordan wrote:
> Subject: [PATCH] padata: always leave BHs disabled when running ->parallel()
>
> A deadlock can happen when an overloaded system runs ->parallel() in the
> context of the current task:
>
> padata_do_parallel
> ->parallel()
> pcrypt_aead_enc/dec
> padata_do_serial
> spin_lock(&reorder->lock) // BHs still enabled
> <interrupt>
> ...
> __do_softirq
> ...
> padata_do_serial
> spin_lock(&reorder->lock)
>
> It's a bug for BHs to be on in _do_serial as Steffen points out, so
> ensure they're off in the "current task" case like they are in
> padata_parallel_worker to avoid this situation.

Note that the splat clearly shows that padata_do_serial, whether entering from
TLS or TIPC, follows the same path, but tls is BHs on, but tipc Bhs is off:
...
spin_lock include/linux/spinlock.h:349 [inline]
padata_do_serial+0x21e/0x4b0 kernel/padata.c:392
pcrypt_aead_enc+0x57/0x70 crypto/pcrypt.c:89
padata_do_parallel+0x87b/0xa10 kernel/padata.c:217
...

So we didn't specify where the BHs was turned on before you acquired the lock.
It is therefore wise to turn off BHs when acquiring locks.

Thanks,
Edward Adam Davis

Daniel Jordan

unread,
Sep 22, 2022, 10:07:50 PM9/22/22
to Steffen Klassert, ead...@sina.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzbot+bc0544...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
Thanks.

> But we also should look at the call to padata_find_next where BHs are
> on. padata_find_next takes the same lock as padata_do_serial, so this
> might be a candidate for a deadlock too.

Yeah, that seems broken, it's now on my list of things to fix. Probably
worth staring at the rest of the locking for a bit too.
Reply all
Reply to author
Forward
0 new messages