[syzbot] inconsistent lock state in p9_req

syzbot

unread,

Aug 15, 2022, 6:24:24 PM8/15/22

to asma...@codewreck.org, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

Hello,

syzbot found the following issue on:

HEAD commit: f6eb0fed6a39 Merge tag 'timers-urgent-2022-08-13' of git:/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1184fec3080000
kernel config: https://syzkaller.appspot.com/x/.config?x=ffbab52ef9fab60
dashboard link: https://syzkaller.appspot.com/bug?extid=2f20b523930c32c160cc
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2f20b5...@syzkaller.appspotmail.com

================================
WARNING: inconsistent lock state
5.19.0-syzkaller-14264-gf6eb0fed6a39 #0 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
syz-executor.3/10062 [HC1[1]:SC1[1]:HE0:SE0] takes:
ffff88801ceb8c18 (&clnt->lock){?.+.}-{2:2}, at: p9_tag_remove net/9p/client.c:367 [inline]
ffff88801ceb8c18 (&clnt->lock){?.+.}-{2:2}, at: p9_req_put net/9p/client.c:375 [inline]
ffff88801ceb8c18 (&clnt->lock){?.+.}-{2:2}, at: p9_req_put+0xc6/0x250 net/9p/client.c:372
{HARDIRQ-ON-W} state was registered at:
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:349 [inline]
p9_fd_request+0x85/0x330 net/9p/trans_fd.c:672
p9_client_rpc+0x2f0/0xce0 net/9p/client.c:660
p9_client_version net/9p/client.c:880 [inline]
p9_client_create+0xaec/0x1070 net/9p/client.c:985
v9fs_session_init+0x1e2/0x1810 fs/9p/v9fs.c:408
v9fs_mount+0xba/0xc90 fs/9p/vfs_super.c:126
legacy_get_tree+0x105/0x220 fs/fs_context.c:610
vfs_get_tree+0x89/0x2f0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x1326/0x1e20 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x27f/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
irq event stamp: 1139
hardirqs last enabled at (1138): [<ffffffff89c00190>] __do_softirq+0x190/0x9c6 kernel/softirq.c:555
hardirqs last disabled at (1139): [<ffffffff897eaf31>] common_interrupt+0x11/0xc0 arch/x86/kernel/irq.c:240
softirqs last enabled at (62): [<ffffffff81483e73>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last enabled at (62): [<ffffffff81483e73>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
softirqs last disabled at (1137): [<ffffffff81483e73>] invoke_softirq kernel/softirq.c:445 [inline]
softirqs last disabled at (1137): [<ffffffff81483e73>] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&clnt->lock);
<Interrupt>
lock(&clnt->lock);

*** DEADLOCK ***

3 locks held by syz-executor.3/10062:
#0: ffff888066ee2ab0 (&p->alloc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:349 [inline]
#0: ffff888066ee2ab0 (&p->alloc_lock){+.+.}-{2:2}, at: task_lock include/linux/sched/task.h:174 [inline]
#0: ffff888066ee2ab0 (&p->alloc_lock){+.+.}-{2:2}, at: exit_fs+0x5a/0x170 fs/fs_struct.c:101
#1: ffff88801b98ca20 (&fs->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:349 [inline]
#1: ffff88801b98ca20 (&fs->lock){+.+.}-{2:2}, at: exit_fs+0x62/0x170 fs/fs_struct.c:102
#2: ffff888022720020 (&chan->lock#2){-.-.}-{2:2}, at: req_done+0xcf/0x2e0 net/9p/trans_virtio.c:139

stack backtrace:
CPU: 0 PID: 10062 Comm: syz-executor.3 Not tainted 5.19.0-syzkaller-14264-gf6eb0fed6a39 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_usage_bug kernel/locking/lockdep.c:3961 [inline]
valid_state kernel/locking/lockdep.c:3973 [inline]
mark_lock_irq kernel/locking/lockdep.c:4176 [inline]
mark_lock.part.0.cold+0x18/0xd8 kernel/locking/lockdep.c:4632
mark_lock kernel/locking/lockdep.c:4596 [inline]
mark_usage kernel/locking/lockdep.c:4524 [inline]
__lock_acquire+0x14a2/0x56d0 kernel/locking/lockdep.c:5007
lock_acquire kernel/locking/lockdep.c:5666 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5631
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0x39/0x50 kernel/locking/spinlock.c:162
p9_tag_remove net/9p/client.c:367 [inline]
p9_req_put net/9p/client.c:375 [inline]
p9_req_put+0xc6/0x250 net/9p/client.c:372
req_done+0x1de/0x2e0 net/9p/trans_virtio.c:148
vring_interrupt drivers/virtio/virtio_ring.c:2454 [inline]
vring_interrupt+0x29d/0x3d0 drivers/virtio/virtio_ring.c:2429
__handle_irq_event_percpu+0x227/0x870 kernel/irq/handle.c:158
handle_irq_event_percpu kernel/irq/handle.c:193 [inline]
handle_irq_event+0xa7/0x1e0 kernel/irq/handle.c:210
handle_edge_irq+0x25f/0xd00 kernel/irq/chip.c:819
generic_handle_irq_desc include/linux/irqdesc.h:158 [inline]
handle_irq arch/x86/kernel/irq.c:231 [inline]
__common_interrupt+0x9d/0x210 arch/x86/kernel/irq.c:250
common_interrupt+0x4d/0xc0 arch/x86/kernel/irq.c:240
asm_common_interrupt+0x22/0x40 arch/x86/include/asm/idtentry.h:640
RIP: 0010:__do_softirq+0x196/0x9c6 kernel/softirq.c:557
Code: 00 48 01 f0 48 89 44 24 18 48 c7 c7 c0 41 eb 89 e8 5f ff be ff 65 66 c7 05 35 94 43 76 00 00 e8 70 ab c1 f7 fb b8 ff ff ff ff <48> c7 c3 c0 a0 c0 8b 41 0f bc c5 41 89 c7 41 83 c7 01 0f 85 ad 00
RSP: 0018:ffffc90000007f70 EFLAGS: 00000206
RAX: 00000000ffffffff RBX: ffff888066ee2140 RCX: 1ffffffff211cc2e
RDX: 0000000000000000 RSI: 0000000000000102 RDI: 0000000000000000
RBP: ffff8880125241c0 R08: 0000000000000001 R09: ffffffff908d9967
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1106
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20 arch/x86/include/asm/idtentry.h:649
RIP: 0010:lock_acquire+0x1ef/0x570 kernel/locking/lockdep.c:5634
Code: d2 a3 7e 83 f8 01 0f 85 e8 02 00 00 9c 58 f6 c4 02 0f 85 fb 02 00 00 48 83 7c 24 08 00 74 01 fb 48 b8 00 00 00 00 00 fc ff df <48> 01 c3 48 c7 03 00 00 00 00 48 c7 43 08 00 00 00 00 48 8b 84 24
RSP: 0018:ffffc9000357fa48 EFLAGS: 00000206
RAX: dffffc0000000000 RBX: 1ffff920006aff4b RCX: ffffffff815e513e
RDX: 1ffff1100cddc576 RSI: 0000000000000002 RDI: 0000000000000000
RBP: 0000000000000001 R08: 0000000000000000 R09: ffffffff908d9967
R10: fffffbfff211b32c R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: ffff88801b98ca20 R15: 0000000000000000
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:349 [inline]
exit_fs+0x62/0x170 fs/fs_struct.c:102
do_exit+0xaa6/0x29b0 kernel/exit.c:791
do_group_exit+0xd2/0x2f0 kernel/exit.c:925
get_signal+0x238c/0x2610 kernel/signal.c:2857
arch_do_signal_or_restart+0x82/0x2300 arch/x86/kernel/signal.c:869
exit_to_user_mode_loop kernel/entry/common.c:166 [inline]
exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:201
__syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:294
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f0b07489279
Code: Unable to access opcode bytes at RIP 0x7f0b0748924f.
RSP: 002b:00007f0b085a3218 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00007f0b0759bf88 RCX: 00007f0b07489279
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f0b0759bf88
RBP: 00007f0b0759bf80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f0b0759bf8c
R13: 00007ffdfa1bfd4f R14: 00007f0b085a3300 R15: 0000000000022000
</TASK>
----------------
Code disassembly (best guess):
0: 00 48 01 add %cl,0x1(%rax)
3: f0 48 89 44 24 18 lock mov %rax,0x18(%rsp)
9: 48 c7 c7 c0 41 eb 89 mov $0xffffffff89eb41c0,%rdi
10: e8 5f ff be ff callq 0xffbeff74
15: 65 66 c7 05 35 94 43 movw $0x0,%gs:0x76439435(%rip) # 0x76439454
1c: 76 00 00
1f: e8 70 ab c1 f7 callq 0xf7c1ab94
24: fb sti
25: b8 ff ff ff ff mov $0xffffffff,%eax
* 2a: 48 c7 c3 c0 a0 c0 8b mov $0xffffffff8bc0a0c0,%rbx <-- trapping instruction
31: 41 0f bc c5 bsf %r13d,%eax
35: 41 89 c7 mov %eax,%r15d
38: 41 83 c7 01 add $0x1,%r15d
3c: 0f .byte 0xf
3d: 85 .byte 0x85
3e: ad lods %ds:(%rsi),%eax

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Tetsuo Handa

unread,

Sep 4, 2022, 2:09:51 AM9/4/22

to Eric Van Hensbergen, Latchesar Ionkov, Dominique Martinet, Christian Schoenebeck, syzbot, v9fs-de...@lists.sourceforge.net, syzkall...@googlegroups.com, net...@vger.kernel.org

syzbot is reporting inconsistent lock state in p9_req_put(), for
p9_tag_remove() from p9_req_put() from IRQ context is using
spin_lock_irqsave() on "struct p9_client"->lock but other locations
not from IRQ context are using spin_lock().

Since spin_lock() => spin_lock_irqsave() conversion on this lock will
needlessly disable IRQ for infrequent event, and p9_tag_remove() needs
to disable IRQ only for modifying IDR (RCU read lock can be used for
reading IDR), let's introduce a spinlock dedicated for serializing
idr_alloc()/idr_alloc_u32()/idr_remove() calls. Since this spinlock
will be held as innermost lock, circular locking dependency problem
won't happen by adding this spinlock.

Link: https://syzkaller.appspot.com/bug?extid=2f20b523930c32c160cc [1]
Reported-by: syzbot <syzbot+2f20b5...@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
---
This patch is not tested, for reproducer is not available.

net/9p/client.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index 0a6110e15d0f..20f0a2d8dd38 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -28,6 +28,8 @@
#define CREATE_TRACE_POINTS
#include <trace/events/9p.h>

+static DEFINE_SPINLOCK(p9_idr_lock);
+
#define DEFAULT_MSIZE (128 * 1024)

/* Client Option Parsing (code inspired by NFS code)
@@ -283,14 +285,14 @@ p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)
INIT_LIST_HEAD(&req->req_list);

idr_preload(GFP_NOFS);
- spin_lock_irq(&c->lock);
+ spin_lock_irq(&p9_idr_lock);
if (type == P9_TVERSION)
tag = idr_alloc(&c->reqs, req, P9_NOTAG, P9_NOTAG + 1,
GFP_NOWAIT);
else
tag = idr_alloc(&c->reqs, req, 0, P9_NOTAG, GFP_NOWAIT);
req->tc.tag = tag;
- spin_unlock_irq(&c->lock);
+ spin_unlock_irq(&p9_idr_lock);
idr_preload_end();
if (tag < 0)
goto free;
@@ -364,9 +366,9 @@ static void p9_tag_remove(struct p9_client *c, struct p9_req_t *r)
u16 tag = r->tc.tag;

p9_debug(P9_DEBUG_MUX, "freeing clnt %p req %p tag: %d\n", c, r, tag);
- spin_lock_irqsave(&c->lock, flags);
+ spin_lock_irqsave(&p9_idr_lock, flags);
idr_remove(&c->reqs, tag);
- spin_unlock_irqrestore(&c->lock, flags);
+ spin_unlock_irqrestore(&p9_idr_lock, flags);
}

int p9_req_put(struct p9_client *c, struct p9_req_t *r)
@@ -813,10 +815,10 @@ static struct p9_fid *p9_fid_create(struct p9_client *clnt)
refcount_set(&fid->count, 1);

idr_preload(GFP_KERNEL);
- spin_lock_irq(&clnt->lock);
+ spin_lock_irq(&p9_idr_lock);
ret = idr_alloc_u32(&clnt->fids, fid, &fid->fid, P9_NOFID - 1,
GFP_NOWAIT);
- spin_unlock_irq(&clnt->lock);
+ spin_unlock_irq(&p9_idr_lock);
idr_preload_end();
if (!ret) {
trace_9p_fid_ref(fid, P9_FID_REF_CREATE);
@@ -835,9 +837,9 @@ static void p9_fid_destroy(struct p9_fid *fid)
p9_debug(P9_DEBUG_FID, "fid %d\n", fid->fid);
trace_9p_fid_ref(fid, P9_FID_REF_DESTROY);
clnt = fid->clnt;
- spin_lock_irqsave(&clnt->lock, flags);
+ spin_lock_irqsave(&p9_idr_lock, flags);
idr_remove(&clnt->fids, fid->fid);
- spin_unlock_irqrestore(&clnt->lock, flags);
+ spin_unlock_irqrestore(&p9_idr_lock, flags);
kfree(fid->rdir);
kfree(fid);
}
--
2.34.1

Dominique Martinet

unread,

Sep 4, 2022, 2:36:39 AM9/4/22

to Tetsuo Handa, Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, syzbot, v9fs-de...@lists.sourceforge.net, syzkall...@googlegroups.com, net...@vger.kernel.org

Tetsuo Handa wrote on Sun, Sep 04, 2022 at 03:09:28PM +0900:
> syzbot is reporting inconsistent lock state in p9_req_put(), for
> p9_tag_remove() from p9_req_put() from IRQ context is using
> spin_lock_irqsave() on "struct p9_client"->lock but other locations
> not from IRQ context are using spin_lock().

Ah, I was wondering what this problem could have been, but it's yet
another instance of trans_fd abusing the client's lock when it really
should get its own...
I didn't realize mixing spin_lock_irq*() and spin_lock() was the
problem, thank you.

> Since spin_lock() => spin_lock_irqsave() conversion on this lock will
> needlessly disable IRQ for infrequent event, and p9_tag_remove() needs
> to disable IRQ only for modifying IDR (RCU read lock can be used for
> reading IDR), let's introduce a spinlock dedicated for serializing
> idr_alloc()/idr_alloc_u32()/idr_remove() calls. Since this spinlock
> will be held as innermost lock, circular locking dependency problem
> won't happen by adding this spinlock.

We have an idr per client though so this is needlessly adding contention
between multiple 9p mounts.

That probably doesn't matter too much in the general case, but adding a
different lock per connection is cheap so let's do it the other way
around: could you add a lock to the p9_conn struct in net/9p/trans_fd.c
and replace c->lock by it there?
That should have identical effect as other transports don't use client's
.lock ; and the locking in trans_fd.c is to protect req's status and
trans_fd's own lists which is orthogonal to client's lock that protects
tag allocations. I agree it should work in theory.

(If you don't have time to do this this patch has been helpful enough and
I can do it eventually)

Thanks,
--
Dominique

Tetsuo Handa

unread,

Sep 4, 2022, 3:06:56 AM9/4/22

to Dominique Martinet, Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, syzbot, v9fs-de...@lists.sourceforge.net, syzkall...@googlegroups.com, net...@vger.kernel.org

syzbot is reporting inconsistent lock state in p9_req_put(), for
p9_tag_remove() from p9_req_put() from IRQ context is using
spin_lock_irqsave() on "struct p9_client"->lock but other locations
not from IRQ context are using spin_lock().

Since spin_lock() => spin_lock_irqsave() conversion on this lock will
needlessly disable IRQ for infrequent event, and p9_tag_remove() needs
to disable IRQ only for modifying IDR (RCU read lock can be used for
reading IDR), let's introduce a spinlock dedicated for serializing
idr_alloc()/idr_alloc_u32()/idr_remove() calls. Since this spinlock
will be held as innermost lock, circular locking dependency problem
won't happen by adding this spinlock.

Link: https://syzkaller.appspot.com/bug?extid=2f20b523930c32c160cc [1]
Reported-by: syzbot <syzbot+2f20b5...@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
---

Changes in v2:
Make this spinlock per "struct p9_client", though I don't know how we
should update "@lock" when "@idr_lock" also protects @fids and @reqs...

/**
* struct p9_client - per client instance state
* @lock: protect @fids and @reqs
* @msize: maximum data size negotiated by protocol
* @proto_version: 9P protocol version to use
* @trans_mod: module API instantiated with this client
* @status: connection state
* @trans: tranport instance state and API
* @fids: All active FID handles
* @reqs: All active requests.
+ * @idr_lock: protect @fids and @reqs when modifying IDR
* @name: node name used as client id
*
* The client structure is used to keep track of various per-client
* state that has been instantiated.
*/

include/net/9p/client.h | 2 ++
net/9p/client.c | 17 +++++++++--------
2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/net/9p/client.h b/include/net/9p/client.h
index 78ebcf782ce5..5edb3bd144fc 100644
--- a/include/net/9p/client.h
+++ b/include/net/9p/client.h
@@ -94,6 +94,7 @@ struct p9_req_t {
* @trans: tranport instance state and API
* @fids: All active FID handles
* @reqs: All active requests.
+ * @idr_lock: protect @fids and @reqs when modifying IDR
* @name: node name used as client id
*
* The client structure is used to keep track of various per-client
@@ -122,6 +123,7 @@ struct p9_client {

struct idr fids;
struct idr reqs;
+ spinlock_t idr_lock;

char name[__NEW_UTS_LEN + 1];
};
diff --git a/net/9p/client.c b/net/9p/client.c
index 0a6110e15d0f..9c801b49431d 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -283,14 +283,14 @@ p9_tag_alloc(struct p9_client *c, int8_t type, unsigned int max_size)

INIT_LIST_HEAD(&req->req_list);

idr_preload(GFP_NOFS);
- spin_lock_irq(&c->lock);

+ spin_lock_irq(&c->idr_lock);

if (type == P9_TVERSION)
tag = idr_alloc(&c->reqs, req, P9_NOTAG, P9_NOTAG + 1,
GFP_NOWAIT);
else
tag = idr_alloc(&c->reqs, req, 0, P9_NOTAG, GFP_NOWAIT);
req->tc.tag = tag;
- spin_unlock_irq(&c->lock);

+ spin_unlock_irq(&c->idr_lock);

idr_preload_end();
if (tag < 0)
goto free;

@@ -364,9 +364,9 @@ static void p9_tag_remove(struct p9_client *c, struct p9_req_t *r)

u16 tag = r->tc.tag;

p9_debug(P9_DEBUG_MUX, "freeing clnt %p req %p tag: %d\n", c, r, tag);
- spin_lock_irqsave(&c->lock, flags);

+ spin_lock_irqsave(&c->idr_lock, flags);

idr_remove(&c->reqs, tag);
- spin_unlock_irqrestore(&c->lock, flags);

+ spin_unlock_irqrestore(&c->idr_lock, flags);

}

int p9_req_put(struct p9_client *c, struct p9_req_t *r)

@@ -813,10 +813,10 @@ static struct p9_fid *p9_fid_create(struct p9_client *clnt)

refcount_set(&fid->count, 1);

idr_preload(GFP_KERNEL);
- spin_lock_irq(&clnt->lock);

+ spin_lock_irq(&clnt->idr_lock);

ret = idr_alloc_u32(&clnt->fids, fid, &fid->fid, P9_NOFID - 1,
GFP_NOWAIT);
- spin_unlock_irq(&clnt->lock);

+ spin_unlock_irq(&clnt->idr_lock);

idr_preload_end();
if (!ret) {
trace_9p_fid_ref(fid, P9_FID_REF_CREATE);

@@ -835,9 +835,9 @@ static void p9_fid_destroy(struct p9_fid *fid)

p9_debug(P9_DEBUG_FID, "fid %d\n", fid->fid);
trace_9p_fid_ref(fid, P9_FID_REF_DESTROY);
clnt = fid->clnt;
- spin_lock_irqsave(&clnt->lock, flags);

+ spin_lock_irqsave(&clnt->idr_lock, flags);

idr_remove(&clnt->fids, fid->fid);
- spin_unlock_irqrestore(&clnt->lock, flags);

+ spin_unlock_irqrestore(&clnt->idr_lock, flags);
kfree(fid->rdir);
kfree(fid);
}
@@ -943,6 +943,7 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
memcpy(clnt->name, client_id, strlen(client_id) + 1);

spin_lock_init(&clnt->lock);
+ spin_lock_init(&clnt->idr_lock);
idr_init(&clnt->fids);
idr_init(&clnt->reqs);

--
2.34.1

Tetsuo Handa

unread,

Sep 4, 2022, 3:17:55 AM9/4/22

to Dominique Martinet, Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, syzbot, v9fs-de...@lists.sourceforge.net, syzkall...@googlegroups.com, net...@vger.kernel.org

On 2022/09/04 15:36, Dominique Martinet wrote:
> We have an idr per client though so this is needlessly adding contention
> between multiple 9p mounts.
>
> That probably doesn't matter too much in the general case,

I thought this is not a hot operation where contention matters.

> but adding a
> different lock per connection is cheap so let's do it the other way
> around: could you add a lock to the p9_conn struct in net/9p/trans_fd.c
> and replace c->lock by it there?

Not impossible, but may not be cheap for lockdep.

> That should have identical effect as other transports don't use client's
> .lock ; and the locking in trans_fd.c is to protect req's status and
> trans_fd's own lists which is orthogonal to client's lock that protects
> tag allocations. I agree it should work in theory.
>
> (If you don't have time to do this this patch has been helpful enough and
> I can do it eventually)

Sent as https://lkml.kernel.org/r/68540a56-6a5f-87e9...@I-love.SAKURA.ne.jp .

By the way, I think credit for the patch on https://syzkaller.appspot.com/bug?extid=50f7e8d06c3768dd97f3 goes to Hillf Danton...

Dominique Martinet

unread,

Sep 4, 2022, 3:55:51 AM9/4/22

to Tetsuo Handa, Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, syzbot, v9fs-de...@lists.sourceforge.net, syzkall...@googlegroups.com, net...@vger.kernel.org

Tetsuo Handa wrote on Sun, Sep 04, 2022 at 04:06:36PM +0900:
> Changes in v2:
> Make this spinlock per "struct p9_client", though I don't know how we
> should update "@lock" when "@idr_lock" also protects @fids and @reqs...

Sorry for the trouble, this is not what I meant: this v2 makes 'lock'
being unused except for trans_fd, which isn't optimal for other
transports like e.g. virtio or rdma.

In hindsight it's probably faster to just send a diff... Happy to keep
you as author if you'd like, or sign off or whatever you prefer -- I
wouldn't have guessed what that report meant without you.

(Reply to your other mail after the diff chunk)

---
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index ef5760971f1e..5b4807411281 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -91,6 +91,7 @@ struct p9_poll_wait {
* @mux_list: list link for mux to manage multiple connections (?)
* @client: reference to client instance for this connection
* @err: error state
+ * @req_lock: lock protecting req_list and requests statuses
* @req_list: accounting for requests which have been sent
* @unsent_req_list: accounting for requests that haven't been sent
* @rreq: read request
@@ -114,6 +115,7 @@ struct p9_conn {
struct list_head mux_list;
struct p9_client *client;
int err;
+ spinlock_t req_lock;
struct list_head req_list;
struct list_head unsent_req_list;
struct p9_req_t *rreq;
@@ -189,10 +191,10 @@ static void p9_conn_cancel(struct p9_conn *m, int err)

p9_debug(P9_DEBUG_ERROR, "mux %p err %d\n", m, err);

- spin_lock(&m->client->lock);
+ spin_lock(&m->req_lock);

if (m->err) {
- spin_unlock(&m->client->lock);
+ spin_unlock(&m->req_lock);
return;
}

@@ -205,7 +207,7 @@ static void p9_conn_cancel(struct p9_conn *m, int err)
list_move(&req->req_list, &cancel_list);
}

- spin_unlock(&m->client->lock);
+ spin_unlock(&m->req_lock);

list_for_each_entry_safe(req, rtmp, &cancel_list, req_list) {
p9_debug(P9_DEBUG_ERROR, "call back req %p\n", req);
@@ -360,7 +362,7 @@ static void p9_read_work(struct work_struct *work)
if ((m->rreq) && (m->rc.offset == m->rc.capacity)) {
p9_debug(P9_DEBUG_TRANS, "got new packet\n");
m->rreq->rc.size = m->rc.offset;
- spin_lock(&m->client->lock);
+ spin_lock(&m->req_lock);
if (m->rreq->status == REQ_STATUS_SENT) {
list_del(&m->rreq->req_list);
p9_client_cb(m->client, m->rreq, REQ_STATUS_RCVD);
@@ -369,14 +371,14 @@ static void p9_read_work(struct work_struct *work)
p9_debug(P9_DEBUG_TRANS,
"Ignore replies associated with a cancelled request\n");
} else {
- spin_unlock(&m->client->lock);
+ spin_unlock(&m->req_lock);
p9_debug(P9_DEBUG_ERROR,
"Request tag %d errored out while we were reading the reply\n",
m->rc.tag);
err = -EIO;
goto error;
}
- spin_unlock(&m->client->lock);
+ spin_unlock(&m->req_lock);
m->rc.sdata = NULL;
m->rc.offset = 0;
m->rc.capacity = 0;
@@ -454,10 +456,10 @@ static void p9_write_work(struct work_struct *work)
}

if (!m->wsize) {
- spin_lock(&m->client->lock);
+ spin_lock(&m->req_lock);
if (list_empty(&m->unsent_req_list)) {
clear_bit(Wworksched, &m->wsched);
- spin_unlock(&m->client->lock);
+ spin_unlock(&m->req_lock);
return;
}

@@ -472,7 +474,7 @@ static void p9_write_work(struct work_struct *work)
m->wpos = 0;
p9_req_get(req);
m->wreq = req;
- spin_unlock(&m->client->lock);
+ spin_unlock(&m->req_lock);
}

p9_debug(P9_DEBUG_TRANS, "mux %p pos %d size %d\n",
@@ -670,10 +672,10 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
if (m->err < 0)
return m->err;

- spin_lock(&client->lock);
+ spin_lock(&m->req_lock);
req->status = REQ_STATUS_UNSENT;
list_add_tail(&req->req_list, &m->unsent_req_list);
- spin_unlock(&client->lock);
+ spin_unlock(&m->req_lock);

if (test_and_clear_bit(Wpending, &m->wsched))
n = EPOLLOUT;
@@ -688,11 +690,13 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)

static int p9_fd_cancel(struct p9_client *client, struct p9_req_t *req)
{
+ struct p9_trans_fd *ts = client->trans;
+ struct p9_conn *m = &ts->conn;
int ret = 1;

p9_debug(P9_DEBUG_TRANS, "client %p req %p\n", client, req);

- spin_lock(&client->lock);
+ spin_lock(&m->req_lock);

if (req->status == REQ_STATUS_UNSENT) {
list_del(&req->req_list);
@@ -700,21 +704,24 @@ static int p9_fd_cancel(struct p9_client *client, struct p9_req_t *req)
p9_req_put(client, req);
ret = 0;
}
- spin_unlock(&client->lock);
+ spin_unlock(&m->req_lock);

return ret;
}

static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
{
+ struct p9_trans_fd *ts = client->trans;
+ struct p9_conn *m = &ts->conn;
+
p9_debug(P9_DEBUG_TRANS, "client %p req %p\n", client, req);

- spin_lock(&client->lock);
+ spin_lock(&m->req_lock);
/* Ignore cancelled request if message has been received
* before lock.
*/
if (req->status == REQ_STATUS_RCVD) {
- spin_unlock(&client->lock);
+ spin_unlock(&m->req_lock);
return 0;
}

@@ -723,7 +730,8 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
*/
list_del(&req->req_list);
req->status = REQ_STATUS_FLSHD;
- spin_unlock(&client->lock);
+ spin_unlock(&m->req_lock);
+
p9_req_put(client, req);

return 0;
@@ -832,6 +840,7 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd)

client->trans = ts;
client->status = Connected;
+ spin_lock_init(&ts->conn.req_lock);

return 0;

@@ -866,6 +875,7 @@ static int p9_socket_open(struct p9_client *client, struct socket *csocket)
p->wr = p->rd = file;
client->trans = p;
client->status = Connected;
+ spin_lock_init(&p->conn.req_lock);

p->rd->f_flags |= O_NONBLOCK;

---

Tetsuo Handa wrote on Sun, Sep 04, 2022 at 04:17:37PM +0900:
> On 2022/09/04 15:36, Dominique Martinet wrote:
> > We have an idr per client though so this is needlessly adding contention
> > between multiple 9p mounts.
> >
> > That probably doesn't matter too much in the general case,
>
> I thought this is not a hot operation where contention matters.

Each IO on the filesystem will take this lock, so while I assume idr
will likely be orders of magnitude faster than any network call we might
as well keep it as separate as possible.
I don't know.

> > but adding a
> > different lock per connection is cheap so let's do it the other way
> > around: could you add a lock to the p9_conn struct in net/9p/trans_fd.c
> > and replace c->lock by it there?
>
> Not impossible, but may not be cheap for lockdep.

It's still a single lock per mount, for most syzcaller tests with a
single mount call it should be identical?

> > That should have identical effect as other transports don't use client's
> > .lock ; and the locking in trans_fd.c is to protect req's status and
> > trans_fd's own lists which is orthogonal to client's lock that protects
> > tag allocations. I agree it should work in theory.
> >
> > (If you don't have time to do this this patch has been helpful enough and
> > I can do it eventually)
>
> Sent as https://lkml.kernel.org/r/68540a56-6a5f-87e9...@I-love.SAKURA.ne.jp .
>
> By the way, I think credit for the patch on
> https://syzkaller.appspot.com/bug?extid=50f7e8d06c3768dd97f3 goes to
> Hillf Danton...

Eh, with your link I'd agree, but I never got any mail from him.

Looking at the patch link, he dropped v9fs-developer, netdev and all the
maintainers from his patch (kept syzcaller, linux-kernel@vger and
syzbot) so I don't think I can realisticly be expected to know he
submitted a patch...

As a matter of fact I've implemented the exact same solution on Aug 17 a
week later, and another first time contributor sent another patch on
Sept 1st as I didn't send a separate mail for it either; this is a bit
ridiculous... Would have saved me time if he'd just kept the bare
minimum of Ccs :/

Well, I honestly don't care -- he was first so sure, if he speaks up I
can change the author, but I'm definitely not going to go out of my way
for someone who didn't help me; I already don't have enough time for
9p...

--
Dominique

Tetsuo Handa

unread,

Sep 4, 2022, 6:02:22 AM9/4/22

to Dominique Martinet, Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, syzbot, v9fs-de...@lists.sourceforge.net, syzkall...@googlegroups.com, net...@vger.kernel.org

On 2022/09/04 16:55, Dominique Martinet wrote:
> Tetsuo Handa wrote on Sun, Sep 04, 2022 at 04:06:36PM +0900:
>> Changes in v2:
>> Make this spinlock per "struct p9_client", though I don't know how we
>> should update "@lock" when "@idr_lock" also protects @fids and @reqs...
>
> Sorry for the trouble, this is not what I meant: this v2 makes 'lock'
> being unused except for trans_fd, which isn't optimal for other
> transports like e.g. virtio or rdma.

v1 was smaller, and I thought frequency of calling
idr_alloc()/idr_alloc_u32()/idr_remove() is low enough to justify
use of global spinlock.

>
> In hindsight it's probably faster to just send a diff... Happy to keep
> you as author if you'd like, or sign off or whatever you prefer -- I
> wouldn't have guessed what that report meant without you.

This diff is bigger than I can guess correctness. Maybe v1 patch should be
applied as a fix for this problem (because including Reported-by: or Fixes:
likely makes patches be automatically picked up by stable kernel maintainers
before syzbot tests for a while) and your patch should be applied as an improvement
(i.e. omit Reported-by: and Fixes: ). You can manually request for inclusion into
stable kernels after syzbot tested for a while.

> Eh, with your link I'd agree, but I never got any mail from him.

Too bad. Hillf is proposing patches in many bugs, but it seems that
he does not try to propose as formal patches with description.

Dominique Martinet

unread,

Sep 4, 2022, 6:53:06 AM9/4/22

to Tetsuo Handa, Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, syzbot, v9fs-de...@lists.sourceforge.net, syzkall...@googlegroups.com, net...@vger.kernel.org

Tetsuo Handa wrote on Sun, Sep 04, 2022 at 07:02:04PM +0900:
> > In hindsight it's probably faster to just send a diff... Happy to keep
> > you as author if you'd like, or sign off or whatever you prefer -- I
> > wouldn't have guessed what that report meant without you.
>
> This diff is bigger than I can guess correctness. Maybe v1 patch should be
> applied as a fix for this problem (because including Reported-by: or Fixes:
> likely makes patches be automatically picked up by stable kernel maintainers
> before syzbot tests for a while) and your patch should be applied as an improvement
> (i.e. omit Reported-by: and Fixes: ). You can manually request for inclusion into
> stable kernels after syzbot tested for a while.

Hmm. The diff is bigger but the change really is equivalent: that
client->lock is only ever used in client.c and trans_fd.c, you replaced
all the occurences in client.c (3 locations + init) while I replaced all
the occurences in trans_fd.c (6 locations + init); the end result is
the same of splitting the two locks exactly at the same place; as far as
correctness goes the patches are identical.

The diff is a bit bigger but the result is more maintainable, and both
versions would require trivial context adjustments to backport anyway
because of bd873038aed5 ("net/9p: allocate appropriate reduced message
buffers") which conflict with either version...
I don't think this warrants the overhead of splitting the patch; sorry.

(and anyway Sasha Levin's autopicker seems to pick almost everything 9p,
said bd873038aed5 was backported down to 5.15 so these will have
backport for free on either version)

Back on topic, assuming you don't strongly oppose to keeping my version,
what tags should I add to the patch?
Reported-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
undersells your work, but I don't want to add something like
Co-authored-by without your permission.

Thanks,
--
Dominique

Tetsuo Handa

unread,

Sep 4, 2022, 6:59:12 AM9/4/22

to Dominique Martinet, Eric Van Hensbergen, Latchesar Ionkov, Christian Schoenebeck, syzbot, v9fs-de...@lists.sourceforge.net, syzkall...@googlegroups.com, net...@vger.kernel.org

On 2022/09/04 19:52, Dominique Martinet wrote:
> Back on topic, assuming you don't strongly oppose to keeping my version,
> what tags should I add to the patch?
> Reported-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
> undersells your work, but I don't want to add something like
> Co-authored-by without your permission.

Regarding this problem, Reported-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp> is sufficient.

Dominique Martinet

unread,

Sep 4, 2022, 7:29:41 AM9/4/22

to v9fs-de...@lists.sourceforge.net, Tetsuo Handa, syzkall...@googlegroups.com, linu...@crudebyte.com, linux-...@vger.kernel.org, Dominique Martinet, syzbot

Shamelessly copying the explanation from Tetsuo Handa's suggested
patch[1] (slightly reworded):
syzbot is reporting inconsistent lock state in p9_req_put()[2],

for p9_tag_remove() from p9_req_put() from IRQ context is using

spin_lock_irqsave() on "struct p9_client"->lock but trans_fd
(not from IRQ context) is using spin_lock().

Since the locks actually protect different things in client.c and in
trans_fd.c, just replace trans_fd.c's lock by a new one specific to the
transport instead of using spin_lock_irq* variants everywhere
(client.c's protect the idr for tag allocations, while
trans_fd.c's protects its own req list and request status field
that acts as the transport's state machine)

Link: https://lkml.kernel.org/r/2470e028-9b05-2013...@I-love.SAKURA.ne.jp [1]
Link: https://syzkaller.appspot.com/bug?extid=2f20b523930c32c160cc [2]
Reported-by: syzbot <syzbot+2f20b5...@syzkaller.appspotmail.com>
Reported-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Signed-off-by: Dominique Martinet <asma...@codewreck.org>
---
Tetsuo Handa-san, thank you very much!
I'm sorry for not respecting your opinion but it's been a pleasure to
have submissions from someone on JST :)

Both this and your previous patch only impact trans_fd which I can't
test super easily, so while I've sent the patch here I'll only queue it
to -next hopefully next week after I've had time to setup a compatible
server again...

net/9p/trans_fd.c | 42 ++++++++++++++++++++++++++----------------
1 file changed, 26 insertions(+), 16 deletions(-)

2.35.1

Tetsuo Handa

unread,

Sep 4, 2022, 9:04:05 AM9/4/22

to Dominique Martinet, v9fs-de...@lists.sourceforge.net, syzkall...@googlegroups.com, linu...@crudebyte.com, linux-...@vger.kernel.org, syzbot

On 2022/09/04 20:29, Dominique Martinet wrote:
> Since the locks actually protect different things in client.c and in
> trans_fd.c, just replace trans_fd.c's lock by a new one specific to the
> transport instead of using spin_lock_irq* variants everywhere
> (client.c's protect the idr for tag allocations, while
> trans_fd.c's protects its own req list and request status field
> that acts as the transport's state machine)

OK, I figured out what this patch changes.

$ grep -nF -- '->lock' *.[ch]
client.c:286: spin_lock_irq(&c->lock);
client.c:293: spin_unlock_irq(&c->lock);
client.c:367: spin_lock_irqsave(&c->lock, flags);
client.c:369: spin_unlock_irqrestore(&c->lock, flags);
client.c:816: spin_lock_irq(&clnt->lock);
client.c:819: spin_unlock_irq(&clnt->lock);
client.c:838: spin_lock_irqsave(&clnt->lock, flags);
client.c:840: spin_unlock_irqrestore(&clnt->lock, flags);
client.c:945: spin_lock_init(&clnt->lock);
trans_virtio.c:139: spin_lock_irqsave(&chan->lock, flags);
trans_virtio.c:151: spin_unlock_irqrestore(&chan->lock, flags);
trans_virtio.c:268: spin_lock_irqsave(&chan->lock, flags);
trans_virtio.c:287: spin_unlock_irqrestore(&chan->lock, flags);
trans_virtio.c:296: spin_unlock_irqrestore(&chan->lock, flags);
trans_virtio.c:303: spin_unlock_irqrestore(&chan->lock, flags);
trans_virtio.c:474: spin_lock_irqsave(&chan->lock, flags);
trans_virtio.c:515: spin_unlock_irqrestore(&chan->lock, flags);
trans_virtio.c:524: spin_unlock_irqrestore(&chan->lock, flags);
trans_virtio.c:532: spin_unlock_irqrestore(&chan->lock, flags);
trans_virtio.c:621: spin_lock_init(&chan->lock);
trans_xen.c:142: spin_lock_irqsave(&ring->lock, flags);
trans_xen.c:149: spin_unlock_irqrestore(&ring->lock, flags);
trans_xen.c:164: spin_unlock_irqrestore(&ring->lock, flags);
trans_xen.c:314: spin_lock_init(&ring->lock);

This patch changes "struct p9_client"->lock to be used for only
protecting idr, as explained at

* @lock: protect @fids and @reqs

line. Makes sense and looks correct.

> Tetsuo Handa-san, thank you very much!
> I'm sorry for not respecting your opinion but it's been a pleasure to
> have submissions from someone on JST :)

Thank you for responding quickly. :-)

Christian Schoenebeck

unread,

Oct 6, 2022, 9:17:20 AM10/6/22

to v9fs-de...@lists.sourceforge.net, Tetsuo Handa, Dominique Martinet, syzbot, syzkall...@googlegroups.com, linux-...@vger.kernel.org

Late on the party, sorry. Note that you already queued a slightly different
version than this patch here, anyway, one thing ...

Are you sure this is the right place for spin_lock_init()? Not rather in
p9_conn_create()?

> return 0;
>
> @@ -866,6 +875,7 @@ static int p9_socket_open(struct p9_client *client,
> struct socket *csocket) p->wr = p->rd = file;
> client->trans = p;
> client->status = Connected;
> + spin_lock_init(&p->conn.req_lock);

Same here.

>
> p->rd->f_flags |= O_NONBLOCK;

The rest LGTM.

Best regards,
Christian Schoenebeck

Dominique Martinet

unread,

Oct 6, 2022, 9:06:01 PM10/6/22

to Christian Schoenebeck, v9fs-de...@lists.sourceforge.net, Tetsuo Handa, syzbot, syzkall...@googlegroups.com, linux-...@vger.kernel.org

Christian Schoenebeck wrote on Thu, Oct 06, 2022 at 03:16:40PM +0200:
> > net/9p/trans_fd.c | 42 ++++++++++++++++++++++++++----------------
> > 1 file changed, 26 insertions(+), 16 deletions(-)
>
> Late on the party, sorry. Note that you already queued a slightly different
> version than this patch here, anyway, one thing ...

Did I? Oh, I apparently reworded the commit message a bit, sorry:

(where HEAD is this patch and 1622... the queued patch)

$ git range-diff HEAD^- 16228c9a4215^-
1: e35fb8546af2 ! 1: 16228c9a4215 net/9p: use a dedicated spinlock for trans_fd
@@ Commit message

Since the locks actually protect different things in client.c and in
trans_fd.c, just replace trans_fd.c's lock by a new one specific to the

- transport instead of using spin_lock_irq* variants everywhere
- (client.c's protect the idr for tag allocations, while
- trans_fd.c's protects its own req list and request status field
+ transport (client.c's protect the idr for fid/tag allocations,
+ while trans_fd.c's protects its own req list and request status field

that acts as the transport's state machine)

- Link: https://lkml.kernel.org/r/20220904112928.1...@codewreck.org
+ Link: https://lore.kernel.org/r/20220904112928.1...@codewreck.org
Link: https://lkml.kernel.org/r/2470e028-9b05-2013...@I-love.SAKURA.ne.jp [1]

Link: https://syzkaller.appspot.com/bug?extid=2f20b523930c32c160cc [2]
Reported-by: syzbot <syzbot+2f20b5...@syzkaller.appspotmail.com>

> > @@ -832,6 +840,7 @@ static int p9_fd_open(struct p9_client *client, int rfd,
> > int wfd)
> >
> > client->trans = ts;
> > client->status = Connected;
> > + spin_lock_init(&ts->conn.req_lock);
>
> Are you sure this is the right place for spin_lock_init()? Not rather in
> p9_conn_create()?

Good point, 'ts->conn' (named... m over there for some reason...) is
mostly initialized in p9_conn_create; it makes much more sense to move
it there despite being slightly further away from the allocation.

It's a bit of a pain to check as the code is spread over many paths (fd,
unix, tcp) but it looks equivalent to me:
- p9_fd_open is only called from p9_fd_create which immediately calls
p9_conn_create
- below p9_socket_open itself immediately calls p9_conn_create

I've moved the init and updated my next branch after very basic check
https://github.com/martinetd/linux/commit/e5cfd99e9ea6c13b3f0134585f269c509247ac0e:
----
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 5b4807411281..d407f31bb49d 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -591,6 +591,7 @@ static void p9_conn_create(struct p9_client *client)
INIT_LIST_HEAD(&m->mux_list);
m->client = client;

+ spin_lock_init(&m->req_lock);
INIT_LIST_HEAD(&m->req_list);
INIT_LIST_HEAD(&m->unsent_req_list);
INIT_WORK(&m->rq, p9_read_work);
@@ -840,7 +841,6 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd)

client->trans = ts;
client->status = Connected;

- spin_lock_init(&ts->conn.req_lock);

return 0;

@@ -875,7 +875,6 @@ static int p9_socket_open(struct p9_client *client, struct socket *csocket)

p->wr = p->rd = file;
client->trans = p;
client->status = Connected;

- spin_lock_init(&p->conn.req_lock);

p->rd->f_flags |= O_NONBLOCK;

----

> The rest LGTM.

Thank you for the look :)

--
Dominique

Christian Schoenebeck

unread,

Oct 7, 2022, 5:29:21 AM10/7/22

to Dominique Martinet, v9fs-de...@lists.sourceforge.net, Tetsuo Handa, syzbot, syzkall...@googlegroups.com, linux-...@vger.kernel.org

No, that's not what I meant, but my bad, it was the following chunk that
didn't apply here:

diff a/net/9p/trans_fd.c b/net/9p/trans_fd.c (rejected hunks)

@@ -205,7 +207,7 @@ static void p9_conn_cancel(struct p9_conn *m, int err)
list_move(&req->req_list, &cancel_list);
}

- spin_unlock(&m->client->lock);
+ spin_unlock(&m->req_lock);

list_for_each_entry_safe(req, rtmp, &cancel_list, req_list) {
p9_debug(P9_DEBUG_ERROR, "call back req %p\n", req);

And that was because this patch was based on:
https://github.com/martinetd/linux/commit/52f1c45dde9136f964d

I usually tag patches depending on another patch not being merged yet (and not
being tied to the same series) like:

Based-on: <message-id>

> > > @@ -832,6 +840,7 @@ static int p9_fd_open(struct p9_client *client, int
> > > rfd, int wfd)
> > >
> > > client->trans = ts;
> > > client->status = Connected;
> > >
> > > + spin_lock_init(&ts->conn.req_lock);
> >
> > Are you sure this is the right place for spin_lock_init()? Not rather in
> > p9_conn_create()?
>
> Good point, 'ts->conn' (named... m over there for some reason...) is
> mostly initialized in p9_conn_create; it makes much more sense to move
> it there despite being slightly further away from the allocation.
>
> It's a bit of a pain to check as the code is spread over many paths (fd,
> unix, tcp) but it looks equivalent to me:
> - p9_fd_open is only called from p9_fd_create which immediately calls
> p9_conn_create
> - below p9_socket_open itself immediately calls p9_conn_create

Yeah, looks pretty much the same, but better to have init code at the same
place. Either or.

With that changed, you can add my RB-tag. :)

Thanks!

Best regards,
Christian Schoenebeck

Reply all

Reply to author

Forward

[syzbot] inconsistent lock state in p9_req_put

syzbot

Tetsuo Handa

Dominique Martinet

Tetsuo Handa

Tetsuo Handa

Dominique Martinet

Tetsuo Handa

Dominique Martinet

Tetsuo Handa

Dominique Martinet

Tetsuo Handa

Christian Schoenebeck

Dominique Martinet

Christian Schoenebeck