[syzbot] KASAN: use-after-free Read in rdma_close

24 views
Skip to first unread message

syzbot

unread,
Sep 25, 2022, 7:29:41 AM9/25/22
to asma...@codewreck.org, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
Hello,

syzbot found the following issue on:

HEAD commit: 483fed3b5dc8 Add linux-next specific files for 20220921
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=11450b0f080000
kernel config: https://syzkaller.appspot.com/x/.config?x=849cb9f70f15b1ba
dashboard link: https://syzkaller.appspot.com/bug?extid=67d13108d855f451cafc
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11c18ce4880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12046b8c880000

Downloadable assets:
disk image: https://storage.googleapis.com/1cb3f4618323/disk-483fed3b.raw.xz
vmlinux: https://storage.googleapis.com/cc02cb30b495/vmlinux-483fed3b.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+67d131...@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: use-after-free in rdma_close+0xaf/0xc0 net/9p/trans_rdma.c:555
Read of size 8 at addr ffff888016c73408 by task syz-executor151/3608

CPU: 0 PID: 3608 Comm: syz-executor151 Not tainted 6.0.0-rc6-next-20220921-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/16/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:284 [inline]
print_report+0x15e/0x45d mm/kasan/report.c:395
kasan_report+0xbb/0x1f0 mm/kasan/report.c:495
rdma_close+0xaf/0xc0 net/9p/trans_rdma.c:555
p9_client_destroy+0xbe/0x370 net/9p/client.c:1040
p9_client_create+0x728/0xf20 net/9p/client.c:1027
v9fs_session_init+0x1e2/0x1810 fs/9p/v9fs.c:408
v9fs_mount+0xba/0xc90 fs/9p/vfs_super.c:126
legacy_get_tree+0x105/0x220 fs/fs_context.c:610
vfs_get_tree+0x89/0x2f0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x1326/0x1e20 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x27f/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f9a65969039
Code: 28 c3 e8 5a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffda4b921e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ffda4b921f8 RCX: 00007f9a65969039
RDX: 0000000020000080 RSI: 0000000020000040 RDI: 0000000020000000
RBP: 00007ffda4b921f0 R08: 00000000200000c0 R09: 00007f9a65927300
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>

Allocated by task 3608:
kasan_save_stack+0x1e/0x40 mm/kasan/common.c:45
kasan_set_track+0x21/0x30 mm/kasan/common.c:52
____kasan_kmalloc mm/kasan/common.c:371 [inline]
____kasan_kmalloc mm/kasan/common.c:330 [inline]
__kasan_kmalloc+0xa1/0xb0 mm/kasan/common.c:380
kmalloc include/linux/slab.h:559 [inline]
kzalloc include/linux/slab.h:695 [inline]
alloc_rdma net/9p/trans_rdma.c:567 [inline]
rdma_create_trans+0x24f/0x13d0 net/9p/trans_rdma.c:644
p9_client_create+0x7ef/0xf20 net/9p/client.c:992
v9fs_session_init+0x1e2/0x1810 fs/9p/v9fs.c:408
v9fs_mount+0xba/0xc90 fs/9p/vfs_super.c:126
legacy_get_tree+0x105/0x220 fs/fs_context.c:610
vfs_get_tree+0x89/0x2f0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x1326/0x1e20 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x27f/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 3608:
kasan_save_stack+0x1e/0x40 mm/kasan/common.c:45
kasan_set_track+0x21/0x30 mm/kasan/common.c:52
kasan_save_free_info+0x2a/0x40 mm/kasan/generic.c:511
____kasan_slab_free mm/kasan/common.c:236 [inline]
____kasan_slab_free+0x160/0x1c0 mm/kasan/common.c:200
kasan_slab_free include/linux/kasan.h:177 [inline]
slab_free_hook mm/slub.c:1669 [inline]
slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1695
slab_free mm/slub.c:3599 [inline]
__kmem_cache_free+0xab/0x3b0 mm/slub.c:3612
rdma_destroy_trans+0x196/0x210 net/9p/trans_rdma.c:380
rdma_create_trans+0x1076/0x13d0 net/9p/trans_rdma.c:735
p9_client_create+0x7ef/0xf20 net/9p/client.c:992
v9fs_session_init+0x1e2/0x1810 fs/9p/v9fs.c:408
v9fs_mount+0xba/0xc90 fs/9p/vfs_super.c:126
legacy_get_tree+0x105/0x220 fs/fs_context.c:610
vfs_get_tree+0x89/0x2f0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x1326/0x1e20 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x27f/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888016c73400
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 8 bytes inside of
512-byte region [ffff888016c73400, ffff888016c73600)

The buggy address belongs to the physical page:
page:ffffea00005b1c00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x16c70
head:ffffea00005b1c00 order:2 compound_mapcount:0 compound_pincount:0
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000010200 ffff888011841c80 dead000080100010 0000000000000000
raw: 0000000000000000 dead000000000001 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 1, tgid 1 (swapper/0), ts 2303057531, free_ts 0
prep_new_page mm/page_alloc.c:2538 [inline]
get_page_from_freelist+0x1092/0x2d20 mm/page_alloc.c:4286
__alloc_pages+0x1c7/0x5a0 mm/page_alloc.c:5545
alloc_page_interleave+0x1e/0x200 mm/mempolicy.c:2113
alloc_pages+0x22f/0x270 mm/mempolicy.c:2275
alloc_slab_page mm/slub.c:1739 [inline]
allocate_slab+0x213/0x300 mm/slub.c:1884
new_slab mm/slub.c:1937 [inline]
___slab_alloc+0xac1/0x1430 mm/slub.c:3119
__slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3217
slab_alloc_node mm/slub.c:3302 [inline]
__kmem_cache_alloc_node+0x18a/0x3d0 mm/slub.c:3375
__do_kmalloc_node mm/slab_common.c:933 [inline]
__kmalloc+0x44/0xc0 mm/slab_common.c:947
kmalloc include/linux/slab.h:564 [inline]
kzalloc include/linux/slab.h:695 [inline]
alloc_workqueue+0x14b/0x1020 kernel/workqueue.c:4314
padata_alloc+0xc8/0x4d0 kernel/padata.c:984
pcrypt_init_padata+0x1b/0xf5 crypto/pcrypt.c:323
pcrypt_init+0x70/0xef crypto/pcrypt.c:348
do_one_initcall+0x13d/0x780 init/main.c:1307
do_initcall_level init/main.c:1382 [inline]
do_initcalls init/main.c:1398 [inline]
do_basic_setup init/main.c:1417 [inline]
kernel_init_freeable+0x6ff/0x788 init/main.c:1637
kernel_init+0x1a/0x1d0 init/main.c:1525
page_owner free stack trace missing

Memory state around the buggy address:
ffff888016c73300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888016c73380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888016c73400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888016c73480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888016c73500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Leon Romanovsky

unread,
Sep 28, 2022, 6:07:30 AM9/28/22
to syzbot, asma...@codewreck.org, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
The bug is in commit 3ff51294a055 ("9p: p9_client_create: use p9_client_destroy on failure").
It is wrong to call to p9_client_destroy() if clnt->trans_mod->create fails.

Thanks

asma...@codewreck.org

unread,
Sep 28, 2022, 6:44:03 AM9/28/22
to Leon Romanovsky, syzbot, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
Leon Romanovsky wrote on Wed, Sep 28, 2022 at 01:07:23PM +0300:
> The bug is in commit 3ff51294a055 ("9p: p9_client_create: use p9_client_destroy on failure").

Thanks for looking

> It is wrong to call to p9_client_destroy() if clnt->trans_mod->create fails.

hmm that's a bit broad :)

But I agree I did get that wrong: trans_mod->close() wasn't called if
create failed.
We do want the idr_for_each_entry() that is in p9_client_destroy so
rather than revert the commit (fix a bug, create a new one..) I'd rather
split it out in an internal function that takes a 'bool close' or
something to not duplicate the rest.
(Bit of a nitpick, sure)

I'll send a patch and credit you in Reported-by unless you don't want
to.

--
Dominique

Leon Romanovsky

unread,
Sep 28, 2022, 6:49:25 AM9/28/22
to asma...@codewreck.org, syzbot, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
On Wed, Sep 28, 2022 at 07:43:38PM +0900, asma...@codewreck.org wrote:
> Leon Romanovsky wrote on Wed, Sep 28, 2022 at 01:07:23PM +0300:
> > The bug is in commit 3ff51294a055 ("9p: p9_client_create: use p9_client_destroy on failure").
>
> Thanks for looking
>
> > It is wrong to call to p9_client_destroy() if clnt->trans_mod->create fails.
>
> hmm that's a bit broad :)
>
> But I agree I did get that wrong: trans_mod->close() wasn't called if
> create failed.
> We do want the idr_for_each_entry() that is in p9_client_destroy so
> rather than revert the commit (fix a bug, create a new one..) I'd rather
> split it out in an internal function that takes a 'bool close' or
> something to not duplicate the rest.
> (Bit of a nitpick, sure)

Please do proper unwind without extra variable.

Proper unwind means that you will call to symmetrical functions in
destroy as you used in create:
alloc -> free
create -> close
e.t.c

When you use some global function like you did, there is huge chance
to see unwind bugs.

asma...@codewreck.org

unread,
Sep 28, 2022, 7:23:39 AM9/28/22
to Leon Romanovsky, syzbot, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
Leon Romanovsky wrote on Wed, Sep 28, 2022 at 01:49:19PM +0300:
> > But I agree I did get that wrong: trans_mod->close() wasn't called if
> > create failed.
> > We do want the idr_for_each_entry() that is in p9_client_destroy so
> > rather than revert the commit (fix a bug, create a new one..) I'd rather
> > split it out in an internal function that takes a 'bool close' or
> > something to not duplicate the rest.
> > (Bit of a nitpick, sure)
>
> Please do proper unwind without extra variable.
>
> Proper unwind means that you will call to symmetrical functions in
> destroy as you used in create:
> alloc -> free
> create -> close
> e.t.c
>
> When you use some global function like you did, there is huge chance
> to see unwind bugs.

No.

Duplicating complicated cleanup code leads to leaks like we used to
have; that destroy function already frees up things in the right order.

And, well, frankly 9p is a mess anyway; the problem here is that
trans_mod->create() doesn't leave any trace we can rely on in a common
cleanup function, but the original "proper unwind" missed:
- p9_fid_destroy/tags cleanup for anything in the cache (because, yes,
apparently fuzzers managed to use the client before it's fully
initialized. I don't want to know.)
- fcall cache destory

I'm not duplicating all this mess. This is the only place that can call
destroy before trans_mod create has been called, I wish we'd have a
pattern like "clnt->trans = clnt->trans_mod->create()" so we could just
check if trans is null, but a destroy parameter will do.

... Well, I guess it's not like there are out of tree trans, I could
just change create() to do that if I had infinite time...

--
Dominique


Leon Romanovsky

unread,
Sep 28, 2022, 7:54:10 AM9/28/22
to asma...@codewreck.org, syzbot, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
On Wed, Sep 28, 2022 at 08:23:14PM +0900, asma...@codewreck.org wrote:
> Leon Romanovsky wrote on Wed, Sep 28, 2022 at 01:49:19PM +0300:
> > > But I agree I did get that wrong: trans_mod->close() wasn't called if
> > > create failed.
> > > We do want the idr_for_each_entry() that is in p9_client_destroy so
> > > rather than revert the commit (fix a bug, create a new one..) I'd rather
> > > split it out in an internal function that takes a 'bool close' or
> > > something to not duplicate the rest.
> > > (Bit of a nitpick, sure)
> >
> > Please do proper unwind without extra variable.
> >
> > Proper unwind means that you will call to symmetrical functions in
> > destroy as you used in create:
> > alloc -> free
> > create -> close
> > e.t.c
> >
> > When you use some global function like you did, there is huge chance
> > to see unwind bugs.
>
> No.

Let's agree to disagree.

>
> Duplicating complicated cleanup code leads to leaks like we used to
> have; that destroy function already frees up things in the right order.

It is pretty straightforward code, nothing complex there.

Just pause for a minute, and ask yourself how totally random guy who
looked on this syzbot bug just because RDMA name in it, found the issue
so quickly.

I will give a hint, I saw not symmetrical error unwind in call trace.

Thanks

Christian Schoenebeck

unread,
Sep 28, 2022, 8:57:27 AM9/28/22
to asma...@codewreck.org, Leon Romanovsky, syzbot, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
OK, maybe it's just me, but ask yourself Leon, if you were the only guy left
(i.e. Dominique) still actively taking care for 9p, would those exactly be
motivating phrases for your efforts? Just saying.

From technical perspective, yes, destruction in reverse order is usually the
better way to go. Whether I would carve that in stone, without any exception,
probably not.

Best regards,
Christian Schoenebeck


Dominique Martinet

unread,
Sep 28, 2022, 5:44:36 PM9/28/22
to v9fs-de...@lists.sourceforge.net, Leon Romanovsky, linu...@crudebyte.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Dominique Martinet, syzbot+67d131...@syzkaller.appspotmail.com
destroy code would incorrectly call close() if trans_mod exists after some
hasty code cleanup: we need to make sure we only call close after create

Link: https://lkml.kernel.org/r/00000000000015...@google.com
Reported-by: syzbot+67d131...@syzkaller.appspotmail.com
Reported-by: Leon Romanovsky <le...@kernel.org>
Fixes: 3ff51294a055 ("9p: p9_client_create: use p9_client_destroy on failure")
Signed-off-by: Dominique Martinet <asma...@codewreck.org>
---
I tried to make trans->create() return clnt->trans to assign directly
from there, but rdma callbacks need clnt->trans to be set early during
init and the diff was just too big for a simple fix.
This should work for all transports without any change, and ensures we
only call close if create succeeded.

net/9p/client.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index bfa80f01992e..40b59431a566 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -971,6 +971,7 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
spin_lock_init(&clnt->lock);
idr_init(&clnt->fids);
idr_init(&clnt->reqs);
+ clnt->trans = ERR_PTR(-EINVAL);

err = parse_opts(options, clnt);
if (err < 0)
@@ -992,6 +993,9 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
err = clnt->trans_mod->create(clnt, dev_name, options);
if (err)
goto out;
+ // ensure clnt->trans is initialized to call close() on destroy
+ if (IS_ERR(clnt->trans))
+ clnt->trans = NULL;

if (clnt->msize > clnt->trans_mod->maxsize) {
clnt->msize = clnt->trans_mod->maxsize;
@@ -1036,7 +1040,7 @@ void p9_client_destroy(struct p9_client *clnt)

p9_debug(P9_DEBUG_MUX, "clnt %p\n", clnt);

- if (clnt->trans_mod)
+ if (clnt->trans_mod && !IS_ERR(client->trans))
clnt->trans_mod->close(clnt);

v9fs_put_trans(clnt->trans_mod);
--
2.35.1

Dominique Martinet

unread,
Sep 28, 2022, 5:44:40 PM9/28/22
to v9fs-de...@lists.sourceforge.net, Leon Romanovsky, linu...@crudebyte.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Dominique Martinet
fcall cache was init'd last for some reason, but we actually allocate
some requests for the version check before that.
Moving the cache creation towards the start also makes p9_client_destroy's
order match the allocation order, which might be easier to think about

Signed-off-by: Dominique Martinet <asma...@codewreck.org>
---
net/9p/client.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index 40b59431a566..1ea326e6e271 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -977,6 +977,17 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
if (err < 0)
goto out;

+ /* P9_HDRSZ + 4 is the smallest packet header we can have that is
+ * followed by data accessed from userspace by read
+ * Note we do not check for failure here because the cache is
+ * optional; warning will be issued on dmesg for failure.
+ */
+ clnt->fcall_cache =
+ kmem_cache_create_usercopy("9p-fcall-cache", clnt->msize,
+ 0, 0, P9_HDRSZ + 4,
+ clnt->msize - (P9_HDRSZ + 4),
+ NULL);
+
if (!clnt->trans_mod)
clnt->trans_mod = v9fs_get_default_trans();

@@ -1016,15 +1027,6 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
if (err)
goto out;

- /* P9_HDRSZ + 4 is the smallest packet header we can have that is
- * followed by data accessed from userspace by read
- */
- clnt->fcall_cache =
- kmem_cache_create_usercopy("9p-fcall-cache", clnt->msize,
- 0, 0, P9_HDRSZ + 4,
- clnt->msize - (P9_HDRSZ + 4),
- NULL);
-
return clnt;

out:
--
2.35.1

asma...@codewreck.org

unread,
Sep 28, 2022, 5:53:20 PM9/28/22
to Christian Schoenebeck, Leon Romanovsky, syzbot, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
Christian Schoenebeck wrote on Wed, Sep 28, 2022 at 02:57:07PM +0200:
> OK, maybe it's just me, but ask yourself Leon, if you were the only guy left
> (i.e. Dominique) still actively taking care for 9p, would those exactly be
> motivating phrases for your efforts? Just saying.

I didn't plan on replying (happy to disagree), but I'm actually grateful
for Leon to have taken the time to look here: Thank you!
While I probably would also have spotted the error (the change is
fresh), it saved me time even if you account for some bikeshedding.

(Not particularly happy with the amount of time I can allocate to 9p nor
the maintainance work I'm doing by the way, but I guess it's better than
leaving it completely unmaintained)

> From technical perspective, yes, destruction in reverse order is usually the
> better way to go. Whether I would carve that in stone, without any exception,
> probably not.

I think it's a tradeoff really.
Unrolling in place is great, don't get me wrong, but it's also easy to
miss things when adding code later on -- we actually just did that and
got another kasan report which made me factor things in to future-proof
the code.

Having a single place of truth that knows how to "untangle" and properly
free a struct, making sure it is noop for parts of the struct that
haven't been initialized yet, is less of a burden for me to think about.


... Just happened to be wrong about the "making sure it's noop" part
because I didn't check properly and my mental model had close functions
noop on NULL clnt->priv, like free functions...
(Uh, actually it is for RDMA, so the "problem" was that it left
clnt->trans set after later errors -- but conversely virtio's close
doesn't check so also had the problem and we really must ensure we don't
close something not open)

Anyway, I've sent a couple of patch (even fixing up the order to match
in create/destroy), I'll consider this closed.

--
Dominique

Dominique Martinet

unread,
Sep 28, 2022, 5:58:09 PM9/28/22
to v9fs-de...@lists.sourceforge.net, Leon Romanovsky, linu...@crudebyte.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Dominique Martinet, syzbot+67d131...@syzkaller.appspotmail.com
destroy code would incorrectly call close() if trans_mod exists after some
hasty code cleanup: we need to make sure we only call close after create

Link: https://lkml.kernel.org/r/00000000000015...@google.com
Reported-by: syzbot+67d131...@syzkaller.appspotmail.com
Reported-by: Leon Romanovsky <le...@kernel.org>
Fixes: 3ff51294a055 ("9p: p9_client_create: use p9_client_destroy on failure")
Signed-off-by: Dominique Martinet <asma...@codewreck.org>
---

As pointed out in later mail, rdma actually does set trans->priv then
fails, so we also need to reset clnt->trans on create errors.

That's getting uglier than I wish it'd be, but the cleanup code I just
trashed away really isn't pretty either so I guess it'll have to do...
At least close() should now really never be called on create failures.

net/9p/client.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index bfa80f01992e..8cf952f2de68 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -971,6 +971,7 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
spin_lock_init(&clnt->lock);
idr_init(&clnt->fids);
idr_init(&clnt->reqs);
+ clnt->trans = ERR_PTR(-EINVAL);

err = parse_opts(options, clnt);
if (err < 0)
@@ -990,8 +991,14 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
clnt, clnt->trans_mod, clnt->msize, clnt->proto_version);

err = clnt->trans_mod->create(clnt, dev_name, options);
- if (err)
+ // ensure clnt->trans is initialized to call close() on destroy
+ // if and only if create succeeded
+ if (err < 0) {
+ clnt->trans = PTR_ERR(err);
goto out;
+ }
+ if (IS_ERR(clnt->trans))
+ clnt->trans = NULL;

if (clnt->msize > clnt->trans_mod->maxsize) {
clnt->msize = clnt->trans_mod->maxsize;
@@ -1036,7 +1043,7 @@ void p9_client_destroy(struct p9_client *clnt)

Dominique Martinet

unread,
Sep 28, 2022, 6:19:35 PM9/28/22
to v9fs-de...@lists.sourceforge.net, Leon Romanovsky, linu...@crudebyte.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Dominique Martinet, syzbot+67d131...@syzkaller.appspotmail.com
destroy code would incorrectly call close() if trans_mod exists after some
hasty code cleanup: we need to make sure we only call close after create

Link: https://lkml.kernel.org/r/20220928214417.1...@codewreck.org
Link: https://lkml.kernel.org/r/00000000000015...@google.com
Reported-by: syzbot+67d131...@syzkaller.appspotmail.com
Reported-by: Leon Romanovsky <le...@kernel.org>
Fixes: 3ff51294a055 ("9p: p9_client_create: use p9_client_destroy on failure")
Signed-off-by: Dominique Martinet <asma...@codewreck.org>
---
v1->v2: also reset trans on create error
v2->v3: fix silly compile errors

Sorry for the multiple mails, that's what I get for not even doing basic
tests before talking...

net/9p/client.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index bfa80f01992e..41e825a8da7c 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -971,6 +971,7 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
spin_lock_init(&clnt->lock);
idr_init(&clnt->fids);
idr_init(&clnt->reqs);
+ clnt->trans = ERR_PTR(-EINVAL);

err = parse_opts(options, clnt);
if (err < 0)
@@ -990,8 +991,14 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
clnt, clnt->trans_mod, clnt->msize, clnt->proto_version);

err = clnt->trans_mod->create(clnt, dev_name, options);
- if (err)
+ // ensure clnt->trans is initialized to call close() on destroy
+ // if and only if create succeeded
+ if (err < 0) {
+ clnt->trans = ERR_PTR(err);
goto out;
+ }
+ if (IS_ERR(clnt->trans))
+ clnt->trans = NULL;

if (clnt->msize > clnt->trans_mod->maxsize) {
clnt->msize = clnt->trans_mod->maxsize;
@@ -1036,7 +1043,7 @@ void p9_client_destroy(struct p9_client *clnt)

p9_debug(P9_DEBUG_MUX, "clnt %p\n", clnt);

- if (clnt->trans_mod)
+ if (clnt->trans_mod && !IS_ERR(clnt->trans))

Leon Romanovsky

unread,
Sep 29, 2022, 1:54:17 AM9/29/22
to Dominique Martinet, v9fs-de...@lists.sourceforge.net, linu...@crudebyte.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot+67d131...@syzkaller.appspotmail.com
On Thu, Sep 29, 2022 at 07:19:23AM +0900, Dominique Martinet wrote:
> destroy code would incorrectly call close() if trans_mod exists after some
> hasty code cleanup: we need to make sure we only call close after create
>
> Link: https://lkml.kernel.org/r/20220928214417.1...@codewreck.org
> Link: https://lkml.kernel.org/r/00000000000015...@google.com
> Reported-by: syzbot+67d131...@syzkaller.appspotmail.com
> Reported-by: Leon Romanovsky <le...@kernel.org>
> Fixes: 3ff51294a055 ("9p: p9_client_create: use p9_client_destroy on failure")
> Signed-off-by: Dominique Martinet <asma...@codewreck.org>
> ---
> v1->v2: also reset trans on create error
> v2->v3: fix silly compile errors
>
> Sorry for the multiple mails, that's what I get for not even doing basic
> tests before talking...

Please always submit new patch versions as new one and don't reply-to.
It breaks flows of everyone who relies on proper email threading.

>
> net/9p/client.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/net/9p/client.c b/net/9p/client.c
> index bfa80f01992e..41e825a8da7c 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -971,6 +971,7 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
> spin_lock_init(&clnt->lock);
> idr_init(&clnt->fids);
> idr_init(&clnt->reqs);
> + clnt->trans = ERR_PTR(-EINVAL);
>
> err = parse_opts(options, clnt);
> if (err < 0)
> @@ -990,8 +991,14 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
> clnt, clnt->trans_mod, clnt->msize, clnt->proto_version);
>
> err = clnt->trans_mod->create(clnt, dev_name, options);
> - if (err)
> + // ensure clnt->trans is initialized to call close() on destroy
> + // if and only if create succeeded

Please use /* */ comment style.

> + if (err < 0) {
> + clnt->trans = ERR_PTR(err);
> goto out;
> + }
> + if (IS_ERR(clnt->trans))
> + clnt->trans = NULL;
>
> if (clnt->msize > clnt->trans_mod->maxsize) {
> clnt->msize = clnt->trans_mod->maxsize;
> @@ -1036,7 +1043,7 @@ void p9_client_destroy(struct p9_client *clnt)
>
> p9_debug(P9_DEBUG_MUX, "clnt %p\n", clnt);
>
> - if (clnt->trans_mod)
> + if (clnt->trans_mod && !IS_ERR(clnt->trans))

It is completely no-go to rely on internal value inside of structure after
failure in ->create() callback.

Leon Romanovsky

unread,
Sep 29, 2022, 1:57:36 AM9/29/22
to Dominique Martinet, v9fs-de...@lists.sourceforge.net, linu...@crudebyte.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Thu, Sep 29, 2022 at 06:44:17AM +0900, Dominique Martinet wrote:
> fcall cache was init'd last for some reason, but we actually allocate
> some requests for the version check before that.
> Moving the cache creation towards the start also makes p9_client_destroy's
> order match the allocation order, which might be easier to think about
>
> Signed-off-by: Dominique Martinet <asma...@codewreck.org>
> ---
> net/9p/client.c | 20 +++++++++++---------
> 1 file changed, 11 insertions(+), 9 deletions(-)

Please submit it properly. It is unclear to which series you are
referring.

>
> diff --git a/net/9p/client.c b/net/9p/client.c
> index 40b59431a566..1ea326e6e271 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -977,6 +977,17 @@ struct p9_client *p9_client_create(const char *dev_name, char *options)
> if (err < 0)
> goto out;
>
> + /* P9_HDRSZ + 4 is the smallest packet header we can have that is
> + * followed by data accessed from userspace by read
> + * Note we do not check for failure here because the cache is
> + * optional; warning will be issued on dmesg for failure.
> + */
> + clnt->fcall_cache =
> + kmem_cache_create_usercopy("9p-fcall-cache", clnt->msize,
> + 0, 0, P9_HDRSZ + 4,
> + clnt->msize - (P9_HDRSZ + 4),
> + NULL);
> +

clnt->msize can be changed after call to ->create() if it is larger than clnt->trans_mod->maxsize).

Leon Romanovsky

unread,
Sep 29, 2022, 2:10:09 AM9/29/22
to asma...@codewreck.org, Christian Schoenebeck, syzbot, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
On Thu, Sep 29, 2022 at 06:52:56AM +0900, asma...@codewreck.org wrote:

<...>

> > From technical perspective, yes, destruction in reverse order is usually the
> > better way to go. Whether I would carve that in stone, without any exception,
> > probably not.
>
> I think it's a tradeoff really.
> Unrolling in place is great, don't get me wrong, but it's also easy to
> miss things when adding code later on -- we actually just did that and
> got another kasan report which made me factor things in to future-proof
> the code.
>
> Having a single place of truth that knows how to "untangle" and properly
> free a struct, making sure it is noop for parts of the struct that
> haven't been initialized yet, is less of a burden for me to think about.

It is not bikeshedding or tradeoff, but matter of well-proven coding
patterns, which are very helpful for review and code maintaining.

Thanks

Dominique Martinet

unread,
Sep 29, 2022, 3:04:14 AM9/29/22
to Leon Romanovsky, v9fs-de...@lists.sourceforge.net, linu...@crudebyte.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Leon Romanovsky wrote on Thu, Sep 29, 2022 at 08:57:31AM +0300:
> > + * followed by data accessed from userspace by read
> > + * Note we do not check for failure here because the cache is
> > + * optional; warning will be issued on dmesg for failure.
> > + */
> > + clnt->fcall_cache =
> > + kmem_cache_create_usercopy("9p-fcall-cache", clnt->msize,
> > + 0, 0, P9_HDRSZ + 4,
> > + clnt->msize - (P9_HDRSZ + 4),
> > + NULL);
> > +
>
> clnt->msize can be changed after call to ->create() if it is larger than clnt->trans_mod->maxsize).

Ugh, that's what I get for doing this in a hurry because you're all
talking about order this order that...

This is absolutely correct, I'm dropping this patch.

--
Dominique

Dominique Martinet

unread,
Sep 29, 2022, 3:24:33 AM9/29/22
to Leon Romanovsky, v9fs-de...@lists.sourceforge.net, linu...@crudebyte.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com, syzbot+67d131...@syzkaller.appspotmail.com
Leon Romanovsky wrote on Thu, Sep 29, 2022 at 08:54:10AM +0300:
> > + // ensure clnt->trans is initialized to call close() on destroy
> > + // if and only if create succeeded
> > + if (err < 0) {
> > + clnt->trans = ERR_PTR(err);
> > goto out;
> > + }
> > + if (IS_ERR(clnt->trans))
> > + clnt->trans = NULL;
> >
> > if (clnt->msize > clnt->trans_mod->maxsize) {
> > clnt->msize = clnt->trans_mod->maxsize;
> > @@ -1036,7 +1043,7 @@ void p9_client_destroy(struct p9_client *clnt)
> >
> > p9_debug(P9_DEBUG_MUX, "clnt %p\n", clnt);
> >
> > - if (clnt->trans_mod)
> > + if (clnt->trans_mod && !IS_ERR(clnt->trans))
>
> It is completely no-go to rely on internal value inside of structure after
> failure in ->create() callback.

We're overriding the value after ->create(), in both cases that might
pose problem (success without setting trans/failure); I can't see how
that would possibly fail.

But as you've seen in the other commit I am in no shape to write any
decent code and consider the implications of what I'm changing in this
untangled mess, so you know what: I'll just leave this broken.

Please send me a patch if you care, I'm done here.
--
Dominique

syzbot

unread,
Dec 30, 2022, 5:34:35 AM12/30/22
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
This bug is marked as fixed by commit:
9p: client_create/destroy: only call trans_mod->close after create

But I can't find it in the tested trees[1] for more than 90 days.
Is it a correct commit? Please update it by replying:

#syz fix: exact-commit-title

Until then the bug is still considered open and new crashes with
the same signature are ignored.

Kernel: Linux
Dashboard link: https://syzkaller.appspot.com/bug?extid=67d13108d855f451cafc

---
[1] I expect the commit to be present in:

1. for-kernelci branch of
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git

2. master branch of
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

3. master branch of
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

4. master branch of
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git

The full list of 10 trees can be found at
https://syzkaller.appspot.com/upstream/repos

syzbot

unread,
Jan 13, 2023, 5:35:33 AM1/13/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

syzbot

unread,
Jan 27, 2023, 5:36:32 AM1/27/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

syzbot

unread,
Feb 10, 2023, 5:36:37 AM2/10/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

syzbot

unread,
Feb 24, 2023, 5:37:38 AM2/24/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

syzbot

unread,
Mar 10, 2023, 5:38:33 AM3/10/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
This bug is marked as fixed by commit:
9p: client_create/destroy: only call trans_mod->close after create

But I can't find it in the tested trees[1] for more than 90 days.
Is it a correct commit? Please update it by replying:

#syz fix: exact-commit-title

Until then the bug is still considered open and new crashes with
the same signature are ignored.

Kernel: Linux
Dashboard link: https://syzkaller.appspot.com/bug?extid=67d13108d855f451cafc

---
[1] I expect the commit to be present in:

1. for-kernelci branch of
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git

2. master branch of
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

3. master branch of
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

4. main branch of

syzbot

unread,
Mar 24, 2023, 6:38:40 AM3/24/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

syzbot

unread,
Apr 7, 2023, 6:38:48 AM4/7/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

syzbot

unread,
Apr 21, 2023, 6:39:35 AM4/21/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

syzbot

unread,
May 5, 2023, 6:40:46 AM5/5/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

syzbot

unread,
May 19, 2023, 6:40:50 AM5/19/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

syzbot

unread,
Jun 2, 2023, 6:42:37 AM6/2/23
to asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

Aleksandr Nogikh

unread,
Jun 2, 2023, 6:58:04 AM6/2/23
to syzbot, asma...@codewreck.org, dan.ca...@oracle.com, da...@davemloft.net, edum...@google.com, eri...@gmail.com, ku...@kernel.org, le...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net
Looks like the guilty patch was just edited -next.

As there's no fixing commit, let's just invalidate the bug:

#syz invalid
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/00000000000017cc6205fd233643%40google.com.
Reply all
Reply to author
Forward
0 new messages