[syzbot] possible deadlock in p9_write

syzbot

unread,

Mar 29, 2022, 5:23:18 PM3/29/22

to asma...@codewreck.org, da...@davemloft.net, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

Hello,

syzbot found the following issue on:

HEAD commit: 8515d05bf6bc Add linux-next specific files for 20220328
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=155abcc3700000
kernel config: https://syzkaller.appspot.com/x/.config?x=530c68bef4e2b8a8
dashboard link: https://syzkaller.appspot.com/bug?extid=bde0f89deacca7c765b8
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+bde0f8...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
5.17.0-next-20220328-syzkaller #0 Not tainted
------------------------------------------------------
kworker/1:1/26 is trying to acquire lock:
ffff88807eece460 (sb_writers#3){.+.+}-{0:0}, at: p9_fd_write net/9p/trans_fd.c:428 [inline]
ffff88807eece460 (sb_writers#3){.+.+}-{0:0}, at: p9_write_work+0x25e/0xca0 net/9p/trans_fd.c:479

but task is already holding lock:
ffffc90000a1fda8 ((work_completion)(&m->wq)){+.+.}-{0:0}, at: process_one_work+0x8ae/0x1610 kernel/workqueue.c:2264

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 ((work_completion)(&m->wq)){+.+.}-{0:0}:
process_one_work+0x905/0x1610 kernel/workqueue.c:2265
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e9/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298

-> #2 ((wq_completion)events){+.+.}-{0:0}:
flush_workqueue+0x164/0x1440 kernel/workqueue.c:2831
flush_scheduled_work include/linux/workqueue.h:583 [inline]
ext4_put_super+0x99/0x1150 fs/ext4/super.c:1202
generic_shutdown_super+0x14c/0x400 fs/super.c:462
kill_block_super+0x97/0xf0 fs/super.c:1394
deactivate_locked_super+0x94/0x160 fs/super.c:332
deactivate_super+0xad/0xd0 fs/super.c:363
cleanup_mnt+0x3a2/0x540 fs/namespace.c:1186
task_work_run+0xdd/0x1a0 kernel/task_work.c:164
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
exit_to_user_mode_loop kernel/entry/common.c:183 [inline]
exit_to_user_mode_prepare+0x23c/0x250 kernel/entry/common.c:215
__syscall_exit_to_user_mode_work kernel/entry/common.c:297 [inline]
syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:308
do_syscall_64+0x42/0x80 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #1 (&type->s_umount_key#32){++++}-{3:3}:
down_read+0x98/0x440 kernel/locking/rwsem.c:1461
iterate_supers+0xdb/0x290 fs/super.c:692
drop_caches_sysctl_handler+0xdb/0x110 fs/drop_caches.c:62
proc_sys_call_handler+0x4a1/0x6e0 fs/proc/proc_sysctl.c:604
call_write_iter include/linux/fs.h:2080 [inline]
do_iter_readv_writev+0x3d1/0x640 fs/read_write.c:726
do_iter_write+0x182/0x700 fs/read_write.c:852
vfs_iter_write+0x70/0xa0 fs/read_write.c:893
iter_file_splice_write+0x723/0xc70 fs/splice.c:689
do_splice_from fs/splice.c:767 [inline]
direct_splice_actor+0x110/0x180 fs/splice.c:936
splice_direct_to_actor+0x34b/0x8c0 fs/splice.c:891
do_splice_direct+0x1a7/0x270 fs/splice.c:979
do_sendfile+0xae0/0x1240 fs/read_write.c:1246
__do_sys_sendfile64 fs/read_write.c:1305 [inline]
__se_sys_sendfile64 fs/read_write.c:1297 [inline]
__x64_sys_sendfile64+0x149/0x210 fs/read_write.c:1297
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (sb_writers#3){.+.+}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3096 [inline]
check_prevs_add kernel/locking/lockdep.c:3219 [inline]
validate_chain kernel/locking/lockdep.c:3834 [inline]
__lock_acquire+0x2ac6/0x56c0 kernel/locking/lockdep.c:5060
lock_acquire kernel/locking/lockdep.c:5672 [inline]
lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5637
percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
__sb_start_write include/linux/fs.h:1728 [inline]
sb_start_write include/linux/fs.h:1798 [inline]
file_start_write include/linux/fs.h:2815 [inline]
kernel_write fs/read_write.c:564 [inline]
kernel_write+0x2ac/0x540 fs/read_write.c:555
p9_fd_write net/9p/trans_fd.c:428 [inline]
p9_write_work+0x25e/0xca0 net/9p/trans_fd.c:479
process_one_work+0x996/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e9/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298

other info that might help us debug this:

Chain exists of:
sb_writers#3 --> (wq_completion)events --> (work_completion)(&m->wq)

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock((work_completion)(&m->wq));
lock((wq_completion)events);
lock((work_completion)(&m->wq));
lock(sb_writers#3);

*** DEADLOCK ***

2 locks held by kworker/1:1/26:
#0: ffff888010c64d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
#0: ffff888010c64d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline]
#0: ffff888010c64d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1280 [inline]
#0: ffff888010c64d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:636 [inline]
#0: ffff888010c64d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:663 [inline]
#0: ffff888010c64d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x87a/0x1610 kernel/workqueue.c:2260
#1: ffffc90000a1fda8 ((work_completion)(&m->wq)){+.+.}-{0:0}, at: process_one_work+0x8ae/0x1610 kernel/workqueue.c:2264

stack backtrace:
CPU: 1 PID: 26 Comm: kworker/1:1 Not tainted 5.17.0-next-20220328-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events p9_write_work
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
check_noncircular+0x25f/0x2e0 kernel/locking/lockdep.c:2176
check_prev_add kernel/locking/lockdep.c:3096 [inline]
check_prevs_add kernel/locking/lockdep.c:3219 [inline]
validate_chain kernel/locking/lockdep.c:3834 [inline]
__lock_acquire+0x2ac6/0x56c0 kernel/locking/lockdep.c:5060
lock_acquire kernel/locking/lockdep.c:5672 [inline]
lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5637
percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
__sb_start_write include/linux/fs.h:1728 [inline]
sb_start_write include/linux/fs.h:1798 [inline]
file_start_write include/linux/fs.h:2815 [inline]
kernel_write fs/read_write.c:564 [inline]
kernel_write+0x2ac/0x540 fs/read_write.c:555
p9_fd_write net/9p/trans_fd.c:428 [inline]
p9_write_work+0x25e/0xca0 net/9p/trans_fd.c:479
process_one_work+0x996/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e9/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
</TASK>
usb 4-1: new high-speed USB device number 72 using dummy_hcd
usb 4-1: New USB device found, idVendor=1b3d, idProduct=0193, bcdDevice= 8.4d
usb 4-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
usb 4-1: config 0 descriptor??
ftdi_sio 4-1:0.0: FTDI USB Serial Device converter detected
usb 4-1: Detected FT232RL
ftdi_sio ttyUSB0: Unable to read latency timer: -71
ftdi_sio ttyUSB0: Unable to write latency timer: -71
ftdi_sio 4-1:0.0: GPIO initialisation failed: -71
usb 4-1: FTDI USB Serial Device converter now attached to ttyUSB0
usb 4-1: USB disconnect, device number 72
ftdi_sio ttyUSB0: FTDI USB Serial Device converter now disconnected from ttyUSB0
ftdi_sio 4-1:0.0: device disconnected
usb 4-1: new high-speed USB device number 73 using dummy_hcd
usb 4-1: New USB device found, idVendor=1b3d, idProduct=0193, bcdDevice= 8.4d
usb 4-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
usb 4-1: config 0 descriptor??
ftdi_sio 4-1:0.0: FTDI USB Serial Device converter detected
usb 4-1: Detected FT232RL
ftdi_sio ttyUSB0: Unable to read latency timer: -71
ftdi_sio ttyUSB0: Unable to write latency timer: -71
ftdi_sio 4-1:0.0: GPIO initialisation failed: -71
usb 4-1: FTDI USB Serial Device converter now attached to ttyUSB0
usb 4-1: USB disconnect, device number 73
ftdi_sio ttyUSB0: FTDI USB Serial Device converter now disconnected from ttyUSB0
ftdi_sio 4-1:0.0: device disconnected

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

asma...@codewreck.org

unread,

Mar 29, 2022, 6:05:52 PM3/29/22

to syzbot, da...@davemloft.net, eri...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, linu...@crudebyte.com, lu...@ionkov.net, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net

So p9_write_work cannot write because there's.. a backing ext4 umount (I
assume it's been mounted with trans fd with an ext4 file) and a
drop_caches stuck in parallel, and we just got caught in the crossfire ?

I'm not sure why it got stuck there but that doesn't look like anything
we can do about it, using trans fd with filesystem backed files isn't a
usage we care about in the first place, maybe there's a way to refuse
these and only keep sockets but I don't really see the point of
artificially limiting the interface (unless using a 9p mount with a file
could have security implications I don't see)

wontfix/dontcare for me,
--
Dominique

Tetsuo Handa

unread,

Mar 29, 2022, 6:36:10 PM3/29/22

to Andrew Perepechko, Andreas Dilger, Theodore Ts'o, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net, open list:EXT4 FILE SYSTEM

Hello.

This seems to be an example of https://lkml.kernel.org/r/49925af7-78a8-a3dd...@I-love.SAKURA.ne.jp
introduced by "ext4: truncate during setxattr leads to kernel panic".

Please don't use schedule_work() if you need to use flush_scheduled_work().

Dominique Martinet

unread,

Mar 29, 2022, 7:51:54 PM3/29/22

to Tetsuo Handa, Andrew Perepechko, Andreas Dilger, Theodore Ts'o, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net, open list:EXT4 FILE SYSTEM

Tetsuo Handa wrote on Wed, Mar 30, 2022 at 07:35:47AM +0900:
> This seems to be an example of
> https://lkml.kernel.org/r/49925af7-78a8-a3dd...@I-love.SAKURA.ne.jp
> introduced by "ext4: truncate during setxattr leads to kernel panic".

Thanks for the pointer

> Please don't use schedule_work() if you need to use flush_scheduled_work().

In this case we don't call flush_scheduled_work -- ext4 does.
The problem is mixing in the two subsystems when someone (e.g. syzbot)
opens an ext4 file and passes that fd to 9p when mounting with e.g.
mount -t 9p -o rfdno=<no>,wfdno=<no>

Frankly that's just not something I consider useful, interacting through
9p to a local file doesn't make sense except for testing.

If that is a real problem, the simplest way out would be to just forbid
non-socket FDs if it's something we can check.
--
Dominique

Tetsuo Handa

unread,

Mar 29, 2022, 9:57:26 PM3/29/22

to Dominique Martinet, Andrew Perepechko, Andreas Dilger, Theodore Ts'o, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net, open list:EXT4 FILE SYSTEM

On 2022/03/30 8:51, Dominique Martinet wrote:
> Tetsuo Handa wrote on Wed, Mar 30, 2022 at 07:35:47AM +0900:
>> This seems to be an example of
>> https://lkml.kernel.org/r/49925af7-78a8-a3dd...@I-love.SAKURA.ne.jp
>> introduced by "ext4: truncate during setxattr leads to kernel panic".
>
> Thanks for the pointer
>
>> Please don't use schedule_work() if you need to use flush_scheduled_work().
>
> In this case we don't call flush_scheduled_work -- ext4 does.

Yes, that's why I changed recipients to ext4 people.

> The problem is mixing in the two subsystems when someone (e.g. syzbot)
> opens an ext4 file and passes that fd to 9p when mounting with e.g.
> mount -t 9p -o rfdno=<no>,wfdno=<no>
>
> Frankly that's just not something I consider useful, interacting through
> 9p to a local file doesn't make sense except for testing.
>
> If that is a real problem, the simplest way out would be to just forbid
> non-socket FDs if it's something we can check.

Do you mean that p9_fd_open() in net/9p/trans_fd.c does not need to accept non-socket file descriptors?
Then, it's something you can check. You can use S_ISSOCK() like e.g. netlink_getsockbyfilp() does.

Dominique Martinet

unread,

Mar 29, 2022, 10:30:00 PM3/29/22

to Tetsuo Handa, Andrew Perepechko, Andreas Dilger, Theodore Ts'o, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net, open list:EXT4 FILE SYSTEM

Tetsuo Handa wrote on Wed, Mar 30, 2022 at 10:57:15AM +0900:
> >> Please don't use schedule_work() if you need to use flush_scheduled_work().
> >
> > In this case we don't call flush_scheduled_work -- ext4 does.
>
> Yes, that's why I changed recipients to ext4 people.

Sorry, I hadn't noticed.
9p is the one calling schedule_work, so ultimately it really is the
combinaison of the two, and not just ext4 that's wrong here.

> > The problem is mixing in the two subsystems when someone (e.g. syzbot)
> > opens an ext4 file and passes that fd to 9p when mounting with e.g.
> > mount -t 9p -o rfdno=<no>,wfdno=<no>
> >
> > Frankly that's just not something I consider useful, interacting through
> > 9p to a local file doesn't make sense except for testing.
> >
> > If that is a real problem, the simplest way out would be to just forbid
> > non-socket FDs if it's something we can check.
>
> Do you mean that p9_fd_open() in net/9p/trans_fd.c does not need to
> accept non-socket file descriptors?

Yes, I can't think of any valid usage that would involve non-socket fd
there.
It might be useful to leave as a test vector, but if it causes problems
I think it's perfectly OK to just refuse these.

> Then, it's something you can check. You can use S_ISSOCK() like
> e.g. netlink_getsockbyfilp() does

Thanks for the example
--
Dominique

Tetsuo Handa

unread,

Mar 29, 2022, 10:49:19 PM3/29/22

to Dominique Martinet, Andrew Perepechko, Andreas Dilger, Theodore Ts'o, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net, open list:EXT4 FILE SYSTEM

On 2022/03/30 11:29, Dominique Martinet wrote:
> Tetsuo Handa wrote on Wed, Mar 30, 2022 at 10:57:15AM +0900:
>>>> Please don't use schedule_work() if you need to use flush_scheduled_work().
>>>
>>> In this case we don't call flush_scheduled_work -- ext4 does.
>>
>> Yes, that's why I changed recipients to ext4 people.
>
> Sorry, I hadn't noticed.
> 9p is the one calling schedule_work, so ultimately it really is the
> combinaison of the two, and not just ext4 that's wrong here.

Calling schedule_work() itself does not cause troubles (unless there are
too many pending works to make progress). Calling flush_scheduled_work()
causes troubles because it waits for completion of all works even if
some of works cannot be completed due to locks held by the caller of
flush_scheduled_work(). 9p is innocent for this report.

Perepechko, Andrew

unread,

Mar 30, 2022, 12:56:54 PM3/30/22

to Tetsuo Handa, Dominique Martinet, Andreas Dilger, Theodore Ts'o, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net, open list:EXT4 FILE SYSTEM

Hello Tetsuo!

Thank you for your report.

I wonder if I can fix this issue by creating a separate per-superblock workqueue.

I may not fully understand the lockdep magic in process_one_work() so any advice is appreciated.

As I see it, if there's no shared locking between different workqueues, unmount should be able to flush only its own scheduled tasks (which should not conflict with any p9 tasks) and unblock the locking chain under similar conditions.

Thank you,

Andrew

From: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Sent: 30 March 2022 05:49
To: Dominique Martinet <asma...@codewreck.org>
Cc: Perepechko, Andrew <andrew.p...@hpe.com>; Andreas Dilger <adi...@dilger.ca>; Theodore Ts'o <ty...@mit.edu>; syzbot <syzbot+bde0f8...@syzkaller.appspotmail.com>; linux-...@vger.kernel.org <linux-...@vger.kernel.org>; syzkall...@googlegroups.com <syzkall...@googlegroups.com>; v9fs-de...@lists.sourceforge.net <v9fs-de...@lists.sourceforge.net>; open list:EXT4 FILE SYSTEM <linux...@vger.kernel.org>
Subject: Re: [syzbot] possible deadlock in p9_write_work

Tetsuo Handa

unread,

Mar 30, 2022, 7:43:49 PM3/30/22

to Perepechko, Andrew, Dominique Martinet, Andreas Dilger, Theodore Ts'o, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net, open list:EXT4 FILE SYSTEM, Tejun Heo

Hello.

Since "ext4: truncate during setxattr leads to kernel panic" did not choose
per-superblock WQ, ext4_put_super() for some ext4 superblock currently waits
for completion of iput() from delayed_iput_fn() from delayed_iput() from
ext4_xattr_set_entry() from all ext4 superblocks (in addition to other tasks
scheduled by unrelated subsystems).

If ext4_put_super() for some superblock wants to wait for only works from that
superblock, please use per-superblock WQ. Creating per-superblock WQ via
alloc_workqueue() without WQ_MEM_RECLAIM flag will not consume much resource.

If ext4_put_super() for some superblock can afford waiting for iput() from
other ext4 superblocks, you can use per-filesystem WQ.

Perepechko, Andrew

unread,

Mar 31, 2022, 2:51:51 AM3/31/22

to Tetsuo Handa, Dominique Martinet, Andreas Dilger, Theodore Ts'o, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, v9fs-de...@lists.sourceforge.net, open list:EXT4 FILE SYSTEM, Tejun Heo

Thank you very much, Tetsuo!

I will rework the patch and resend it.

From: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Sent: 31 March 2022 02:43
To: Perepechko, Andrew <andrew.p...@hpe.com>; Dominique Martinet <asma...@codewreck.org>
Cc: Andreas Dilger <adi...@dilger.ca>; Theodore Ts'o <ty...@mit.edu>; syzbot <syzbot+bde0f8...@syzkaller.appspotmail.com>; linux-...@vger.kernel.org <linux-...@vger.kernel.org>; syzkall...@googlegroups.com <syzkall...@googlegroups.com>; v9fs-de...@lists.sourceforge.net <v9fs-de...@lists.sourceforge.net>; open list:EXT4 FILE SYSTEM <linux...@vger.kernel.org>; Tejun Heo <t...@kernel.org>

Tetsuo Handa

unread,

May 26, 2022, 6:41:53 AM5/26/22

to syzbot, syzkall...@googlegroups.com

On 2022/03/30 7:35, Tetsuo Handa wrote:
> This seems to be an example of https://lkml.kernel.org/r/49925af7-78a8-a3dd...@I-love.SAKURA.ne.jp
> introduced by "ext4: truncate during setxattr leads to kernel panic".

That patch was removed from linux-next.git tree.

#syz invalid

Reply all

Reply to author

Forward