INFO: task hung in __blkdev_get

114 views
Skip to first unread message

syzbot

unread,
Dec 9, 2017, 5:43:01 AM12/9/17
to linux-...@vger.kernel.org, mi...@redhat.com, pet...@infradead.org, syzkall...@googlegroups.com
Hello,

syzkaller hit the following crash on
4131d5166185d0d75b5f1d4bf362a9e0bac05598
git://git.cmpxchg.org/linux-mmots.git/master
compiler: gcc (GCC) 7.1.1 20170620
.config is attached
Raw console output is attached.
C reproducer is attached
syzkaller reproducer is attached. See https://goo.gl/kgGztJ
for information about syzkaller reproducers


INFO: task blkid:3090 blocked for more than 120 seconds.
Not tainted 4.15.0-rc1-mm1+ #29
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
blkid D23696 3090 3062 0x00000004
Call Trace:
context_switch kernel/sched/core.c:2800 [inline]
__schedule+0x8eb/0x2060 kernel/sched/core.c:3376
schedule+0xf5/0x430 kernel/sched/core.c:3435
schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3493
__mutex_lock_common kernel/locking/mutex.c:833 [inline]
__mutex_lock+0xaad/0x1a80 kernel/locking/mutex.c:893
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
__blkdev_get+0x158/0x10e0 fs/block_dev.c:1439
blkdev_get+0x3a1/0xad0 fs/block_dev.c:1591
blkdev_open+0x1e4/0x270 fs/block_dev.c:1749
do_dentry_open+0x682/0xd70 fs/open.c:752
vfs_open+0x107/0x230 fs/open.c:866
do_last fs/namei.c:3379 [inline]
path_openat+0x1157/0x3530 fs/namei.c:3519
do_filp_open+0x25b/0x3b0 fs/namei.c:3554
do_sys_open+0x502/0x6d0 fs/open.c:1059
SYSC_open fs/open.c:1077 [inline]
SyS_open+0x2d/0x40 fs/open.c:1072
entry_SYSCALL_64_fastpath+0x1f/0x96
RIP: 0033:0x7fe5c5bfc120
RSP: 002b:00007ffde0d34f08 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe5c5bfc120
RDX: 00007ffde0d35f34 RSI: 0000000000000000 RDI: 00007ffde0d35f34
RBP: 0000000000000000 R08: 0000000000000078 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000211e030
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000005

Showing all locks held in the system:
2 locks held by khungtaskd/675:
#0: (rcu_read_lock){....}, at: [<000000007fb79bbe>]
check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
#0: (rcu_read_lock){....}, at: [<000000007fb79bbe>] watchdog+0x1c5/0xd60
kernel/hung_task.c:249
#1: (tasklist_lock){.+.+}, at: [<0000000035358c26>]
debug_show_all_locks+0xd3/0x400 kernel/locking/lockdep.c:4554
1 lock held by rsyslogd/2973:
#0: (&f->f_pos_lock){+.+.}, at: [<0000000001261d81>]
__fdget_pos+0x131/0x1a0 fs/file.c:770
2 locks held by getty/3055:
#0: (&tty->ldisc_sem){++++}, at: [<000000005cc11435>]
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000008d2b0f7>]
n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
2 locks held by getty/3056:
#0: (&tty->ldisc_sem){++++}, at: [<000000005cc11435>]
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000008d2b0f7>]
n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
2 locks held by getty/3057:
#0: (&tty->ldisc_sem){++++}, at: [<000000005cc11435>]
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000008d2b0f7>]
n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
2 locks held by getty/3058:
#0: (&tty->ldisc_sem){++++}, at: [<000000005cc11435>]
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000008d2b0f7>]
n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
2 locks held by getty/3059:
#0: (&tty->ldisc_sem){++++}, at: [<000000005cc11435>]
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000008d2b0f7>]
n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
2 locks held by getty/3060:
#0: (&tty->ldisc_sem){++++}, at: [<000000005cc11435>]
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000008d2b0f7>]
n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
2 locks held by getty/3061:
#0: (&tty->ldisc_sem){++++}, at: [<000000005cc11435>]
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000008d2b0f7>]
n_tty_read+0x2f2/0x1a10 drivers/tty/n_tty.c:2131
1 lock held by syzkaller346228/3087:
#0: (&bdev->bd_mutex){+.+.}, at: [<0000000099fa8891>]
__blkdev_put+0xa7/0x7c0 fs/block_dev.c:1757
1 lock held by blkid/3090:
#0: (&bdev->bd_mutex){+.+.}, at: [<00000000d4caea9e>]
__blkdev_get+0x158/0x10e0 fs/block_dev.c:1439

=============================================

NMI backtrace for cpu 0
CPU: 0 PID: 675 Comm: khungtaskd Not tainted 4.15.0-rc1-mm1+ #29
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
nmi_cpu_backtrace+0x1d2/0x210 lib/nmi_backtrace.c:103
nmi_trigger_cpumask_backtrace+0x122/0x180 lib/nmi_backtrace.c:62
arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline]
check_hung_task kernel/hung_task.c:132 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline]
watchdog+0x90c/0xd60 kernel/hung_task.c:249
kthread+0x37a/0x440 kernel/kthread.c:238
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:517
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.15.0-rc1-mm1+ #29
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
task: 0000000064c7084b task.stack: 00000000bb7e08d1
RIP: 0010:sched_ttwu_pending+0x0/0x270 kernel/sched/core.c:1584
RSP: 0018:ffff8801db507d68 EFLAGS: 00000046
RAX: 0000000000000003 RBX: 0000000000027900 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801da3894a0
RBP: ffff8801db507e90 R08: 0000000000000001 R09: ffff88021fff8048
R10: ffff88021fff8050 R11: ffff88021fff805d R12: ffff8801da388300
R13: 1ffff1003b6a0fb1 R14: ffff8801db507e68 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe5c5bfc110 CR3: 00000001d1b96000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
smp_reschedule_interrupt+0xe6/0x670 arch/x86/kernel/smp.c:277
reschedule_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:931
</IRQ>
RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:54
RSP: 0018:ffff8801da397da8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff02
RAX: dffffc0000000000 RBX: 1ffff1003b472fb8 RCX: 0000000000000000
RDX: 1ffffffff0bd9744 RSI: 0000000000000001 RDI: ffffffff85ecba20
RBP: ffff8801da397da8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: ffff8801da397e60 R14: ffffffff865eca20 R15: 0000000000000000
arch_safe_halt arch/x86/include/asm/paravirt.h:93 [inline]
default_idle+0xbf/0x430 arch/x86/kernel/process.c:355
arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:346
default_idle_call+0x36/0x90 kernel/sched/idle.c:98
cpuidle_idle_call kernel/sched/idle.c:156 [inline]
do_idle+0x24a/0x3b0 kernel/sched/idle.c:246
cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:351
start_secondary+0x2dd/0x3e0 arch/x86/kernel/smpboot.c:277
secondary_startup_64+0xa5/0xb0 arch/x86/kernel/head_64.S:237
Code: ff ff 4c 89 85 70 ff ff ff e8 ad 12 5a 00 48 8b 95 68 ff ff ff 4c 8b
85 70 ff ff ff e9 19 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 48 89 e5
41 57 41 56 41 55 4c 8d ad 78 ff ff ff 49 be 00 00


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzk...@googlegroups.com.
Please credit me with: Reported-by: syzbot <syzk...@googlegroups.com>

syzbot will keep track of this bug report.
Once a fix for this bug is merged into any tree, reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.
Note: all commands must start from beginning of the line in the email body.
config.txt
raw.log
repro.txt
repro.c

Tetsuo Handa

unread,
Apr 5, 2018, 10:08:52 AM4/5/18
to mi...@redhat.com, pet...@infradead.org, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Dmitry Vyukov
I tried the reproducer in my environment. The reproducer can trivially
reproduce a hung up. If the bug I'm observing is what the syzbot is
reporting (I ran the reproducer using init= kernel command line option),
the reason __blkdev_get() is blocked waiting for bdev->bd_mutex is that
an exiting thread cannot release bdev->bd_mutex held at __blkdev_put()
because that thread is unable to return from wait_event_interruptible()
in blk_queue_enter(). (By the way, since the exiting task is sleeping in
interruptible state, khungtaskd does not help; we need to use SysRq-t.)

----------------------------------------
[ 200.731135] a.out S 0 447 446 0x80000002
[ 200.732646] Call Trace:
[ 200.733566] ? __schedule+0x2a4/0x9f0
[ 200.734764] ? _raw_spin_unlock_irqrestore+0x40/0x50
[ 200.736251] schedule+0x34/0x80
[ 200.737353] schedule_timeout+0x1cd/0x530
[ 200.738595] ? collect_expired_timers+0xa0/0xa0
[ 200.739921] ? blk_queue_enter+0x7d/0x550
[ 200.741500] blk_queue_enter+0x275/0x550
[ 200.743017] ? wait_woken+0x80/0x80
[ 200.744273] generic_make_request+0xe3/0x2a0
[ 200.745621] ? submit_bio+0x67/0x130
[ 200.746831] submit_bio+0x67/0x130
[ 200.747915] ? guard_bio_eod+0xae/0x200
[ 200.749177] submit_bh_wbc+0x161/0x190
[ 200.750326] __block_write_full_page+0x15c/0x3c0
[ 200.751678] ? check_disk_change+0x60/0x60
[ 200.752923] __writepage+0x11/0x50
[ 200.754023] write_cache_pages+0x1ea/0x530
[ 200.755575] ? __test_set_page_writeback+0x440/0x440
[ 200.757039] ? __lock_acquire+0x38f/0x1a10
[ 200.758718] generic_writepages+0x5f/0xa0
[ 200.760128] ? do_writepages+0x12/0x50
[ 200.761281] do_writepages+0x12/0x50
[ 200.762369] __filemap_fdatawrite_range+0xc3/0x100
[ 200.763719] ? __mutex_lock+0x72/0x950
[ 200.764876] filemap_write_and_wait+0x25/0x60
[ 200.766168] __blkdev_put+0x71/0x200
[ 200.767261] blkdev_close+0x1c/0x20
[ 200.768333] __fput+0x95/0x1d0
[ 200.769339] task_work_run+0x84/0xa0
[ 200.770767] do_exit+0x301/0xbf0
[ 200.771877] ? __do_page_fault+0x2ca/0x510
[ 200.773086] do_group_exit+0x38/0xb0
[ 200.774260] SyS_exit_group+0xb/0x10
[ 200.776094] do_syscall_64+0x68/0x210
[ 200.777486] entry_SYSCALL_64_after_hwframe+0x42/0xb7

[ 200.815616] Showing all locks held in the system:
[ 200.817791] 2 locks held by kworker/0:1/38:
[ 200.819125] #0: 0000000057489670 ((wq_completion)"events_freezable_power_efficient"){+.+.}, at: process_one_work+0x1e2/0x690
[ 200.821878] #1: 00000000c9c74590 ((work_completion)(&(&ev->dwork)->work)){+.+.}, at: process_one_work+0x1e2/0x690
[ 200.824690] 1 lock held by a.out/447:
[ 200.826448] #0: 00000000bedfcca8 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x3c/0x200


[ 270.495110] a.out S 0 447 446 0x80000002
[ 270.496770] Call Trace:
[ 270.497779] ? __schedule+0x2a4/0x9f0
[ 270.498930] ? _raw_spin_unlock_irqrestore+0x40/0x50
[ 270.500433] schedule+0x34/0x80
[ 270.501513] schedule_timeout+0x1cd/0x530
[ 270.502773] ? collect_expired_timers+0xa0/0xa0
[ 270.504125] ? blk_queue_enter+0x7d/0x550
[ 270.505706] blk_queue_enter+0x275/0x550
[ 270.507352] ? wait_woken+0x80/0x80
[ 270.508614] generic_make_request+0xe3/0x2a0
[ 270.509913] ? submit_bio+0x67/0x130
[ 270.511876] submit_bio+0x67/0x130
[ 270.513160] ? guard_bio_eod+0xae/0x200
[ 270.514430] submit_bh_wbc+0x161/0x190
[ 270.515691] __block_write_full_page+0x15c/0x3c0
[ 270.517024] ? check_disk_change+0x60/0x60
[ 270.518301] __writepage+0x11/0x50
[ 270.519420] write_cache_pages+0x1ea/0x530
[ 270.520691] ? __test_set_page_writeback+0x440/0x440
[ 270.522118] ? __lock_acquire+0x38f/0x1a10
[ 270.523377] generic_writepages+0x5f/0xa0
[ 270.524624] ? do_writepages+0x12/0x50
[ 270.525974] do_writepages+0x12/0x50
[ 270.527222] __filemap_fdatawrite_range+0xc3/0x100
[ 270.529115] ? __mutex_lock+0x72/0x950
[ 270.530378] filemap_write_and_wait+0x25/0x60
[ 270.531741] __blkdev_put+0x71/0x200
[ 270.532904] blkdev_close+0x1c/0x20
[ 270.534074] __fput+0x95/0x1d0
[ 270.535139] task_work_run+0x84/0xa0
[ 270.536331] do_exit+0x301/0xbf0
[ 270.537458] ? __do_page_fault+0x2ca/0x510
[ 270.538717] do_group_exit+0x38/0xb0
[ 270.539830] SyS_exit_group+0xb/0x10
[ 270.540963] do_syscall_64+0x68/0x210
[ 270.542097] entry_SYSCALL_64_after_hwframe+0x42/0xb7

[ 270.582465] Showing all locks held in the system:
[ 270.584584] 2 locks held by kworker/1:2/136:
[ 270.585939] #0: 0000000057489670 ((wq_completion)"events_freezable_power_efficient"){+.+.}, at: process_one_work+0x1e2/0x690
[ 270.589850] #1: 00000000e793cf79 ((work_completion)(&(&ev->dwork)->work)){+.+.}, at: process_one_work+0x1e2/0x690
[ 270.592813] 1 lock held by a.out/447:
[ 270.594237] #0: 00000000bedfcca8 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x3c/0x200
----------------------------------------

I checked variables using below patch, but no variable changed over time.

----------------------------------------
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -857,10 +857,26 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags)
*/
smp_rmb();

- ret = wait_event_interruptible(q->mq_freeze_wq,
- (atomic_read(&q->mq_freeze_depth) == 0 &&
- (preempt || !blk_queue_preempt_only(q))) ||
- blk_queue_dying(q));
+ while (1) {
+ ret = wait_event_interruptible_timeout
+ (q->mq_freeze_wq,
+ (atomic_read(&q->mq_freeze_depth) == 0 &&
+ (preempt || !blk_queue_preempt_only(q))) ||
+ blk_queue_dying(q), 5 * HZ);
+ if (ret < 0) {
+ break;
+ } else if (ret == 0) {
+ printk("q->mq_freeze_depth=%d preempt=%u "
+ "blk_queue_preempt_only(q)=%u "
+ "blk_queue_dying(q)=%u\n",
+ atomic_read(&q->mq_freeze_depth),
+ preempt, blk_queue_preempt_only(q),
+ blk_queue_dying(q));
+ } else {
+ ret = 0;
+ break;
+ }
+ }
if (blk_queue_dying(q))
return -ENODEV;
if (ret)
----------------------------------------

[ 28.090809] loop: module loaded
[ 43.488409] q->mq_freeze_depth=1 preempt=0 blk_queue_preempt_only(q)=0 blk_queue_dying(q)=0
[ 48.608715] q->mq_freeze_depth=1 preempt=0 blk_queue_preempt_only(q)=0 blk_queue_dying(q)=0
[ 53.728281] q->mq_freeze_depth=1 preempt=0 blk_queue_preempt_only(q)=0 blk_queue_dying(q)=0
[ 58.848387] q->mq_freeze_depth=1 preempt=0 blk_queue_preempt_only(q)=0 blk_queue_dying(q)=0
(...snipped...)
[ 289.248694] q->mq_freeze_depth=1 preempt=0 blk_queue_preempt_only(q)=0 blk_queue_dying(q)=0
[ 294.368484] q->mq_freeze_depth=1 preempt=0 blk_queue_preempt_only(q)=0 blk_queue_dying(q)=0
[ 299.488896] q->mq_freeze_depth=1 preempt=0 blk_queue_preempt_only(q)=0 blk_queue_dying(q)=0
[ 304.608632] q->mq_freeze_depth=1 preempt=0 blk_queue_preempt_only(q)=0 blk_queue_dying(q)=0

This means that we deadlocked there.

Tetsuo Handa

unread,
Apr 6, 2018, 6:12:13 AM4/6/18
to Dmitry Vyukov, Ming Lei, Jens Axboe, mi...@redhat.com, pet...@infradead.org, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
From fab524a1a8a67a8d6de1d486ff526ed2f18ee6fd Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Date: Fri, 6 Apr 2018 10:03:17 +0900
Subject: [PATCH] block/loop: fix deadlock after loop_set_status

syzbot is reporting deadlocks at __blkdev_get() [1].

----------------------------------------
[ 92.493919] systemd-udevd D12696 525 1 0x00000000
[ 92.495891] Call Trace:
[ 92.501560] schedule+0x23/0x80
[ 92.502923] schedule_preempt_disabled+0x5/0x10
[ 92.504645] __mutex_lock+0x416/0x9e0
[ 92.510760] __blkdev_get+0x73/0x4f0
[ 92.512220] blkdev_get+0x12e/0x390
[ 92.518151] do_dentry_open+0x1c3/0x2f0
[ 92.519815] path_openat+0x5d9/0xdc0
[ 92.521437] do_filp_open+0x7d/0xf0
[ 92.527365] do_sys_open+0x1b8/0x250
[ 92.528831] do_syscall_64+0x6e/0x270
[ 92.530341] entry_SYSCALL_64_after_hwframe+0x42/0xb7

[ 92.931922] 1 lock held by systemd-udevd/525:
[ 92.933642] #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x73/0x4f0
----------------------------------------

The reason of deadlock turned out that wait_event_interruptible() in
blk_queue_enter() got stuck with bdev->bd_mutex held at __blkdev_put()
due to q->mq_freeze_depth == 1.

----------------------------------------
[ 92.787172] a.out S12584 634 633 0x80000002
[ 92.789120] Call Trace:
[ 92.796693] schedule+0x23/0x80
[ 92.797994] blk_queue_enter+0x3cb/0x540
[ 92.803272] generic_make_request+0xf0/0x3d0
[ 92.807970] submit_bio+0x67/0x130
[ 92.810928] submit_bh_wbc+0x15e/0x190
[ 92.812461] __block_write_full_page+0x218/0x460
[ 92.815792] __writepage+0x11/0x50
[ 92.817209] write_cache_pages+0x1ae/0x3d0
[ 92.825585] generic_writepages+0x5a/0x90
[ 92.831865] do_writepages+0x43/0xd0
[ 92.836972] __filemap_fdatawrite_range+0xc1/0x100
[ 92.838788] filemap_write_and_wait+0x24/0x70
[ 92.840491] __blkdev_put+0x69/0x1e0
[ 92.841949] blkdev_close+0x16/0x20
[ 92.843418] __fput+0xda/0x1f0
[ 92.844740] task_work_run+0x87/0xb0
[ 92.846215] do_exit+0x2f5/0xba0
[ 92.850528] do_group_exit+0x34/0xb0
[ 92.852018] SyS_exit_group+0xb/0x10
[ 92.853449] do_syscall_64+0x6e/0x270
[ 92.854944] entry_SYSCALL_64_after_hwframe+0x42/0xb7

[ 92.943530] 1 lock held by a.out/634:
[ 92.945105] #0: 00000000a2849e25 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x3c/0x1e0
----------------------------------------

The reason of q->mq_freeze_depth == 1 turned out that loop_set_status()
forgot to call blk_mq_unfreeze_queue() at error paths for
info->lo_encrypt_type != NULL case.

----------------------------------------
[ 37.509497] CPU: 2 PID: 634 Comm: a.out Tainted: G W 4.16.0+ #457
[ 37.513608] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[ 37.518832] RIP: 0010:blk_freeze_queue_start+0x17/0x40
[ 37.521778] RSP: 0018:ffffb0c2013e7c60 EFLAGS: 00010246
[ 37.524078] RAX: 0000000000000000 RBX: ffff8b07b1519798 RCX: 0000000000000000
[ 37.527015] RDX: 0000000000000002 RSI: ffffb0c2013e7cc0 RDI: ffff8b07b1519798
[ 37.529934] RBP: ffffb0c2013e7cc0 R08: 0000000000000008 R09: 47a189966239b898
[ 37.532684] R10: dad78b99b278552f R11: 9332dca72259d5ef R12: ffff8b07acd73678
[ 37.535452] R13: 0000000000004c04 R14: 0000000000000000 R15: ffff8b07b841e940
[ 37.538186] FS: 00007fede33b9740(0000) GS:ffff8b07b8e80000(0000) knlGS:0000000000000000
[ 37.541168] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 37.543590] CR2: 00000000206fdf18 CR3: 0000000130b30006 CR4: 00000000000606e0
[ 37.546410] Call Trace:
[ 37.547902] blk_freeze_queue+0x9/0x30
[ 37.549968] loop_set_status+0x67/0x3c0 [loop]
[ 37.549975] loop_set_status64+0x3b/0x70 [loop]
[ 37.549986] lo_ioctl+0x223/0x810 [loop]
[ 37.549995] blkdev_ioctl+0x572/0x980
[ 37.550003] block_ioctl+0x34/0x40
[ 37.550006] do_vfs_ioctl+0xa7/0x6d0
[ 37.550017] ksys_ioctl+0x6b/0x80
[ 37.573076] SyS_ioctl+0x5/0x10
[ 37.574831] do_syscall_64+0x6e/0x270
[ 37.576769] entry_SYSCALL_64_after_hwframe+0x42/0xb7
----------------------------------------

[1] https://syzkaller.appspot.com/bug?id=cd662bc3f6022c0979d01a262c318fab2ee9b56f

Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Reported-by: syzbot <bot+48594378e9851eab70...@syzkaller.appspotmail.com>
Fixes: ecdd09597a572513 ("block/loop: fix race between I/O and set_status")
Cc: Ming Lei <tom.l...@gmail.com>
Cc: Dmitry Vyukov <dvy...@google.com>
Cc: stable <sta...@vger.kernel.org>
Cc: Jens Axboe <ax...@fb.com>
---
drivers/block/loop.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 264abaa..e5fc020 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1103,11 +1103,15 @@ static int loop_clr_fd(struct loop_device *lo)
if (info->lo_encrypt_type) {
unsigned int type = info->lo_encrypt_type;

- if (type >= MAX_LO_CRYPT)
- return -EINVAL;
+ if (type >= MAX_LO_CRYPT) {
+ err = -EINVAL;
+ goto exit;
+ }
xfer = xfer_funcs[type];
- if (xfer == NULL)
- return -EINVAL;
+ if (xfer == NULL) {
+ err = -EINVAL;
+ goto exit;
+ }
} else
xfer = NULL;

--
1.8.3.1


Jens Axboe

unread,
Apr 6, 2018, 11:53:49 AM4/6/18
to Tetsuo Handa, Dmitry Vyukov, Ming Lei, mi...@redhat.com, pet...@infradead.org, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On 4/6/18 4:12 AM, Tetsuo Handa wrote:
> ----------------------------------------
>
> The reason of q->mq_freeze_depth == 1 turned out that loop_set_status()
> forgot to call blk_mq_unfreeze_queue() at error paths for
> info->lo_encrypt_type != NULL case.

Thanks for finding this, applied.

--
Jens Axboe

Tetsuo Handa

unread,
Apr 10, 2018, 6:55:22 AM4/10/18
to Jens Axboe, Dmitry Vyukov, Ming Lei, mi...@redhat.com, pet...@infradead.org, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Omar Sandoval
Hello.

Since syzbot is reporting so many hung up bug which involves /dev/loopX ,
is it possible to "temporarily" apply below patch for testing under syzbot
( after "block/loop: fix deadlock after loop_set_status" and
"loop: fix LOOP_GET_STATUS lock imbalance" in linux-block.git#for-linus
are merged into linux.git )?

I haven't got a smoking gun by lockdep, but I noticed that

[upstream] INFO: task hung in lo_open (2)
https://syzkaller.appspot.com/bug?id=1f93b57f496d969efb9fb24167f6f9de5ee068fd

contained "lo->lo_ctl_mutex => bdev->bd_mutex" locking order

2 locks held by syz-executor6/15084:
#0: (&lo->lo_ctl_mutex/1){+.+.}, at: [<00000000cc154b8d>] lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355
#1: (&bdev->bd_mutex){+.+.}, at: [<0000000058b7c5b5>] blkdev_reread_part+0x1e/0x40 block/ioctl.c:192

while commit f028f3b2f987ebc6 ("loop: fix circular locking in loop_clr_fd()")
says that

* Calling fput holding lo_ctl_mutex triggers a circular
* lock dependency possibility warning as fput can take
* bd_mutex which is usually taken before lo_ctl_mutex.

which implies that the locking order should be "bdev->bd_mutex => lo->lo_ctl_mutex"
and also says that use of "_nested" version might mask some other real bugs
which could be the bug syzbot is frequently reporting as hung tasks...

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 264abaa..5559b15 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1360,7 +1360,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
struct loop_device *lo = bdev->bd_disk->private_data;
int err;

- err = mutex_lock_killable_nested(&lo->lo_ctl_mutex, 1);
+ err = mutex_lock_killable(&lo->lo_ctl_mutex);
if (err)
goto out_unlocked;


Dmitry Vyukov

unread,
Apr 10, 2018, 7:42:08 AM4/10/18
to Tetsuo Handa, Jens Axboe, Ming Lei, Ingo Molnar, Peter Zijlstra, syzbot, LKML, syzkaller-bugs, Omar Sandoval
On Tue, Apr 10, 2018 at 12:55 PM, Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
> Hello.
>
> Since syzbot is reporting so many hung up bug which involves /dev/loopX ,
> is it possible to "temporarily" apply below patch for testing under syzbot

Unfortunately it's not possible, for full explanation please see:
https://github.com/google/syzkaller/blob/master/docs/syzbot.md#no-custom-patches

Tetsuo Handa

unread,
Apr 10, 2018, 9:04:47 AM4/10/18
to dvy...@google.com, ax...@kernel.dk, tom.l...@gmail.com, mi...@redhat.com, pet...@infradead.org, bot+48594378e9851eab70...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com, osa...@fb.com
Dmitry Vyukov wrote:
> On Tue, Apr 10, 2018 at 12:55 PM, Tetsuo Handa
> <penguin...@i-love.sakura.ne.jp> wrote:
> > Hello.
> >
> > Since syzbot is reporting so many hung up bug which involves /dev/loopX ,
> > is it possible to "temporarily" apply below patch for testing under syzbot
>
> Unfortunately it's not possible, for full explanation please see:
> https://github.com/google/syzkaller/blob/master/docs/syzbot.md#no-custom-patches
>

I mean, sending custom patch to linux.git for -rc and revert the custom patch
before -final is released. It won't take so much period until we get the result.

If syzbot can test arbitrary git tree instead of linux.git, making a branch
which contains custom patches would be possible.

Dmitry Vyukov

unread,
Apr 10, 2018, 9:25:07 AM4/10/18
to Tetsuo Handa, Jens Axboe, Ming Lei, Ingo Molnar, Peter Zijlstra, syzbot, LKML, syzkaller-bugs, Omar Sandoval
On Tue, Apr 10, 2018 at 3:04 PM, Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
> Dmitry Vyukov wrote:
>> On Tue, Apr 10, 2018 at 12:55 PM, Tetsuo Handa
>> <penguin...@i-love.sakura.ne.jp> wrote:
>> > Hello.
>> >
>> > Since syzbot is reporting so many hung up bug which involves /dev/loopX ,
>> > is it possible to "temporarily" apply below patch for testing under syzbot
>>
>> Unfortunately it's not possible, for full explanation please see:
>> https://github.com/google/syzkaller/blob/master/docs/syzbot.md#no-custom-patches
>>
>
> I mean, sending custom patch to linux.git for -rc and revert the custom patch
> before -final is released. It won't take so much period until we get the result.

Ah, I see, then I guess it wasn't a question to me.


> If syzbot can test arbitrary git tree instead of linux.git, making a branch
> which contains custom patches would be possible.

syzbot tests a set of trees (also net-next and bpf-next at the
moment). But see this reply to Takashi re a similar request:
https://groups.google.com/forum/#!msg/syzkaller-bugs/7ucgCkAJKSk/skZjgavRAQAJ
Note that a syzkaller instances will produce several hundreds of
different crashes within a day, and then a big question is what to do
with them.

What's perfectly possible though is running syzkaller locally, and you
can do it on just any tree you want. I've recently put a script that
setups syzkaller end-to-end with config, compiler and image that
syzbot uses:
https://github.com/google/syzkaller/blob/master/tools/demo_setup.sh
(that uses v4.13, but you can change this to any kernel version).

Tetsuo Handa

unread,
Apr 14, 2018, 11:27:06 AM4/14/18
to dvy...@google.com, ax...@kernel.dk, tom.l...@gmail.com, mi...@redhat.com, pet...@infradead.org, bot+48594378e9851eab70...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com, osa...@fb.com
OK. The patch was sent to linux.git as commit 1e047eaab3bb5564.

#syz fix: block/loop: fix deadlock after loop_set_status

Dmitry Vyukov <dvy...@google.com>" wrote:
> On Tue, Apr 10, 2018 at 3:04 PM, Tetsuo Handa
> <penguin...@i-love.sakura.ne.jp> wrote:
> > Dmitry Vyukov wrote:
> >> On Tue, Apr 10, 2018 at 12:55 PM, Tetsuo Handa
> >> <penguin...@i-love.sakura.ne.jp> wrote:
> >> > Hello.
> >> >
> >> > Since syzbot is reporting so many hung up bug which involves /dev/loopX ,
> >> > is it possible to "temporarily" apply below patch for testing under syzbot
> >>
> >> Unfortunately it's not possible, for full explanation please see:
> >> https://github.com/google/syzkaller/blob/master/docs/syzbot.md#no-custom-patches
> >>
> >
> > I mean, sending custom patch to linux.git for -rc and revert the custom patch
> > before -final is released. It won't take so much period until we get the result.
>
> Ah, I see, then I guess it wasn't a question to me.
>
I noticed that there already is the lockdep report at

possible deadlock in blkdev_reread_part
https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889

entry, and no patch is proposed yet:

https://groups.google.com/forum/#!msg/syzkaller-bugs/2Rw8-OM6IbM/SI4DyK-1AQAJ

Dmitry Vyukov

unread,
Apr 16, 2018, 8:05:47 AM4/16/18
to Tetsuo Handa, Jens Axboe, Ming Lei, Ingo Molnar, Peter Zijlstra, syzbot, LKML, syzkaller-bugs, Omar Sandoval
On Sat, Apr 14, 2018 at 5:26 PM, Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
> OK. The patch was sent to linux.git as commit 1e047eaab3bb5564.
>
> #syz fix: block/loop: fix deadlock after loop_set_status
>
> Dmitry Vyukov <dvy...@google.com>" wrote:
>> On Tue, Apr 10, 2018 at 3:04 PM, Tetsuo Handa
>> <penguin...@i-love.sakura.ne.jp> wrote:
>> > Dmitry Vyukov wrote:
>> >> On Tue, Apr 10, 2018 at 12:55 PM, Tetsuo Handa
>> >> <penguin...@i-love.sakura.ne.jp> wrote:
>> >> > Hello.
>> >> >
>> >> > Since syzbot is reporting so many hung up bug which involves /dev/loopX ,
>> >> > is it possible to "temporarily" apply below patch for testing under syzbot
>> >>
>> >> Unfortunately it's not possible, for full explanation please see:
>> >> https://github.com/google/syzkaller/blob/master/docs/syzbot.md#no-custom-patches
>> >>
>> >
>> > I mean, sending custom patch to linux.git for -rc and revert the custom patch
>> > before -final is released. It won't take so much period until we get the result.
>>
>> Ah, I see, then I guess it wasn't a question to me.
>>
> I noticed that there already is the lockdep report at
>
> possible deadlock in blkdev_reread_part
> https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889
>
> entry, and no patch is proposed yet:
>
> https://groups.google.com/forum/#!msg/syzkaller-bugs/2Rw8-OM6IbM/SI4DyK-1AQAJ

If it's lock inversion, I wonder why there were no lockdep report
before "task hung" report...
Various assorted "task hung" reports seems to be plague for syzbot at
the moment. Lockdep reports should be way more actionable based solely
on the inversion report.
Reply all
Reply to author
Forward
0 new messages