floppy: GPF in floppy_rb0_cb

50 views
Skip to first unread message

Dmitry Vyukov

unread,
Jan 24, 2016, 8:13:04 AM1/24/16
to Jiri Kosina, NeilBrown, Takashi Iwai, Jens Axboe, Hannes Reinecke, Rasmus Villemoes, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Hello,

The following causes program causes multiple bugs and eventually machine death:

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/wait.h>

#define N 100

int main()
{
int i, status, pids[N];

for (;;) {
for (i = 0; i < N; i++) {
if ((pids[i] = fork()) == 0) {
open("/dev/fd0", O_RDWR);
exit(0);
}
}
for (i = 0; i < N; i++) {
while (waitpid(pids[i], &status, __WALL) != pids[i]) {
}
}
}
return 0;
}


------------[ cut here ]------------
WARNING: CPU: 0 PID: 6 at drivers/block/floppy.c:975 schedule_bh+0x55/0x60()
Modules linked in:
CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.4.0+ #276
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: floppy fd_timer_workfn
00000000ffffffff ffff88003df97ac0 ffffffff82999e2d 0000000000000000
ffff88003df32f80 ffffffff8687a0e0 ffff88003df97b00 ffffffff81352089
ffffffff8335dbb5 ffffffff8687a0e0 00000000000003cf ffffffff895cae20
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82999e2d>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
[<ffffffff81352089>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:482
[<ffffffff813522b9>] warn_slowpath_null+0x29/0x30 kernel/panic.c:515
[<ffffffff8335dbb5>] schedule_bh+0x55/0x60 drivers/block/floppy.c:975
[<ffffffff8336e1cf>] redo_fd_request+0x173f/0x39f0 drivers/block/floppy.c:2878
[< inline >] seek_floppy drivers/block/floppy.c:1572
[<ffffffff8336ad6c>] floppy_ready+0x106c/0x13f0 drivers/block/floppy.c:1911
[<ffffffff8335c9ff>] fd_timer_workfn+0xf/0x20 drivers/block/floppy.c:985
[<ffffffff813a0836>] process_one_work+0x796/0x1440 kernel/workqueue.c:2037
[<ffffffff813a15bb>] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
[<ffffffff813b4d4f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff86336fef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
---[ end trace 40047c23eabef132 ]---
------------[ cut here ]------------
WARNING: CPU: 1 PID: 10091 at kernel/locking/lockdep.c:3183
__lock_acquire+0xbc8/0x4700()
DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS)
Modules linked in:
[< inline >] process_fd_request drivers/block/floppy.c:2893
[<ffffffff8335df06>] __floppy_read_block_0+0x196/0x260
drivers/block/floppy.c:3822
[<ffffffff83364b93>] floppy_revalidate+0x573/0x770 drivers/block/floppy.c:3867
[<ffffffff8186ff91>] check_disk_change+0xf1/0x130 fs/block_dev.c:1135
[<ffffffff8335e958>] floppy_open+0x518/0x920 drivers/block/floppy.c:3713
[<ffffffff81871c88>] __blkdev_get+0x338/0x10e0 fs/block_dev.c:1213
[<ffffffff818732b0>] blkdev_get+0x310/0x960 fs/block_dev.c:1352
[<ffffffff81873b05>] blkdev_open+0x1a5/0x250 fs/block_dev.c:1507
[<ffffffff817a9c02>] do_dentry_open+0x6a2/0xcb0 fs/open.c:736
[<ffffffff817ad2db>] vfs_open+0x17b/0x1f0 fs/open.c:853
[< inline >] do_last fs/namei.c:3254
[<ffffffff817e00d9>] path_openat+0xde9/0x5e30 fs/namei.c:3386
[<ffffffff817e895e>] do_filp_open+0x18e/0x250 fs/namei.c:3421
[<ffffffff817ada5c>] do_sys_open+0x1fc/0x420 fs/open.c:1022
[< inline >] SYSC_open fs/open.c:1040
[<ffffffff817adcad>] SyS_open+0x2d/0x40 fs/open.c:1035
[<ffffffff86336c36>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
---[ end trace 40047c23eabef13c ]---
CPU: 1 PID: 10091 Comm: kworker/u8:2 Tainted: G W 4.4.0+ #276
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: floppy fd_timer_workfn
00000000ffffffff ffff8800607f7650 ffffffff82999e2d ffff8800607f76c0
ffff88005b2f4740 ffffffff8642bc40 ffff8800607f7690 ffffffff81352089
ffffffff81454e08 ffffed000c0feed4 ffffffff8642bc40 0000000000000c6f
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82999e2d>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
[<ffffffff81352089>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:482
[<ffffffff81352199>] warn_slowpath_fmt+0xa9/0xd0 kernel/panic.c:494
[<ffffffff81454e08>] __lock_acquire+0xbc8/0x4700 kernel/locking/lockdep.c:3183
[<ffffffff8145ad8c>] lock_acquire+0x1dc/0x430 kernel/locking/lockdep.c:3585
[< inline >] __raw_spin_lock_irqsave
include/linux/spinlock_api_smp.h:112
[<ffffffff863365cf>] _raw_spin_lock_irqsave+0x9f/0xd0
kernel/locking/spinlock.c:159
[<ffffffff8143a678>] complete+0x18/0x70 kernel/sched/completion.c:33
[<ffffffff8335dd04>] floppy_rb0_cb+0x74/0xe0 drivers/block/floppy.c:3785
[<ffffffff828f41f7>] bio_endio+0x117/0x200 block/bio.c:1761
[< inline >] req_bio_endio block/blk-core.c:155
[<ffffffff82910533>] blk_update_request+0x1c3/0xbc0 block/blk-core.c:2632
[<ffffffff82910f5a>] blk_update_bidi_request+0x2a/0x160 block/blk-core.c:2686
[<ffffffff82913ee0>] __blk_end_bidi_request+0x30/0x60 block/blk-core.c:2802
[<ffffffff82913f37>] __blk_end_request+0x27/0x30 block/blk-core.c:2903
[<ffffffff83360076>] floppy_end_request+0x96/0x2a0 drivers/block/floppy.c:2213
[<ffffffff833606d2>] request_done+0x452/0x6d0 drivers/block/floppy.c:2266
[< inline >] seek_floppy drivers/block/floppy.c:1571
[<ffffffff8336ad40>] floppy_ready+0x1040/0x13f0 drivers/block/floppy.c:1911
[<ffffffff8335c9ff>] fd_timer_workfn+0xf/0x20 drivers/block/floppy.c:985
[<ffffffff813a0836>] process_one_work+0x796/0x1440 kernel/workqueue.c:2037
[<ffffffff813a15bb>] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
[<ffffffff813b4d4f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff86336fef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
---[ end trace 40047c23eabef13d ]---
BUG: unable to handle kernel NULL pointer dereference at 000000000000036b
IP: [<000000000000036b>] 0x36b
PGD 651b5067 PUD 63062067 PMD 0
Oops: 0010 [#1] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 1 PID: 10091 Comm: kworker/u8:2 Tainted: G W 4.4.0+ #276
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: floppy fd_timer_workfn
task: ffff88005b2f4740 ti: ffff8800607f0000 task.ti: ffff8800607f0000
RIP: 0010:[<000000000000036b>] [<000000000000036b>] 0x36b
RSP: 0018:ffff8800607f7920 EFLAGS: 00010093
RAX: ffff88005eb775c8 RBX: 000000005eafc740 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88005eb775c8
RBP: ffff8800607f7968 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000036b R11: ffffed000fffec09 R12: ffff88005eb775b8
R13: dffffc0000000000 R14: ffff88005eb77608 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000036b CR3: 0000000065243000 CR4: 00000000000006e0
Stack:
ffffffff81438d28 ffff88005eb775c8 0000000100000086 0000000300000000
ffff88005eb77578 ffff88005eb77580 0000000000000086 dffffc0000000000
0000000000001000 ffff8800607f7978 ffffffff81438e1e ffff8800607f79a0
Call Trace:
[<ffffffff81438e1e>] __wake_up_locked+0xe/0x10 kernel/sched/wait.c:105
[<ffffffff8143a6ae>] complete+0x4e/0x70 kernel/sched/completion.c:35
[<ffffffff8335dd04>] floppy_rb0_cb+0x74/0xe0 drivers/block/floppy.c:3785
[<ffffffff828f41f7>] bio_endio+0x117/0x200 block/bio.c:1761
[< inline >] req_bio_endio block/blk-core.c:155
[<ffffffff82910533>] blk_update_request+0x1c3/0xbc0 block/blk-core.c:2632
[<ffffffff82910f5a>] blk_update_bidi_request+0x2a/0x160 block/blk-core.c:2686
[<ffffffff82913ee0>] __blk_end_bidi_request+0x30/0x60 block/blk-core.c:2802
[<ffffffff82913f37>] __blk_end_request+0x27/0x30 block/blk-core.c:2903
[<ffffffff83360076>] floppy_end_request+0x96/0x2a0 drivers/block/floppy.c:2213
[<ffffffff833606d2>] request_done+0x452/0x6d0 drivers/block/floppy.c:2266
[< inline >] seek_floppy drivers/block/floppy.c:1571
[<ffffffff8336ad40>] floppy_ready+0x1040/0x13f0 drivers/block/floppy.c:1911
[<ffffffff8335c9ff>] fd_timer_workfn+0xf/0x20 drivers/block/floppy.c:985
[<ffffffff813a0836>] process_one_work+0x796/0x1440 kernel/workqueue.c:2037
[<ffffffff813a15bb>] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
[<ffffffff813b4d4f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff86336fef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
Code: Bad RIP value.
RIP [<000000000000036b>] 0x36b
RSP <ffff8800607f7920>
CR2: 000000000000036b
---[ end trace 40047c23eabef13e ]---
Oops: 0000 [#2] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 0 PID: 10091 Comm: kworker/u8:2 Tainted: G D W 4.4.0+ #276
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88005b2f4740 ti: ffff8800607f0000 task.ti: ffff8800607f0000
RIP: 0010:[<ffffffff813b632d>] [<ffffffff813b632d>] kthread_data+0x4d/0x70
RSP: 0018:ffff8800607f73d8 EFLAGS: 00010046
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff88005b2f47e8
RDX: 1ffffffffffffff5 RSI: 0000000000000000 RDI: ffffffffffffffa8
RBP: ffff8800607f73e0 R08: ffff88003ec20b78 R09: 000000000252cb9d
R10: ffff88005b2f47c0 R11: ffff88003ec20270 R12: 0000000000000000
R13: 0000000000020140 R14: ffff88005b2f4784 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000028 CR3: 00000000075bb000 CR4: 00000000000006f0
Stack:
ffff88005b2f4740 ffff8800607f7400 ffffffff813a858a ffff88003ec20140
0000000000000040 ffff8800607f7488 ffffffff863275d6 0000000000000000
ffff8800607f7490 0000000000000286 ffff88003ec20af0 ffff88003ec20ac8
Call Trace:
[<ffffffff813a858a>] wq_worker_sleeping+0x1a/0x220 kernel/workqueue.c:850
[<ffffffff863275d6>] __schedule+0x1206/0x1c50 kernel/sched/core.c:3260
[<ffffffff863280b7>] schedule+0x97/0x1c0 kernel/sched/core.c:3311
[<ffffffff8135c521>] do_exit+0x1b61/0x2c60 kernel/exit.c:830
[<ffffffff811abe7f>] oops_end+0x9f/0xd0 arch/x86/kernel/dumpstack.c:250
[<ffffffff8127de6c>] no_context+0x2cc/0x870 arch/x86/mm/fault.c:728
[<ffffffff8127e68b>] __bad_area_nosemaphore+0x27b/0x460 arch/x86/mm/fault.c:808
[<ffffffff8127e89a>] bad_area_nosemaphore+0x2a/0x40 arch/x86/mm/fault.c:815
[<ffffffff8127ee0f>] __do_page_fault+0x18f/0x960 arch/x86/mm/fault.c:1180
[<ffffffff8127f738>] trace_do_page_fault+0xe8/0x420 arch/x86/mm/fault.c:1331
[<ffffffff812705c4>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:264
[<ffffffff86338f78>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:986
[<ffffffff81438e1e>] __wake_up_locked+0xe/0x10 kernel/sched/wait.c:105
[<ffffffff8143a6ae>] complete+0x4e/0x70 kernel/sched/completion.c:35
[<ffffffff8335dd04>] floppy_rb0_cb+0x74/0xe0 drivers/block/floppy.c:3785
[<ffffffff828f41f7>] bio_endio+0x117/0x200 block/bio.c:1761
[< inline >] req_bio_endio block/blk-core.c:155
[<ffffffff82910533>] blk_update_request+0x1c3/0xbc0 block/blk-core.c:2632
[<ffffffff82910f5a>] blk_update_bidi_request+0x2a/0x160 block/blk-core.c:2686
[<ffffffff82913ee0>] __blk_end_bidi_request+0x30/0x60 block/blk-core.c:2802
[<ffffffff82913f37>] __blk_end_request+0x27/0x30 block/blk-core.c:2903
[<ffffffff83360076>] floppy_end_request+0x96/0x2a0 drivers/block/floppy.c:2213
[<ffffffff833606d2>] request_done+0x452/0x6d0 drivers/block/floppy.c:2266
[< inline >] seek_floppy drivers/block/floppy.c:1571
[<ffffffff8336ad40>] floppy_ready+0x1040/0x13f0 drivers/block/floppy.c:1911
[<ffffffff8335c9ff>] fd_timer_workfn+0xf/0x20 drivers/block/floppy.c:985
[<ffffffff813a0836>] process_one_work+0x796/0x1440 kernel/workqueue.c:2037
[<ffffffff813a15bb>] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2171
[<ffffffff813b4d4f>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff86336fef>] ret_from_fork+0x3f/0x70 arch/x86/entry/entry_64.S:468
Code: c1 ea 03 80 3c 02 00 75 29 48 8b 9b 60 05 00 00 48 b8 00 00 00
00 00 fc ff df 48 8d 7b a8 48 89 fa 48 c1 ea 03 80 3c 02 00 75 0e <48>
8b 43 a8 5b 5d c3 e8 77 a6 3a 00 eb d0 e8 70 a6 3a 00 eb eb
RIP [<ffffffff813b632d>] kthread_data+0x4d/0x70 kernel/kthread.c:137
RSP <ffff8800607f73d8>
CR2: ffffffffffffffa8
---[ end trace 40047c23eabef13f ]---
Fixing recursive fault but reboot is needed!


I am testing in qemu, I think without a floppy drive:

$ qemu-system-x86_64 -hda wheezy.img -net
user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net nic -nographic -kernel
bzImage -append "console=ttyS0 root=/dev/sda debug earlyprintk=serial
slub_debug=FPZU" -enable-kvm -m 2G -numa node,nodeid=0,cpus=0-1 -numa
node,nodeid=1,cpus=2-3 -smp sockets=2,cores=2,threads=1 -usb
-usbdevice mouse -usbdevice tablet -soundhw all

On commit 30f05309bde49295e02e45c7e615f73aa4e0ccc2.

Jiri Kosina

unread,
Jan 25, 2016, 9:06:34 AM1/25/16
to Dmitry Vyukov, NeilBrown, Takashi Iwai, Jens Axboe, Hannes Reinecke, Rasmus Villemoes, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Sun, 24 Jan 2016, Dmitry Vyukov wrote:

> Hello,
>
> The following causes program causes multiple bugs and eventually machine death:
>
> // autogenerated by syzkaller (http://github.com/google/syzkaller)
> #include <unistd.h>
> #include <stdlib.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sys/wait.h>
>
> #define N 100
>
> int main()
> {
> int i, status, pids[N];
>
> for (;;) {
> for (i = 0; i < N; i++) {
> if ((pids[i] = fork()) == 0) {
> open("/dev/fd0", O_RDWR);

Oh man, you are trying to wake up the beast everybody had hoped to have
vanish from this universe forever.
redo_fd_request() is complaining because previous work item hasn't
finished yet, and the whole handling is designed as single-threaded (since
commit 070ad7e793dc). So apparently floppy_wq is racing with itself. We'll
have to find out how exactly that happened, as it might be the root cause
to everything that follows.

> ---[ end trace 40047c23eabef132 ]---
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 10091 at kernel/locking/lockdep.c:3183
> __lock_acquire+0xbc8/0x4700()
> DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS)

Locked ran out of key bitmap. I.e. long convoy waiting for one process to
proceeed, I guess; very likely waiting for global floppy_lock.

> Modules linked in:
> [< inline >] process_fd_request drivers/block/floppy.c:2893
> [<ffffffff8335df06>] __floppy_read_block_0+0x196/0x260
> drivers/block/floppy.c:3822
> [<ffffffff83364b93>] floppy_revalidate+0x573/0x770 drivers/block/floppy.c:3867

And this happens while reading block #0 to perform size auto-sensing.

> [<ffffffff8186ff91>] check_disk_change+0xf1/0x130 fs/block_dev.c:1135
> [<ffffffff8335e958>] floppy_open+0x518/0x920 drivers/block/floppy.c:3713
> [<ffffffff81871c88>] __blkdev_get+0x338/0x10e0 fs/block_dev.c:1213
> [<ffffffff818732b0>] blkdev_get+0x310/0x960 fs/block_dev.c:1352
> [<ffffffff81873b05>] blkdev_open+0x1a5/0x250 fs/block_dev.c:1507
> [<ffffffff817a9c02>] do_dentry_open+0x6a2/0xcb0 fs/open.c:736
> [<ffffffff817ad2db>] vfs_open+0x17b/0x1f0 fs/open.c:853
> [< inline >] do_last fs/namei.c:3254
> [<ffffffff817e00d9>] path_openat+0xde9/0x5e30 fs/namei.c:3386
> [<ffffffff817e895e>] do_filp_open+0x18e/0x250 fs/namei.c:3421
> [<ffffffff817ada5c>] do_sys_open+0x1fc/0x420 fs/open.c:1022
> [< inline >] SYSC_open fs/open.c:1040
> [<ffffffff817adcad>] SyS_open+0x2d/0x40 fs/open.c:1035
> [<ffffffff86336c36>] entry_SYSCALL_64_fastpath+0x16/0x7a
> arch/x86/entry/entry_64.S:185
> ---[ end trace 40047c23eabef13c ]---
> CPU: 1 PID: 10091 Comm: kworker/u8:2 Tainted: G W 4.4.0+ #276
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: floppy fd_timer_workfn
> 00000000ffffffff ffff8800607f7650 ffffffff82999e2d ffff8800607f76c0
> ffff88005b2f4740 ffffffff8642bc40 ffff8800607f7690 ffffffff81352089
> ffffffff81454e08 ffffed000c0feed4 ffffffff8642bc40 0000000000000c6f

This trace looks incompletee, right? From the stacktraces it looks like it
might be another warning from lockdep, but the actually message doesn't
seem to have made it to your trace.
And here we seem to be trying to complete already freed completion.

There might be either race regarding block #0 reading (commits 7b7b68bba5
and 6314a108ec1), or the 070ad7e793dc rework itself.

Do you happen to have (serial) console logs from the kernel before it
crashed? Does it by any chance contain any messages from the floppy
driver?

Thanks,

--
Jiri Kosina
SUSE Labs

Dmitry Vyukov

unread,
Jan 25, 2016, 10:35:10 AM1/25/16
to Jiri Kosina, NeilBrown, Takashi Iwai, Jens Axboe, Hannes Reinecke, Rasmus Villemoes, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Hi,

Here is a complete log:
https://gist.githubusercontent.com/dvyukov/bd7923ece774521d6136/raw/5fae6326d90360d8bd49d6e06e0b3a7f132f00ac/gistfile1.txt
It is intermixed with executed programs, but you can grep for "] floppy".
There seems to be a bunch of "floppy: error -5 while reading block 0" errors.

Jiri Kosina

unread,
Jan 27, 2016, 11:55:16 AM1/27/16
to Dmitry Vyukov, NeilBrown, Takashi Iwai, Jens Axboe, Hannes Reinecke, Rasmus Villemoes, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Mon, 25 Jan 2016, Dmitry Vyukov wrote:

> Here is a complete log:
> https://gist.githubusercontent.com/dvyukov/bd7923ece774521d6136/raw/5fae6326d90360d8bd49d6e06e0b3a7f132f00ac/gistfile1.txt
> It is intermixed with executed programs, but you can grep for "] floppy".
> There seems to be a bunch of "floppy: error -5 while reading block 0" errors.

Alright, there is a serious issue with how floppy_revalidate() handles
error from lock_fdc() (in-depth explanation for the curious: it doesn't).

In case wait_for_event_interruptible() is actually interrupted without
succesfully claiming fdc, we proceed with revalidation, causing all kinds
of havoc (because another parallel request is still in progress).

The whole story with the 'interruptible' parameter to lock_fdc() is funny
as well, because we always wait interruptibly anyway. So it can be nuked.
Most of the lock_fdc() callsites do properly handle error (and propagate
EINTR), but floppy_revalidate() and floppy_check_events() don't.

Could you please let syzkaller retest with the patch below? It fixes all
the oopses I am able to trigger here.

Thanks.


diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 9e25120..b206115 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -866,7 +866,7 @@ static void set_fdc(int drive)
}

/* locks the driver */
-static int lock_fdc(int drive, bool interruptible)
+static int lock_fdc(int drive)
{
if (WARN(atomic_read(&usage_count) == 0,
"Trying to lock fdc while usage count=0\n"))
@@ -2173,7 +2173,7 @@ static int do_format(int drive, struct format_descr *tmp_format_req)
{
int ret;

- if (lock_fdc(drive, true))
+ if (lock_fdc(drive))
return -EINTR;

set_floppy(drive);
@@ -2960,7 +2960,7 @@ static int user_reset_fdc(int drive, int arg, bool interruptible)
{
int ret;

- if (lock_fdc(drive, interruptible))
+ if (lock_fdc(drive))
return -EINTR;

if (arg == FD_RESET_ALWAYS)
@@ -3243,7 +3243,7 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g,
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
mutex_lock(&open_lock);
- if (lock_fdc(drive, true)) {
+ if (lock_fdc(drive)) {
mutex_unlock(&open_lock);
return -EINTR;
}
@@ -3263,7 +3263,7 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g,
} else {
int oldStretch;

- if (lock_fdc(drive, true))
+ if (lock_fdc(drive))
return -EINTR;
if (cmd != FDDEFPRM) {
/* notice a disk change immediately, else
@@ -3349,7 +3349,7 @@ static int get_floppy_geometry(int drive, int type, struct floppy_struct **g)
if (type)
*g = &floppy_type[type];
else {
- if (lock_fdc(drive, false))
+ if (lock_fdc(drive))
return -EINTR;
if (poll_drive(false, 0) == -EINTR)
return -EINTR;
@@ -3433,7 +3433,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
if (UDRS->fd_ref != 1)
/* somebody else has this drive open */
return -EBUSY;
- if (lock_fdc(drive, true))
+ if (lock_fdc(drive))
return -EINTR;

/* do the actual eject. Fails on
@@ -3445,7 +3445,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
process_fd_request();
return ret;
case FDCLRPRM:
- if (lock_fdc(drive, true))
+ if (lock_fdc(drive))
return -EINTR;
current_type[drive] = NULL;
floppy_sizes[drive] = MAX_DISK_SIZE << 1;
@@ -3467,7 +3467,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
UDP->flags &= ~FTD_MSG;
return 0;
case FDFMTBEG:
- if (lock_fdc(drive, true))
+ if (lock_fdc(drive))
return -EINTR;
if (poll_drive(true, FD_RAW_NEED_DISK) == -EINTR)
return -EINTR;
@@ -3484,7 +3484,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
return do_format(drive, &inparam.f);
case FDFMTEND:
case FDFLUSH:
- if (lock_fdc(drive, true))
+ if (lock_fdc(drive))
return -EINTR;
return invalidate_drive(bdev);
case FDSETEMSGTRESH:
@@ -3507,7 +3507,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
outparam = UDP;
break;
case FDPOLLDRVSTAT:
- if (lock_fdc(drive, true))
+ if (lock_fdc(drive))
return -EINTR;
if (poll_drive(true, FD_RAW_NEED_DISK) == -EINTR)
return -EINTR;
@@ -3530,7 +3530,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
case FDRAWCMD:
if (type)
return -EINVAL;
- if (lock_fdc(drive, true))
+ if (lock_fdc(drive))
return -EINTR;
set_floppy(drive);
i = raw_cmd_ioctl(cmd, (void __user *)param);
@@ -3539,7 +3539,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
process_fd_request();
return i;
case FDTWADDLE:
- if (lock_fdc(drive, true))
+ if (lock_fdc(drive))
return -EINTR;
twaddle();
process_fd_request();
@@ -3748,7 +3748,8 @@ static unsigned int floppy_check_events(struct gendisk *disk,
return DISK_EVENT_MEDIA_CHANGE;

if (time_after(jiffies, UDRS->last_checked + UDP->checkfreq)) {
- lock_fdc(drive, false);
+ if (lock_fdc(drive))
+ return -EINTR;
poll_drive(false, 0);
process_fd_request();
}
@@ -3847,7 +3848,9 @@ static int floppy_revalidate(struct gendisk *disk)
"VFS: revalidate called on non-open device.\n"))
return -EFAULT;

- lock_fdc(drive, false);
+ res = lock_fdc(drive);
+ if (res)
+ return res;
cf = (test_bit(FD_DISK_CHANGED_BIT, &UDRS->flags) ||
test_bit(FD_VERIFY_BIT, &UDRS->flags));
if (!(cf || test_bit(drive, &fake_change) || drive_no_geom(drive))) {

Dmitry Vyukov

unread,
Jan 27, 2016, 12:17:25 PM1/27/16
to Jiri Kosina, NeilBrown, Takashi Iwai, Jens Axboe, Hannes Reinecke, Rasmus Villemoes, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Wed, Jan 27, 2016 at 5:55 PM, Jiri Kosina <ji...@kernel.org> wrote:
> On Mon, 25 Jan 2016, Dmitry Vyukov wrote:
>
>> Here is a complete log:
>> https://gist.githubusercontent.com/dvyukov/bd7923ece774521d6136/raw/5fae6326d90360d8bd49d6e06e0b3a7f132f00ac/gistfile1.txt
>> It is intermixed with executed programs, but you can grep for "] floppy".
>> There seems to be a bunch of "floppy: error -5 while reading block 0" errors.
>
> Alright, there is a serious issue with how floppy_revalidate() handles
> error from lock_fdc() (in-depth explanation for the curious: it doesn't).
>
> In case wait_for_event_interruptible() is actually interrupted without
> succesfully claiming fdc, we proceed with revalidation, causing all kinds
> of havoc (because another parallel request is still in progress).
>
> The whole story with the 'interruptible' parameter to lock_fdc() is funny
> as well, because we always wait interruptibly anyway. So it can be nuked.
> Most of the lock_fdc() callsites do properly handle error (and propagate
> EINTR), but floppy_revalidate() and floppy_check_events() don't.
>
> Could you please let syzkaller retest with the patch below? It fixes all
> the oopses I am able to trigger here.

Done. I will get back to you if I see any other oops.

Jiri Kosina

unread,
Jan 27, 2016, 3:27:16 PM1/27/16
to Dmitry Vyukov, NeilBrown, Takashi Iwai, Jens Axboe, Hannes Reinecke, Rasmus Villemoes, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Wed, 27 Jan 2016, Dmitry Vyukov wrote:

> > Alright, there is a serious issue with how floppy_revalidate() handles
> > error from lock_fdc() (in-depth explanation for the curious: it doesn't).
> >
> > In case wait_for_event_interruptible() is actually interrupted without
> > succesfully claiming fdc, we proceed with revalidation, causing all kinds
> > of havoc (because another parallel request is still in progress).
> >
> > The whole story with the 'interruptible' parameter to lock_fdc() is funny
> > as well, because we always wait interruptibly anyway. So it can be nuked.
> > Most of the lock_fdc() callsites do properly handle error (and propagate
> > EINTR), but floppy_revalidate() and floppy_check_events() don't.
> >
> > Could you please let syzkaller retest with the patch below? It fixes all
> > the oopses I am able to trigger here.
>
> Done. I will get back to you if I see any other oops.

Please also get to me in case you don't see any other oops after the
timeframe you have been able to reproduce the original problem, so that we
could consider the issue resolved and I can make proper patch from this
and send it upstream.

Dmitry Vyukov

unread,
Jan 28, 2016, 5:32:47 AM1/28/16
to Jiri Kosina, NeilBrown, Takashi Iwai, Jens Axboe, Hannes Reinecke, Rasmus Villemoes, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
I have not seen any /dev/fd0 related crashes over night. Previously I
seen tons of them, so I guess this bug is fixed.
Please send this upstream.
Thanks

Jiri Kosina

unread,
Jan 28, 2016, 5:43:38 AM1/28/16
to Dmitry Vyukov, Jens Axboe, NeilBrown, Takashi Iwai, Hannes Reinecke, Rasmus Villemoes, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Thu, 28 Jan 2016, Dmitry Vyukov wrote:

> I have not seen any /dev/fd0 related crashes over night. Previously I
> seen tons of them, so I guess this bug is fixed.
> Please send this upstream.
> Thanks

Thanks a lot for testing, Dmitry.

Jens, here is the patch with proper changelog. Could you please apply it
to your tree? Or would you prefer a pull request with this single patch in
it?
Thanks.




From: Jiri Kosina <jko...@suse.cz>
Subject: [PATCH] floppy: fix lock_fdc() signal handling

floppy_revalidate() doesn't perform any error handling on lock_fdc()
result. lock_fdc() might actually be interrupted by a signal (it waits for
fdc becoming non-busy interruptibly). In such case, floppy_revalidate()
proceeds as if it had claimed the lock, but it fact it doesn't.

In case of multiple threads trying to open("/dev/fdX"), this leads to
serious corruptions all over the place, because all of a sudden there is
no critical section protection (that'd otherwise be guaranteed by locked
fd) whatsoever.

While at this, fix the fact that the 'interruptible' parameter to
lock_fdc() doesn't make any sense whatsoever, because we always wait
interruptibly anyway.

Most of the lock_fdc() callsites do properly handle error (and propagate
EINTR), but floppy_revalidate() and floppy_check_events() don't. Fix this.

Spotted by 'syzkaller' tool.

Reported-by: Dmitry Vyukov <dvy...@google.com>
Tested-by: Dmitry Vyukov <dvy...@google.com>
Signed-off-by: Jiri Kosina <jko...@suse.cz>
---
drivers/block/floppy.c | 33 ++++++++++++++++++---------------
1 file changed, 18 insertions(+), 15 deletions(-)

Dmitry Vyukov

unread,
Jan 29, 2016, 4:22:29 AM1/29/16
to syzkaller, Jens Axboe, NeilBrown, Takashi Iwai, Hannes Reinecke, Rasmus Villemoes, LKML, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Thu, Jan 28, 2016 at 11:43 AM, Jiri Kosina <ji...@kernel.org> wrote:
> On Thu, 28 Jan 2016, Dmitry Vyukov wrote:
>
>> I have not seen any /dev/fd0 related crashes over night. Previously I
>> seen tons of them, so I guess this bug is fixed.
>> Please send this upstream.
>> Thanks
>
> Thanks a lot for testing, Dmitry.
>
> Jens, here is the patch with proper changelog. Could you please apply it
> to your tree? Or would you prefer a pull request with this single patch in
> it?
> Thanks.


Done
Replaced the old patch with this one.
Thanks for a quick fix
Reply all
Reply to author
Forward
0 new messages