general protection fault in lo_ioctl (2)

47 views
Skip to first unread message

syzbot

unread,
May 2, 2018, 3:33:02 AM5/2/18
to ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following crash on:

HEAD commit: fff75eb2a08c Merge tag 'errseq-v4.17' of
git://git.kernel.o...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?id=5301511529693184
kernel config:
https://syzkaller.appspot.com/x/.config?id=6493557782959164711
dashboard link: https://syzkaller.appspot.com/bug?extid=bf89c128e05dd6c62523
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
userspace arch: i386
syzkaller
repro:https://syzkaller.appspot.com/x/repro.syz?id=5000693362458624

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+bf89c1...@syzkaller.appspotmail.com

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 10637 Comm: syz-executor1 Not tainted 4.17.0-rc3+ #52
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:loop_set_fd drivers/block/loop.c:901 [inline]
RIP: 0010:lo_ioctl+0x1ae8/0x2130 drivers/block/loop.c:1397
RSP: 0018:ffff8801bd6dfc08 EFLAGS: 00010202
RAX: 0000000000000037 RBX: ffff8801d296db00 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: ffffffff84b2cd57 RDI: 00000000000001b8
RBP: ffff8801bd6dfc80 R08: ffff8801c46943c0 R09: ffffed003b5c46c2
R10: 0000000000000003 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: ffff8801b58561a0 R15: ffff8801b58560c0
FS: 0000000000000000(0000) GS:ffff8801dae00000(0063) knlGS:00000000f7f2eb40
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 0000000008138024 CR3: 00000001d03df000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
lo_compat_ioctl+0xc1/0x170 drivers/block/loop.c:1602
compat_blkdev_ioctl+0x3c2/0x1b20 block/compat_ioctl.c:406
__do_compat_sys_ioctl fs/compat_ioctl.c:1461 [inline]
__se_compat_sys_ioctl fs/compat_ioctl.c:1407 [inline]
__ia32_compat_sys_ioctl+0x221/0x640 fs/compat_ioctl.c:1407
do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
RIP: 0023:0xf7f32cb9
RSP: 002b:00000000f7f2e0ac EFLAGS: 00000282 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000000000004c00
RDX: 0000000000000003 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Code: e8 03 80 3c 08 00 0f 85 2c 02 00 00 4d 8b ad f0 00 00 00 48 b9 00 00
00 00 00 fc ff df 49 8d bd b8 01 00 00 48 89 f8 48 c1 e8 03 <80> 3c 08 00
0f 85 f9 01 00 00 4d 8b ad b8 01 00 00 48 b9 00 00
RIP: loop_set_fd drivers/block/loop.c:901 [inline] RSP: ffff8801bd6dfc08
RIP: lo_ioctl+0x1ae8/0x2130 drivers/block/loop.c:1397 RSP: ffff8801bd6dfc08
---[ end trace a15583efe355602c ]---


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug
report.
Note: all commands must start from beginning of the line in the email body.

Dmitry Vyukov

unread,
May 2, 2018, 7:23:45 AM5/2/18
to syzbot, Tetsuo Handa, Jens Axboe, linux...@vger.kernel.org, LKML, syzkaller-bugs
On Wed, May 2, 2018 at 9:33 AM, syzbot
<syzbot+bf89c1...@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: fff75eb2a08c Merge tag 'errseq-v4.17' of
> git://git.kernel.o...
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?id=5301511529693184
> kernel config:
> https://syzkaller.appspot.com/x/.config?id=6493557782959164711
> dashboard link: https://syzkaller.appspot.com/bug?extid=bf89c128e05dd6c62523
> compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> userspace arch: i386
> syzkaller
> repro:https://syzkaller.appspot.com/x/repro.syz?id=5000693362458624

#syz dup: INFO: rcu detected stall in blkdev_ioctl
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/000000000000cbaaf6056b341859%40google.com.
> For more options, visit https://groups.google.com/d/optout.

Tetsuo Handa

unread,
May 7, 2018, 4:56:30 PM5/7/18
to Dmitry Vyukov, syzbot, Jens Axboe, linux...@vger.kernel.org, LKML, syzkaller-bugs, Theodore Ts'o
On 2018/05/02 20:23, Dmitry Vyukov wrote:
> #syz dup: INFO: rcu detected stall in blkdev_ioctl

The cause of stall turned out to be ioctl(loop_fd, LOOP_CHANGE_FD, loop_fd).

But we haven't explained the cause of NULL pointer dereference which can
occur when raced with ioctl(LOOP_CLR_FD). Therefore,

#syz undup

syzbot

unread,
May 7, 2018, 11:36:01 PM5/7/18
to syzkall...@googlegroups.com, ty...@mit.edu
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger
crash:

Reported-and-tested-by:
syzbot+bf89c1...@syzkaller.appspotmail.com

Tested on:

commit: 170785a9cc72 loop: add recursion validation to LOOP_CHANGE..
git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git/loop-fix
kernel config: https://syzkaller.appspot.com/x/.config?x=5a1dc06635c10d27
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
userspace arch: i386

Note: testing is done by a robot and is best-effort only.

Tetsuo Handa

unread,
May 8, 2018, 7:05:22 AM5/8/18
to syzbot, Jens Axboe, syzkaller-bugs, Theodore Ts'o, Jan Kara, Milan Broz, Dmitry Vyukov, linux...@vger.kernel.org, LKML
Using sleep injection patch and reproducer shown below, I can reproduce
the crashes. It is a race between ioctl(loop_fd, LOOP_CLR_FD, 0) versus
ioctl(other_loop_fd, LOOP_SET_FD, loop_fd).

Unless we hold corresponding lo->lo_ctl_mutex (or keep corresponding
lo->lo_refcnt elevated) when traversing other loop devices,
"/* Avoid recursion */" loop from loop_set_fd()/loop_change_fd() will
suffer from races by loop_clr_fd().

So, it is time to think how to solve this race condition, as well as how to solve
lockdep's deadlock warning (and I guess that syzbot is actually hitting deadlocks).
An approach which serializes loop operations using global lock was proposed at
https://groups.google.com/d/msg/syzkaller-bugs/2Rw8-OM6IbM/PzdobV8kAgAJ .
Please respond...

------------------------------------------------------------
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -909,6 +909,9 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
error = -EINVAL;
goto out_putf;
}
+ pr_err("Start sleeping\n");
+ schedule_timeout_killable(3 * HZ);
+ pr_err("End sleeping\n");
f = l->lo_backing_file;
}

------------------------------------------------------------

------------------------------------------------------------
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <linux/loop.h>
#include <sys/ioctl.h>

int main(int argc, char *argv[])
{
int fd0 = open("/dev/loop0", O_RDONLY);
int fd1 = open("/dev/loop1", O_RDONLY);
int fd2 = open("/tmp/file", O_RDWR | O_CREAT | O_TRUNC, 0600);
ioctl(fd1, LOOP_SET_FD, fd2);
if (fork() == 0) {
sleep(1);
ioctl(fd1, LOOP_CLR_FD, 0);
_exit(0);
}
ioctl(fd0, LOOP_SET_FD, fd1);
return 0;
}
------------------------------------------------------------

------------------------------------------------------------
[ 14.119073] loop: module loaded
[ 17.363610] Start sleeping
[ 20.383442] End sleeping
[ 20.386511] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 20.394779] PGD 13377d067 P4D 13377d067 PUD 131509067 PMD 0
[ 20.400847] Oops: 0000 [#1] SMP
[ 20.403875] Modules linked in: loop
[ 20.406188] CPU: 6 PID: 6470 Comm: a.out Tainted: G T 4.17.0-rc4+ #540
[ 20.411266] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[ 20.418169] RIP: 0010:lo_ioctl+0x7ef/0x840 [loop]
[ 20.421272] RSP: 0018:ffffc90000bbbd88 EFLAGS: 00010282
[ 20.424661] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83679478
[ 20.429271] RDX: ffff8801332e9c00 RSI: 0000000000000086 RDI: 0000000000000286
[ 20.434517] RBP: ffffc90000bbbdd8 R08: 0000000000000638 R09: 0000000000000000
[ 20.436879] R10: 0000000000000190 R11: 0720072007200720 R12: ffff8801314ab118
[ 20.439076] R13: ffff880138deae40 R14: ffff8801311f7780 R15: ffff8801314ab000
[ 20.441144] FS: 00007f0b57743740(0000) GS:ffff88013a780000(0000) knlGS:0000000000000000
[ 20.443588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 20.445284] CR2: 0000000000000008 CR3: 0000000138efb002 CR4: 00000000000606e0
[ 20.447381] Call Trace:
[ 20.448149] blkdev_ioctl+0x88d/0x950
[ 20.449237] block_ioctl+0x38/0x40
[ 20.450269] do_vfs_ioctl+0xaa/0x650
[ 20.451479] ? handle_mm_fault+0x108/0x250
[ 20.452704] ksys_ioctl+0x70/0x80
[ 20.453737] __x64_sys_ioctl+0x15/0x20
[ 20.454887] do_syscall_64+0x5d/0x100
[ 20.456014] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 20.457519] RIP: 0033:0x7f0b57267107
[ 20.458644] RSP: 002b:00007fff8a0fd698 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 20.460853] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f0b57267107
[ 20.462952] RDX: 0000000000000004 RSI: 0000000000004c00 RDI: 0000000000000003
[ 20.465023] RBP: 0000000000000003 R08: 00007f0b57743740 R09: 0000000000000000
[ 20.467091] R10: 00007f0b57743a10 R11: 0000000000000246 R12: 00000000004005ef
[ 20.469361] R13: 00007fff8a0fd790 R14: 0000000000000000 R15: 0000000000000000
[ 20.471657] Code: a0 48 89 55 d0 e8 e0 5f 1d e1 bf b8 0b 00 00 e8 78 9e 7c e2 48 c7 c7 a9 40 00 a0 e8 ca 5f 1d e1 48 8b 55 d0 48 8b 82 f0 00 00 00 <48> 8b 40 08 48 8b 40 68 48 85 c0 0f 84 15 fd ff ff 0f b7 90 b8
[ 20.477207] RIP: lo_ioctl+0x7ef/0x840 [loop] RSP: ffffc90000bbbd88
[ 20.479027] CR2: 0000000000000008
[ 20.480063] ---[ end trace 925bc1b992d96cb3 ]---
[ 20.481441] Kernel panic - not syncing: Fatal exception
[ 20.483119] Kernel Offset: disabled
[ 20.489564] Rebooting in 86400 seconds..
------------------------------------------------------------



Theodore Y. Ts'o

unread,
May 8, 2018, 8:37:38 AM5/8/18
to Tetsuo Handa, syzbot, Jens Axboe, syzkaller-bugs, Jan Kara, Milan Broz, Dmitry Vyukov, linux...@vger.kernel.org, LKML
On Tue, May 08, 2018 at 08:05:12PM +0900, Tetsuo Handa wrote:
>
> So, it is time to think how to solve this race condition, as well as how to solve
> lockdep's deadlock warning (and I guess that syzbot is actually hitting deadlocks).
> An approach which serializes loop operations using global lock was proposed at
> https://groups.google.com/d/msg/syzkaller-bugs/2Rw8-OM6IbM/PzdobV8kAgAJ .
> Please respond...

I'm looking at your patch which you proposed on this, and the locking
architecture still looks way too complex. Things like
loop_mutex_owner, and all of the infrastructure around
lo->ioctl_in_progress should be removed, if at all possible.

I believe it should be possible to do things with a single global
mutex, some code refactoring, and some unlocked versions of some of
the functions.

Again, this requires root, and it requires someone deliberately trying
to induce a race. So "it's time" is not necessarily the priority I
would set for this item. But if we are going to fix it, let's fix it
right, and not make the code more complex and less maintainable, all
in the name of trying to make a rare, not-likely-to-happen-in-real-life
syzbot reported problem to go away.

Cheers,

- Ted

Tetsuo Handa

unread,
May 8, 2018, 5:06:58 PM5/8/18
to ty...@mit.edu, syzbot+bf89c1...@syzkaller.appspotmail.com, ax...@kernel.dk, syzkall...@googlegroups.com, ja...@suse.cz, gmaz...@gmail.com, dvy...@google.com, linux...@vger.kernel.org, linux-...@vger.kernel.org
Theodore Y. Ts'o wrote:
> On Tue, May 08, 2018 at 08:05:12PM +0900, Tetsuo Handa wrote:
> >
> > So, it is time to think how to solve this race condition, as well as how to solve
> > lockdep's deadlock warning (and I guess that syzbot is actually hitting deadlocks).
> > An approach which serializes loop operations using global lock was proposed at
> > https://groups.google.com/d/msg/syzkaller-bugs/2Rw8-OM6IbM/PzdobV8kAgAJ .
> > Please respond...
>
> I'm looking at your patch which you proposed on this, and the locking
> architecture still looks way too complex. Things like
> loop_mutex_owner, and all of the infrastructure around
> lo->ioctl_in_progress should be removed, if at all possible.

The patch in the above link no longer uses "lo->ioctl_in_progress".
You looked at previous version rather than current version.

>
> I believe it should be possible to do things with a single global
> mutex, some code refactoring, and some unlocked versions of some of
> the functions.

The patch in the above link uses single global mutex "loop_mutex".

>
> Again, this requires root, and it requires someone deliberately trying
> to induce a race. So "it's time" is not necessarily the priority I
> would set for this item. But if we are going to fix it, let's fix it
> right, and not make the code more complex and less maintainable, all
> in the name of trying to make a rare, not-likely-to-happen-in-real-life
> syzbot reported problem to go away.

While NULL pointer dereference would be rare, deadlocks might not be rare
enough to postpone the patch. Deadlocks can cause pile of false-positive
hung task reports and can prevent syzbot from finding other bugs. That's
why I say "it is time to think".
Reply all
Reply to author
Forward
0 new messages