general protection fault in loop_validate_file (2)

15 views
Skip to first unread message

syzbot

unread,
Mar 17, 2019, 11:36:05 PM3/17/19
to ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following crash on:

HEAD commit: 9c7dc824 Merge tag '5.1-rc-smb3' of git://git.samba.org/sf..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=148a35fb200000
kernel config: https://syzkaller.appspot.com/x/.config?x=7e1aaa1cfbfe1abf
dashboard link: https://syzkaller.appspot.com/bug?extid=9bdc1adc1c55e7fe765b
compiler: gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+9bdc1a...@syzkaller.appspotmail.com

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 9499 Comm: syz-executor.4 Not tainted 5.0.0+ #25
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:loop_validate_file+0x23e/0x310 drivers/block/loop.c:652
Code: 00 48 89 f8 48 c1 e8 03 42 80 3c 38 00 0f 85 d4 00 00 00 4d 8b a4 24
f0 00 00 00 49 8d bc 24 b8 01 00 00 48 89 f8 48 c1 e8 03 <42> 80 3c 38 00
0f 85 a8 00 00 00 4d 8b a4 24 b8 01 00 00 4c 89 e0
RSP: 0018:ffff88804d2bfb18 EFLAGS: 00010202
RAX: 0000000000000037 RBX: ffff888095ffa3d8 RCX: ffffc9000e657000
RDX: 0000000000000077 RSI: ffffffff83e754ed RDI: 00000000000001b8
RBP: ffff88804d2bfb40 R08: ffff88808eab41c0 R09: fffffbfff11d981d
R10: ffff88804d2bfb40 R11: ffffffff88ecc0e7 R12: 0000000000000000
R13: 0000000000000002 R14: ffff888095ffa800 R15: dffffc0000000000
FS: 00007f04b0c91700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000625208 CR3: 00000000a7af8000 CR4: 00000000001406e0
Call Trace:
loop_set_fd drivers/block/loop.c:930 [inline]
lo_ioctl+0x99d/0x2150 drivers/block/loop.c:1542
__blkdev_driver_ioctl block/ioctl.c:303 [inline]
blkdev_ioctl+0xee8/0x1c40 block/ioctl.c:605
block_ioctl+0xee/0x130 fs/block_dev.c:1931
vfs_ioctl fs/ioctl.c:46 [inline]
file_ioctl fs/ioctl.c:509 [inline]
do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
__do_sys_ioctl fs/ioctl.c:720 [inline]
__se_sys_ioctl fs/ioctl.c:718 [inline]
__x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458079
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f04b0c90c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458079
RDX: 0000000000000004 RSI: 0000000000004c00 RDI: 0000000000000003
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f04b0c916d4
R13: 00000000004c1252 R14: 00000000004d3160 R15: 00000000ffffffff
Modules linked in:
---[ end trace 81b29486bae7280a ]---
RIP: 0010:loop_validate_file+0x23e/0x310 drivers/block/loop.c:652
Code: 00 48 89 f8 48 c1 e8 03 42 80 3c 38 00 0f 85 d4 00 00 00 4d 8b a4 24
f0 00 00 00 49 8d bc 24 b8 01 00 00 48 89 f8 48 c1 e8 03 <42> 80 3c 38 00
0f 85 a8 00 00 00 4d 8b a4 24 b8 01 00 00 4c 89 e0
RSP: 0018:ffff88804d2bfb18 EFLAGS: 00010202
RAX: 0000000000000037 RBX: ffff888095ffa3d8 RCX: ffffc9000e657000
RDX: 0000000000000077 RSI: ffffffff83e754ed RDI: 00000000000001b8
RBP: ffff88804d2bfb40 R08: ffff88808eab41c0 R09: fffffbfff11d981d
R10: ffff88804d2bfb40 R11: ffffffff88ecc0e7 R12: 0000000000000000
R13: 0000000000000002 R14: ffff888095ffa800 R15: dffffc0000000000
FS: 00007f04b0c91700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000625208 CR3: 00000000a7af8000 CR4: 00000000001406e0


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.

Dongli Zhang

unread,
Mar 18, 2019, 2:48:10 AM3/18/19
to linux...@vger.kernel.org, ja...@suse.cz, syzbot, ax...@kernel.dk, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Tetsuo Handa
As discussed with Tetsuo in other email thread, the issue can hit when the
backend of the loop is another loop device, e.g., loop0's backend is a file,
while loop1's backend is loop0.


loop0's backend is file loop1's backend is loop0

__loop_clr_fd
mutex_lock(&loop_ctl_mutex);
lo->lo_backing_file = NULL;
mutex_unlock(&loop_ctl_mutex);
loop_set_fd()
mutex_lock_killable(&loop_ctl_mutex);
loop_validate_file
f = l->lo_backing_file; --> access if
loop0 is not Lo_unbound
mutex_lock(&loop_ctl_mutex);
lo->lo_flags = 0;
mutex_unlock(&loop_ctl_mutex);



Here I post below for more feedback. We should use a loop device as backend only
when it is still bound.

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 1e6edd5..bf1c61c 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -656,7 +656,7 @@ static int loop_validate_file(struct file *file, struct
block_device *bdev)
return -EBADF;

l = f->f_mapping->host->i_bdev->bd_disk->private_data;
- if (l->lo_state == Lo_unbound) {
+ if (l->lo_state != Lo_bound) {
return -EINVAL;
}
f = l->lo_backing_file;


Thanks Tetsuo for the discussion and feedback!

Dongli Zhang

Jan Kara

unread,
Mar 18, 2019, 6:44:11 AM3/18/19
to Dongli Zhang, linux...@vger.kernel.org, ja...@suse.cz, syzbot, ax...@kernel.dk, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Tetsuo Handa
Yes, this is the right fix. Thanks for writing it! In fact, the problem has
been introduced already in commit 7ccd0791d985 "loop: Push loop_ctl_mutex
down into loop_clr_fd()" after which loop_validate_file() could see devices
in Lo_rundown state with which it didn't count. It was harmless at that
point but still. So when submitting the fix, please include there:

Fixes: 7ccd0791d985 "loop: Push loop_ctl_mutex down into loop_clr_fd()"

And also appropriate Reported-by for syzbot. Something like:

Reported-by: syzbot+9bdc1a...@syzkaller.appspotmail.com

You can also add my:

Reviewed-by: Jan Kara <ja...@suse.cz>

Thanks!

Honza

>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 1e6edd5..bf1c61c 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -656,7 +656,7 @@ static int loop_validate_file(struct file *file, struct
> block_device *bdev)
> return -EBADF;
>
> l = f->f_mapping->host->i_bdev->bd_disk->private_data;
> - if (l->lo_state == Lo_unbound) {
> + if (l->lo_state != Lo_bound) {
> return -EINVAL;
> }
> f = l->lo_backing_file;
>
>
> Thanks Tetsuo for the discussion and feedback!
>
> Dongli Zhang
--
Jan Kara <ja...@suse.com>
SUSE Labs, CR

Dongli Zhang

unread,
Mar 18, 2019, 8:27:28 AM3/18/19
to Jan Kara, syzbot, ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, penguin...@i-love.sakura.ne.jp
Hi Jan,

Indeed there is another issue implicitly reported by below console output from
syzkaller:

[ 245.524424][ T9455] loop_reread_partitions: partition scan of loop0 () failed
(rc=-13)
[ 245.563340][ T9499] kasan: CONFIG_KASAN_INLINE enabled
[ 245.576412][ T9489] __loop_clr_fd: partition scan of loop0 failed (rc=-13)
[ 245.581275][ T9499] kasan: GPF could be caused by NULL-ptr deref or user
memory access
[ 245.602596][ T9499] general protection fault: 0000 [#1] PREEMPT SMP KASAN

I think rc=-13 is because of below code at line 168:

162 int __blkdev_reread_part(struct block_device *bdev)
163 {
164 struct gendisk *disk = bdev->bd_disk;
165
166 if (!disk_part_scan_enabled(disk) || bdev != bdev->bd_contains)
167 return -EINVAL;
168 if (!capable(CAP_SYS_ADMIN))
169 return -EACCES;
170
171 lockdep_assert_held(&bdev->bd_mutex);
172
173 return rescan_partitions(disk, bdev);
174 }
175 EXPORT_SYMBOL(__blkdev_reread_part);

I can reproduce this by 'chown username /dev/loop0' on my test machine.

Taking 'losetup -d /dev/loop0' as sample, as /dev/loop0 belongs to my username,
I am able to detach the loop without 'su'.

However, because of above line 168, the partition scan would fail.

Should we always assume the user should have admin privilege to detach the loop
and this is not a bug?

Thank you very much!

Dongli Zhang

Jan Kara

unread,
Mar 19, 2019, 5:41:38 AM3/19/19
to Dongli Zhang, Jan Kara, syzbot, ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, penguin...@i-love.sakura.ne.jp
Firstly, __blkdev_reread_part() is used for all devices so we have to be
*very* careful when we relax the permission checks there.

Secondly, I'm not convinced it is always safe to allow non-priviledged user
to force repartitioning of a device. That exposes the whole partition table
parsing code to non-priviledged user and thus increases possible attack
surface.

But in this specific case we call __blkdev_reread_part() only to tear down
partitions. So in this specific case calling it by unpriviledged user is
fine plus leaving stale partitions behind is certainly not nice. What we
could do is:

Export drop_partitions() functionality as a function like
__blkdev_drop_part() that would call drop_partitions(bdev->bd_disk, bdev).
It would expect (and assert) that &bdev->bd_mutex is already locked. It
should probably also replicate the check:

if (!disk_part_scan_enabled(disk) || bdev != bdev->bd_contains)
return -EINVAL;

from __blkdev_reread_part(). And we can call this from __loop_clr_fd().

Then we can just unexport __blkdev_reread_part() (since only loop.c is
using it) and fold it inside blkdev_reread_part().

What do you think?

Honza

syzbot

unread,
Mar 19, 2019, 7:32:11 AM3/19/19
to Tetsuo Handa, penguin...@i-love.sakura.ne.jp, syzkall...@googlegroups.com
> Commit f7c8a4120eedf24c in axboe/linux-block.git#for-linus .

> #syz fix: loop: access lo_backing_file only when the loop device is
> Lo_bound

Your 'fix:' command is accepted, but please keep
syzkall...@googlegroups.com mailing list in CC next time. It serves as
a history of what happened with each bug report. Thank you.


Dongli Zhang

unread,
Mar 20, 2019, 12:17:38 AM3/20/19
to Jan Kara, syzbot, ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, penguin...@i-love.sakura.ne.jp
The idea is to duplicate a new caller of drop_partitions() which is specifically
used by loop. I think it works. It is safe as well as the functionality is
limited within loop.

However, perhaps we should not regard this as a bug? Below is the sample to
reproduce this issue:

$ dd if=/dev/zero of=tmp.raw bs=1M count=100 oflag=direct

$ parted tmp.raw --script mklabel msdos mkpart primary 0% 50% mkpart primary 50%
100%

$ sudo chown zhang /dev/loop0

$ losetup -P /dev/loop0 tmp.raw

$ ls -l /dev | grep loop0
brw-rw---- 1 zhang disk 7, 0 Mar 20 11:42 loop0 --> user (zhang)
brw-rw---- 1 root disk 259, 0 Mar 20 11:42 loop0p1 --> root
brw-rw---- 1 root disk 259, 1 Mar 20 11:42 loop0p2 --> root

$ losetup -d /dev/loop

From above case, /dev/loop0 is owned by user (zhang), while the partitions are
owned by root.

Should the detach of loop owned by user also unmaps all related partitions owned
by root?

Perhaps we should assume the '-P' option is always used by root?


This is similar to the fact that the administrator should always use '-P' in
order to enforce partscan when the loop is detached. Otherwise, the partitions
belong to the loop would remain after 'losetup -d'.


Another example is from kpartx as below. If the partitions are mapped via
'kpartx -av', the user should always run 'kpartx -d' to remove all partition
mappings.

Otherwise, there would be dm-0 and dm-1 left.

$ sudo kpartx -av tmp.raw
add map loop0p1 (253:0): 0 102400 linear 7:0 1
add map loop0p2 (253:1): 0 102399 linear 7:0 102401

$ ls -l /dev/mapper/
total 0
crw------- 1 root root 10, 236 3月 20 11:16 control
lrwxrwxrwx 1 root root 7 3月 20 11:21 loop0p1 -> ../dm-0
lrwxrwxrwx 1 root root 7 3月 20 11:21 loop0p2 -> ../dm-1

$ sudo losetup -d /dev/loop0

$ ls -l /dev/mapper/
total 0
crw------- 1 root root 10, 236 3月 20 11:16 control
lrwxrwxrwx 1 root root 7 3月 20 11:21 loop0p1 -> ../dm-0
lrwxrwxrwx 1 root root 7 3月 20 11:21 loop0p2 -> ../dm-1

Dongli Zhang

Jan Kara

unread,
Mar 20, 2019, 9:44:01 AM3/20/19
to Dongli Zhang, Jan Kara, syzbot, ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, penguin...@i-love.sakura.ne.jp
Yes, that's another option. And since I'm not aware of any user reports of
the problem you've found I'd just say that non-root users don't use loop
devices with partitions.

> This is similar to the fact that the administrator should always use '-P'
> in order to enforce partscan when the loop is detached. Otherwise, the
> partitions belong to the loop would remain after 'losetup -d'.

Yeah. That seems like another slight catch for the users. Honestly, I don't
feel either of these issues is serious enough that I'd go and fix them. But
if someone wanted to spend the time with them, I won't object :).

Honza
Reply all
Reply to author
Forward
0 new messages