[syzbot] [block?] WARNING in blkdev_put (2)

13 views
Skip to first unread message

syzbot

unread,
Feb 24, 2023, 2:25:54 AM2/24/23
to ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: d2af0fa4bfa4 Add linux-next specific files for 20230220
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=170d2ef0c80000
kernel config: https://syzkaller.appspot.com/x/.config?x=594e1a56901fd35d
dashboard link: https://syzkaller.appspot.com/bug?extid=2bcc0d79e548c4f62a59
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1227e837480000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=122d8ca0c80000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/83b78c113e8e/disk-d2af0fa4.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d59f9b2c9091/vmlinux-d2af0fa4.xz
kernel image: https://storage.googleapis.com/syzbot-assets/2726c16c1d3b/bzImage-d2af0fa4.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2bcc0d...@syzkaller.appspotmail.com

------------[ cut here ]------------
WARNING: CPU: 1 PID: 5080 at block/bdev.c:845 blkdev_put+0x6ca/0x770 block/bdev.c:845
Modules linked in:
CPU: 1 PID: 5080 Comm: syz-executor158 Not tainted 6.2.0-rc8-next-20230220-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/21/2023
RIP: 0010:blkdev_put+0x6ca/0x770 block/bdev.c:845
Code: 48 8b 3c 24 e8 b7 7c da fd e9 99 fa ff ff e8 8d 7c da fd e9 cf fb ff ff 4c 89 ff e8 80 7c da fd e9 80 fd ff ff e8 e6 ea 88 fd <0f> 0b e9 ef fc ff ff e8 8a 7c da fd e9 f3 fa ff ff 48 8b 3c 24 e8
RSP: 0018:ffffc90003cefc88 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888144c49600 RCX: 0000000000000000
RDX: ffff88807c2f8000 RSI: ffffffff83fbb8da RDI: 0000000000000005
RBP: ffff888146bc0000 R08: 0000000000000005 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000000 R12: 00000000484e009f
R13: ffff888144c49628 R14: ffff888146bc0460 R15: ffff888144c49ab8
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb645428948 CR3: 000000000c571000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
blkdev_close+0x68/0x80 block/fops.c:507
__fput+0x27c/0xa90 fs/file_table.c:321
task_work_run+0x16f/0x270 kernel/task_work.c:179
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0xb42/0x2b60 kernel/exit.c:869
do_group_exit+0xd4/0x2a0 kernel/exit.c:1019
__do_sys_exit_group kernel/exit.c:1030 [inline]
__se_sys_exit_group kernel/exit.c:1028 [inline]
__x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1028
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fb6453e4639
Code: Unable to access opcode bytes at 0x7fb6453e460f.
RSP: 002b:00007ffcfacb3ec8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007fb645458270 RCX: 00007fb6453e4639
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffffffffffc0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb645458270
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Hillf Danton

unread,
Mar 2, 2023, 8:59:48 PM3/2/23
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Thu, 23 Feb 2023 23:25:53 -0800
> HEAD commit: d2af0fa4bfa4 Add linux-next specific files for 20230220
> git tree: linux-next
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=122d8ca0c80000

Check if bd_holder is the correct one.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

--- x/block/bdev.c
+++ y/block/bdev.c
@@ -833,7 +833,7 @@ void blkdev_put(struct block_device *bde
mutex_lock(&disk->open_mutex);
if (mode & FMODE_EXCL) {
struct block_device *whole = bdev_whole(bdev);
- bool bdev_free;
+ bool bdev_free = false;

/*
* Release a claim on the device. The holder fields
@@ -842,6 +842,8 @@ void blkdev_put(struct block_device *bde
*/
spin_lock(&bdev_lock);

+ if (whole->bd_holder != bd_may_claim)
+ goto unlock;
WARN_ON_ONCE(--bdev->bd_holders < 0);
WARN_ON_ONCE(--whole->bd_holders < 0);

@@ -850,6 +852,7 @@ void blkdev_put(struct block_device *bde
if (!whole->bd_holders)
whole->bd_holder = NULL;

+unlock:
spin_unlock(&bdev_lock);

/*
--

syzbot

unread,
Mar 2, 2023, 9:14:35 PM3/2/23
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING in blkdev_flush_mapping

------------[ cut here ]------------
WARNING: CPU: 1 PID: 5617 at block/bdev.c:582 blkdev_flush_mapping+0x293/0x310 block/bdev.c:582
Modules linked in:
CPU: 1 PID: 5617 Comm: syz-executor.0 Not tainted 6.2.0-syzkaller-13277-g2eb29d59ddf0-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023
RIP: 0010:blkdev_flush_mapping+0x293/0x310 block/bdev.c:582
Code: e8 a2 24 6c fd e9 5a ff ff ff e8 18 82 88 fd 48 89 ef 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f e9 32 f8 1e 06 e8 fd 81 88 fd <0f> 0b e9 bc fd ff ff e8 b1 fd d9 fd e9 9a fd ff ff 48 8b 3c 24 e8
RSP: 0018:ffffc90004b0fd10 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
RDX: ffff8880241f9d40 RSI: ffffffff83fc7843 RDI: 0000000000000005
RBP: ffff88801ea51001 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000000 R12: 00000000484e009f
R13: ffff88801bd36328 R14: ffff88801bd36300 R15: 0000000000000000
FS: 000055555667d400(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd2b67cf70 CR3: 0000000029003000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
blkdev_put_whole+0xd1/0xf0 block/bdev.c:615
blkdev_put+0x224/0x7e0 block/bdev.c:878
blkdev_close+0x68/0x80 block/fops.c:507
__fput+0x27c/0xa90 fs/file_table.c:321
task_work_run+0x16f/0x270 kernel/task_work.c:179
resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x23c/0x250 kernel/entry/common.c:203
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f657123dfab
Code: 0f 05 48 3d 00 f0 ff ff 77 45 c3 0f 1f 40 00 48 83 ec 18 89 7c 24 0c e8 63 fc ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 a1 fc ff ff 8b 44
RSP: 002b:00007ffc3ff573b0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007f657123dfab
RDX: 00007f6570e00120 RSI: ffffffffffffffff RDI: 0000000000000003
RBP: 00007f65713ad980 R08: 0000000000000000 R09: 00007f6570e00000
R10: 00007f6570e00128 R11: 0000000000000293 R12: 0000000000015b6b
R13: 00007ffc3ff574b0 R14: 00007f65713abf80 R15: 0000000000000032
</TASK>


Tested on:

commit: 2eb29d59 Merge tag 'drm-next-2023-03-03-1' of git://an..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=111f9404c80000
kernel config: https://syzkaller.appspot.com/x/.config?x=cab35c936731a347
dashboard link: https://syzkaller.appspot.com/bug?extid=2bcc0d79e548c4f62a59
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=12380f7f480000

Hillf Danton

unread,
Mar 2, 2023, 9:32:45 PM3/2/23
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Thu, 23 Feb 2023 23:25:53 -0800
> HEAD commit: d2af0fa4bfa4 Add linux-next specific files for 20230220
> git tree: linux-next
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=122d8ca0c80000

Check if bd_holder is the correct one.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

--- x/block/bdev.c
+++ y/block/bdev.c
@@ -833,7 +833,7 @@ void blkdev_put(struct block_device *bde
mutex_lock(&disk->open_mutex);
if (mode & FMODE_EXCL) {
struct block_device *whole = bdev_whole(bdev);
- bool bdev_free;
+ bool bdev_free = false;

/*
* Release a claim on the device. The holder fields
@@ -842,6 +842,11 @@ void blkdev_put(struct block_device *bde
*/
spin_lock(&bdev_lock);

+ if (whole->bd_holder != bd_may_claim) {
+ bdev->bd_holders = 0;
+ whole->bd_holders = 0;
+ goto unlock;
+ }
WARN_ON_ONCE(--bdev->bd_holders < 0);
WARN_ON_ONCE(--whole->bd_holders < 0);

@@ -850,6 +855,7 @@ void blkdev_put(struct block_device *bde

syzbot

unread,
Mar 2, 2023, 10:03:28 PM3/2/23
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+2bcc0d...@syzkaller.appspotmail.com

Tested on:

commit: 2eb29d59 Merge tag 'drm-next-2023-03-03-1' of git://an..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=10467122c80000
kernel config: https://syzkaller.appspot.com/x/.config?x=cab35c936731a347
dashboard link: https://syzkaller.appspot.com/bug?extid=2bcc0d79e548c4f62a59
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=10d3ef60c80000

Note: testing is done by a robot and is best-effort only.

Julian Ruess

unread,
Mar 6, 2023, 10:33:59 AM3/6/23
to Alexander Egorenkov, syzbot+2bcc0d...@syzkaller.appspotmail.com, ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, ja...@suse.cz, yuk...@huawei.com, h...@lst.de, Niklas Schnelle, Gerd Bayer
On Thu, 2023-03-02 at 20:33 +0100, Alexander Egorenkov wrote:
>
> Hi,
>
> we are seeing a similar problem on s390x architecture when
> partitioning
> a NVMe disk on linux-next.
>
>
>   [   70.403015]  nvme0n1: p1
>   [   70.403197] ------------[ cut here ]------------
>   [   70.403199] WARNING: CPU: 8 PID: 2452 at block/bdev.c:845
> blkdev_put+0x280/0x298

...

> The problem appeared about a week ago.
>
> Regards
> Alex

Hi all,

I bisected this to:

commit e5cfefa97bccf956ea0bb6464c1f6c84fd7a8d9f
Author: Yu Kuai <yuk...@huawei.com>
Date: Fri Feb 17 10:22:00 2023 +0800

block: fix scan partition for exclusively open device again

As explained in commit 36369f46e917 ("block: Do not reread
partition table
on exclusively open device"), reread partition on the device that
is
exclusively opened by someone else is problematic.

This patch will make sure partition scan will only be proceed if
current
thread open the device exclusively, or the device is not opened
exclusively, and in the later case, other scanners and exclusive
openers
will be blocked temporarily until partition scan is done.

Fixes: 10c70d95c0f2 ("block: remove the bd_openers checks in
blk_drop_partitions")
Cc: <sta...@vger.kernel.org>
Suggested-by: Jan Kara <ja...@suse.cz>
Signed-off-by: Yu Kuai <yuk...@huawei.com>
Reviewed-by: Christoph Hellwig <h...@lst.de>
Link:
https://lore.kernel.org/r/20230217022200.3...@huaweicloud.com

Signed-off-by: Jens Axboe <ax...@kernel.dk>



Regards
Julian

--
Julian Ruess
Linux on IBM Z Development
IBM Deutschland Research & Development GmbH
Dept 1419, Schoenaicher Str. 220, 71032 Boeblingen,
Vorsitzender des Aufsichtsrats: Gregor Pillen, Geschäftsführung: David
Faller
Sitz der Gesellschaft: Böblingen, Registergericht: Amtsgericht
Stuttgart, HRB 243294
IBM Data Privacy Statement - https://www.ibm.com/privacy


Julian Ruess

unread,
Mar 7, 2023, 4:19:44 AM3/7/23
to Yu Kuai, Alexander Egorenkov, syzbot+2bcc0d...@syzkaller.appspotmail.com, ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, ja...@suse.cz, h...@lst.de, Niklas Schnelle, Gerd Bayer, yukuai (C), jul...@linux.ibm.com
On Tue, 2023-03-07 at 09:42 +0800, Yu Kuai wrote:
> Hi,

>
> 在 2023/03/06 23:00, Julian Ruess 写道:
> > On Thu, 2023-03-02 at 20:33 +0100, Alexander Egorenkov wrote:
> > >
> > > Hi,
> > >
> > > we are seeing a similar problem on s390x architecture when
> > > partitioning
> > > a NVMe disk on linux-next.
> > >
> > >
> > >    [   70.403015]  nvme0n1: p1
> > >    [   70.403197] ------------[ cut here ]------------
> > >    [   70.403199] WARNING: CPU: 8 PID: 2452 at block/bdev.c:845
> > > blkdev_put+0x280/0x298
> >
> > ...
> >
> > > The problem appeared about a week ago.
> > >
> > > Regards
> > > Alex
> >
> > Hi all,
> >
> > I bisected this to:
> >
> > commit e5cfefa97bccf956ea0bb6464c1f6c84fd7a8d9f
> > Author: Yu Kuai <yuk...@huawei.com>
> > Date:   Fri Feb 17 10:22:00 2023 +0800
> >                                                                    
> >                    
> >      block: fix scan partition for exclusively open device again
>
> Yes, thanks for the report, I figure out that I made a mistake here.
>
> Following patch should fix this problem:
>
> diff --git a/block/genhd.c b/block/genhd.c
> index 3ee5577e1586..02d9cfb9e077 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -385,7 +385,7 @@ int disk_scan_partitions(struct gendisk *disk,
> fmode_t mode)
>          if (IS_ERR(bdev))
>                  ret =  PTR_ERR(bdev);
>          else
> -               blkdev_put(bdev, mode);
> +               blkdev_put(bdev, mode & ~FMODE_EXCL);
>
> Thanks,
> Kuai

This patch works for me. Thanks!
@Jens Axboe: Will this be part of the next 6.3-rc?

Regards
Julian

Yu Kuai

unread,
Mar 7, 2023, 4:28:29 AM3/7/23
to Julian Ruess, Alexander Egorenkov, syzbot+2bcc0d...@syzkaller.appspotmail.com, ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, ja...@suse.cz, h...@lst.de, Niklas Schnelle, Gerd Bayer, yukuai (C)
Hi,

在 2023/03/06 23:00, Julian Ruess 写道:
> On Thu, 2023-03-02 at 20:33 +0100, Alexander Egorenkov wrote:
>>
>> Hi,
>>
>> we are seeing a similar problem on s390x architecture when
>> partitioning
>> a NVMe disk on linux-next.
>>
>>
>>   [   70.403015]  nvme0n1: p1
>>   [   70.403197] ------------[ cut here ]------------
>>   [   70.403199] WARNING: CPU: 8 PID: 2452 at block/bdev.c:845
>> blkdev_put+0x280/0x298
>
> ...
>
>> The problem appeared about a week ago.
>>
>> Regards
>> Alex
>
> Hi all,
>
> I bisected this to:
>
> commit e5cfefa97bccf956ea0bb6464c1f6c84fd7a8d9f
> Author: Yu Kuai <yuk...@huawei.com>
> Date: Fri Feb 17 10:22:00 2023 +0800
>
> block: fix scan partition for exclusively open device again

Yes, thanks for the report, I figure out that I made a mistake here.

Following patch should fix this problem:

diff --git a/block/genhd.c b/block/genhd.c
index 3ee5577e1586..02d9cfb9e077 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -385,7 +385,7 @@ int disk_scan_partitions(struct gendisk *disk,
fmode_t mode)
if (IS_ERR(bdev))
ret = PTR_ERR(bdev);
else
- blkdev_put(bdev, mode);
+ blkdev_put(bdev, mode & ~FMODE_EXCL);

Thanks,
Kuai
>
Reply all
Reply to author
Forward
0 new messages