general protection fault in __queue_work (2)

35 views
Skip to first unread message

syzbot

unread,
Mar 6, 2020, 10:55:13 PM3/6/20
to ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following crash on:

HEAD commit: 770fbb32 Add linux-next specific files for 20200228
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=104fdfa1e00000
kernel config: https://syzkaller.appspot.com/x/.config?x=576314276bce4ad5
dashboard link: https://syzkaller.appspot.com/bug?extid=889cc963ed79ee90f74f
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=176d7f29e00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=134c481de00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+889cc9...@syzkaller.appspotmail.com

general protection fault, probably for non-canonical address 0xdffffc0000000030: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000180-0x0000000000000187]
CPU: 0 PID: 9989 Comm: blkid Not tainted 5.6.0-rc3-next-20200228-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__queue_work+0xe8/0x1280 kernel/workqueue.c:1409
Code: c6 00 bf 97 89 4c 89 e7 e8 15 14 48 02 49 8d 86 80 01 00 00 48 89 c2 48 89 44 24 18 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 94 0e 00 00 41 8b 9e 80 01 00
RSP: 0018:ffffc90002457078 EFLAGS: 00010006
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000030 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000040 R08: ffff88809e21a180 R09: fffffbfff190c866
R10: fffffbfff190c865 R11: ffffffff8c86432b R12: ffff8880a677e518
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000040
FS: 00007fe98e0e6740(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000bd7058 CR3: 00000000a3381000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
queue_work_on+0x18b/0x200 kernel/workqueue.c:1515
queue_work include/linux/workqueue.h:507 [inline]
loop_queue_work drivers/block/loop.c:984 [inline]
loop_queue_rq+0x5ac/0x1050 drivers/block/loop.c:2038
blk_mq_dispatch_rq_list+0x997/0x17f0 block/blk-mq.c:1246
blk_mq_do_dispatch_sched+0x188/0x3f0 block/blk-mq-sched.c:115
blk_mq_sched_dispatch_requests+0x3cd/0x650 block/blk-mq-sched.c:211
__blk_mq_run_hw_queue+0x1b8/0x2c0 block/blk-mq.c:1382
__blk_mq_delay_run_hw_queue+0x522/0x5e0 block/blk-mq.c:1459
blk_mq_run_hw_queue+0x16c/0x2f0 block/blk-mq.c:1512
blk_mq_sched_insert_requests+0x2d4/0x5f0 block/blk-mq-sched.c:444
blk_mq_flush_plug_list+0x452/0x880 block/blk-mq.c:1758
blk_flush_plug_list+0x2ff/0x460 block/blk-core.c:1772
blk_finish_plug block/blk-core.c:1789 [inline]
blk_finish_plug+0x50/0x97 block/blk-core.c:1785
read_pages+0x125/0x610 mm/readahead.c:142
? 0xffffffff81000000
__do_page_cache_readahead+0x47c/0x570 mm/readahead.c:212
force_page_cache_readahead+0x1dc/0x320 mm/readahead.c:243
page_cache_sync_readahead mm/readahead.c:522 [inline]
page_cache_sync_readahead+0x4b8/0x520 mm/readahead.c:509
generic_file_buffered_read mm/filemap.c:2029 [inline]
generic_file_read_iter+0x1650/0x2a40 mm/filemap.c:2302
blkdev_read_iter+0x11b/0x180 fs/block_dev.c:2039
call_read_iter include/linux/fs.h:1895 [inline]
new_sync_read+0x4a2/0x790 fs/read_write.c:414
__vfs_read+0xc9/0x100 fs/read_write.c:427
vfs_read+0x1ea/0x430 fs/read_write.c:461
ksys_read+0x127/0x250 fs/read_write.c:587
do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:295
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fe98d9ee310
Code: 73 01 c3 48 8b 0d 28 4b 2b 00 31 d2 48 29 c2 64 89 11 48 83 c8 ff eb ea 90 90 83 3d e5 a2 2b 00 00 75 10 b8 00 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 6e 8a 01 00 48 89 04 24
RSP: 002b:00007ffd4d1e1df8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe98d9ee310
RDX: 0000000000000400 RSI: 0000000000bd6c58 RDI: 0000000000000003
RBP: 0000000000bd6c30 R08: 0000000000000028 R09: 0000000001680000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000bd6030
R13: 0000000000000400 R14: 0000000000bd6080 R15: 0000000000bd6c48
Modules linked in:
---[ end trace 0e35dd0f272c5c88 ]---
RIP: 0010:__queue_work+0xe8/0x1280 kernel/workqueue.c:1409
Code: c6 00 bf 97 89 4c 89 e7 e8 15 14 48 02 49 8d 86 80 01 00 00 48 89 c2 48 89 44 24 18 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 94 0e 00 00 41 8b 9e 80 01 00
RSP: 0018:ffffc90002457078 EFLAGS: 00010006
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000030 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000040 R08: ffff88809e21a180 R09: fffffbfff190c866
R10: fffffbfff190c865 R11: ffffffff8c86432b R12: ffff8880a677e518
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000040
FS: 00007fe98e0e6740(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000bd7058 CR3: 00000000a3381000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

syzbot

unread,
Mar 7, 2020, 12:55:03 PM3/7/20
to ak...@linux-foundation.org, ax...@kernel.dk, han...@cmpxchg.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, schatzb...@gmail.com, s...@canb.auug.org.au, syzkall...@googlegroups.com
syzbot has bisected this bug to:

commit 29dab2122492f6dbc0b895ca5bd047e166684d1a
Author: Dan Schatzberg <schatzb...@gmail.com>
Date: Tue Feb 25 04:14:07 2020 +0000

loop: use worker per cgroup instead of kworker

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14d6b01de00000
start commit: 770fbb32 Add linux-next specific files for 20200228
git tree: linux-next
final crash: https://syzkaller.appspot.com/x/report.txt?x=16d6b01de00000
console output: https://syzkaller.appspot.com/x/log.txt?x=12d6b01de00000
Reported-by: syzbot+889cc9...@syzkaller.appspotmail.com
Fixes: 29dab2122492 ("loop: use worker per cgroup instead of kworker")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Hillf Danton

unread,
Mar 8, 2020, 5:45:10 AM3/8/20
to syzbot, ak...@linux-foundation.org, ax...@kernel.dk, han...@cmpxchg.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, schatzb...@gmail.com, s...@canb.auug.org.au, syzkall...@googlegroups.com

Sat, 07 Mar 2020 09:55:02 -0800
> syzbot has bisected this bug to:
>
> commit 29dab2122492f6dbc0b895ca5bd047e166684d1a
> Author: Dan Schatzberg <schatzb...@gmail.com>
> Date: Tue Feb 25 04:14:07 2020 +0000
>
> loop: use worker per cgroup instead of kworker
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14d6b01de00000
> start commit: 770fbb32 Add linux-next specific files for 20200228
> git tree: linux-next
> Reported-by: syzbot+889cc9...@syzkaller.appspotmail.com
> Fixes: 29dab2122492 ("loop: use worker per cgroup instead of kworker")
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Enqueue work unless workqueue is destroyed.

--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -890,11 +890,6 @@ static void loop_config_discard(struct l
blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
}

-static void loop_unprepare_queue(struct loop_device *lo)
-{
- destroy_workqueue(lo->workqueue);
-}
-
static int loop_prepare_queue(struct loop_device *lo)
{
lo->workqueue = alloc_workqueue("loop%d",
@@ -921,7 +916,7 @@ static void loop_workfn(struct work_stru
static void loop_rootcg_workfn(struct work_struct *work);
static void loop_free_idle_workers(struct timer_list *timer);

-static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd)
+static int loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd)
{
struct rb_node **node = &(lo->worker_tree.rb_node), *parent = NULL;
struct loop_worker *cur_worker, *worker = NULL;
@@ -930,6 +925,13 @@ static void loop_queue_work(struct loop_

spin_lock_irq(&lo->lo_lock);

+ if (IS_ERR(lo->workqueue)) {
+ /* workqueue is destroyed */
+ int err = PTR_ERR(lo->workqueue);
+ spin_unlock_irq(&lo->lo_lock);
+ return err;
+ }
+
if (!cmd->css)
goto queue_work;

@@ -984,6 +986,7 @@ queue_work:
list_add_tail(&cmd->list_entry, cmd_list);
queue_work(lo->workqueue, work);
spin_unlock_irq(&lo->lo_lock);
+ return 0;
}

static void loop_update_rotational(struct loop_device *lo)
@@ -1208,8 +1211,16 @@ static int __loop_clr_fd(struct loop_dev
*
* 3) unlock, del_timer_sync so if timer raced it will be a no-op
*/
- loop_unprepare_queue(lo);
spin_lock_irq(&lo->lo_lock);
+ do {
+ struct workqueue_struct *wq = lo->workqueue;
+
+ lo->workqueue = ERR_PTR(-EINVAL);
+ spin_unlock_irq(&lo->lo_lock);
+ destroy_workqueue(wq);
+ spin_lock_irq(&lo->lo_lock);
+ } while (0);
+
lo->lo_backing_file = NULL;
list_for_each_entry_safe(worker, pos, &lo->idle_worker_list,
idle_list) {
@@ -2031,7 +2042,9 @@ static blk_status_t loop_queue_rq(struct
} else
#endif
cmd->css = NULL;
- loop_queue_work(lo, cmd);
+
+ if (loop_queue_work(lo, cmd))
+ return BLK_STS_IOERR;

return BLK_STS_OK;
}

Jens Axboe

unread,
Mar 8, 2020, 12:17:37 PM3/8/20
to Hillf Danton, syzbot, ak...@linux-foundation.org, han...@cmpxchg.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, schatzb...@gmail.com, s...@canb.auug.org.au, syzkall...@googlegroups.com
On 3/8/20 3:44 AM, Hillf Danton wrote:
> @@ -1208,8 +1211,16 @@ static int __loop_clr_fd(struct loop_dev
> *
> * 3) unlock, del_timer_sync so if timer raced it will be a no-op
> */
> - loop_unprepare_queue(lo);
> spin_lock_irq(&lo->lo_lock);
> + do {
> + struct workqueue_struct *wq = lo->workqueue;
> +
> + lo->workqueue = ERR_PTR(-EINVAL);
> + spin_unlock_irq(&lo->lo_lock);
> + destroy_workqueue(wq);
> + spin_lock_irq(&lo->lo_lock);
> + } while (0);

This looks highly suspicious, what's the point of this loop?

Also think this series a) might not be fully cooked, and b) really
should have gone through the block tree.

--
Jens Axboe

Hillf Danton

unread,
Mar 8, 2020, 10:09:17 PM3/8/20
to Jens Axboe, syzbot, ak...@linux-foundation.org, han...@cmpxchg.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, schatzb...@gmail.com, s...@canb.auug.org.au, syzkall...@googlegroups.com

On Sun, 8 Mar 2020 10:17:33 -0600 Jens Axboe wrote:
> On 3/8/20 3:44 AM, Hillf Danton wrote:
> > @@ -1208,8 +1211,16 @@ static int __loop_clr_fd(struct loop_dev
> > *
> > * 3) unlock, del_timer_sync so if timer raced it will be a no-op
> > */
> > - loop_unprepare_queue(lo);
> > spin_lock_irq(&lo->lo_lock);
> > + do {
> > + struct workqueue_struct *wq = lo->workqueue;
> > +
> > + lo->workqueue = ERR_PTR(-EINVAL);
> > + spin_unlock_irq(&lo->lo_lock);
> > + destroy_workqueue(wq);
> > + spin_lock_irq(&lo->lo_lock);
> > + } while (0);
>
> This looks highly suspicious, what's the point of this loop?

It is a while(0) loop that just gives me the chance for adding the
transient local variable wq.

> Also think this series a) might not be fully cooked, and b) really
> should have gone through the block tree.

Gavel in your hand, Sir.

Hillf

Jens Axboe

unread,
Mar 8, 2020, 10:17:05 PM3/8/20
to Hillf Danton, syzbot, ak...@linux-foundation.org, han...@cmpxchg.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, schatzb...@gmail.com, s...@canb.auug.org.au, syzkall...@googlegroups.com
On 3/8/20 8:09 PM, Hillf Danton wrote:
>
> On Sun, 8 Mar 2020 10:17:33 -0600 Jens Axboe wrote:
>> On 3/8/20 3:44 AM, Hillf Danton wrote:
>>> @@ -1208,8 +1211,16 @@ static int __loop_clr_fd(struct loop_dev
>>> *
>>> * 3) unlock, del_timer_sync so if timer raced it will be a no-op
>>> */
>>> - loop_unprepare_queue(lo);
>>> spin_lock_irq(&lo->lo_lock);
>>> + do {
>>> + struct workqueue_struct *wq = lo->workqueue;
>>> +
>>> + lo->workqueue = ERR_PTR(-EINVAL);
>>> + spin_unlock_irq(&lo->lo_lock);
>>> + destroy_workqueue(wq);
>>> + spin_lock_irq(&lo->lo_lock);
>>> + } while (0);
>>
>> This looks highly suspicious, what's the point of this loop?
>
> It is a while(0) loop that just gives me the chance for adding the
> transient local variable wq.

I think that adds more confusion than what is necessary, and I don't think
the approach is great to begin with as you now need various checks as well
for the workqueue pointer.

We're freezing the queue right after anyway, which will ensure that
nobody is going to hit an invalid workqueue pointer in terms of queueing.
This looks more like an ordering issue.

>> Also think this series a) might not be fully cooked, and b) really
>> should have gone through the block tree.
>
> Gavel in your hand, Sir.

Andrew, can you please drop this series so we can work out the kinks and
get it properly queued up after?

--
Jens Axboe

syzbot

unread,
Nov 16, 2022, 1:42:27 AM11/16/22
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.
Reply all
Reply to author
Forward
0 new messages