[syzbot] upstream test error: BUG: sleeping function called from invalid context in stack_depot_save

23 views
Skip to first unread message

syzbot

unread,
Jul 1, 2021, 7:00:20ā€ÆAM7/1/21
to ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: dbe69e43 Merge tag 'net-next-5.14' of git://git.kernel.org..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1216d478300000
kernel config: https://syzkaller.appspot.com/x/.config?x=47e4697be2f5b985
dashboard link: https://syzkaller.appspot.com/bug?extid=e45919db2eab5e837646

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e45919...@syzkaller.appspotmail.com

BUG: sleeping function called from invalid context at mm/page_alloc.c:5179
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 8436, name: syz-fuzzer
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff814406db>] copy_process+0x1e1b/0x74c0 kernel/fork.c:2061
softirqs last enabled at (0): [<ffffffff8144071c>] copy_process+0x1e5c/0x74c0 kernel/fork.c:2065
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 1 PID: 8436 Comm: syz-fuzzer Tainted: G W 5.13.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:96
___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9153
prepare_alloc_pages+0x3da/0x580 mm/page_alloc.c:5179
__alloc_pages+0x12f/0x500 mm/page_alloc.c:5375
alloc_pages+0x18c/0x2a0 mm/mempolicy.c:2272
stack_depot_save+0x39d/0x4e0 lib/stackdepot.c:303
save_stack+0x15e/0x1e0 mm/page_owner.c:120
__set_page_owner+0x50/0x290 mm/page_owner.c:181
prep_new_page mm/page_alloc.c:2445 [inline]
__alloc_pages_bulk+0x8b9/0x1870 mm/page_alloc.c:5313
alloc_pages_bulk_array_node include/linux/gfp.h:557 [inline]
vm_area_alloc_pages mm/vmalloc.c:2775 [inline]
__vmalloc_area_node mm/vmalloc.c:2845 [inline]
__vmalloc_node_range+0x39d/0x960 mm/vmalloc.c:2947
__vmalloc_node mm/vmalloc.c:2996 [inline]
vzalloc+0x67/0x80 mm/vmalloc.c:3066
n_tty_open+0x16/0x170 drivers/tty/n_tty.c:1914
tty_ldisc_open+0x9b/0x110 drivers/tty/tty_ldisc.c:464
tty_ldisc_setup+0x43/0x100 drivers/tty/tty_ldisc.c:781
tty_init_dev.part.0+0x1f4/0x610 drivers/tty/tty_io.c:1461
tty_init_dev include/linux/err.h:36 [inline]
tty_open_by_driver drivers/tty/tty_io.c:2102 [inline]
tty_open+0xb16/0x1000 drivers/tty/tty_io.c:2150
chrdev_open+0x266/0x770 fs/char_dev.c:414
do_dentry_open+0x4c8/0x11c0 fs/open.c:826
do_open fs/namei.c:3361 [inline]
path_openat+0x1c0e/0x27e0 fs/namei.c:3494
do_filp_open+0x190/0x3d0 fs/namei.c:3521
do_sys_openat2+0x16d/0x420 fs/open.c:1195
do_sys_open fs/open.c:1211 [inline]
__do_sys_openat fs/open.c:1227 [inline]
__se_sys_openat fs/open.c:1222 [inline]
__x64_sys_openat+0x13f/0x1f0 fs/open.c:1222
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4af20a
Code: e8 3b 82 fb ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 4c 8b 54 24 28 4c 8b 44 24 30 4c 8b 4c 24 38 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 40 ff ff ff ff 48 c7 44 24 48
RSP: 002b:000000c0003293f8 EFLAGS: 00000216 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 000000c00001e800 RCX: 00000000004af20a
RDX: 0000000000000000 RSI: 000000c0001a5a50 RDI: ffffffffffffff9c
RBP: 000000c000329470 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000216 R12: 00000000000001a6
R13: 00000000000001a5 R14: 0000000000000200 R15: 000000c00029c280
can: request_module (can-proto-0) failed.
can: request_module (can-proto-0) failed.
can: request_module (can-proto-0) failed.


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Dmitry Vyukov

unread,
Jul 1, 2021, 7:10:49ā€ÆAM7/1/21
to syzbot, kasan-dev, ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
On Thu, Jul 1, 2021 at 1:00 PM syzbot
<syzbot+e45919...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: dbe69e43 Merge tag 'net-next-5.14' of git://git.kernel.org..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1216d478300000
> kernel config: https://syzkaller.appspot.com/x/.config?x=47e4697be2f5b985
> dashboard link: https://syzkaller.appspot.com/bug?extid=e45919db2eab5e837646
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+e45919...@syzkaller.appspotmail.com

+kasan-dev@ for for stack_depot_save warning

Hillf Danton

unread,
Jul 3, 2021, 12:13:10ā€ÆAM7/3/21
to Dmitry Vyukov, syzbot, Mel Gorman, kasan-dev, ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
On Thu, 1 Jul 2021 13:10:37 +0200 Dmitry Vyukov wrote:
One of the quick fixes is move preparing new page out of the local lock (with
irq disabled) if it is difficult to add changes in saving stack.

+++ x/mm/page_alloc.c
@@ -5231,6 +5231,7 @@ unsigned long __alloc_pages_bulk(gfp_t g
gfp_t alloc_gfp;
unsigned int alloc_flags = ALLOC_WMARK_LOW;
int nr_populated = 0, nr_account = 0;
+ LIST_HEAD(head);

if (unlikely(nr_pages <= 0))
return 0;
@@ -5308,17 +5309,29 @@ unsigned long __alloc_pages_bulk(gfp_t g
break;
}
nr_account++;
-
- prep_new_page(page, 0, gfp, 0);
- if (page_list)
- list_add(&page->lru, page_list);
- else
- page_array[nr_populated] = page;
+ list_add(&page->lru, &head);
nr_populated++;
}

local_unlock_irqrestore(&pagesets.lock, flags);

+ list_for_each_entry(page, &head, lru)
+ prep_new_page(page, 0, gfp, 0);
+
+ if (page_list)
+ list_splice(&head, page_list);
+ else {
+ int i;
+
+ for (i = 0; i < nr_pages && !list_empty(&head); i++) {
+ /* Skip existing pages */
+ if (page_array[i])
+ continue;
+ page = list_first_entry(&head, struct page, lru);
+ list_del_init(&page->lru);
+ page_array[i] = page;
+ }
+ }
__count_zid_vm_events(PGALLOC, zone_idx(zone), nr_account);
zone_statistics(ac.preferred_zoneref->zone, zone, nr_account);

Desmond Cheong Zhi Xi

unread,
Jul 13, 2021, 5:51:25ā€ÆAM7/13/21
to Hillf Danton, Dmitry Vyukov, Mel Gorman, syzbot, kasan-dev, ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
I believe this particular bug should be fixed by Mel Gorman's patch that
was added into Andrew Morton's -mm tree (mm/page_alloc: Avoid page
allocator recursion with pagesets.lock held):
https://lore.kernel.org/lkml/2021070808...@techsingularity.net/

With the patch, we avoid recursing into stack_depot_save while holding
onto the local lock.

Best wishes,
Desmond

Hillf Danton

unread,
Jul 13, 2021, 6:18:10ā€ÆAM7/13/21
to Desmond Cheong Zhi Xi, Dmitry Vyukov, Mel Gorman, syzbot, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
On Tue, 13 Jul 2021 17:51:19 +0800 Desmond Cheong Zhi Xi wrote:
>
>I believe this particular bug should be fixed by Mel Gorman's patch that
>was added into Andrew Morton's -mm tree (mm/page_alloc: Avoid page
>allocator recursion with pagesets.lock held):
>https://lore.kernel.org/lkml/2021070808...@techsingularity.net/
>
>With the patch, we avoid recursing into stack_depot_save while holding
>onto the local lock.

You are right. Different fixes of different tastes.

Thanks for taking a look at it.

Hillf

Desmond Cheong Zhi Xi

unread,
Jul 28, 2021, 9:45:02ā€ÆAM7/28/21
to syzbot, syzkall...@googlegroups.com
#syz fix: mm/page_alloc: avoid page allocator recursion with pagesets.lock held

The issue arises from recursing into stack_depot_save while holding onto
the local lock. This is fixed by the patch to [1].

Link: https://syzkaller.appspot.com/bug?extid=127fd7828d6eeb611703 [1]

Best,
Desmond

Dan Carpenter

unread,
Jul 28, 2021, 10:07:52ā€ÆAM7/28/21
to Desmond Cheong Zhi Xi, syzbot, syzkall...@googlegroups.com
On Wed, Jul 28, 2021 at 09:44:58PM +0800, Desmond Cheong Zhi Xi wrote:
>
> #syz fix: mm/page_alloc: avoid page allocator recursion with pagesets.lock held

I kind of wish the syz fix took a standard commit format with a git
hash so it was easier to review the fix.

#syz fix: 187ad460b841 ("mm/page_alloc: avoid page allocator recursion with pagesets.lock held")

regards,
dan carpenter

syzbot

unread,
Oct 26, 2021, 10:08:16ā€ÆAM10/26/21
to ak...@linux-foundation.org, dan.ca...@oracle.com, desmond...@gmail.com, dvy...@google.com, hda...@sina.com, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, mgo...@techsingularity.net, syzkall...@googlegroups.com, tonymaris...@yandex.com
This bug is marked as fixed by commit:
187ad460b841 ("mm/page_alloc: avoid page allocator recursion with pagesets.lock held")
But I can't find it in any tested tree for more than 90 days.
Is it a correct commit? Please update it by replying:
#syz fix: exact-commit-title
Until then the bug is still considered open and
new crashes with the same signature are ignored.

Marco Elver

unread,
Oct 26, 2021, 10:28:08ā€ÆAM10/26/21
to syzbot, ak...@linux-foundation.org, dan.ca...@oracle.com, desmond...@gmail.com, dvy...@google.com, hda...@sina.com, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, mgo...@techsingularity.net, syzkall...@googlegroups.com, tonymaris...@yandex.com
#syz fix: mm/page_alloc: avoid page allocator recursion with pagesets.lock held

On Tue, 26 Oct 2021 at 16:08, syzbot
<syzbot+e45919...@syzkaller.appspotmail.com> wrote:
>
> This bug is marked as fixed by commit:
> 187ad460b841 ("mm/page_alloc: avoid page allocator recursion with pagesets.lock held")

Looks like Dan's "#syz fix" made syzbot think that the title is the above.

The reason that the commit title only is preferred is that commits in
trees like -mm don't have stable hashes. Maybe if the hash is known to
persist the alternative format could be useful.
Reply all
Reply to author
Forward
0 new messages