I started suspecting a stack overflow. But I was afraid if may be a
KASAN artifact, as it both increases stack usage and disables vmap
stacks.
But I was able to reproduce this without KASAN and root cause at the same time.
I am on v4.20, config is (basically just defconfig+kvmconfig):
https://gist.githubusercontent.com/dvyukov/f8401c8da367088c789bfb953d42d3b3/raw/eac0e85d3db577ba68ec59acf916899b61741ee1/gistfile1.txt
Running the syzkaller program gave me:
Out of memory: Kill process 13971 (syz-executor) score 998 or sacrifice child
Killed process 13971 (syz-executor) total-vm:37512kB, anon-rss:92kB,
file-rss:0kB, shmem-rss:0kB
oom_reaper: reaped process 13971 (syz-executor), now anon-rss:0kB,
file-rss:0kB, shmem-rss:0kB
Kernel panic - not syncing: corrupted stack end detected inside scheduler
CPU: 3 PID: 2555 Comm: kworker/u12:3 Not tainted 4.20.0-rc7+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Workqueue: writeback wb_workfn (flush-8:0)
Call Trace:
dump_stack+0x1d4/0x2b5 lib/earlycpio.c:120
panic+0x25e/0x49c kernel/cpu.c:617
__schedule+0x1be8/0x21d0
preempt_schedule_common+0x35/0xe0
preempt_schedule+0x23/0x30
___preempt_schedule+0x16/0x18
_raw_spin_unlock_irq+0x75/0x80
mark_work_canceling kernel/workqueue.c:747 [inline]
__flush_work+0x4f5/0x970 kernel/workqueue.c:2996
flush_work+0x17/0x20 kernel/workqueue.c:3059
drain_all_pages+0x418/0x680 mm/page_alloc.c:4570
__alloc_pages_slowpath+0xb76/0x2c10 mm/page_alloc.c:4072
__alloc_pages_nodemask+0xa6c/0xe10 mm/page_alloc.c:5029
cache_grow_begin+0x9d/0x8a0
fallback_alloc+0x204/0x2e0
____cache_alloc_node+0x1cc/0x1f0
slab_alloc_node mm/slub.c:2710 [inline]
slab_alloc mm/slub.c:2752 [inline]
kmem_cache_alloc+0x296/0x720 mm/slub.c:2769
mempool_alloc_slab+0x44/0x60 mm/mempool.c:130
mempool_alloc+0x174/0x4e0 mm/mempool.c:433
bvec_alloc+0x150/0x2d0 block/bio.c:485
bio_alloc_bioset+0x44e/0x650 block/bio.c:1455
ext4_bio_write_page+0xc11/0x1780 fs/ext4/resize.c:76
mpage_add_bh_to_extent fs/ext4/inode.c:2300 [inline]
mpage_submit_page+0x138/0x230 fs/ext4/inode.c:2335
ext4_da_page_release_reservation fs/ext4/inode.c:1651 [inline]
mpage_process_page_bufs+0x429/0x500 fs/ext4/inode.c:3226
mpage_prepare_extent_to_map+0xb2a/0x1640 fs/ext4/inode.c:154
ext4_inode_journal_mode fs/ext4/ext4_jbd2.h:411 [inline]
ext4_should_journal_data fs/ext4/ext4_jbd2.h:427 [inline]
ext4_writepages+0x112c/0x3a20 fs/ext4/inode.c:2190
test_and_set_bit arch/x86/include/asm/bitops.h:220 [inline]
TestSetPageDirty include/linux/page-flags.h:287 [inline]
do_writepages+0xfc/0x170 mm/page-writeback.c:2383
mark_inode_dirty_sync include/linux/fs.h:2124 [inline]
__writeback_single_inode+0x1cd/0x12e0 fs/fs-writeback.c:1372
writeback_sb_inodes+0x6c7/0x1040 fs/fs-writeback.c:1795
__writeback_inodes_wb+0x1a3/0x310 fs/fs-writeback.c:1704
wb_writeback+0x92c/0xe10 include/trace/events/writeback.h:572
syz-executor invoked oom-killer:
gfp_mask=0x7080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), nodemask=(null),
order=3, oom_score_adj=0
syz-executor cpuset=/ mems_allowed=0-1
wb_workfn+0xdf3/0x1600 fs/pnode.c:430
get_unbound_pool kernel/workqueue.c:3437 [inline]
process_one_work+0xcf3/0x1be0 kernel/workqueue.c:3612
worker_thread+0x17d/0x12f0 kernel/workqueue.c:2289
__write_once_size include/linux/compiler.h:218 [inline]
__list_del include/linux/list.h:106 [inline]
__list_del_entry include/linux/list.h:120 [inline]
list_del_init include/linux/list.h:159 [inline]
kthread+0x354/0x430 kernel/kthread.c:1010
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:358
CPU: 0 PID: 6768 Comm: syz-executor Not tainted 4.20.0-rc7+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
dump_stack+0x1d4/0x2b5 lib/earlycpio.c:120
dump_header+0x294/0xfaf
oom_killer_enable mm/oom_kill.c:715 [inline]
oom_kill_process+0xa3f/0xd20 mm/oom_kill.c:750
out_of_memory+0x88c/0x12a0 mm/fadvise.c:184
compound_order include/linux/mm.h:707 [inline]
page_hstate include/linux/hugetlb.h:469 [inline]
__alloc_pages_slowpath+0x1cfa/0x2c10 mm/page_alloc.c:7820
__alloc_pages_nodemask+0xa6c/0xe10 mm/page_alloc.c:5029
copy_process+0x94c/0x7b00
variable_test_bit arch/x86/include/asm/bitops.h:332 [inline]
cpumask_test_cpu include/linux/cpumask.h:344 [inline]
trace_sched_process_fork include/trace/events/sched.h:288 [inline]
_do_fork+0x191/0xf20 kernel/fork.c:2232
__x64_sys_clone+0xbf/0x150 kernel/fork.c:2340
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
do_syscall_32_irqs_on arch/x86/entry/common.c:341 [inline]
do_syscall_64+0x192/0x770 arch/x86/entry/common.c:349
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x45578b
Code: db 45 85 f6 0f 85 95 01 00 00 64 4c 8b 04 25 10 00 00 00 31 d2
4d 8d 90 d0 02 00 00 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d
00 f0 ff ff 0f 87 d6 00 00 00 85 c0 41 89 c5 0f 85 dd 00 00
RSP: 002b:00007fff9dc6ca20 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 00007fff9dc6ca20 RCX: 000000000045578b
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 00007fff9dc6ca70 R08: 0000000001d0d940 R09: 0000000000000000
R10: 0000000001d0dc10 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000020 R14: 0000000000000000 R15: 0000000000000000
and second time:
[ 281.244340] Kernel panic - not syncing: corrupted stack end
detected inside scheduler
[ 281.245754] CPU: 2 PID: 6265 Comm: kworker/u12:4 Not tainted 4.20.0-rc7+ #6
[ 281.246887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1 04/01/2014
[ 281.248240] Workqueue: writeback wb_workfn (flush-8:0)
[ 281.248992] Call Trace:
[ 281.249364] dump_stack+0x1d4/0x2b5
[ 281.252261] panic+0x25e/0x49c
[ 281.255403] __schedule+0x1be8/0x21d0
[ 281.263754] preempt_schedule_common+0x35/0xe0
[ 281.264425] preempt_schedule+0x23/0x30
[ 281.265010] ___preempt_schedule+0x16/0x18
[ 281.265635] _raw_spin_unlock_irqrestore+0xbf/0xe0
[ 281.266357] __remove_mapping+0x77b/0x17e0
[ 281.291388] shrink_page_list+0x5232/0xa6b0
[ 281.414732] shrink_inactive_list+0x997/0x1ab0
[ 281.419009] shrink_node_memcg+0x9de/0x16a0
[ 281.424799] shrink_node+0x3af/0x1530
[ 281.433316] do_try_to_free_pages+0x3bc/0x1170
[ 281.435723] try_to_free_pages+0x43c/0x9e0
[ 281.442644] __alloc_pages_slowpath+0xa4c/0x2c10
[ 281.459197] __alloc_pages_nodemask+0xa6c/0xe10
[ 281.466504] alloc_pages_current+0xb6/0x1e0
[ 281.467326] __page_cache_alloc+0x332/0x560
[ 281.471049] pagecache_get_page+0x2af/0xdd0
[ 281.487360] __getblk_gfp+0x36e/0xd50
[ 281.497989] ext4_read_block_bitmap_nowait+0x2ed/0x1e10
[ 281.509111] ext4_read_block_bitmap+0x23/0x80
[ 281.509934] ext4_mb_mark_diskspace_used+0x180/0x10a0
[ 281.512755] ext4_mb_new_blocks+0xeb7/0x4260
[ 281.540189] ext4_ext_map_blocks+0x2776/0x5b00
[ 281.556040] ext4_map_blocks+0xcaa/0x1860
[ 281.559967] ext4_writepages+0x1e4c/0x3a20
[ 281.575738] do_writepages+0xfc/0x170
[ 281.578546] __writeback_single_inode+0x1cd/0x12e0
[ 281.592498] writeback_sb_inodes+0x6c7/0x1040
[ 281.598601] __writeback_inodes_wb+0x1a3/0x310
[ 281.600816] wb_writeback+0x92c/0xe10
[ 281.618064] wb_workfn+0xdf3/0x1600
[ 281.635970] process_one_work+0xcf3/0x1be0
[ 281.662614] worker_thread+0x17d/0x12f0
[ 281.680989] kthread+0x354/0x430
[ 281.682529] ret_from_fork+0x3a/0x50
One time it took about 10 seconds and another time it took 5 minutes.
Whom should we route this to? It looks both mm and ext4 related.