INFO: rcu detected stall in shmem

syzbot

unread,

Oct 9, 2018, 8:08:03 PM10/9/18

to ak...@linux-foundation.org, gu...@fb.com, han...@cmpxchg.org, kirill....@linux.intel.com, linux-...@vger.kernel.org, linu...@kvack.org, mho...@kernel.org, penguin...@i-love.sakura.ne.jp, rien...@google.com, syzkall...@googlegroups.com, yan...@alibaba-inc.com

Hello,

syzbot found the following crash on:

HEAD commit: 570b7bdeaf18 Add linux-next specific files for 20181009
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=13eeb685400000
kernel config: https://syzkaller.appspot.com/x/.config?x=9b5a60e1381390c4
dashboard link: https://syzkaller.appspot.com/bug?extid=77e6b28a7a7106ad0def
compiler: gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+77e6b2...@syzkaller.appspotmail.com

RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000457579
RDX: 0000000000000002 RSI: 0000000000b36000 RDI: 0000000020000000
RBP: 000000000072bf00 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000008031 R11: 0000000000000246 R12: 00007f9315bfc6d4
R13: 00000000004c284a R14: 00000000004d3bd0 R15: 00000000ffffffff
rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 0-....: (1 GPs behind) idle=cb6/1/0x4000000000000002
softirq=64368/64369 fqs=750
rcu: (t=10505 jiffies g=81341 q=1698)
NMI backtrace for cpu 0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+
#90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
nmi_cpu_backtrace.cold.2+0x5c/0xa1 lib/nmi_backtrace.c:101
nmi_trigger_cpumask_backtrace+0x1e8/0x22a lib/nmi_backtrace.c:62
arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
trigger_single_cpu_backtrace include/linux/nmi.h:162 [inline]
rcu_dump_cpu_stacks+0x16f/0x1bc kernel/rcu/tree.c:1195
print_cpu_stall.cold.67+0x1f3/0x3c7 kernel/rcu/tree.c:1334
check_cpu_stall kernel/rcu/tree.c:1408 [inline]
rcu_pending kernel/rcu/tree.c:2961 [inline]
rcu_check_callbacks+0xf38/0x13f0 kernel/rcu/tree.c:2506
update_process_times+0x2d/0x70 kernel/time/timer.c:1636
tick_sched_handle+0x9f/0x180 kernel/time/tick-sched.c:164
tick_sched_timer+0x45/0x130 kernel/time/tick-sched.c:1274
__run_hrtimer kernel/time/hrtimer.c:1398 [inline]
__hrtimer_run_queues+0x412/0x10c0 kernel/time/hrtimer.c:1460
hrtimer_interrupt+0x313/0x780 kernel/time/hrtimer.c:1518
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1034 [inline]
smp_apic_timer_interrupt+0x1a1/0x750 arch/x86/kernel/apic/apic.c:1059
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:804
</IRQ>
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:761
[inline]
RIP: 0010:dump_stack+0x358/0x3ab lib/dump_stack.c:118
Code: 74 0c 48 c7 c7 f0 f5 31 89 e8 9f 0e 0e fa 48 83 3d 07 15 7d 01 00 0f
84 63 fe ff ff e8 1c 89 c9 f9 48 8b bd 70 ff ff ff 57 9d <0f> 1f 44 00 00
e8 09 89 c9 f9 48 8b 8d 68 ff ff ff b8 ff ff 37 00
RSP: 0018:ffff88017d3a5c70 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 0000000000040000 RBX: 1ffffffff1263ebe RCX: ffffc90001e5a000
RDX: 0000000000040000 RSI: ffffffff87b4e0f4 RDI: 0000000000000246
RBP: ffff88017d3a5d18 R08: ffff8801d7e02480 R09: fffffbfff13da030
R10: fffffbfff13da030 R11: 0000000000000003 R12: 1ffff1002fa74b96
R13: 00000000ffffffff R14: 0000000000000200 R15: 0000000000000000
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784
shmem_fault+0x25f/0x960 mm/shmem.c:1982
__do_fault+0x100/0x6b0 mm/memory.c:2996
do_read_fault mm/memory.c:3408 [inline]
do_fault mm/memory.c:3531 [inline]
handle_pte_fault mm/memory.c:3762 [inline]
__handle_mm_fault+0x3d40/0x5a40 mm/memory.c:3886
handle_mm_fault+0x54f/0xc70 mm/memory.c:3923
faultin_page mm/gup.c:518 [inline]
__get_user_pages+0x806/0x1b30 mm/gup.c:718
populate_vma_page_range+0x2db/0x3d0 mm/gup.c:1222
__mm_populate+0x286/0x4d0 mm/gup.c:1270
mm_populate include/linux/mm.h:2311 [inline]
vm_mmap_pgoff+0x27f/0x2c0 mm/util.c:362
ksys_mmap_pgoff+0xf1/0x660 mm/mmap.c:1606
__do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
__se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
__x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9315bfbc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000457579
RDX: 0000000000000002 RSI: 0000000000b36000 RDI: 0000000020000000
RBP: 000000000072bf00 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000008031 R11: 0000000000000246 R12: 00007f9315bfc6d4
R13: 00000000004c284a R14: 00000000004d3bd0 R15: 00000000ffffffff
Memory limit reached of cgroup /syz0
memory: usage 205164kB, limit 204800kB, failcnt 6901
memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /syz0: cache:680KB rss:176336KB rss_huge:163840KB
shmem:740KB mapped_file:660KB dirty:0KB writeback:0KB swap:0KB
inactive_anon:708KB active_anon:176448KB inactive_file:4KB active_file:0KB
unevictable:0KB
Out of memory and no killable processes...
syz-executor0 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE),
nodemask=(null), order=0, oom_score_adj=-1000
syz-executor0 cpuset=syz0 mems_allowed=0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+
#90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784
shmem_fault+0x25f/0x960 mm/shmem.c:1982
__do_fault+0x100/0x6b0 mm/memory.c:2996
do_read_fault mm/memory.c:3408 [inline]
do_fault mm/memory.c:3531 [inline]
handle_pte_fault mm/memory.c:3762 [inline]
__handle_mm_fault+0x3d40/0x5a40 mm/memory.c:3886
handle_mm_fault+0x54f/0xc70 mm/memory.c:3923
faultin_page mm/gup.c:518 [inline]
__get_user_pages+0x806/0x1b30 mm/gup.c:718
populate_vma_page_range+0x2db/0x3d0 mm/gup.c:1222
__mm_populate+0x286/0x4d0 mm/gup.c:1270
mm_populate include/linux/mm.h:2311 [inline]
vm_mmap_pgoff+0x27f/0x2c0 mm/util.c:362
ksys_mmap_pgoff+0xf1/0x660 mm/mmap.c:1606
__do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
__se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
__x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9315bfbc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000457579
RDX: 0000000000000002 RSI: 0000000000b36000 RDI: 0000000020000000
RBP: 000000000072bf00 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000008031 R11: 0000000000000246 R12: 00007f9315bfc6d4
R13: 00000000004c284a R14: 00000000004d3bd0 R15: 00000000ffffffff
Memory limit reached of cgroup /syz0
memory: usage 205168kB, limit 204800kB, failcnt 6909
memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /syz0: cache:680KB rss:176336KB rss_huge:163840KB
shmem:740KB mapped_file:660KB dirty:0KB writeback:0KB swap:0KB
inactive_anon:712KB active_anon:176448KB inactive_file:0KB active_file:4KB
unevictable:0KB
Out of memory and no killable processes...
syz-executor0 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE),
nodemask=(null), order=0, oom_score_adj=-1000
syz-executor0 cpuset=syz0 mems_allowed=0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+
#90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784
shmem_fault+0x25f/0x960 mm/shmem.c:1982
__do_fault+0x100/0x6b0 mm/memory.c:2996
do_read_fault mm/memory.c:3408 [inline]
do_fault mm/memory.c:3531 [inline]
handle_pte_fault mm/memory.c:3762 [inline]
__handle_mm_fault+0x3d40/0x5a40 mm/memory.c:3886
handle_mm_fault+0x54f/0xc70 mm/memory.c:3923
faultin_page mm/gup.c:518 [inline]
__get_user_pages+0x806/0x1b30 mm/gup.c:718
populate_vma_page_range+0x2db/0x3d0 mm/gup.c:1222
__mm_populate+0x286/0x4d0 mm/gup.c:1270
mm_populate include/linux/mm.h:2311 [inline]
vm_mmap_pgoff+0x27f/0x2c0 mm/util.c:362
ksys_mmap_pgoff+0xf1/0x660 mm/mmap.c:1606
__do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
__se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
__x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9315bfbc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000457579
RDX: 0000000000000002 RSI: 0000000000b36000 RDI: 0000000020000000
RBP: 000000000072bf00 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000008031 R11: 0000000000000246 R12: 00007f9315bfc6d4
R13: 00000000004c284a R14: 00000000004d3bd0 R15: 00000000ffffffff
Memory limit reached of cgroup /syz0
memory: usage 205172kB, limit 204800kB, failcnt 6917
memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /syz0: cache:680KB rss:176336KB rss_huge:163840KB
shmem:740KB mapped_file:792KB dirty:0KB writeback:0KB swap:0KB
inactive_anon:716KB active_anon:176448KB inactive_file:4KB active_file:0KB
unevictable:0KB
Out of memory and no killable processes...
syz-executor0 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE),
nodemask=(null), order=0, oom_score_adj=-1000
syz-executor0 cpuset=syz0 mems_allowed=0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+
#90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784
shmem_fault+0x25f/0x960 mm/shmem.c:1982
__do_fault+0x100/0x6b0 mm/memory.c:2996
do_read_fault mm/memory.c:3408 [inline]
do_fault mm/memory.c:3531 [inline]
handle_pte_fault mm/memory.c:3762 [inline]
__handle_mm_fault+0x3d40/0x5a40 mm/memory.c:3886
handle_mm_fault+0x54f/0xc70 mm/memory.c:3923
faultin_page mm/gup.c:518 [inline]
__get_user_pages+0x806/0x1b30 mm/gup.c:718
populate_vma_page_range+0x2db/0x3d0 mm/gup.c:1222
__mm_populate+0x286/0x4d0 mm/gup.c:1270
mm_populate include/linux/mm.h:2311 [inline]
vm_mmap_pgoff+0x27f/0x2c0 mm/util.c:362
ksys_mmap_pgoff+0xf1/0x660 mm/mmap.c:1606
__do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
__se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
__x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9315bfbc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000457579
RDX: 0000000000000002 RSI: 0000000000b36000 RDI: 0000000020000000
RBP: 000000000072bf00 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000008031 R11: 0000000000000246 R12: 00007f9315bfc6d4
R13: 00000000004c284a R14: 00000000004d3bd0 R15: 00000000ffffffff
Memory limit reached of cgroup /syz0
memory: usage 205176kB, limit 204800kB, failcnt 6925
memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /syz0: cache:680KB rss:176336KB rss_huge:163840KB
shmem:740KB mapped_file:792KB dirty:0KB writeback:0KB swap:0KB
inactive_anon:720KB active_anon:176448KB inactive_file:0KB active_file:4KB
unevictable:0KB
Out of memory and no killable processes...
syz-executor0 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE),
nodemask=(null), order=0, oom_score_adj=-1000
syz-executor0 cpuset=syz0 mems_allowed=0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+
#90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784
shmem_fault+0x25f/0x960 mm/shmem.c:1982
__do_fault+0x100/0x6b0 mm/memory.c:2996
do_read_fault mm/memory.c:3408 [inline]
do_fault mm/memory.c:3531 [inline]
handle_pte_fault mm/memory.c:3762 [inline]
__handle_mm_fault+0x3d40/0x5a40 mm/memory.c:3886
handle_mm_fault+0x54f/0xc70 mm/memory.c:3923
faultin_page mm/gup.c:518 [inline]
__get_user_pages+0x806/0x1b30 mm/gup.c:718
populate_vma_page_range+0x2db/0x3d0 mm/gup.c:1222
__mm_populate+0x286/0x4d0 mm/gup.c:1270
mm_populate include/linux/mm.h:2311 [inline]
vm_mmap_pgoff+0x27f/0x2c0 mm/util.c:362
ksys_mmap_pgoff+0xf1/0x660 mm/mmap.c:1606
__do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
__se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
__x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9315bfbc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000457579
RDX: 0000000000000002 RSI: 0000000000b36000 RDI: 0000000020000000
RBP: 000000000072bf00 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000008031 R11: 0000000000000246 R12: 00007f9315bfc6d4
R13: 00000000004c284a R14: 00000000004d3bd0 R15: 00000000ffffffff
Memory limit reached of cgroup /syz0
memory: usage 205180kB, limit 204800kB, failcnt 6933
memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /syz0: cache:680KB rss:176336KB rss_huge:163840KB
shmem:740KB mapped_file:792KB dirty:0KB writeback:0KB swap:0KB
inactive_anon:724KB active_anon:176448KB inactive_file:4KB active_file:0KB
unevictable:0KB
Out of memory and no killable processes...
syz-executor0 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE),
nodemask=(null), order=0, oom_score_adj=-1000
syz-executor0 cpuset=syz0 mems_allowed=0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+
#90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784
shmem_fault+0x25f/0x960 mm/shmem.c:1982
__do_fault+0x100/0x6b0 mm/memory.c:2996
do_read_fault mm/memory.c:3408 [inline]
do_fault mm/memory.c:3531 [inline]
handle_pte_fault mm/memory.c:3762 [inline]
__handle_mm_fault+0x3d40/0x5a40 mm/memory.c:3886
handle_mm_fault+0x54f/0xc70 mm/memory.c:3923
faultin_page mm/gup.c:518 [inline]
__get_user_pages+0x806/0x1b30 mm/gup.c:718
populate_vma_page_range+0x2db/0x3d0 mm/gup.c:1222
__mm_populate+0x286/0x4d0 mm/gup.c:1270
mm_populate include/linux/mm.h:2311 [inline]
vm_mmap_pgoff+0x27f/0x2c0 mm/util.c:362
ksys_mmap_pgoff+0xf1/0x660 mm/mmap.c:1606
__do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
__se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
__x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9315bfbc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000457579
RDX: 0000000000000002 RSI: 0000000000b36000 RDI: 0000000020000000
RBP: 000000000072bf00 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000008031 R11: 0000000000000246 R12: 00007f9315bfc6d4
R13: 00000000004c284a R14: 00000000004d3bd0 R15: 00000000ffffffff
Memory limit reached of cgroup /syz0
memory: usage 205184kB, limit 204800kB, failcnt 6941
memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /syz0: cache:680KB rss:176336KB rss_huge:163840KB
shmem:740KB mapped_file:792KB dirty:0KB writeback:0KB swap:0KB
inactive_anon:728KB active_anon:176448KB inactive_file:0KB active_file:4KB
unevictable:0KB
Out of memory and no killable processes...
syz-executor0 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE),
nodemask=(null), order=0, oom_score_adj=-1000
syz-executor0 cpuset=syz0 mems_allowed=0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+
#90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784
shmem_fault+0x25f/0x960 mm/shmem.c:1982
__do_fault+0x100/0x6b0 mm/memory.c:2996
do_read_fault mm/memory.c:3408 [inline]
do_fault mm/memory.c:3531 [inline]
handle_pte_fault mm/memory.c:3762 [inline]
__handle_mm_fault+0x3d40/0x5a40 mm/memory.c:3886
handle_mm_fault+0x54f/0xc70 mm/memory.c:3923
faultin_page mm/gup.c:518 [inline]
__get_user_pages+0x806/0x1b30 mm/gup.c:718
populate_vma_page_range+0x2db/0x3d0 mm/gup.c:1222
__mm_populate+0x286/0x4d0 mm/gup.c:1270
mm_populate include/linux/mm.h:2311 [inline]
vm_mmap_pgoff+0x27f/0x2c0 mm/util.c:362
ksys_mmap_pgoff+0xf1/0x660 mm/mmap.c:1606
__do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
__se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
__x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9315bfbc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000457579
RDX: 0000000000000002 RSI: 0000000000b36000 RDI: 0000000020000000
RBP: 000000000072bf00 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000008031 R11: 0000000000000246 R12: 00007f9315bfc6d4
R13: 00000000004c284a R14: 00000000004d3bd0 R15: 00000000ffffffff
Memory limit reached of cgroup /syz0
memory: usage 205188kB, limit 204800kB, failcnt 6949
memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /syz0: cache:680KB rss:176336KB rss_huge:163840KB
shmem:740KB mapped_file:792KB dirty:0KB writeback:0KB swap:0KB
inactive_anon:732KB active_anon:176448KB inactive_file:4KB active_file:0KB
unevictable:0KB
Out of memory and no killable processes...
syz-executor0 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE),
nodemask=(null), order=0, oom_score_adj=-1000
syz-executor0 cpuset=syz0 mems_allowed=0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+
#90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784
shmem_fault+0x25f/0x960 mm/shmem.c:1982
__do_fault+0x100/0x6b0 mm/memory.c:2996
do_read_fault mm/memory.c:3408 [inline]
do_fault mm/memory.c:3531 [inline]
handle_pte_fault mm/memory.c:3762 [inline]
__handle_mm_fault+0x3d40/0x5a40 mm/memory.c:3886
handle_mm_fault+0x54f/0xc70 mm/memory.c:3923
faultin_page mm/gup.c:518 [inline]
__get_user_pages+0x806/0x1b30 mm/gup.c:718
populate_vma_page_range+0x2db/0x3d0 mm/gup.c:1222
__mm_populate+0x286/0x4d0 mm/gup.c:1270
mm_populate include/linux/mm.h:2311 [inline]
vm_mmap_pgoff+0x27f/0x2c0 mm/util.c:362
ksys_mmap_pgoff+0xf1/0x660 mm/mmap.c:1606
__do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
__se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
__x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9315bfbc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000457579
RDX: 0000000000000002 RSI: 0000000000b36000 RDI: 0000000020000000
RBP: 000000000072bf00 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000008031 R11: 0000000000000246 R12: 00007f9315bfc6d4
R13: 00000000004c284a R14: 00000000004d3bd0 R15: 00000000ffffffff
Memory limit reached of cgroup /syz0
memory: usage 205192kB, limit 204800kB, failcnt 6957
memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /syz0: cache:680KB rss:176336KB rss_huge:163840KB
shmem:740KB mapped_file:792KB dirty:0KB writeback:0KB swap:0KB
inactive_anon:736KB active_anon:176448KB inactive_file:0KB active_file:4KB
unevictable:0KB
Out of memory and no killable processes...
syz-executor0 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE),
nodemask=(null), order=0, oom_score_adj=-1000
syz-executor0 cpuset=syz0 mems_allowed=0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+
#90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784
shmem_fault+0x25f/0x960 mm/shmem.c:1982
__do_fault+0x100/0x6b0 mm/memory.c:2996
do_read_fault mm/memory.c:3408 [inline]
do_fault mm/memory.c:3531 [inline]
handle_pte_fault mm/memory.c:3762 [inline]
__handle_mm_fault+0x3d40/0x5a40 mm/memory.c:3886
handle_mm_fault+0x54f/0xc70 mm/memory.c:3923
faultin_page mm/gup.c:518 [inline]
__get_user_pages+0x806/0x1b30 mm/gup.c:718
populate_vma_page_range+0x2db/0x3d0 mm/gup.c:1222
__mm_populate+0x286/0x4d0 mm/gup.c:1270
mm_populate include/linux/mm.h:2311 [inline]
vm_mmap_pgoff+0x27f/0x2c0 mm/util.c:362
ksys_mmap_pgoff+0xf1/0x660 mm/mmap.c:1606
__do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
__se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
__x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457579
Code: 1d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 eb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f9315bfbc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000457579
RDX: 0000000000000002 RSI: 0000000000b36000 RDI: 0000000020000000
RBP: 000000000072bf00 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000008031 R11: 0000000000000246 R12: 00007f9315bfc6d4
R13: 00000000004c284a R14: 00000000004d3bd0 R15: 00000000ffffffff
Memory limit reached of cgroup /syz0
memory: usage 205196kB, limit 204800kB, failcnt 6965
memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /syz0: cache:680KB rss:176336KB rss_huge:163840KB
shmem:740KB mapped_file:792KB dirty:0KB writeback:0KB swap:0KB
inactive_anon:740KB active_anon:176448KB inactive_file:4KB active_file:0KB
unevictable:0KB
Out of memory and no killable processes...
syz-executor0 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE),
nodemask=(null), order=0, oom_score_adj=-1000
syz-executor0 cpuset=syz0 mems_allowed=0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+
#90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x244/0x3ab lib/dump_stack.c:113
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784

---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.

Tetsuo Handa

unread,

Oct 9, 2018, 8:13:01 PM10/9/18

to syzbot, han...@cmpxchg.org, mho...@kernel.org, ak...@linux-foundation.org, gu...@fb.com, kirill....@linux.intel.com, linux-...@vger.kernel.org, linu...@kvack.org, rien...@google.com, syzkall...@googlegroups.com, yan...@alibaba-inc.com

syzbot is hitting RCU stall due to memcg-OOM event.
https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64

What should we do if memcg-OOM found no killable task because the allocating task
was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires
(which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper
OOM header when no eligible victim left") because syzbot was terminating the test
upon WARN(1) removed by that commit) is not a good behavior.

syz-executor0 invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=-1000
syz-executor0 cpuset=syz0 mems_allowed=0
CPU: 0 PID: 2050 Comm: syz-executor0 Not tainted 4.19.0-rc7-next-20181009+ #90
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:

(...snipped...)

David Rientjes

unread,

Oct 10, 2018, 12:11:51 AM10/10/18

to Tetsuo Handa, syzbot, han...@cmpxchg.org, mho...@kernel.org, ak...@linux-foundation.org, gu...@fb.com, kirill....@linux.intel.com, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com, yan...@alibaba-inc.com

On Wed, 10 Oct 2018, Tetsuo Handa wrote:

> syzbot is hitting RCU stall due to memcg-OOM event.
> https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64
>
> What should we do if memcg-OOM found no killable task because the allocating task
> was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires
> (which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper
> OOM header when no eligible victim left") because syzbot was terminating the test
> upon WARN(1) removed by that commit) is not a good behavior.
>

Not printing anything would be the obvious solution but the ideal solution
would probably involve

- adding feedback to the memcg oom killer that there are no killable
processes,

- adding complete coverage for memcg_oom_recover() in all uncharge paths
where the oom memcg's page_counter is decremented, and

- having all processes stall until memcg_oom_recover() is called so
looping back into try_charge() has a reasonable expectation to succeed.

Dmitry Vyukov

unread,

Oct 10, 2018, 3:56:18 AM10/10/18

to David Rientjes, Tetsuo Handa, syzbot, Johannes Weiner, Michal Hocko, Andrew Morton, gu...@fb.com, Kirill A. Shutemov, LKML, Linux-MM, syzkaller-bugs, Yang Shi

On Wed, Oct 10, 2018 at 6:11 AM, 'David Rientjes' via syzkaller-bugs
<syzkall...@googlegroups.com> wrote:
> On Wed, 10 Oct 2018, Tetsuo Handa wrote:
>
>> syzbot is hitting RCU stall due to memcg-OOM event.
>> https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64
>>
>> What should we do if memcg-OOM found no killable task because the allocating task
>> was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires
>> (which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper
>> OOM header when no eligible victim left") because syzbot was terminating the test
>> upon WARN(1) removed by that commit) is not a good behavior.

You want to say that most of the recent hangs and stalls are actually
caused by our attempt to sandbox test processes with memory cgroup?
The process with oom_score_adj == -1000 is not supposed to consume any
significant memory; we have another (test) process with oom_score_adj
== 0 that's actually consuming memory.
But should we refrain from using -1000? Perhaps it would be better to
use -500/500 for control/test process, or -999/1000?

> Not printing anything would be the obvious solution but the ideal solution
> would probably involve
>
> - adding feedback to the memcg oom killer that there are no killable
> processes,
>
> - adding complete coverage for memcg_oom_recover() in all uncharge paths
> where the oom memcg's page_counter is decremented, and
>
> - having all processes stall until memcg_oom_recover() is called so
> looping back into try_charge() has a reasonable expectation to succeed.
>

> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/alpine.DEB.2.21.1810092106190.83503%40chino.kir.corp.google.com.
> For more options, visit https://groups.google.com/d/optout.

Michal Hocko

unread,

Oct 10, 2018, 4:59:47 AM10/10/18

to Tetsuo Handa, syzbot, han...@cmpxchg.org, ak...@linux-foundation.org, gu...@fb.com, kirill....@linux.intel.com, linux-...@vger.kernel.org, linu...@kvack.org, rien...@google.com, syzkall...@googlegroups.com, yan...@alibaba-inc.com

On Wed 10-10-18 09:12:45, Tetsuo Handa wrote:
> syzbot is hitting RCU stall due to memcg-OOM event.
> https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64

This is really interesting. If we do not have any eligible oom victim we
simply force the charge (allow to proceed and go over the hard limit)
and break the isolation. That means that the caller gets back to running
and realease all locks take on the way. I am wondering how come we are
seeing the RCU stall. Whole is holding the rcu lock? Certainly not the
charge patch and neither should the caller because you have to be in a
sleepable context to trigger the OOM killer. So there must be something
more going on.

> What should we do if memcg-OOM found no killable task because the allocating task
> was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires
> (which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper
> OOM header when no eligible victim left") because syzbot was terminating the test
> upon WARN(1) removed by that commit) is not a good behavior.

We definitely want to inform about ineligible oom victim. We might
consider some rate limiting for the memcg state but that is a valuable
information to see under normal situation (when you do not have floods
of these situations).
--
Michal Hocko
SUSE Labs

Michal Hocko

unread,

Oct 10, 2018, 5:02:40 AM10/10/18

to David Rientjes, Tetsuo Handa, syzbot, han...@cmpxchg.org, ak...@linux-foundation.org, gu...@fb.com, kirill....@linux.intel.com, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com, yan...@alibaba-inc.com

On Tue 09-10-18 21:11:48, David Rientjes wrote:
> On Wed, 10 Oct 2018, Tetsuo Handa wrote:
>
> > syzbot is hitting RCU stall due to memcg-OOM event.
> > https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64
> >
> > What should we do if memcg-OOM found no killable task because the allocating task
> > was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires
> > (which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper
> > OOM header when no eligible victim left") because syzbot was terminating the test
> > upon WARN(1) removed by that commit) is not a good behavior.
> >
>
> Not printing anything would be the obvious solution but the ideal solution
> would probably involve
>
> - adding feedback to the memcg oom killer that there are no killable
> processes,

We already have that - out_of_memory == F

> - adding complete coverage for memcg_oom_recover() in all uncharge paths
> where the oom memcg's page_counter is decremented, and

Could you elaborate?

> - having all processes stall until memcg_oom_recover() is called so
> looping back into try_charge() has a reasonable expectation to succeed.

You cannot stall in the charge path waiting for others to make a forward
progress because we would be back to oom deadlocks when nobody can make
forward progress due to lock dependencies.

Right now we simply force the charge and allow for further progress when
situation like this happen because this shouldn't happen unless the
memcg is misconfigured badly.

Michal Hocko

unread,

Oct 10, 2018, 5:13:12 AM10/10/18

to Dmitry Vyukov, David Rientjes, Tetsuo Handa, syzbot, Johannes Weiner, Andrew Morton, gu...@fb.com, Kirill A. Shutemov, LKML, Linux-MM, syzkaller-bugs, Yang Shi

On Wed 10-10-18 09:55:57, Dmitry Vyukov wrote:
> On Wed, Oct 10, 2018 at 6:11 AM, 'David Rientjes' via syzkaller-bugs
> <syzkall...@googlegroups.com> wrote:
> > On Wed, 10 Oct 2018, Tetsuo Handa wrote:
> >
> >> syzbot is hitting RCU stall due to memcg-OOM event.
> >> https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64
> >>
> >> What should we do if memcg-OOM found no killable task because the allocating task
> >> was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires
> >> (which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper
> >> OOM header when no eligible victim left") because syzbot was terminating the test
> >> upon WARN(1) removed by that commit) is not a good behavior.
>
>
> You want to say that most of the recent hangs and stalls are actually
> caused by our attempt to sandbox test processes with memory cgroup?
> The process with oom_score_adj == -1000 is not supposed to consume any
> significant memory; we have another (test) process with oom_score_adj
> == 0 that's actually consuming memory.
> But should we refrain from using -1000? Perhaps it would be better to
> use -500/500 for control/test process, or -999/1000?

oom disable on a task (especially when this is the only task in the
memcg) is tricky. Look at the memcg report
[ 935.562389] Memory limit reached of cgroup /syz0
[ 935.567398] memory: usage 204808kB, limit 204800kB, failcnt 6081
[ 935.573768] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
[ 935.580650] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
[ 935.586923] Memory cgroup stats for /syz0: cache:152KB rss:176336KB rss_huge:163840KB shmem:344KB mapped_file:264KB dirty:0KB writeback:0KB swap:0KB inactive_anon:260KB active_anon:176448KB inactive_file:4KB active_file:0KB

There is still somebody holding anonymous (THP) memory. If there is no
other eligible oom victim then it must be some of the oom disabled ones.
You have suppressed the task list information so we do not know who that
might be though.

So it looks like there is some misconfiguration or a bug in the oom
victim selection.

Dmitry Vyukov

unread,

Oct 10, 2018, 5:33:35 AM10/10/18

to Michal Hocko, David Rientjes, Tetsuo Handa, syzbot, Johannes Weiner, Andrew Morton, gu...@fb.com, Kirill A. Shutemov, LKML, Linux-MM, syzkaller-bugs, Yang Shi

I afraid KASAN can interfere with memory accounting/OMM killing too.
KASAN quarantines up to 1/32-th of physical memory (in our case
7.5GB/32 = 230MB) that is already freed by the task, but as far as I
understand is still accounted against memcg. So maybe making cgroup
limit >> quarantine size will help to resolve this too.

But of course there can be a plain memory leak too.

Tetsuo Handa

unread,

Oct 10, 2018, 6:43:58 AM10/10/18

to Michal Hocko, syzbot, han...@cmpxchg.org, ak...@linux-foundation.org, gu...@fb.com, kirill....@linux.intel.com, linux-...@vger.kernel.org, linu...@kvack.org, rien...@google.com, syzkall...@googlegroups.com, yan...@alibaba-inc.com, Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek

On 2018/10/10 17:59, Michal Hocko wrote:
> On Wed 10-10-18 09:12:45, Tetsuo Handa wrote:
>> syzbot is hitting RCU stall due to memcg-OOM event.
>> https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64
>
> This is really interesting. If we do not have any eligible oom victim we
> simply force the charge (allow to proceed and go over the hard limit)
> and break the isolation. That means that the caller gets back to running
> and realease all locks take on the way.

What happens if the caller continued trying to allocate more memory
because the caller cannot be noticed by SIGKILL from the OOM killer?

> I am wondering how come we are
> seeing the RCU stall. Whole is holding the rcu lock? Certainly not the
> charge patch and neither should the caller because you have to be in a
> sleepable context to trigger the OOM killer. So there must be something
> more going on.

>
>> What should we do if memcg-OOM found no killable task because the allocating task
>> was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires
>> (which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper
>> OOM header when no eligible victim left") because syzbot was terminating the test
>> upon WARN(1) removed by that commit) is not a good behavior.
>
> We definitely want to inform about ineligible oom victim. We might
> consider some rate limiting for the memcg state but that is a valuable
> information to see under normal situation (when you do not have floods
> of these situations).
>

But if the caller cannot be noticed by SIGKILL from the OOM killer,
allowing the caller to trigger the OOM killer again and again (until
global OOM killer triggers) is bad.

Michal Hocko

unread,

Oct 10, 2018, 7:35:04 AM10/10/18

to Tetsuo Handa, syzbot, han...@cmpxchg.org, ak...@linux-foundation.org, gu...@fb.com, kirill....@linux.intel.com, linux-...@vger.kernel.org, linu...@kvack.org, rien...@google.com, syzkall...@googlegroups.com, yan...@alibaba-inc.com, Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek

On Wed 10-10-18 19:43:38, Tetsuo Handa wrote:
> On 2018/10/10 17:59, Michal Hocko wrote:
> > On Wed 10-10-18 09:12:45, Tetsuo Handa wrote:
> >> syzbot is hitting RCU stall due to memcg-OOM event.
> >> https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64
> >
> > This is really interesting. If we do not have any eligible oom victim we
> > simply force the charge (allow to proceed and go over the hard limit)
> > and break the isolation. That means that the caller gets back to running
> > and realease all locks take on the way.
>
> What happens if the caller continued trying to allocate more memory
> because the caller cannot be noticed by SIGKILL from the OOM killer?

It could eventually trigger the global OOM.

> > I am wondering how come we are
> > seeing the RCU stall. Whole is holding the rcu lock? Certainly not the
> > charge patch and neither should the caller because you have to be in a
> > sleepable context to trigger the OOM killer. So there must be something
> > more going on.
>
> Just flooding out of memory messages can trigger RCU stall problems.
> For example, a severe skbuff_head_cache or kmalloc-512 leak bug is causing

[...]

Quite some of them, indeed! I guess we want to rate limit the output.
What about the following?

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index f10aa5360616..4ee393c85e27 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -430,6 +430,9 @@ static void dump_tasks(struct mem_cgroup *memcg, const nodemask_t *nodemask)

static void dump_header(struct oom_control *oc, struct task_struct *p)
{
+ static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
+ DEFAULT_RATELIMIT_BURST);
+
pr_warn("%s invoked oom-killer: gfp_mask=%#x(%pGg), nodemask=%*pbl, order=%d, oom_score_adj=%hd\n",
current->comm, oc->gfp_mask, &oc->gfp_mask,
nodemask_pr_args(oc->nodemask), oc->order,
@@ -437,6 +440,9 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
if (!IS_ENABLED(CONFIG_COMPACTION) && oc->order)
pr_warn("COMPACTION is disabled!!!\n");

+ if (!__ratelimit(&oom_rs))
+ return;
+
cpuset_print_current_mems_allowed();
dump_stack();
if (is_memcg_oom(oc))
@@ -931,8 +937,6 @@ static void oom_kill_process(struct oom_control *oc, const char *message)
struct task_struct *t;
struct mem_cgroup *oom_group;
unsigned int victim_points = 0;
- static DEFINE_RATELIMIT_STATE(oom_rs, DEFAULT_RATELIMIT_INTERVAL,
- DEFAULT_RATELIMIT_BURST);

/*
* If the task is already exiting, don't alarm the sysadmin or kill
@@ -949,8 +953,7 @@ static void oom_kill_process(struct oom_control *oc, const char *message)
}
task_unlock(p);

- if (__ratelimit(&oom_rs))
- dump_header(oc, p);
+ dump_header(oc, p);

pr_err("%s: Kill process %d (%s) score %u or sacrifice child\n",
message, task_pid_nr(p), p->comm, points);

> >> What should we do if memcg-OOM found no killable task because the allocating task
> >> was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires
> >> (which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper
> >> OOM header when no eligible victim left") because syzbot was terminating the test
> >> upon WARN(1) removed by that commit) is not a good behavior.
> >
> > We definitely want to inform about ineligible oom victim. We might
> > consider some rate limiting for the memcg state but that is a valuable
> > information to see under normal situation (when you do not have floods
> > of these situations).
> >
>
> But if the caller cannot be noticed by SIGKILL from the OOM killer,
> allowing the caller to trigger the OOM killer again and again (until
> global OOM killer triggers) is bad.

There is simply no other option. Well, except for failing the charge
which has been considered and refused because it could trigger
unexpected error paths and that breaking the isolation on rare cases
when of the misconfiguration is acceptable. We can reconsider that
but you should bring really good arguments on the table. I was very
successful doing that.

Sergey Senozhatsky

unread,

Oct 10, 2018, 7:48:50 AM10/10/18

to Michal Hocko, Tetsuo Handa, syzbot, han...@cmpxchg.org, ak...@linux-foundation.org, gu...@fb.com, kirill....@linux.intel.com, linux-...@vger.kernel.org, linu...@kvack.org, rien...@google.com, syzkall...@googlegroups.com, yan...@alibaba-inc.com, Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek

On (10/10/18 13:35), Michal Hocko wrote:
> > Just flooding out of memory messages can trigger RCU stall problems.
> > For example, a severe skbuff_head_cache or kmalloc-512 leak bug is causing
>
> [...]
>
> Quite some of them, indeed! I guess we want to rate limit the output.
> What about the following?

A bit unrelated, but while we are at it:

I like it when we rate-limit printk-s that lookup the system.
But it seems that default rate-limit values are not always good enough,
DEFAULT_RATELIMIT_INTERVAL / DEFAULT_RATELIMIT_BURST can still be too
verbose. For instance, when we have a very slow IPMI emulated serial
console -- e.g. baud rate at 57600. DEFAULT_RATELIMIT_INTERVAL and
DEFAULT_RATELIMIT_BURST can add new OOM headers and backtraces faster
than we evict them.

Does it sound reasonable enough to use larger than default rate-limits
for printk-s in OOM print-outs? OOM reports tend to be somewhat large
and the reported numbers are not always *very* unique.

What do you think?

-ss

Michal Hocko

unread,

Oct 10, 2018, 8:25:43 AM10/10/18

to Sergey Senozhatsky, Tetsuo Handa, syzbot, han...@cmpxchg.org, ak...@linux-foundation.org, gu...@fb.com, kirill....@linux.intel.com, linux-...@vger.kernel.org, linu...@kvack.org, rien...@google.com, syzkall...@googlegroups.com, yan...@alibaba-inc.com, Sergey Senozhatsky, Petr Mladek

I do not really care about the current inerval/burst values. This change
should be done seprately and ideally with some numbers.

Dmitry Vyukov

unread,

Oct 10, 2018, 8:30:01 AM10/10/18

to Michal Hocko, Sergey Senozhatsky, Tetsuo Handa, syzbot, Johannes Weiner, Andrew Morton, gu...@fb.com, Kirill A. Shutemov, LKML, Linux-MM, David Rientjes, syzkaller-bugs, Yang Shi, Sergey Senozhatsky, Petr Mladek

I think Sergey meant that this place may need to use
larger-than-default values because it prints lots of output per
instance (whereas the default limit is more tuned for cases that print
just 1 line).

I've found at least 1 place that uses DEFAULT_RATELIMIT_INTERVAL*10:
https://elixir.bootlin.com/linux/latest/source/fs/btrfs/extent-tree.c#L8365
Probably we need something similar here.

Dmitry Vyukov

unread,

Oct 10, 2018, 8:36:50 AM10/10/18

to Michal Hocko, Sergey Senozhatsky, Tetsuo Handa, syzbot, Johannes Weiner, Andrew Morton, gu...@fb.com, Kirill A. Shutemov, LKML, Linux-MM, David Rientjes, syzkaller-bugs, Yang Shi, Sergey Senozhatsky, Petr Mladek

In parallel with the kernel changes I've also made a change to
syzkaller that (1) makes it not use oom_score_adj=-1000, this hard
killing limit looks like quite risky thing, (2) increase memcg size
beyond expected KASAN quarantine size:
https://github.com/google/syzkaller/commit/adedaf77a18f3d03d695723c86fc083c3551ff5b
If this will stop the flow of hang/stall reports, then we can just
close all old reports as invalid.

Tetsuo Handa

unread,

Oct 10, 2018, 9:10:59 AM10/10/18

to Dmitry Vyukov, Michal Hocko, Sergey Senozhatsky, syzbot, Johannes Weiner, Andrew Morton, gu...@fb.com, Kirill A. Shutemov, LKML, Linux-MM, David Rientjes, syzkaller-bugs, Yang Shi, Sergey Senozhatsky, Petr Mladek

Yes. The OOM killer tends to print a lot of messages (and I estimate that
mutex_trylock(&oom_lock) accelerates wasting more CPU consumption by
preemption).

>>
>> I've found at least 1 place that uses DEFAULT_RATELIMIT_INTERVAL*10:
>> https://elixir.bootlin.com/linux/latest/source/fs/btrfs/extent-tree.c#L8365
>> Probably we need something similar here.

Since printk() is a significantly CPU consuming operation, I think that what
we need to guarantee is interval between the end of an OOM killer messages
and the beginning of next OOM killer messages is large enough. For example,
setup a timer with 5 seconds timeout upon the end of an OOM killer messages
and check whether the timer already fired upon the beginning of next OOM killer
messages.

>
>
> In parallel with the kernel changes I've also made a change to
> syzkaller that (1) makes it not use oom_score_adj=-1000, this hard
> killing limit looks like quite risky thing, (2) increase memcg size
> beyond expected KASAN quarantine size:
> https://github.com/google/syzkaller/commit/adedaf77a18f3d03d695723c86fc083c3551ff5b
> If this will stop the flow of hang/stall reports, then we can just
> close all old reports as invalid.

I don't think so. Only this report was different from others because printk()
in this report was from memcg OOM events without eligible tasks whereas printk()
in others are from global OOM events triggered by severe slab memory leak.

Dmitry Vyukov

unread,

Oct 10, 2018, 9:18:18 AM10/10/18

to Tetsuo Handa, Michal Hocko, Sergey Senozhatsky, syzbot, Johannes Weiner, Andrew Morton, gu...@fb.com, Kirill A. Shutemov, LKML, Linux-MM, David Rientjes, syzkaller-bugs, Yang Shi, Sergey Senozhatsky, Petr Mladek

Ack.
I guess I just hoped deep down that we somehow magically get rid of
all these reports with some simple change like this :)

Tetsuo Handa

unread,

Oct 10, 2018, 10:19:46 AM10/10/18

to Michal Hocko, syzbot, han...@cmpxchg.org, ak...@linux-foundation.org, gu...@fb.com, kirill....@linux.intel.com, linux-...@vger.kernel.org, linu...@kvack.org, rien...@google.com, syzkall...@googlegroups.com, yan...@alibaba-inc.com, Sergey Senozhatsky, Sergey Senozhatsky, Petr Mladek

On 2018/10/10 20:35, Michal Hocko wrote:
>>>> What should we do if memcg-OOM found no killable task because the allocating task
>>>> was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires
>>>> (which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper
>>>> OOM header when no eligible victim left") because syzbot was terminating the test
>>>> upon WARN(1) removed by that commit) is not a good behavior.
>>>
>>> We definitely want to inform about ineligible oom victim. We might
>>> consider some rate limiting for the memcg state but that is a valuable
>>> information to see under normal situation (when you do not have floods
>>> of these situations).
>>>
>>
>> But if the caller cannot be noticed by SIGKILL from the OOM killer,
>> allowing the caller to trigger the OOM killer again and again (until
>> global OOM killer triggers) is bad.
>
> There is simply no other option. Well, except for failing the charge
> which has been considered and refused because it could trigger
> unexpected error paths and that breaking the isolation on rare cases
> when of the misconfiguration is acceptable. We can reconsider that
> but you should bring really good arguments on the table. I was very
> successful doing that.
>

By the way, how do we avoid this flooding? Something like this?

include/linux/sched.h | 1 +
mm/oom_kill.c | 11 +++++++++++
2 files changed, 12 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 977cb57..58eff50 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -723,6 +723,7 @@ struct task_struct {
#endif
#ifdef CONFIG_MEMCG
unsigned in_user_fault:1;
+ unsigned memcg_oom_no_eligible_warned:1;
#ifdef CONFIG_MEMCG_KMEM
unsigned memcg_kmem_skip_account:1;
#endif
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index f10aa53..ff0fa65 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -1106,6 +1106,13 @@ bool out_of_memory(struct oom_control *oc)
select_bad_process(oc);
/* Found nothing?!?! */
if (!oc->chosen) {
+#ifdef CONFIG_MEMCG
+ if (is_memcg_oom(oc)) {
+ if (current->memcg_oom_no_eligible_warned)
+ return false;
+ current->memcg_oom_no_eligible_warned = 1;
+ }
+#endif
dump_header(oc, NULL);
pr_warn("Out of memory and no killable processes...\n");
/*
@@ -1115,6 +1122,10 @@ bool out_of_memory(struct oom_control *oc)
*/
if (!is_sysrq_oom(oc) && !is_memcg_oom(oc))
panic("System is deadlocked on memory\n");
+#ifdef CONFIG_MEMCG
+ } else if (is_memcg_oom(oc)) {
+ current->memcg_oom_no_eligible_warned = 0;
+#endif
}
if (oc->chosen && oc->chosen != (void *)-1UL)
oom_kill_process(oc, !is_memcg_oom(oc) ? "Out of memory" :
--
1.8.3.1

Michal Hocko

unread,

Oct 10, 2018, 11:11:51 AM10/10/18

to linu...@kvack.org, syzkall...@googlegroups.com, Michal Hocko, gu...@fb.com, han...@cmpxchg.org, kirill....@linux.intel.com, linux-...@vger.kernel.org, penguin...@i-love.sakura.ne.jp, rien...@google.com, yan...@alibaba-inc.com

From: Michal Hocko <mho...@suse.com>

syzbot has noticed that it can trigger RCU stalls from the memcg oom
path:

RIP: 0010:dump_stack+0x358/0x3ab lib/dump_stack.c:118
Code: 74 0c 48 c7 c7 f0 f5 31 89 e8 9f 0e 0e fa 48 83 3d 07 15 7d 01 00 0f
84 63 fe ff ff e8 1c 89 c9 f9 48 8b bd 70 ff ff ff 57 9d <0f> 1f 44 00 00
e8 09 89 c9 f9 48 8b 8d 68 ff ff ff b8 ff ff 37 00
RSP: 0018:ffff88017d3a5c70 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 0000000000040000 RBX: 1ffffffff1263ebe RCX: ffffc90001e5a000
RDX: 0000000000040000 RSI: ffffffff87b4e0f4 RDI: 0000000000000246
RBP: ffff88017d3a5d18 R08: ffff8801d7e02480 R09: fffffbfff13da030
R10: fffffbfff13da030 R11: 0000000000000003 R12: 1ffff1002fa74b96
R13: 00000000ffffffff R14: 0000000000000200 R15: 0000000000000000
dump_header+0x27b/0xf72 mm/oom_kill.c:441
out_of_memory.cold.30+0xf/0x184 mm/oom_kill.c:1109
mem_cgroup_out_of_memory+0x15e/0x210 mm/memcontrol.c:1386
mem_cgroup_oom mm/memcontrol.c:1701 [inline]
try_charge+0xb7c/0x1710 mm/memcontrol.c:2260
mem_cgroup_try_charge+0x627/0xe20 mm/memcontrol.c:5892
mem_cgroup_try_charge_delay+0x1d/0xa0 mm/memcontrol.c:5907
shmem_getpage_gfp+0x186b/0x4840 mm/shmem.c:1784
shmem_fault+0x25f/0x960 mm/shmem.c:1982
__do_fault+0x100/0x6b0 mm/memory.c:2996
do_read_fault mm/memory.c:3408 [inline]
do_fault mm/memory.c:3531 [inline]

The primary reason of the stall lies in an expensive printk handling
of oom report flood because a misconfiguration on the syzbot side
caused that there is simply no eligible task because they have
OOM_SCORE_ADJ_MIN set. This generates the oom report for each allocation
from the memcg context.

While normal workloads should be much more careful about potential heavy
memory consumers that are OOM disabled it makes some sense to rate limit
a potentially expensive oom reports for cases when there is no eligible
victim found. Do that by moving the rate limit logic inside dump_header.
We no longer rely on the caller to do that. It was only oom_kill_process
which has been throttling. Other two call sites simply didn't have to
care because one just paniced on the OOM when configured that way and
no eligible task would panic for the global case as well. Memcg changed
the picture because we do not panic and we might have multiple sources
of the same event.

Once we are here, make sure that the reason to trigger the OOM is
printed without ratelimiting because this is really valuable to
debug what happened.

Reported-by: syzbot+77e6b2...@syzkaller.appspotmail.com
Cc: gu...@fb.com
Cc: han...@cmpxchg.org
Cc: kirill....@linux.intel.com
Cc: linux-...@vger.kernel.org
Cc: penguin...@i-love.sakura.ne.jp
Cc: rien...@google.com
Cc: yan...@alibaba-inc.com
Signed-off-by: Michal Hocko <mho...@suse.com>
---
mm/oom_kill.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index f10aa5360616..4ee393c85e27 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c

--
2.19.0

Sergey Senozhatsky

unread,

Oct 10, 2018, 11:17:43 AM10/10/18

to Dmitry Vyukov, Michal Hocko, Sergey Senozhatsky, Tetsuo Handa, syzbot, Johannes Weiner, Andrew Morton, gu...@fb.com, Kirill A. Shutemov, LKML, Linux-MM, David Rientjes, syzkaller-bugs, Yang Shi, Sergey Senozhatsky, Petr Mladek

On (10/10/18 14:29), Dmitry Vyukov wrote:
> >> A bit unrelated, but while we are at it:
> >>
> >> I like it when we rate-limit printk-s that lookup the system.
> >> But it seems that default rate-limit values are not always good enough,
> >> DEFAULT_RATELIMIT_INTERVAL / DEFAULT_RATELIMIT_BURST can still be too
> >> verbose. For instance, when we have a very slow IPMI emulated serial
> >> console -- e.g. baud rate at 57600. DEFAULT_RATELIMIT_INTERVAL and
> >> DEFAULT_RATELIMIT_BURST can add new OOM headers and backtraces faster
> >> than we evict them.
> >>
> >> Does it sound reasonable enough to use larger than default rate-limits
> >> for printk-s in OOM print-outs? OOM reports tend to be somewhat large
> >> and the reported numbers are not always *very* unique.
> >>
> >> What do you think?
> >
> > I do not really care about the current inerval/burst values. This change
> > should be done seprately and ideally with some numbers.
>
> I think Sergey meant that this place may need to use
> larger-than-default values because it prints lots of output per
> instance (whereas the default limit is more tuned for cases that print
> just 1 line).
>
> I've found at least 1 place that uses DEFAULT_RATELIMIT_INTERVAL*10:
> https://elixir.bootlin.com/linux/latest/source/fs/btrfs/extent-tree.c#L8365
> Probably we need something similar here.

Yes, Dmitry, that's what I meant - to use something like
DEFAULT_RATELIMIT_INTERVAL * 10 in OOM. I didn't mean to change
the default values system wide.

---

We are not rate-limiting a single annoying printk() in OOM, but
functions that do a whole bunch of printks - OOM header, backtraces, etc.
Thus OOM report can be, I don't know, 50 or 70 or 100 lines (who knows).
So that's why rate-limit in OOM is more permissive in terms of number of
printed lines. When we rate-limit a single printk() we let 10 prinks()
/*10 lines*/ max every 5 seconds. While in OOM this transforms into
10 dump_header() + 10 oom_kill_process() every 5 seconds. Still can be
too many printk()-s, enough to lockup the system.

-ss

Sergey Senozhatsky

unread,

Oct 10, 2018, 9:17:36 PM10/10/18

to Tetsuo Handa, Dmitry Vyukov, Michal Hocko, Sergey Senozhatsky, syzbot, Johannes Weiner, Andrew Morton, gu...@fb.com, Kirill A. Shutemov, LKML, Linux-MM, David Rientjes, syzkaller-bugs, Yang Shi, Sergey Senozhatsky, Petr Mladek

On (10/10/18 22:10), Tetsuo Handa wrote:
> >> I've found at least 1 place that uses DEFAULT_RATELIMIT_INTERVAL*10:
> >> https://elixir.bootlin.com/linux/latest/source/fs/btrfs/extent-tree.c#L8365
> >> Probably we need something similar here.
>
> Since printk() is a significantly CPU consuming operation, I think that what
> we need to guarantee is interval between the end of an OOM killer messages
> and the beginning of next OOM killer messages is large enough. For example,
> setup a timer with 5 seconds timeout upon the end of an OOM killer messages
> and check whether the timer already fired upon the beginning of next OOM killer
> messages.

Hmm, there is no way to make sure that previous OOM report made it to
consoles. So maybe timer approach will be as good as rate-limiting.

-ss

Reply all

Reply to author

Forward

INFO: rcu detected stall in shmem_fault

syzbot

Tetsuo Handa

David Rientjes

Dmitry Vyukov

Michal Hocko

Michal Hocko

Michal Hocko

Dmitry Vyukov

Tetsuo Handa

Michal Hocko

Sergey Senozhatsky

Michal Hocko

Dmitry Vyukov

Dmitry Vyukov

Tetsuo Handa

Dmitry Vyukov

Tetsuo Handa

Michal Hocko

Sergey Senozhatsky

Sergey Senozhatsky