INFO: rcu detected stall in sys_keyctl

79 views
Skip to first unread message

syzbot

unread,
Mar 4, 2020, 3:08:13 AM3/4/20
to da...@davemloft.net, her...@gondor.apana.org.au, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following crash on:

HEAD commit: 63623fd4 Merge tag 'for-linus' of git://git.kernel.org/pub..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15257ba1e00000
kernel config: https://syzkaller.appspot.com/x/.config?x=9833e26bab355358
dashboard link: https://syzkaller.appspot.com/bug?extid=0c5c2dbf76930df91489
compiler: gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+0c5c2d...@syzkaller.appspotmail.com

rcu: INFO: rcu_preempt self-detected stall on CPU
rcu: 0-....: (1 GPs behind) idle=576/1/0x4000000000000002 softirq=55718/56054 fqs=5235
(t=10500 jiffies g=63445 q=1523)
NMI backtrace for cpu 0
CPU: 0 PID: 18804 Comm: syz-executor.4 Not tainted 5.6.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x197/0x210 lib/dump_stack.c:118
nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
trigger_single_cpu_backtrace include/linux/nmi.h:164 [inline]
rcu_dump_cpu_stacks+0x183/0x1cf kernel/rcu/tree_stall.h:254
print_cpu_stall kernel/rcu/tree_stall.h:475 [inline]
check_cpu_stall kernel/rcu/tree_stall.h:549 [inline]
rcu_pending kernel/rcu/tree.c:3030 [inline]
rcu_sched_clock_irq.cold+0x51a/0xc37 kernel/rcu/tree.c:2276
update_process_times+0x2d/0x70 kernel/time/timer.c:1726
tick_sched_handle+0xa2/0x190 kernel/time/tick-sched.c:171
tick_sched_timer+0x53/0x140 kernel/time/tick-sched.c:1314
__run_hrtimer kernel/time/hrtimer.c:1517 [inline]
__hrtimer_run_queues+0x364/0xe40 kernel/time/hrtimer.c:1579
hrtimer_interrupt+0x314/0x770 kernel/time/hrtimer.c:1641
local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1119 [inline]
smp_apic_timer_interrupt+0x160/0x610 arch/x86/kernel/apic/apic.c:1144
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
</IRQ>
RIP: 0010:__sanitizer_cov_trace_const_cmp4+0x16/0x20 kernel/kcov.c:276
Code: 48 89 e5 48 8b 4d 08 e8 d8 fe ff ff 5d c3 66 0f 1f 44 00 00 55 89 f2 89 fe bf 05 00 00 00 48 89 e5 48 8b 4d 08 e8 ba fe ff ff <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 f2 48 89 fe bf 07 00 00 00
RSP: 0018:ffffc900053877d8 EFLAGS: 00000297 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000002 RBX: 584279fc973b765a RCX: ffffffff83b71a81
RDX: 00000000ffffff75 RSI: 0000000000000000 RDI: 0000000000000005
RBP: ffffc900053877d8 R08: ffff888041a265c0 R09: 0000000000000092
R10: ffffed1015d0707b R11: ffff8880ae8383db R12: ffff88809eb56398
R13: 000000003ab2c4e4 R14: 1b0d4377a72d08f5 R15: 00000000ffffff75
mpihelp_submul_1+0x161/0x1a0 lib/mpi/generic_mpih-mul3.c:45
mpihelp_divrem+0x1ce/0x1360 lib/mpi/mpih-div.c:209
mpi_powm+0xffb/0x1d20 lib/mpi/mpi-pow.c:205
_compute_val crypto/dh.c:39 [inline]
dh_compute_value+0x373/0x610 crypto/dh.c:178
crypto_kpp_generate_public_key include/crypto/kpp.h:315 [inline]
__keyctl_dh_compute+0x9ae/0x1470 security/keys/dh.c:367
keyctl_dh_compute+0xcf/0x12d security/keys/dh.c:422
__do_sys_keyctl security/keys/keyctl.c:1818 [inline]
__se_sys_keyctl security/keys/keyctl.c:1714 [inline]
__x64_sys_keyctl+0x159/0x470 security/keys/keyctl.c:1714
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x45c479
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f872a51fc78 EFLAGS: 00000246 ORIG_RAX: 00000000000000fa
RAX: ffffffffffffffda RBX: 00007f872a5206d4 RCX: 000000000045c479
RDX: 0000000020002700 RSI: 0000000020000400 RDI: 0000000000000017
RBP: 000000000076bfc0 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffffff84 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000000006fa R14: 00000000004c9883 R15: 000000000076bfcc


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Dmitry Vyukov

unread,
Mar 4, 2020, 3:15:41 AM3/4/20
to syzbot, David Miller, Herbert Xu, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, LKML, syzkaller-bugs, Kate Stewart, Thomas Gleixner, Eric Biggers, all...@lohutok.net
+lib/mpi maintainers

I wonder if this can also be triggered by remote actors (tls, wifi, usb, etc).
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000dd909105a002ebe6%40google.com.

Dmitry Vyukov

unread,
Mar 4, 2020, 3:59:50 AM3/4/20
to Kris Karas, syzbot, David Miller, Herbert Xu, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, LKML, syzkaller-bugs, Kate Stewart, Thomas Gleixner, Eric Biggers, all...@lohutok.net
On Wed, Mar 4, 2020 at 9:41 AM Kris Karas <linux...@moonlit-rail.com> wrote:
>
> Resending this to all the original CCs per suggestion of Dmitry.
> I'm not a member of linux-crypto, no idea if it will bounce; in any
> case, the OOPS I saw does not appear to be crypto related.
>
> Dmitry Vyukov wrote:
> > +lib/mpi maintainers
> >
> > I wonder if this can also be triggered by remote actors (tls, wifi,
> > usb, etc).
> >
>
> This looks somewhat similar to an OOPS + rcu stall I reported earlier in
> reply to Greg KH's announcement of 5.5.7:
>
> rcu: INFO: rcu_sched self-detected stall on CPU
> rcu: 14-....: (20999 ticks this GP)
> idle=216/1/0x4000000000000002 softirq=454/454 fqs=5250
> (t=21004 jiffies g=-755 q=1327)
> NMI backtrace for cpu 14
> CPU: 14 PID: 520 Comm: pidof Tainted: G D 5.5.7 #1
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470
> Taichi, BIOS P3.50 07/18/2019
> Call Trace:
> <IRQ>
> dump_stack+0x50/0x70
> nmi_cpu_backtrace.cold+0x14/0x53
> ? lapic_can_unplug_cpu.cold+0x44/0x44
> nmi_trigger_cpumask_backtrace+0x7b/0x88
> rcu_dump_cpu_stacks+0x7b/0xa9
> rcu_sched_clock_irq.cold+0x152/0x39b
> update_process_times+0x1f/0x50
> tick_sched_timer+0x40/0x90
> ? tick_sched_do_timer+0x50/0x50
> __hrtimer_run_queues+0xdd/0x180
> hrtimer_interrupt+0x108/0x230
> smp_apic_timer_interrupt+0x53/0xa0
> apic_timer_interrupt+0xf/0x20
> </IRQ>
>
> I don't have a reproducer for it, either. It showed up in 5.5.7 (but
> might be from earlier as it reproduces so infrequently).

Hi Kris,

What follows after this stack? That's the most interesting part. The
part that you showed is common for all stalls and does not mean much,
besides the fact that there is a stall. These can well be very
different stalls in different parts of kernel.

Hillf Danton

unread,
Mar 4, 2020, 5:29:12 AM3/4/20
to syzbot, da...@davemloft.net, her...@gondor.apana.org.au, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com

On Wed, 04 Mar 2020 00:08:11 -0800
Cut the chance of being a cpu hog.

--- a/lib/mpi/mpih-div.c
+++ b/lib/mpi/mpih-div.c
@@ -215,6 +215,8 @@ q_test:

qp[i] = q;
n0 = np[dsize - 1];
+
+ cond_resched();
}
}
}

Dmitry Vyukov

unread,
Mar 4, 2020, 5:45:19 AM3/4/20
to Hillf Danton, syzbot, David Miller, Herbert Xu, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, LKML, syzkaller-bugs
Isn't it looping infinitely? cond_resched() won't help in such case.

Kris Karas

unread,
Mar 4, 2020, 7:25:02 AM3/4/20
to syzbot, David Miller, Herbert Xu, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, LKML, syzkaller-bugs, Kate Stewart, Thomas Gleixner, Eric Biggers, all...@lohutok.net
Resending this to all the original CCs per suggestion of Dmitry.
I'm not a member of linux-crypto, no idea if it will bounce; in any
case, the OOPS I saw does not appear to be crypto related.

Dmitry Vyukov wrote:
> syzbot wrote:
> +lib/mpi maintainers
>
> I wonder if this can also be triggered by remote actors (tls, wifi,
> usb, etc).
>

This looks somewhat similar to an OOPS + rcu stall I reported earlier in
reply to Greg KH's announcement of 5.5.7:

    rcu: INFO: rcu_sched self-detected stall on CPU
    rcu:    14-....: (20999 ticks this GP)
idle=216/1/0x4000000000000002 softirq=454/454 fqs=5250
            (t=21004 jiffies g=-755 q=1327)
    NMI backtrace for cpu 14
    CPU: 14 PID: 520 Comm: pidof Tainted: G      D           5.5.7 #1
    Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470
Taichi, BIOS P3.50 07/18/2019
    Call Trace:
     <IRQ>
     dump_stack+0x50/0x70
     nmi_cpu_backtrace.cold+0x14/0x53
     ? lapic_can_unplug_cpu.cold+0x44/0x44
     nmi_trigger_cpumask_backtrace+0x7b/0x88
     rcu_dump_cpu_stacks+0x7b/0xa9
     rcu_sched_clock_irq.cold+0x152/0x39b
     update_process_times+0x1f/0x50
     tick_sched_timer+0x40/0x90
     ? tick_sched_do_timer+0x50/0x50
     __hrtimer_run_queues+0xdd/0x180
     hrtimer_interrupt+0x108/0x230
     smp_apic_timer_interrupt+0x53/0xa0
     apic_timer_interrupt+0xf/0x20
     </IRQ>

I don't have a reproducer for it, either.  It showed up in 5.5.7 (but
might be from earlier as it reproduces so infrequently).

Kris


Kris Karas

unread,
Mar 4, 2020, 3:35:59 PM3/4/20
to Dmitry Vyukov, syzbot, David Miller, Herbert Xu, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, LKML, syzkaller-bugs, Kate Stewart, Thomas Gleixner, Eric Biggers, all...@lohutok.net
Dmitry Vyukov wrote:
> Kris Karas wrote:
>> [...]
>> rcu: INFO: rcu_sched self-detected stall on CPU
>> rcu: 14-....: (20999 ticks this GP)
>> idle=216/1/0x4000000000000002 softirq=454/454 fqs=5250
>> (t=21004 jiffies g=-755 q=1327)
>> NMI backtrace for cpu 14
>> CPU: 14 PID: 520 Comm: pidof Tainted: G D 5.5.7 #1
>> [...]
>> I don't have a reproducer for it, either. It showed up in 5.5.7 (but
>> might be from earlier as it reproduces so infrequently).
> Hi Kris,
>
> What follows after this stack? That's the most interesting part. The
> part that you showed is common for all stalls and does not mean much,
> besides the fact that there is a stall. These can well be very
> different stalls in different parts of kernel.

Hi Dmitry,

Sorry, dummy me, I should have found my original post in Lore and posted
a link to that.
Oh, here we go:

https://lore.kernel.org/lkml/6d4f9ac8-a478-2ae4...@moonlit-rail.com/

Given that the stall stack is not terribly useful, it would seem that
the OOPS I saw was probably unrelated to this one caught by syzbot,
though the stalled CPU does make me curious (as in all the OOPSen I've
encountered in the past 25 years have rarely mentioned an RCU stall). 
For convenience, I'll re-post everything I was able to salvage from
dmesg originally.

Kris

The OOPS in the dump, below, occurred while the machine was booting,
right about the time that /sbin/init switched from runstate S => 3. 
System daemons (haveged, named, syslogd, etc...) were starting. The OOPS
occurred in /bin/pidof, which is no doubt checking whether a daemon is
up before attempting to start it.  Under the OOPS, the filesystem was
functioning (at least well enough to save dmesg to a file), though many
things were hanging.  It required an Alt-SysRq-E to get me a login
prompt, and I lucked out in PAM and friends working well enough to give
me a functioning command prompt. Alt-SysRq-S,U,S,B was necessary to
reboot.  Without further ado...

BUG: kernel NULL pointer dereference, address: 00000000000000e8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 7f6a50067 P4D 7f6a50067 PUD 7f6a51067 PMD 0
Oops: 0000 [#1] SMP
CPU: 3 PID: 516 Comm: pidof Not tainted 5.5.7 #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi, BIOS P3.50 07/18/2019
RIP: 0010:cap_capable+0x13/0x70
Code: bf f4 ff ff ff 66 90 e9 01 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 4c 8b 87 88 00 00 00 4c 39 c6 74 39 45 8b 88 e8 00 00 00 <44> 39 8e e8 00 00 00 7e 18 48 8b 86 e0 00 00 00 4c 39 c0 74 12 48
RSP: 0018:ffffc90000777cb0 EFLAGS: 00010207
RAX: ffff8887f96ea000 RBX: 0000000000000002 RCX: 0000000000000002
RDX: 0000000000000013 RSI: 0000000000000000 RDI: ffff8887f9646480
RBP: 0000000000000013 R08: ffffffff82423da0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8887f9646480 R14: ffffffff822a7620 R15: ffff8887fae600c0
FS: 00007f8ee26cd740(0000) GS:ffff8887fecc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000e8 CR3: 00000007f8691000 CR4: 00000000003406e0
Call Trace:
security_capable+0x36/0x50
ptrace_has_cap+0x14/0x30
__ptrace_may_access+0x76/0x110
ptrace_may_access+0x28/0x50
do_task_stat+0x7b/0xd90
? do_filp_open+0xab/0x100
proc_single_show+0x54/0xc0
? __kmalloc+0x183/0x210
seq_read+0xbb/0x3c0
vfs_read+0xc6/0x150
ksys_read+0x6b/0x100
do_syscall_64+0x3d/0x120
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f8ee27d682e
Code: c0 e9 f6 fe ff ff 50 48 8d 3d b6 5d 0a 00 e8 e9 fd 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
RSP: 002b:00007ffdc7fdcf58 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00007f8ee28ce958 RCX: 00007f8ee27d682e
RDX: 0000000000000400 RSI: 00000000017e2590 RDI: 0000000000000004
RBP: 00007f8ee28ce950 R08: 00007f8ee28ac120 R09: 00007ffdc7fdce00
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffdc7fddf02
Modules linked in:
CR2: 00000000000000e8
---[ end trace 9da0e81512fbb929 ]---
RIP: 0010:cap_capable+0x13/0x70
Code: bf f4 ff ff ff 66 90 e9 01 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 4c 8b 87 88 00 00 00 4c 39 c6 74 39 45 8b 88 e8 00 00 00 <44> 39 8e e8 00 00 00 7e 18 48 8b 86 e0 00 00 00 4c 39 c0 74 12 48
RSP: 0018:ffffc90000777cb0 EFLAGS: 00010207
RAX: ffff8887f96ea000 RBX: 0000000000000002 RCX: 0000000000000002
RDX: 0000000000000013 RSI: 0000000000000000 RDI: ffff8887f9646480
RBP: 0000000000000013 R08: ffffffff82423da0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8887f9646480 R14: ffffffff822a7620 R15: ffff8887fae600c0
FS: 00007f8ee26cd740(0000) GS:ffff8887fecc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000e8 CR3: 00000007f8691000 CR4: 00000000003406e0
udevd[518]: starting eudev-3.2.9
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 14-....: (20999 ticks this GP) idle=216/1/0x4000000000000002 softirq=454/454 fqs=5250
(t=21004 jiffies g=-755 q=1327)
NMI backtrace for cpu 14
CPU: 14 PID: 520 Comm: pidof Tainted: G D 5.5.7 #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi, BIOS P3.50 07/18/2019
Call Trace:
<IRQ>
dump_stack+0x50/0x70
nmi_cpu_backtrace.cold+0x14/0x53
? lapic_can_unplug_cpu.cold+0x44/0x44
nmi_trigger_cpumask_backtrace+0x7b/0x88
rcu_dump_cpu_stacks+0x7b/0xa9
rcu_sched_clock_irq.cold+0x152/0x39b
update_process_times+0x1f/0x50
tick_sched_timer+0x40/0x90
? tick_sched_do_timer+0x50/0x50
__hrtimer_run_queues+0xdd/0x180
hrtimer_interrupt+0x108/0x230
smp_apic_timer_interrupt+0x53/0xa0
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:queued_spin_lock_slowpath+0x41/0x1a0
Code: 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 18 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 8b 07 84 c0 <75> f8 66 c7 07 01 00 c3 f6 c4 01 75 04 c6 47 01 00 48 c7 c0 40 29
RSP: 0018:ffffc90001e87d08 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000101 RBX: ffff8887f96f0000 RCX: ffff8887f96f0000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8887f96f0658
RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887f50b7080
R13: ffffffff82424480 R14: ffffffff82424480 R15: ffff8887f50b70c0
ptrace_may_access+0x1e/0x50
do_task_stat+0x7b/0xd90
? do_filp_open+0xab/0x100
proc_single_show+0x54/0xc0
? __kmalloc+0x183/0x210
seq_read+0xbb/0x3c0
vfs_read+0xc6/0x150
ksys_read+0x6b/0x100
do_syscall_64+0x3d/0x120
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f666796082e
Code: c0 e9 f6 fe ff ff 50 48 8d 3d b6 5d 0a 00 e8 e9 fd 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
RSP: 002b:00007ffe26688b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00007f6667a58958 RCX: 00007f666796082e
RDX: 0000000000000400 RSI: 000000000153e590 RDI: 0000000000000004
RBP: 00007f6667a58950 R08: 00007f6667a36120 R09: 00007ffe266889e0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffe2668aee9
...
sysrq: Terminate All Tasks
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 14-....: (83876 ticks this GP) idle=216/1/0x4000000000000002 softirq=454/454 fqs=20970
(t=84003 jiffies g=-755 q=2695)
NMI backtrace for cpu 14
CPU: 14 PID: 520 Comm: pidof Tainted: G D 5.5.7 #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi, BIOS P3.50 07/18/2019
Call Trace:
<IRQ>
dump_stack+0x50/0x70
nmi_cpu_backtrace.cold+0x14/0x53
? lapic_can_unplug_cpu.cold+0x44/0x44
nmi_trigger_cpumask_backtrace+0x7b/0x88
rcu_dump_cpu_stacks+0x7b/0xa9
rcu_sched_clock_irq.cold+0x152/0x39b
update_process_times+0x1f/0x50
tick_sched_timer+0x40/0x90
? tick_sched_do_timer+0x50/0x50
__hrtimer_run_queues+0xdd/0x180
hrtimer_interrupt+0x108/0x230
smp_apic_timer_interrupt+0x53/0xa0
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:queued_spin_lock_slowpath+0x3d/0x1a0
Code: 3e f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 18 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 66 c7 07 01 00 c3 f6 c4 01 75 04 c6 47 01 00 48
RSP: 0018:ffffc90001e87d08 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000101 RBX: ffff8887f96f0000 RCX: ffff8887f96f0000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8887f96f0658
RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887f50b7080
R13: ffffffff82424480 R14: ffffffff82424480 R15: ffff8887f50b70c0
ptrace_may_access+0x1e/0x50
do_task_stat+0x7b/0xd90
? do_filp_open+0xab/0x100
proc_single_show+0x54/0xc0
? __kmalloc+0x183/0x210
seq_read+0xbb/0x3c0
vfs_read+0xc6/0x150
ksys_read+0x6b/0x100
do_syscall_64+0x3d/0x120
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f666796082e
Code: c0 e9 f6 fe ff ff 50 48 8d 3d b6 5d 0a 00 e8 e9 fd 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
RSP: 002b:00007ffe26688b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00007f6667a58958 RCX: 00007f666796082e
RDX: 0000000000000400 RSI: 000000000153e590 RDI: 0000000000000004
RBP: 00007f6667a58950 R08: 00007f6667a36120 R09: 00007ffe266889e0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffe2668aee9
rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-... } 21041 jiffies s: 45 root: 0x4000/.
rcu: blocking rcu_node structures:
Task dump for CPU 14:
pidof R running task 0 520 1 0x8000000c
Call Trace:
? do_syscall_64+0x3d/0x120
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 14-....: (146878 ticks this GP) idle=216/1/0x4000000000000002 softirq=454/454 fqs=36715
(t=147006 jiffies g=-755 q=3376)
NMI backtrace for cpu 14
CPU: 14 PID: 520 Comm: pidof Tainted: G D 5.5.7 #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi, BIOS P3.50 07/18/2019
Call Trace:
<IRQ>
dump_stack+0x50/0x70
nmi_cpu_backtrace.cold+0x14/0x53
? lapic_can_unplug_cpu.cold+0x44/0x44
nmi_trigger_cpumask_backtrace+0x7b/0x88
rcu_dump_cpu_stacks+0x7b/0xa9
rcu_sched_clock_irq.cold+0x152/0x39b
update_process_times+0x1f/0x50
tick_sched_timer+0x40/0x90
? tick_sched_do_timer+0x50/0x50
__hrtimer_run_queues+0xdd/0x180
hrtimer_interrupt+0x108/0x230
smp_apic_timer_interrupt+0x53/0xa0
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:queued_spin_lock_slowpath+0x3d/0x1a0
Code: 3e f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 18 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 66 c7 07 01 00 c3 f6 c4 01 75 04 c6 47 01 00 48
RSP: 0018:ffffc90001e87d08 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000101 RBX: ffff8887f96f0000 RCX: ffff8887f96f0000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8887f96f0658
RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887f50b7080
R13: ffffffff82424480 R14: ffffffff82424480 R15: ffff8887f50b70c0
ptrace_may_access+0x1e/0x50
do_task_stat+0x7b/0xd90
? do_filp_open+0xab/0x100
proc_single_show+0x54/0xc0
? __kmalloc+0x183/0x210
seq_read+0xbb/0x3c0
vfs_read+0xc6/0x150
ksys_read+0x6b/0x100
do_syscall_64+0x3d/0x120
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f666796082e
Code: c0 e9 f6 fe ff ff 50 48 8d 3d b6 5d 0a 00 e8 e9 fd 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
RSP: 002b:00007ffe26688b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00007f6667a58958 RCX: 00007f666796082e
RDX: 0000000000000400 RSI: 000000000153e590 RDI: 0000000000000004
RBP: 00007f6667a58950 R08: 00007f6667a36120 R09: 00007ffe266889e0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffe2668aee9
rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-... } 85041 jiffies s: 45 root: 0x4000/.
rcu: blocking rcu_node structures:
Task dump for CPU 14:
pidof R running task 0 520 1 0x8000000c
Call Trace:
? do_syscall_64+0x3d/0x120
? entry_SYSCALL_64_after_hwframe+0x44/0xa9
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 14-....: (209792 ticks this GP) idle=216/1/0x4000000000000002 softirq=454/454 fqs=52439
(t=210009 jiffies g=-755 q=3747)
NMI backtrace for cpu 14
CPU: 14 PID: 520 Comm: pidof Tainted: G D 5.5.7 #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X470 Taichi, BIOS P3.50 07/18/2019
Call Trace:
<IRQ>
dump_stack+0x50/0x70
nmi_cpu_backtrace.cold+0x14/0x53
? lapic_can_unplug_cpu.cold+0x44/0x44
nmi_trigger_cpumask_backtrace+0x7b/0x88
rcu_dump_cpu_stacks+0x7b/0xa9
rcu_sched_clock_irq.cold+0x152/0x39b
update_process_times+0x1f/0x50
tick_sched_timer+0x40/0x90
? tick_sched_do_timer+0x50/0x50
__hrtimer_run_queues+0xdd/0x180
hrtimer_interrupt+0x108/0x230
smp_apic_timer_interrupt+0x53/0xa0
apic_timer_interrupt+0xf/0x20
</IRQ>
RIP: 0010:queued_spin_lock_slowpath+0x3d/0x1a0
Code: 3e f0 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 18 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 <8b> 07 84 c0 75 f8 66 c7 07 01 00 c3 f6 c4 01 75 04 c6 47 01 00 48
RSP: 0018:ffffc90001e87d08 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff13
RAX: 0000000000000101 RBX: ffff8887f96f0000 RCX: ffff8887f96f0000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8887f96f0658
RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887f50b7080
R13: ffffffff82424480 R14: ffffffff82424480 R15: ffff8887f50b70c0
ptrace_may_access+0x1e/0x50
do_task_stat+0x7b/0xd90
? do_filp_open+0xab/0x100
proc_single_show+0x54/0xc0
? __kmalloc+0x183/0x210
seq_read+0xbb/0x3c0
vfs_read+0xc6/0x150
ksys_read+0x6b/0x100
do_syscall_64+0x3d/0x120
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f666796082e
Code: c0 e9 f6 fe ff ff 50 48 8d 3d b6 5d 0a 00 e8 e9 fd 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
RSP: 002b:00007ffe26688b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00007f6667a58958 RCX: 00007f666796082e
RDX: 0000000000000400 RSI: 000000000153e590 RDI: 0000000000000004
RBP: 00007f6667a58950 R08: 00007f6667a36120 R09: 00007ffe266889e0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffe2668aee9
rcu: INFO: rcu_sched detected expedited stalls on CPUs/tasks: { 14-... } 150577 jiffies s: 45 root: 0x4000/.
rcu: blocking rcu_node structures:
Task dump for CPU 14:
pidof R running task 0 520 1 0x8000000c
Call Trace:
? do_syscall_64+0x3d/0x120
? entry_SYSCALL_64_after_hwframe+0x44/0xa9



Dmitry Vyukov

unread,
Mar 5, 2020, 1:10:52 AM3/5/20
to Kris Karas, syzbot, David Miller, Herbert Xu, open list:HARDWARE RANDOM NUMBER GENERATOR CORE, LKML, syzkaller-bugs, Kate Stewart, Thomas Gleixner, Eric Biggers, all...@lohutok.net
On Wed, Mar 4, 2020 at 9:36 PM Kris Karas <linux...@moonlit-rail.com> wrote:
>
> Dmitry Vyukov wrote:
> > Kris Karas wrote:
> >> [...]
> >> rcu: INFO: rcu_sched self-detected stall on CPU
> >> rcu: 14-....: (20999 ticks this GP)
> >> idle=216/1/0x4000000000000002 softirq=454/454 fqs=5250
> >> (t=21004 jiffies g=-755 q=1327)
> >> NMI backtrace for cpu 14
> >> CPU: 14 PID: 520 Comm: pidof Tainted: G D 5.5.7 #1
> >> [...]
> >> I don't have a reproducer for it, either. It showed up in 5.5.7 (but
> >> might be from earlier as it reproduces so infrequently).
> > Hi Kris,
> >
> > What follows after this stack? That's the most interesting part. The
> > part that you showed is common for all stalls and does not mean much,
> > besides the fact that there is a stall. These can well be very
> > different stalls in different parts of kernel.
>
> Hi Dmitry,
>
> Sorry, dummy me, I should have found my original post in Lore and posted
> a link to that.
> Oh, here we go:
>
> https://lore.kernel.org/lkml/6d4f9ac8-a478-2ae4...@moonlit-rail.com/

These all mention ptrace_may_access, so that looks like a different bug to me.

> Given that the stall stack is not terribly useful, it would seem that
> the OOPS I saw was probably unrelated to this one caught by syzbot,
> though the stalled CPU does make me curious (as in all the OOPSen I've
> encountered in the past 25 years have rarely mentioned an RCU stall).

Well, it means you don't stress the kernel hard enough :)
Search for "rcu detected stall" here:
https://syzkaller.appspot.com/upstream
https://syzkaller.appspot.com/upstream/fixed
https://syzkaller.appspot.com/linux-4.19
https://syzkaller.appspot.com/linux-4.14
and these are all different _bugs_, some of them have tens of
thousands of crash instances. And that's just in 2.5 years :)

syzbot

unread,
Jul 2, 2020, 1:07:10 AM7/2/20
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
Crashes did not happen for a while, no reproducer and no activity.
Reply all
Reply to author
Forward
0 new messages