perf: stalls in perf_install_in_context/perf_remove_from_context

70 views
Skip to first unread message

Dmitry Vyukov

unread,
Dec 15, 2015, 8:01:52 AM12/15/15
to Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Hello,

The following program causes stalls in
perf_install_in_context/perf_remove_from_context (if run in a loop):

// autogenerated by syzkaller (http://github.com/google/syzkaller)
#include <syscall.h>
#include <string.h>
#include <stdint.h>
#include <pthread.h>
#include <sys/types.h>
#include <unistd.h>
#include <grp.h>

void f(int pid, int cpu)
{
*(uint32_t*)0x20000000 = 0x1;
*(uint32_t*)0x20000004 = 0x70;
*(uint64_t*)0x20000008 = 0x2;
*(uint64_t*)0x20000010 = 0x9;
*(uint64_t*)0x20000018 = 0x1;
*(uint64_t*)0x20000020 = 0x7;
*(uint8_t*)0x20000028 = 0x8;
*(uint8_t*)0x20000029 = 0x5;
*(uint8_t*)0x2000002a = 0x03;
*(uint8_t*)0x2000002b = 0x0;
*(uint32_t*)0x2000002c = 0x0;
*(uint32_t*)0x20000030 = 0x0;
*(uint32_t*)0x20000034 = 0x4;
*(uint64_t*)0x20000038 = 0x5;
*(uint64_t*)0x20000040 = 0x9;
*(uint64_t*)0x20000048 = 0x4;
*(uint64_t*)0x20000050 = 0x2;
*(uint64_t*)0x20000058 = 0x5fbb;
*(uint32_t*)0x20000060 = 0x5;
*(uint64_t*)0x20000064 = 0x5;
*(uint32_t*)0x2000006c = 0x4;
syscall(SYS_perf_event_open, 0x20000000ul, pid, cpu%4,
0xfffffffffffffffful, 0x9ul, 0);
}

void *thr(void *arg)
{
for (int i = 0; i < 4; i++)
f(i%2 ? getpgid(0) : -1, i);
return 0;
}

int main()
{
setpgid(0, 0);
setgroups(0, NULL);
syscall(SYS_setresgid, 65534, 65534, 65534);
syscall(SYS_setresuid, 65534, 65534, 65534);
syscall(SYS_mmap, 0x20000000ul, 0x1000ul, 0x3ul, 0x32ul,
0xfffffffffffffffful, 0x0ul);
const int N = 4;
pthread_t th[N];
for (int i = 0; i < N; i++)
pthread_create(&th[i], 0, thr, 0);
for (int i = 0; i < N; i++)
pthread_join(th[i], 0);
return 0;
}


INFO: rcu_sched self-detected stall on CPU
0-...: (26000 ticks this GP) idle=1d5/140000000000001/0
softirq=342452/342452 fqs=8380
[ 844.885113] INFO: rcu_sched detected stalls on CPUs/tasks:
0-...: (26000 ticks this GP) idle=1d5/140000000000001/0
softirq=342452/342452 fqs=8380
(detected by 3, t=26002 jiffies, g=216456, c=216455, q=182)
Task dump for CPU 0:
a.out R running task 14768 4286 5666 0x0000000a
ffff88003c952a48 0000000000000000 0000000000000001 0000000000000000
0000000000000000 ffff8800719adb00 0000000000000006 0000000000000006
ffff8800719ae2e0 ffff8800719adb00 ffffffffffffff10 ffffffff823fde9e
Call Trace:
[< inline >] ? arch_local_irq_enable
./arch/x86/include/asm/paravirt.h:816
[< inline >] ? __raw_spin_unlock_irq
include/linux/spinlock_api_smp.h:170
[<ffffffff823fde9e>] ? _raw_spin_unlock_irq+0x2e/0x40
kernel/locking/spinlock.c:199
[< inline >] ? __raw_spin_unlock_irq
include/linux/spinlock_api_smp.h:170
[<ffffffff823fde97>] ? _raw_spin_unlock_irq+0x27/0x40
kernel/locking/spinlock.c:199
[<ffffffff8119c8bb>] ? perf_install_in_context+0x6b/0xf0
kernel/events/core.c:2173
[<ffffffff811a492c>] ? SYSC_perf_event_open+0x63c/0xe40
kernel/events/core.c:8565
[<ffffffff811a7739>] ? SyS_perf_event_open+0x9/0x10 kernel/events/core.c:8261
[<ffffffff823fe676>] ? entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185

(t=26008 jiffies g=216456 c=216455 q=182)
Task dump for CPU 0:
a.out R running task 14768 4286 5666 0x0000000a
0000000000001622 ffff88003ec03dc8 ffffffff810e5e38 ffffffff810e5d67
0000000000000000 ffffffff82bfd240 0000000000000092 ffff88003ec03de0
ffffffff810e8cd4 0000000000000001 ffff88003ec03e10 ffffffff81124435
Call Trace:
<IRQ> [<ffffffff810e5e38>] sched_show_task+0x148/0x250
kernel/sched/core.c:4934
[< inline >] ? stack_not_used include/linux/sched.h:2840
[<ffffffff810e5d67>] ? sched_show_task+0x77/0x250 kernel/sched/core.c:4922
[<ffffffff810e8cd4>] dump_cpu_task+0x34/0x40 kernel/sched/core.c:8611
[<ffffffff81124435>] rcu_dump_cpu_stacks+0x85/0xc0 kernel/rcu/tree.c:1221
[< inline >] print_cpu_stall kernel/rcu/tree.c:1328
[< inline >] check_cpu_stall kernel/rcu/tree.c:1392
[< inline >] __rcu_pending kernel/rcu/tree.c:3880
[< inline >] rcu_pending kernel/rcu/tree.c:3944
[<ffffffff8112825c>] rcu_check_callbacks+0x48c/0x770 kernel/rcu/tree.c:2784
[< inline >] ? __account_system_time kernel/sched/cputime.c:201
[<ffffffff810e974c>] ? account_system_time+0x7c/0x110
kernel/sched/cputime.c:228
[<ffffffff8113c990>] ? tick_sched_do_timer+0x30/0x30
kernel/time/tick-sched.c:132
[<ffffffff8112d164>] update_process_times+0x34/0x60 kernel/time/timer.c:1420
[<ffffffff8113c2d1>] tick_sched_handle.isra.17+0x31/0x40
kernel/time/tick-sched.c:151
[<ffffffff8113c9cb>] tick_sched_timer+0x3b/0x70 kernel/time/tick-sched.c:1070
[< inline >] __run_hrtimer kernel/time/hrtimer.c:1229
[<ffffffff8112dd49>] __hrtimer_run_queues+0xc9/0x270 kernel/time/hrtimer.c:1293
[<ffffffff8112e0a6>] hrtimer_interrupt+0xa6/0x1b0 kernel/time/hrtimer.c:1327
[<ffffffff81095d70>] local_apic_timer_interrupt+0x30/0x60
arch/x86/kernel/apic/apic.c:901
[<ffffffff81096828>] smp_apic_timer_interrupt+0x38/0x50
arch/x86/kernel/apic/apic.c:925
[<ffffffff823ff46c>] apic_timer_interrupt+0x8c/0xa0
arch/x86/entry/entry_64.S:520
<EOI> [< inline >] ? arch_local_irq_enable
./arch/x86/include/asm/paravirt.h:816
<EOI> [< inline >] ? __raw_spin_unlock_irq
include/linux/spinlock_api_smp.h:170
<EOI> [<ffffffff823fde9e>] ? _raw_spin_unlock_irq+0x2e/0x40
kernel/locking/spinlock.c:199
[< inline >] ? __raw_spin_unlock_irq
include/linux/spinlock_api_smp.h:170
[<ffffffff823fde97>] ? _raw_spin_unlock_irq+0x27/0x40
kernel/locking/spinlock.c:199
[<ffffffff8119c8bb>] perf_install_in_context+0x6b/0xf0
kernel/events/core.c:2173
[<ffffffff811a492c>] SYSC_perf_event_open+0x63c/0xe40 kernel/events/core.c:8565
[<ffffffff811a7739>] SyS_perf_event_open+0x9/0x10 kernel/events/core.c:8261
[<ffffffff823fe676>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185



I also frequently see similar stalls in perf_remove_from_context:

INFO: rcu_sched self-detected stall on CPU
2-...: (25911 ticks this GP) idle=839/140000000000001/0
softirq=2032313/2032313 fqs=8576
(t=26000 jiffies g=563525 c=563524 q=1996)
Task dump for CPU 2:
syzkaller_execu R running task 28992 14908 631 0x0000000a
ffff88002e0c7168 ffff88006da07c10 ffffffff8137ea59 0000000000000001
ffff88006da1ee00 0000000000000002 0000000000000000 dffffc0000000000
0000000000000004 ffff88006da07c30 ffffffff81386fe1 ffffffff86cb24c8
Call Trace:
<IRQ> [<ffffffff8137ea59>] sched_show_task+0x269/0x3b0
kernel/sched/core.c:4934
[<ffffffff81386fe1>] dump_cpu_task+0x71/0x90 kernel/sched/core.c:8611
[<ffffffff8143b257>] rcu_dump_cpu_stacks+0x1c7/0x280 kernel/rcu/tree.c:1222
[< inline >] print_cpu_stall kernel/rcu/tree.c:1329
[< inline >] check_cpu_stall kernel/rcu/tree.c:1393
[< inline >] __rcu_pending kernel/rcu/tree.c:3908
[< inline >] rcu_pending kernel/rcu/tree.c:3972
[<ffffffff81446a22>] rcu_check_callbacks+0xee2/0x19f0 kernel/rcu/tree.c:2785
[<ffffffff8151d7e3>] ? acct_account_cputime+0x63/0x80 kernel/tsacct.c:167
[<ffffffff8145767a>] update_process_times+0x3a/0x70 kernel/time/timer.c:1420
[<ffffffff8147f6af>] tick_sched_handle.isra.16+0xaf/0xe0
kernel/time/tick-sched.c:151
[<ffffffff81480d95>] tick_sched_timer+0x75/0x100 kernel/time/tick-sched.c:1070
[< inline >] __run_hrtimer kernel/time/hrtimer.c:1229
[<ffffffff81459443>] __hrtimer_run_queues+0x363/0xc10
kernel/time/hrtimer.c:1293
[<ffffffff81480d20>] ? tick_sched_do_timer+0xa0/0xa0
kernel/time/tick-sched.c:124
[<ffffffff814590e0>] ? retrigger_next_event+0x140/0x140
kernel/time/hrtimer.c:577
[<ffffffff8146d407>] ? ktime_get_update_offsets_now+0x1d7/0x2a0
kernel/time/timekeeping.c:1969
[<ffffffff8145b412>] hrtimer_interrupt+0x182/0x430 kernel/time/hrtimer.c:1327
[<ffffffff811f1149>] local_apic_timer_interrupt+0x79/0x100
arch/x86/kernel/apic/apic.c:901
[<ffffffff811f42be>] smp_apic_timer_interrupt+0x7e/0xa0
arch/x86/kernel/apic/apic.c:925
[<ffffffff85b6826c>] apic_timer_interrupt+0x8c/0xa0
arch/x86/entry/entry_64.S:520
<EOI> [< inline >] ? arch_local_irq_enable
./arch/x86/include/asm/paravirt.h:816
<EOI> [< inline >] ? __raw_spin_unlock_irq
include/linux/spinlock_api_smp.h:170
<EOI> [<ffffffff85b66a03>] ? _raw_spin_unlock_irq+0x33/0x40
kernel/locking/spinlock.c:199
[< inline >] ? __raw_spin_unlock_irq
include/linux/spinlock_api_smp.h:170
[<ffffffff85b669fc>] ? _raw_spin_unlock_irq+0x2c/0x40
kernel/locking/spinlock.c:199
[<ffffffff815a72b6>] perf_remove_from_context+0x1a6/0x360
kernel/events/core.c:1714
[<ffffffff815a7110>] ? perf_group_detach+0x560/0x560 kernel/events/core.c:1507
[< inline >] ? rcu_read_unlock include/linux/rcupdate.h:913
[<ffffffff815ae718>] ? perf_event_ctx_lock_nested+0x128/0x300
kernel/events/core.c:977
[<ffffffff815ae877>] ? perf_event_ctx_lock_nested+0x287/0x300
kernel/events/core.c:986
[<ffffffff815ae62f>] ? perf_event_ctx_lock_nested+0x3f/0x300
kernel/events/core.c:972
[<ffffffff815c14f0>] ? perf_event_release_kernel+0x20/0x20
kernel/events/core.c:3845
[<ffffffff815c13a5>] put_event+0x255/0x380 kernel/events/core.c:3835
[<ffffffff815c11bd>] ? put_event+0x6d/0x380 kernel/events/core.c:3818
[<ffffffff815c1527>] perf_release+0x37/0x50 kernel/events/core.c:3853
[<ffffffff81715f73>] __fput+0x233/0x780 fs/file_table.c:208
[<ffffffff81716545>] ____fput+0x15/0x20 fs/file_table.c:244
[<ffffffff8134679b>] task_work_run+0x16b/0x200 kernel/task_work.c:115
[< inline >] exit_task_work include/linux/task_work.h:21
[<ffffffff812f4d3b>] do_exit+0x8bb/0x2b20 kernel/exit.c:750
[<ffffffff812f4480>] ? mm_update_next_owner+0x6f0/0x6f0 kernel/exit.c:357
[<ffffffff813eabe0>] ? debug_check_no_locks_freed+0x310/0x310
kernel/locking/lockdep.c:4123
[< inline >] ? arch_send_call_function_single_ipi
./arch/x86/include/asm/smp.h:134
[<ffffffff8148cbe8>] ? generic_exec_single+0x158/0x340 kernel/smp.c:182
[<ffffffff814374fa>] ? debug_lockdep_rcu_enabled+0x9a/0xb0
kernel/rcu/update.c:242
[< inline >] ? arch_local_irq_restore
./arch/x86/include/asm/paravirt.h:806
[<ffffffff813f16a6>] ? lock_release+0x7f6/0xc80 kernel/locking/lockdep.c:3607
[<ffffffff828c6083>] ? delay_tsc+0x83/0xa0 arch/x86/lib/delay.c:84
[<ffffffff8130e36b>] ? recalc_sigpending_tsk+0x13b/0x180 kernel/signal.c:145
[<ffffffff812f7118>] do_group_exit+0x108/0x320 kernel/exit.c:880
[<ffffffff813196c4>] get_signal+0x5e4/0x1500 kernel/signal.c:2307
[<ffffffff811507a3>] do_signal+0x83/0x1c90 arch/x86/kernel/signal.c:712
[<ffffffff813ea582>] ? trace_hardirqs_on+0x12/0x20
kernel/locking/lockdep.c:2619
[<ffffffff81150720>] ? setup_sigcontext+0x780/0x780
arch/x86/kernel/signal.c:165
[<ffffffff8176dffd>] ? fd_install+0x4d/0x60 fs/file.c:626
[<ffffffff815c4393>] ? SYSC_perf_event_open+0x503/0x1fa0
kernel/events/core.c:8060
[<ffffffff815c3e90>] ? perf_event_set_output+0x3b0/0x3b0
kernel/events/core.c:8169
[<ffffffff81003901>] exit_to_usermode_loop+0xf1/0x1a0
arch/x86/entry/common.c:247
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:282
[<ffffffff8100631f>] syscall_return_slowpath+0x19f/0x210
arch/x86/entry/common.c:344
[<ffffffff85b675e2>] int_ret_from_sys_call+0x25/0x9f
arch/x86/entry/entry_64.S:281

After the stall message machine indeed hangs.

On commit 6764e5ebd5c62236d082f9ae030674467d0b2779 (Dec 9).

Peter Zijlstra

unread,
Dec 15, 2015, 8:39:51 AM12/15/15
to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On Tue, Dec 15, 2015 at 02:01:32PM +0100, Dmitry Vyukov wrote:


> Call Trace:
> <IRQ> [<ffffffff810e5e38>] sched_show_task+0x148/0x250 kernel/sched/core.c:4934

<snip>

> [<ffffffff823ff46c>] apic_timer_interrupt+0x8c/0xa0 arch/x86/entry/entry_64.S:520
>
> <EOI> [< inline >] ? arch_local_irq_enable ./arch/x86/include/asm/paravirt.h:816
> <EOI> [< inline >] ? __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:170
> <EOI> [<ffffffff823fde9e>] ? _raw_spin_unlock_irq+0x2e/0x40 kernel/locking/spinlock.c:199
> [< inline >] ? __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:170
> [<ffffffff823fde97>] ? _raw_spin_unlock_irq+0x27/0x40 kernel/locking/spinlock.c:199
> [<ffffffff8119c8bb>] perf_install_in_context+0x6b/0xf0 kernel/events/core.c:2173
> [<ffffffff811a492c>] SYSC_perf_event_open+0x63c/0xe40 kernel/events/core.c:8565
> [<ffffffff811a7739>] SyS_perf_event_open+0x9/0x10 kernel/events/core.c:8261
> [<ffffffff823fe676>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185


So this looks like its stuck in the retry loop in
perf_install_in_context(). We do the raw_spin_unlock_irq(), which allows
the above watchdog interrupt to happen, but make no progress.

This means the task_function_call() thing keeps failing, which is
extremely unlikely.

Let me have a play with your test-case to see if it reproduces.

Dmitry Vyukov

unread,
Dec 15, 2015, 8:41:48 AM12/15/15
to Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Yeah, both perf_install_in_context and perf_remove_from_context hang
in a very similar loops.

Dmitry Vyukov

unread,
Dec 15, 2015, 8:45:04 AM12/15/15
to Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
Note that the test uses cpu%4 because I have 4 cpus. Here is how I
start qemu: -numa node,nodeid=0,cpus=0-1 -numa node,nodeid=1,cpus=2-3
-smp sockets=2,cores=2,threads=1

Peter Zijlstra

unread,
Dec 15, 2015, 8:57:30 AM12/15/15
to Dmitry Vyukov, Ingo Molnar, Arnaldo Carvalho de Melo, LKML, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On Tue, Dec 15, 2015 at 02:44:43PM +0100, Dmitry Vyukov wrote:
> Note that the test uses cpu%4 because I have 4 cpus. Here is how I
> start qemu: -numa node,nodeid=0,cpus=0-1 -numa node,nodeid=1,cpus=2-3
> -smp sockets=2,cores=2,threads=1

I'll first try on real hardware. I distrust this virt muck :-)
Reply all
Reply to author
Forward
0 new messages