Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

perf sw_event related lockup

25 views
Skip to first unread message

Vince Weaver

unread,
Nov 13, 2013, 5:40:01 PM11/13/13
to
Hello

so with the perf_fuzzer modified to avoid the tracepoint issues, I've
triggered this software-event related soft lockup.

From what I can gather from the backtraces, they all map to the loop
in do_perf_sw_event() in kernel/events/core.c

hlist_for_each_entry_rcu(event, head, hlist_entry) {
if (perf_swevent_match(event, type, event_id, data, regs))
perf_swevent_event(event, nr, data, regs);
}

is it possible to get stuck in that as an infinite loop?

below is the dmesg from the lockup, I eventually had to reboot to clear
the problem:

[ 345.354950] NOHZ: local_softirq_pending 100
[ 416.043831] ------------[ cut here ]------------
[ 416.047607] WARNING: CPU: 0 PID: 2853 at arch/x86/kernel/cpu/perf_event.c:1076 x86_pmu_start+0x46/0xee()
[ 416.047607] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg acpi_cpufreq mcs7830 psmouse usbnet evdev serio_raw coretemp pcspkr video wmi processor ohci_pci thermal_sys ohci_hcd button i2c_nforce2 sg ehci_pci ehci_hcd sd_mod usbcore usb_common
[ 416.047607] CPU: 0 PID: 2853 Comm: perf_fuzzer Not tainted 3.12.0 #2
[ 416.047607] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 416.047607] 0000000000000434 ffff88011ad81bc8 ffffffff8151aad6 0000000000000000
[ 416.047607] ffffffff8103cd68 ffff88011fc15b38 ffffffff810127b3 ffff88011b313400
[ 416.047607] 0000000000000021 0000000000000000 0000000000000000 ffff88011fc0b940
[ 416.047607] Call Trace:
[ 416.047607] [<ffffffff8151aad6>] ? dump_stack+0x41/0x53
[ 416.047607] [<ffffffff8103cd68>] ? warn_slowpath_common+0x81/0x9c
[ 416.047607] [<ffffffff810127b3>] ? x86_pmu_start+0x46/0xee
[ 416.047607] [<ffffffff810127b3>] ? x86_pmu_start+0x46/0xee
[ 416.047607] [<ffffffff81012a2e>] ? x86_pmu_enable+0x1d3/0x285
[ 416.047607] [<ffffffff810c79e0>] ? perf_event_context_sched_in+0x6d/0x8d
[ 416.047607] [<ffffffff810c7a21>] ? __perf_event_task_sched_in+0x21/0x108
[ 416.047607] [<ffffffff8106657b>] ? finish_task_switch+0x46/0x98
[ 416.047607] [<ffffffff8151c9ee>] ? __schedule+0x4f7/0x54b
[ 416.047607] [<ffffffff810667ad>] ? __cond_resched+0x2a/0x35
[ 416.047607] [<ffffffff8151cabd>] ? _cond_resched+0x2c/0x37
[ 416.047607] [<ffffffff8151afc1>] ? mutex_lock+0x16/0x37
[ 416.047607] [<ffffffff8131575b>] ? n_tty_write+0x2aa/0x3eb
[ 416.047607] [<ffffffff810664d6>] ? try_to_wake_up+0x1ab/0x1ab
[ 416.047607] [<ffffffff81312de6>] ? tty_write+0x19c/0x22e
[ 416.047607] [<ffffffff813154b1>] ? n_tty_ioctl+0xc5/0xc5
[ 416.047607] [<ffffffff81112a67>] ? vfs_write+0xb9/0x132
[ 416.047607] [<ffffffff81112bc2>] ? SyS_write+0x59/0x86
[ 416.047607] [<ffffffff81524e11>] ? tracesys+0xd4/0xd9
[ 416.047607] ---[ end trace ce7dd64597479b92 ]---
[ 416.755310] NOHZ: local_softirq_pending 100
[ 452.232000] BUG: soft lockup - CPU#1 stuck for 23s! [perf_fuzzer:7211]
[ 452.232000] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg acpi_cpufreq mcs7830 psmouse usbnet evdev serio_raw coretemp pcspkr video wmi processor ohci_pci thermal_sys ohci_hcd button i2c_nforce2 sg ehci_pci ehci_hcd sd_mod usbcore usb_common
[ 452.232000] CPU: 1 PID: 7211 Comm: perf_fuzzer Tainted: G W 3.12.0 #2
[ 452.232000] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 452.232000] task: ffff8800c4bd6100 ti: ffff8800c42fa000 task.ti: ffff8800c42fa000
[ 452.232000] RIP: 0010:[<ffffffff810caa4c>] [<ffffffff810caa4c>] __perf_sw_event+0x9a/0x1a5
[ 452.232000] RSP: 0018:ffff8800c42fbc98 EFLAGS: 00000286
[ 452.232000] RAX: ffff880037f5ec16 RBX: ffff88011fc92ac0 RCX: ffff8800c42fbe88
[ 452.232000] RDX: ffff8800c42fbc98 RSI: 0000000000000001 RDI: ffff880037f5ec00
[ 452.232000] RBP: 0000000000000001 R08: 0000000000000018 R09: 0140000000000000
[ 452.232000] R10: ffff88011fc92ac0 R11: ffff88011b348010 R12: bf4bf357febfceb5
[ 452.232000] R13: 0000000000000001 R14: f9eed3fcf7377feb R15: fdd3ff677744feef
[ 452.232000] FS: 00007f01b4263700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 452.232000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 452.232000] CR2: 00007f01b42639d0 CR3: 00000000c40f1000 CR4: 00000000000407e0
[ 452.232000] DR0: 0000000001404010 DR1: 0000000080002706 DR2: 0000000100000000
[ 452.232000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 452.232000] Stack:
[ 452.232000] 79d2f9f7ffffefff bfb7cedef5dbdfee ffdfefcf57ffbf76 dcfe7f7fdddbef2f
[ 452.232000] 00007f01b42639d0 fed7dbefff6ef7dd f7ffffeef7bbf7f8 dbdf9f0fdf7ffff3
[ 452.232000] 0000000000000000 0000000000000000 afaecfbfbbe5eefb 0000000000000000
[ 452.232000] Call Trace:
[ 452.232000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
[ 452.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 452.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 452.232000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
[ 452.232000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
[ 452.232000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
[ 452.232000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
[ 452.232000] Code: 48 c7 84 24 80 00 00 00 00 00 00 00 48 c7 44 24 48 00 00 00 00 65 48 03 04 25 78 cd 00 00 48 8b 08 48 85 c9 75 68 e9 c9 00 00 00 <41> 83 bc 24 b0 00 00 00 01 75 48 4d 39 b4 24 b8 00 00 00 75 3e
[ 480.231999] BUG: soft lockup - CPU#1 stuck for 23s! [perf_fuzzer:7211]
[ 480.232000] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg acpi_cpufreq mcs7830 psmouse usbnet evdev serio_raw coretemp pcspkr video wmi processor ohci_pci thermal_sys ohci_hcd button i2c_nforce2 sg ehci_pci ehci_hcd sd_mod usbcore usb_common
[ 480.232000] CPU: 1 PID: 7211 Comm: perf_fuzzer Tainted: G W 3.12.0 #2
[ 480.232000] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 480.232000] task: ffff8800c4bd6100 ti: ffff8800c42fa000 task.ti: ffff8800c42fa000
[ 480.232000] RIP: 0010:[<ffffffff810cab0c>] [<ffffffff810cab0c>] __perf_sw_event+0x15a/0x1a5
[ 480.232000] RSP: 0018:ffff8800c42fbc98 EFLAGS: 00000286
[ 480.232000] RAX: ffff880037f5ec16 RBX: ffff88011fc92ac0 RCX: ffff8800c42fbe88
[ 480.232000] RDX: ffff8800c42fbc98 RSI: 0000000000000001 RDI: ffff880037f5ec00
[ 480.232000] RBP: 0000000000000001 R08: 0000000000000018 R09: 0140000000000000
[ 480.232000] R10: ffff88011fc92ac0 R11: ffff88011b348010 R12: bf4bf357febfceb5
[ 480.232000] R13: 0000000000000001 R14: f9eed3fcf7377feb R15: fdd3ff677744feef
[ 480.232000] FS: 00007f01b4263700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 480.232000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 480.232000] CR2: 00007f01b42639d0 CR3: 00000000c40f1000 CR4: 00000000000407e0
[ 480.232000] DR0: 0000000001404010 DR1: 0000000080002706 DR2: 0000000100000000
[ 480.232000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 480.232000] Stack:
[ 480.232000] 79d2f9f7ffffefff bfb7cedef5dbdfee ffdfefcf57ffbf76 dcfe7f7fdddbef2f
[ 480.232000] 00007f01b42639d0 fed7dbefff6ef7dd f7ffffeef7bbf7f8 dbdf9f0fdf7ffff3
[ 480.232000] 0000000000000000 0000000000000000 afaecfbfbbe5eefb 0000000000000000
[ 480.232000] Call Trace:
[ 480.232000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
[ 480.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 480.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 480.232000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
[ 480.232000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
[ 480.232000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
[ 480.232000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
[ 480.232000] Code: 48 c1 e6 39 48 c1 e0 3f 4c 29 c2 48 01 fa 48 29 f2 48 8d 04 02 48 c1 e8 38 48 8d 04 c1 48 8b 00 48 8d 50 c0 48 85 c0 4c 0f 45 e2 <4d> 85 e4 0f 85 37 ff ff ff 48 c7 c0 90 ed 00 00 65 48 03 04 25
[ 486.527998] INFO: rcu_sched self-detected stall on CPU
[ 486.531999] INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=15002 jiffies, g=6562, c=6561, q=96)
[ 486.532000] sending NMI to all CPUs:
[ 486.528000] NMI backtrace for cpu 1
[ 486.528000] CPU: 1 PID: 7211 Comm: perf_fuzzer Tainted: G W 3.12.0 #2
[ 486.528000] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 486.528000] task: ffff8800c4bd6100 ti: ffff8800c42fa000 task.ti: ffff8800c42fa000
[ 486.528000] RIP: 0010:[<ffffffff8129c063>] [<ffffffff8129c063>] delay_tsc+0x23/0x50
[ 486.528000] RSP: 0018:ffff88011fc83ba8 EFLAGS: 00000046
[ 486.528000] RAX: 0000000018e375ef RBX: ffffffff81c86d40 RCX: 0000000018e3731d
[ 486.528000] RDX: 000000000000012b RSI: 0000000000000001 RDI: 00000000000009d2
[ 486.528000] RBP: ffff88011fc83ba8 R08: 0000000000000001 R09: 0000000300000003
[ 486.528000] R10: 0000000000000005 R11: ffff88011fc83f00 R12: 00000000000026f2
[ 486.528000] R13: 0000000000000020 R14: ffffffff81c04982 R15: ffffffff8132c081
[ 486.528000] FS: 00007f01b4263700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 486.528000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 486.528000] CR2: 00007f01b42639d0 CR3: 00000000c40f1000 CR4: 00000000000407e0
[ 486.528000] DR0: 0000000001404010 DR1: 0000000080002706 DR2: 0000000100000000
[ 486.528000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 486.528000] Stack:
[ 486.528000] ffff88011fc83bd8 ffffffff8132c039 0000000300000003 ffffffff81c86d40
[ 486.528000] 0000000000000046 0000000000000039 ffff88011fc83bf8 ffffffff8132c09d
[ 486.528000] ffffffff81c04970 ffffffff81c86d40 ffff88011fc83c38 ffffffff813284f9
[ 486.528000] Call Trace:
[ 486.528000] <IRQ> [<ffffffff8132c039>] ? wait_for_xmitr+0x4b/0x93
[ 486.528000] [<ffffffff8132c09d>] ? serial8250_console_putchar+0x1c/0x2e
[ 486.528000] [<ffffffff813284f9>] ? uart_console_write+0x45/0x5b
[ 486.528000] [<ffffffff8132deab>] ? serial8250_console_write+0xb4/0x102
[ 486.528000] [<ffffffff810779f8>] ? T.924+0xb9/0xcb
[ 486.528000] [<ffffffff81077b36>] ? console_unlock+0x12c/0x32c
[ 486.528000] [<ffffffff810782f5>] ? vprintk_emit+0x3ca/0x3f3
[ 486.528000] [<ffffffff8151a94d>] ? printk+0x4d/0x4f
[ 486.528000] [<ffffffff81067bd7>] ? sched_slice+0x80/0x82
[ 486.528000] [<ffffffff810a3beb>] ? rcu_check_callbacks+0x14e/0x5bf
[ 486.528000] [<ffffffff810471e8>] ? update_process_times+0x41/0x71
[ 486.528000] [<ffffffff8108729a>] ? tick_sched_handle+0x4a/0x58
[ 486.528000] [<ffffffff81087465>] ? tick_sched_timer+0x5d/0x7c
[ 486.528000] [<ffffffff8105be6e>] ? __run_hrtimer+0xba/0x145
[ 486.528000] [<ffffffff81087408>] ? tick_nohz_handler+0xb0/0xb0
[ 486.528000] [<ffffffff8105c143>] ? hrtimer_interrupt+0xd5/0x1c6
[ 486.528000] [<ffffffff815269e4>] ? smp_trace_apic_timer_interrupt+0x53/0x91
[ 486.528000] [<ffffffff8152597a>] ? trace_apic_timer_interrupt+0x6a/0x70
[ 486.528000] <EOI> [<ffffffff810cab38>] ? __perf_sw_event+0x186/0x1a5
[ 486.528000] [<ffffffff810cab40>] ? __perf_sw_event+0x18e/0x1a5
[ 486.528000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
[ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 486.528000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
[ 486.528000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
[ 486.528000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
[ 486.528000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
[ 486.528000] Code: 8d 3c bf e9 ae ff ff ff 55 48 89 e5 66 66 66 66 90 65 8b 34 25 d0 b0 00 00 66 66 90 0f ae e8 0f 31 89 c1 66 66 90 0f ae e8 0f 31 <89> c2 29 ca 39 fa 73 23 f3 90 65 44 8b 04 25 d0 b0 00 00 44 39
[ 486.532000] NMI backtrace for cpu 0
[ 486.532000] CPU: 0 PID: 7215 Comm: perf_fuzzer Tainted: G W 3.12.0 #2
[ 486.532000] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 486.532000] task: ffff8800c425a180 ti: ffff8800c4316000 task.ti: ffff8800c4316000
[ 486.532000] RIP: 0010:[<ffffffff8129c003>] [<ffffffff8129c003>] __const_udelay+0x15/0x29
[ 486.532000] RSP: 0000:ffff88011fc03da8 EFLAGS: 00000092
[ 486.532000] RAX: 0000000001062560 RBX: 0000000000000001 RCX: 0000000000000003
[ 486.532000] RDX: 00000000009aa0a0 RSI: 0000000000000200 RDI: 0000000000418958
[ 486.532000] RBP: ffff88011fc03dc8 R08: 0000000000000000 R09: 0000000000000008
[ 486.532000] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88011fc0d9d0
[ 486.532000] R13: ffffffff81a39780 R14: 0000000000000001 R15: ffff8800c4316010
[ 486.532000] FS: 00007f01b4263700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[ 486.532000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 486.532000] CR2: 00007f9cc6ece188 CR3: 00000000c4780000 CR4: 00000000000407f0
[ 486.532000] DR0: 0000000100000000 DR1: 0000000080002706 DR2: 0000000100000000
[ 486.532000] DR3: 0000000080000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 486.532000] Stack:
[ 486.532000] ffffffff81a39780 ffffffff81028fa0 00000000000019a1 ffffffff81a39780
[ 486.532000] ffff88011fc03e38 ffffffff810a3ee2 00000000008efa7e ffffffff81a39780
[ 486.532000] 0000000000000060 ffffffff81a9bd70 0000000000000000 0000000000000083
[ 486.532000] Call Trace:
[ 486.532000] <IRQ> [<ffffffff81028fa0>] ? arch_trigger_all_cpu_backtrace+0x78/0x8c
[ 486.532000] [<ffffffff810a3ee2>] ? rcu_check_callbacks+0x445/0x5bf
[ 486.532000] [<ffffffff810471e8>] ? update_process_times+0x41/0x71
[ 486.532000] [<ffffffff8108729a>] ? tick_sched_handle+0x4a/0x58
[ 486.532000] [<ffffffff81087465>] ? tick_sched_timer+0x5d/0x7c
[ 486.532000] [<ffffffff8105be6e>] ? __run_hrtimer+0xba/0x145
[ 486.532000] [<ffffffff81087408>] ? tick_nohz_handler+0xb0/0xb0
[ 486.532000] [<ffffffff8105c143>] ? hrtimer_interrupt+0xd5/0x1c6
[ 486.532000] [<ffffffff815269e4>] ? smp_trace_apic_timer_interrupt+0x53/0x91
[ 486.532000] [<ffffffff8152597a>] ? trace_apic_timer_interrupt+0x6a/0x70
[ 486.532000] <EOI> Code: 17 c9 c3 55 48 89 e5 66 66 66 66 90 48 8b 05 ad 42 7c 00 c9 ff e0 48 83 ec 08 48 8d 04 bd 00 00 00 00 65 48 8b 14 25 e0 22 01 00 <48> 6b d2 3e f7 e2 48 8b 05 88 42 7c 00 48 8d 7a 01 5a ff e0 55
[ 486.528000] { 1} (t=15424 jiffies g=6562 c=6561 q=96)
[ 486.528000] sending NMI to all CPUs:
[ 486.528000] NMI backtrace for cpu 1
[ 486.528000] CPU: 1 PID: 7211 Comm: perf_fuzzer Tainted: G W 3.12.0 #2
[ 486.528000] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 486.528000] task: ffff8800c4bd6100 ti: ffff8800c42fa000 task.ti: ffff8800c42fa000
[ 486.528000] RIP: 0010:[<ffffffff8129c063>] [<ffffffff8129c063>] delay_tsc+0x23/0x50
[ 486.528000] RSP: 0018:ffff88011fc83da8 EFLAGS: 00000893
[ 486.528000] RAX: 000000001b2ef428 RBX: 0000000000000001 RCX: 000000001b2ef402
[ 486.528000] RDX: 000000000000012c RSI: 0000000000000001 RDI: 0000000000265910
[ 486.528000] RBP: ffff88011fc83da8 R08: 0000000000000000 R09: 0000000000000008
[ 486.528000] R10: ffff88011fc83be8 R11: ffff88011f01f400 R12: ffff88011fc8d9d0
[ 486.528000] R13: ffffffff81a39780 R14: ffffffff81a3b088 R15: ffff8800c42fa010
[ 486.528000] FS: 00007f01b4263700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 486.528000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 486.528000] CR2: 00007f01b42639d0 CR3: 00000000c40f1000 CR4: 00000000000407e0
[ 486.528000] DR0: 0000000001404010 DR1: 0000000080002706 DR2: 0000000100000000
[ 486.528000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 486.528000] Stack:
[ 486.528000] ffff88011fc83dc8 ffffffff81028fa0 0000000000000060 ffffffff81a39780
[ 486.528000] ffff88011fc83e38 ffffffff810a3c8e ffffffff81a14b60 0000000000000000
[ 486.528000] 0000000000000060 ffffffff81a9bd70 0000000000000001 0000000000000000
[ 486.528000] Call Trace:
[ 486.528000] <IRQ>
[ 486.528000] [<ffffffff81028fa0>] ? arch_trigger_all_cpu_backtrace+0x78/0x8c
[ 486.528000] [<ffffffff810a3c8e>] ? rcu_check_callbacks+0x1f1/0x5bf
[ 486.528000] [<ffffffff810471e8>] ? update_process_times+0x41/0x71
[ 486.528000] [<ffffffff8108729a>] ? tick_sched_handle+0x4a/0x58
[ 486.528000] [<ffffffff81087465>] ? tick_sched_timer+0x5d/0x7c
[ 486.528000] [<ffffffff8105be6e>] ? __run_hrtimer+0xba/0x145
[ 486.528000] [<ffffffff81087408>] ? tick_nohz_handler+0xb0/0xb0
[ 486.528000] [<ffffffff8105c143>] ? hrtimer_interrupt+0xd5/0x1c6
[ 486.528000] [<ffffffff815269e4>] ? smp_trace_apic_timer_interrupt+0x53/0x91
[ 486.528000] [<ffffffff8152597a>] ? trace_apic_timer_interrupt+0x6a/0x70
[ 486.528000] <EOI>
[ 486.528000] [<ffffffff810cab38>] ? __perf_sw_event+0x186/0x1a5
[ 486.528000] [<ffffffff810cab40>] ? __perf_sw_event+0x18e/0x1a5
[ 486.528000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
[ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 486.528000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
[ 486.528000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
[ 486.528000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
[ 486.528000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
[ 486.528000] Code: 8d 3c bf e9 ae ff ff ff 55 48 89 e5 66 66 66 66 90 65 8b 34 25 d0 b0 00 00 66 66 90 0f ae e8 0f 31 89 c1 66 66 90 0f ae e8 0f 31 <89> c2 29 ca 39 fa 73 23 f3 90 65 44 8b 04 25 d0 b0 00 00 44 39
[ 488.244000] NMI backtrace for cpu 0
[ 486.528000] INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 828.947 msecs
[ 488.244000] CPU: 0 PID: 7215 Comm: perf_fuzzer Tainted: G W 3.12.0 #2
[ 488.244000] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 488.244000] task: ffff8800c425a180 ti: ffff8800c4316000 task.ti: ffff8800c4316000
[ 488.244000] RIP: 0033:[<0000000000406c67>] [<0000000000406c67>] 0x406c66
[ 488.244000] RSP: 002b:00007fff569a7dd8 EFLAGS: 00000206
[ 488.244000] RAX: 0000000000000000 RBX: 346dc5d63886594b RCX: 000000000000a7ca
[ 488.244000] RDX: 0000000000000000 RSI: 00007f01b404b640 RDI: 00007f01b404c240
[ 488.244000] RBP: 00000000ffffffff R08: 0000000000000000 R09: 00007f01b4263700
[ 488.244000] R10: 00007f01b42639d0 R11: 0000000000000246 R12: ffffffffffffffff
[ 488.244000] R13: 0000000000000033 R14: 0000000000000053 R15: 0000000000694a68
[ 488.244000] FS: 00007f01b4263700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[ 488.244000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 488.244000] CR2: 00007f9cc6ece188 CR3: 00000000c4780000 CR4: 00000000000407f0
[ 488.244000] DR0: 0000000100000000 DR1: 0000000080002706 DR2: 0000000100000000
[ 488.244000] DR3: 0000000080000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 488.244000]
[ 488.244000] INFO: NMI handler (arch_trigger_all_cpu_backtrace_handler) took too long to run: 1192.664 msecs
[ 512.231999] BUG: soft lockup - CPU#1 stuck for 22s! [perf_fuzzer:7211]
[ 512.232000] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg acpi_cpufreq mcs7830 psmouse usbnet evdev serio_raw coretemp pcspkr video wmi processor ohci_pci thermal_sys ohci_hcd button i2c_nforce2 sg ehci_pci ehci_hcd sd_mod usbcore usb_common
[ 512.232000] CPU: 1 PID: 7211 Comm: perf_fuzzer Tainted: G W 3.12.0 #2
[ 512.232000] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 512.232000] task: ffff8800c4bd6100 ti: ffff8800c42fa000 task.ti: ffff8800c42fa000
[ 512.232000] RIP: 0010:[<ffffffff810caa8a>] [<ffffffff810caa8a>] __perf_sw_event+0xd8/0x1a5
[ 512.232000] RSP: 0018:ffff8800c42fbc98 EFLAGS: 00000246
[ 512.232000] RAX: ffff880037f5ec16 RBX: ffff88011fc92ac0 RCX: ffff8800c42fbe88
[ 512.232000] RDX: ffff8800c42fbc98 RSI: 0000000000000001 RDI: ffff880037f5ec00
[ 512.232000] RBP: 0000000000000001 R08: 0000000000000018 R09: 0140000000000000
[ 512.232000] R10: ffff88011fc92ac0 R11: ffff88011b348010 R12: bf4bf357febfceb5
[ 512.232000] R13: 0000000000000001 R14: f9eed3fcf7377feb R15: fdd3ff677744feef
[ 512.232000] FS: 00007f01b4263700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 512.232000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 512.232000] CR2: 00007f01b42639d0 CR3: 00000000c40f1000 CR4: 00000000000407e0
[ 512.232000] DR0: 0000000001404010 DR1: 0000000080002706 DR2: 0000000100000000
[ 512.232000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 512.232000] Stack:
[ 512.232000] 79d2f9f7ffffefff bfb7cedef5dbdfee ffdfefcf57ffbf76 dcfe7f7fdddbef2f
[ 512.232000] 00007f01b42639d0 fed7dbefff6ef7dd f7ffffeef7bbf7f8 dbdf9f0fdf7ffff3
[ 512.232000] 0000000000000000 0000000000000000 afaecfbfbbe5eefb 0000000000000000
[ 512.232000] Call Trace:
[ 512.232000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
[ 512.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 512.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 512.232000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
[ 512.232000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
[ 512.232000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
[ 512.232000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
[ 512.232000] Code: 75 3e 41 f6 84 24 80 01 00 00 01 75 33 48 85 db 0f 84 ba 00 00 00 41 8a 84 24 d8 00 00 00 a8 10 74 09 f6 83 88 00 00 00 03 75 15 <a8> 20 0f 84 9d 00 00 00 f6 83 88 00 00 00 03 0f 85 90 00 00 00
[ 540.232001] BUG: soft lockup - CPU#1 stuck for 23s! [perf_fuzzer:7211]
[ 540.232001] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg acpi_cpufreq mcs7830 psmouse usbnet evdev serio_raw coretemp pcspkr video wmi processor ohci_pci thermal_sys ohci_hcd button i2c_nforce2 sg ehci_pci ehci_hcd sd_mod usbcore usb_common
[ 540.232001] CPU: 1 PID: 7211 Comm: perf_fuzzer Tainted: G W 3.12.0 #2
[ 540.232001] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 540.232001] task: ffff8800c4bd6100 ti: ffff8800c42fa000 task.ti: ffff8800c42fa000
[ 540.232001] RIP: 0010:[<ffffffff810caaad>] [<ffffffff810caaad>] __perf_sw_event+0xfb/0x1a5
[ 540.232001] RSP: 0018:ffff8800c42fbc98 EFLAGS: 00000286
[ 540.232001] RAX: ffff880037f5ec16 RBX: ffff88011fc92ac0 RCX: ffff8800c42fbe88
[ 540.232001] RDX: ffff8800c42fbc98 RSI: 0000000000000001 RDI: ffff880037f5ec00
[ 540.232001] RBP: 0000000000000001 R08: 0000000000000018 R09: 0140000000000000
[ 540.232001] R10: ffff88011fc92ac0 R11: ffff88011b348010 R12: bf4bf357febfceb5
[ 540.232001] R13: 0000000000000001 R14: f9eed3fcf7377feb R15: fdd3ff677744feef
[ 540.232001] FS: 00007f01b4263700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 540.232001] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 540.232001] CR2: 00007f01b42639d0 CR3: 00000000c40f1000 CR4: 00000000000407e0
[ 540.232001] DR0: 0000000001404010 DR1: 0000000080002706 DR2: 0000000100000000
[ 540.232001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 540.232001] Stack:
[ 540.232001] 79d2f9f7ffffefff bfb7cedef5dbdfee ffdfefcf57ffbf76 dcfe7f7fdddbef2f
[ 540.232001] 00007f01b42639d0 fed7dbefff6ef7dd f7ffffeef7bbf7f8 dbdf9f0fdf7ffff3
[ 540.232001] 0000000000000000 0000000000000000 afaecfbfbbe5eefb 0000000000000000
[ 540.232001] Call Trace:
[ 540.232001] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
[ 540.232001] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 540.232001] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 540.232001] [<ffffffff8151e232>] ? page_fault+0x22/0x30
[ 540.232001] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
[ 540.232001] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
[ 540.232001] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
[ 540.232001] Code: 83 88 00 00 00 03 75 15 a8 20 0f 84 9d 00 00 00 f6 83 88 00 00 00 03 0f 85 90 00 00 00 4d 8b 64 24 40 4d 85 e4 74 6c 49 83 ec 40 <eb> 5d 45 89 f6 48 b8 00 00 00 00 01 00 00 00 48 ba 01 00 fc ff
[ 568.232001] BUG: soft lockup - CPU#1 stuck for 23s! [perf_fuzzer:7211]
[ 568.232001] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative f71882fg acpi_cpufreq mcs7830 psmouse usbnet evdev serio_raw coretemp pcspkr video wmi processor ohci_pci thermal_sys ohci_hcd button i2c_nforce2 sg ehci_pci ehci_hcd sd_mod usbcore usb_common
[ 568.232001] CPU: 1 PID: 7211 Comm: perf_fuzzer Tainted: G W 3.12.0 #2
[ 568.232001] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012
[ 568.232001] task: ffff8800c4bd6100 ti: ffff8800c42fa000 task.ti: ffff8800c42fa000
[ 568.232001] RIP: 0010:[<ffffffff810ca6a4>] [<ffffffff810ca6a4>] perf_swevent_overflow+0x58/0x58
[ 568.232001] RSP: 0018:ffff8800c42fbc90 EFLAGS: 00000246
[ 568.232001] RAX: ffff880037f5ec16 RBX: ffff88011fc92ac0 RCX: ffff8800c42fbe88
[ 568.232001] RDX: ffff8800c42fbc98 RSI: 0000000000000001 RDI: ffff880037f5ec00
[ 568.232001] RBP: 0000000000000001 R08: 0000000000000018 R09: 0140000000000000
[ 568.232001] R10: ffff88011fc92ac0 R11: ffff88011b348010 R12: bf4bf357febfceb5
[ 568.232001] R13: 0000000000000001 R14: f9eed3fcf7377feb R15: fdd3ff677744feef
[ 568.232001] FS: 00007f01b4263700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
[ 568.232001] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 568.232001] CR2: 00007f01b42639d0 CR3: 00000000c40f1000 CR4: 00000000000407e0
[ 568.232001] DR0: 0000000001404010 DR1: 0000000080002706 DR2: 0000000100000000
[ 568.232001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 568.232001] Stack:
[ 568.232001] ffffffff810cab40 79d2f9f7ffffefff bfb7cedef5dbdfee ffdfefcf57ffbf76
[ 568.232001] dcfe7f7fdddbef2f 00007f01b42639d0 fed7dbefff6ef7dd f7ffffeef7bbf7f8
[ 568.232001] dbdf9f0fdf7ffff3 0000000000000000 0000000000000000 afaecfbfbbe5eefb
[ 568.232001] Call Trace:
[ 568.232001] [<ffffffff810cab40>] ? __perf_sw_event+0x18e/0x1a5
[ 568.232001] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
[ 568.232001] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 568.232001] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
[ 568.232001] [<ffffffff8151e232>] ? page_fault+0x22/0x30
[ 568.232001] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
[ 568.232001] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
[ 568.232001] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0
[ 568.232001] Code: 75 1c eb 1f 4c 89 e9 4c 89 e2 48 89 ef e8 e9 fc ff ff 85 c0 75 0d 48 ff cb be 01 00 00 00 48 85 db 75 e1 5d 5b 5d 41 5c 41 5d c3 <48> 83 ec 08 48 01 77 70 48 85 c9 74 64 48 83 bf c0 00 00 00 00
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Peter Zijlstra

unread,
Nov 15, 2013, 10:10:02 AM11/15/13
to
On Wed, Nov 13, 2013 at 05:45:59PM -0500, Vince Weaver wrote:
> Hello
>
> so with the perf_fuzzer modified to avoid the tracepoint issues, I've
> triggered this software-event related soft lockup.
>
> From what I can gather from the backtraces, they all map to the loop
> in do_perf_sw_event() in kernel/events/core.c
>
> hlist_for_each_entry_rcu(event, head, hlist_entry) {
> if (perf_swevent_match(event, type, event_id, data, regs))
> perf_swevent_event(event, nr, data, regs);
> }
>
> is it possible to get stuck in that as an infinite loop?
>
> below is the dmesg from the lockup, I eventually had to reboot to clear
> the problem:

> [ 416.755310] NOHZ: local_softirq_pending 100
> [ 452.232000] BUG: soft lockup - CPU#1 stuck for 23s! [perf_fuzzer:7211]
> [ 452.232000] RIP: 0010:[<ffffffff810caa4c>] [<ffffffff810caa4c>] __perf_sw_event+0x9a/0x1a5
> [ 452.232000] Call Trace:
> [ 452.232000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
> [ 452.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 452.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 452.232000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
> [ 452.232000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
> [ 452.232000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
> [ 452.232000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0

> [ 480.232000] RIP: 0010:[<ffffffff810cab0c>] [<ffffffff810cab0c>] __perf_sw_event+0x15a/0x1a5
> [ 480.232000] Call Trace:
> [ 480.232000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
> [ 480.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 480.232000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 480.232000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
> [ 480.232000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
> [ 480.232000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
> [ 480.232000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0

> [ 486.528000] RIP: 0010:[<ffffffff8129c063>] [<ffffffff8129c063>] delay_tsc+0x23/0x50
> [ 486.528000] Call Trace:
> [ 486.528000] <EOI> [<ffffffff810cab38>] ? __perf_sw_event+0x186/0x1a5
> [ 486.528000] [<ffffffff810cab40>] ? __perf_sw_event+0x18e/0x1a5
> [ 486.528000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
> [ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 486.528000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
> [ 486.528000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
> [ 486.528000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
> [ 486.528000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0

> [ 486.528000] Call Trace:
> [ 486.528000] [<ffffffff810cab38>] ? __perf_sw_event+0x186/0x1a5
> [ 486.528000] [<ffffffff810cab40>] ? __perf_sw_event+0x18e/0x1a5
> [ 486.528000] [<ffffffff81520fa3>] ? __do_page_fault+0x191/0x3f5
> [ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 486.528000] [<ffffffff81066edb>] ? sched_clock_local+0x13/0x76
> [ 486.528000] [<ffffffff8151e232>] ? page_fault+0x22/0x30
> [ 486.528000] [<ffffffff8129cfe0>] ? __put_user_4+0x20/0x30
> [ 486.528000] [<ffffffff81066629>] ? schedule_tail+0x5c/0x60
> [ 486.528000] [<ffffffff81524b7f>] ? ret_from_fork+0xf/0xb0

Please enable CONFIG_FRAME_POINTER to get better backtraces, but the
above suggests the pagefault swevent, will have a look.

Vince Weaver

unread,
Nov 15, 2013, 12:20:01 PM11/15/13
to
On Fri, 15 Nov 2013, Peter Zijlstra wrote:
>
> Please enable CONFIG_FRAME_POINTER to get better backtraces, but the
> above suggests the pagefault swevent, will have a look.

here it is again I think. I compiled with CONFIG_FRAME_POINTER but the
backtraces don't look that different.

Sorry about the truncation of the right edge of the dump, this is over a
serial console and I can't figure out how to get minicom configured
properly to not do that.


[ 132.648914] hrtimer: interrupt took 1372 ns
[ 236.300427] perf samples too long (2502 > 2500), lowering kernel.perf_event_0
[ 236.301947] INFO: NMI handler (perf_event_nmi_handler) took too long to run:s
[ 555.475509] perf samples too long (5002 > 5000), lowering kernel.perf_event_0
[ 555.589569] perf samples too long (10005 > 10000), lowering kernel.perf_even0
[ 555.593183] INFO: NMI handler (perf_event_nmi_handler) took too long to run:s
[ 571.903706] ------------[ cut here ]------------
[ 571.907549] WARNING: CPU: 1 PID: 3926 at arch/x86/kernel/cpu/perf_event.c:10)
[ 571.907549] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powern
[ 571.907549] CPU: 1 PID: 3926 Comm: perf_fuzzer Not tainted 3.12.0 #4
[ 571.907549] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BI2
[ 571.907549] 0000000000000434 ffff8800ca4afab8 ffffffff8151d8ec 0000000000004
[ 571.907549] 0000000000000000 ffff8800ca4afaf8 ffffffff8103cda9 0000000200003
[ 571.907549] ffffffff81012a69 ffff8800c813e800 ffff88011fc8b940 0000000000001
[ 571.907549] Call Trace:
[ 571.907549] [<ffffffff8151d8ec>] dump_stack+0x49/0x5d
[ 571.907549] [<ffffffff8103cda9>] warn_slowpath_common+0x81/0x9b
[ 571.907549] [<ffffffff81012a69>] ? x86_pmu_start+0x4f/0xf8
[ 571.907549] [<ffffffff8103cddd>] warn_slowpath_null+0x1a/0x1c
[ 571.907549] [<ffffffff81012a69>] x86_pmu_start+0x4f/0xf8
[ 571.907549] [<ffffffff81012cea>] x86_pmu_enable+0x1d8/0x27d
[ 571.907549] [<ffffffff810c5ea5>] perf_pmu_enable+0x21/0x23
[ 571.907549] [<ffffffff81012721>] x86_pmu_commit_txn+0x84/0x9a
[ 571.907549] [<ffffffff810db64c>] ? release_pages+0x18a/0x199
[ 571.907549] [<ffffffff810c88bc>] ? event_sched_in+0x7a/0x109
[ 571.907549] [<ffffffff810c89e3>] group_sched_in+0x98/0x147
[ 571.907549] [<ffffffff810c91e9>] __perf_event_enable+0x117/0x172
[ 571.907549] [<ffffffff810c956f>] ? perf_event_task_output+0x13d/0x13d
[ 571.907549] [<ffffffff810c95ab>] remote_function+0x3c/0x43
[ 571.907549] [<ffffffff8108c655>] smp_call_function_single+0x8a/0x101
[ 571.907549] [<ffffffff810c95f6>] task_function_call+0x44/0x4e
[ 571.907549] [<ffffffff810c90d2>] ? __perf_event_task_sched_in+0x10e/0x10e
[ 571.907549] [<ffffffff810c9b59>] perf_event_enable+0x112/0x150
[ 571.907549] [<ffffffff810c9a47>] ? perf_install_in_context+0xbc/0xbc
[ 571.907549] [<ffffffff810c7027>] perf_event_for_each_child+0x58/0x8b
[ 571.907549] [<ffffffff810c7122>] perf_event_task_enable+0x4b/0x7d
[ 571.907549] [<ffffffff8104ed5a>] SyS_prctl+0x151/0x341
[ 571.907549] [<ffffffff81527b56>] system_call_fastpath+0x1a/0x1f
[ 571.907549] ---[ end trace ee0dc4a441b301bf ]---
[ 573.054155] BUG: unable to handle kernel NULL pointer dereference at )
[ 573.057564] IP: [<ffffffff81520a44>] _raw_spin_lock_irqsave+0x11/0x2d
[ 573.057564] PGD c814b067 PUD cb183067 PMD 0
[ 573.057564] Oops: 0002 [#1] SMP
[ 573.057564] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powern
[ 573.057564] CPU: 1 PID: 3926 Comm: perf_fuzzer Tainted: G W 3.12.04
[ 573.057564] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BI2
[ 573.057564] task: ffff880037cfd0c0 ti: ffff8800ca4ae000 task.ti: ffff8800ca40
[ 573.057564] RIP: 0010:[<ffffffff81520a44>] [<ffffffff81520a44>] _raw_spin_ld
[ 573.057564] RSP: 0018:ffff8800ca4afa08 EFLAGS: 00010046
[ 573.057564] RAX: 0000000000000046 RBX: ffff8800c813e800 RCX: 0000000000000000
[ 573.057564] RDX: 0000000000010000 RSI: ffff8800c813e800 RDI: 0000000000000000
[ 573.057564] RBP: ffff8800ca4afa08 R08: ffff8800c813e800 R09: ffff8800ca4afdf8
[ 573.057564] R10: ffff88011a2261c0 R11: 0000000000000246 R12: ffff88011fc8b940
[ 573.057564] R13: 0000000000000000 R14: ffff8800c813e948 R15: 0000000000000000
[ 573.057564] FS: 00007fc238e92700(0000) GS:ffff88011fc80000(0000) knlGS:00000
[ 573.057564] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 573.057564] CR2: 0000000000000000 CR3: 00000000c88c4000 CR4: 00000000000407e0
[ 573.057564] DR0: 0000000000033c20 DR1: 0000000000000000 DR2: 0000000000000000
[ 573.057564] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 573.057564] Stack:
[ 573.057564] ffff8800ca4afa58 ffffffff810190b0 0000000000000000 0000000000000
[ 573.057564] ffff88011fff8680 0000000000000000 ffff8800c813e800 ffff88011fc80
[ 573.057564] 0000000000000040 0000000000000000 ffff8800ca4afa88 ffffffff81013
[ 573.057564] Call Trace:
[ 573.057564] [<ffffffff810190b0>] __intel_shared_reg_get_constraints+0x5e/0xc
[ 573.057564] [<ffffffff81019273>] intel_get_event_constraints+0x85/0x154
[ 573.057564] [<ffffffff81013bcc>] x86_schedule_events+0x54/0x185
[ 573.057564] [<ffffffff8101284f>] ? x86_pmu_add+0xf8/0x110
[ 573.057564] [<ffffffff810126ee>] x86_pmu_commit_txn+0x51/0x9a
[ 573.057564] [<ffffffff810d6fef>] ? __alloc_pages_nodemask+0x173/0x718
[ 573.057564] [<ffffffff810c88bc>] ? event_sched_in+0x7a/0x109
[ 573.057564] [<ffffffff810c89e3>] group_sched_in+0x98/0x147
[ 573.057564] [<ffffffff81067436>] ? sched_clock_cpu+0xc3/0xd1
[ 573.057564] [<ffffffff810c8bf0>] ctx_sched_in+0x15e/0x185
[ 573.057564] [<ffffffff810c8c80>] perf_event_sched_in+0x69/0x72
[ 573.057564] [<ffffffff810c8d51>] __perf_install_in_context+0xc8/0xf1
[ 573.057564] [<ffffffff810c956f>] ? perf_event_task_output+0x13d/0x13d
[ 573.057564] [<ffffffff810c95ab>] remote_function+0x3c/0x43
[ 573.057564] [<ffffffff8108c655>] smp_call_function_single+0x8a/0x101
[ 573.057564] [<ffffffff810c95f6>] task_function_call+0x44/0x4e
[ 573.057564] [<ffffffff810c8c89>] ? perf_event_sched_in+0x72/0x72
[ 573.057564] [<ffffffff810c9a0c>] perf_install_in_context+0x81/0xbc
[ 573.057564] [<ffffffff8151dddc>] ? mutex_lock+0x16/0x37
[ 573.057564] [<ffffffff810cd56b>] SYSC_perf_event_open+0x634/0x7e9
[ 573.057564] [<ffffffff810cd729>] SyS_perf_event_open+0x9/0xb
[ 573.057564] [<ffffffff81527b56>] system_call_fastpath+0x1a/0x1f
[ 573.057564] Code: c1 07 89 c2 c1 ea 10 66 39 c2 74 0c 66 8b 07 66 39 d0 74 0
[ 573.057564] RIP [<ffffffff81520a44>] _raw_spin_lock_irqsave+0x11/0x2d
[ 573.057564] RSP <ffff8800ca4afa08>
[ 573.057564] CR2: 0000000000000000
[ 573.057564] ---[ end trace ee0dc4a441b301c0 ]---
[ 577.000000] ------------[ cut here ]------------
[ 577.000000] WARNING: CPU: 0 PID: 15546 at kernel/watchdog.c:245 watchdog_ove)
[ 577.000000] Watchdog detected hard LOCKUP on cpu 0
[ 577.000000] Modules linked in: cpufreq_userspace cpufreq_stats cpufreq_powern
[ 577.000000] CPU: 0 PID: 15546 Comm: perf_fuzzer Tainted: G D W 3.12.4
[ 577.000000] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BI2
[ 577.000000] 00000000000000f5 ffff88011fc07bc8 ffffffff8151d8ec 0000000000005
[ 577.000000] ffff88011fc07c18 ffff88011fc07c08 ffffffff8103cda9 ffff88011fc08
[ 577.000000] ffffffff810a137f ffff88011b313400 0000000000000000 ffff88011fc08
[ 577.000000] Call Trace:
[ 577.000000] <NMI> [<ffffffff8151d8ec>] dump_stack+0x49/0x5d
[ 577.000000] [<ffffffff8103cda9>] warn_slowpath_common+0x81/0x9b
[ 577.000000] [<ffffffff810a137f>] ? watchdog_overflow_callback+0x9b/0xa6
[ 577.000000] [<ffffffff8103ce66>] warn_slowpath_fmt+0x46/0x48
[ 577.000000] [<ffffffff810a137f>] watchdog_overflow_callback+0x9b/0xa6
[ 577.000000] [<ffffffff810cbb94>] __perf_event_overflow+0x137/0x1c1
[ 577.000000] [<ffffffff81018991>] ? intel_pmu_enable_all+0x4c/0xa4
[ 577.000000] [<ffffffff810cc23a>] perf_event_overflow+0x14/0x16
[ 577.000000] [<ffffffff81018fbd>] intel_pmu_handle_irq+0x2b8/0x34d
[ 577.000000] [<ffffffff8151f82a>] ? __schedule+0x514/0x54b
[ 577.000000] [<ffffffff81522253>] perf_event_nmi_handler+0x2d/0x4a
[ 577.000000] [<ffffffff81521b86>] nmi_handle+0x5e/0x13a
[ 577.000000] [<ffffffff81521d0a>] do_nmi+0xa8/0x2c0
[ 577.000000] [<ffffffff81521337>] end_repeat_nmi+0x1e/0x2e
[ 577.000000] [<ffffffff81520abf>] ? _raw_spin_lock+0x26/0x2a
[ 577.000000] [<ffffffff81520abf>] ? _raw_spin_lock+0x26/0x2a
[ 577.000000] [<ffffffff81520abf>] ? _raw_spin_lock+0x26/0x2a
[ 577.000000] <<EOE>> <IRQ> [<ffffffff8106cf00>] sched_rt_period_timer+0x8be
[ 577.000000] [<ffffffff8105c156>] __run_hrtimer+0xba/0x145
[ 577.000000] [<ffffffff8106ce75>] ? init_rt_bandwidth+0x46/0x46
[ 577.000000] [<ffffffff8105c42b>] hrtimer_interrupt+0xd5/0x1c3
[ 577.000000] [<ffffffff81028469>] local_apic_timer_interrupt+0x58/0x5d
[ 577.000000] [<ffffffff8152983e>] smp_apic_timer_interrupt+0x2a/0x3c
[ 577.000000] [<ffffffff815287ca>] apic_timer_interrupt+0x6a/0x70
[ 577.000000] <EOI>
[ 577.000000] ---[ end trace ee0dc4a441b301c1 ]---
[ 577.000000] perf samples too long (5541124 > 20000), lowering kernel.perf_ev0
[ 577.000000] INFO: NMI handler (perf_event_nmi_handler) took too long to run:s
[ 577.000000] perf samples too long (5497840 > 40000), lowering kernel.perf_ev0
[ 577.000000] perf samples too long (5454894 > 76923), lowering kernel.perf_ev0
[ 577.000000] perf samples too long (5412282 > 142857), lowering kernel.perf_e0
[ 577.000000] perf samples too long (5370004 > 250000), lowering kernel.perf_e0
[ 577.000000] perf samples too long (5328056 > 500000), lowering kernel.perf_e0

Peter Zijlstra

unread,
Nov 18, 2013, 4:00:02 AM11/18/13
to
On Fri, Nov 15, 2013 at 12:26:16PM -0500, Vince Weaver wrote:
> On Fri, 15 Nov 2013, Peter Zijlstra wrote:
> >
> > Please enable CONFIG_FRAME_POINTER to get better backtraces, but the
> > above suggests the pagefault swevent, will have a look.
>
> here it is again I think. I compiled with CONFIG_FRAME_POINTER but the
> backtraces don't look that different.

Looks to be that other one -- the one where the intel constraints thing
goes belly up. I was hoping Stephane would look at that since I know
he's been rewriting that code.

Stephane Eranian

unread,
Nov 18, 2013, 10:20:02 AM11/18/13
to
Peter,

I missed the beginning of this thread somehow.
What's the initial problem?

Peter Zijlstra

unread,
Nov 18, 2013, 10:40:02 AM11/18/13
to
On Mon, Nov 18, 2013 at 04:18:34PM +0100, Stephane Eranian wrote:
> Peter,
>
> I missed the beginning of this thread somehow.
> What's the initial problem?

lkml.kernel.org/r/alpine.DEB.2.10.1...@vincent-weaver-1.um.maine.edu
lkml.kernel.org/r/alpine.DEB.2.10.1...@vincent-weaver-1.um.maine.edu

Vince managed to make the intel constraints code go *bang*.

Stephane Eranian

unread,
Nov 18, 2013, 10:50:01 AM11/18/13
to
Vince,

What's the config for this test?
Need to host CPU type and the event config.
Otherwise there is not much I can do with the crash dump.
Thanks.

Vince Weaver

unread,
Nov 18, 2013, 11:40:02 AM11/18/13
to
On Mon, 18 Nov 2013, Stephane Eranian wrote:

> What's the config for this test?
> Need to host CPU type and the event config.
> Otherwise there is not much I can do with the crash dump.

It's 3.12, on a core2 machine.

Sadly I don't have the event config. To reproduce I start my perf_fuzzer
running and typically it will hit the error/lockup within an hour or so
of fuzzing assuming it doesn't hit one of the other outstanding bugs
first.

I can probably narrow it down but that involves a lot of luck as well as a
lot of time and I might not get a chance to do so until next month at the
earliest.

Vince

Stephane Eranian

unread,
Nov 18, 2013, 12:00:01 PM11/18/13
to
On Mon, Nov 18, 2013 at 5:44 PM, Vince Weaver <vincent...@maine.edu> wrote:
> On Mon, 18 Nov 2013, Stephane Eranian wrote:
>
>> What's the config for this test?
>> Need to host CPU type and the event config.
>> Otherwise there is not much I can do with the crash dump.
>
> It's 3.12, on a core2 machine.
>
On core2, there is no offcore-response event, therefore you should not
get to the shared_reg_*() code OR it should return immediately. Looks
like here, it calls into it and finds some pointers NULL:
cpuc->shared_regs, I bet.

Now, I am trying to understand how we got there:

xreg = &event->hw.extra_reg;
if (xreg->idx != EXTRA_REG_NONE) {
c = __intel_shared_reg_get_constraints(cpuc, event, xreg);

}

breg = &event->hw.branch_reg;
if (breg->idx != EXTRA_REG_NONE) {
d = __intel_shared_reg_get_constraints(cpuc, event, breg);
}

So one of those tests is true.

That means breg->idx is pointing to a LBR_SELECT MSR but that does
not exist on Core2, or xreg->idx is pointing to a shared resource
which does not exist either.
For the former, it would mean it thinks there is hw LBR filtering, but
that's wrong. For the latter, it means it thinks that Core 2 has a
x86_pmu.extra_regs. I would check those two cases. Don't know
how this could happen.

I could debug this but I don't have a Core 2 machine that I can
easily crash. But maybe I can fake this on a IVB laptop. Just
would need the perf_event_attr struct.
0 new messages