Syzkaller found a bug: general protection fault in x86_pmu_enable

Sanan Hasanov

unread,

Nov 10, 2022, 1:47:40 PM11/10/22

to pet...@infradead.org, mi...@redhat.com, ac...@kernel.org, mark.r...@arm.com, alexander...@linux.intel.com, jo...@kernel.org, namh...@kernel.org, tg...@linutronix.de, b...@alien8.de, dave....@linux.intel.com, x...@kernel.org, h...@zytor.com, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, pa...@pgazz.com, syzkaller

Good day, dear maintainers,

We found a bug using a modified kernel configuration file used by syzbot.
We enhanced the coverage of the configuration file using our tool, krepair.
config and reproducer files are attached.
Branch: https://github.com/torvalds/linux (HEAD detached at 33c9805860e58)
Reproducer can be executed as follows:
./syz-execprog -repeat=0 -procs=8 program
More info: https://github.com/google/syzkaller/blob/master/docs/executing_syzkaller_programs.md
Thank you!

general protection fault, probably for non-canonical address
0xdffffc0000000033: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000198-0x000000000000019f]
CPU: 0 PID: 7030 Comm: systemd-udevd Not tainted
6.1.0-rc1-00010-gbb1a1146467a #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:x86_pmu_enable_event+0x61/0x2b0
Code: 00 e8 93 95 3a 00 48 c7 c7 40 59 60 85 e8 47 17 a4 02 48 8d bb
98 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6
04 02 65 48 8b 2d 0b 46 a1 7d 84 c0 74 08 3c 03 0f 8e f3 01
RSP: 0000:ffff88811ae09ce8 EFLAGS: 00010016
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8260e90d
RDX: 0000000000000033 RSI: ffffffff85605940 RDI: 0000000000000198
RBP: ffff88811ae21e20 R08: 0000000000000001 R09: ffffed10235c43c5
R10: ffff88811ae21e27 R11: ffffed10235c43c4 R12: 0000000000000007
R13: ffff88811ae21c20 R14: fffffbfff0cf1d0f R15: 0000000000000004
FS: 00007f2607dbb8c0(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005606fbe35108 CR3: 000000010c384000 CR4: 0000000000350ef0
DR0: 0000000003be2348 DR1: 0000000003be2348 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000600
Call Trace:
<IRQ>
amd_pmu_enable_all+0x101/0x160
x86_pmu_enable+0x367/0xcb0
perf_pmu_enable+0xb6/0xf0
perf_mux_hrtimer_handler+0x4c0/0x880
__hrtimer_run_queues+0x2cf/0x6c0
hrtimer_interrupt+0x2f3/0x700
__sysvec_apic_timer_interrupt+0x114/0x370
sysvec_apic_timer_interrupt+0x89/0xc0
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:__sanitizer_cov_trace_pc+0xd/0x70
Code: 00 00 00 e9 f5 b6 a4 02 48 89 f7 e9 cd fc ff ff 66 66 2e 0f 1f
84 00 00 00 00 00 66 90 65 8b 05 19 ee 66 7d 89 c6 48 8b 0c 24 <81> e6
00 01 00 00 65 48 8b 14 25 00 6d 02 00 a9 00 01 ff 00 74 0e
RSP: 0000:ffff888111adfd18 EFLAGS: 00000246
RAX: 0000000080000000 RBX: ffff88800a90ba00 RCX: ffffffff84f039aa
RDX: ffff88811137e480 RSI: 0000000080000000 RDI: ffff88800a90baf0
RBP: ffff88800a90ba08 R08: 0000000000000001 R09: ffffed10011c59a0
R10: ffff888008e2ccff R11: ffffed10011c599f R12: ffff88800a90ba1c
R13: 0000000000000003 R14: 00005606fbe35108 R15: dffffc0000000000
mt_find+0x23a/0x9a0
find_vma+0x77/0xa0
do_user_addr_fault+0x268/0xe80
exc_page_fault+0x78/0x120
asm_exc_page_fault+0x22/0x30
RIP: 0033:0x7f2606bcae74
Code: fa ff ff 48 3b 5a 28 0f 85 28 04 00 00 48 8b 4b 28 48 3b 59 20
0f 85 1a 04 00 00 48 83 78 20 00 0f 84 3a 04 00 00 48 8b 43 28 <48> 89
42 28 48 8b 43 28 48 89 50 20 e9 5a fa ff ff 0f 1f 00 49 8b
RSP: 002b:00007ffdefeb6a10 EFLAGS: 00010206
RAX: 00005606fbe350e0 RBX: 00005606fbe350e0 RCX: 00005606fbe350e0
RDX: 00005606fbe350e0 RSI: 0000000000000000 RDI: 00007f2606eecb00
RBP: 00007f2606eecb00 R08: 00000000ffffffff R09: 000000000000001e
R10: 00007f2607dbb8c0 R11: 00005606fa0729a0 R12: 0000000000000000
R13: 0000000000005140 R14: 00005606fbe3a220 R15: 0000000000001010
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:x86_pmu_enable_event+0x61/0x2b0
Code: 00 e8 93 95 3a 00 48 c7 c7 40 59 60 85 e8 47 17 a4 02 48 8d bb
98 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6
04 02 65 48 8b 2d 0b 46 a1 7d 84 c0 74 08 3c 03 0f 8e f3 01
RSP: 0000:ffff88811ae09ce8 EFLAGS: 00010016
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8260e90d
RDX: 0000000000000033 RSI: ffffffff85605940 RDI: 0000000000000198
RBP: ffff88811ae21e20 R08: 0000000000000001 R09: ffffed10235c43c5
R10: ffff88811ae21e27 R11: ffffed10235c43c4 R12: 0000000000000007
R13: ffff88811ae21c20 R14: fffffbfff0cf1d0f R15: 0000000000000004
FS: 00007f2607dbb8c0(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005606fbe35108 CR3: 000000010c384000 CR4: 0000000000350ef0
DR0: 0000000003be2348 DR1: 0000000003be2348 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000600
----------------
Code disassembly (best guess):
0: 00 e8 add %ch,%al
2: 93 xchg %eax,%ebx
3: 95 xchg %eax,%ebp
4: 3a 00 cmp (%rax),%al
6: 48 c7 c7 40 59 60 85 mov $0xffffffff85605940,%rdi
d: e8 47 17 a4 02 callq 0x2a41759
12: 48 8d bb 98 01 00 00 lea 0x198(%rbx),%rdi
19: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
20: fc ff df
23: 48 89 fa mov %rdi,%rdx
26: 48 c1 ea 03 shr $0x3,%rdx
* 2a: 0f b6 04 02 movzbl (%rdx,%rax,1),%eax <-- trapping instruction
2e: 65 48 8b 2d 0b 46 a1 mov %gs:0x7da1460b(%rip),%rbp # 0x7da14641
35: 7d
36: 84 c0 test %al,%al
38: 74 08 je 0x42
3a: 3c 03 cmp $0x3,%al
3c: 0f .byte 0xf
3d: 8e f3 mov %ebx,%?
3f: 01 .byte 0x1

core.c.config

repro.prog

Borislav Petkov

unread,

Nov 21, 2022, 4:45:29 AM11/21/22

to Sanan Hasanov, pet...@infradead.org, mi...@redhat.com, ac...@kernel.org, mark.r...@arm.com, alexander...@linux.intel.com, jo...@kernel.org, namh...@kernel.org, tg...@linutronix.de, dave....@linux.intel.com, x...@kernel.org, h...@zytor.com, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, pa...@pgazz.com, syzkaller

Looks like

baa014b9543c ("perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling")

which just went to Linus.

You could try 6.1-rc6 when it releases tomorrow.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

George K

unread,

Jun 7, 2024, 3:29:36 PM6/7/24

to syzkaller

This crash is still showing up on AMD machines only with latest Upstream (6.10.0-rc1).

[ 76.709676] perf: interrupt took too long (15854 > 15786), lowering kernel.perf_event_max_sample_rate to 12000
[ 76.716135] perf: interrupt took too long (19880 > 19817), lowering kernel.perf_event_max_sample_rate to 10000
[ 93.882311] hrtimer: interrupt took 1098800 ns
[ 174.292184] perf_duration_warn: 1 callbacks suppressed
[ 174.292248] perf: interrupt took too long (31354 > 31193), lowering kernel.perf_event_max_sample_rate to 6000
[ 181.134950] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000034: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 181.137003] KASAN: null-ptr-deref in range [0x00000000000001a0-0x00000000000001a7]
[ 181.137918] CPU: 0 PID: 1660 Comm: repro_x86_pmu_e Not tainted 6.10.0-rc1+ #39
[ 181.138799] Hardware name: Red Hat KVM, BIOS 1.16.0-4.module+el8.9.0+90052+d3bf71d8 04/01/2014
[ 181.139814] RIP: 0010:x86_pmu_enable_event (arch/x86/events/perf_event.h:1174 arch/x86/events/core.c:1427)
[ 181.142657] RSP: 0018:ffff888118a09578 EFLAGS: 00010012
[ 181.143295] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 181.144160] RDX: 0000000000000034 RSI: 0000000000000000 RDI: 00000000000001a0
[ 181.145016] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[ 181.145865] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888118a2a230
[ 181.146723] R13: ffff888118a2a420 R14: 0000000000000001 R15: fffffbfff2afd837
[ 181.147571] FS: 00007fe9b2747740(0000) GS:ffff888118a00000(0000) knlGS:0000000000000000
[ 181.148544] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 181.149238] CR2: 00007ffc667ebdb8 CR3: 0000000025834000 CR4: 00000000000006f0
[ 181.150075] Call Trace:
[ 181.150387] <IRQ>
[ 181.154359] amd_pmu_enable_all (arch/x86/events/amd/core.c:1341)
[ 181.154899] x86_pmu_enable (arch/x86/events/core.c:1276 arch/x86/events/core.c:1335)
[ 181.156479] __pmu_ctx_sched_out (kernel/events/core.c:8314 (discriminator 1))
[ 181.157046] ctx_sched_out (kernel/events/core.c:8328)
[ 181.157530] __perf_install_in_context (kernel/events/core.c:6235)
[ 181.159488] remote_function (./arch/x86/include/asm/atomic64_64.h:20)
[ 181.160663] __flush_smp_call_function_queue (kernel/smp.c:189 (discriminator 20) kernel/smp.c:197 (discriminator 20) kernel/smp.c:540 (discriminator 20))
[ 181.161929] __sysvec_call_function_single (arch/x86/kernel/smp.c:193 (discriminator 1))
[ 181.162574] sysvec_call_function_single (lib/maple_tree.c:3155 (discriminator 2))
[ 181.163161] asm_sysvec_call_function_single (./arch/x86/include/asm/idtentry.h:709)
[ 181.163814] RIP: 0010:__sanitizer_cov_trace_pc (./arch/x86/include/asm/current.h:49 (discriminator 1) kernel/kcov.c:623 (discriminator 1))
[ 181.166674] RSP: 0018:ffff888118a098f0 EFLAGS: 00000287
[ 181.167310] RAX: 0000000080000102 RBX: ffffffff8e08cbbb RCX: 0000000000000000
[ 181.168168] RDX: ffff888101bc4d40 RSI: ffffffff814b2e15 RDI: ffffffff976aa2d0
[ 181.169031] RBP: 000000000029b692 R08: 0000000000000000 R09: 0000000000000000
[ 181.169905] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000029b691
[ 181.170761] R13: 000000000029b691 R14: 00000000000d4001 R15: ffffffff972dfde2
[ 181.172592] orc_find.part.0 (arch/x86/kernel/unwind_orc.c:220 (discriminator 1))
[ 181.173548] unwind_next_frame (arch/x86/kernel/unwind_orc.c:202 arch/x86/kernel/unwind_orc.c:494)
[ 181.175676] arch_stack_walk (./include/asm-generic/bitops/instrumented-atomic.h:28 ./include/linux/cpumask.h:522 arch/x86/kernel/cpu/cacheinfo.c:1171)
[ 181.177152] stack_trace_save (kernel/stacktrace.c:89 kernel/stacktrace.c:101)
[ 181.178181] kasan_save_stack (mm/kasan/report.c:570)
[ 181.186586] kasan_save_track (mm/kasan/report.c:214 (discriminator 2) mm/kasan/report.c:590 (discriminator 2))
[ 181.187084] kasan_save_free_info (mm/kasan/quarantine.c:345)
[ 181.187632] poison_slab_object (mm/kasan/report.c:255 (discriminator 2) mm/kasan/report.c:482 (discriminator 2))
[ 181.188149] __kasan_slab_free (./arch/x86/include/asm/pgalloc.h:69)
[ 181.188643] kmem_cache_free (mm/slub.c:3675)
[ 181.189128] rcu_do_batch (./arch/x86/include/asm/preempt.h:79 (discriminator 18) ./include/linux/bottom_half.h:13 (discriminator 18) ./include/linux/bottom_half.h:20 (discriminator 18) kernel/rcu/tree.c:2558 (discriminator 18))
[ 181.191250] rcu_core (./arch/x86/include/asm/preempt.h:94 (discriminator 1) ./include/trace/events/rcu.h:27 (discriminator 1) ./include/trace/events/rcu.h:27 (discriminator 1) kernel/rcu/tree.c:2781 (discriminator 1))
[ 181.191670] handle_softirqs (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:555)
[ 181.193300] __irq_exit_rcu (kernel/softirq.c:589 kernel/softirq.c:428 kernel/softirq.c:637)
[ 181.193795] sysvec_apic_timer_interrupt (lib/maple_tree.c:427 (discriminator 1) lib/maple_tree.c:532 (discriminator 1) lib/maple_tree.c:1720 (discriminator 1))
[ 181.194373] </IRQ>

Sent this patch Upstream, but there has not been much activity on it.

https://lore.kernel.org/lkml/1716990659-2427-1-git-s...@oracle.com/

Does anyone have any insight as to root-cause? Again, looks like an AMD only issue.

The Syzkaller reproducer can be found in this link: https://lore.kernel.org/netdev/CAMt6jhyec7-TSFpr3F+_ikjp...@mail.gmail.com/T/#u

To get repro.c, run:

syz-prog2c -prog repro.prog -threaded -repeat 0 -procs 8 -slowdown 1 -sandbox none -sandbox_arg 0 -enable net_reset -enable net_dev -enable cgroups -tmpdir -enable binfmt_misc -enable close_fds -enable sysctl -segv >repro.c

Also, a similar fix is done in __intel_pmu_enable_all() in arch/x86/events/intel/core.c except that a WARN_ON_ONCE is done as well. See: https://elixir.bootlin.com/linux/v6.10-rc1/source/arch/x86/events/intel/core.c#L2256

Thank you

George K

unread,

Jun 7, 2024, 4:03:43 PM6/7/24

to syzkaller

This commit was supposed to fix the same issue:

baa014b9543c 2022-11-14 perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling

Reply all

Reply to author

Forward

Syzkaller found a bug: general protection fault in x86_pmu_enable_event

Sanan Hasanov

Borislav Petkov

George K

George K