The load_segments() function changes segment registers, invalidating
GS base (which KCOV relies on for per-cpu data). When CONFIG_KCOV is
enabled, any subsequent instrumented C code call (e.g.
native_gdt_invalidate()) begins crashing the kernel in an endless
loop.
To reproduce the problem, it's sufficient to do kexec on a
KCOV-instrumented kernel:
$ kexec -l /boot/otherKernel
$ kexec -e
The real-world context for this problem is enabling crash dump
collection in syzkaller. For this, the tool loads a panic kernel
before fuzzing and then calls makedumpfile after the panic. This
workflow requires both CONFIG_KEXEC and CONFIG_KCOV to be enabled
simultaneously.
Adding safeguards directly to the KCOV fast-path
(__sanitizer_cov_trace_pc()) is also undesirable as it would
introduce an extra performance overhead.
Disabling instrumentation for the individual functions would be too
fragile, so let's fix the bug by disabling KCOV instrumentation for
the entire machine_kexec_64.c and physaddr.c. If coverage-guided
fuzzing ever needs these components in the future, we should consider
other approaches.
The problem is not relevant for 32 bit kernels as CONFIG_KCOV is not
supported there.
Reviewed-by: Dmitry Vyukov <
dvy...@google.com>
---
v2:
Updated the comments to explain the underlying context.
v1:
https://lore.kernel.org/all/20260216173716....@google.com/
---
arch/x86/kernel/Makefile | 10 ++++++++++
arch/x86/mm/Makefile | 10 ++++++++++
2 files changed, 20 insertions(+)
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index e9aeeeafad173..41b1333907ded 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -43,6 +43,16 @@ KCOV_INSTRUMENT_dumpstack_$(BITS).o := n
KCOV_INSTRUMENT_unwind_orc.o := n
KCOV_INSTRUMENT_unwind_frame.o := n
KCOV_INSTRUMENT_unwind_guess.o := n
+# Disable KCOV to prevent crashes during kexec: load_segments() invalidates
+# the GS base, which KCOV relies on for per-CPU data.
+# As KCOV && KEXEC compatibility should be preserved (e.g. syzkaller is
+# using it to collect crash dumps during kernel fuzzing), we could either
+# selectively disable KCOV instrumentation, which can be fragile, or add
+# more checks to KCOV, which would slow it down.
+# As a compromise solution, let's disable KCOV instrumentation for the
+# whole file. If its coverage is ever needed, we should consider other
+# approaches.
+KCOV_INSTRUMENT_machine_kexec_64.o := n
CFLAGS_head32.o := -fno-stack-protector
CFLAGS_head64.o := -fno-stack-protector
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 5b9908f13dcfd..ea3a31b54e49e 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -4,6 +4,16 @@ KCOV_INSTRUMENT_tlb.o := n
KCOV_INSTRUMENT_mem_encrypt.o := n
KCOV_INSTRUMENT_mem_encrypt_amd.o := n
KCOV_INSTRUMENT_pgprot.o := n
+# Disable KCOV to prevent crashes during kexec: load_segments() invalidates
+# the GS base, which KCOV relies on for per-CPU data.
+# As KCOV && KEXEC compatibility should be preserved (e.g. syzkaller is
+# using it to collect crash dumps during kernel fuzzing), we could either
+# selectively disable KCOV instrumentation, which can be fragile, or add
+# more checks to KCOV, which would slow it down.
+# As a compromise solution, let's disable KCOV instrumentation for the
+# whole file. If its coverage is ever needed, we should consider other
+# approaches.
+KCOV_INSTRUMENT_physaddr.o := n
KASAN_SANITIZE_mem_encrypt.o := n
KASAN_SANITIZE_mem_encrypt_amd.o := n
base-commit: f338e77383789c0cae23ca3d48adcc5e9e137e3c
--
2.53.0.959.g497ff81fa9-goog