[PATCH (draft)] kcov: fix potential kcov_mode corruption under CONFIG_PREEMPT_RT

1 view
Skip to first unread message

Tetsuo Handa

unread,
May 4, 2026, 11:31:03 AMMay 4
to kasan-dev, LKML, Dmitry Vyukov, Alexander Potapenko
Problem Description:

syzbot has reported intermittent kcov_mode corruption when running with
CONFIG_PREEMPT_RT=y. Specifically, struct task_struct->kcov_mode is found
to be unexpectedly modified (e.g., changed from 0 to 2) during workqueue
execution, even when the work handler itself does not use KCOV [1].

Root Cause Analysis:

The issue stems from the ambiguity of in_task() under CONFIG_PREEMPT_RT.
Since commit 5ff3b30ab57d ("kcov: collect coverage from interrupts"),
kcov_remote_start() uses in_task() to decide whether to use a per-CPU
preallocated buffer (irq_area) or a task-specific buffer (allocated via
vmalloc).

In a CONFIG_PREEMPT_RT kernel, in_task() returns true even for threaded
softirqs (e.g., ksoftirqd or threaded IRQ handlers), unless they are
explicitly within a local_bh_disable() region at the time of the check.
This leads to several critical issues:

1. Context Mismatch: A threaded interrupt handler (like usb_giveback_urb_bh)
might be identified as a "task" context. If it starts a remote KCOV
session, it incorrectly modifies the kworker's task_struct->kcov_mode
using the logic intended for system calls.

2. State Leakage: If the result of in_task() differs between
kcov_remote_start() and kcov_remote_stop() due to the subtle state
changes in RT scheduling, the cleanup code (kcov_stop()) might be skipped
or applied to the wrong context. This leaves kcov_mode set to a non-zero
value (e.g., KCOV_MODE_TRACE_CMP) when the worker returns to
process_one_work().

3. Evidence: Debug logs show the pr_err from process_one_work()
carrying a [ C1] (CPU-based) prefix instead of a [Txxx] (Task-based)
prefix. This confirms that at the moment of the error,
in_serving_softirq() was true, causing in_task() to be false, which
contradicts the state when the work started.

Proposed Fix:

Introduce a more strict context check, in_task_really(), which
explicitly excludes any softirq context (including threaded ones) by
checking in_serving_softirq() regardless of the CONFIG_PREEMPT_RT
configuration. This ensures that threaded interrupts always use the
irq_area path and do not corrupt the task_struct state of the worker
threads they happen to preempt.

I wrote this patch's body, but patch description above was generated
using Google AI mode. I can't tell whether above explanation is correct.
Therefore, please review but don't accept yet. For now:

Link: https://syzkaller.appspot.com/bug?extid=a21650c1666eae7b2aae [1]
Analyzed-by: https://share.google/aimode/5aL6wtYxiS3koWnAx (expires in 7 days)
Not-yet-signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
Reported-by: syzbot+3f51ad...@syzkaller.appspotmail.com
Maybe-closes: https://syzkaller.appspot.com/bug?extid=3f51ad7ac3ae57a6fdcc
---
kernel/kcov.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/kernel/kcov.c b/kernel/kcov.c
index 0b369e88c7c9..c20d3c2712d7 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -171,6 +171,15 @@ static __always_inline bool in_softirq_really(void)
return in_serving_softirq() && !in_hardirq() && !in_nmi();
}

+static __always_inline bool in_task_really(void)
+{
+#ifdef CONFIG_PREEMPT_RT
+ return !in_serving_softirq() && !in_hardirq() && !in_nmi();
+#else
+ return in_task();
+#endif
+}
+
static notrace bool check_kcov_mode(enum kcov_mode needed_mode, struct task_struct *t)
{
unsigned int mode;
@@ -903,7 +912,7 @@ void kcov_remote_start(u64 handle)
return;
}
kcov_debug("handle = %llx, context: %s\n", handle,
- in_task() ? "task" : "softirq");
+ in_task_really() ? "task" : "softirq");
kcov = remote->kcov;
/* Put in kcov_remote_stop(). */
kcov_get(kcov);
@@ -915,7 +924,7 @@ void kcov_remote_start(u64 handle)
*/
mode = context_unsafe(kcov->mode);
sequence = kcov->sequence;
- if (in_task()) {
+ if (in_task_really()) {
size = kcov->remote_size;
area = kcov_remote_area_get(size);
} else {
@@ -924,7 +933,7 @@ void kcov_remote_start(u64 handle)
}
spin_unlock(&kcov_remote_lock);

- /* Can only happen when in_task(). */
+ /* Can only happen when in_task_really(). */
if (!area) {
local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
area = vmalloc(size * sizeof(unsigned long));
@@ -1069,7 +1078,7 @@ void kcov_remote_stop(void)
kcov_move_area(kcov->mode, kcov->area, kcov->size, area);
spin_unlock(&kcov->lock);

- if (in_task()) {
+ if (in_task_really()) {
spin_lock(&kcov_remote_lock);
kcov_remote_area_put(area, size);
spin_unlock(&kcov_remote_lock);
--
2.47.3

Tetsuo Handa

unread,
May 4, 2026, 11:39:26 AMMay 4
to kasan-dev, LKML, Dmitry Vyukov, Alexander Potapenko
On 2026/05/05 0:30, Tetsuo Handa wrote:
> Link: https://syzkaller.appspot.com/bug?extid=a21650c1666eae7b2aae [1]

Oops. Link was wrong.

Link: https://syzkaller.appspot.com/bug?extid=e6686317bd9fe911591a [1]

Tetsuo Handa

unread,
May 5, 2026, 4:15:46 AMMay 5
to kasan-dev, LKML, Dmitry Vyukov, Alexander Potapenko
Problem Description:

syzbot has reported intermittent kcov_mode corruption when running with
CONFIG_PREEMPT_RT=y. Specifically, struct task_struct->kcov_mode is found
to be unexpectedly modified (e.g., changed from 0 to 2) during workqueue
execution, even when the work handler itself does not use KCOV.

[ 224.426704][ T5698] BUG: workqueue function defense_work_handler changed kcov_mode from 0 to 2
[ 224.441977][ C1] BUG: workqueue function usb_giveback_urb_bh changed kcov_mode from 2 to 0

Root Cause Analysis:

The issue stems from the ambiguity of in_task() under CONFIG_PREEMPT_RT.
Since commit 5ff3b30ab57d ("kcov: collect coverage from interrupts"),
kcov_remote_start() uses in_task() to decide whether to use a per-CPU
preallocated buffer (irq_area) or a task-specific buffer (allocated via
vmalloc).

In a CONFIG_PREEMPT_RT kernel, in_task() returns true even for threaded
softirqs (e.g., ksoftirqd or threaded IRQ handlers), unless they are
explicitly within a local_bh_disable() region at the time of the check.
This leads to several critical issues:

1. Context Mismatch: A threaded interrupt handler (like
usb_giveback_urb_bh()) might be identified as a "task" context. If it
starts a remote KCOV session, it incorrectly modifies the kworker's
task_struct->kcov_mode using the logic intended for system calls.

2. State Leakage: If the result of in_task() differs between
kcov_remote_start() and kcov_remote_stop() due to the subtle state
changes in RT scheduling, the cleanup code (kcov_stop()) might be
skipped or applied to the wrong context. This leaves kcov_mode set to
a non-zero value (e.g., KCOV_MODE_TRACE_CMP) when the worker returns
to process_one_work().

3. Evidence: The message quoted in Problem Description showed the
pr_err() from process_one_work() carrying a [ C1] (CPU-based) prefix
instead of a [Txxx] (Task-based) prefix. This confirms that at the
moment of the error, in_serving_softirq() was true, causing in_task()
to be false, which contradicts the state when the work started.

Proposed Fix:

Introduce a more strict context check, in_task_really(), which
explicitly excludes any softirq context (including threaded ones) by
checking in_serving_softirq() regardless of the CONFIG_PREEMPT_RT
configuration. This ensures that threaded interrupts always use the
irq_area path and do not corrupt the task_struct state of the worker
threads they happen to preempt.

Also, replace !in_task() && !in_serving_softirq() test in
kcov_remote_{start,stop}() with in_hardirq() || in_nmi() test, for
relying on unstable in_serving_softirq() test might break symmetry of
kcov_remote_{start,stop}() calls.

Link: https://syzkaller.appspot.com/bug?extid=e6686317bd9fe911591a
Analyzed-by: Google AI mode (no mail address)
Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
---
kernel/kcov.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/kernel/kcov.c b/kernel/kcov.c
index 0b369e88c7c9..f77bcd507701 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -171,6 +171,16 @@ static __always_inline bool in_softirq_really(void)
return in_serving_softirq() && !in_hardirq() && !in_nmi();
}

+static __always_inline bool in_task_really(void)
+{
+ /* Caller won't call this function if in_hardirq() || in_nmi(). */
+#ifdef CONFIG_PREEMPT_RT
+ return !in_serving_softirq();
+#else
+ return in_task();
+#endif
+}
+
static notrace bool check_kcov_mode(enum kcov_mode needed_mode, struct task_struct *t)
{
unsigned int mode;
@@ -869,9 +879,10 @@ void kcov_remote_start(u64 handle)
int sequence;
unsigned long flags;

- if (WARN_ON(!kcov_check_handle(handle, true, true, true)))
+ /* Don't use in_task() in order to allow consistent checks in RT kernels. */
+ if (in_hardirq() || in_nmi())
return;
- if (!in_task() && !in_softirq_really())
+ if (WARN_ON(!kcov_check_handle(handle, true, true, true)))
return;

local_lock_irqsave(&kcov_percpu_data.lock, flags);
@@ -903,7 +914,7 @@ void kcov_remote_start(u64 handle)
return;
}
kcov_debug("handle = %llx, context: %s\n", handle,
- in_task() ? "task" : "softirq");
+ in_task_really() ? "task" : "softirq");
kcov = remote->kcov;
/* Put in kcov_remote_stop(). */
kcov_get(kcov);
@@ -915,7 +926,7 @@ void kcov_remote_start(u64 handle)
*/
mode = context_unsafe(kcov->mode);
sequence = kcov->sequence;
- if (in_task()) {
+ if (in_task_really()) {
size = kcov->remote_size;
area = kcov_remote_area_get(size);
} else {
@@ -924,7 +935,7 @@ void kcov_remote_start(u64 handle)
}
spin_unlock(&kcov_remote_lock);

- /* Can only happen when in_task(). */
+ /* Can only happen when in_task_really(). */
if (!area) {
local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
area = vmalloc(size * sizeof(unsigned long));
@@ -1024,7 +1035,8 @@ void kcov_remote_stop(void)
int sequence;
unsigned long flags;

- if (!in_task() && !in_softirq_really())
+ /* Don't use in_task() in order to allow consistent checks in RT kernels. */
+ if (in_hardirq() || in_nmi())
return;

local_lock_irqsave(&kcov_percpu_data.lock, flags);
@@ -1069,7 +1081,7 @@ void kcov_remote_stop(void)

Tetsuo Handa

unread,
May 6, 2026, 7:56:03 AMMay 6
to kasan-dev, LKML, Dmitry Vyukov, Alexander Potapenko
Problem:
In CONFIG_PREEMPT_RT=y kernels, KCOV experiences logical errors and
WARNINGs (as reported by syzbot). The root cause is a twofold mismatch
between KCOV's design and the RT preemptive model:

1. Reentrancy on the same CPU: KCOV uses per_cpu variables
(kcov_percpu_data) to save/restore states and provide a temporary
irq_area. In RT kernels, local_lock_irqsave() does not disable
preemption. Thus, a task executing KCOV code can be preempted by a
threaded Softirq on the same CPU. If the Softirq also triggers KCOV,
it overwrites the per_cpu data, corrupting the preempted task's state.

2. Context Confusion: PREEMPT_RT often executes Softirqs within the
task_struct context of the currently running task (e.g., a kworker or
threaded IRQ handler). Since KCOV relies on in_task() to decide whether
to modify current->kcov_mode, it mistakenly modifies the kcov_mode of
an unrelated kworker when a Softirq "borrows" its context.

Solution:
This patch eliminates the use of per_cpu data structures for KCOV when
CONFIG_PREEMPT_RT is enabled, moving the necessary state into task_struct.

* Task-local Storage: Added kcov_saved_* fields to task_struct under
CONFIG_PREEMPT_RT. This ensures that even if a task is preempted during
remote coverage collection, its state remains isolated and travels with
the task itself.

* Eliminate irq_area Sharing: In RT kernels, kcov_remote_start() now
always utilizes kcov_remote_area_get() (the remote area pool) instead
of the shared per-cpu irq_area. This prevents buffer collision between
preempting contexts on the same CPU.

* Consistent Context Checks: Replaced unreliable in_task() checks with
explicit in_hardirq() || in_nmi() guards in critical paths to ensure
KCOV's state machine remains consistent regardless of whether a Softirq
is threaded or "borrowing" a task context.

* Conditional Lock Removal: Since all data is now task-local in the RT
case, the requirement for local_lock is removed for
CONFIG_PREEMPT_RT=y, reducing unnecessary locking overhead.

This change ensures that KCOV is fully reentrant and safe for the
PREEMPT_RT execution model without sacrificing the performance of
non-RT kernels.

Link: https://syzkaller.appspot.com/bug?extid=e6686317bd9fe911591a
Analyzed-by: AI Mode in Google Search (no mail address)
Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
---
Only compile tested. I don't have environment to measure how ignoring
CONFIG_KCOV_IRQ_AREA_SIZE affects reliability / performance. We might
need to enforce CONFIG_KCOV_IRQ_AREA_SIZE for in_serving_softirq() case
if remote_arg->area_size < CONFIG_KCOV_IRQ_AREA_SIZE in
ioctl(KCOV_REMOTE_ENABLE) request. Please be sure to test this patch
using CONFIG_PREEMPT_RT=y and CONFIG_PREEMPT_RT=n kernels in local
syzkaller environment before sending upstream.

include/linux/sched.h | 10 +++++
kernel/kcov.c | 87 ++++++++++++++++++++++++++++++++++---------
2 files changed, 79 insertions(+), 18 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 368c7b4d7cb5..2c963f4271d6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1522,6 +1522,16 @@ struct task_struct {

/* Collect coverage from softirq context: */
unsigned int kcov_softirq;
+
+#ifdef CONFIG_PREEMPT_RT
+ /* Temporary storage for preempting remote coverage collection: */
+ unsigned int kcov_saved_mode;
+ unsigned int kcov_saved_size;
+ void *kcov_saved_area;
+ struct kcov *kcov_saved_kcov;
+ int kcov_saved_sequence;
+#endif
+
#endif

#ifdef CONFIG_MEMCG_V1
diff --git a/kernel/kcov.c b/kernel/kcov.c
index 0b369e88c7c9..3178a0e03c3b 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -88,6 +88,7 @@ static DEFINE_SPINLOCK(kcov_remote_lock);
static DEFINE_HASHTABLE(kcov_remote_map, 4);
static struct list_head kcov_remote_areas = LIST_HEAD_INIT(kcov_remote_areas);

+#ifndef CONFIG_PREEMPT_RT
struct kcov_percpu_data {
void *irq_area;
local_lock_t lock;
@@ -102,6 +103,7 @@ struct kcov_percpu_data {
static DEFINE_PER_CPU(struct kcov_percpu_data, kcov_percpu_data) = {
.lock = INIT_LOCAL_LOCK(lock),
};
+#endif

/* Must be called with kcov_remote_lock locked. */
static struct kcov_remote *kcov_remote_find(u64 handle)
@@ -823,6 +825,44 @@ static inline bool kcov_mode_enabled(unsigned int mode)
return (mode & ~KCOV_IN_CTXSW) != KCOV_MODE_DISABLED;
}

+#ifdef CONFIG_PREEMPT_RT
+static inline void kcov_local_lock_irqsave(unsigned long flags) { }
+static inline void kcov_local_unlock_irqrestore(unsigned long flags) { }
+
+static void kcov_remote_softirq_start(struct task_struct *t)
+{
+ unsigned int mode;
+
+ mode = READ_ONCE(t->kcov_mode);
+ barrier();
+ if (kcov_mode_enabled(mode)) {
+ t->kcov_saved_mode = mode;
+ t->kcov_saved_size = t->kcov_size;
+ t->kcov_saved_area = t->kcov_area;
+ t->kcov_saved_sequence = t->kcov_sequence;
+ t->kcov_saved_kcov = t->kcov;
+ kcov_stop(t);
+ }
+}
+
+static void kcov_remote_softirq_stop(struct task_struct *t)
+{
+ if (t->kcov_saved_kcov) {
+ kcov_start(t, t->kcov_saved_kcov, t->kcov_saved_size,
+ t->kcov_saved_area, t->kcov_saved_mode,
+ t->kcov_saved_sequence);
+ t->kcov_saved_mode = 0;
+ t->kcov_saved_size = 0;
+ t->kcov_saved_area = NULL;
+ t->kcov_saved_sequence = 0;
+ t->kcov_saved_kcov = NULL;
+ }
+}
+
+#else
+#define kcov_local_lock_irqsave(flags) local_lock_irqsave(&kcov_percpu_data.lock, flags)
+#define kcov_local_unlock_irqrestore(flags) local_unlock_irqrestore(&kcov_percpu_data.lock, flags)
+
static void kcov_remote_softirq_start(struct task_struct *t)
__must_hold(&kcov_percpu_data.lock)
{
@@ -858,6 +898,8 @@ static void kcov_remote_softirq_stop(struct task_struct *t)
}
}

+#endif
+
void kcov_remote_start(u64 handle)
{
struct task_struct *t = current;
@@ -869,12 +911,13 @@ void kcov_remote_start(u64 handle)
int sequence;
unsigned long flags;

- if (WARN_ON(!kcov_check_handle(handle, true, true, true)))
+ /* Don't use in_task() in order to allow consistent checks in RT kernels. */
+ if (in_hardirq() || in_nmi())
return;
- if (!in_task() && !in_softirq_really())
+ if (WARN_ON(!kcov_check_handle(handle, true, true, true)))
return;

- local_lock_irqsave(&kcov_percpu_data.lock, flags);
+ kcov_local_lock_irqsave(flags);

/*
* Check that kcov_remote_start() is not called twice in background
@@ -882,7 +925,7 @@ void kcov_remote_start(u64 handle)
*/
mode = READ_ONCE(t->kcov_mode);
if (WARN_ON(in_task() && kcov_mode_enabled(mode))) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ kcov_local_unlock_irqrestore(flags);
return;
}
/*
@@ -891,7 +934,7 @@ void kcov_remote_start(u64 handle)
* happened while collecting coverage from a background thread.
*/
if (WARN_ON(in_serving_softirq() && t->kcov_softirq)) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ kcov_local_unlock_irqrestore(flags);
return;
}

@@ -899,11 +942,11 @@ void kcov_remote_start(u64 handle)
remote = kcov_remote_find(handle);
if (!remote) {
spin_unlock(&kcov_remote_lock);
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ kcov_local_unlock_irqrestore(flags);
return;
}
kcov_debug("handle = %llx, context: %s\n", handle,
- in_task() ? "task" : "softirq");
+ IS_ENABLED(CONFIG_PREEMPT_RT) || in_task() ? "task" : "softirq");
kcov = remote->kcov;
/* Put in kcov_remote_stop(). */
kcov_get(kcov);
@@ -915,6 +958,10 @@ void kcov_remote_start(u64 handle)
*/
mode = context_unsafe(kcov->mode);
sequence = kcov->sequence;
+#ifdef CONFIG_PREEMPT_RT
+ size = kcov->remote_size;
+ area = kcov_remote_area_get(size);
+#else
if (in_task()) {
size = kcov->remote_size;
area = kcov_remote_area_get(size);
@@ -922,17 +969,18 @@ void kcov_remote_start(u64 handle)
size = CONFIG_KCOV_IRQ_AREA_SIZE;
area = this_cpu_ptr(&kcov_percpu_data)->irq_area;
}
+#endif
spin_unlock(&kcov_remote_lock);

- /* Can only happen when in_task(). */
+ /* Can only happen when CONFIG_PREEMPT_RT=y or in_task(). */
if (!area) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ kcov_local_unlock_irqrestore(flags);
area = vmalloc(size * sizeof(unsigned long));
if (!area) {
kcov_put(kcov);
return;
}
- local_lock_irqsave(&kcov_percpu_data.lock, flags);
+ kcov_local_lock_irqsave(flags);
}

/* Reset coverage size. */
@@ -944,7 +992,7 @@ void kcov_remote_start(u64 handle)
}
kcov_start(t, kcov, size, area, mode, sequence);

- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ kcov_local_unlock_irqrestore(flags);

}
EXPORT_SYMBOL(kcov_remote_start);
@@ -1024,15 +1072,16 @@ void kcov_remote_stop(void)
int sequence;
unsigned long flags;

- if (!in_task() && !in_softirq_really())
+ /* Don't use in_task() in order to allow consistent checks in RT kernels. */
+ if (in_hardirq() || in_nmi())
return;

- local_lock_irqsave(&kcov_percpu_data.lock, flags);
+ kcov_local_lock_irqsave(flags);

mode = READ_ONCE(t->kcov_mode);
barrier();
if (!kcov_mode_enabled(mode)) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ kcov_local_unlock_irqrestore(flags);
return;
}
/*
@@ -1040,12 +1089,12 @@ void kcov_remote_stop(void)
* actually found the remote handle and started collecting coverage.
*/
if (in_serving_softirq() && !t->kcov_softirq) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ kcov_local_unlock_irqrestore(flags);
return;
}
/* Make sure that kcov_softirq is only set when in softirq. */
if (WARN_ON(!in_serving_softirq() && t->kcov_softirq)) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ kcov_local_unlock_irqrestore(flags);
return;
}

@@ -1069,13 +1118,13 @@ void kcov_remote_stop(void)
kcov_move_area(kcov->mode, kcov->area, kcov->size, area);
spin_unlock(&kcov->lock);

- if (in_task()) {
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) || in_task()) {
spin_lock(&kcov_remote_lock);
kcov_remote_area_put(area, size);
spin_unlock(&kcov_remote_lock);
}

- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ kcov_local_unlock_irqrestore(flags);

/* Get in kcov_remote_start(). */
kcov_put(kcov);
@@ -1119,6 +1168,7 @@ static void __init selftest(void)

static int __init kcov_init(void)
{
+#ifndef CONFIG_PREEMPT_RT
int cpu;

for_each_possible_cpu(cpu) {
@@ -1128,6 +1178,7 @@ static int __init kcov_init(void)
return -ENOMEM;
per_cpu_ptr(&kcov_percpu_data, cpu)->irq_area = area;
}
+#endif

/*
* The kcov debugfs file won't ever get removed and thus,
--
2.47.3


Tetsuo Handa

unread,
May 9, 2026, 4:10:00 AM (11 days ago) May 9
to kasan-dev, LKML, Dmitry Vyukov, Alexander Potapenko, Alan Stern, Andrew Morton, Andrey Konovalov, Andrey Konovalov, Clark Williams, Greg Kroah-Hartman, Marco Elver, Sebastian Andrzej Siewior
In CONFIG_PREEMPT_RT=y kernels, softirqs are executed in a per-CPU
ksoftirqd thread or in the context of the task that raised the softirq.
This means in_task() can return true even while serving softirqs. This
behavior causes KCOV to incorrectly identify the context, leading to state
corruption and various kcov-related warnings reported by syzbot.

Furthermore, commit d5d2c51f1e5f ("kcov: replace local_irq_save() with a
local_lock_t") introduced local_lock to protect per-CPU kcov_percpu_data.
However, the need for this physical CPU-level locking is questionable
because:

1. kcov_remote_start/stop() already bail out early if called from
in_hardirq() or in_nmi().
2. The core tracing function check_kcov_mode() also skips hardirq/NMI
contexts, meaning no KCOV state is accessed during hardirqs even if
they interrupt a KCOV-enabled task/softirq.
3. In non-RT kernels, softirqs do not nest, so no concurrent access to
per-CPU data occurs between softirqs.
4. In RT kernels, while softirqs can be preempted, this patch moves the
KCOV state from per-CPU variables to task_struct (per-task), eliminating
the contention on shared per-CPU resources.

By moving kcov_remote_data to task_struct for RT kernels and replacing
the in_task() check with !in_serving_softirq(), we ensure consistent
context detection. Since the data is now isolated per-task and not accessed
by hardirqs, the local_lock (and the original local_irq_save) is no longer
necessary and is removed to reduce overhead.

Changes:

* Move remote coverage state from kcov_percpu_data to task_struct under
CONFIG_PREEMPT_RT.
* Replace in_task() with !in_serving_softirq() in kcov_remote_start/stop()
for accurate context tracking.
* Remove local_lock and IRQ disabling from kcov_remote_start/stop() as the
state is now task-local and hardirqs are already excluded.
* Ensure CONFIG_PREEMPT_RT=y uses kcov_remote_area_get() (the
vmalloc-backed pool) instead of the single per-CPU irq_area.

Analyzed-by: AI Mode in Google Search (no mail address)
Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts")
---
Only compile tested. I don't have environment to measure how not
preallocating CONFIG_KCOV_IRQ_AREA_SIZE bytes of buffers and
add to the global pool in kcov_init() impacts reliability / performance.
Please be sure to test this patch using CONFIG_PREEMPT_RT=y and
CONFIG_PREEMPT_RT=n kernels in local syzkaller environment before
sending upstream.

This patch is expected to address various KCOV related reports.
But I don't add Closes: lines because we can't tell whether this patch
alone is sufficient for marking as Closes:. We will find the answer
some time after being sent to upstream.

https://syzkaller.appspot.com/bug?extid=90984d3713722683112e
https://syzkaller.appspot.com/bug?extid=47cf95ca1f9dcca872c8
https://syzkaller.appspot.com/bug?extid=8a173e13208949931dc7
https://syzkaller.appspot.com/bug?extid=3f51ad7ac3ae57a6fdcc
https://syzkaller.appspot.com/bug?extid=e6686317bd9fe911591a

include/linux/sched.h | 10 +++++
kernel/kcov.c | 94 +++++++++++++++++++++++++------------------
lib/Kconfig.debug | 1 +
3 files changed, 65 insertions(+), 40 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 368c7b4d7cb5..2c963f4271d6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1522,6 +1522,16 @@ struct task_struct {

/* Collect coverage from softirq context: */
unsigned int kcov_softirq;
+
+#ifdef CONFIG_PREEMPT_RT
+ /* Temporary storage for preempting remote coverage collection: */
+ unsigned int kcov_saved_mode;
+ unsigned int kcov_saved_size;
+ void *kcov_saved_area;
+ struct kcov *kcov_saved_kcov;
+ int kcov_saved_sequence;
+#endif
+
#endif

#ifdef CONFIG_MEMCG_V1
diff --git a/kernel/kcov.c b/kernel/kcov.c
index 0b369e88c7c9..965c11a75b36 100644
--- a/kernel/kcov.c
+++ b/kernel/kcov.c
@@ -88,9 +88,9 @@ static DEFINE_SPINLOCK(kcov_remote_lock);
static DEFINE_HASHTABLE(kcov_remote_map, 4);
static struct list_head kcov_remote_areas = LIST_HEAD_INIT(kcov_remote_areas);

+#ifndef CONFIG_PREEMPT_RT
struct kcov_percpu_data {
void *irq_area;
- local_lock_t lock;

unsigned int saved_mode;
unsigned int saved_size;
@@ -100,8 +100,8 @@ struct kcov_percpu_data {
};

static DEFINE_PER_CPU(struct kcov_percpu_data, kcov_percpu_data) = {
- .lock = INIT_LOCAL_LOCK(lock),
};
+#endif

/* Must be called with kcov_remote_lock locked. */
static struct kcov_remote *kcov_remote_find(u64 handle)
@@ -823,8 +823,38 @@ static inline bool kcov_mode_enabled(unsigned int mode)
return (mode & ~KCOV_IN_CTXSW) != KCOV_MODE_DISABLED;
}

+#ifdef CONFIG_PREEMPT_RT
+#else
static void kcov_remote_softirq_start(struct task_struct *t)
- __must_hold(&kcov_percpu_data.lock)
{
struct kcov_percpu_data *data = this_cpu_ptr(&kcov_percpu_data);
unsigned int mode;
@@ -842,7 +872,6 @@ static void kcov_remote_softirq_start(struct task_struct *t)
}

static void kcov_remote_softirq_stop(struct task_struct *t)
- __must_hold(&kcov_percpu_data.lock)
{
struct kcov_percpu_data *data = this_cpu_ptr(&kcov_percpu_data);

@@ -857,6 +886,7 @@ static void kcov_remote_softirq_stop(struct task_struct *t)
data->saved_kcov = NULL;
}
}
+#endif

void kcov_remote_start(u64 handle)
{
@@ -867,43 +897,35 @@ void kcov_remote_start(u64 handle)
void *area;
unsigned int size;
int sequence;
- unsigned long flags;

- if (WARN_ON(!kcov_check_handle(handle, true, true, true)))
+ /* Don't use in_task() in order to allow consistent checks in RT kernels. */
+ if (in_hardirq() || in_nmi())
return;
- if (!in_task() && !in_softirq_really())
+ if (WARN_ON(!kcov_check_handle(handle, true, true, true)))
return;

- local_lock_irqsave(&kcov_percpu_data.lock, flags);
-
/*
* Check that kcov_remote_start() is not called twice in background
* threads nor called by user tasks (with enabled kcov).
*/
- mode = READ_ONCE(t->kcov_mode);
- if (WARN_ON(in_task() && kcov_mode_enabled(mode))) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ if (WARN_ON(!in_serving_softirq() && kcov_mode_enabled(READ_ONCE(t->kcov_mode))))
return;
- }
/*
* Check that kcov_remote_start() is not called twice in softirqs.
* Note, that kcov_remote_start() can be called from a softirq that
* happened while collecting coverage from a background thread.
*/
- if (WARN_ON(in_serving_softirq() && t->kcov_softirq)) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ if (WARN_ON(in_serving_softirq() && t->kcov_softirq))
return;
- }

spin_lock(&kcov_remote_lock);
remote = kcov_remote_find(handle);
if (!remote) {
spin_unlock(&kcov_remote_lock);
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
return;
}
kcov_debug("handle = %llx, context: %s\n", handle,
- in_task() ? "task" : "softirq");
+ !in_serving_softirq() ? "task" : "softirq");
kcov = remote->kcov;
/* Put in kcov_remote_stop(). */
kcov_get(kcov);
@@ -915,6 +937,10 @@ void kcov_remote_start(u64 handle)
*/
mode = context_unsafe(kcov->mode);
sequence = kcov->sequence;
+#ifdef CONFIG_PREEMPT_RT
+ size = kcov->remote_size;
+ area = kcov_remote_area_get(size);
+#else
if (in_task()) {
size = kcov->remote_size;
area = kcov_remote_area_get(size);
@@ -922,17 +948,16 @@ void kcov_remote_start(u64 handle)
size = CONFIG_KCOV_IRQ_AREA_SIZE;
area = this_cpu_ptr(&kcov_percpu_data)->irq_area;
}
+#endif
spin_unlock(&kcov_remote_lock);

- /* Can only happen when in_task(). */
+ /* Can only happen when CONFIG_PREEMPT_RT=y or in_task(). */
if (!area) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
area = vmalloc(size * sizeof(unsigned long));
if (!area) {
kcov_put(kcov);
return;
}
- local_lock_irqsave(&kcov_percpu_data.lock, flags);
}

/* Reset coverage size. */
@@ -943,9 +968,6 @@ void kcov_remote_start(u64 handle)
t->kcov_softirq = 1;
}
kcov_start(t, kcov, size, area, mode, sequence);
-
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
-
}
EXPORT_SYMBOL(kcov_remote_start);

@@ -1022,32 +1044,24 @@ void kcov_remote_stop(void)
void *area;
unsigned int size;
int sequence;
- unsigned long flags;

- if (!in_task() && !in_softirq_really())
+ /* Don't use in_task() in order to allow consistent checks in RT kernels. */
+ if (in_hardirq() || in_nmi())
return;

- local_lock_irqsave(&kcov_percpu_data.lock, flags);
-
mode = READ_ONCE(t->kcov_mode);
barrier();
- if (!kcov_mode_enabled(mode)) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ if (!kcov_mode_enabled(mode))
return;
- }
/*
* When in softirq, check if the corresponding kcov_remote_start()
* actually found the remote handle and started collecting coverage.
*/
- if (in_serving_softirq() && !t->kcov_softirq) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ if (in_serving_softirq() && !t->kcov_softirq)
return;
- }
/* Make sure that kcov_softirq is only set when in softirq. */
- if (WARN_ON(!in_serving_softirq() && t->kcov_softirq)) {
- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
+ if (WARN_ON(!in_serving_softirq() && t->kcov_softirq))
return;
- }

kcov = t->kcov;
area = t->kcov_area;
@@ -1069,14 +1083,12 @@ void kcov_remote_stop(void)
kcov_move_area(kcov->mode, kcov->area, kcov->size, area);
spin_unlock(&kcov->lock);

- if (in_task()) {
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) || in_task()) {
spin_lock(&kcov_remote_lock);
kcov_remote_area_put(area, size);
spin_unlock(&kcov_remote_lock);
}

- local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
-
/* Get in kcov_remote_start(). */
kcov_put(kcov);
}
@@ -1119,6 +1131,7 @@ static void __init selftest(void)

static int __init kcov_init(void)
{
+#ifndef CONFIG_PREEMPT_RT
int cpu;

for_each_possible_cpu(cpu) {
@@ -1128,6 +1141,7 @@ static int __init kcov_init(void)
return -ENOMEM;
per_cpu_ptr(&kcov_percpu_data, cpu)->irq_area = area;
}
+#endif

/*
* The kcov debugfs file won't ever get removed and thus,
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 13f3297aa823..493be2c73c9d 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2247,6 +2247,7 @@ config KCOV_INSTRUMENT_ALL
config KCOV_IRQ_AREA_SIZE
hex "Size of interrupt coverage collection area in words"
depends on KCOV
+ depends on !PREEMPT_RT
default 0x40000
help
KCOV uses preallocated per-cpu areas to collect coverage from
--
2.47.3


Dmitry Vyukov

unread,
May 15, 2026, 3:10:59 AM (5 days ago) May 15
to Tetsuo Handa, Aleksandr Nogikh, kasan-dev, LKML, Alexander Potapenko, Alan Stern, Andrew Morton, Andrey Konovalov, Clark Williams, Greg Kroah-Hartman, Marco Elver, Sebastian Andrzej Siewior
+Aleksandr Nogikh

I don't see this patch on syzbot CI:
https://ci.syzbot.org/?name=kcov%3A+move+kcov_remote_data+to+task_struct+for+RT+and+remove+local_lock

How can we make it at least tested on syzbot CI? Otherwise it risks
breaking syzbot when merged.
Is it possible to unify the logic between RT and non-RT kernels (not
have any ifdef's)?
The logic is already rather complicated, having 2 work modes, and
limited testing will be problematic in the future.

Tetsuo Handa

unread,
6:33 AM (9 hours ago) 6:33 AM
to Dmitry Vyukov, Aleksandr Nogikh, kasan-dev, LKML, Alexander Potapenko, Alan Stern, Andrew Morton, Andrey Konovalov, Clark Williams, Greg Kroah-Hartman, Marco Elver, Sebastian Andrzej Siewior
On 2026/05/15 16:10, Dmitry Vyukov wrote:
>> Only compile tested. I don't have environment to measure how not
>
> +Aleksandr Nogikh
>
> I don't see this patch on syzbot CI:
> https://ci.syzbot.org/?name=kcov%3A+move+kcov_remote_data+to+task_struct+for+RT+and+remove+local_lock

I didn't include syzbot related mail address.
What is needed to make this patch detected by syzbot CI?

>
> How can we make it at least tested on syzbot CI? Otherwise it risks
> breaking syzbot when merged.

Someone who can run syzkaller will be able to test this patch.
If nobody can try, I can try sending this patch to only linux-next tree via my tree.

>> +#ifdef CONFIG_PREEMPT_RT
>
> Is it possible to unify the logic between RT and non-RT kernels (not
> have any ifdef's)?
> The logic is already rather complicated, having 2 work modes, and
> limited testing will be problematic in the future.

Always using per task_struct temporary storage will be possible if we don't mind
bloating sizeof(struct task_struct). But we can't unify buffer allocation part
because allocating large buffer using kmalloc(GFP_ATOMIC) for CONFIG_PREEMPT_RT=n
is unreliable. Preallocating global non-percpu buffers for CONFIG_PREEMPT_RT=y
would be possible, but we don't know how many we should allocate due to
possibility of preemption in CONFIG_PREEMPT_RT=y.

Sebastian Andrzej Siewior

unread,
11:13 AM (4 hours ago) 11:13 AM
to Tetsuo Handa, kasan-dev, LKML, Dmitry Vyukov, Alexander Potapenko, Alan Stern, Andrew Morton, Andrey Konovalov, Andrey Konovalov, Clark Williams, Greg Kroah-Hartman, Marco Elver
On 2026-05-09 17:09:44 [+0900], Tetsuo Handa wrote:
> In CONFIG_PREEMPT_RT=y kernels, softirqs are executed in a per-CPU
> ksoftirqd thread or in the context of the task that raised the softirq.
> This means in_task() can return true even while serving softirqs. This
> behavior causes KCOV to incorrectly identify the context, leading to state
> corruption and various kcov-related warnings reported by syzbot.

I need to digest the remaining email but in_task() should not return on
PREEMPT_RT if a softirq is served. Regardless if it happens in ksoftirqd
or in the context of the task raised via local_bh_enable().

I will digest this later.

Sebastian
Reply all
Reply to author
Forward
0 new messages