[PATCH v5 00/23] mm/ksw: Introduce real-time KStackWatch debugging tool

8 views
Skip to first unread message

Jinchao Wang

unread,
Sep 24, 2025, 7:51:36 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
This patch series introduces KStackWatch, a lightweight debugging tool to detect
kernel stack corruption in real time. It installs a hardware breakpoint
(watchpoint) at a function's specified offset using `kprobe.post_handler` and
removes it in `fprobe.exit_handler`. This covers the full execution window and
reports corruption immediately with time, location, and a call stack.

The motivation comes from scenarios where corruption occurs silently in one
function but manifests later in another, without a direct call trace linking
the two. Such bugs are often extremely hard to debug with existing tools.
These scenarios are demonstrated in test 3–5 (silent corruption test, patch 20).

Key features include:

* Immediate and precise corruption detection
* Support multiple watchpoints for concurrently called functions
* Lockless design, usable in any context
* Depth filter for recursive calls
* Minimal impact on reproducibility
* Flexible procfs configuration with key=val syntax

To validate the approach, the patch includes a test module and a test script.

There is a workflow example described in detail in the documentation (patch 22).
Please read the document first if you want an overview.

---
Patches 1–3 of this series are also used in the wprobe work proposed by
Masami Hiramatsu, so there may be some overlap between our patches.
Patch 3 comes directly from Masami Hiramatsu (thanks).
---
Changelog

V5:
* Support key=value input format
* Support multiple watchpoints
* Support watching instruction inside loop
* Support recursion depth tracking with generation
* Ignore triggers from fprobe trampoline
* Split watch_on into watch_get and watch_on to fail fast
* Handle ksw_stack_prepare_watch error
* Rewrite silent corruption test
* Add multiple watchpoints test
* Add an example in documentation

V4:
https://lore.kernel.org/all/20250912101145.4657...@gmail.com/
* Solve the lockdep issues with:
* per-task KStackWatch context to track depth
* atomic flag to protect watched_addr
* Use refactored version of arch_reinstall_hw_breakpoint

V3:
https://lore.kernel.org/all/20250910052335.1151...@gmail.com/
* Use modify_wide_hw_breakpoint_local() (from Masami)
* Add atomic flag to restrict /proc/kstackwatch to a single opener
* Protect stack probe with an atomic PID flag
* Handle CPU hotplug for watchpoints
* Add preempt_disable/enable in ksw_watch_on_local_cpu()
* Introduce const struct ksw_config *ksw_get_config(void) and use it
* Switch to global watch_attr, remove struct watch_info
* Validate local_var_len in parser()
* Handle case when canary is not found
* Use dump_stack() instead of show_regs() to allow module build
* Reduce logging and comments
* Format logs with KBUILD_MODNAME
* Remove unused headers
* Add new document

V2:
https://lore.kernel.org/all/20250904002126.1514...@gmail.com/
* Make hardware breakpoint and stack operations architecture-independent.

V1:
https://lore.kernel.org/all/20250828073311.1116...@gmail.com/
* Replaced kretprobe with fprobe for function exit hooking, as suggested
by Masami Hiramatsu
* Introduced per-task depth logic to track recursion across scheduling
* Removed the use of workqueue for a more efficient corruption check
* Reordered patches for better logical flow
* Simplified and improved commit messages throughout the series
* Removed initial archcheck which should be improved later
* Replaced the multiple-thread test with silent corruption test
* Split self-tests into a separate patch to improve clarity.
* Added a new entry for KStackWatch to the MAINTAINERS file.

RFC:
https://lore.kernel.org/lkml/20250818122720.4349...@gmail.com/

---

The series is structured as follows:

Jinchao Wang (22):
x86/hw_breakpoint: Unify breakpoint install/uninstall
x86/hw_breakpoint: Add arch_reinstall_hw_breakpoint
mm/ksw: add build system support
mm/ksw: add ksw_config struct and parser
mm/ksw: add singleton /proc/kstackwatch interface
mm/ksw: add HWBP pre-allocation
mm/ksw: Add atomic watchpoint management api
mm/ksw: ignore false positives from exit trampolines
mm/ksw: support CPU hotplug
sched: add per-task context
mm/ksw: add entry kprobe and exit fprobe management
mm/ksw: add per-task ctx tracking
mm/ksw: resolve stack watch addr and len
mm/ksw: manage probe and HWBP lifecycle via procfs
mm/ksw: add self-debug helpers
mm/ksw: add test module
mm/ksw: add stack overflow test
mm/ksw: add recursive depth test
mm/ksw: add multi-thread corruption test cases
tools/ksw: add test script
docs: add KStackWatch document
MAINTAINERS: add entry for KStackWatch

Masami Hiramatsu (Google) (1):
HWBP: Add modify_wide_hw_breakpoint_local() API

Documentation/dev-tools/index.rst | 1 +
Documentation/dev-tools/kstackwatch.rst | 316 ++++++++++++++++++++++
MAINTAINERS | 8 +
arch/Kconfig | 10 +
arch/x86/Kconfig | 1 +
arch/x86/include/asm/hw_breakpoint.h | 8 +
arch/x86/kernel/hw_breakpoint.c | 148 ++++++-----
include/linux/hw_breakpoint.h | 6 +
include/linux/kstackwatch_types.h | 14 +
include/linux/sched.h | 5 +
kernel/events/hw_breakpoint.c | 37 +++
mm/Kconfig.debug | 18 ++
mm/Makefile | 1 +
mm/kstackwatch/Makefile | 8 +
mm/kstackwatch/kernel.c | 263 +++++++++++++++++++
mm/kstackwatch/kstackwatch.h | 58 +++++
mm/kstackwatch/stack.c | 240 +++++++++++++++++
mm/kstackwatch/test.c | 332 ++++++++++++++++++++++++
mm/kstackwatch/watch.c | 305 ++++++++++++++++++++++
tools/kstackwatch/kstackwatch_test.sh | 52 ++++
20 files changed, 1769 insertions(+), 62 deletions(-)
create mode 100644 Documentation/dev-tools/kstackwatch.rst
create mode 100644 include/linux/kstackwatch_types.h
create mode 100644 mm/kstackwatch/Makefile
create mode 100644 mm/kstackwatch/kernel.c
create mode 100644 mm/kstackwatch/kstackwatch.h
create mode 100644 mm/kstackwatch/stack.c
create mode 100644 mm/kstackwatch/test.c
create mode 100644 mm/kstackwatch/watch.c
create mode 100755 tools/kstackwatch/kstackwatch_test.sh

--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:51:40 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Consolidate breakpoint management to reduce code duplication.
The diffstat was misleading, so the stripped code size is compared instead.
After refactoring, it is reduced from 11976 bytes to 11448 bytes on my
x86_64 system built with clang.

This also makes it easier to introduce arch_reinstall_hw_breakpoint().

In addition, including linux/types.h to fix a missing build dependency.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
Reviewed-by: Masami Hiramatsu (Google) <mhir...@kernel.org>
---
arch/x86/include/asm/hw_breakpoint.h | 6 ++
arch/x86/kernel/hw_breakpoint.c | 141 +++++++++++++++------------
2 files changed, 84 insertions(+), 63 deletions(-)

diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h
index 0bc931cd0698..aa6adac6c3a2 100644
--- a/arch/x86/include/asm/hw_breakpoint.h
+++ b/arch/x86/include/asm/hw_breakpoint.h
@@ -5,6 +5,7 @@
#include <uapi/asm/hw_breakpoint.h>

#define __ARCH_HW_BREAKPOINT_H
+#include <linux/types.h>

/*
* The name should probably be something dealt in
@@ -18,6 +19,11 @@ struct arch_hw_breakpoint {
u8 type;
};

+enum bp_slot_action {
+ BP_SLOT_ACTION_INSTALL,
+ BP_SLOT_ACTION_UNINSTALL,
+};
+
#include <linux/kdebug.h>
#include <linux/percpu.h>
#include <linux/list.h>
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index b01644c949b2..3658ace4bd8d 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -48,7 +48,6 @@ static DEFINE_PER_CPU(unsigned long, cpu_debugreg[HBP_NUM]);
*/
static DEFINE_PER_CPU(struct perf_event *, bp_per_reg[HBP_NUM]);

-
static inline unsigned long
__encode_dr7(int drnum, unsigned int len, unsigned int type)
{
@@ -85,96 +84,112 @@ int decode_dr7(unsigned long dr7, int bpnum, unsigned *len, unsigned *type)
}

/*
- * Install a perf counter breakpoint.
- *
- * We seek a free debug address register and use it for this
- * breakpoint. Eventually we enable it in the debug control register.
- *
- * Atomic: we hold the counter->ctx->lock and we only handle variables
- * and registers local to this cpu.
+ * We seek a slot and change it or keep it based on the action.
+ * Returns slot number on success, negative error on failure.
+ * Must be called with IRQs disabled.
*/
-int arch_install_hw_breakpoint(struct perf_event *bp)
+static int manage_bp_slot(struct perf_event *bp, enum bp_slot_action action)
{
- struct arch_hw_breakpoint *info = counter_arch_bp(bp);
- unsigned long *dr7;
- int i;
-
- lockdep_assert_irqs_disabled();
+ struct perf_event *old_bp;
+ struct perf_event *new_bp;
+ int slot;
+
+ switch (action) {
+ case BP_SLOT_ACTION_INSTALL:
+ old_bp = NULL;
+ new_bp = bp;
+ break;
+ case BP_SLOT_ACTION_UNINSTALL:
+ old_bp = bp;
+ new_bp = NULL;
+ break;
+ default:
+ return -EINVAL;
+ }

- for (i = 0; i < HBP_NUM; i++) {
- struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
+ for (slot = 0; slot < HBP_NUM; slot++) {
+ struct perf_event **curr = this_cpu_ptr(&bp_per_reg[slot]);

- if (!*slot) {
- *slot = bp;
- break;
+ if (*curr == old_bp) {
+ *curr = new_bp;
+ return slot;
}
}

- if (WARN_ONCE(i == HBP_NUM, "Can't find any breakpoint slot"))
- return -EBUSY;
+ if (old_bp) {
+ WARN_ONCE(1, "Can't find matching breakpoint slot");
+ return -EINVAL;
+ }
+
+ WARN_ONCE(1, "No free breakpoint slots");
+ return -EBUSY;
+}
+
+static void setup_hwbp(struct arch_hw_breakpoint *info, int slot, bool enable)
+{
+ unsigned long dr7;

- set_debugreg(info->address, i);
- __this_cpu_write(cpu_debugreg[i], info->address);
+ set_debugreg(info->address, slot);
+ __this_cpu_write(cpu_debugreg[slot], info->address);

- dr7 = this_cpu_ptr(&cpu_dr7);
- *dr7 |= encode_dr7(i, info->len, info->type);
+ dr7 = this_cpu_read(cpu_dr7);
+ if (enable)
+ dr7 |= encode_dr7(slot, info->len, info->type);
+ else
+ dr7 &= ~__encode_dr7(slot, info->len, info->type);

/*
- * Ensure we first write cpu_dr7 before we set the DR7 register.
- * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
+ * Enabling:
+ * Ensure we first write cpu_dr7 before we set the DR7 register.
+ * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
*/
+ if (enable)
+ this_cpu_write(cpu_dr7, dr7);
+
barrier();

- set_debugreg(*dr7, 7);
+ set_debugreg(dr7, 7);
+
if (info->mask)
- amd_set_dr_addr_mask(info->mask, i);
+ amd_set_dr_addr_mask(enable ? info->mask : 0, slot);

- return 0;
+ /*
+ * Disabling:
+ * Ensure the write to cpu_dr7 is after we've set the DR7 register.
+ * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
+ */
+ if (!enable)
+ this_cpu_write(cpu_dr7, dr7);
}

/*
- * Uninstall the breakpoint contained in the given counter.
- *
- * First we search the debug address register it uses and then we disable
- * it.
- *
- * Atomic: we hold the counter->ctx->lock and we only handle variables
- * and registers local to this cpu.
+ * find suitable breakpoint slot and set it up based on the action
*/
-void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+static int arch_manage_bp(struct perf_event *bp, enum bp_slot_action action)
{
- struct arch_hw_breakpoint *info = counter_arch_bp(bp);
- unsigned long dr7;
- int i;
+ struct arch_hw_breakpoint *info;
+ int slot;

lockdep_assert_irqs_disabled();

- for (i = 0; i < HBP_NUM; i++) {
- struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
-
- if (*slot == bp) {
- *slot = NULL;
- break;
- }
- }
-
- if (WARN_ONCE(i == HBP_NUM, "Can't find any breakpoint slot"))
- return;
+ slot = manage_bp_slot(bp, action);
+ if (slot < 0)
+ return slot;

- dr7 = this_cpu_read(cpu_dr7);
- dr7 &= ~__encode_dr7(i, info->len, info->type);
+ info = counter_arch_bp(bp);
+ setup_hwbp(info, slot, action != BP_SLOT_ACTION_UNINSTALL);

- set_debugreg(dr7, 7);
- if (info->mask)
- amd_set_dr_addr_mask(0, i);
+ return 0;
+}

- /*
- * Ensure the write to cpu_dr7 is after we've set the DR7 register.
- * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
- */
- barrier();
+int arch_install_hw_breakpoint(struct perf_event *bp)
+{
+ return arch_manage_bp(bp, BP_SLOT_ACTION_INSTALL);
+}

- this_cpu_write(cpu_dr7, dr7);
+void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+{
+ arch_manage_bp(bp, BP_SLOT_ACTION_UNINSTALL);
}

static int arch_bp_generic_len(int x86_len)
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:51:43 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
The new arch_reinstall_hw_breakpoint() function can be used in an
atomic context, unlike the more expensive free and re-allocation path.
This allows callers to efficiently re-establish an existing breakpoint.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
Reviewed-by: Masami Hiramatsu (Google) <mhir...@kernel.org>
---
arch/x86/include/asm/hw_breakpoint.h | 2 ++
arch/x86/kernel/hw_breakpoint.c | 9 +++++++++
2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h
index aa6adac6c3a2..c22cc4e87fc5 100644
--- a/arch/x86/include/asm/hw_breakpoint.h
+++ b/arch/x86/include/asm/hw_breakpoint.h
@@ -21,6 +21,7 @@ struct arch_hw_breakpoint {

enum bp_slot_action {
BP_SLOT_ACTION_INSTALL,
+ BP_SLOT_ACTION_REINSTALL,
BP_SLOT_ACTION_UNINSTALL,
};

@@ -65,6 +66,7 @@ extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,


int arch_install_hw_breakpoint(struct perf_event *bp);
+int arch_reinstall_hw_breakpoint(struct perf_event *bp);
void arch_uninstall_hw_breakpoint(struct perf_event *bp);
void hw_breakpoint_pmu_read(struct perf_event *bp);
void hw_breakpoint_pmu_unthrottle(struct perf_event *bp);
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 3658ace4bd8d..29c9369264d4 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -99,6 +99,10 @@ static int manage_bp_slot(struct perf_event *bp, enum bp_slot_action action)
old_bp = NULL;
new_bp = bp;
break;
+ case BP_SLOT_ACTION_REINSTALL:
+ old_bp = bp;
+ new_bp = bp;
+ break;
case BP_SLOT_ACTION_UNINSTALL:
old_bp = bp;
new_bp = NULL;
@@ -187,6 +191,11 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
return arch_manage_bp(bp, BP_SLOT_ACTION_INSTALL);
}

+int arch_reinstall_hw_breakpoint(struct perf_event *bp)
+{
+ return arch_manage_bp(bp, BP_SLOT_ACTION_REINSTALL);
+}
+
void arch_uninstall_hw_breakpoint(struct perf_event *bp)
{
arch_manage_bp(bp, BP_SLOT_ACTION_UNINSTALL);
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:51:47 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
From: "Masami Hiramatsu (Google)" <mhir...@kernel.org>

Add modify_wide_hw_breakpoint_local() arch-wide interface which allows
hwbp users to update watch address on-line. This is available if the
arch supports CONFIG_HAVE_REINSTALL_HW_BREAKPOINT.
Note that this allows to change the type only for compatible types,
because it does not release and reserve the hwbp slot based on type.
For instance, you can not change HW_BREAKPOINT_W to HW_BREAKPOINT_X.

Signed-off-by: Masami Hiramatsu (Google) <mhir...@kernel.org>
---
arch/Kconfig | 10 ++++++++++
arch/x86/Kconfig | 1 +
include/linux/hw_breakpoint.h | 6 ++++++
kernel/events/hw_breakpoint.c | 37 +++++++++++++++++++++++++++++++++++
4 files changed, 54 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index d1b4ffd6e085..e4787fc814df 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -418,6 +418,16 @@ config HAVE_MIXED_BREAKPOINTS_REGS
Select this option if your arch implements breakpoints under the
latter fashion.

+config HAVE_REINSTALL_HW_BREAKPOINT
+ bool
+ depends on HAVE_HW_BREAKPOINT
+ help
+ Depending on the arch implementation of hardware breakpoints,
+ some of them are able to update the breakpoint configuration
+ without release and reserve the hardware breakpoint register.
+ What configuration is able to update depends on hardware and
+ software implementation.
+
config HAVE_USER_RETURN_NOTIFIER
bool

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 52c8910ba2ef..4ea313ef3e82 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -247,6 +247,7 @@ config X86
select HAVE_FUNCTION_TRACER
select HAVE_GCC_PLUGINS
select HAVE_HW_BREAKPOINT
+ select HAVE_REINSTALL_HW_BREAKPOINT
select HAVE_IOREMAP_PROT
select HAVE_IRQ_EXIT_ON_IRQ_STACK if X86_64
select HAVE_IRQ_TIME_ACCOUNTING
diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index db199d653dd1..ea373f2587f8 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -81,6 +81,9 @@ register_wide_hw_breakpoint(struct perf_event_attr *attr,
perf_overflow_handler_t triggered,
void *context);

+extern int modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr);
+
extern int register_perf_hw_breakpoint(struct perf_event *bp);
extern void unregister_hw_breakpoint(struct perf_event *bp);
extern void unregister_wide_hw_breakpoint(struct perf_event * __percpu *cpu_events);
@@ -124,6 +127,9 @@ register_wide_hw_breakpoint(struct perf_event_attr *attr,
perf_overflow_handler_t triggered,
void *context) { return NULL; }
static inline int
+modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr) { return -ENOSYS; }
+static inline int
register_perf_hw_breakpoint(struct perf_event *bp) { return -ENOSYS; }
static inline void unregister_hw_breakpoint(struct perf_event *bp) { }
static inline void
diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
index 8ec2cb688903..5ee1522a99c9 100644
--- a/kernel/events/hw_breakpoint.c
+++ b/kernel/events/hw_breakpoint.c
@@ -887,6 +887,43 @@ void unregister_wide_hw_breakpoint(struct perf_event * __percpu *cpu_events)
}
EXPORT_SYMBOL_GPL(unregister_wide_hw_breakpoint);

+/**
+ * modify_wide_hw_breakpoint_local - update breakpoint config for local CPU
+ * @bp: the hwbp perf event for this CPU
+ * @attr: the new attribute for @bp
+ *
+ * This does not release and reserve the slot of a HWBP; it just reuses the
+ * current slot on local CPU. So the users must update the other CPUs by
+ * themselves.
+ * Also, since this does not release/reserve the slot, this can not change the
+ * type to incompatible type of the HWBP.
+ * Return err if attr is invalid or the CPU fails to update debug register
+ * for new @attr.
+ */
+#ifdef CONFIG_HAVE_REINSTALL_HW_BREAKPOINT
+int modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr)
+{
+ int ret;
+
+ if (find_slot_idx(bp->attr.bp_type) != find_slot_idx(attr->bp_type))
+ return -EINVAL;
+
+ ret = hw_breakpoint_arch_parse(bp, attr, counter_arch_bp(bp));
+ if (ret)
+ return ret;
+
+ return arch_reinstall_hw_breakpoint(bp);
+}
+#else
+int modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr)
+{
+ return -EOPNOTSUPP;
+}
+#endif
+EXPORT_SYMBOL_GPL(modify_wide_hw_breakpoint_local);
+
/**
* hw_breakpoint_is_used - check if breakpoints are currently used
*
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:51:51 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add Kconfig and Makefile infrastructure.

The implementation is located under `mm/kstackwatch/`.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/Kconfig.debug | 8 ++++++++
mm/Makefile | 1 +
mm/kstackwatch/Makefile | 2 ++
mm/kstackwatch/kernel.c | 23 +++++++++++++++++++++++
mm/kstackwatch/kstackwatch.h | 5 +++++
mm/kstackwatch/stack.c | 1 +
mm/kstackwatch/watch.c | 1 +
7 files changed, 41 insertions(+)
create mode 100644 mm/kstackwatch/Makefile
create mode 100644 mm/kstackwatch/kernel.c
create mode 100644 mm/kstackwatch/kstackwatch.h
create mode 100644 mm/kstackwatch/stack.c
create mode 100644 mm/kstackwatch/watch.c

diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index 32b65073d0cc..89be351c0be5 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -309,3 +309,11 @@ config PER_VMA_LOCK_STATS
overhead in the page fault path.

If in doubt, say N.
+
+config KSTACK_WATCH
+ bool "Kernel Stack Watch"
+ depends on HAVE_HW_BREAKPOINT && KPROBES && FPROBE && STACKTRACE
+ help
+ A lightweight real-time debugging tool to detect stack corrupting.
+
+ If unsure, say N.
diff --git a/mm/Makefile b/mm/Makefile
index ef54aa615d9d..665c9f2bf987 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -92,6 +92,7 @@ obj-$(CONFIG_PAGE_POISONING) += page_poison.o
obj-$(CONFIG_KASAN) += kasan/
obj-$(CONFIG_KFENCE) += kfence/
obj-$(CONFIG_KMSAN) += kmsan/
+obj-$(CONFIG_KSTACK_WATCH) += kstackwatch/
obj-$(CONFIG_FAILSLAB) += failslab.o
obj-$(CONFIG_FAIL_PAGE_ALLOC) += fail_page_alloc.o
obj-$(CONFIG_MEMTEST) += memtest.o
diff --git a/mm/kstackwatch/Makefile b/mm/kstackwatch/Makefile
new file mode 100644
index 000000000000..84a46cb9a766
--- /dev/null
+++ b/mm/kstackwatch/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_KSTACK_WATCH) += kstackwatch.o
+kstackwatch-y := kernel.o stack.o watch.o
diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
new file mode 100644
index 000000000000..78f1d019225f
--- /dev/null
+++ b/mm/kstackwatch/kernel.c
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/module.h>
+
+static int __init kstackwatch_init(void)
+{
+ pr_info("module loaded\n");
+ return 0;
+}
+
+static void __exit kstackwatch_exit(void)
+{
+ pr_info("module unloaded\n");
+}
+
+module_init(kstackwatch_init);
+module_exit(kstackwatch_exit);
+
+MODULE_AUTHOR("Jinchao Wang");
+MODULE_DESCRIPTION("Kernel Stack Watch");
+MODULE_LICENSE("GPL");
+
diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
new file mode 100644
index 000000000000..0273ef478a26
--- /dev/null
+++ b/mm/kstackwatch/kstackwatch.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _KSTACKWATCH_H
+#define _KSTACKWATCH_H
+
+#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
new file mode 100644
index 000000000000..cec594032515
--- /dev/null
+++ b/mm/kstackwatch/stack.c
@@ -0,0 +1 @@
+// SPDX-License-Identifier: GPL-2.0
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
new file mode 100644
index 000000000000..cec594032515
--- /dev/null
+++ b/mm/kstackwatch/watch.c
@@ -0,0 +1 @@
+// SPDX-License-Identifier: GPL-2.0
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:51:54 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add struct ksw_config and ksw_parse_config() to parse user string.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kernel.c | 112 +++++++++++++++++++++++++++++++++++
mm/kstackwatch/kstackwatch.h | 27 +++++++++
2 files changed, 139 insertions(+)

diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
index 78f1d019225f..3b7009033dd4 100644
--- a/mm/kstackwatch/kernel.c
+++ b/mm/kstackwatch/kernel.c
@@ -1,16 +1,128 @@
// SPDX-License-Identifier: GPL-2.0
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

+#include <linux/kstrtox.h>
#include <linux/module.h>
+#include <linux/string.h>
+
+#include "kstackwatch.h"
+
+static struct ksw_config *ksw_config;
+
+struct param_map {
+ const char *name; /* long name */
+ const char *short_name; /* short name (2 letters) */
+ size_t offset; /* offsetof(struct ksw_config, field) */
+ bool is_string; /* true for string */
+};
+
+/* macro generates both long and short name automatically */
+#define PMAP(field, short, is_str) \
+ { #field, #short, offsetof(struct ksw_config, field), is_str }
+
+static const struct param_map ksw_params[] = {
+ PMAP(func_name, fn, true),
+ PMAP(func_offset, fo, false),
+ PMAP(depth, dp, false),
+ PMAP(max_watch, mw, false),
+ PMAP(sp_offset, so, false),
+ PMAP(watch_len, wl, false),
+};
+
+static int ksw_parse_param(struct ksw_config *config, const char *key,
+ const char *val)
+{
+ const struct param_map *pm = NULL;
+ int ret;
+
+ for (int i = 0; i < ARRAY_SIZE(ksw_params); i++) {
+ if (strcmp(key, ksw_params[i].name) == 0 ||
+ strcmp(key, ksw_params[i].short_name) == 0) {
+ pm = &ksw_params[i];
+ break;
+ }
+ }
+
+ if (!pm)
+ return -EINVAL;
+
+ if (pm->is_string) {
+ char **dst = (char **)((char *)config + pm->offset);
+ *dst = kstrdup(val, GFP_KERNEL);
+ if (!*dst)
+ return -ENOMEM;
+ } else {
+ ret = kstrtou16(val, 0, (u16 *)((char *)config + pm->offset));
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+/*
+ * Configuration string format:
+ * param_name=<value> [param_name=<value> ...]
+ *
+ * Required parameters:
+ * - func_name |fn (str) : target function name
+ * - func_offset|fo (u16) : instruction pointer offset
+ *
+ * Optional parameters:
+ * - depth |dp (u16) : recursion depth
+ * - max_watch |mw (u16) : maximum number of watchpoints
+ * - sp_offset |so (u16) : offset from stack pointer at func_offset
+ * - watch_len |wl (u16) : watch length (1,2,4,8)
+ */
+static int __maybe_unused ksw_parse_config(char *buf, struct ksw_config *config)
+{
+ char *part, *key, *val;
+ int ret;
+
+ kfree(config->func_name);
+ kfree(config->user_input);
+ memset(ksw_config, 0, sizeof(*ksw_config));
+
+ buf = strim(buf);
+ config->user_input = kstrdup(buf, GFP_KERNEL);
+ if (!config->user_input)
+ return -ENOMEM;
+
+ while ((part = strsep(&buf, " \t\n")) != NULL) {
+ if (*part == '\0')
+ continue;
+
+ key = strsep(&part, "=");
+ val = part;
+ if (!key || !val)
+ continue;
+ ret = ksw_parse_param(config, key, val);
+ if (ret)
+ pr_warn("unsupported param %s=%s", key, val);
+ }
+
+ if (!config->func_name || !config->func_offset) {
+ pr_err("Missing required parameters: function or func_offset\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}

static int __init kstackwatch_init(void)
{
+ ksw_config = kzalloc(sizeof(*ksw_config), GFP_KERNEL);
+ if (!ksw_config)
+ return -ENOMEM;
+
pr_info("module loaded\n");
return 0;
}

static void __exit kstackwatch_exit(void)
{
+ kfree(ksw_config);
+
pr_info("module unloaded\n");
}

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 0273ef478a26..a7bad207f863 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -2,4 +2,31 @@
#ifndef _KSTACKWATCH_H
#define _KSTACKWATCH_H

+#include <linux/types.h>
+
+#define MAX_CONFIG_STR_LEN 128
+
+struct ksw_config {
+ char *func_name;
+ u16 depth;
+
+ /*
+ * watched variable info:
+ * - func_offset : instruction offset in the function, typically the
+ * assignment of the watched variable, where ksw
+ * registers a kprobe post-handler.
+ * - sp_offset : offset from stack pointer at func_offset. Usually 0.
+ * - watch_len : size of the watched variable (1, 2, 4, or 8 bytes).
+ */
+ u16 func_offset;
+ u16 sp_offset;
+ u16 watch_len;
+
+ /* max number of hwbps that can be used */
+ u16 max_watch;
+
+ /* save to show */
+ char *user_input;
+};
+
#endif /* _KSTACKWATCH_H */
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:51:58 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Provide the /proc/kstackwatch file to read or update the configuration.
Only a single process can open this file at a time, enforced using atomic
config_file_busy, to prevent concurrent access.

ksw_get_config() exposes the configuration pointer as const.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kernel.c | 77 +++++++++++++++++++++++++++++++++++-
mm/kstackwatch/kstackwatch.h | 3 ++
2 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
index 3b7009033dd4..4a06ddadd9c7 100644
--- a/mm/kstackwatch/kernel.c
+++ b/mm/kstackwatch/kernel.c
@@ -3,11 +3,15 @@

#include <linux/kstrtox.h>
#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
#include <linux/string.h>
+#include <linux/uaccess.h>

#include "kstackwatch.h"

static struct ksw_config *ksw_config;
+static atomic_t config_file_busy = ATOMIC_INIT(0);

struct param_map {
const char *name; /* long name */
@@ -74,7 +78,7 @@ static int ksw_parse_param(struct ksw_config *config, const char *key,
* - sp_offset |so (u16) : offset from stack pointer at func_offset
* - watch_len |wl (u16) : watch length (1,2,4,8)
*/
-static int __maybe_unused ksw_parse_config(char *buf, struct ksw_config *config)
+static int ksw_parse_config(char *buf, struct ksw_config *config)
{
char *part, *key, *val;
int ret;
@@ -109,18 +113,89 @@ static int __maybe_unused ksw_parse_config(char *buf, struct ksw_config *config)
return 0;
}

+static ssize_t kstackwatch_proc_write(struct file *file,
+ const char __user *buffer, size_t count,
+ loff_t *pos)
+{
+ char input[MAX_CONFIG_STR_LEN];
+ int ret;
+
+ if (count == 0 || count >= sizeof(input))
+ return -EINVAL;
+
+ if (copy_from_user(input, buffer, count))
+ return -EFAULT;
+
+ input[count] = '\0';
+ strim(input);
+
+ if (!strlen(input)) {
+ pr_info("config cleared\n");
+ return count;
+ }
+
+ ret = ksw_parse_config(input, ksw_config);
+ if (ret) {
+ pr_err("Failed to parse config %d\n", ret);
+ return ret;
+ }
+
+ return count;
+}
+
+static int kstackwatch_proc_show(struct seq_file *m, void *v)
+{
+ seq_printf(m, "%s\n", ksw_config->user_input);
+ return 0;
+}
+
+static int kstackwatch_proc_open(struct inode *inode, struct file *file)
+{
+ if (atomic_cmpxchg(&config_file_busy, 0, 1))
+ return -EBUSY;
+
+ return single_open(file, kstackwatch_proc_show, NULL);
+}
+
+static int kstackwatch_proc_release(struct inode *inode, struct file *file)
+{
+ atomic_set(&config_file_busy, 0);
+ return single_release(inode, file);
+}
+
+static const struct proc_ops kstackwatch_proc_ops = {
+ .proc_open = kstackwatch_proc_open,
+ .proc_read = seq_read,
+ .proc_write = kstackwatch_proc_write,
+ .proc_lseek = seq_lseek,
+ .proc_release = kstackwatch_proc_release,
+};
+
+const struct ksw_config *ksw_get_config(void)
+{
+ return ksw_config;
+}
static int __init kstackwatch_init(void)
{
ksw_config = kzalloc(sizeof(*ksw_config), GFP_KERNEL);
if (!ksw_config)
return -ENOMEM;

+ if (!proc_create("kstackwatch", 0600, NULL, &kstackwatch_proc_ops)) {
+ pr_err("create proc kstackwatch fail");
+ kfree(ksw_config);
+ return -ENOMEM;
+ }
+
pr_info("module loaded\n");
return 0;
}

static void __exit kstackwatch_exit(void)
{
+ remove_proc_entry("kstackwatch", NULL);
+ kfree(ksw_config->func_name);
+ kfree(ksw_config->user_input);
kfree(ksw_config);

pr_info("module unloaded\n");
diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index a7bad207f863..983125d5cf18 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -29,4 +29,7 @@ struct ksw_config {
char *user_input;
};

+// singleton, only modified in kernel.c
+const struct ksw_config *ksw_get_config(void);

Jinchao Wang

unread,
Sep 24, 2025, 7:52:06 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add three functions for atomic lifecycle management of watchpoints:
- ksw_watch_get(): Acquires a watchpoint from a llist.
- ksw_watch_on(): Enables the watchpoint on all online CPUs.
- ksw_watch_off(): Disables the watchpoint and returns it to the llist.

For cross-CPU synchronization, updates are propagated using direct
modification on the local CPU and asynchronous IPIs for remote CPUs.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 4 ++
mm/kstackwatch/watch.c | 85 +++++++++++++++++++++++++++++++++++-
2 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 4eac1be3b325..850fc2b18a9c 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -38,11 +38,15 @@ const struct ksw_config *ksw_get_config(void);
/* watch management */
struct ksw_watchpoint {
struct perf_event *__percpu *event;
+ call_single_data_t __percpu *csd;
struct perf_event_attr attr;
struct llist_node node; // for atomic watch_on and off
struct list_head list; // for cpu online and offline
};
int ksw_watch_init(void);
void ksw_watch_exit(void);
+int ksw_watch_get(struct ksw_watchpoint **out_wp);
+int ksw_watch_on(struct ksw_watchpoint *wp, ulong watch_addr, u16 watch_len);
+int ksw_watch_off(struct ksw_watchpoint *wp);

#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index 1d8e24fede54..887cc13292dc 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -31,11 +31,83 @@ static void ksw_watch_handler(struct perf_event *bp,
panic("Stack corruption detected");
}

+static void ksw_watch_on_local_cpu(void *info)
+{
+ struct ksw_watchpoint *wp = info;
+ struct perf_event *bp;
+ ulong flags;
+ int cpu;
+ int ret;
+
+ local_irq_save(flags);
+ cpu = raw_smp_processor_id();
+ bp = per_cpu(*wp->event, cpu);
+ if (!bp) {
+ local_irq_restore(flags);
+ return;
+ }
+
+ ret = modify_wide_hw_breakpoint_local(bp, &wp->attr);
+ local_irq_restore(flags);
+ WARN(ret, "fail to reinstall HWBP on CPU%d ret %d", cpu, ret);
+}
+
+static void ksw_watch_update(struct ksw_watchpoint *wp, ulong addr, u16 len)
+{
+ call_single_data_t *csd;
+ int cur_cpu;
+ int cpu;
+
+ wp->attr.bp_addr = addr;
+ wp->attr.bp_len = len;
+
+ cur_cpu = raw_smp_processor_id();
+ for_each_online_cpu(cpu) {
+ /* remote cpu first */
+ if (cpu == cur_cpu)
+ continue;
+ csd = per_cpu_ptr(wp->csd, cpu);
+ smp_call_function_single_async(cpu, csd);
+ }
+ ksw_watch_on_local_cpu(wp);
+}
+
+int ksw_watch_get(struct ksw_watchpoint **out_wp)
+{
+ struct ksw_watchpoint *wp;
+ struct llist_node *node;
+
+ node = llist_del_first(&free_wp_list);
+ if (!node)
+ return -EBUSY;
+
+ wp = llist_entry(node, struct ksw_watchpoint, node);
+ WARN_ON_ONCE(wp->attr.bp_addr != (u64)&holder);
+
+ *out_wp = wp;
+ return 0;
+}
+int ksw_watch_on(struct ksw_watchpoint *wp, ulong watch_addr, u16 watch_len)
+{
+ ksw_watch_update(wp, watch_addr, watch_len);
+ return 0;
+}
+
+int ksw_watch_off(struct ksw_watchpoint *wp)
+{
+ WARN_ON_ONCE(wp->attr.bp_addr == (u64)&holder);
+ ksw_watch_update(wp, (ulong)&holder, sizeof(ulong));
+ llist_add(&wp->node, &free_wp_list);
+ return 0;
+}
+
static int ksw_watch_alloc(void)
{
int max_watch = ksw_get_config()->max_watch;
struct ksw_watchpoint *wp;
+ call_single_data_t *csd;
int success = 0;
+ int cpu;
int ret;

init_llist_head(&free_wp_list);
@@ -45,6 +117,16 @@ static int ksw_watch_alloc(void)
wp = kzalloc(sizeof(*wp), GFP_KERNEL);
if (!wp)
return success > 0 ? success : -EINVAL;
+ wp->csd = alloc_percpu(call_single_data_t);
+ if (!wp->csd) {
+ kfree(wp);
+ return success > 0 ? success : -EINVAL;
+ }
+
+ for_each_possible_cpu(cpu) {
+ csd = per_cpu_ptr(wp->csd, cpu);
+ INIT_CSD(csd, ksw_watch_on_local_cpu, wp);
+ }

hw_breakpoint_init(&wp->attr);
wp->attr.bp_addr = (ulong)&holder;
@@ -54,6 +136,7 @@ static int ksw_watch_alloc(void)
ksw_watch_handler, wp);
if (IS_ERR((void *)wp->event)) {
ret = PTR_ERR((void *)wp->event);
+ free_percpu(wp->csd);
kfree(wp);
return success > 0 ? success : ret;
}
@@ -75,6 +158,7 @@ static void ksw_watch_free(void)
list_for_each_entry_safe(wp, tmp, &all_wp_list, list) {
list_del(&wp->list);
unregister_wide_hw_breakpoint(wp->event);
+ free_percpu(wp->csd);
kfree(wp);
}
mutex_unlock(&all_wp_mutex);
@@ -88,7 +172,6 @@ int ksw_watch_init(void)
if (ret <= 0)
return -EBUSY;

-
return 0;
}

--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:52:09 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Because trampolines run after the watched function returns but before the
exit_handler is called, and in the original stack frame, so the trampoline
code may overwrite the watched stack address.

These false positives should be ignored. is_ftrace_trampoline() does
not cover all trampolines, so add a local check to handle the remaining
cases.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/watch.c | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)

diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index 887cc13292dc..722ffd9fda7c 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -2,6 +2,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

#include <linux/cpuhotplug.h>
+#include <linux/ftrace.h>
#include <linux/hw_breakpoint.h>
#include <linux/irqflags.h>
#include <linux/mutex.h>
@@ -18,10 +19,46 @@ bool panic_on_catch;
module_param(panic_on_catch, bool, 0644);
MODULE_PARM_DESC(panic_on_catch, "panic immediately on corruption catch");

+#define TRAMPOLINE_NAME "return_to_handler"
+#define TRAMPOLINE_DEPTH 16
+
+/* Resolved once, then reused */
+static unsigned long tramp_start, tramp_end;
+
+static void ksw_watch_resolve_trampoline(void)
+{
+ unsigned long sz, off;
+
+ if (likely(tramp_start && tramp_end))
+ return;
+
+ tramp_start = kallsyms_lookup_name(TRAMPOLINE_NAME);
+ if (tramp_start && kallsyms_lookup_size_offset(tramp_start, &sz, &off))
+ tramp_end = tramp_start + sz;
+}
+
+static bool ksw_watch_in_trampoline(unsigned long ip)
+{
+ if (tramp_start && tramp_end && ip >= tramp_start && ip < tramp_end)
+ return true;
+ return false;
+}
static void ksw_watch_handler(struct perf_event *bp,
struct perf_sample_data *data,
struct pt_regs *regs)
{
+ unsigned long entries[TRAMPOLINE_DEPTH];
+ int i, nr = 0;
+
+ nr = stack_trace_save_regs(regs, entries, TRAMPOLINE_DEPTH, 0);
+ for (i = 0; i < nr; i++) {
+ //ignore trampoline
+ if (is_ftrace_trampoline(entries[i]))
+ return;
+ if (ksw_watch_in_trampoline(entries[i]))
+ return;
+ }
+
pr_err("========== KStackWatch: Caught stack corruption =======\n");
pr_err("config %s\n", ksw_get_config()->user_input);
dump_stack();
@@ -168,6 +205,7 @@ int ksw_watch_init(void)
{
int ret;

+ ksw_watch_resolve_trampoline();
ret = ksw_watch_alloc();
if (ret <= 0)
return -EBUSY;
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:52:13 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Register CPU online/offline callbacks via cpuhp_setup_state_nocalls()
so stack watches are installed/removed dynamically as CPUs come online
or go offline.

When a new CPU comes online, register a hardware breakpoint for the holder,
avoiding races with watch_on()/watch_off() that may run on another CPU. The
watch address will be updated the next time watch_on() is called.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/watch.c | 52 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)

diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index 722ffd9fda7c..f32b1e46168c 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -89,6 +89,48 @@ static void ksw_watch_on_local_cpu(void *info)
WARN(ret, "fail to reinstall HWBP on CPU%d ret %d", cpu, ret);
}

+static int ksw_watch_cpu_online(unsigned int cpu)
+{
+ struct perf_event_attr attr;
+ struct ksw_watchpoint *wp;
+ call_single_data_t *csd;
+ struct perf_event *bp;
+
+ mutex_lock(&all_wp_mutex);
+ list_for_each_entry(wp, &all_wp_list, list) {
+ attr = wp->attr;
+ attr.bp_addr = (u64)&holder;
+ bp = perf_event_create_kernel_counter(&attr, cpu, NULL,
+ ksw_watch_handler, wp);
+ if (IS_ERR(bp)) {
+ pr_warn("%s failed to create watch on CPU %d: %ld\n",
+ __func__, cpu, PTR_ERR(bp));
+ continue;
+ }
+
+ per_cpu(*wp->event, cpu) = bp;
+ csd = per_cpu_ptr(wp->csd, cpu);
+ INIT_CSD(csd, ksw_watch_on_local_cpu, wp);
+ }
+ mutex_unlock(&all_wp_mutex);
+ return 0;
+}
+
+static int ksw_watch_cpu_offline(unsigned int cpu)
+{
+ struct ksw_watchpoint *wp;
+ struct perf_event *bp;
+
+ mutex_lock(&all_wp_mutex);
+ list_for_each_entry(wp, &all_wp_list, list) {
+ bp = per_cpu(*wp->event, cpu);
+ if (bp)
+ unregister_hw_breakpoint(bp);
+ }
+ mutex_unlock(&all_wp_mutex);
+ return 0;
+}
+
static void ksw_watch_update(struct ksw_watchpoint *wp, ulong addr, u16 len)
{
call_single_data_t *csd;
@@ -210,6 +252,16 @@ int ksw_watch_init(void)
if (ret <= 0)
return -EBUSY;

+ ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+ "kstackwatch:online",
+ ksw_watch_cpu_online,
+ ksw_watch_cpu_offline);
+ if (ret < 0) {
+ ksw_watch_free();
+ pr_err("Failed to register CPU hotplug notifier\n");
+ return ret;
+ }
+
return 0;
}

--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:52:16 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Introduce struct ksw_ctx to enable lockless per-task state
tracking. This is required because KStackWatch operates in NMI context
(via kprobe handler) where traditional locking is unsafe.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
include/linux/kstackwatch_types.h | 14 ++++++++++++++
include/linux/sched.h | 5 +++++
2 files changed, 19 insertions(+)
create mode 100644 include/linux/kstackwatch_types.h

diff --git a/include/linux/kstackwatch_types.h b/include/linux/kstackwatch_types.h
new file mode 100644
index 000000000000..2b515c06a918
--- /dev/null
+++ b/include/linux/kstackwatch_types.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_KSTACK_WATCH_TYPES_H
+#define _LINUX_KSTACK_WATCH_TYPES_H
+#include <linux/types.h>
+
+struct ksw_watchpoint;
+struct ksw_ctx {
+ struct ksw_watchpoint *wp;
+ ulong sp;
+ u16 depth;
+ u16 generation;
+};
+
+#endif /* _LINUX_KSTACK_WATCH_TYPES_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index f8188b833350..6935ee51f855 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -22,6 +22,7 @@
#include <linux/sem_types.h>
#include <linux/shm.h>
#include <linux/kmsan_types.h>
+#include <linux/kstackwatch_types.h>
#include <linux/mutex_types.h>
#include <linux/plist_types.h>
#include <linux/hrtimer_types.h>
@@ -1481,6 +1482,10 @@ struct task_struct {
struct kmsan_ctx kmsan_ctx;
#endif

+#if IS_ENABLED(CONFIG_KSTACK_WATCH)
+ struct ksw_ctx ksw_ctx;
+#endif
+
#if IS_ENABLED(CONFIG_KUNIT)
struct kunit *kunit_test;
#endif
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:52:20 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Provide ksw_stack_init() and ksw_stack_exit() to manage entry and exit
probes for the target function from ksw_get_config().

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 4 ++
mm/kstackwatch/stack.c | 101 +++++++++++++++++++++++++++++++++++
2 files changed, 105 insertions(+)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 850fc2b18a9c..4045890e5652 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -35,6 +35,10 @@ struct ksw_config {
// singleton, only modified in kernel.c
const struct ksw_config *ksw_get_config(void);

+/* stack management */
+int ksw_stack_init(void);
+void ksw_stack_exit(void);
+
/* watch management */
struct ksw_watchpoint {
struct perf_event *__percpu *event;
diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
index cec594032515..9f59f41d954c 100644
--- a/mm/kstackwatch/stack.c
+++ b/mm/kstackwatch/stack.c
@@ -1 +1,102 @@
// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/atomic.h>
+#include <linux/fprobe.h>
+#include <linux/kprobes.h>
+#include <linux/kstackwatch_types.h>
+#include <linux/printk.h>
+
+#include "kstackwatch.h"
+
+static struct kprobe entry_probe;
+static struct fprobe exit_probe;
+
+static int ksw_stack_prepare_watch(struct pt_regs *regs,
+ const struct ksw_config *config,
+ ulong *watch_addr, u16 *watch_len)
+{
+ /* implement logic will be added in following patches */
+ *watch_addr = 0;
+ *watch_len = 0;
+ return 0;
+}
+
+static void ksw_stack_entry_handler(struct kprobe *p, struct pt_regs *regs,
+ unsigned long flags)
+{
+ struct ksw_ctx *ctx = &current->ksw_ctx;
+ ulong watch_addr;
+ u16 watch_len;
+ int ret;
+
+ ret = ksw_watch_get(&ctx->wp);
+ if (ret)
+ return;
+
+ ret = ksw_stack_prepare_watch(regs, ksw_get_config(), &watch_addr,
+ &watch_len);
+ if (ret) {
+ ksw_watch_off(ctx->wp);
+ ctx->wp = NULL;
+ pr_err("failed to prepare watch target: %d\n", ret);
+ return;
+ }
+
+ ret = ksw_watch_on(ctx->wp, watch_addr, watch_len);
+ if (ret) {
+ pr_err("failed to watch on depth:%d addr:0x%lx len:%u %d\n",
+ ksw_get_config()->depth, watch_addr, watch_len, ret);
+ return;
+ }
+
+}
+
+static void ksw_stack_exit_handler(struct fprobe *fp, unsigned long ip,
+ unsigned long ret_ip,
+ struct ftrace_regs *regs, void *data)
+{
+ struct ksw_ctx *ctx = &current->ksw_ctx;
+
+
+ if (ctx->wp) {
+ ksw_watch_off(ctx->wp);
+ ctx->wp = NULL;
+ ctx->sp = 0;
+ }
+}
+
+int ksw_stack_init(void)
+{
+ int ret;
+ char *symbuf = NULL;
+
+ memset(&entry_probe, 0, sizeof(entry_probe));
+ entry_probe.symbol_name = ksw_get_config()->func_name;
+ entry_probe.offset = ksw_get_config()->func_offset;
+ entry_probe.post_handler = ksw_stack_entry_handler;
+ ret = register_kprobe(&entry_probe);
+ if (ret) {
+ pr_err("failed to register kprobe ret %d\n", ret);
+ return ret;
+ }
+
+ memset(&exit_probe, 0, sizeof(exit_probe));
+ exit_probe.exit_handler = ksw_stack_exit_handler;
+ symbuf = (char *)ksw_get_config()->func_name;
+
+ ret = register_fprobe_syms(&exit_probe, (const char **)&symbuf, 1);
+ if (ret < 0) {
+ pr_err("failed to register fprobe ret %d\n", ret);
+ unregister_kprobe(&entry_probe);
+ return ret;
+ }
+
+ return 0;
+}
+
+void ksw_stack_exit(void)
+{
+ unregister_fprobe(&exit_probe);
+ unregister_kprobe(&entry_probe);
+}
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:52:24 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Each task tracks its depth, stack pointer, and generation. A watchpoint is
enabled only when the configured depth is reached, and disabled on function
exit.

The context is reset when probes are disabled, generation changes, or exit
depth becomes inconsistent.

Duplicate arming on the same frame is skipped.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/stack.c | 67 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 67 insertions(+)

diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
index 9f59f41d954c..e596ef97222d 100644
--- a/mm/kstackwatch/stack.c
+++ b/mm/kstackwatch/stack.c
@@ -12,6 +12,53 @@
static struct kprobe entry_probe;
static struct fprobe exit_probe;

+static bool probe_enable;
+static u16 probe_generation;
+
+static void ksw_reset_ctx(void)
+{
+ struct ksw_ctx *ctx = &current->ksw_ctx;
+
+ if (ctx->wp)
+ ksw_watch_off(ctx->wp);
+
+ ctx->wp = NULL;
+ ctx->sp = 0;
+ ctx->depth = 0;
+ ctx->generation = READ_ONCE(probe_generation);
+}
+
+static bool ksw_stack_check_ctx(bool entry)
+{
+ struct ksw_ctx *ctx = &current->ksw_ctx;
+ u16 cur_enable = READ_ONCE(probe_enable);
+ u16 cur_generation = READ_ONCE(probe_generation);
+ u16 cur_depth, target_depth = ksw_get_config()->depth;
+
+ if (!cur_enable) {
+ ksw_reset_ctx();
+ return false;
+ }
+
+ if (ctx->generation != cur_generation)
+ ksw_reset_ctx();
+
+ if (!entry && !ctx->depth) {
+ ksw_reset_ctx();
+ return false;
+ }
+
+ if (entry)
+ cur_depth = ctx->depth++;
+ else
+ cur_depth = --ctx->depth;
+
+ if (cur_depth == target_depth)
+ return true;
+ else
+ return false;
+}
+
static int ksw_stack_prepare_watch(struct pt_regs *regs,
const struct ksw_config *config,
ulong *watch_addr, u16 *watch_len)
@@ -26,10 +73,22 @@ static void ksw_stack_entry_handler(struct kprobe *p, struct pt_regs *regs,
unsigned long flags)
{
struct ksw_ctx *ctx = &current->ksw_ctx;
+ ulong stack_pointer;
ulong watch_addr;
u16 watch_len;
int ret;

+ stack_pointer = kernel_stack_pointer(regs);
+
+ /*
+ * triggered more than once, may be in a loop
+ */
+ if (ctx->wp && ctx->sp == stack_pointer)
+ return;
+
+ if (!ksw_stack_check_ctx(true))
+ return;
+
ret = ksw_watch_get(&ctx->wp);
if (ret)
return;
@@ -50,6 +109,7 @@ static void ksw_stack_entry_handler(struct kprobe *p, struct pt_regs *regs,
return;
}

+ ctx->sp = stack_pointer;
}

static void ksw_stack_exit_handler(struct fprobe *fp, unsigned long ip,
@@ -58,6 +118,8 @@ static void ksw_stack_exit_handler(struct fprobe *fp, unsigned long ip,
{
struct ksw_ctx *ctx = &current->ksw_ctx;

+ if (!ksw_stack_check_ctx(false))
+ return;

if (ctx->wp) {
ksw_watch_off(ctx->wp);
@@ -92,11 +154,16 @@ int ksw_stack_init(void)
return ret;
}

+ WRITE_ONCE(probe_generation, READ_ONCE(probe_generation) + 1);
+ WRITE_ONCE(probe_enable, true);
+
return 0;
}

void ksw_stack_exit(void)
{
+ WRITE_ONCE(probe_enable, false);
+ WRITE_ONCE(probe_generation, READ_ONCE(probe_generation) + 1);
unregister_fprobe(&exit_probe);
unregister_kprobe(&entry_probe);
}
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:52:27 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add helpers to find the stack canary or a local variable addr and len
for the probed function based on ksw_get_config(). For canary search,
limits search to a fixed number of steps to avoid scanning the entire
stack. Validates that the computed address and length are within the
kernel stack.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/stack.c | 77 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 74 insertions(+), 3 deletions(-)

diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
index e596ef97222d..3c4cb6d5b58a 100644
--- a/mm/kstackwatch/stack.c
+++ b/mm/kstackwatch/stack.c
@@ -9,6 +9,7 @@

#include "kstackwatch.h"

+#define MAX_CANARY_SEARCH_STEPS 128
static struct kprobe entry_probe;
static struct fprobe exit_probe;

@@ -59,13 +60,83 @@ static bool ksw_stack_check_ctx(bool entry)
return false;
}

+static unsigned long ksw_find_stack_canary_addr(struct pt_regs *regs)
+{
+ unsigned long *stack_ptr, *stack_end, *stack_base;
+ unsigned long expected_canary;
+ unsigned int i;
+
+ stack_ptr = (unsigned long *)kernel_stack_pointer(regs);
+
+ stack_base = (unsigned long *)(current->stack);
+
+ // TODO: limit it to the current frame
+ stack_end = (unsigned long *)((char *)current->stack + THREAD_SIZE);
+
+ expected_canary = current->stack_canary;
+
+ if (stack_ptr < stack_base || stack_ptr >= stack_end) {
+ pr_err("Stack pointer 0x%lx out of bounds [0x%lx, 0x%lx)\n",
+ (unsigned long)stack_ptr, (unsigned long)stack_base,
+ (unsigned long)stack_end);
+ return 0;
+ }
+
+ for (i = 0; i < MAX_CANARY_SEARCH_STEPS; i++) {
+ if (&stack_ptr[i] >= stack_end)
+ break;
+
+ if (stack_ptr[i] == expected_canary) {
+ pr_debug("canary found i:%d 0x%lx\n", i,
+ (unsigned long)&stack_ptr[i]);
+ return (unsigned long)&stack_ptr[i];
+ }
+ }
+
+ pr_debug("canary not found in first %d steps\n",
+ MAX_CANARY_SEARCH_STEPS);
+ return 0;
+}
+
+static int ksw_stack_validate_addr(unsigned long addr, size_t size)
+{
+ unsigned long stack_start, stack_end;
+
+ if (!addr || !size)
+ return -EINVAL;
+
+ stack_start = (unsigned long)current->stack;
+ stack_end = stack_start + THREAD_SIZE;
+
+ if (addr < stack_start || (addr + size) > stack_end)
+ return -ERANGE;
+
+ return 0;
+}
+
static int ksw_stack_prepare_watch(struct pt_regs *regs,
const struct ksw_config *config,
ulong *watch_addr, u16 *watch_len)
{
- /* implement logic will be added in following patches */
- *watch_addr = 0;
- *watch_len = 0;
+ ulong addr;
+ u16 len;
+
+ // default is to watch the canary
+ if (!ksw_get_config()->watch_len) {
+ addr = ksw_find_stack_canary_addr(regs);
+ len = sizeof(ulong);
+ } else {
+ addr = kernel_stack_pointer(regs) + ksw_get_config()->sp_offset;
+ len = ksw_get_config()->watch_len;
+ }
+
+ if (ksw_stack_validate_addr(addr, len)) {
+ pr_err("invalid stack addr:0x%lx len :%u\n", addr, len);
+ return -EINVAL;
+ }
+
+ *watch_addr = addr;
+ *watch_len = len;
return 0;
}

--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 7:52:32 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Allow dynamic enabling/disabling of KStackWatch through user input of proc.
With this patch, the entire system becomes functional.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kernel.c | 55 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
index 4a06ddadd9c7..11aa06908ff1 100644
--- a/mm/kstackwatch/kernel.c
+++ b/mm/kstackwatch/kernel.c
@@ -13,6 +13,43 @@
static struct ksw_config *ksw_config;
static atomic_t config_file_busy = ATOMIC_INIT(0);

+static bool watching_active;
+
+static int ksw_start_watching(void)
+{
+ int ret;
+
+ /*
+ * Watch init will preallocate the HWBP,
+ * so it must happen before stack init
+ */
+ ret = ksw_watch_init();
+ if (ret) {
+ pr_err("ksw_watch_init ret: %d\n", ret);
+ return ret;
+ }
+
+ ret = ksw_stack_init();
+ if (ret) {
+ pr_err("ksw_stack_init ret: %d\n", ret);
+ ksw_watch_exit();
+ return ret;
+ }
+ watching_active = true;
+
+ pr_info("start watching: %s\n", ksw_config->user_input);
+ return 0;
+}
+
+static void ksw_stop_watching(void)
+{
+ ksw_stack_exit();
+ ksw_watch_exit();
+ watching_active = false;
+
+ pr_info("stop watching: %s\n", ksw_config->user_input);
+}
+
struct param_map {
const char *name; /* long name */
const char *short_name; /* short name (2 letters) */
@@ -126,6 +163,9 @@ static ssize_t kstackwatch_proc_write(struct file *file,
if (copy_from_user(input, buffer, count))
return -EFAULT;

+ if (watching_active)
+ ksw_stop_watching();
+
input[count] = '\0';
strim(input);

@@ -140,12 +180,22 @@ static ssize_t kstackwatch_proc_write(struct file *file,
return ret;
}

+ ret = ksw_start_watching();
+ if (ret) {
+ pr_err("Failed to start watching with %d\n", ret);
+ return ret;
+ }
+
return count;
}

static int kstackwatch_proc_show(struct seq_file *m, void *v)
{
- seq_printf(m, "%s\n", ksw_config->user_input);
+ if (watching_active)
+ seq_printf(m, "%s\n", ksw_config->user_input);
+ else
+ seq_puts(m, "not watching\n");
+
return 0;
}

@@ -193,6 +243,9 @@ static int __init kstackwatch_init(void)

static void __exit kstackwatch_exit(void)
{
+ if (watching_active)
+ ksw_stop_watching();
+
remove_proc_entry("kstackwatch", NULL);
kfree(ksw_config->func_name);
kfree(ksw_config->user_input);
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 8:00:16 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Provide two debug helpers:

- ksw_watch_show(): print the current watch target address and length.
- ksw_watch_fire(): intentionally trigger the watchpoint immediately
by writing to the watched address, useful for testing HWBP behavior.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 2 ++
mm/kstackwatch/watch.c | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 36 insertions(+)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 4045890e5652..528001534047 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -52,5 +52,7 @@ void ksw_watch_exit(void);
int ksw_watch_get(struct ksw_watchpoint **out_wp);
int ksw_watch_on(struct ksw_watchpoint *wp, ulong watch_addr, u16 watch_len);
int ksw_watch_off(struct ksw_watchpoint *wp);
+void ksw_watch_show(void);
+void ksw_watch_fire(void);

#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index f32b1e46168c..9837d6873d92 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -269,3 +269,37 @@ void ksw_watch_exit(void)
{
ksw_watch_free();
}
+
+/* self debug function */
+void ksw_watch_show(void)
+{
+ struct ksw_watchpoint *wp = current->ksw_ctx.wp;
+
+ if (!wp) {
+ pr_info("nothing to show\n");
+ return;
+ }
+
+ pr_info("watch target bp_addr: 0x%llx len:%llu\n", wp->attr.bp_addr,
+ wp->attr.bp_len);
+}
+EXPORT_SYMBOL_GPL(ksw_watch_show);
+
+/* self debug function */
+void ksw_watch_fire(void)
+{
+ struct ksw_watchpoint *wp;
+ char *ptr;
+
+ wp = current->ksw_ctx.wp;
+
+ if (!wp) {
+ pr_info("nothing to fire\n");
+ return;
+ }
+
+ ptr = (char *)wp->attr.bp_addr;
+ pr_warn("watch triggered immediately\n");
+ *ptr = 0x42; // This should trigger immediately for any bp_len
+}
+EXPORT_SYMBOL_GPL(ksw_watch_fire);
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 8:00:24 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Introduce a separate test module to validate functionality in controlled
scenarios.

The module provides a proc interface (/proc/kstackwatch_test) that allows
triggering specific test cases via simple commands:

echo test0 > /proc/kstackwatch_test

Test module is built with optimizations disabled to ensure predictable
behavior.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/Kconfig.debug | 10 ++++
mm/kstackwatch/Makefile | 6 ++
mm/kstackwatch/test.c | 122 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 138 insertions(+)
create mode 100644 mm/kstackwatch/test.c

diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index 89be351c0be5..291dd8a78b98 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -317,3 +317,13 @@ config KSTACK_WATCH
A lightweight real-time debugging tool to detect stack corrupting.

If unsure, say N.
+
+config KSTACK_WATCH_TEST
+ tristate "KStackWatch Test Module"
+ depends on KSTACK_WATCH
+ help
+ This module provides controlled stack corruption scenarios to verify
+ the functionality of KStackWatch. It is useful for development and
+ validation of KStackWatch mechanism.
+
+ If unsure, say N.
diff --git a/mm/kstackwatch/Makefile b/mm/kstackwatch/Makefile
index 84a46cb9a766..d007b8dcd1c6 100644
--- a/mm/kstackwatch/Makefile
+++ b/mm/kstackwatch/Makefile
@@ -1,2 +1,8 @@
obj-$(CONFIG_KSTACK_WATCH) += kstackwatch.o
kstackwatch-y := kernel.o stack.o watch.o
+
+obj-$(CONFIG_KSTACK_WATCH_TEST) += kstackwatch_test.o
+kstackwatch_test-y := test.o
+CFLAGS_test.o := -fno-inline \
+ -fno-optimize-sibling-calls \
+ -fno-pic -fno-pie -O0 -Og
diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
new file mode 100644
index 000000000000..1ed98931cc51
--- /dev/null
+++ b/mm/kstackwatch/test.c
@@ -0,0 +1,122 @@
+// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/delay.h>
+#include <linux/kthread.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/prandom.h>
+#include <linux/printk.h>
+#include <linux/proc_fs.h>
+#include <linux/random.h>
+#include <linux/spinlock.h>
+#include <linux/string.h>
+#include <linux/uaccess.h>
+
+#include "kstackwatch.h"
+
+static struct proc_dir_entry *test_proc;
+
+#define BUFFER_SIZE 16
+#define MAX_DEPTH 6
+
+struct work_node {
+ ulong *ptr;
+ struct completion done;
+ struct list_head list;
+};
+
+static DECLARE_COMPLETION(work_res);
+static DEFINE_MUTEX(work_mutex);
+static LIST_HEAD(work_list);
+
+static void test_watch_fire(void)
+{
+ u64 buffer[BUFFER_SIZE] = { 0 };
+
+ pr_info("entry of %s\n", __func__);
+ ksw_watch_show();
+ ksw_watch_fire();
+ pr_info("buf[0]:%lld\n", buffer[0]);
+
+ barrier_data(buffer);
+ pr_info("exit of %s\n", __func__);
+}
+
+
+static ssize_t test_proc_write(struct file *file, const char __user *buffer,
+ size_t count, loff_t *pos)
+{
+ char cmd[256];
+ int test_num;
+
+ if (count >= sizeof(cmd))
+ return -EINVAL;
+
+ if (copy_from_user(cmd, buffer, count))
+ return -EFAULT;
+
+ cmd[count] = '\0';
+ strim(cmd);
+
+ pr_info("received command: %s\n", cmd);
+
+ if (sscanf(cmd, "test%d", &test_num) == 1) {
+ switch (test_num) {
+ case 0:
+ test_watch_fire();
+ break;
+ default:
+ pr_err("Unknown test number %d\n", test_num);
+ return -EINVAL;
+ }
+ } else {
+ pr_err("invalid command format. Use 'testN'.\n");
+ return -EINVAL;
+ }
+
+ return count;
+}
+
+static ssize_t test_proc_read(struct file *file, char __user *buffer,
+ size_t count, loff_t *pos)
+{
+ static const char usage[] = "KStackWatch Simplified Test Module\n"
+ "============ usage ==============\n"
+ "Usage:\n"
+ "echo test{i} > /proc/kstackwatch_test\n"
+ " test0 - test watch fire\n";
+
+ return simple_read_from_buffer(buffer, count, pos, usage,
+ strlen(usage));
+}
+
+static const struct proc_ops test_proc_ops = {
+ .proc_read = test_proc_read,
+ .proc_write = test_proc_write,
+};
+
+static int __init kstackwatch_test_init(void)
+{
+ test_proc = proc_create("kstackwatch_test", 0600, NULL, &test_proc_ops);
+ if (!test_proc) {
+ pr_err("Failed to create proc entry\n");
+ return -ENOMEM;
+ }
+ pr_info("module loaded\n");
+ return 0;
+}
+
+static void __exit kstackwatch_test_exit(void)
+{
+ if (test_proc)
+ remove_proc_entry("kstackwatch_test", NULL);
+ pr_info("module unloaded\n");
+}
+
+module_init(kstackwatch_test_init);
+module_exit(kstackwatch_test_exit);
+
+MODULE_AUTHOR("Jinchao Wang");
+MODULE_DESCRIPTION("Simple KStackWatch Test Module");
+MODULE_LICENSE("GPL");
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 8:00:30 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Extend the test module with a new test case (test1) that intentionally
overflows a local u64 buffer to corrupt the stack canary.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/test.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
index 1ed98931cc51..740e3c11b3ef 100644
--- a/mm/kstackwatch/test.c
+++ b/mm/kstackwatch/test.c
@@ -43,6 +43,20 @@ static void test_watch_fire(void)
pr_info("exit of %s\n", __func__);
}

+static void test_canary_overflow(void)
+{
+ u64 buffer[BUFFER_SIZE];
+
+ pr_info("entry of %s\n", __func__);
+
+ /* intentionally overflow */
+ for (int i = BUFFER_SIZE; i < BUFFER_SIZE + 10; i++)
+ buffer[i] = 0xdeadbeefdeadbeef;
+ barrier_data(buffer);
+
+ pr_info("exit of %s\n", __func__);
+}
+

static ssize_t test_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
@@ -66,6 +80,9 @@ static ssize_t test_proc_write(struct file *file, const char __user *buffer,
case 0:
test_watch_fire();
break;
+ case 1:
+ test_canary_overflow();
+ break;
default:
pr_err("Unknown test number %d\n", test_num);
return -EINVAL;
@@ -85,7 +102,8 @@ static ssize_t test_proc_read(struct file *file, char __user *buffer,
"============ usage ==============\n"
"Usage:\n"
"echo test{i} > /proc/kstackwatch_test\n"
- " test0 - test watch fire\n";
+ " test0 - test watch fire\n"
+ " test1 - test canary overflow\n";

return simple_read_from_buffer(buffer, count, pos, usage,
strlen(usage));
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 8:00:35 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Introduce a test that performs stack writes in recursive calls to exercise
stack watch at a specific recursion depth.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/test.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
index 740e3c11b3ef..08e3d37c4c04 100644
--- a/mm/kstackwatch/test.c
+++ b/mm/kstackwatch/test.c
@@ -57,6 +57,20 @@ static void test_canary_overflow(void)
pr_info("exit of %s\n", __func__);
}

+static void test_recursive_depth(int depth)
+{
+ u64 buffer[BUFFER_SIZE];
+
+ pr_info("entry of %s depth:%d\n", __func__, depth);
+
+ if (depth < MAX_DEPTH)
+ test_recursive_depth(depth + 1);
+
+ buffer[0] = depth;
+ barrier_data(buffer);
+
+ pr_info("exit of %s depth:%d\n", __func__, depth);
+}

static ssize_t test_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
@@ -83,6 +97,9 @@ static ssize_t test_proc_write(struct file *file, const char __user *buffer,
case 1:
test_canary_overflow();
break;
+ case 2:
+ test_recursive_depth(0);
+ break;
default:
pr_err("Unknown test number %d\n", test_num);
return -EINVAL;
@@ -103,7 +120,8 @@ static ssize_t test_proc_read(struct file *file, char __user *buffer,
"Usage:\n"
"echo test{i} > /proc/kstackwatch_test\n"
" test0 - test watch fire\n"
- " test1 - test canary overflow\n";
+ " test1 - test canary overflow\n"
+ " test2 - test recursive func\n";

Jinchao Wang

unread,
Sep 24, 2025, 8:00:42 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
These tests share a common structure and are grouped together.

- buggy():
exposes the stack address to corrupting(); may omit waiting
- corrupting():
reads the exposed pointer and modifies memory;
if buggy() omits waiting, victim()'s buffer is corrupted
- victim():
initializes a local buffer and later verifies it;
reports an error if the buffer was unexpectedly modified

buggy() and victim() run in worker() thread, with similar stack frame sizes
to simplify testing. By adjusting fence_size in corrupting(), the test can
trigger either silent corruption or overflow across threads.

- Test 3: one worker, 20 loops, silent corruption
- Test 4: 20 workers, one loop each, silent corruption
- Test 5: one worker, one loop, overflow corruption

Test 4 also exercises multiple watchpoint instances.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/test.c | 178 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 176 insertions(+), 2 deletions(-)

diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
index 08e3d37c4c04..859122bbbdeb 100644
--- a/mm/kstackwatch/test.c
+++ b/mm/kstackwatch/test.c
@@ -17,11 +17,12 @@

static struct proc_dir_entry *test_proc;

-#define BUFFER_SIZE 16
+#define BUFFER_SIZE 32
#define MAX_DEPTH 6

struct work_node {
ulong *ptr;
+ u64 start_ns;
struct completion done;
struct list_head list;
};
@@ -30,6 +31,9 @@ static DECLARE_COMPLETION(work_res);
static DEFINE_MUTEX(work_mutex);
static LIST_HEAD(work_list);

+static int global_fence_size;
+static int global_loop_count;
+
static void test_watch_fire(void)
{
u64 buffer[BUFFER_SIZE] = { 0 };
@@ -72,6 +76,164 @@ static void test_recursive_depth(int depth)
pr_info("exit of %s depth:%d\n", __func__, depth);
}

+static struct work_node *test_mthread_buggy(int thread_id, int seq_id)
+{
+ ulong buf[BUFFER_SIZE];
+ struct work_node *node;
+ bool trigger;
+
+ node = kmalloc(sizeof(*node), GFP_KERNEL);
+ if (!node)
+ return NULL;
+
+ init_completion(&node->done);
+ node->ptr = buf;
+ node->start_ns = ktime_get_ns();
+ mutex_lock(&work_mutex);
+ list_add(&node->list, &work_list);
+ mutex_unlock(&work_mutex);
+ complete(&work_res);
+
+ trigger = (get_random_u32() % 100) < 10;
+ if (trigger)
+ return node; /* let the caller handle cleanup */
+
+ wait_for_completion(&node->done);
+ kfree(node);
+ return NULL;
+}
+
+#define CORRUPTING_MINIOR_WAIT_NS (100000)
+#define VICTIM_MINIOR_WAIT_NS (300000)
+
+static inline void silent_wait_us(u64 start_ns, u64 min_wait_us)
+{
+ u64 diff_ns, remain_us;
+
+ diff_ns = ktime_get_ns() - start_ns;
+ if (diff_ns < min_wait_us * 1000ULL) {
+ remain_us = min_wait_us - (diff_ns >> 10);
+ usleep_range(remain_us, remain_us + 200);
+ }
+}
+
+static void test_mthread_victim(int thread_id, int seq_id, u64 start_ns)
+{
+ ulong buf[BUFFER_SIZE];
+
+ for (int j = 0; j < BUFFER_SIZE; j++)
+ buf[j] = 0xdeadbeef + seq_id;
+ if (start_ns)
+ silent_wait_us(start_ns, VICTIM_MINIOR_WAIT_NS);
+
+ for (int j = 0; j < BUFFER_SIZE; j++) {
+ if (buf[j] != (0xdeadbeef + seq_id)) {
+ pr_warn("victim[%d][%d]: unhappy buf[%d]=0x%lx\n",
+ thread_id, seq_id, j, buf[j]);
+ return;
+ }
+ }
+
+ pr_info("victim[%d][%d]: happy\n", thread_id, seq_id);
+}
+
+static int test_mthread_corrupting(void *data)
+{
+ struct work_node *node;
+ int fence_size;
+
+ while (!kthread_should_stop()) {
+ if (!wait_for_completion_timeout(&work_res, HZ))
+ continue;
+ while (true) {
+ mutex_lock(&work_mutex);
+ node = list_first_entry_or_null(&work_list,
+ struct work_node, list);
+ if (node)
+ list_del(&node->list);
+ mutex_unlock(&work_mutex);
+
+ if (!node)
+ break; /* no more nodes, exit inner loop */
+ silent_wait_us(node->start_ns,
+ CORRUPTING_MINIOR_WAIT_NS);
+
+ fence_size = READ_ONCE(global_fence_size);
+ for (int i = fence_size; i < BUFFER_SIZE - fence_size;
+ i++)
+ node->ptr[i] = 0xabcdabcd;
+
+ complete(&node->done);
+ }
+ }
+
+ return 0;
+}
+
+static int test_mthread_worker(void *data)
+{
+ int thread_id = (long)data;
+ int loop_count;
+ struct work_node *node;
+
+ loop_count = READ_ONCE(global_loop_count);
+
+ for (int i = 0; i < loop_count; i++) {
+ node = test_mthread_buggy(thread_id, i);
+
+ if (node)
+ test_mthread_victim(thread_id, i, node->start_ns);
+ else
+ test_mthread_victim(thread_id, i, 0);
+ if (node) {
+ wait_for_completion(&node->done);
+ kfree(node);
+ }
+ }
+ return 0;
+}
+
+static void test_mthread_case(int num_workers, int loop_count, int fence_size)
+{
+ static struct task_struct *corrupting;
+ static struct task_struct **workers;
+
+ WRITE_ONCE(global_loop_count, loop_count);
+ WRITE_ONCE(global_fence_size, fence_size);
+
+ init_completion(&work_res);
+ workers = kmalloc_array(num_workers, sizeof(void *), GFP_KERNEL);
+ memset(workers, 0, sizeof(struct task_struct *) * num_workers);
+
+ corrupting = kthread_run(test_mthread_corrupting, NULL, "corrupting");
+ if (IS_ERR(corrupting)) {
+ pr_err("failed to create corrupting thread\n");
+ return;
+ }
+
+ for (ulong i = 0; i < num_workers; i++) {
+ workers[i] = kthread_run(test_mthread_worker, (void *)i,
+ "worker_%ld", i);
+ if (IS_ERR(workers[i])) {
+ pr_err("failto create worker thread %ld", i);
+ workers[i] = NULL;
+ }
+ }
+
+ for (ulong i = 0; i < num_workers; i++) {
+ if (workers[i] && workers[i]->__state != TASK_DEAD) {
+ usleep_range(1000, 2000);
+ i--;
+ }
+ }
+ kfree(workers);
+
+ if (corrupting && !IS_ERR(corrupting)) {
+ kthread_stop(corrupting);
+ corrupting = NULL;
+ }
+}
+
static ssize_t test_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
{
@@ -100,6 +262,15 @@ static ssize_t test_proc_write(struct file *file, const char __user *buffer,
case 2:
test_recursive_depth(0);
break;
+ case 3:
+ test_mthread_case(1, 20, BUFFER_SIZE / 4);
+ break;
+ case 4:
+ test_mthread_case(20, 1, BUFFER_SIZE / 4);
+ break;
+ case 5:
+ test_mthread_case(1, 1, -3);
+ break;
default:
pr_err("Unknown test number %d\n", test_num);
return -EINVAL;
@@ -121,7 +292,10 @@ static ssize_t test_proc_read(struct file *file, char __user *buffer,
"echo test{i} > /proc/kstackwatch_test\n"
" test0 - test watch fire\n"
" test1 - test canary overflow\n"
- " test2 - test recursive func\n";
+ " test2 - test recursive func\n"
+ " test3 - test silent corruption\n"
+ " test4 - test multiple silent corruption\n"
+ " test5 - test prologue corruption\n";

Jinchao Wang

unread,
Sep 24, 2025, 8:00:50 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Provide a shell script to trigger test cases.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
tools/kstackwatch/kstackwatch_test.sh | 52 +++++++++++++++++++++++++++
1 file changed, 52 insertions(+)
create mode 100755 tools/kstackwatch/kstackwatch_test.sh

diff --git a/tools/kstackwatch/kstackwatch_test.sh b/tools/kstackwatch/kstackwatch_test.sh
new file mode 100755
index 000000000000..aede35dcb8b6
--- /dev/null
+++ b/tools/kstackwatch/kstackwatch_test.sh
@@ -0,0 +1,52 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+echo "IMPORTANT: Before running, make sure you have updated the config values!"
+
+usage() {
+ echo "Usage: $0 [0-5]"
+ echo " 0 - test watch fire"
+ echo " 1 - test canary overflow"
+ echo " 2 - test recursive depth"
+ echo " 3 - test silent corruption"
+ echo " 4 - test multi-threaded silent corruption"
+ echo " 5 - test multi-threaded overflow"
+}
+
+run_test() {
+ local test_num=$1
+ case "$test_num" in
+ 0) echo fn=test_watch_fire fo=0x29 wl=8 >/proc/kstackwatch
+ echo test0 > /proc/kstackwatch_test
+ ;;
+ 1) echo fn=test_canary_overflow fo=0x14 >/proc/kstackwatch
+ echo test1 >/proc/kstackwatch_test
+ ;;
+ 2) echo fn=test_recursive_depth fo=0x2f dp=3 wl=8 so=0 >/proc/kstackwatch
+ echo test2 >/proc/kstackwatch_test
+ ;;
+ 3) echo fn=test_mthread_victim fo=0x4c so=64 wl=8 >/proc/kstackwatch
+ echo test3 >/proc/kstackwatch_test
+ ;;
+ 4) echo fn=test_mthread_victim fo=0x4c so=64 wl=8 >/proc/kstackwatch
+ echo test4 >/proc/kstackwatch_test
+ ;;
+ 5) echo fn=test_mthread_buggy fo=0x16 so=0x100 wl=8 >/proc/kstackwatch
+ echo test5 >/proc/kstackwatch_test
+ ;;
+ *) usage
+ exit 1 ;;
+ esac
+ # Reset watch after test
+ echo >/proc/kstackwatch
+}
+
+# Check root and module
+[ "$EUID" -ne 0 ] && echo "Run as root" && exit 1
+for f in /proc/kstackwatch /proc/kstackwatch_test; do
+ [ ! -f "$f" ] && echo "$f not found" && exit 1
+done
+
+# Run
+[ -z "$1" ] && { usage; exit 0; }
+run_test "$1"
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 8:00:58 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add documentation for KStackWatch under Documentation/.

It provides an overview, main features, usage details, configuration
parameters, and example scenarios with test cases. The document also
explains how to locate function offsets and interpret logs.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
Documentation/dev-tools/index.rst | 1 +
Documentation/dev-tools/kstackwatch.rst | 316 ++++++++++++++++++++++++
2 files changed, 317 insertions(+)
create mode 100644 Documentation/dev-tools/kstackwatch.rst

diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 65c54b27a60b..45eb828d9d65 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -31,6 +31,7 @@ Documentation/process/debugging/index.rst
kcsan
kfence
kselftest
+ kstackwatch
kunit/index
ktap
checkuapi
diff --git a/Documentation/dev-tools/kstackwatch.rst b/Documentation/dev-tools/kstackwatch.rst
new file mode 100644
index 000000000000..7a9e018ddccb
--- /dev/null
+++ b/Documentation/dev-tools/kstackwatch.rst
@@ -0,0 +1,316 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================================
+KStackWatch: Kernel Stack Watch
+=================================
+
+Overview
+========
+
+KStackWatch is a lightweight debugging tool designed to detect kernel stack
+corruption in real time. It installs a hardware breakpoint (watchpoint)
+at a function's specified offset using *kprobe.post_handler* and
+removes it in *fprobe.exit_handler*. This covers the full execution
+window and reports corruption immediately with time, location, and
+call stack.
+
+Main features:
+
+* Immediate and precise detection
+* Supports concurrent calls to the watched function
+* Lockless design, usable in any context
+* Depth filter for recursive calls
+* Minimal impact on reproducibility
+* Flexible procfs configuration with key=val syntax
+
+Usage
+=====
+
+KStackWatch is configured through */proc/kstackwatch* using a key=value
+format. Both long and short forms are supported. Writing an empty string
+disables the watch.
+
+.. code-block:: bash
+
+ # long form
+ echo func_name=? func_offset=? ... > /proc/kstackwatch
+
+ # short form
+ echo fn=? fo=? ... > /proc/kstackwatch
+
+ # disable
+ echo > /proc/kstackwatch
+
+The function name and the instruction offset where the watchpoint should
+be placed must be known. This information can be obtained from
+*objdump* or other tools.
+
+Required parameters
+--------------------
+
++--------------+--------+-----------------------------------------+
+| Parameter | Short | Description |
++==============+========+=========================================+
+| func_name | fn | Name of the target function |
++--------------+--------+-----------------------------------------+
+| func_offset | fo | Instruction pointer offset |
++--------------+--------+-----------------------------------------+
+
+Optional parameters
+--------------------
+
+Default 0 and can be omitted.
+Both decimal and hexadecimal are supported.
+
++--------------+--------+------------------------------------------------+
+| Parameter | Short | Description |
++==============+========+================================================+
+| depth | dp | Recursion depth filter |
++--------------+--------+------------------------------------------------+
+| max_watch | mw | Maximum number of concurrent watchpoints |
+| | | (default 0, capped by available hardware |
+| | | breakpoints) |
++--------------+--------+------------------------------------------------+
+| sp_offset | so | Watching addr offset from stack pointer |
++--------------+--------+------------------------------------------------+
+| watch_len | wl | Watch length in bytes (1, 2, 4, 8, or 0), |
+| | | 0 means automatically watch the stack canary |
+| | | and ignore the ``sp_offset`` parameter |
++--------------+--------+------------------------------------------------+
+
+Workflow Example
+================
+
+Silent corruption
+-----------------
+
+Consider *test3* in *kstackwatch_test.sh*. Run it directly:
+
+.. code-block:: bash
+
+ echo test3 >/proc/kstackwatch_test
+
+Sometimes, *test_mthread_victim()* may report as unhappy:
+
+.. code-block:: bash
+
+ [ 7.807082] kstackwatch_test: victim[0][11]: unhappy buf[8]=0xabcdabcd
+
+Its source code is:
+
+.. code-block:: c
+
+ static void test_mthread_victim(int thread_id, int seq_id, u64 start_ns)
+ {
+ ulong buf[BUFFER_SIZE];
+
+ for (int j = 0; j < BUFFER_SIZE; j++)
+ buf[j] = 0xdeadbeef + seq_id;
+
+ if (start_ns)
+ silent_wait_us(start_ns, VICTIM_MINIOR_WAIT_NS);
+
+ for (int j = 0; j < BUFFER_SIZE; j++) {
+ if (buf[j] != (0xdeadbeef + seq_id)) {
+ pr_warn("victim[%d][%d]: unhappy buf[%d]=0x%lx\n",
+ thread_id, seq_id, j, buf[j]);
+ return;
+ }
+ }
+
+ pr_info("victim[%d][%d]: happy\n", thread_id, seq_id);
+ }
+
+From the source code, the report indicates buf[8] was unexpectedly modified,
+a case of silent corruption.
+
+Configuration
+-------------
+
+Since buf[8] is the corrupted variable, the following configuration shows
+how to use KStackWatch to detect its corruption.
+
+func_name
+~~~~~~~~~~~
+
+As seen, buf[8] is initialized and modified in *test_mthread_victim*(),
+which sets *func_name*.
+
+func_offset & sp_offset
+~~~~~~~~~~~~~~~~~~~~~~~~~
+The watchpoint should be set after the assignment and as close as
+possible, which sets *func_offset*.
+
+The watchpoint should be set to watch buf[8], which sets *sp_offset*.
+
+Use the objdump output to disassemble the function:
+
+.. code-block:: bash
+
+ objdump -S --disassemble=test_mthread_victim vmlinux
+
+A shortened output is:
+
+.. code-block:: text
+
+ static void test_mthread_victim(int thread_id, int seq_id, u64 start_ns)
+ {
+ ffffffff815cb4e0: e8 5b 9b ca ff call ffffffff81275040 <__fentry__>
+ ffffffff815cb4e5: 55 push %rbp
+ ffffffff815cb4e6: 53 push %rbx
+ ffffffff815cb4e7: 48 81 ec 08 01 00 00 sub $0x108,%rsp
+ ffffffff815cb4ee: 89 fd mov %edi,%ebp
+ ffffffff815cb4f0: 89 f3 mov %esi,%ebx
+ ffffffff815cb4f2: 49 89 d0 mov %rdx,%r8
+ ffffffff815cb4f5: 65 48 8b 05 0b cb 80 mov %gs:0x280cb0b(%rip),%rax # ffffffff83dd8008 <__stack_chk_guard>
+ ffffffff815cb4fc: 02
+ ffffffff815cb4fd: 48 89 84 24 00 01 00 mov %rax,0x100(%rsp)
+ ffffffff815cb504: 00
+ ffffffff815cb505: 31 c0 xor %eax,%eax
+ ulong buf[BUFFER_SIZE];
+ ffffffff815cb507: 48 89 e2 mov %rsp,%rdx
+ ffffffff815cb50a: b9 20 00 00 00 mov $0x20,%ecx
+ ffffffff815cb50f: 48 89 d7 mov %rdx,%rdi
+ ffffffff815cb512: f3 48 ab rep stos %rax,%es:(%rdi)
+
+ for (int j = 0; j < BUFFER_SIZE; j++)
+ ffffffff815cb515: eb 10 jmp ffffffff815cb527 <test_mthread_victim+0x47>
+ buf[j] = 0xdeadbeef + seq_id;
+ ffffffff815cb517: 8d 93 ef be ad de lea -0x21524111(%rbx),%edx
+ ffffffff815cb51d: 48 63 c8 movslq %eax,%rcx
+ ffffffff815cb520: 48 89 14 cc mov %rdx,(%rsp,%rcx,8)
+ ffffffff815cb524: 83 c0 01 add $0x1,%eax
+ ffffffff815cb527: 83 f8 1f cmp $0x1f,%eax
+ ffffffff815cb52a: 7e eb jle ffffffff815cb517 <test_mthread_victim+0x37>
+ if (start_ns)
+ ffffffff815cb52c: 4d 85 c0 test %r8,%r8
+ ffffffff815cb52f: 75 21 jne ffffffff815cb552 <test_mthread_victim+0x72>
+ silent_wait_us(start_ns, VICTIM_MINIOR_WAIT_NS);
+ ...
+ ffffffff815cb571: 48 8b 84 24 00 01 00 mov 0x100(%rsp),%rax
+ ffffffff815cb579: 65 48 2b 05 87 ca 80 sub %gs:0x280ca87(%rip),%rax # ffffffff83dd8008 <__stack_chk_guard>
+ ...
+ ffffffff815cb5a1: eb ce jmp ffffffff815cb571 <test_mthread_victim+0x91>
+ }
+ ffffffff815cb5a3: e8 d8 86 f1 00 call ffffffff824e3c80 <__stack_chk_fail>
+
+
+func_offset
+^^^^^^^^^^^
+
+The function begins at ffffffff815cb4e0. The *buf* array is initialized in a loop.
+The instruction storing values into the array is at ffffffff815cb520, and the
+first instruction after the loop is at ffffffff815cb52c.
+
+Because KStackWatch uses *kprobe.post_handler*, the watchpoint can be
+set right after ffffffff815cb520. However, this may cause false positives
+because the watchpoint is active before buf[8] is fully assigned.
+
+An alternative is to place the watchpoint at ffffffff815cb52c, right
+after the loop. This avoids false positives but leaves a small window
+for false negatives.
+
+In this document, ffffffff815cb52c is chosen for cleaner logs. If false
+negatives are suspected, repeat the test to catch the corruption.
+
+The required offset is calculated from the function start:
+
+*func_offset* is 0x4c (ffffffff815cb52c - ffffffff815cb4e0).
+
+sp_offset
+^^^^^^^^^^^
+
+From the disassembly, the buf array is at the top of the stack,
+meaning buf == rsp. Therefore, buf[8] sits at rsp + 8 * sizeof(ulong) =
+rsp + 64. Thus, *sp_offset* is 64.
+
+Other parameters
+~~~~~~~~~~~~~~~~~~
+
+* *depth* is 0, as test_mthread_victim is not recursive
+* *max_watch* is 0 to use all available hwbps
+* *watch_len* is 8, the size of a ulong on x86_64
+
+Parameters with a value of 0 can be omitted as defaults.
+
+Configure the watch:
+
+.. code-block:: bash
+
+ echo "fn=test_mthread_victim fo=0x4c so=64 wl=8" > /proc/kstackwatch
+
+Now rerun the test:
+
+.. code-block:: bash
+
+ echo test3 >/proc/kstackwatch_test
+
+The dmesg log shows:
+
+.. code-block:: text
+
+ [ 7.607074] kstackwatch: ========== KStackWatch: Caught stack corruption =======
+ [ 7.607077] kstackwatch: config fn=test_mthread_victim fo=0x4c so=64 wl=8
+ [ 7.607080] CPU: 2 UID: 0 PID: 347 Comm: corrupting Not tainted 6.17.0-rc7-00022-g90270f3db80a-dirty #509 PREEMPT(voluntary)
+ [ 7.607083] Call Trace:
+ [ 7.607084] <#DB>
+ [ 7.607085] dump_stack_lvl+0x66/0xa0
+ [ 7.607091] ksw_watch_handler.part.0+0x2b/0x60
+ [ 7.607094] ksw_watch_handler+0xba/0xd0
+ [ 7.607095] ? test_mthread_corrupting+0x48/0xd0
+ [ 7.607097] ? kthread+0x10d/0x210
+ [ 7.607099] ? ret_from_fork+0x187/0x1e0
+ [ 7.607102] ? ret_from_fork_asm+0x1a/0x30
+ [ 7.607105] __perf_event_overflow+0x154/0x570
+ [ 7.607108] perf_bp_event+0xb4/0xc0
+ [ 7.607112] ? look_up_lock_class+0x59/0x150
+ [ 7.607115] hw_breakpoint_exceptions_notify+0xf7/0x110
+ [ 7.607117] notifier_call_chain+0x44/0x110
+ [ 7.607119] atomic_notifier_call_chain+0x5f/0x110
+ [ 7.607121] notify_die+0x4c/0xb0
+ [ 7.607123] exc_debug_kernel+0xaf/0x170
+ [ 7.607126] asm_exc_debug+0x1e/0x40
+ [ 7.607127] RIP: 0010:test_mthread_corrupting+0x48/0xd0
+ [ 7.607129] Code: c7 80 0a 24 83 e8 48 f1 f1 00 48 85 c0 74 dd eb 30 bb 00 00 00 00 eb 59 48 63 c2 48 c1 e0 03 48 03 03 be cd ab cd ab 48 89 30 <83> c2 01 b8 20 00 00 00 29 c8 39 d0 7f e0 48 8d 7b 10 e8 d1 86 d4
+ [ 7.607130] RSP: 0018:ffffc90000acfee0 EFLAGS: 00000286
+ [ 7.607132] RAX: ffffc90000a13de8 RBX: ffff888102d57580 RCX: 0000000000000008
+ [ 7.607132] RDX: 0000000000000008 RSI: 00000000abcdabcd RDI: ffffc90000acfe00
+ [ 7.607133] RBP: ffff8881085bc800 R08: 0000000000000001 R09: 0000000000000000
+ [ 7.607133] R10: 0000000000000001 R11: 0000000000000000 R12: ffff888105398000
+ [ 7.607134] R13: ffff8881085bc800 R14: ffffffff815cb660 R15: 0000000000000000
+ [ 7.607134] ? __pfx_test_mthread_corrupting+0x10/0x10
+ [ 7.607137] </#DB>
+ [ 7.607138] <TASK>
+ [ 7.607138] kthread+0x10d/0x210
+ [ 7.607140] ? __pfx_kthread+0x10/0x10
+ [ 7.607141] ret_from_fork+0x187/0x1e0
+ [ 7.607143] ? __pfx_kthread+0x10/0x10
+ [ 7.607144] ret_from_fork_asm+0x1a/0x30
+ [ 7.607147] </TASK>
+ [ 7.607147] kstackwatch: =================== KStackWatch End ===================
+ [ 7.807082] kstackwatch_test: victim[0][11]: unhappy buf[8]=0xabcdabcd
+
+The line ``RIP: 0010:test_mthread_corrupting+0x48/0xd0`` shows the exact
+location where the corruption occurred. Now that the ``corrupting()`` function has
+been identified, it is straightforward to trace back to ``buggy()`` and fix the bug.
+
+
+More usage examples and corruption scenarios are provided in
+``kstackwatch_test.sh`` and ``mm/kstackwatch/test.c``.
+
+Limitations
+===========
+
+* Limited by available hardware breakpoints
+* Only one function can be watched at a time
+* Canary search limited to 128 * sizeof(ulong) from the current stack
+ pointer. This is sufficient for most cases, but has three limitations:
+
+ - If the stack frame is larger, the search may fail.
+ - If the function does not have a canary, the search may fail.
+ - If stack memory occasionally contains the same value as the canary,
+ it may be incorrectly matched.
+
+ In these cases, the user can provide the canary location using
+ ``sp_offset``, or treat any memory in the function prologue
+ as the canary.
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 8:01:03 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add a maintainer entry for Kernel Stack Watch.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
MAINTAINERS | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 520fb4e379a3..3d4811ff3631 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13362,6 +13362,14 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
F: Documentation/dev-tools/kselftest*
F: tools/testing/selftests/

+KERNEL STACK WATCH
+M: Jinchao Wang <wangjin...@gmail.com>
+S: Maintained
+F: Documentation/dev-tools/kstackwatch.rst
+F: include/linux/kstackwatch_types.h
+F: mm/kstackwatch/
+F: tools/kstackwatch/
+
KERNEL SMB3 SERVER (KSMBD)
M: Namjae Jeon <linki...@kernel.org>
M: Namjae Jeon <linki...@samba.org>
--
2.43.0

Jinchao Wang

unread,
Sep 24, 2025, 8:25:06 AM (yesterday) Sep 24
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Pre-allocate per-CPU hardware breakpoints at init with a place holder
address, which will be retargeted dynamically in kprobe handler.
This avoids allocation in atomic context.

At most max_watch breakpoints are allocated (0 means no limit).

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 13 +++++
mm/kstackwatch/watch.c | 97 ++++++++++++++++++++++++++++++++++++
2 files changed, 110 insertions(+)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 983125d5cf18..4eac1be3b325 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -2,6 +2,9 @@
#ifndef _KSTACKWATCH_H
#define _KSTACKWATCH_H

+#include <linux/llist.h>
+#include <linux/percpu.h>
+#include <linux/perf_event.h>
#include <linux/types.h>

#define MAX_CONFIG_STR_LEN 128
@@ -32,4 +35,14 @@ struct ksw_config {
// singleton, only modified in kernel.c
const struct ksw_config *ksw_get_config(void);

+/* watch management */
+struct ksw_watchpoint {
+ struct perf_event *__percpu *event;
+ struct perf_event_attr attr;
+ struct llist_node node; // for atomic watch_on and off
+ struct list_head list; // for cpu online and offline
+};
+int ksw_watch_init(void);
+void ksw_watch_exit(void);
+
#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index cec594032515..1d8e24fede54 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -1 +1,98 @@
// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/cpuhotplug.h>
+#include <linux/hw_breakpoint.h>
+#include <linux/irqflags.h>
+#include <linux/mutex.h>
+#include <linux/printk.h>
+
+#include "kstackwatch.h"
+
+static LLIST_HEAD(free_wp_list);
+static LIST_HEAD(all_wp_list);
+static DEFINE_MUTEX(all_wp_mutex);
+
+static ulong holder;
+bool panic_on_catch;
+module_param(panic_on_catch, bool, 0644);
+MODULE_PARM_DESC(panic_on_catch, "panic immediately on corruption catch");
+
+static void ksw_watch_handler(struct perf_event *bp,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
+{
+ pr_err("========== KStackWatch: Caught stack corruption =======\n");
+ pr_err("config %s\n", ksw_get_config()->user_input);
+ dump_stack();
+ pr_err("=================== KStackWatch End ===================\n");
+
+ if (panic_on_catch)
+ panic("Stack corruption detected");
+}
+
+static int ksw_watch_alloc(void)
+{
+ int max_watch = ksw_get_config()->max_watch;
+ struct ksw_watchpoint *wp;
+ int success = 0;
+ int ret;
+
+ init_llist_head(&free_wp_list);
+
+ //max_watch=0 means at most
+ while (!max_watch || success < max_watch) {
+ wp = kzalloc(sizeof(*wp), GFP_KERNEL);
+ if (!wp)
+ return success > 0 ? success : -EINVAL;
+
+ hw_breakpoint_init(&wp->attr);
+ wp->attr.bp_addr = (ulong)&holder;
+ wp->attr.bp_len = sizeof(ulong);
+ wp->attr.bp_type = HW_BREAKPOINT_W;
+ wp->event = register_wide_hw_breakpoint(&wp->attr,
+ ksw_watch_handler, wp);
+ if (IS_ERR((void *)wp->event)) {
+ ret = PTR_ERR((void *)wp->event);
+ kfree(wp);
+ return success > 0 ? success : ret;
+ }
+ llist_add(&wp->node, &free_wp_list);
+ mutex_lock(&all_wp_mutex);
+ list_add(&wp->list, &all_wp_list);
+ mutex_unlock(&all_wp_mutex);
+ success++;
+ }
+
+ return success;
+}
+
+static void ksw_watch_free(void)
+{
+ struct ksw_watchpoint *wp, *tmp;
+
+ mutex_lock(&all_wp_mutex);
+ list_for_each_entry_safe(wp, tmp, &all_wp_list, list) {
+ list_del(&wp->list);
+ unregister_wide_hw_breakpoint(wp->event);
+ kfree(wp);
+ }
+ mutex_unlock(&all_wp_mutex);
+}
+
+int ksw_watch_init(void)
+{
+ int ret;
+
+ ret = ksw_watch_alloc();
+ if (ret <= 0)
+ return -EBUSY;
+
+
+ return 0;
+}
+
+void ksw_watch_exit(void)
+{
+ ksw_watch_free();
+}
--
2.43.0

Randy Dunlap

unread,
Sep 24, 2025, 1:06:34 PM (22 hours ago) Sep 24
to Jinchao Wang, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org
corruption.

> +
> + If unsure, say N.


--
~Randy

Marco Elver

unread,
Sep 24, 2025, 4:45:30 PM (19 hours ago) Sep 24
to Jinchao Wang, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org
On Wed, 24 Sept 2025 at 14:00, Jinchao Wang <wangjin...@gmail.com> wrote:
>
> Introduce a separate test module to validate functionality in controlled
> scenarios.
>
> The module provides a proc interface (/proc/kstackwatch_test) that allows
> triggering specific test cases via simple commands:
>
> echo test0 > /proc/kstackwatch_test

This should not be in /proc/ - if anything, it should go into debugfs.
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/kasan-dev/20250924115931.197077-2-wangjinchao600%40gmail.com.

Marco Elver

unread,
Sep 24, 2025, 4:50:15 PM (19 hours ago) Sep 24
to Jinchao Wang, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org
On Wed, 24 Sept 2025 at 13:51, Jinchao Wang <wangjin...@gmail.com> wrote:
>
> Provide the /proc/kstackwatch file to read or update the configuration.
> Only a single process can open this file at a time, enforced using atomic
> config_file_busy, to prevent concurrent access.

Why is this in /proc and not debugfs?
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/kasan-dev/20250924115124.194940-7-wangjinchao600%40gmail.com.

Jinchao Wang

unread,
Sep 24, 2025, 10:05:36 PM (13 hours ago) Sep 24
to Randy Dunlap, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org
Thanks, will fix in next version.
>
> > +
> > + If unsure, say N.
>
>
> --
> ~Randy
>

--
Jinchao

Jinchao Wang

unread,
Sep 24, 2025, 10:06:52 PM (13 hours ago) Sep 24
to Marco Elver, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org
On Wed, Sep 24, 2025 at 10:44:50PM +0200, Marco Elver wrote:
> On Wed, 24 Sept 2025 at 14:00, Jinchao Wang <wangjin...@gmail.com> wrote:
> >
> > Introduce a separate test module to validate functionality in controlled
> > scenarios.
> >
> > The module provides a proc interface (/proc/kstackwatch_test) that allows
> > triggering specific test cases via simple commands:
> >
> > echo test0 > /proc/kstackwatch_test
>
> This should not be in /proc/ - if anything, it should go into debugfs.
Thanks, will fix in next version.
>
--
Jinchao

Jinchao Wang

unread,
Sep 24, 2025, 10:07:31 PM (13 hours ago) Sep 24
to Marco Elver, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Randy Dunlap, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org
On Wed, Sep 24, 2025 at 10:49:35PM +0200, Marco Elver wrote:
> On Wed, 24 Sept 2025 at 13:51, Jinchao Wang <wangjin...@gmail.com> wrote:
> >
> > Provide the /proc/kstackwatch file to read or update the configuration.
> > Only a single process can open this file at a time, enforced using atomic
> > config_file_busy, to prevent concurrent access.
>
> Why is this in /proc and not debugfs?
Thanks, will fix in next version.
>
--
Jinchao
Reply all
Reply to author
Forward
0 new messages