[PATCH v3 00/19] mm/ksw: Introduce real-time Kernel Stack Watch debugging tool

0 views
Skip to first unread message

Jinchao Wang

unread,
Sep 10, 2025, 1:24:10 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
This patch series introduces **KStackWatch**, a lightweight kernel debugging tool
for detecting kernel stack corruption in real time.

The motivation comes from scenarios where corruption occurs silently in one function
but manifests later as a crash in another. Using KASAN may not reproduce the issue due
to its heavy overhead. with no direct call trace linking the two. Such bugs are often
extremely hard to debug with existing tools.
I demonstrate this scenario in **test2 (silent corruption test)**.

KStackWatch works by combining a hardware breakpoint with kprobe and fprobe.
It can watch a stack canary or a selected local variable and detects the moment the
corruption actually occurs. This allows developers to pinpoint the real source rather
than only observing the final crash.

Key features include:

- Lightweight overhead with minimal impact on bug reproducibility
- Real-time detection of stack corruption
- Simple configuration through `/proc/kstackwatch`
- Support for recursive depth filter

To validate the approach, the patch includes a test module and a test script.

---
Changelog

V3:
Main changes:
* Use modify_wide_hw_breakpoint_local() (from Masami)
* Add atomic flag to restrict /proc/kstackwatch to a single opener
* Protect stack probe with an atomic PID flag
* Handle CPU hotplug for watchpoints
* Add preempt_disable/enable in ksw_watch_on_local_cpu()
* Introduce const struct ksw_config *ksw_get_config(void) and use it
* Switch to global watch_attr, remove struct watch_info
* Validate local_var_len in parser()
* Handle case when canary is not found
* Use dump_stack() instead of show_regs() to allow module build

Cleanups:
* Reduce logging and comments
* Format logs with KBUILD_MODNAME
* Remove unused headers

Documentation:
* Add new document

V2:
https://lore.kernel.org/all/20250904002126.1514...@gmail.com/
* Make hardware breakpoint and stack operations architecture-independent.

V1:
https://lore.kernel.org/all/20250828073311.1116...@gmail.com/
Core Implementation
* Replaced kretprobe with fprobe for function exit hooking, as suggested
by Masami Hiramatsu
* Introduced per-task depth logic to track recursion across scheduling
* Removed the use of workqueue for a more efficient corruption check
* Reordered patches for better logical flow
* Simplified and improved commit messages throughout the series
* Removed initial archcheck which should be improved later


Testing and Architecture

* Replaced the multiple-thread test with silent corruption test
* Split self-tests into a separate patch to improve clarity.

Maintenance
* Added a new entry for KStackWatch to the MAINTAINERS file.

RFC:
https://lore.kernel.org/lkml/20250818122720.4349...@gmail.com/
---

The series is structured as follows:

Jinchao Wang (18):
x86/hw_breakpoint: introduce arch_reinstall_hw_breakpoint() for atomic
context
mm/ksw: add build system support
mm/ksw: add ksw_config struct and parser
mm/ksw: add /proc/kstackwatch interface
mm/ksw: add HWBP pre-allocation
mm/ksw: add atomic watch on/off operations
mm/ksw: support CPU hotplug
mm/ksw: add probe management helpers
mm/ksw: resolve stack watch addr and len
mm/ksw: add recursive depth tracking
mm/ksw: manage start/stop of stack watching
mm/ksw: add self-debug helpers
mm/ksw: add test module
mm/ksw: add stack overflow test
mm/ksw: add silent corruption test case
mm/ksw: add recursive stack corruption test
tools/ksw: add test script
docs: add KStackWatch document

Masami Hiramatsu (Google) (1):
HWBP: Add modify_wide_hw_breakpoint_local() API

Documentation/dev-tools/kstackwatch.rst | 94 ++++++++
MAINTAINERS | 7 +
arch/Kconfig | 10 +
arch/x86/Kconfig | 1 +
arch/x86/include/asm/hw_breakpoint.h | 1 +
arch/x86/kernel/hw_breakpoint.c | 50 +++++
include/linux/hw_breakpoint.h | 6 +
kernel/events/hw_breakpoint.c | 36 ++++
mm/Kconfig.debug | 21 ++
mm/Makefile | 1 +
mm/kstackwatch/Makefile | 8 +
mm/kstackwatch/kernel.c | 239 ++++++++++++++++++++
mm/kstackwatch/kstackwatch.h | 53 +++++
mm/kstackwatch/stack.c | 276 ++++++++++++++++++++++++
mm/kstackwatch/test.c | 259 ++++++++++++++++++++++
mm/kstackwatch/watch.c | 205 ++++++++++++++++++
tools/kstackwatch/kstackwatch_test.sh | 40 ++++
17 files changed, 1307 insertions(+)
create mode 100644 Documentation/dev-tools/kstackwatch.rst
create mode 100644 mm/kstackwatch/Makefile
create mode 100644 mm/kstackwatch/kernel.c
create mode 100644 mm/kstackwatch/kstackwatch.h
create mode 100644 mm/kstackwatch/stack.c
create mode 100644 mm/kstackwatch/test.c
create mode 100644 mm/kstackwatch/watch.c
create mode 100755 tools/kstackwatch/kstackwatch_test.sh

--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:24:38 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Introduce arch_reinstall_hw_breakpoint() to update hardware breakpoint
parameters (address, length, type) without freeing and reallocating the
debug register slot.

This allows atomic updates in contexts where memory allocation is not
permitted, such as kprobe handlers.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
arch/x86/include/asm/hw_breakpoint.h | 1 +
arch/x86/kernel/hw_breakpoint.c | 50 ++++++++++++++++++++++++++++
2 files changed, 51 insertions(+)

diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h
index 0bc931cd0698..bb7c70ad22fe 100644
--- a/arch/x86/include/asm/hw_breakpoint.h
+++ b/arch/x86/include/asm/hw_breakpoint.h
@@ -59,6 +59,7 @@ extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,


int arch_install_hw_breakpoint(struct perf_event *bp);
+int arch_reinstall_hw_breakpoint(struct perf_event *bp);
void arch_uninstall_hw_breakpoint(struct perf_event *bp);
void hw_breakpoint_pmu_read(struct perf_event *bp);
void hw_breakpoint_pmu_unthrottle(struct perf_event *bp);
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index b01644c949b2..89135229ed21 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -132,6 +132,56 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
return 0;
}

+/*
+ * Reinstall a hardware breakpoint on the current CPU.
+ *
+ * This function is used to re-establish a perf counter hardware breakpoint.
+ * It finds the debug address register slot previously allocated for the
+ * breakpoint and re-enables it by writing the address to the debug register
+ * and setting the corresponding bits in the debug control register (DR7).
+ *
+ * It is expected that the breakpoint's event context lock is already held
+ * and interrupts are disabled, ensuring atomicity and safety from other
+ * event handlers.
+ */
+int arch_reinstall_hw_breakpoint(struct perf_event *bp)
+{
+ struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+ unsigned long *dr7;
+ int i;
+
+ lockdep_assert_irqs_disabled();
+
+ for (i = 0; i < HBP_NUM; i++) {
+ struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
+
+ if (*slot == bp)
+ break;
+ }
+
+ if (WARN_ONCE(i == HBP_NUM, "Can't find a matching breakpoint slot"))
+ return -EINVAL;
+
+ set_debugreg(info->address, i);
+ __this_cpu_write(cpu_debugreg[i], info->address);
+
+ dr7 = this_cpu_ptr(&cpu_dr7);
+ *dr7 |= encode_dr7(i, info->len, info->type);
+
+ /*
+ * Ensure we first write cpu_dr7 before we set the DR7 register.
+ * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
+ */
+ barrier();
+
+ set_debugreg(*dr7, 7);
+ if (info->mask)
+ amd_set_dr_addr_mask(info->mask, i);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(arch_reinstall_hw_breakpoint);
+
/*
* Uninstall the breakpoint contained in the given counter.
*
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:24:50 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
From: "Masami Hiramatsu (Google)" <mhir...@kernel.org>

Add modify_wide_hw_breakpoint_local() arch-wide interface which allows
hwbp users to update watch address on-line. This is available if the
arch supports CONFIG_HAVE_REINSTALL_HW_BREAKPOINT.
Note that this allows to change the type only for compatible types,
because it does not release and reserve the hwbp slot based on type.
For instance, you can not change HW_BREAKPOINT_W to HW_BREAKPOINT_X.

Signed-off-by: Masami Hiramatsu (Google) <mhir...@kernel.org>
---
arch/Kconfig | 10 ++++++++++
arch/x86/Kconfig | 1 +
include/linux/hw_breakpoint.h | 6 ++++++
kernel/events/hw_breakpoint.c | 36 +++++++++++++++++++++++++++++++++++
4 files changed, 53 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index d1b4ffd6e085..e4787fc814df 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -418,6 +418,16 @@ config HAVE_MIXED_BREAKPOINTS_REGS
Select this option if your arch implements breakpoints under the
latter fashion.

+config HAVE_REINSTALL_HW_BREAKPOINT
+ bool
+ depends on HAVE_HW_BREAKPOINT
+ help
+ Depending on the arch implementation of hardware breakpoints,
+ some of them are able to update the breakpoint configuration
+ without release and reserve the hardware breakpoint register.
+ What configuration is able to update depends on hardware and
+ software implementation.
+
config HAVE_USER_RETURN_NOTIFIER
bool

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 58d890fe2100..49d4ce2af94c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -247,6 +247,7 @@ config X86
select HAVE_FUNCTION_TRACER
select HAVE_GCC_PLUGINS
select HAVE_HW_BREAKPOINT
+ select HAVE_REINSTALL_HW_BREAKPOINT
select HAVE_IOREMAP_PROT
select HAVE_IRQ_EXIT_ON_IRQ_STACK if X86_64
select HAVE_IRQ_TIME_ACCOUNTING
diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index db199d653dd1..ea373f2587f8 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -81,6 +81,9 @@ register_wide_hw_breakpoint(struct perf_event_attr *attr,
perf_overflow_handler_t triggered,
void *context);

+extern int modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr);
+
extern int register_perf_hw_breakpoint(struct perf_event *bp);
extern void unregister_hw_breakpoint(struct perf_event *bp);
extern void unregister_wide_hw_breakpoint(struct perf_event * __percpu *cpu_events);
@@ -124,6 +127,9 @@ register_wide_hw_breakpoint(struct perf_event_attr *attr,
perf_overflow_handler_t triggered,
void *context) { return NULL; }
static inline int
+modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr) { return -ENOSYS; }
+static inline int
register_perf_hw_breakpoint(struct perf_event *bp) { return -ENOSYS; }
static inline void unregister_hw_breakpoint(struct perf_event *bp) { }
static inline void
diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
index 8ec2cb688903..ef9bab968b2c 100644
--- a/kernel/events/hw_breakpoint.c
+++ b/kernel/events/hw_breakpoint.c
@@ -887,6 +887,42 @@ void unregister_wide_hw_breakpoint(struct perf_event * __percpu *cpu_events)
}
EXPORT_SYMBOL_GPL(unregister_wide_hw_breakpoint);

+/**
+ * modify_wide_hw_breakpoint_local - update breakpoint config for local cpu
+ * @bp: the hwbp perf event for this cpu
+ * @attr: the new attribute for @bp
+ *
+ * This does not release and reserve the slot of HWBP, just reuse the current
+ * slot on local CPU. So the users must update the other CPUs by themselves.
+ * Also, since this does not release/reserve the slot, this can not change the
+ * type to incompatible type of the HWBP.
+ * Return err if attr is invalid or the cpu fails to update debug register
+ * for new @attr.
+ */
+#ifdef CONFIG_HAVE_REINSTALL_HW_BREAKPOINT
+int modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr)
+{
+ int ret;
+
+ if (find_slot_idx(bp->attr.bp_type) != find_slot_idx(attr->bp_type))
+ return -EINVAL;
+
+ ret = hw_breakpoint_arch_parse(bp, attr, counter_arch_bp(bp));
+ if (ret)
+ return ret;
+
+ return arch_reinstall_hw_breakpoint(bp);
+}
+#else
+int modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr)
+{
+ return -EOPNOTSUPP;
+}
+#endif
+EXPORT_SYMBOL_GPL(modify_wide_hw_breakpoint_local);
+
/**
* hw_breakpoint_is_used - check if breakpoints are currently used
*
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:25:00 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Add Kconfig and Makefile infrastructure.

The implementation is located under `mm/kstackwatch/`.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/Kconfig.debug | 11 +++++++++++
mm/Makefile | 1 +
mm/kstackwatch/Makefile | 2 ++
mm/kstackwatch/kernel.c | 22 ++++++++++++++++++++++
mm/kstackwatch/kstackwatch.h | 5 +++++
mm/kstackwatch/stack.c | 1 +
mm/kstackwatch/watch.c | 1 +
7 files changed, 43 insertions(+)
create mode 100644 mm/kstackwatch/Makefile
create mode 100644 mm/kstackwatch/kernel.c
create mode 100644 mm/kstackwatch/kstackwatch.h
create mode 100644 mm/kstackwatch/stack.c
create mode 100644 mm/kstackwatch/watch.c

diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index 32b65073d0cc..fdfc6e6d0dec 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -309,3 +309,14 @@ config PER_VMA_LOCK_STATS
overhead in the page fault path.

If in doubt, say N.
+
+config KSTACK_WATCH
+ tristate "Kernel Stack Watch"
+ depends on HAVE_HW_BREAKPOINT && KPROBES && FPROBE
+ select HAVE_REINSTALL_HW_BREAKPOINT
+ help
+ A lightweight real-time debugging tool to detect stack corruption.
+ It can watch either the canary or local variable and tracks
+ the recursive depth of the monitored function.
+
+ If unsure, say N.
diff --git a/mm/Makefile b/mm/Makefile
index ef54aa615d9d..665c9f2bf987 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -92,6 +92,7 @@ obj-$(CONFIG_PAGE_POISONING) += page_poison.o
obj-$(CONFIG_KASAN) += kasan/
obj-$(CONFIG_KFENCE) += kfence/
obj-$(CONFIG_KMSAN) += kmsan/
+obj-$(CONFIG_KSTACK_WATCH) += kstackwatch/
obj-$(CONFIG_FAILSLAB) += failslab.o
obj-$(CONFIG_FAIL_PAGE_ALLOC) += fail_page_alloc.o
obj-$(CONFIG_MEMTEST) += memtest.o
diff --git a/mm/kstackwatch/Makefile b/mm/kstackwatch/Makefile
new file mode 100644
index 000000000000..84a46cb9a766
--- /dev/null
+++ b/mm/kstackwatch/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_KSTACK_WATCH) += kstackwatch.o
+kstackwatch-y := kernel.o stack.o watch.o
diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
new file mode 100644
index 000000000000..40aa7e9ff513
--- /dev/null
+++ b/mm/kstackwatch/kernel.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/module.h>
+
+MODULE_AUTHOR("Jinchao Wang");
+MODULE_DESCRIPTION("Kernel Stack Watch");
+MODULE_LICENSE("GPL");
+
+static int __init kstackwatch_init(void)
+{
+ pr_info("module loaded\n");
+ return 0;
+}
+
+static void __exit kstackwatch_exit(void)
+{
+ pr_info("module unloaded\n");
+}
+
+module_init(kstackwatch_init);
+module_exit(kstackwatch_exit);
diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
new file mode 100644
index 000000000000..0273ef478a26
--- /dev/null
+++ b/mm/kstackwatch/kstackwatch.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _KSTACKWATCH_H
+#define _KSTACKWATCH_H
+
+#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
new file mode 100644
index 000000000000..cec594032515
--- /dev/null
+++ b/mm/kstackwatch/stack.c
@@ -0,0 +1 @@
+// SPDX-License-Identifier: GPL-2.0
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
new file mode 100644
index 000000000000..cec594032515
--- /dev/null
+++ b/mm/kstackwatch/watch.c
@@ -0,0 +1 @@
+// SPDX-License-Identifier: GPL-2.0
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:25:12 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Add struct ksw_config and ksw_parse_config() to parse user string.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kernel.c | 91 ++++++++++++++++++++++++++++++++++++
mm/kstackwatch/kstackwatch.h | 33 +++++++++++++
2 files changed, 124 insertions(+)

diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
index 40aa7e9ff513..1502795e02af 100644
--- a/mm/kstackwatch/kernel.c
+++ b/mm/kstackwatch/kernel.c
@@ -1,20 +1,111 @@
// SPDX-License-Identifier: GPL-2.0
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

+#include <linux/kstrtox.h>
#include <linux/module.h>
+#include <linux/string.h>
+
+#include "kstackwatch.h"

MODULE_AUTHOR("Jinchao Wang");
MODULE_DESCRIPTION("Kernel Stack Watch");
MODULE_LICENSE("GPL");

+static struct ksw_config *ksw_config;
+
+/*
+ * Format of the configuration string:
+ * function+ip_offset[+depth] [local_var_offset:local_var_len]
+ *
+ * - function : name of the target function
+ * - ip_offset : instruction pointer offset within the function
+ * - depth : recursion depth to watch
+ * - local_var_offset : offset from the stack pointer at function+ip_offset
+ * - local_var_len : length of the local variable(1,2,4,8)
+ */
+static int __maybe_unused ksw_parse_config(char *buf, struct ksw_config *config)
+{
+ char *func_part, *local_var_part = NULL;
+ char *token;
+ u16 local_var_len;
+
+ memset(ksw_config, 0, sizeof(*ksw_config));
+
+ /* set the watch type to the default canary-based watching */
+ config->type = WATCH_CANARY;
+
+ func_part = strim(buf);
+ strscpy(config->config_str, func_part, MAX_CONFIG_STR_LEN);
+
+ local_var_part = strchr(func_part, ' ');
+ if (local_var_part) {
+ *local_var_part = '\0'; // terminate the function part
+ local_var_part = strim(local_var_part + 1);
+ }
+
+ /* parse the function part: function+ip_offset[+depth] */
+ token = strsep(&func_part, "+");
+ if (!token)
+ goto fail;
+
+ strscpy(config->function, token, MAX_FUNC_NAME_LEN - 1);
+
+ token = strsep(&func_part, "+");
+ if (!token || kstrtou16(token, 0, &config->ip_offset)) {
+ pr_err("failed to parse instruction offset\n");
+ goto fail;
+ }
+
+ token = strsep(&func_part, "+");
+ if (token && kstrtou16(token, 0, &config->depth)) {
+ pr_err("failed to parse depth\n");
+ goto fail;
+ }
+ if (!local_var_part || !(*local_var_part))
+ return 0;
+
+ /* parse the optional local var offset:len */
+ config->type = WATCH_LOCAL_VAR;
+ token = strsep(&local_var_part, ":");
+ if (!token || kstrtou16(token, 0, &config->local_var_offset)) {
+ pr_err("failed to parse local var offset\n");
+ goto fail;
+ }
+
+ if (!local_var_part || kstrtou16(local_var_part, 0, &local_var_len)) {
+ pr_err("failed to parse local var len\n");
+ goto fail;
+ }
+
+ if (local_var_len != 1 && local_var_len != 2 &&
+ local_var_len != 4 && local_var_len != 8) {
+ pr_err("invalid local var len %u (must be 1,2,4,8)\n",
+ local_var_len);
+ goto fail;
+ }
+ config->local_var_len = local_var_len;
+
+ return 0;
+fail:
+ pr_err("invalid input: %s\n", config->config_str);
+ config->config_str[0] = '\0';
+ return -EINVAL;
+}
+
static int __init kstackwatch_init(void)
{
+ ksw_config = kzalloc(sizeof(*ksw_config), GFP_KERNEL);
+ if (!ksw_config)
+ return -ENOMEM;
+
pr_info("module loaded\n");
return 0;
}

static void __exit kstackwatch_exit(void)
{
+ kfree(ksw_config);
+
pr_info("module unloaded\n");
}

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 0273ef478a26..7c595c5c24d1 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -2,4 +2,37 @@
#ifndef _KSTACKWATCH_H
#define _KSTACKWATCH_H

+#include <linux/types.h>
+
+#define MAX_FUNC_NAME_LEN 64
+#define MAX_CONFIG_STR_LEN 128
+
+enum watch_type {
+ WATCH_CANARY = 0,
+ WATCH_LOCAL_VAR,
+};
+
+struct ksw_config {
+ /* function part */
+ char function[MAX_FUNC_NAME_LEN];
+ u16 ip_offset;
+ u16 depth;
+
+ /* local var, useless for canary watch */
+ /* offset from rsp at function+ip_offset */
+ u16 local_var_offset;
+
+ /*
+ * local var size (1,2,4,8 bytes)
+ * it will be the watching len
+ */
+ u16 local_var_len;
+
+ /* easy for understand*/
+ enum watch_type type;
+
+ /* save to show */
+ char config_str[MAX_CONFIG_STR_LEN];
+};
+
#endif /* _KSTACKWATCH_H */
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:25:24 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Provide the /proc/kstackwatch file to read or update the configuration.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kernel.c | 75 +++++++++++++++++++++++++++++++++++-
mm/kstackwatch/kstackwatch.h | 3 ++
2 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
index 1502795e02af..8e1dca45003e 100644
--- a/mm/kstackwatch/kernel.c
+++ b/mm/kstackwatch/kernel.c
@@ -3,7 +3,10 @@

#include <linux/kstrtox.h>
#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
#include <linux/string.h>
+#include <linux/uaccess.h>

#include "kstackwatch.h"

@@ -12,6 +15,7 @@ MODULE_DESCRIPTION("Kernel Stack Watch");
MODULE_LICENSE("GPL");

static struct ksw_config *ksw_config;
+static atomic_t config_file_busy = ATOMIC_INIT(0);

/*
* Format of the configuration string:
@@ -23,7 +27,7 @@ static struct ksw_config *ksw_config;
* - local_var_offset : offset from the stack pointer at function+ip_offset
* - local_var_len : length of the local variable(1,2,4,8)
*/
-static int __maybe_unused ksw_parse_config(char *buf, struct ksw_config *config)
+static int ksw_parse_config(char *buf, struct ksw_config *config)
{
char *func_part, *local_var_part = NULL;
char *token;
@@ -92,18 +96,87 @@ static int __maybe_unused ksw_parse_config(char *buf, struct ksw_config *config)
return -EINVAL;
}

+static ssize_t kstackwatch_proc_write(struct file *file,
+ const char __user *buffer, size_t count,
+ loff_t *pos)
+{
+ char input[MAX_CONFIG_STR_LEN];
+ int ret;
+
+ if (count == 0 || count >= sizeof(input))
+ return -EINVAL;
+
+ if (copy_from_user(input, buffer, count))
+ return -EFAULT;
+
+ input[count] = '\0';
+ strim(input);
+
+ if (!strlen(input)) {
+ pr_info("config cleared\n");
+ return count;
+ }
+
+ ret = ksw_parse_config(input, ksw_config);
+ if (ret) {
+ pr_err("Failed to parse config %d\n", ret);
+ return ret;
+ }
+
+ return count;
+}
+
+static int kstackwatch_proc_show(struct seq_file *m, void *v)
+{
+ seq_printf(m, "%s\n", ksw_config->config_str);
+ return 0;
+}
+
+static int kstackwatch_proc_open(struct inode *inode, struct file *file)
+{
+ if (atomic_cmpxchg(&config_file_busy, 0, 1))
+ return -EBUSY;
+
+ return single_open(file, kstackwatch_proc_show, NULL);
+}
+
+static int kstackwatch_proc_release(struct inode *inode, struct file *file)
+{
+ atomic_set(&config_file_busy, 0);
+ return single_release(inode, file);
+}
+
+static const struct proc_ops kstackwatch_proc_ops = {
+ .proc_open = kstackwatch_proc_open,
+ .proc_read = seq_read,
+ .proc_write = kstackwatch_proc_write,
+ .proc_lseek = seq_lseek,
+ .proc_release = kstackwatch_proc_release,
+};
+
+const struct ksw_config *ksw_get_config(void)
+{
+ return ksw_config;
+}
static int __init kstackwatch_init(void)
{
ksw_config = kzalloc(sizeof(*ksw_config), GFP_KERNEL);
if (!ksw_config)
return -ENOMEM;

+ if (!proc_create("kstackwatch", 0600, NULL, &kstackwatch_proc_ops)) {
+ pr_err("create proc kstackwatch fail");
+ kfree(ksw_config);
+ return -ENOMEM;
+ }
+
pr_info("module loaded\n");
return 0;
}

static void __exit kstackwatch_exit(void)
{
+ remove_proc_entry("kstackwatch", NULL);
kfree(ksw_config);

pr_info("module unloaded\n");
diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 7c595c5c24d1..277b192f80fa 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -35,4 +35,7 @@ struct ksw_config {
char config_str[MAX_CONFIG_STR_LEN];
};

+// singleton, only modified in kernel.c
+const struct ksw_config *ksw_get_config(void);

Jinchao Wang

unread,
Sep 10, 2025, 1:25:40 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Pre-allocate per-CPU hardware breakpoints at init with a dummy address,
which will be retargeted dynamically in kprobe handler.
This avoids allocation in atomic contexts.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 4 +++
mm/kstackwatch/watch.c | 55 ++++++++++++++++++++++++++++++++++++
2 files changed, 59 insertions(+)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 277b192f80fa..3ea191370970 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -38,4 +38,8 @@ struct ksw_config {
// singleton, only modified in kernel.c
const struct ksw_config *ksw_get_config(void);

+/* watch management */
+int ksw_watch_init(void);
+void ksw_watch_exit(void);
+
#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index cec594032515..d3399ac840b2 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -1 +1,56 @@
// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/hw_breakpoint.h>
+#include <linux/perf_event.h>
+#include <linux/printk.h>
+
+#include "kstackwatch.h"
+
+static struct perf_event *__percpu *watch_events;
+
+static unsigned long watch_holder;
+
+static struct perf_event_attr watch_attr;
+
+bool panic_on_catch;
+module_param(panic_on_catch, bool, 0644);
+MODULE_PARM_DESC(panic_on_catch, "panic immediately on corruption catch");
+static void ksw_watch_handler(struct perf_event *bp,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
+{
+ pr_err("========== KStackWatch: Caught stack corruption =======\n");
+ pr_err("config %s\n", ksw_get_config()->config_str);
+ dump_stack();
+ pr_err("=================== KStackWatch End ===================\n");
+
+ if (panic_on_catch)
+ panic("Stack corruption detected");
+}
+
+int ksw_watch_init(void)
+{
+ int ret;
+
+ hw_breakpoint_init(&watch_attr);
+ watch_attr.bp_addr = (unsigned long)&watch_holder;
+ watch_attr.bp_len = sizeof(watch_holder);
+ watch_attr.bp_type = HW_BREAKPOINT_W;
+ watch_events = register_wide_hw_breakpoint(&watch_attr,
+ ksw_watch_handler,
+ NULL);
+ if (IS_ERR(watch_events)) {
+ ret = PTR_ERR(watch_events);
+ pr_err("failed to register wide hw breakpoint: %d\n", ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+void ksw_watch_exit(void)
+{
+ unregister_wide_hw_breakpoint(watch_events);
+ watch_events = NULL;
+}
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:25:55 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Add support to atomically turn the hardware watch on and off without
allocation overhead.

The watch is pre-allocated and later retargeted.
The current CPU is updated directly, while other CPUs are updated
asynchronously via smp_call_function_single_async().

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 2 +
mm/kstackwatch/watch.c | 95 ++++++++++++++++++++++++++++++++++++
2 files changed, 97 insertions(+)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 3ea191370970..2fa377843f17 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -41,5 +41,7 @@ const struct ksw_config *ksw_get_config(void);
/* watch management */
int ksw_watch_init(void);
void ksw_watch_exit(void);
+int ksw_watch_on(u64 watch_addr, u64 watch_len);
+void ksw_watch_off(void);

#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index d3399ac840b2..e02ffc3231ad 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -3,16 +3,23 @@

#include <linux/hw_breakpoint.h>
#include <linux/perf_event.h>
+#include <linux/preempt.h>
#include <linux/printk.h>

#include "kstackwatch.h"

static struct perf_event *__percpu *watch_events;
+static DEFINE_SPINLOCK(watch_lock);

static unsigned long watch_holder;

static struct perf_event_attr watch_attr;

+static void ksw_watch_on_local_cpu(void *info);
+
+static DEFINE_PER_CPU(call_single_data_t,
+ watch_csd) = CSD_INIT(ksw_watch_on_local_cpu, NULL);
+
bool panic_on_catch;
module_param(panic_on_catch, bool, 0644);
MODULE_PARM_DESC(panic_on_catch, "panic immediately on corruption catch");
@@ -29,6 +36,94 @@ static void ksw_watch_handler(struct perf_event *bp,
panic("Stack corruption detected");
}

+static void ksw_watch_on_local_cpu(void *data)
+{
+ struct perf_event *bp;
+ int cpu;
+ int ret;
+
+ preempt_disable();
+ cpu = raw_smp_processor_id();
+ bp = *per_cpu_ptr(watch_events, cpu);
+ if (!bp) {
+ preempt_enable();
+ return;
+ }
+
+ ret = modify_wide_hw_breakpoint_local(bp, &watch_attr);
+ preempt_enable();
+
+ if (ret) {
+ pr_err("failed to reinstall HWBP on CPU %d ret %d\n", cpu,
+ ret);
+ return;
+ }
+
+ if (watch_attr.bp_addr == (unsigned long)&watch_holder) {
+ pr_debug("watch off CPU %d\n", cpu);
+ } else {
+ pr_debug("watch on CPU %d at 0x%llx (len %llu)\n", cpu,
+ watch_attr.bp_addr, watch_attr.bp_len);
+ }
+}
+
+int ksw_watch_on(u64 watch_addr, u64 watch_len)
+{
+ unsigned long flags;
+ int cpu;
+ call_single_data_t *csd;
+
+ if (!watch_addr) {
+ pr_err("watch with invalid address\n");
+ return -EINVAL;
+ }
+
+ spin_lock_irqsave(&watch_lock, flags);
+
+ /*
+ * enforce singleton watch:
+ * - if a watch is already active (bp_addr != &watch_holder),
+ * - and not asking to reset it (watch_addr != &watch_holder)
+ * then reject with -EBUSY.
+ */
+ if (watch_attr.bp_addr != (unsigned long)&watch_holder &&
+ watch_addr != (unsigned long)&watch_holder) {
+ spin_unlock_irqrestore(&watch_lock, flags);
+ return -EBUSY;
+ }
+
+ watch_attr.bp_addr = watch_addr;
+ watch_attr.bp_len = watch_len;
+
+ /* ensure watchpoint update is visible to other CPUs before IPI */
+ smp_wmb();
+
+ spin_unlock_irqrestore(&watch_lock, flags);
+
+ if (watch_addr == (unsigned long)&watch_holder)
+ pr_debug("watch off starting\n");
+ else
+ pr_debug("watch on starting\n");
+
+ cpus_read_lock();
+ for_each_online_cpu(cpu) {
+ if (cpu == raw_smp_processor_id()) {
+ ksw_watch_on_local_cpu(NULL);
+ } else {
+ csd = &per_cpu(watch_csd, cpu);
+ smp_call_function_single_async(cpu, csd);
+ }
+ }
+ cpus_read_unlock();
+
+ return 0;
+}
+
+void ksw_watch_off(void)
+{
+ ksw_watch_on((unsigned long)&watch_holder, sizeof(watch_holder));
+}
+
int ksw_watch_init(void)
{
int ret;
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:26:06 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Register CPU online/offline callbacks via cpuhp_setup_state_nocalls()
so stack watches are installed/removed dynamically as CPUs come online
or go offline.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/watch.c | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)

diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index e02ffc3231ad..d95efefdffe9 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

+#include <linux/cpuhotplug.h>
#include <linux/hw_breakpoint.h>
#include <linux/perf_event.h>
#include <linux/preempt.h>
@@ -67,6 +68,32 @@ static void ksw_watch_on_local_cpu(void *data)
}
}

+static int ksw_cpu_online(unsigned int cpu)
+{
+ struct perf_event *bp;
+
+ bp = perf_event_create_kernel_counter(&watch_attr, cpu, NULL,
+ ksw_watch_handler, NULL);
+ if (IS_ERR(bp)) {
+ pr_err("Failed to create watch on CPU %d: %ld\n", cpu,
+ PTR_ERR(bp));
+ return PTR_ERR(bp);
+ }
+
+ per_cpu(*watch_events, cpu) = bp;
+ per_cpu(watch_csd, cpu) = CSD_INIT(ksw_watch_on_local_cpu, NULL);
+ return 0;
+}
+
+static int ksw_cpu_offline(unsigned int cpu)
+{
+ struct perf_event *bp = per_cpu(*watch_events, cpu);
+
+ if (bp)
+ unregister_hw_breakpoint(bp);
+ return 0;
+}
+
int ksw_watch_on(u64 watch_addr, u64 watch_len)
{
unsigned long flags;
@@ -141,6 +168,15 @@ int ksw_watch_init(void)
return ret;
}

+ ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+ "kstackwatch:online", ksw_cpu_online,
+ ksw_cpu_offline);
+ if (ret < 0) {
+ unregister_wide_hw_breakpoint(watch_events);
+ pr_err("Failed to register CPU hotplug notifier\n");
+ return ret;
+ }
+
return 0;
}

--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:32:12 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Provide ksw_stack_init() and ksw_stack_exit() to manage entry and
exit probes for the target function from ksw_get_config().
Use atomic PID tracking to ensure singleton watch.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 4 ++
mm/kstackwatch/stack.c | 99 ++++++++++++++++++++++++++++++++++++
2 files changed, 103 insertions(+)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 2fa377843f17..79ca40e69268 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -38,6 +38,10 @@ struct ksw_config {
// singleton, only modified in kernel.c
const struct ksw_config *ksw_get_config(void);

+/* stack management */
+int ksw_stack_init(void);
+void ksw_stack_exit(void);
+
/* watch management */
int ksw_watch_init(void);
void ksw_watch_exit(void);
diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
index cec594032515..72409156458f 100644
--- a/mm/kstackwatch/stack.c
+++ b/mm/kstackwatch/stack.c
@@ -1 +1,100 @@
// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/atomic.h>
+#include <linux/fprobe.h>
+#include <linux/kprobes.h>
+#include <linux/printk.h>
+#include <linux/spinlock.h>
+
+#include "kstackwatch.h"
+
+static struct kprobe entry_probe;
+static struct fprobe exit_probe;
+#define INVALID_PID -1
+static atomic_t ksw_stack_pid = ATOMIC_INIT(INVALID_PID);
+
+static int ksw_stack_prepare_watch(struct pt_regs *regs,
+ const struct ksw_config *config,
+ u64 *watch_addr, u64 *watch_len)
+{
+ /* implement logic will be added in following patches */
+ *watch_addr = 0;
+ *watch_len = 0;
+ return 0;
+}
+
+static void ksw_stack_entry_handler(struct kprobe *p, struct pt_regs *regs,
+ unsigned long flags)
+{
+ u64 watch_addr;
+ u64 watch_len;
+ int ret;
+
+ if (atomic_cmpxchg(&ksw_stack_pid, INVALID_PID, current->pid) !=
+ INVALID_PID)
+ return;
+
+ ret = ksw_stack_prepare_watch(regs, ksw_get_config(), &watch_addr,
+ &watch_len);
+ if (ret) {
+ atomic_set(&ksw_stack_pid, INVALID_PID);
+ pr_err("failed to prepare watch target: %d\n", ret);
+ return;
+ }
+
+ ret = ksw_watch_on(watch_addr, watch_len);
+ if (ret) {
+ atomic_set(&ksw_stack_pid, INVALID_PID);
+ pr_err("failed to watch on addr:0x%llx len:%llu %d\n",
+ watch_addr, watch_len, ret);
+ return;
+ }
+}
+
+static void ksw_stack_exit_handler(struct fprobe *fp, unsigned long ip,
+ unsigned long ret_ip,
+ struct ftrace_regs *regs, void *data)
+{
+ if (atomic_read(&ksw_stack_pid) != current->pid)
+ return;
+
+ ksw_watch_off();
+
+ atomic_set(&ksw_stack_pid, INVALID_PID);
+}
+
+int ksw_stack_init(void)
+{
+ int ret;
+ char *symbuf = NULL;
+
+ memset(&entry_probe, 0, sizeof(entry_probe));
+ entry_probe.symbol_name = ksw_get_config()->function;
+ entry_probe.offset = ksw_get_config()->ip_offset;
+ entry_probe.post_handler = ksw_stack_entry_handler;
+ ret = register_kprobe(&entry_probe);
+ if (ret) {
+ pr_err("Failed to register kprobe ret %d\n", ret);
+ return ret;
+ }
+
+ memset(&exit_probe, 0, sizeof(exit_probe));
+ exit_probe.exit_handler = ksw_stack_exit_handler;
+ symbuf = (char *)ksw_get_config()->function;
+
+ ret = register_fprobe_syms(&exit_probe, (const char **)&symbuf, 1);
+ if (ret < 0) {
+ pr_err("register_fprobe_syms fail %d\n", ret);
+ unregister_kprobe(&entry_probe);
+ return ret;
+ }
+
+ return 0;
+}
+
+void ksw_stack_exit(void)
+{
+ unregister_fprobe(&exit_probe);
+ unregister_kprobe(&entry_probe);
+}
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:32:21 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Add helpers to find the stack canary or a local variable addr and len
for the probed function based on ksw_get_config(). For canary search,
limits search to a fixed number of steps to avoid scanning the entire
stack. Validates that the computed address and length are within the
kernel stack.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/stack.c | 86 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 83 insertions(+), 3 deletions(-)

diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
index 72409156458f..3ea0f9de698e 100644
--- a/mm/kstackwatch/stack.c
+++ b/mm/kstackwatch/stack.c
@@ -13,14 +13,94 @@ static struct kprobe entry_probe;
static struct fprobe exit_probe;
#define INVALID_PID -1
static atomic_t ksw_stack_pid = ATOMIC_INIT(INVALID_PID);
+#define MAX_CANARY_SEARCH_STEPS 128
+
+static unsigned long ksw_find_stack_canary_addr(struct pt_regs *regs)
+{
+ unsigned long *stack_ptr, *stack_end, *stack_base;
+ unsigned long expected_canary;
+ unsigned int i;
+
+ stack_ptr = (unsigned long *)kernel_stack_pointer(regs);
+
+ stack_base = (unsigned long *)(current->stack);
+
+ // TODO: limit it to the current frame
+ stack_end = (unsigned long *)((char *)current->stack + THREAD_SIZE);
+
+ expected_canary = current->stack_canary;
+
+ if (stack_ptr < stack_base || stack_ptr >= stack_end) {
+ pr_err("Stack pointer 0x%lx out of bounds [0x%lx, 0x%lx)\n",
+ (unsigned long)stack_ptr, (unsigned long)stack_base,
+ (unsigned long)stack_end);
+ return 0;
+ }
+
+ for (i = 0; i < MAX_CANARY_SEARCH_STEPS; i++) {
+ if (&stack_ptr[i] >= stack_end)
+ break;
+
+ if (stack_ptr[i] == expected_canary) {
+ pr_debug("canary found i:%d 0x%lx\n", i,
+ (unsigned long)&stack_ptr[i]);
+ return (unsigned long)&stack_ptr[i];
+ }
+ }
+
+ pr_debug("canary not found in first %d steps\n",
+ MAX_CANARY_SEARCH_STEPS);
+ return 0;
+}
+
+static int ksw_stack_validate_addr(unsigned long addr, size_t size)
+{
+ unsigned long stack_start, stack_end;
+
+ if (!addr || !size)
+ return -EINVAL;
+
+ stack_start = (unsigned long)current->stack;
+ stack_end = stack_start + THREAD_SIZE;
+
+ if (addr < stack_start || (addr + size) > stack_end)
+ return -ERANGE;
+
+ return 0;
+}

static int ksw_stack_prepare_watch(struct pt_regs *regs,
const struct ksw_config *config,
u64 *watch_addr, u64 *watch_len)
{
- /* implement logic will be added in following patches */
- *watch_addr = 0;
- *watch_len = 0;
+ u64 addr;
+ u64 len;
+
+ /* Resolve addresses for all active watches */
+ switch (ksw_get_config()->type) {
+ case WATCH_CANARY:
+ addr = ksw_find_stack_canary_addr(regs);
+ len = sizeof(unsigned long);
+ break;
+
+ case WATCH_LOCAL_VAR:
+ addr = kernel_stack_pointer(regs) +
+ ksw_get_config()->local_var_offset;
+ len = ksw_get_config()->local_var_len;
+ break;
+
+ default:
+ pr_err("Unknown watch type %d\n", ksw_get_config()->type);
+ return -EINVAL;
+ }
+
+ if (ksw_stack_validate_addr(addr, len)) {
+ pr_err("invalid stack addr:0x%llx len :%llu\n", addr, len);
+ return -EINVAL;
+ }
+
+ *watch_addr = addr;
+ *watch_len = len;
return 0;
}

--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:32:34 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Track per-task recursion depth using a simple hashtable keyed by PID.
Entry/exit handlers update the depth, triggering only at the configured
recursion level.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/stack.c | 100 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 98 insertions(+), 2 deletions(-)

diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
index 3ea0f9de698e..669876057f0b 100644
--- a/mm/kstackwatch/stack.c
+++ b/mm/kstackwatch/stack.c
@@ -3,6 +3,8 @@

#include <linux/atomic.h>
#include <linux/fprobe.h>
+#include <linux/hash.h>
+#include <linux/hashtable.h>
#include <linux/kprobes.h>
#include <linux/printk.h>
#include <linux/spinlock.h>
@@ -15,6 +17,83 @@ static struct fprobe exit_probe;
static atomic_t ksw_stack_pid = ATOMIC_INIT(INVALID_PID);
#define MAX_CANARY_SEARCH_STEPS 128

+struct depth_entry {
+ pid_t pid;
+ int depth; /* starts from 0 */
+ struct hlist_node node;
+};
+
+#define DEPTH_HASH_BITS 8
+#define DEPTH_HASH_SIZE BIT(DEPTH_HASH_BITS)
+static DEFINE_HASHTABLE(depth_hash, DEPTH_HASH_BITS);
+static DEFINE_SPINLOCK(depth_hash_lock);
+
+static int get_recursive_depth(void)
+{
+ struct depth_entry *entry;
+ pid_t pid = current->pid;
+ int depth = 0;
+
+ spin_lock(&depth_hash_lock);
+ hash_for_each_possible(depth_hash, entry, node, pid) {
+ if (entry->pid == pid) {
+ depth = entry->depth;
+ break;
+ }
+ }
+ spin_unlock(&depth_hash_lock);
+ return depth;
+}
+
+static void set_recursive_depth(int depth)
+{
+ struct depth_entry *entry;
+ pid_t pid = current->pid;
+ bool found = false;
+
+ spin_lock(&depth_hash_lock);
+ hash_for_each_possible(depth_hash, entry, node, pid) {
+ if (entry->pid == pid) {
+ entry->depth = depth;
+ found = true;
+ break;
+ }
+ }
+
+ if (found) {
+ // last exit handler
+ if (depth == 0) {
+ hash_del(&entry->node);
+ kfree(entry);
+ }
+ goto unlock;
+ }
+
+ WARN_ONCE(depth != 1, "new entry depth %d should be 1", depth);
+ entry = kmalloc(sizeof(*entry), GFP_ATOMIC);
+ if (entry) {
+ entry->pid = pid;
+ entry->depth = depth;
+ hash_add(depth_hash, &entry->node, pid);
+ }
+unlock:
+ spin_unlock(&depth_hash_lock);
+}
+
+static void reset_recursive_depth(void)
+{
+ struct depth_entry *entry;
+ struct hlist_node *tmp;
+ int bkt;
+
+ spin_lock(&depth_hash_lock);
+ hash_for_each_safe(depth_hash, bkt, tmp, entry, node) {
+ hash_del(&entry->node);
+ kfree(entry);
+ }
+ spin_unlock(&depth_hash_lock);
+}
+
static unsigned long ksw_find_stack_canary_addr(struct pt_regs *regs)
{
unsigned long *stack_ptr, *stack_end, *stack_base;
@@ -109,8 +188,15 @@ static void ksw_stack_entry_handler(struct kprobe *p, struct pt_regs *regs,
{
u64 watch_addr;
u64 watch_len;
+ int cur_depth;
int ret;

+ cur_depth = get_recursive_depth();
+ set_recursive_depth(cur_depth + 1);
+
+ if (cur_depth != ksw_get_config()->depth)
+ return;
+
if (atomic_cmpxchg(&ksw_stack_pid, INVALID_PID, current->pid) !=
INVALID_PID)
return;
@@ -126,8 +212,8 @@ static void ksw_stack_entry_handler(struct kprobe *p, struct pt_regs *regs,
ret = ksw_watch_on(watch_addr, watch_len);
if (ret) {
atomic_set(&ksw_stack_pid, INVALID_PID);
- pr_err("failed to watch on addr:0x%llx len:%llu %d\n",
- watch_addr, watch_len, ret);
+ pr_err("failed to watch on depth:%d addr:0x%llx len:%llu %d\n",
+ cur_depth, watch_addr, watch_len, ret);
return;
}
}
@@ -136,6 +222,14 @@ static void ksw_stack_exit_handler(struct fprobe *fp, unsigned long ip,
unsigned long ret_ip,
struct ftrace_regs *regs, void *data)
{
+ int cur_depth;
+
+ cur_depth = get_recursive_depth() - 1;
+ set_recursive_depth(cur_depth);
+
+ if (cur_depth != ksw_get_config()->depth)
+ return;
+
if (atomic_read(&ksw_stack_pid) != current->pid)
return;

@@ -149,6 +243,8 @@ int ksw_stack_init(void)
int ret;
char *symbuf = NULL;

+ reset_recursive_depth();
+
memset(&entry_probe, 0, sizeof(entry_probe));
entry_probe.symbol_name = ksw_get_config()->function;
entry_probe.offset = ksw_get_config()->ip_offset;
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:32:45 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Introduce helper functions to start and stop watching the configured
function. These handle initialization/cleanup of both stack and watch
components, and maintain a `watching_active` flag to track current state.

Ensure procfs write triggers proper stop/start sequence, and show handler
indicates watching status.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kernel.c | 55 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
index 8e1dca45003e..9ef969f28e29 100644
--- a/mm/kstackwatch/kernel.c
+++ b/mm/kstackwatch/kernel.c
@@ -17,6 +17,43 @@ MODULE_LICENSE("GPL");
static struct ksw_config *ksw_config;
static atomic_t config_file_busy = ATOMIC_INIT(0);

+static bool watching_active;
+
+static int ksw_start_watching(void)
+{
+ int ret;
+
+ /*
+ * Watch init will preallocate the HWBP,
+ * so it must happen before stack init
+ */
+ ret = ksw_watch_init();
+ if (ret) {
+ pr_err("ksw_watch_init ret: %d\n", ret);
+ return ret;
+ }
+
+ ret = ksw_stack_init();
+ if (ret) {
+ pr_err("ksw_stack_init ret: %d\n", ret);
+ ksw_watch_exit();
+ return ret;
+ }
+ watching_active = true;
+
+ pr_info("start watching: %s\n", ksw_config->config_str);
+ return 0;
+}
+
+static void ksw_stop_watching(void)
+{
+ ksw_stack_exit();
+ ksw_watch_exit();
+ watching_active = false;
+
+ pr_info("stop watching: %s\n", ksw_config->config_str);
+}
+
/*
* Format of the configuration string:
* function+ip_offset[+depth] [local_var_offset:local_var_len]
@@ -109,6 +146,9 @@ static ssize_t kstackwatch_proc_write(struct file *file,
if (copy_from_user(input, buffer, count))
return -EFAULT;

+ if (watching_active)
+ ksw_stop_watching();
+
input[count] = '\0';
strim(input);

@@ -123,12 +163,22 @@ static ssize_t kstackwatch_proc_write(struct file *file,
return ret;
}

+ ret = ksw_start_watching();
+ if (ret) {
+ pr_err("Failed to start watching with %d\n", ret);
+ return ret;
+ }
+
return count;
}

static int kstackwatch_proc_show(struct seq_file *m, void *v)
{
- seq_printf(m, "%s\n", ksw_config->config_str);
+ if (watching_active)
+ seq_printf(m, "%s\n", ksw_config->config_str);
+ else
+ seq_puts(m, "not watching\n");
+
return 0;
}

@@ -176,6 +226,9 @@ static int __init kstackwatch_init(void)

static void __exit kstackwatch_exit(void)
{
+ if (watching_active)
+ ksw_stop_watching();
+
remove_proc_entry("kstackwatch", NULL);
kfree(ksw_config);

--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:32:57 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Provide two debug helpers:

- ksw_watch_show(): print the current watch target address and length.
- ksw_watch_fire(): intentionally trigger the watchpoint immediately
by writing to the watched address, useful for testing HWBP behavior.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 2 ++
mm/kstackwatch/watch.c | 18 ++++++++++++++++++
2 files changed, 20 insertions(+)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 79ca40e69268..8632b43b6a33 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -47,5 +47,7 @@ int ksw_watch_init(void);
void ksw_watch_exit(void);
int ksw_watch_on(u64 watch_addr, u64 watch_len);
void ksw_watch_off(void);
+void ksw_watch_show(void);
+void ksw_watch_fire(void);

#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index d95efefdffe9..87bbe54bb5d3 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -185,3 +185,21 @@ void ksw_watch_exit(void)
unregister_wide_hw_breakpoint(watch_events);
watch_events = NULL;
}
+
+/* self debug function */
+void ksw_watch_show(void)
+{
+ pr_info("watch target bp_addr: 0x%llx len:%llu\n", watch_attr.bp_addr,
+ watch_attr.bp_len);
+}
+EXPORT_SYMBOL_GPL(ksw_watch_show);
+
+/* self debug function */
+void ksw_watch_fire(void)
+{
+ char *ptr = (char *)watch_attr.bp_addr;
+
+ pr_warn("watch triggered immediately\n");
+ *ptr = 0x42; // This should trigger immediately for any bp_len
+}
+EXPORT_SYMBOL_GPL(ksw_watch_fire);
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:33:08 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Introduce a separate test module to validate functionality in controlled
scenarios, such as stack canary writes and simulated corruption.

The module provides a proc interface (/proc/kstackwatch_test) that allows
triggering specific test cases via simple commands:

- test0: directly corrupt the canary to verify watch/fire behavior

Test module is built with optimizations disabled to ensure predictable
behavior.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/Kconfig.debug | 10 ++++
mm/kstackwatch/Makefile | 6 +++
mm/kstackwatch/test.c | 115 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 131 insertions(+)
create mode 100644 mm/kstackwatch/test.c

diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index fdfc6e6d0dec..46c280280980 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -320,3 +320,13 @@ config KSTACK_WATCH
the recursive depth of the monitored function.

If unsure, say N.
+
+config KSTACK_WATCH_TEST
+ tristate "KStackWatch Test Module"
+ depends on KSTACK_WATCH
+ help
+ This module provides controlled stack exhaustion and overflow scenarios
+ to verify the functionality of KStackWatch. It is particularly useful
+ for development and validation of the KStachWatch mechanism.
+
+ If unsure, say N.
diff --git a/mm/kstackwatch/Makefile b/mm/kstackwatch/Makefile
index 84a46cb9a766..d007b8dcd1c6 100644
--- a/mm/kstackwatch/Makefile
+++ b/mm/kstackwatch/Makefile
@@ -1,2 +1,8 @@
obj-$(CONFIG_KSTACK_WATCH) += kstackwatch.o
kstackwatch-y := kernel.o stack.o watch.o
+
+obj-$(CONFIG_KSTACK_WATCH_TEST) += kstackwatch_test.o
+kstackwatch_test-y := test.o
+CFLAGS_test.o := -fno-inline \
+ -fno-optimize-sibling-calls \
+ -fno-pic -fno-pie -O0 -Og
diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
new file mode 100644
index 000000000000..76dbfb042067
--- /dev/null
+++ b/mm/kstackwatch/test.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/delay.h>
+#include <linux/kthread.h>
+#include <linux/module.h>
+#include <linux/prandom.h>
+#include <linux/printk.h>
+#include <linux/proc_fs.h>
+#include <linux/string.h>
+#include <linux/uaccess.h>
+
+#include "kstackwatch.h"
+
+MODULE_AUTHOR("Jinchao Wang");
+MODULE_DESCRIPTION("Simple KStackWatch Test Module");
+MODULE_LICENSE("GPL");
+
+static struct proc_dir_entry *test_proc;
+#define BUFFER_SIZE 4
+#define MAX_DEPTH 6
+
+/*
+ * Test Case 0: Write to the canary position directly (Canary Test)
+ * use a u64 buffer array to ensure the canary will be placed
+ * corrupt the stack canary using the debug function
+ */
+static void canary_test_write(void)
+{
+ u64 buffer[BUFFER_SIZE];
+
+ pr_info("starting %s\n", __func__);
+ ksw_watch_show();
+ ksw_watch_fire();
+
+ buffer[0] = 0;
+
+ /* make sure the compiler do not drop assign action */
+ barrier_data(buffer);
+ pr_info("canary write test completed\n");
+}
+
+static ssize_t test_proc_write(struct file *file, const char __user *buffer,
+ size_t count, loff_t *pos)
+{
+ char cmd[256];
+ int test_num;
+
+ if (count >= sizeof(cmd))
+ return -EINVAL;
+
+ if (copy_from_user(cmd, buffer, count))
+ return -EFAULT;
+
+ cmd[count] = '\0';
+ strim(cmd);
+
+ pr_info("received command: %s\n", cmd);
+
+ if (sscanf(cmd, "test%d", &test_num) == 1) {
+ switch (test_num) {
+ case 0:
+ pr_info("triggering canary write test\n");
+ canary_test_write();
+ break;
+ default:
+ pr_err("Unknown test number %d\n", test_num);
+ return -EINVAL;
+ }
+ } else {
+ pr_err("invalid command format. Use 'test1', 'test2', or 'test3'.\n");
+ return -EINVAL;
+ }
+
+ return count;
+}
+
+static ssize_t test_proc_read(struct file *file, char __user *buffer,
+ size_t count, loff_t *pos)
+{
+ static const char usage[] =
+ "KStackWatch Simplified Test Module\n"
+ "==================================\n"
+ "Usage:\n"
+ " echo 'test0' > /proc/kstackwatch_test - Canary write test\n";
+
+ return simple_read_from_buffer(buffer, count, pos, usage,
+ strlen(usage));
+}
+
+static const struct proc_ops test_proc_ops = {
+ .proc_read = test_proc_read,
+ .proc_write = test_proc_write,
+};
+
+static int __init kstackwatch_test_init(void)
+{
+ test_proc = proc_create("kstackwatch_test", 0600, NULL, &test_proc_ops);
+ if (!test_proc) {
+ pr_err("Failed to create proc entry\n");
+ return -ENOMEM;
+ }
+ pr_info("module loaded\n");
+ return 0;
+}
+
+static void __exit kstackwatch_test_exit(void)
+{
+ if (test_proc)
+ remove_proc_entry("kstackwatch_test", NULL);
+ pr_info("module unloaded\n");
+}
+
+module_init(kstackwatch_test_init);
+module_exit(kstackwatch_test_exit);
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:33:20 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Extend the test module with a new test case (test1) that intentionally
overflows a local u64 buffer to corrupt the stack canary. This helps
validate detection of stack corruption under overflow conditions.

The proc interface is updated to document the new test:

- test1: stack canary overflow test

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/test.c | 28 +++++++++++++++++++++++++++-
1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
index 76dbfb042067..ab1a3f92b5e8 100644
--- a/mm/kstackwatch/test.c
+++ b/mm/kstackwatch/test.c
@@ -40,6 +40,27 @@ static void canary_test_write(void)
pr_info("canary write test completed\n");
}

+/*
+ * Test Case 1: Stack Overflow (Canary Test)
+ * This function uses a u64 buffer 64-bit write
+ * to corrupt the stack canary with a single operation
+ */
+static void canary_test_overflow(void)
+{
+ u64 buffer[BUFFER_SIZE];
+
+ pr_info("starting %s\n", __func__);
+ pr_info("buffer 0x%lx\n", (unsigned long)buffer);
+
+ /* intentionally overflow the u64 buffer. */
+ ((u64 *)buffer + BUFFER_SIZE)[0] = 0xdeadbeefdeadbeef;
+
+ /* make sure the compiler do not drop assign action */
+ barrier_data(buffer);
+
+ pr_info("canary overflow test completed\n");
+}
+
static ssize_t test_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
{
@@ -63,6 +84,10 @@ static ssize_t test_proc_write(struct file *file, const char __user *buffer,
pr_info("triggering canary write test\n");
canary_test_write();
break;
+ case 1:
+ pr_info("triggering canary overflow test\n");
+ canary_test_overflow();
+ break;
default:
pr_err("Unknown test number %d\n", test_num);
return -EINVAL;
@@ -82,7 +107,8 @@ static ssize_t test_proc_read(struct file *file, char __user *buffer,
"KStackWatch Simplified Test Module\n"
"==================================\n"
"Usage:\n"
- " echo 'test0' > /proc/kstackwatch_test - Canary write test\n";
+ " echo 'test0' > /proc/kstackwatch_test - Canary write test\n"
+ " echo 'test1' > /proc/kstackwatch_test - Canary overflow test\n";

return simple_read_from_buffer(buffer, count, pos, usage,
strlen(usage));
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:33:32 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Introduce a new test scenario to simulate silent stack corruption:

- silent_corruption_buggy():
exposes a local variable address globally without resetting it.
- silent_corruption_unwitting():
reads the exposed pointer and modifies the memory, simulating a routine
that unknowingly writes to another stack frame.
- silent_corruption_victim():
demonstrates the effect of silent corruption on unrelated local variables.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/test.c | 93 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 92 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
index ab1a3f92b5e8..b10465381089 100644
--- a/mm/kstackwatch/test.c
+++ b/mm/kstackwatch/test.c
@@ -20,6 +20,9 @@ static struct proc_dir_entry *test_proc;
#define BUFFER_SIZE 4
#define MAX_DEPTH 6

+/* global variables for Silent corruption test */
+static u64 *g_corrupt_ptr;
+
/*
* Test Case 0: Write to the canary position directly (Canary Test)
* use a u64 buffer array to ensure the canary will be placed
@@ -61,6 +64,89 @@ static void canary_test_overflow(void)
pr_info("canary overflow test completed\n");
}

+static void do_something(int min_ms, int max_ms)
+{
+ u32 rand;
+
+ get_random_bytes(&rand, sizeof(rand));
+ rand = min_ms + rand % (max_ms - min_ms + 1);
+ msleep(rand);
+}
+
+static void silent_corruption_buggy(int i)
+{
+ u64 local_var;
+
+ pr_info("starting %s\n", __func__);
+
+ pr_info("%s %d local_var addr: 0x%lx\n", __func__, i,
+ (unsigned long)&local_var);
+ WRITE_ONCE(g_corrupt_ptr, &local_var);
+ do_something(0, 300);
+ //buggy: return without resetting g_corrupt_ptr
+}
+
+static int silent_corruption_unwitting(void *data)
+{
+ u64 *local_ptr;
+
+ pr_debug("starting %s\n", __func__);
+
+ do {
+ local_ptr = READ_ONCE(g_corrupt_ptr);
+ do_something(0, 300);
+ } while (!local_ptr);
+
+ local_ptr[0] = 0;
+
+ return 0;
+}
+
+static void silent_corruption_victim(int i)
+{
+ u64 local_var;
+
+ pr_debug("starting %s %dth\n", __func__, i);
+
+ /* local_var random in [0xff0000, 0x100ffff] */
+ get_random_bytes(&local_var, sizeof(local_var));
+ local_var = 0xff0000 + local_var & 0xffff;
+
+ pr_debug("%s local_var addr: 0x%lx\n", __func__,
+ (unsigned long)&local_var);
+
+ do_something(0, 100);
+
+ if (local_var >= 0xff0000 && local_var <= 0xffffff)
+ pr_info("%s %d happy with 0x%llx\n", __func__, i, local_var);
+ else
+ pr_info("%s %d unhappy with 0x%llx\n", __func__, i, local_var);
+}
+
+/*
+ * Test Case 2: Silent Corruption
+ * buggy() does not protect its local var correctly
+ * unwitting() simply does its intended work
+ * victim() is unaware know what happened
+ */
+static void silent_corruption_test(void)
+{
+ struct task_struct *unwitting;
+
+ pr_info("starting %s\n", __func__);
+ WRITE_ONCE(g_corrupt_ptr, NULL);
+
+ unwitting = kthread_run(silent_corruption_unwitting, NULL, "unwitting");
+ if (IS_ERR(unwitting)) {
+ pr_err("failed to create thread2\n");
+ return;
+ }
+
+ silent_corruption_buggy(0);
+ for (int i = 0; i < 10; i++)
+ silent_corruption_victim(i);
+}
+
static ssize_t test_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
{
@@ -88,6 +174,10 @@ static ssize_t test_proc_write(struct file *file, const char __user *buffer,
pr_info("triggering canary overflow test\n");
canary_test_overflow();
break;
+ case 2:
+ pr_info("triggering silent corruption test\n");
+ silent_corruption_test();
+ break;
default:
pr_err("Unknown test number %d\n", test_num);
return -EINVAL;
@@ -108,7 +198,8 @@ static ssize_t test_proc_read(struct file *file, char __user *buffer,
"==================================\n"
"Usage:\n"
" echo 'test0' > /proc/kstackwatch_test - Canary write test\n"
- " echo 'test1' > /proc/kstackwatch_test - Canary overflow test\n";
+ " echo 'test1' > /proc/kstackwatch_test - Canary overflow test\n"
+ " echo 'test2' > /proc/kstackwatch_test - Silent corruption test\n";

Jinchao Wang

unread,
Sep 10, 2025, 1:33:43 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Add a test that triggers stack writes across recursive calls,verifying
detection at specific recursion depths.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/test.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
index b10465381089..6a75cd3e313d 100644
--- a/mm/kstackwatch/test.c
+++ b/mm/kstackwatch/test.c
@@ -147,6 +147,27 @@ static void silent_corruption_test(void)
silent_corruption_victim(i);
}

+/*
+ * Test Case 3: Recursive Call Corruption
+ * Test corruption detection at specified recursion depth
+ */
+static void recursive_corruption_test(int depth)
+{
+ u64 buffer[BUFFER_SIZE];
+
+ pr_info("recursive call at depth %d\n", depth);
+ pr_info("buffer 0x%lx\n", (unsigned long)buffer);
+ if (depth <= MAX_DEPTH)
+ recursive_corruption_test(depth + 1);
+
+ buffer[0] = depth;
+
+ /* make sure the compiler do not drop assign action */
+ barrier_data(buffer);
+
+ pr_info("returning from depth %d\n", depth);
+}
+
static ssize_t test_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
{
@@ -178,6 +199,11 @@ static ssize_t test_proc_write(struct file *file, const char __user *buffer,
pr_info("triggering silent corruption test\n");
silent_corruption_test();
break;
+ case 3:
+ pr_info("triggering recursive corruption test\n");
+ /* depth start with 0 */
+ recursive_corruption_test(0);
+ break;
default:
pr_err("Unknown test number %d\n", test_num);
return -EINVAL;
@@ -199,7 +225,8 @@ static ssize_t test_proc_read(struct file *file, char __user *buffer,
"Usage:\n"
" echo 'test0' > /proc/kstackwatch_test - Canary write test\n"
" echo 'test1' > /proc/kstackwatch_test - Canary overflow test\n"
- " echo 'test2' > /proc/kstackwatch_test - Silent corruption test\n";
+ " echo 'test2' > /proc/kstackwatch_test - Silent corruption test\n"
+ " echo 'test3' > /proc/kstackwatch_test - Recursive corruption test\n";

Jinchao Wang

unread,
Sep 10, 2025, 1:33:56 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Provide a shell script to trigger test cases.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
tools/kstackwatch/kstackwatch_test.sh | 40 +++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
create mode 100755 tools/kstackwatch/kstackwatch_test.sh

diff --git a/tools/kstackwatch/kstackwatch_test.sh b/tools/kstackwatch/kstackwatch_test.sh
new file mode 100755
index 000000000000..61e171439ab6
--- /dev/null
+++ b/tools/kstackwatch/kstackwatch_test.sh
@@ -0,0 +1,40 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+echo "IMPORTANT: Before running, make sure you have updated the offset values!"
+
+usage() {
+ echo "Usage: $0 [0-3]"
+ echo " 0 - Canary Write Test"
+ echo " 1 - Canary Overflow Test"
+ echo " 2 - Silent Corruption Test"
+ echo " 3 - Recursive Corruption Test"
+}
+
+run_test() {
+ local test_num=$1
+ case "$test_num" in
+ 0) echo "canary_test_write+0x19" >/proc/kstackwatch
+ echo "test0" >/proc/kstackwatch_test ;;
+ 1) echo "canary_test_overflow+0x1a" >/proc/kstackwatch
+ echo "test1" >/proc/kstackwatch_test ;;
+ 2) echo "silent_corruption_victim+0x32 0:8" >/proc/kstackwatch
+ echo "test2" >/proc/kstackwatch_test ;;
+ 3) echo "recursive_corruption_test+0x21+3 0:8" >/proc/kstackwatch
+ echo "test3" >/proc/kstackwatch_test ;;
+ *) usage
+ exit 1 ;;
+ esac
+ # Reset watch after test
+ echo >/proc/kstackwatch
+}
+
+# Check root and module
+[ "$EUID" -ne 0 ] && echo "Run as root" && exit 1
+for f in /proc/kstackwatch /proc/kstackwatch_test; do
+ [ ! -f "$f" ] && echo "$f not found" && exit 1
+done
+
+# Run
+[ -z "$1" ] && { usage; exit 0; }
+run_test "$1"
--
2.43.0

Jinchao Wang

unread,
Sep 10, 2025, 1:34:08 AMSep 10
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org, Jinchao Wang
Add a new documentation file for KStackWatch, explaining its
purpose, motivation, key features, configuration format, module parameters,
implementation notes, limitations, and testing instructions.

Update MAINTAINERS to include Jinchao Wang as the maintainer for associated
files.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
Documentation/dev-tools/kstackwatch.rst | 94 +++++++++++++++++++++++++
MAINTAINERS | 7 ++
2 files changed, 101 insertions(+)
create mode 100644 Documentation/dev-tools/kstackwatch.rst

diff --git a/Documentation/dev-tools/kstackwatch.rst b/Documentation/dev-tools/kstackwatch.rst
new file mode 100644
index 000000000000..f741de08ca56
--- /dev/null
+++ b/Documentation/dev-tools/kstackwatch.rst
@@ -0,0 +1,94 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================================
+KStackWatch: Kernel Stack Watch
+====================================
+
+Overview
+========
+KStackWatch is a lightweight debugging tool designed to detect
+kernel stack corruption in real time. It helps developers capture the
+moment corruption occurs, rather than only observing a later crash.
+
+Motivation
+==========
+Stack corruption may originate in one function but manifest much later
+with no direct call trace linking the two. This makes such issues
+extremely difficult to diagnose. KStackWatch addresses this by combining
+hardware breakpoints with kprobe and fprobe instrumentation, monitoring
+stack canaries or local variables at the point of corruption.
+
+Key Features
+============
+- Lightweight overhead:
+ Minimal runtime cost, preserving bug reproducibility.
+- Real-time detection:
+ Detect stack corruption immediately.
+- Flexible configuration:
+ Control via a procfs interface.
+- Depth filtering:
+ Optional recursion depth tracking per task.
+
+Configuration
+=============
+The control file is created at::
+
+ /proc/kstackwatch
+
+To configure, write a string in the following format::
+
+ function+ip_offset[+depth] [local_var_offset:local_var_len]
+ - function : name of the target function
+ - ip_offset : instruction pointer offset within the function
+ - depth : recursion depth to watch, starting from 0
+ - local_var_offset : offset from the stack pointer at function+ip_offset
+ - local_var_len : length of the local variable(1,2,4,8)
+
+Fields
+------
+- ``function``:
+ Name of the target function to watch.
+- ``ip_offset``:
+ Instruction pointer offset within the function.
+- ``depth`` (optional):
+ Maximum recursion depth for the watch.
+- ``local_var_offset:local_var_len`` (optional):
+ A region of a local variable to monitor, relative to the stack pointer.
+ If not given, KStackWatch monitors the stack canary by default.
+
+Examples
+--------
+1. Watch the canary at the entry of ``canary_test_write``::
+
+ echo 'canary_test_write+0x12' > /proc/kstackwatch
+
+2. Watch a local variable of 8 bytes at offset 0 in
+ ``silent_corruption_victim``::
+
+ echo 'silent_corruption_victim+0x7f 0:8' > /proc/kstackwatch
+
+Module Parameters
+=================
+``panic_on_catch`` (bool)
+ - If true, trigger a kernel panic immediately on detecting stack
+ corruption.
+ - Default is false (log a message only).
+
+Implementation Notes
+====================
+- Hardware breakpoints are preallocated at watch start.
+- Function exit is monitored using ``fprobe``.
+- Per-task depth tracking is used to handle recursion across scheduling.
+- The procfs interface allows dynamic reconfiguration at runtime.
+- Active state is cleared before applying new settings.
+
+Limitations
+===========
+- Only one active watch can be configured at a time (singleton).
+- Local variable offset and size must be known in advance.
+
+Testing
+=======
+KStackWatch includes a companion test module (`kstackwatch_test`) and
+a helper script (`kstackwatch_test.sh`) to exercise different stack
+corruption scenarios:
diff --git a/MAINTAINERS b/MAINTAINERS
index cd7ff55b5d32..076512afddcc 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13355,6 +13355,13 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
F: Documentation/dev-tools/kselftest*
F: tools/testing/selftests/

+KERNEL STACK WATCH
+M: Jinchao Wang <wangjin...@gmail.com>
+S: Maintained
+F: Documentation/dev-tools/kstackwatch.rst
+F: mm/kstackwatch/
+F: tools/kstackwatch/
+
KERNEL SMB3 SERVER (KSMBD)
M: Namjae Jeon <linki...@kernel.org>
M: Namjae Jeon <linki...@samba.org>
--
2.43.0

Masami Hiramatsu

unread,
Sep 10, 2025, 8:46:20 PM (14 days ago) Sep 10
to Jinchao Wang, Andrew Morton, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org
Hi Jinchao,
Please do not expose the arch dependent symbol. Instead, you should
expose an arch independent wrapper.

Anyway, you also need to share the same code with arch_install_hw_breakpoint()
like below;

Thanks,


diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 89135229ed21..2f3c5406999e 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -84,6 +84,28 @@ int decode_dr7(unsigned long dr7, int bpnum, unsigned *len, unsigned *type)
return (dr7 >> (bpnum * DR_ENABLE_SIZE)) & 0x3;
}

+static void __arch_install_hw_breakpoint(struct perf_event *bp, int regno)
+{
+ struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+ unsigned long *dr7;
+
+ set_debugreg(info->address, regno);
+ __this_cpu_write(cpu_debugreg[i], info->address);
+
+ dr7 = this_cpu_ptr(&cpu_dr7);
+ *dr7 |= encode_dr7(i, info->len, info->type);
+
+ /*
+ * Ensure we first write cpu_dr7 before we set the DR7 register.
+ * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
+ */
+ barrier();
+
+ set_debugreg(*dr7, 7);
+ if (info->mask)
+ amd_set_dr_addr_mask(info->mask, i);
+}
+
/*
* Install a perf counter breakpoint.
*
@@ -95,8 +117,6 @@ int decode_dr7(unsigned long dr7, int bpnum, unsigned *len, unsigned *type)
*/
int arch_install_hw_breakpoint(struct perf_event *bp)
{
- struct arch_hw_breakpoint *info = counter_arch_bp(bp);
- unsigned long *dr7;
int i;

lockdep_assert_irqs_disabled();
@@ -113,22 +133,7 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
if (WARN_ONCE(i == HBP_NUM, "Can't find any breakpoint slot"))
return -EBUSY;

- set_debugreg(info->address, i);
- __this_cpu_write(cpu_debugreg[i], info->address);
-
- dr7 = this_cpu_ptr(&cpu_dr7);
- *dr7 |= encode_dr7(i, info->len, info->type);
-
- /*
- * Ensure we first write cpu_dr7 before we set the DR7 register.
- * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
- */
- barrier();
-
- set_debugreg(*dr7, 7);
- if (info->mask)
- amd_set_dr_addr_mask(info->mask, i);
-
+ __arch_install_hw_breakpoint(bp, i);
return 0;
}

@@ -146,8 +151,6 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
*/
int arch_reinstall_hw_breakpoint(struct perf_event *bp)
{
- struct arch_hw_breakpoint *info = counter_arch_bp(bp);
- unsigned long *dr7;
int i;

lockdep_assert_irqs_disabled();
@@ -162,22 +165,7 @@ int arch_reinstall_hw_breakpoint(struct perf_event *bp)
if (WARN_ONCE(i == HBP_NUM, "Can't find a matching breakpoint slot"))
return -EINVAL;

- set_debugreg(info->address, i);
- __this_cpu_write(cpu_debugreg[i], info->address);
-
- dr7 = this_cpu_ptr(&cpu_dr7);
- *dr7 |= encode_dr7(i, info->len, info->type);
-
- /*
- * Ensure we first write cpu_dr7 before we set the DR7 register.
- * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
- */
- barrier();
-
- set_debugreg(*dr7, 7);
- if (info->mask)
- amd_set_dr_addr_mask(info->mask, i);
-
+ __arch_install_hw_breakpoint(bp, i);
return 0;
}
EXPORT_SYMBOL_GPL(arch_reinstall_hw_breakpoint);

--
Masami Hiramatsu (Google) <mhir...@kernel.org>

Jinchao Wang

unread,
Sep 10, 2025, 9:02:09 PM (14 days ago) Sep 10
to Masami Hiramatsu (Google), Andrew Morton, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org
You are right. The arch-dependent symbol has been removed and the code
is shared

with arch_install_hw_breakpoint() in the next version of the patch.

https://lore.kernel.org/lkml/20250910093951.1330...@gmail.com
Thanks,
Jinchao

Jinchao Wang

unread,
Sep 12, 2025, 1:51:51 AM (13 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org
FYI: The current patchset contains lockdep issues due to the kprobe handler
running in NMI context. Please do not spend time reviewing this version.
Thanks.
--
Jinchao

Alexander Potapenko

unread,
Sep 12, 2025, 2:42:33 AM (12 days ago) Sep 12
to Jinchao Wang, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org
Hi Jinchao,

In the next version, could you please elaborate more on the user
workflow of this tool?
It occurs to me that in order to detect the corruption the user has to
know precisely in which function the corruption is happening, which is
usually the hardest part.

--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

Jinchao Wang

unread,
Sep 12, 2025, 4:15:45 AM (12 days ago) Sep 12
to Alexander Potapenko, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Naveen N . Rao, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Steven Rostedt, Mathieu Desnoyers, Ingo Molnar, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, Thomas Gleixner, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, linu...@kvack.org, linux-tra...@vger.kernel.org, linux-pe...@vger.kernel.org, linux-...@vger.kernel.org
On 9/12/25 14:41, Alexander Potapenko wrote:
> On Fri, Sep 12, 2025 at 7:51 AM Jinchao Wang <wangjin...@gmail.com> wrote:
>>
>> FYI: The current patchset contains lockdep issues due to the kprobe handler
>> running in NMI context. Please do not spend time reviewing this version.
>> Thanks.
>> --
>> Jinchao
>
> Hi Jinchao,
>
> In the next version, could you please elaborate more on the user
> workflow of this tool?
> It occurs to me that in order to detect the corruption the user has to
> know precisely in which function the corruption is happening, which is
> usually the hardest part.
>

Hi Alexander,

Thank you for the question. I agree with your observation about the
challenge of detecting stack corruption.

Stack corruption debugging typically involves three steps:
1. Detect the corruption
2. Find the root cause
3. Fix the issue

Your question addresses step 1, which is indeed a challenging
part. Currently, we have several approaches for detection:

- Compile with CONFIG_STACKPROTECTOR_STRONG to add stack canaries
and trigger __stack_chk_fail() on corruption
- Manual detection when local variables are unexpectedly modified
(though this is quite difficult in practice)

However, KStackWatch is specifically designed for step 2 rather than
step 1. Let me illustrate with a complex scenario:

In one actual case, the corruption path was:
- A calls B (the buggy function) through N1 call levels
- B stores its stack variable L1's address in P (through a global
variable or queue or list...)
- C (the victim) called by A through N2 levels, unexpectedly has a
canary or local variable L2 with the overlapping address with L1
- D uses P in a separate task (N3 call levels deep), which modifies
the value of L1, and L2 is corrupted
- C finds the corruption

The only clue might be identifying function D first, which then leads
us to B through P.

Key advantages of KStackWatch:
- Lightweight overhead that doesn't reduce reproduction probability
- Real-time capability to identify corruption exactly when it happens
- Precise location tracking of where corruptions occur

KStackWatch helps identify function D directly, bypassing the complex
call chains (N1, N2, N3) and intermediate functions. Once we locate D,
we can trace back through the corruption path and resolve the issue.

Does this clarify the tool's intended workflow?

--
Jinchao
Reply all
Reply to author
Forward
0 new messages