[PATCH v4 00/21] mm/ksw: Introduce real-time KStackWatch debugging tool

3 views
Skip to first unread message

Jinchao Wang

unread,
Sep 12, 2025, 6:12:00 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
This patch series introduces KStackWatch, a lightweight kernel debugging tool
for detecting kernel stack corruption in real time.

The motivation comes from scenarios where corruption occurs silently in one function
but manifests later as a crash in another. Using other tools may not reproduce the
issue due to its heavy overhead. with no direct call trace linking the two. Such bugs
are often extremely hard to debug with existing tools.

I demonstrate this scenario in test2 (silent corruption test).

KStackWatch works by combining a hardware breakpoint with kprobe and fprobe.
It can watch a stack canary or a selected local variable and detects the moment the
corruption actually occurs. This allows developers to pinpoint the real source rather
than only observing the final crash.

Key features include:

- Lightweight overhead with minimal impact on bug reproducibility
- Real-time detection of stack corruption
- Simple configuration through `/proc/kstackwatch`
- Support for recursive depth filter

To validate the approach, the patch includes a test module and a test script.

---
Changelog

V4:
* Solve the lockdep issues with:
* per-task KStackWatch context to track depth
* atomic flag to protect watched_addr
* Use refactored version of arch_reinstall_hw_breakpoint

Patches 1–3 of this series are also used in the wprobe work proposed by
Masami Hiramatsu, so there may be some overlap between our patches.
Patch 3 comes directly from Masami Hiramatsu (thanks).

V3:
Main changes:
* Use modify_wide_hw_breakpoint_local() (from Masami)
* Add atomic flag to restrict /proc/kstackwatch to a single opener
* Protect stack probe with an atomic PID flag
* Handle CPU hotplug for watchpoints
* Add preempt_disable/enable in ksw_watch_on_local_cpu()
* Introduce const struct ksw_config *ksw_get_config(void) and use it
* Switch to global watch_attr, remove struct watch_info
* Validate local_var_len in parser()
* Handle case when canary is not found
* Use dump_stack() instead of show_regs() to allow module build

Cleanups:
* Reduce logging and comments
* Format logs with KBUILD_MODNAME
* Remove unused headers

Documentation:
* Add new document

V2:
https://lore.kernel.org/all/20250904002126.1514...@gmail.com/
* Make hardware breakpoint and stack operations architecture-independent.

V1:
https://lore.kernel.org/all/20250828073311.1116...@gmail.com/
Core Implementation
* Replaced kretprobe with fprobe for function exit hooking, as suggested
by Masami Hiramatsu
* Introduced per-task depth logic to track recursion across scheduling
* Removed the use of workqueue for a more efficient corruption check
* Reordered patches for better logical flow
* Simplified and improved commit messages throughout the series
* Removed initial archcheck which should be improved later


Testing and Architecture

* Replaced the multiple-thread test with silent corruption test
* Split self-tests into a separate patch to improve clarity.

Maintenance
* Added a new entry for KStackWatch to the MAINTAINERS file.

RFC:
https://lore.kernel.org/lkml/20250818122720.4349...@gmail.com/
---

The series is structured as follows:

Jinchao Wang (20):
x86/hw_breakpoint: Unify breakpoint install/uninstall
x86/hw_breakpoint: Add arch_reinstall_hw_breakpoint
mm/ksw: add build system support
mm/ksw: add ksw_config struct and parser
mm/ksw: add singleton /proc/kstackwatch interface
mm/ksw: add HWBP pre-allocation
mm/ksw: Add atomic ksw_watch_on() and ksw_watch_off()
mm/ksw: support CPU hotplug
sched: add per-task KStackWatch context
mm/ksw: add probe management helpers
mm/ksw: resolve stack watch addr and len
mm/ksw: manage probe and HWBP lifecycle via procfs
mm/ksw: add self-debug helpers
mm/ksw: add test module
mm/ksw: add stack overflow test
mm/ksw: add silent corruption test case
mm/ksw: add recursive stack corruption test
tools/ksw: add test script
docs: add KStackWatch document
MAINTAINERS: add entry for KStackWatch

Masami Hiramatsu (Google) (1):
HWBP: Add modify_wide_hw_breakpoint_local() API

Documentation/dev-tools/kstackwatch.rst | 94 +++++++++
MAINTAINERS | 8 +
arch/Kconfig | 10 +
arch/x86/Kconfig | 1 +
arch/x86/include/asm/hw_breakpoint.h | 8 +
arch/x86/kernel/hw_breakpoint.c | 148 +++++++------
include/linux/hw_breakpoint.h | 6 +
include/linux/kstackwatch_types.h | 13 ++
include/linux/sched.h | 5 +
kernel/events/hw_breakpoint.c | 36 ++++
mm/Kconfig.debug | 21 ++
mm/Makefile | 1 +
mm/kstackwatch/Makefile | 8 +
mm/kstackwatch/kernel.c | 239 +++++++++++++++++++++
mm/kstackwatch/kstackwatch.h | 53 +++++
mm/kstackwatch/stack.c | 194 ++++++++++++++++++
mm/kstackwatch/test.c | 262 ++++++++++++++++++++++++
mm/kstackwatch/watch.c | 181 ++++++++++++++++
tools/kstackwatch/kstackwatch_test.sh | 40 ++++
19 files changed, 1266 insertions(+), 62 deletions(-)
create mode 100644 Documentation/dev-tools/kstackwatch.rst
create mode 100644 include/linux/kstackwatch_types.h
create mode 100644 mm/kstackwatch/Makefile
create mode 100644 mm/kstackwatch/kernel.c
create mode 100644 mm/kstackwatch/kstackwatch.h
create mode 100644 mm/kstackwatch/stack.c
create mode 100644 mm/kstackwatch/test.c
create mode 100644 mm/kstackwatch/watch.c
create mode 100755 tools/kstackwatch/kstackwatch_test.sh

--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:04 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Consolidate breakpoint management to reduce code duplication.
The diffstat was misleading, so the stripped code size is compared instead.
After refactoring, it is reduced from 11976 bytes to 11448 bytes on my
x86_64 system built with clang.

This also makes it easier to introduce arch_reinstall_hw_breakpoint().

In addition, including linux/types.h to fix a missing build dependency.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
arch/x86/include/asm/hw_breakpoint.h | 6 ++
arch/x86/kernel/hw_breakpoint.c | 141 +++++++++++++++------------
2 files changed, 84 insertions(+), 63 deletions(-)

diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h
index 0bc931cd0698..aa6adac6c3a2 100644
--- a/arch/x86/include/asm/hw_breakpoint.h
+++ b/arch/x86/include/asm/hw_breakpoint.h
@@ -5,6 +5,7 @@
#include <uapi/asm/hw_breakpoint.h>

#define __ARCH_HW_BREAKPOINT_H
+#include <linux/types.h>

/*
* The name should probably be something dealt in
@@ -18,6 +19,11 @@ struct arch_hw_breakpoint {
u8 type;
};

+enum bp_slot_action {
+ BP_SLOT_ACTION_INSTALL,
+ BP_SLOT_ACTION_UNINSTALL,
+};
+
#include <linux/kdebug.h>
#include <linux/percpu.h>
#include <linux/list.h>
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index b01644c949b2..3658ace4bd8d 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -48,7 +48,6 @@ static DEFINE_PER_CPU(unsigned long, cpu_debugreg[HBP_NUM]);
*/
static DEFINE_PER_CPU(struct perf_event *, bp_per_reg[HBP_NUM]);

-
static inline unsigned long
__encode_dr7(int drnum, unsigned int len, unsigned int type)
{
@@ -85,96 +84,112 @@ int decode_dr7(unsigned long dr7, int bpnum, unsigned *len, unsigned *type)
}

/*
- * Install a perf counter breakpoint.
- *
- * We seek a free debug address register and use it for this
- * breakpoint. Eventually we enable it in the debug control register.
- *
- * Atomic: we hold the counter->ctx->lock and we only handle variables
- * and registers local to this cpu.
+ * We seek a slot and change it or keep it based on the action.
+ * Returns slot number on success, negative error on failure.
+ * Must be called with IRQs disabled.
*/
-int arch_install_hw_breakpoint(struct perf_event *bp)
+static int manage_bp_slot(struct perf_event *bp, enum bp_slot_action action)
{
- struct arch_hw_breakpoint *info = counter_arch_bp(bp);
- unsigned long *dr7;
- int i;
-
- lockdep_assert_irqs_disabled();
+ struct perf_event *old_bp;
+ struct perf_event *new_bp;
+ int slot;
+
+ switch (action) {
+ case BP_SLOT_ACTION_INSTALL:
+ old_bp = NULL;
+ new_bp = bp;
+ break;
+ case BP_SLOT_ACTION_UNINSTALL:
+ old_bp = bp;
+ new_bp = NULL;
+ break;
+ default:
+ return -EINVAL;
+ }

- for (i = 0; i < HBP_NUM; i++) {
- struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
+ for (slot = 0; slot < HBP_NUM; slot++) {
+ struct perf_event **curr = this_cpu_ptr(&bp_per_reg[slot]);

- if (!*slot) {
- *slot = bp;
- break;
+ if (*curr == old_bp) {
+ *curr = new_bp;
+ return slot;
}
}

- if (WARN_ONCE(i == HBP_NUM, "Can't find any breakpoint slot"))
- return -EBUSY;
+ if (old_bp) {
+ WARN_ONCE(1, "Can't find matching breakpoint slot");
+ return -EINVAL;
+ }
+
+ WARN_ONCE(1, "No free breakpoint slots");
+ return -EBUSY;
+}
+
+static void setup_hwbp(struct arch_hw_breakpoint *info, int slot, bool enable)
+{
+ unsigned long dr7;

- set_debugreg(info->address, i);
- __this_cpu_write(cpu_debugreg[i], info->address);
+ set_debugreg(info->address, slot);
+ __this_cpu_write(cpu_debugreg[slot], info->address);

- dr7 = this_cpu_ptr(&cpu_dr7);
- *dr7 |= encode_dr7(i, info->len, info->type);
+ dr7 = this_cpu_read(cpu_dr7);
+ if (enable)
+ dr7 |= encode_dr7(slot, info->len, info->type);
+ else
+ dr7 &= ~__encode_dr7(slot, info->len, info->type);

/*
- * Ensure we first write cpu_dr7 before we set the DR7 register.
- * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
+ * Enabling:
+ * Ensure we first write cpu_dr7 before we set the DR7 register.
+ * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
*/
+ if (enable)
+ this_cpu_write(cpu_dr7, dr7);
+
barrier();

- set_debugreg(*dr7, 7);
+ set_debugreg(dr7, 7);
+
if (info->mask)
- amd_set_dr_addr_mask(info->mask, i);
+ amd_set_dr_addr_mask(enable ? info->mask : 0, slot);

- return 0;
+ /*
+ * Disabling:
+ * Ensure the write to cpu_dr7 is after we've set the DR7 register.
+ * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
+ */
+ if (!enable)
+ this_cpu_write(cpu_dr7, dr7);
}

/*
- * Uninstall the breakpoint contained in the given counter.
- *
- * First we search the debug address register it uses and then we disable
- * it.
- *
- * Atomic: we hold the counter->ctx->lock and we only handle variables
- * and registers local to this cpu.
+ * find suitable breakpoint slot and set it up based on the action
*/
-void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+static int arch_manage_bp(struct perf_event *bp, enum bp_slot_action action)
{
- struct arch_hw_breakpoint *info = counter_arch_bp(bp);
- unsigned long dr7;
- int i;
+ struct arch_hw_breakpoint *info;
+ int slot;

lockdep_assert_irqs_disabled();

- for (i = 0; i < HBP_NUM; i++) {
- struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
-
- if (*slot == bp) {
- *slot = NULL;
- break;
- }
- }
-
- if (WARN_ONCE(i == HBP_NUM, "Can't find any breakpoint slot"))
- return;
+ slot = manage_bp_slot(bp, action);
+ if (slot < 0)
+ return slot;

- dr7 = this_cpu_read(cpu_dr7);
- dr7 &= ~__encode_dr7(i, info->len, info->type);
+ info = counter_arch_bp(bp);
+ setup_hwbp(info, slot, action != BP_SLOT_ACTION_UNINSTALL);

- set_debugreg(dr7, 7);
- if (info->mask)
- amd_set_dr_addr_mask(0, i);
+ return 0;
+}

- /*
- * Ensure the write to cpu_dr7 is after we've set the DR7 register.
- * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
- */
- barrier();
+int arch_install_hw_breakpoint(struct perf_event *bp)
+{
+ return arch_manage_bp(bp, BP_SLOT_ACTION_INSTALL);
+}

- this_cpu_write(cpu_dr7, dr7);
+void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+{
+ arch_manage_bp(bp, BP_SLOT_ACTION_UNINSTALL);
}

static int arch_bp_generic_len(int x86_len)
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:10 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
The new arch_reinstall_hw_breakpoint() function can be used in an
atomic context, unlike the more expensive free and re-allocation path.
This allows callers to efficiently re-establish an existing breakpoint.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
arch/x86/include/asm/hw_breakpoint.h | 2 ++
arch/x86/kernel/hw_breakpoint.c | 9 +++++++++
2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h
index aa6adac6c3a2..c22cc4e87fc5 100644
--- a/arch/x86/include/asm/hw_breakpoint.h
+++ b/arch/x86/include/asm/hw_breakpoint.h
@@ -21,6 +21,7 @@ struct arch_hw_breakpoint {

enum bp_slot_action {
BP_SLOT_ACTION_INSTALL,
+ BP_SLOT_ACTION_REINSTALL,
BP_SLOT_ACTION_UNINSTALL,
};

@@ -65,6 +66,7 @@ extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,


int arch_install_hw_breakpoint(struct perf_event *bp);
+int arch_reinstall_hw_breakpoint(struct perf_event *bp);
void arch_uninstall_hw_breakpoint(struct perf_event *bp);
void hw_breakpoint_pmu_read(struct perf_event *bp);
void hw_breakpoint_pmu_unthrottle(struct perf_event *bp);
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 3658ace4bd8d..29c9369264d4 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -99,6 +99,10 @@ static int manage_bp_slot(struct perf_event *bp, enum bp_slot_action action)
old_bp = NULL;
new_bp = bp;
break;
+ case BP_SLOT_ACTION_REINSTALL:
+ old_bp = bp;
+ new_bp = bp;
+ break;
case BP_SLOT_ACTION_UNINSTALL:
old_bp = bp;
new_bp = NULL;
@@ -187,6 +191,11 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
return arch_manage_bp(bp, BP_SLOT_ACTION_INSTALL);
}

+int arch_reinstall_hw_breakpoint(struct perf_event *bp)
+{
+ return arch_manage_bp(bp, BP_SLOT_ACTION_REINSTALL);
+}
+
void arch_uninstall_hw_breakpoint(struct perf_event *bp)
{
arch_manage_bp(bp, BP_SLOT_ACTION_UNINSTALL);
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:14 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
From: "Masami Hiramatsu (Google)" <mhir...@kernel.org>

Add modify_wide_hw_breakpoint_local() arch-wide interface which allows
hwbp users to update watch address on-line. This is available if the
arch supports CONFIG_HAVE_REINSTALL_HW_BREAKPOINT.
Note that this allows to change the type only for compatible types,
because it does not release and reserve the hwbp slot based on type.
For instance, you can not change HW_BREAKPOINT_W to HW_BREAKPOINT_X.

Signed-off-by: Masami Hiramatsu (Google) <mhir...@kernel.org>
---
arch/Kconfig | 10 ++++++++++
arch/x86/Kconfig | 1 +
include/linux/hw_breakpoint.h | 6 ++++++
kernel/events/hw_breakpoint.c | 36 +++++++++++++++++++++++++++++++++++
4 files changed, 53 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index d1b4ffd6e085..e4787fc814df 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -418,6 +418,16 @@ config HAVE_MIXED_BREAKPOINTS_REGS
Select this option if your arch implements breakpoints under the
latter fashion.

+config HAVE_REINSTALL_HW_BREAKPOINT
+ bool
+ depends on HAVE_HW_BREAKPOINT
+ help
+ Depending on the arch implementation of hardware breakpoints,
+ some of them are able to update the breakpoint configuration
+ without release and reserve the hardware breakpoint register.
+ What configuration is able to update depends on hardware and
+ software implementation.
+
config HAVE_USER_RETURN_NOTIFIER
bool

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 58d890fe2100..49d4ce2af94c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -247,6 +247,7 @@ config X86
select HAVE_FUNCTION_TRACER
select HAVE_GCC_PLUGINS
select HAVE_HW_BREAKPOINT
+ select HAVE_REINSTALL_HW_BREAKPOINT
select HAVE_IOREMAP_PROT
select HAVE_IRQ_EXIT_ON_IRQ_STACK if X86_64
select HAVE_IRQ_TIME_ACCOUNTING
diff --git a/include/linux/hw_breakpoint.h b/include/linux/hw_breakpoint.h
index db199d653dd1..ea373f2587f8 100644
--- a/include/linux/hw_breakpoint.h
+++ b/include/linux/hw_breakpoint.h
@@ -81,6 +81,9 @@ register_wide_hw_breakpoint(struct perf_event_attr *attr,
perf_overflow_handler_t triggered,
void *context);

+extern int modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr);
+
extern int register_perf_hw_breakpoint(struct perf_event *bp);
extern void unregister_hw_breakpoint(struct perf_event *bp);
extern void unregister_wide_hw_breakpoint(struct perf_event * __percpu *cpu_events);
@@ -124,6 +127,9 @@ register_wide_hw_breakpoint(struct perf_event_attr *attr,
perf_overflow_handler_t triggered,
void *context) { return NULL; }
static inline int
+modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr) { return -ENOSYS; }
+static inline int
register_perf_hw_breakpoint(struct perf_event *bp) { return -ENOSYS; }
static inline void unregister_hw_breakpoint(struct perf_event *bp) { }
static inline void
diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c
index 8ec2cb688903..ef9bab968b2c 100644
--- a/kernel/events/hw_breakpoint.c
+++ b/kernel/events/hw_breakpoint.c
@@ -887,6 +887,42 @@ void unregister_wide_hw_breakpoint(struct perf_event * __percpu *cpu_events)
}
EXPORT_SYMBOL_GPL(unregister_wide_hw_breakpoint);

+/**
+ * modify_wide_hw_breakpoint_local - update breakpoint config for local cpu
+ * @bp: the hwbp perf event for this cpu
+ * @attr: the new attribute for @bp
+ *
+ * This does not release and reserve the slot of HWBP, just reuse the current
+ * slot on local CPU. So the users must update the other CPUs by themselves.
+ * Also, since this does not release/reserve the slot, this can not change the
+ * type to incompatible type of the HWBP.
+ * Return err if attr is invalid or the cpu fails to update debug register
+ * for new @attr.
+ */
+#ifdef CONFIG_HAVE_REINSTALL_HW_BREAKPOINT
+int modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr)
+{
+ int ret;
+
+ if (find_slot_idx(bp->attr.bp_type) != find_slot_idx(attr->bp_type))
+ return -EINVAL;
+
+ ret = hw_breakpoint_arch_parse(bp, attr, counter_arch_bp(bp));
+ if (ret)
+ return ret;
+
+ return arch_reinstall_hw_breakpoint(bp);
+}
+#else
+int modify_wide_hw_breakpoint_local(struct perf_event *bp,
+ struct perf_event_attr *attr)
+{
+ return -EOPNOTSUPP;
+}
+#endif
+EXPORT_SYMBOL_GPL(modify_wide_hw_breakpoint_local);
+
/**
* hw_breakpoint_is_used - check if breakpoints are currently used
*
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:20 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add Kconfig and Makefile infrastructure.

The implementation is located under mm/kstackwatch/.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/Kconfig.debug | 11 +++++++++++
mm/Makefile | 1 +
mm/kstackwatch/Makefile | 2 ++
mm/kstackwatch/kernel.c | 22 ++++++++++++++++++++++
mm/kstackwatch/kstackwatch.h | 5 +++++
mm/kstackwatch/stack.c | 1 +
mm/kstackwatch/watch.c | 1 +
7 files changed, 43 insertions(+)
create mode 100644 mm/kstackwatch/Makefile
create mode 100644 mm/kstackwatch/kernel.c
create mode 100644 mm/kstackwatch/kstackwatch.h
create mode 100644 mm/kstackwatch/stack.c
create mode 100644 mm/kstackwatch/watch.c

diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index 32b65073d0cc..fdfc6e6d0dec 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -309,3 +309,14 @@ config PER_VMA_LOCK_STATS
overhead in the page fault path.

If in doubt, say N.
+
+config KSTACK_WATCH
+ tristate "Kernel Stack Watch"
+ depends on HAVE_HW_BREAKPOINT && KPROBES && FPROBE
+ select HAVE_REINSTALL_HW_BREAKPOINT
+ help
+ A lightweight real-time debugging tool to detect stack corruption.
+ It can watch either the canary or local variable and tracks
+ the recursive depth of the monitored function.
+
+ If unsure, say N.
diff --git a/mm/Makefile b/mm/Makefile
index ef54aa615d9d..665c9f2bf987 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -92,6 +92,7 @@ obj-$(CONFIG_PAGE_POISONING) += page_poison.o
obj-$(CONFIG_KASAN) += kasan/
obj-$(CONFIG_KFENCE) += kfence/
obj-$(CONFIG_KMSAN) += kmsan/
+obj-$(CONFIG_KSTACK_WATCH) += kstackwatch/
obj-$(CONFIG_FAILSLAB) += failslab.o
obj-$(CONFIG_FAIL_PAGE_ALLOC) += fail_page_alloc.o
obj-$(CONFIG_MEMTEST) += memtest.o
diff --git a/mm/kstackwatch/Makefile b/mm/kstackwatch/Makefile
new file mode 100644
index 000000000000..84a46cb9a766
--- /dev/null
+++ b/mm/kstackwatch/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_KSTACK_WATCH) += kstackwatch.o
+kstackwatch-y := kernel.o stack.o watch.o
diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
new file mode 100644
index 000000000000..40aa7e9ff513
--- /dev/null
+++ b/mm/kstackwatch/kernel.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/module.h>
+
+MODULE_AUTHOR("Jinchao Wang");
+MODULE_DESCRIPTION("Kernel Stack Watch");
+MODULE_LICENSE("GPL");
+
+static int __init kstackwatch_init(void)
+{
+ pr_info("module loaded\n");
+ return 0;
+}
+
+static void __exit kstackwatch_exit(void)
+{
+ pr_info("module unloaded\n");
+}
+
+module_init(kstackwatch_init);
+module_exit(kstackwatch_exit);
diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
new file mode 100644
index 000000000000..0273ef478a26
--- /dev/null
+++ b/mm/kstackwatch/kstackwatch.h
@@ -0,0 +1,5 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _KSTACKWATCH_H
+#define _KSTACKWATCH_H
+
+#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
new file mode 100644
index 000000000000..cec594032515
--- /dev/null
+++ b/mm/kstackwatch/stack.c
@@ -0,0 +1 @@
+// SPDX-License-Identifier: GPL-2.0
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
new file mode 100644
index 000000000000..cec594032515
--- /dev/null
+++ b/mm/kstackwatch/watch.c
@@ -0,0 +1 @@
+// SPDX-License-Identifier: GPL-2.0
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:23 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add struct ksw_config and ksw_parse_config() to parse user string.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kernel.c | 91 ++++++++++++++++++++++++++++++++++++
mm/kstackwatch/kstackwatch.h | 33 +++++++++++++
2 files changed, 124 insertions(+)

diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
index 40aa7e9ff513..1502795e02af 100644
--- a/mm/kstackwatch/kernel.c
+++ b/mm/kstackwatch/kernel.c
@@ -1,20 +1,111 @@
// SPDX-License-Identifier: GPL-2.0
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

+#include <linux/kstrtox.h>
#include <linux/module.h>
+#include <linux/string.h>
+
+#include "kstackwatch.h"

MODULE_AUTHOR("Jinchao Wang");
MODULE_DESCRIPTION("Kernel Stack Watch");
MODULE_LICENSE("GPL");

+static struct ksw_config *ksw_config;
+
+/*
+ * Format of the configuration string:
+ * function+ip_offset[+depth] [local_var_offset:local_var_len]
+ *
+ * - function : name of the target function
+ * - ip_offset : instruction pointer offset within the function
+ * - depth : recursion depth to watch
+ * - local_var_offset : offset from the stack pointer at function+ip_offset
+ * - local_var_len : length of the local variable(1,2,4,8)
+ */
+static int __maybe_unused ksw_parse_config(char *buf, struct ksw_config *config)
+{
+ char *func_part, *local_var_part = NULL;
+ char *token;
+ u16 local_var_len;
+
+ memset(ksw_config, 0, sizeof(*ksw_config));
+
+ /* set the watch type to the default canary-based watching */
+ config->type = WATCH_CANARY;
+
+ func_part = strim(buf);
+ strscpy(config->config_str, func_part, MAX_CONFIG_STR_LEN);
+
+ local_var_part = strchr(func_part, ' ');
+ if (local_var_part) {
+ *local_var_part = '\0'; // terminate the function part
+ local_var_part = strim(local_var_part + 1);
+ }
+
+ /* parse the function part: function+ip_offset[+depth] */
+ token = strsep(&func_part, "+");
+ if (!token)
+ goto fail;
+
+ strscpy(config->function, token, MAX_FUNC_NAME_LEN - 1);
+
+ token = strsep(&func_part, "+");
+ if (!token || kstrtou16(token, 0, &config->ip_offset)) {
+ pr_err("failed to parse instruction offset\n");
+ goto fail;
+ }
+
+ token = strsep(&func_part, "+");
+ if (token && kstrtou16(token, 0, &config->depth)) {
+ pr_err("failed to parse depth\n");
+ goto fail;
+ }
+ if (!local_var_part || !(*local_var_part))
+ return 0;
+
+ /* parse the optional local var offset:len */
+ config->type = WATCH_LOCAL_VAR;
+ token = strsep(&local_var_part, ":");
+ if (!token || kstrtou16(token, 0, &config->local_var_offset)) {
+ pr_err("failed to parse local var offset\n");
+ goto fail;
+ }
+
+ if (!local_var_part || kstrtou16(local_var_part, 0, &local_var_len)) {
+ pr_err("failed to parse local var len\n");
+ goto fail;
+ }
+
+ if (local_var_len != 1 && local_var_len != 2 &&
+ local_var_len != 4 && local_var_len != 8) {
+ pr_err("invalid local var len %u (must be 1,2,4,8)\n",
+ local_var_len);
+ goto fail;
+ }
+ config->local_var_len = local_var_len;
+
+ return 0;
+fail:
+ pr_err("invalid input: %s\n", config->config_str);
+ config->config_str[0] = '\0';
+ return -EINVAL;
+}
+
static int __init kstackwatch_init(void)
{
+ ksw_config = kzalloc(sizeof(*ksw_config), GFP_KERNEL);
+ if (!ksw_config)
+ return -ENOMEM;
+
pr_info("module loaded\n");
return 0;
}

static void __exit kstackwatch_exit(void)
{
+ kfree(ksw_config);
+
pr_info("module unloaded\n");
}

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 0273ef478a26..7c595c5c24d1 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -2,4 +2,37 @@
#ifndef _KSTACKWATCH_H
#define _KSTACKWATCH_H

+#include <linux/types.h>
+
+#define MAX_FUNC_NAME_LEN 64
+#define MAX_CONFIG_STR_LEN 128
+
+enum watch_type {
+ WATCH_CANARY = 0,
+ WATCH_LOCAL_VAR,
+};
+
+struct ksw_config {
+ /* function part */
+ char function[MAX_FUNC_NAME_LEN];
+ u16 ip_offset;
+ u16 depth;
+
+ /* local var, useless for canary watch */
+ /* offset from rsp at function+ip_offset */
+ u16 local_var_offset;
+
+ /*
+ * local var size (1,2,4,8 bytes)
+ * it will be the watching len
+ */
+ u16 local_var_len;
+
+ /* easy for understand*/
+ enum watch_type type;
+
+ /* save to show */
+ char config_str[MAX_CONFIG_STR_LEN];
+};
+
#endif /* _KSTACKWATCH_H */
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:28 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Provide the /proc/kstackwatch file to read or update the configuration.
Only a single process can open this file at a time, enforced using atomic
config_file_busy, to prevent concurrent access.

ksw_get_config() exposes the configuration pointer as const.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kernel.c | 75 +++++++++++++++++++++++++++++++++++-
mm/kstackwatch/kstackwatch.h | 3 ++
2 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
index 1502795e02af..8e1dca45003e 100644
--- a/mm/kstackwatch/kernel.c
+++ b/mm/kstackwatch/kernel.c
@@ -3,7 +3,10 @@

#include <linux/kstrtox.h>
#include <linux/module.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
#include <linux/string.h>
+#include <linux/uaccess.h>

#include "kstackwatch.h"

@@ -12,6 +15,7 @@ MODULE_DESCRIPTION("Kernel Stack Watch");
MODULE_LICENSE("GPL");

static struct ksw_config *ksw_config;
+static atomic_t config_file_busy = ATOMIC_INIT(0);

/*
* Format of the configuration string:
@@ -23,7 +27,7 @@ static struct ksw_config *ksw_config;
* - local_var_offset : offset from the stack pointer at function+ip_offset
* - local_var_len : length of the local variable(1,2,4,8)
*/
-static int __maybe_unused ksw_parse_config(char *buf, struct ksw_config *config)
+static int ksw_parse_config(char *buf, struct ksw_config *config)
{
char *func_part, *local_var_part = NULL;
char *token;
@@ -92,18 +96,87 @@ static int __maybe_unused ksw_parse_config(char *buf, struct ksw_config *config)
return -EINVAL;
}

+static ssize_t kstackwatch_proc_write(struct file *file,
+ const char __user *buffer, size_t count,
+ loff_t *pos)
+{
+ char input[MAX_CONFIG_STR_LEN];
+ int ret;
+
+ if (count == 0 || count >= sizeof(input))
+ return -EINVAL;
+
+ if (copy_from_user(input, buffer, count))
+ return -EFAULT;
+
+ input[count] = '\0';
+ strim(input);
+
+ if (!strlen(input)) {
+ pr_info("config cleared\n");
+ return count;
+ }
+
+ ret = ksw_parse_config(input, ksw_config);
+ if (ret) {
+ pr_err("Failed to parse config %d\n", ret);
+ return ret;
+ }
+
+ return count;
+}
+
+static int kstackwatch_proc_show(struct seq_file *m, void *v)
+{
+ seq_printf(m, "%s\n", ksw_config->config_str);
+ return 0;
+}
+
+static int kstackwatch_proc_open(struct inode *inode, struct file *file)
+{
+ if (atomic_cmpxchg(&config_file_busy, 0, 1))
+ return -EBUSY;
+
+ return single_open(file, kstackwatch_proc_show, NULL);
+}
+
+static int kstackwatch_proc_release(struct inode *inode, struct file *file)
+{
+ atomic_set(&config_file_busy, 0);
+ return single_release(inode, file);
+}
+
+static const struct proc_ops kstackwatch_proc_ops = {
+ .proc_open = kstackwatch_proc_open,
+ .proc_read = seq_read,
+ .proc_write = kstackwatch_proc_write,
+ .proc_lseek = seq_lseek,
+ .proc_release = kstackwatch_proc_release,
+};
+
+const struct ksw_config *ksw_get_config(void)
+{
+ return ksw_config;
+}
static int __init kstackwatch_init(void)
{
ksw_config = kzalloc(sizeof(*ksw_config), GFP_KERNEL);
if (!ksw_config)
return -ENOMEM;

+ if (!proc_create("kstackwatch", 0600, NULL, &kstackwatch_proc_ops)) {
+ pr_err("create proc kstackwatch fail");
+ kfree(ksw_config);
+ return -ENOMEM;
+ }
+
pr_info("module loaded\n");
return 0;
}

static void __exit kstackwatch_exit(void)
{
+ remove_proc_entry("kstackwatch", NULL);
kfree(ksw_config);

pr_info("module unloaded\n");
diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 7c595c5c24d1..277b192f80fa 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -35,4 +35,7 @@ struct ksw_config {
char config_str[MAX_CONFIG_STR_LEN];
};

+// singleton, only modified in kernel.c
+const struct ksw_config *ksw_get_config(void);

Jinchao Wang

unread,
Sep 12, 2025, 6:12:33 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Pre-allocate per-CPU hardware breakpoints at init with a place holder
address, which will be retargeted dynamically in kprobe handler.
This avoids allocation in atomic context.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 4 +++
mm/kstackwatch/watch.c | 55 ++++++++++++++++++++++++++++++++++++
2 files changed, 59 insertions(+)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 277b192f80fa..3ea191370970 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -38,4 +38,8 @@ struct ksw_config {
// singleton, only modified in kernel.c
const struct ksw_config *ksw_get_config(void);

+/* watch management */
+int ksw_watch_init(void);
+void ksw_watch_exit(void);
+
#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index cec594032515..d3399ac840b2 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -1 +1,56 @@
// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/hw_breakpoint.h>
+#include <linux/perf_event.h>
+#include <linux/printk.h>
+
+#include "kstackwatch.h"
+
+static struct perf_event *__percpu *watch_events;
+
+static unsigned long watch_holder;
+
+static struct perf_event_attr watch_attr;
+
+bool panic_on_catch;
+module_param(panic_on_catch, bool, 0644);
+MODULE_PARM_DESC(panic_on_catch, "panic immediately on corruption catch");
+static void ksw_watch_handler(struct perf_event *bp,
+ struct perf_sample_data *data,
+ struct pt_regs *regs)
+{
+ pr_err("========== KStackWatch: Caught stack corruption =======\n");
+ pr_err("config %s\n", ksw_get_config()->config_str);
+ dump_stack();
+ pr_err("=================== KStackWatch End ===================\n");
+
+ if (panic_on_catch)
+ panic("Stack corruption detected");
+}
+
+int ksw_watch_init(void)
+{
+ int ret;
+
+ hw_breakpoint_init(&watch_attr);
+ watch_attr.bp_addr = (unsigned long)&watch_holder;
+ watch_attr.bp_len = sizeof(watch_holder);
+ watch_attr.bp_type = HW_BREAKPOINT_W;
+ watch_events = register_wide_hw_breakpoint(&watch_attr,
+ ksw_watch_handler,
+ NULL);
+ if (IS_ERR(watch_events)) {
+ ret = PTR_ERR(watch_events);
+ pr_err("failed to register wide hw breakpoint: %d\n", ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+void ksw_watch_exit(void)
+{
+ unregister_wide_hw_breakpoint(watch_events);
+ watch_events = NULL;
+}
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:38 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
The atomic_long_cmpxchg() ensures at most one active watchpoint exists at
any time, with ksw_watch_on() succeeding only when no watch is active
(current address is placeholder) and ksw_watch_off() succeeding only when
the caller knows the active watch address.

For cross-CPU synchronization, updates are propagated using direct
modification on the local CPU and asynchronous IPIs for remote CPUs.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 2 +
mm/kstackwatch/watch.c | 73 +++++++++++++++++++++++++++++++++++-
2 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 3ea191370970..0786fa961011 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -41,5 +41,7 @@ const struct ksw_config *ksw_get_config(void);
/* watch management */
int ksw_watch_init(void);
void ksw_watch_exit(void);
+int ksw_watch_on(ulong watch_addr, u16 watch_len);
+int ksw_watch_off(ulong watch_addr, u16 watch_len);

#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index d3399ac840b2..14549e02faf1 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -2,6 +2,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

#include <linux/hw_breakpoint.h>
+#include <linux/irqflags.h>
#include <linux/perf_event.h>
#include <linux/printk.h>

@@ -9,10 +10,16 @@

static struct perf_event *__percpu *watch_events;

-static unsigned long watch_holder;
+static ulong watch_holder;
+static atomic_long_t watched_addr = ATOMIC_LONG_INIT((ulong)&watch_holder);

static struct perf_event_attr watch_attr;

+static void ksw_watch_on_local_cpu(void *info);
+
+static DEFINE_PER_CPU(call_single_data_t,
+ watch_csd) = CSD_INIT(ksw_watch_on_local_cpu, NULL);
+
bool panic_on_catch;
module_param(panic_on_catch, bool, 0644);
MODULE_PARM_DESC(panic_on_catch, "panic immediately on corruption catch");
@@ -29,6 +36,70 @@ static void ksw_watch_handler(struct perf_event *bp,
panic("Stack corruption detected");
}

+static void ksw_watch_on_local_cpu(void *data)
+{
+ struct perf_event *bp;
+ ulong flags;
+ int cpu;
+ int ret;
+
+ local_irq_save(flags);
+ cpu = raw_smp_processor_id();
+ bp = *per_cpu_ptr(watch_events, cpu);
+ if (!bp) {
+ local_irq_restore(flags);
+ return;
+ }
+
+ ret = modify_wide_hw_breakpoint_local(bp, &watch_attr);
+ local_irq_restore(flags);
+
+ if (ret) {
+ pr_err("failed to reinstall HWBP on CPU %d ret %d\n", cpu,
+ ret);
+ return;
+ }
+}
+
+static void __ksw_watch_target(ulong addr, u16 len)
+{
+ int cpu;
+ call_single_data_t *csd;
+
+ watch_attr.bp_addr = addr;
+ watch_attr.bp_len = len;
+
+ /* ensure watchpoint update is visible to other CPUs before IPI */
+ smp_wmb();
+
+ for_each_online_cpu(cpu) {
+ if (cpu == raw_smp_processor_id()) {
+ ksw_watch_on_local_cpu(NULL);
+ } else {
+ csd = &per_cpu(watch_csd, cpu);
+ smp_call_function_single_async(cpu, csd);
+ }
+ }
+}
+
+static int ksw_watch_target(ulong old_addr, ulong new_addr, u16 watch_len)
+{
+ if (atomic_long_cmpxchg(&watched_addr, old_addr, new_addr) != old_addr)
+ return -EINVAL;
+ __ksw_watch_target(new_addr, watch_len);
+ return 0;
+}
+
+int ksw_watch_on(ulong watch_addr, u16 watch_len)
+{
+ return ksw_watch_target((ulong)&watch_holder, watch_addr, watch_len);
+}
+
+int ksw_watch_off(ulong watch_addr, u16 watch_len)
+{
+ return ksw_watch_target(watch_addr, (ulong)&watch_holder, watch_len);
+}
+
int ksw_watch_init(void)
{
int ret;
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:44 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Register CPU online/offline callbacks via cpuhp_setup_state_nocalls()
so stack watches are installed/removed dynamically as CPUs come online
or go offline.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/watch.c | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)

diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index 14549e02faf1..795e779792da 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

+#include <linux/cpuhotplug.h>
#include <linux/hw_breakpoint.h>
#include <linux/irqflags.h>
#include <linux/perf_event.h>
@@ -61,6 +62,32 @@ static void ksw_watch_on_local_cpu(void *data)
}
}

+static int ksw_cpu_online(unsigned int cpu)
+{
+ struct perf_event *bp;
+
+ bp = perf_event_create_kernel_counter(&watch_attr, cpu, NULL,
+ ksw_watch_handler, NULL);
+ if (IS_ERR(bp)) {
+ pr_err("Failed to create watch on CPU %d: %ld\n", cpu,
+ PTR_ERR(bp));
+ return PTR_ERR(bp);
+ }
+
+ per_cpu(*watch_events, cpu) = bp;
+ per_cpu(watch_csd, cpu) = CSD_INIT(ksw_watch_on_local_cpu, NULL);
+ return 0;
+}
+
+static int ksw_cpu_offline(unsigned int cpu)
+{
+ struct perf_event *bp = per_cpu(*watch_events, cpu);
+
+ if (bp)
+ unregister_hw_breakpoint(bp);
+ return 0;
+}
+
static void __ksw_watch_target(ulong addr, u16 len)
{
int cpu;
@@ -117,6 +144,15 @@ int ksw_watch_init(void)
return ret;
}

+ ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+ "kstackwatch:online", ksw_cpu_online,
+ ksw_cpu_offline);
+ if (ret < 0) {
+ unregister_wide_hw_breakpoint(watch_events);
+ pr_err("Failed to register CPU hotplug notifier\n");
+ return ret;
+ }
+
return 0;
}

--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:47 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Introduce struct kstackwatch_ctx to enable lockless per-task state
tracking. This is required because KStackWatch operates in NMI context
(via kprobe handler) where traditional locking is unsafe.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
include/linux/kstackwatch_types.h | 13 +++++++++++++
include/linux/sched.h | 5 +++++
2 files changed, 18 insertions(+)
create mode 100644 include/linux/kstackwatch_types.h

diff --git a/include/linux/kstackwatch_types.h b/include/linux/kstackwatch_types.h
new file mode 100644
index 000000000000..93855fcc7981
--- /dev/null
+++ b/include/linux/kstackwatch_types.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_KSTACK_WATCH_TYPES_H
+#define _LINUX_KSTACK_WATCH_TYPES_H
+#include <linux/types.h>
+
+struct kstackwatch_ctx {
+ ulong watch_addr;
+ u16 watch_len;
+ u16 depth;
+ bool watch_on;
+};
+
+#endif /* _LINUX_KSTACK_WATCH_TYPES_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index f8188b833350..1b324b458309 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -22,6 +22,7 @@
#include <linux/sem_types.h>
#include <linux/shm.h>
#include <linux/kmsan_types.h>
+#include <linux/kstackwatch_types.h>
#include <linux/mutex_types.h>
#include <linux/plist_types.h>
#include <linux/hrtimer_types.h>
@@ -1481,6 +1482,10 @@ struct task_struct {
struct kmsan_ctx kmsan_ctx;
#endif

+#if IS_ENABLED(CONFIG_KSTACK_WATCH)
+ struct kstackwatch_ctx kstackwatch_ctx;
+#endif
+
#if IS_ENABLED(CONFIG_KUNIT)
struct kunit *kunit_test;
#endif
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:53 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Provide ksw_stack_init() and ksw_stack_exit() to manage entry and
exit probes for the target function from ksw_get_config().

The entry/exit probe handlers use atomic ksw_stack_pid to ensure a
singleton watch and current->kstackwatch_ctx.depth to track
recursion depth. A watch is set up only when depth reaches the
configured value.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 4 ++
mm/kstackwatch/stack.c | 113 +++++++++++++++++++++++++++++++++++
2 files changed, 117 insertions(+)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 0786fa961011..5ea2db76cdfb 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -38,6 +38,10 @@ struct ksw_config {
// singleton, only modified in kernel.c
const struct ksw_config *ksw_get_config(void);

+/* stack management */
+int ksw_stack_init(void);
+void ksw_stack_exit(void);
+
/* watch management */
int ksw_watch_init(void);
void ksw_watch_exit(void);
diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
index cec594032515..ac52a9f81486 100644
--- a/mm/kstackwatch/stack.c
+++ b/mm/kstackwatch/stack.c
@@ -1 +1,114 @@
// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/atomic.h>
+#include <linux/fprobe.h>
+#include <linux/kprobes.h>
+#include <linux/kstackwatch_types.h>
+#include <linux/printk.h>
+
+#include "kstackwatch.h"
+
+static struct kprobe entry_probe;
+static struct fprobe exit_probe;
+#define INVALID_PID -1
+static atomic_t ksw_stack_pid = ATOMIC_INIT(INVALID_PID);
+
+static int ksw_stack_prepare_watch(struct pt_regs *regs,
+ const struct ksw_config *config,
+ ulong *watch_addr, u16 *watch_len)
+{
+ /* implement logic will be added in following patches */
+ *watch_addr = 0;
+ *watch_len = 0;
+ return 0;
+}
+
+static void ksw_stack_entry_handler(struct kprobe *p, struct pt_regs *regs,
+ unsigned long flags)
+{
+ struct kstackwatch_ctx *ctx = &current->kstackwatch_ctx;
+ ulong watch_addr;
+ u16 watch_len;
+ int ret;
+
+ if (ctx->depth++ != ksw_get_config()->depth)
+ return;
+
+ if (atomic_cmpxchg(&ksw_stack_pid, INVALID_PID, current->pid) !=
+ INVALID_PID)
+ return;
+
+ ret = ksw_stack_prepare_watch(regs, ksw_get_config(), &watch_addr,
+ &watch_len);
+ if (ret) {
+ atomic_set(&ksw_stack_pid, INVALID_PID);
+ pr_err("failed to prepare watch target: %d\n", ret);
+ return;
+ }
+
+ ret = ksw_watch_on(watch_addr, watch_len);
+ if (ret) {
+ atomic_set(&ksw_stack_pid, INVALID_PID);
+ pr_err("failed to watch on depth:%d addr:0x%lx len:%u %d\n",
+ ksw_get_config()->depth, watch_addr, watch_len, ret);
+ return;
+ }
+
+ ctx->watch_addr = watch_addr;
+ ctx->watch_len = watch_len;
+ ctx->watch_on = true;
+}
+
+static void ksw_stack_exit_handler(struct fprobe *fp, unsigned long ip,
+ unsigned long ret_ip,
+ struct ftrace_regs *regs, void *data)
+{
+ struct kstackwatch_ctx *ctx = &current->kstackwatch_ctx;
+
+ if (--ctx->depth != ksw_get_config()->depth)
+ return;
+
+ if (atomic_read(&ksw_stack_pid) != current->pid)
+ return;
+ WARN_ON_ONCE(!ctx->watch_on);
+ WARN_ON_ONCE(ksw_watch_off(ctx->watch_addr, ctx->watch_len));
+ ctx->watch_on = false;
+
+ atomic_set(&ksw_stack_pid, INVALID_PID);
+}
+
+int ksw_stack_init(void)
+{
+ int ret;
+ char *symbuf = NULL;
+
+ memset(&entry_probe, 0, sizeof(entry_probe));
+ entry_probe.symbol_name = ksw_get_config()->function;
+ entry_probe.offset = ksw_get_config()->ip_offset;
+ entry_probe.post_handler = ksw_stack_entry_handler;
+ ret = register_kprobe(&entry_probe);
+ if (ret) {
+ pr_err("Failed to register kprobe ret %d\n", ret);
+ return ret;
+ }
+
+ memset(&exit_probe, 0, sizeof(exit_probe));
+ exit_probe.exit_handler = ksw_stack_exit_handler;
+ symbuf = (char *)ksw_get_config()->function;
+
+ ret = register_fprobe_syms(&exit_probe, (const char **)&symbuf, 1);
+ if (ret < 0) {
+ pr_err("register_fprobe_syms fail %d\n", ret);
+ unregister_kprobe(&entry_probe);
+ return ret;
+ }
+
+ return 0;
+}
+
+void ksw_stack_exit(void)
+{
+ unregister_fprobe(&exit_probe);
+ unregister_kprobe(&entry_probe);
+}
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:12:58 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add helpers to find the stack canary or a local variable addr and len
for the probed function based on ksw_get_config(). For canary search,
limits search to a fixed number of steps to avoid scanning the entire
stack. Validates that the computed address and length are within the
kernel stack.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/stack.c | 88 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 84 insertions(+), 4 deletions(-)

diff --git a/mm/kstackwatch/stack.c b/mm/kstackwatch/stack.c
index ac52a9f81486..65a97309e028 100644
--- a/mm/kstackwatch/stack.c
+++ b/mm/kstackwatch/stack.c
@@ -9,18 +9,98 @@

#include "kstackwatch.h"

+#define INVALID_PID -1
+#define MAX_CANARY_SEARCH_STEPS 128
static struct kprobe entry_probe;
static struct fprobe exit_probe;
-#define INVALID_PID -1
static atomic_t ksw_stack_pid = ATOMIC_INIT(INVALID_PID);

+static unsigned long ksw_find_stack_canary_addr(struct pt_regs *regs)
+{
+ unsigned long *stack_ptr, *stack_end, *stack_base;
+ unsigned long expected_canary;
+ unsigned int i;
+
+ stack_ptr = (unsigned long *)kernel_stack_pointer(regs);
+
+ stack_base = (unsigned long *)(current->stack);
+
+ // TODO: limit it to the current frame
+ stack_end = (unsigned long *)((char *)current->stack + THREAD_SIZE);
+
+ expected_canary = current->stack_canary;
+
+ if (stack_ptr < stack_base || stack_ptr >= stack_end) {
+ pr_err("Stack pointer 0x%lx out of bounds [0x%lx, 0x%lx)\n",
+ (unsigned long)stack_ptr, (unsigned long)stack_base,
+ (unsigned long)stack_end);
+ return 0;
+ }
+
+ for (i = 0; i < MAX_CANARY_SEARCH_STEPS; i++) {
+ if (&stack_ptr[i] >= stack_end)
+ break;
+
+ if (stack_ptr[i] == expected_canary) {
+ pr_debug("canary found i:%d 0x%lx\n", i,
+ (unsigned long)&stack_ptr[i]);
+ return (unsigned long)&stack_ptr[i];
+ }
+ }
+
+ pr_debug("canary not found in first %d steps\n",
+ MAX_CANARY_SEARCH_STEPS);
+ return 0;
+}
+
+static int ksw_stack_validate_addr(unsigned long addr, size_t size)
+{
+ unsigned long stack_start, stack_end;
+
+ if (!addr || !size)
+ return -EINVAL;
+
+ stack_start = (unsigned long)current->stack;
+ stack_end = stack_start + THREAD_SIZE;
+
+ if (addr < stack_start || (addr + size) > stack_end)
+ return -ERANGE;
+
+ return 0;
+}
+
static int ksw_stack_prepare_watch(struct pt_regs *regs,
const struct ksw_config *config,
ulong *watch_addr, u16 *watch_len)
{
- /* implement logic will be added in following patches */
- *watch_addr = 0;
- *watch_len = 0;
+ ulong addr;
+ u16 len;
+
+ /* Resolve addresses for all active watches */
+ switch (ksw_get_config()->type) {
+ case WATCH_CANARY:
+ addr = ksw_find_stack_canary_addr(regs);
+ len = sizeof(unsigned long);
+ break;
+
+ case WATCH_LOCAL_VAR:
+ addr = kernel_stack_pointer(regs) +
+ ksw_get_config()->local_var_offset;
+ len = ksw_get_config()->local_var_len;
+ break;
+
+ default:
+ pr_err("Unknown watch type %d\n", ksw_get_config()->type);
+ return -EINVAL;
+ }
+
+ if (ksw_stack_validate_addr(addr, len)) {
+ pr_err("invalid stack addr:0x%lx len :%u\n", addr, len);
+ return -EINVAL;
+ }
+
+ *watch_addr = addr;
+ *watch_len = len;
return 0;
}

--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:13:04 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Allow dynamic enabling/disabling of kstackwatch through user input of proc.
With this patch, the entire system becomes functional.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kernel.c | 55 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/kernel.c b/mm/kstackwatch/kernel.c
index 8e1dca45003e..9ef969f28e29 100644
--- a/mm/kstackwatch/kernel.c
+++ b/mm/kstackwatch/kernel.c
@@ -17,6 +17,43 @@ MODULE_LICENSE("GPL");
static struct ksw_config *ksw_config;
static atomic_t config_file_busy = ATOMIC_INIT(0);

+static bool watching_active;
+
+static int ksw_start_watching(void)
+{
+ int ret;
+
+ /*
+ * Watch init will preallocate the HWBP,
+ * so it must happen before stack init
+ */
+ ret = ksw_watch_init();
+ if (ret) {
+ pr_err("ksw_watch_init ret: %d\n", ret);
+ return ret;
+ }
+
+ ret = ksw_stack_init();
+ if (ret) {
+ pr_err("ksw_stack_init ret: %d\n", ret);
+ ksw_watch_exit();
+ return ret;
+ }
+ watching_active = true;
+
+ pr_info("start watching: %s\n", ksw_config->config_str);
+ return 0;
+}
+
+static void ksw_stop_watching(void)
+{
+ ksw_stack_exit();
+ ksw_watch_exit();
+ watching_active = false;
+
+ pr_info("stop watching: %s\n", ksw_config->config_str);
+}
+
/*
* Format of the configuration string:
* function+ip_offset[+depth] [local_var_offset:local_var_len]
@@ -109,6 +146,9 @@ static ssize_t kstackwatch_proc_write(struct file *file,
if (copy_from_user(input, buffer, count))
return -EFAULT;

+ if (watching_active)
+ ksw_stop_watching();
+
input[count] = '\0';
strim(input);

@@ -123,12 +163,22 @@ static ssize_t kstackwatch_proc_write(struct file *file,
return ret;
}

+ ret = ksw_start_watching();
+ if (ret) {
+ pr_err("Failed to start watching with %d\n", ret);
+ return ret;
+ }
+
return count;
}

static int kstackwatch_proc_show(struct seq_file *m, void *v)
{
- seq_printf(m, "%s\n", ksw_config->config_str);
+ if (watching_active)
+ seq_printf(m, "%s\n", ksw_config->config_str);
+ else
+ seq_puts(m, "not watching\n");
+
return 0;
}

@@ -176,6 +226,9 @@ static int __init kstackwatch_init(void)

static void __exit kstackwatch_exit(void)
{
+ if (watching_active)
+ ksw_stop_watching();
+
remove_proc_entry("kstackwatch", NULL);
kfree(ksw_config);

--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:13:08 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Provide two debug helpers:

- ksw_watch_show(): print the current watch target address and length.
- ksw_watch_fire(): intentionally trigger the watchpoint immediately
by writing to the watched address, useful for testing HWBP behavior.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/kstackwatch.h | 2 ++
mm/kstackwatch/watch.c | 18 ++++++++++++++++++
2 files changed, 20 insertions(+)

diff --git a/mm/kstackwatch/kstackwatch.h b/mm/kstackwatch/kstackwatch.h
index 5ea2db76cdfb..9a4900df8ff8 100644
--- a/mm/kstackwatch/kstackwatch.h
+++ b/mm/kstackwatch/kstackwatch.h
@@ -47,5 +47,7 @@ int ksw_watch_init(void);
void ksw_watch_exit(void);
int ksw_watch_on(ulong watch_addr, u16 watch_len);
int ksw_watch_off(ulong watch_addr, u16 watch_len);
+void ksw_watch_show(void);
+void ksw_watch_fire(void);

#endif /* _KSTACKWATCH_H */
diff --git a/mm/kstackwatch/watch.c b/mm/kstackwatch/watch.c
index 795e779792da..2e9294595bf3 100644
--- a/mm/kstackwatch/watch.c
+++ b/mm/kstackwatch/watch.c
@@ -161,3 +161,21 @@ void ksw_watch_exit(void)
unregister_wide_hw_breakpoint(watch_events);
watch_events = NULL;
}
+
+/* self debug function */
+void ksw_watch_show(void)
+{
+ pr_info("watch target bp_addr: 0x%llx len:%llu\n", watch_attr.bp_addr,
+ watch_attr.bp_len);
+}
+EXPORT_SYMBOL_GPL(ksw_watch_show);
+
+/* self debug function */
+void ksw_watch_fire(void)
+{
+ char *ptr = (char *)watch_attr.bp_addr;
+
+ pr_warn("watch triggered immediately\n");
+ *ptr = 0x42; // This should trigger immediately for any bp_len
+}
+EXPORT_SYMBOL_GPL(ksw_watch_fire);
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:13:14 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Introduce a separate test module to validate functionality in controlled
scenarios, such as stack canary writes and simulated corruption.

The module provides a proc interface (/proc/kstackwatch_test) that allows
triggering specific test cases via simple commands:

- test0: directly corrupt the canary to verify watch/fire behavior

Test module is built with optimizations disabled to ensure predictable
behavior.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/Kconfig.debug | 10 ++++
mm/kstackwatch/Makefile | 6 +++
mm/kstackwatch/test.c | 115 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 131 insertions(+)
create mode 100644 mm/kstackwatch/test.c

diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index fdfc6e6d0dec..46c280280980 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -320,3 +320,13 @@ config KSTACK_WATCH
the recursive depth of the monitored function.

If unsure, say N.
+
+config KSTACK_WATCH_TEST
+ tristate "KStackWatch Test Module"
+ depends on KSTACK_WATCH
+ help
+ This module provides controlled stack exhaustion and overflow scenarios
+ to verify the functionality of KStackWatch. It is particularly useful
+ for development and validation of the KStachWatch mechanism.
+
+ If unsure, say N.
diff --git a/mm/kstackwatch/Makefile b/mm/kstackwatch/Makefile
index 84a46cb9a766..d007b8dcd1c6 100644
--- a/mm/kstackwatch/Makefile
+++ b/mm/kstackwatch/Makefile
@@ -1,2 +1,8 @@
obj-$(CONFIG_KSTACK_WATCH) += kstackwatch.o
kstackwatch-y := kernel.o stack.o watch.o
+
+obj-$(CONFIG_KSTACK_WATCH_TEST) += kstackwatch_test.o
+kstackwatch_test-y := test.o
+CFLAGS_test.o := -fno-inline \
+ -fno-optimize-sibling-calls \
+ -fno-pic -fno-pie -O0 -Og
diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
new file mode 100644
index 000000000000..76dbfb042067
--- /dev/null
+++ b/mm/kstackwatch/test.c
@@ -0,0 +1,115 @@
+// SPDX-License-Identifier: GPL-2.0
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/delay.h>
+#include <linux/kthread.h>
+#include <linux/module.h>
+#include <linux/prandom.h>
+#include <linux/printk.h>
+#include <linux/proc_fs.h>
+#include <linux/string.h>
+#include <linux/uaccess.h>
+
+#include "kstackwatch.h"
+
+MODULE_AUTHOR("Jinchao Wang");
+MODULE_DESCRIPTION("Simple KStackWatch Test Module");
+MODULE_LICENSE("GPL");
+
+static struct proc_dir_entry *test_proc;
+#define BUFFER_SIZE 4
+#define MAX_DEPTH 6
+
+/*
+ * Test Case 0: Write to the canary position directly (Canary Test)
+ * use a u64 buffer array to ensure the canary will be placed
+ * corrupt the stack canary using the debug function
+ */
+static void canary_test_write(void)
+{
+ u64 buffer[BUFFER_SIZE];
+
+ pr_info("starting %s\n", __func__);
+ ksw_watch_show();
+ ksw_watch_fire();
+
+ buffer[0] = 0;
+
+ /* make sure the compiler do not drop assign action */
+ barrier_data(buffer);
+ pr_info("canary write test completed\n");
+}
+
+static ssize_t test_proc_write(struct file *file, const char __user *buffer,
+ size_t count, loff_t *pos)
+{
+ char cmd[256];
+ int test_num;
+
+ if (count >= sizeof(cmd))
+ return -EINVAL;
+
+ if (copy_from_user(cmd, buffer, count))
+ return -EFAULT;
+
+ cmd[count] = '\0';
+ strim(cmd);
+
+ pr_info("received command: %s\n", cmd);
+
+ if (sscanf(cmd, "test%d", &test_num) == 1) {
+ switch (test_num) {
+ case 0:
+ pr_info("triggering canary write test\n");
+ canary_test_write();
+ break;
+ default:
+ pr_err("Unknown test number %d\n", test_num);
+ return -EINVAL;
+ }
+ } else {
+ pr_err("invalid command format. Use 'test1', 'test2', or 'test3'.\n");
+ return -EINVAL;
+ }
+
+ return count;
+}
+
+static ssize_t test_proc_read(struct file *file, char __user *buffer,
+ size_t count, loff_t *pos)
+{
+ static const char usage[] =
+ "KStackWatch Simplified Test Module\n"
+ "==================================\n"
+ "Usage:\n"
+ " echo 'test0' > /proc/kstackwatch_test - Canary write test\n";
+
+ return simple_read_from_buffer(buffer, count, pos, usage,
+ strlen(usage));
+}
+
+static const struct proc_ops test_proc_ops = {
+ .proc_read = test_proc_read,
+ .proc_write = test_proc_write,
+};
+
+static int __init kstackwatch_test_init(void)
+{
+ test_proc = proc_create("kstackwatch_test", 0600, NULL, &test_proc_ops);
+ if (!test_proc) {
+ pr_err("Failed to create proc entry\n");
+ return -ENOMEM;
+ }
+ pr_info("module loaded\n");
+ return 0;
+}
+
+static void __exit kstackwatch_test_exit(void)
+{
+ if (test_proc)
+ remove_proc_entry("kstackwatch_test", NULL);
+ pr_info("module unloaded\n");
+}
+
+module_init(kstackwatch_test_init);
+module_exit(kstackwatch_test_exit);
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:13:20 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Extend the test module with a new test case (test1) that intentionally
overflows a local u64 buffer to corrupt the stack canary. This helps
validate detection of stack corruption under overflow conditions.

The proc interface is updated to document the new test:

- test1: stack canary overflow test

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/test.c | 28 +++++++++++++++++++++++++++-
1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
index 76dbfb042067..ab1a3f92b5e8 100644
--- a/mm/kstackwatch/test.c
+++ b/mm/kstackwatch/test.c
@@ -40,6 +40,27 @@ static void canary_test_write(void)
pr_info("canary write test completed\n");
}

+/*
+ * Test Case 1: Stack Overflow (Canary Test)
+ * This function uses a u64 buffer 64-bit write
+ * to corrupt the stack canary with a single operation
+ */
+static void canary_test_overflow(void)
+{
+ u64 buffer[BUFFER_SIZE];
+
+ pr_info("starting %s\n", __func__);
+ pr_info("buffer 0x%lx\n", (unsigned long)buffer);
+
+ /* intentionally overflow the u64 buffer. */
+ ((u64 *)buffer + BUFFER_SIZE)[0] = 0xdeadbeefdeadbeef;
+
+ /* make sure the compiler do not drop assign action */
+ barrier_data(buffer);
+
+ pr_info("canary overflow test completed\n");
+}
+
static ssize_t test_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
{
@@ -63,6 +84,10 @@ static ssize_t test_proc_write(struct file *file, const char __user *buffer,
pr_info("triggering canary write test\n");
canary_test_write();
break;
+ case 1:
+ pr_info("triggering canary overflow test\n");
+ canary_test_overflow();
+ break;
default:
pr_err("Unknown test number %d\n", test_num);
return -EINVAL;
@@ -82,7 +107,8 @@ static ssize_t test_proc_read(struct file *file, char __user *buffer,
"KStackWatch Simplified Test Module\n"
"==================================\n"
"Usage:\n"
- " echo 'test0' > /proc/kstackwatch_test - Canary write test\n";
+ " echo 'test0' > /proc/kstackwatch_test - Canary write test\n"
+ " echo 'test1' > /proc/kstackwatch_test - Canary overflow test\n";

return simple_read_from_buffer(buffer, count, pos, usage,
strlen(usage));
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:13:25 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Introduce a new test scenario to simulate silent stack corruption:

- silent_corruption_buggy():
exposes a local variable address globally without resetting it.
- silent_corruption_unwitting():
reads the exposed pointer and modifies the memory, simulating a routine
that unknowingly writes to another stack frame.
- silent_corruption_victim():
demonstrates the effect of silent corruption on unrelated local variables.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/test.c | 96 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 95 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
index ab1a3f92b5e8..2b196f72ffd7 100644
--- a/mm/kstackwatch/test.c
+++ b/mm/kstackwatch/test.c
@@ -20,6 +20,9 @@ static struct proc_dir_entry *test_proc;
#define BUFFER_SIZE 4
#define MAX_DEPTH 6

+/* global variables for Silent corruption test */
+static u64 *g_corrupt_ptr;
+
/*
* Test Case 0: Write to the canary position directly (Canary Test)
* use a u64 buffer array to ensure the canary will be placed
@@ -61,6 +64,92 @@ static void canary_test_overflow(void)
pr_info("canary overflow test completed\n");
}

+static void do_something(int min_ms, int max_ms)
+{
+ u32 rand;
+
+ get_random_bytes(&rand, sizeof(rand));
+ rand = min_ms + rand % (max_ms - min_ms + 1);
+ msleep(rand);
+}
+
+static void silent_corruption_buggy(int i)
+{
+ u64 local_var;
+
+ pr_info("starting %s\n", __func__);
+
+ pr_info("%s %d local_var addr: 0x%lx\n", __func__, i,
+ (unsigned long)&local_var);
+ WRITE_ONCE(g_corrupt_ptr, &local_var);
+
+ do_something(50, 150);
+ //buggy: return without resetting g_corrupt_ptr
+}
+
+static void silent_corruption_victim(int i)
+{
+ u64 local_var;
+
+ local_var = 0xdeadbeef;
+ pr_info("starting %s %dth\n", __func__, i);
+ pr_info("%s local_var addr: 0x%lx\n", __func__,
+ (unsigned long)&local_var);
+
+ do_something(50, 150);
+
+ if (local_var != 0)
+ pr_info("%s %d happy with 0x%llx\n", __func__, i, local_var);
+ else
+ pr_info("%s %d unhappy with 0x%llx\n", __func__, i, local_var);
+}
+
+static int silent_corruption_unwitting(void *data)
+{
+ u64 *local_ptr;
+
+ pr_info("starting %s\n", __func__);
+
+ do {
+ local_ptr = READ_ONCE(g_corrupt_ptr);
+ do_something(500, 1000);
+ } while (!local_ptr);
+
+ local_ptr[0] = 0;
+
+ return 0;
+}
+
+/*
+ * Test Case 2: Silent Corruption
+ * buggy() does not protect its local var correctly
+ * unwitting() simply does its intended work
+ * victim() is unaware know what happened
+ */
+static void silent_corruption_test(void)
+{
+ struct task_struct *unwitting;
+
+ pr_info("starting %s\n", __func__);
+ WRITE_ONCE(g_corrupt_ptr, NULL);
+
+ unwitting = kthread_run(silent_corruption_unwitting, NULL, "unwitting");
+ if (IS_ERR(unwitting)) {
+ pr_err("failed to create thread2\n");
+ return;
+ }
+
+ silent_corruption_buggy(0);
+
+ /*
+ * An iteration-based bug: The unwitting thread corrupts the victim's
+ * stack. In a twist of fate, the victim's subsequent repetitions ensure
+ * the corruption is contained, protecting the caller's stack.
+ */
+ for (int i = 0; i < 20; i++)
+ silent_corruption_victim(i);
+}
+
static ssize_t test_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
{
@@ -88,6 +177,10 @@ static ssize_t test_proc_write(struct file *file, const char __user *buffer,
pr_info("triggering canary overflow test\n");
canary_test_overflow();
break;
+ case 2:
+ pr_info("triggering silent corruption test\n");
+ silent_corruption_test();
+ break;
default:
pr_err("Unknown test number %d\n", test_num);
return -EINVAL;
@@ -108,7 +201,8 @@ static ssize_t test_proc_read(struct file *file, char __user *buffer,
"==================================\n"
"Usage:\n"
" echo 'test0' > /proc/kstackwatch_test - Canary write test\n"
- " echo 'test1' > /proc/kstackwatch_test - Canary overflow test\n";
+ " echo 'test1' > /proc/kstackwatch_test - Canary overflow test\n"
+ " echo 'test2' > /proc/kstackwatch_test - Silent corruption test\n";

Jinchao Wang

unread,
Sep 12, 2025, 6:13:29 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add a test that triggers stack writes across recursive calls,verifying
detection at specific recursion depths.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
mm/kstackwatch/test.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/mm/kstackwatch/test.c b/mm/kstackwatch/test.c
index 2b196f72ffd7..3e867d778e91 100644
--- a/mm/kstackwatch/test.c
+++ b/mm/kstackwatch/test.c
@@ -150,6 +150,27 @@ static void silent_corruption_test(void)
silent_corruption_victim(i);
}

+/*
+ * Test Case 3: Recursive Call Corruption
+ * Test corruption detection at specified recursion depth
+ */
+static void recursive_corruption_test(int depth)
+{
+ u64 buffer[BUFFER_SIZE];
+
+ pr_info("recursive call at depth %d\n", depth);
+ pr_info("buffer 0x%lx\n", (unsigned long)buffer);
+ if (depth <= MAX_DEPTH)
+ recursive_corruption_test(depth + 1);
+
+ buffer[0] = depth;
+
+ /* make sure the compiler do not drop assign action */
+ barrier_data(buffer);
+
+ pr_info("returning from depth %d\n", depth);
+}
+
static ssize_t test_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
{
@@ -181,6 +202,11 @@ static ssize_t test_proc_write(struct file *file, const char __user *buffer,
pr_info("triggering silent corruption test\n");
silent_corruption_test();
break;
+ case 3:
+ pr_info("triggering recursive corruption test\n");
+ /* depth start with 0 */
+ recursive_corruption_test(0);
+ break;
default:
pr_err("Unknown test number %d\n", test_num);
return -EINVAL;
@@ -202,7 +228,8 @@ static ssize_t test_proc_read(struct file *file, char __user *buffer,
"Usage:\n"
" echo 'test0' > /proc/kstackwatch_test - Canary write test\n"
" echo 'test1' > /proc/kstackwatch_test - Canary overflow test\n"
- " echo 'test2' > /proc/kstackwatch_test - Silent corruption test\n";
+ " echo 'test2' > /proc/kstackwatch_test - Silent corruption test\n"
+ " echo 'test3' > /proc/kstackwatch_test - Recursive corruption test\n";

Jinchao Wang

unread,
Sep 12, 2025, 6:13:34 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Provide a shell script to trigger test cases.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
tools/kstackwatch/kstackwatch_test.sh | 40 +++++++++++++++++++++++++++
1 file changed, 40 insertions(+)
create mode 100755 tools/kstackwatch/kstackwatch_test.sh

diff --git a/tools/kstackwatch/kstackwatch_test.sh b/tools/kstackwatch/kstackwatch_test.sh
new file mode 100755
index 000000000000..61e171439ab6
--- /dev/null
+++ b/tools/kstackwatch/kstackwatch_test.sh
@@ -0,0 +1,40 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+echo "IMPORTANT: Before running, make sure you have updated the offset values!"
+
+usage() {
+ echo "Usage: $0 [0-3]"
+ echo " 0 - Canary Write Test"
+ echo " 1 - Canary Overflow Test"
+ echo " 2 - Silent Corruption Test"
+ echo " 3 - Recursive Corruption Test"
+}
+
+run_test() {
+ local test_num=$1
+ case "$test_num" in
+ 0) echo "canary_test_write+0x19" >/proc/kstackwatch
+ echo "test0" >/proc/kstackwatch_test ;;
+ 1) echo "canary_test_overflow+0x1a" >/proc/kstackwatch
+ echo "test1" >/proc/kstackwatch_test ;;
+ 2) echo "silent_corruption_victim+0x32 0:8" >/proc/kstackwatch
+ echo "test2" >/proc/kstackwatch_test ;;
+ 3) echo "recursive_corruption_test+0x21+3 0:8" >/proc/kstackwatch
+ echo "test3" >/proc/kstackwatch_test ;;
+ *) usage
+ exit 1 ;;
+ esac
+ # Reset watch after test
+ echo >/proc/kstackwatch
+}
+
+# Check root and module
+[ "$EUID" -ne 0 ] && echo "Run as root" && exit 1
+for f in /proc/kstackwatch /proc/kstackwatch_test; do
+ [ ! -f "$f" ] && echo "$f not found" && exit 1
+done
+
+# Run
+[ -z "$1" ] && { usage; exit 0; }
+run_test "$1"
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:13:40 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add a new documentation file for KStackWatch, explaining its purpose,
motivation, key features, configuration format, module parameters,
implementation notes, limitations, and testing instructions.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
Documentation/dev-tools/kstackwatch.rst | 94 +++++++++++++++++++++++++
1 file changed, 94 insertions(+)
create mode 100644 Documentation/dev-tools/kstackwatch.rst

diff --git a/Documentation/dev-tools/kstackwatch.rst b/Documentation/dev-tools/kstackwatch.rst
new file mode 100644
index 000000000000..f741de08ca56
--- /dev/null
+++ b/Documentation/dev-tools/kstackwatch.rst
@@ -0,0 +1,94 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================================
+KStackWatch: Kernel Stack Watch
+====================================
+
+Overview
+========
+KStackWatch is a lightweight debugging tool designed to detect
+kernel stack corruption in real time. It helps developers capture the
+moment corruption occurs, rather than only observing a later crash.
+
+Motivation
+==========
+Stack corruption may originate in one function but manifest much later
+with no direct call trace linking the two. This makes such issues
+extremely difficult to diagnose. KStackWatch addresses this by combining
+hardware breakpoints with kprobe and fprobe instrumentation, monitoring
+stack canaries or local variables at the point of corruption.
+
+Key Features
+============
+- Lightweight overhead:
+ Minimal runtime cost, preserving bug reproducibility.
+- Real-time detection:
+ Detect stack corruption immediately.
+- Flexible configuration:
+ Control via a procfs interface.
+- Depth filtering:
+ Optional recursion depth tracking per task.
+
+Configuration
+=============
+The control file is created at::
+
+ /proc/kstackwatch
+
+To configure, write a string in the following format::
+
+ function+ip_offset[+depth] [local_var_offset:local_var_len]
+ - function : name of the target function
+ - ip_offset : instruction pointer offset within the function
+ - depth : recursion depth to watch, starting from 0
+ - local_var_offset : offset from the stack pointer at function+ip_offset
+ - local_var_len : length of the local variable(1,2,4,8)
+
+Fields
+------
+- ``function``:
+ Name of the target function to watch.
+- ``ip_offset``:
+ Instruction pointer offset within the function.
+- ``depth`` (optional):
+ Maximum recursion depth for the watch.
+- ``local_var_offset:local_var_len`` (optional):
+ A region of a local variable to monitor, relative to the stack pointer.
+ If not given, KStackWatch monitors the stack canary by default.
+
+Examples
+--------
+1. Watch the canary at the entry of ``canary_test_write``::
+
+ echo 'canary_test_write+0x12' > /proc/kstackwatch
+
+2. Watch a local variable of 8 bytes at offset 0 in
+ ``silent_corruption_victim``::
+
+ echo 'silent_corruption_victim+0x7f 0:8' > /proc/kstackwatch
+
+Module Parameters
+=================
+``panic_on_catch`` (bool)
+ - If true, trigger a kernel panic immediately on detecting stack
+ corruption.
+ - Default is false (log a message only).
+
+Implementation Notes
+====================
+- Hardware breakpoints are preallocated at watch start.
+- Function exit is monitored using ``fprobe``.
+- Per-task depth tracking is used to handle recursion across scheduling.
+- The procfs interface allows dynamic reconfiguration at runtime.
+- Active state is cleared before applying new settings.
+
+Limitations
+===========
+- Only one active watch can be configured at a time (singleton).
+- Local variable offset and size must be known in advance.
+
+Testing
+=======
+KStackWatch includes a companion test module (`kstackwatch_test`) and
+a helper script (`kstackwatch_test.sh`) to exercise different stack
+corruption scenarios:
--
2.43.0

Jinchao Wang

unread,
Sep 12, 2025, 6:13:45 AM (11 days ago) Sep 12
to Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org, Jinchao Wang
Add a maintainer entry for Kernel Stack Watch.

Signed-off-by: Jinchao Wang <wangjin...@gmail.com>
---
MAINTAINERS | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index cd7ff55b5d32..1baa989abf2d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13355,6 +13355,14 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
F: Documentation/dev-tools/kselftest*
F: tools/testing/selftests/

+KERNEL STACK WATCH
+M: Jinchao Wang <wangjin...@gmail.com>
+S: Maintained
+F: Documentation/dev-tools/kstackwatch.rst
+F: include/linux/kstackwatch_types.h
+F: mm/kstackwatch/
+F: tools/kstackwatch/
+
KERNEL SMB3 SERVER (KSMBD)
M: Namjae Jeon <linki...@kernel.org>
M: Namjae Jeon <linki...@samba.org>
--
2.43.0

Randy Dunlap

unread,
Sep 13, 2025, 12:07:52 AM (10 days ago) Sep 13
to Jinchao Wang, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org


On 9/12/25 3:11 AM, Jinchao Wang wrote:
> diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
> index fdfc6e6d0dec..46c280280980 100644
> --- a/mm/Kconfig.debug
> +++ b/mm/Kconfig.debug
> @@ -320,3 +320,13 @@ config KSTACK_WATCH
> the recursive depth of the monitored function.
>
> If unsure, say N.
> +
> +config KSTACK_WATCH_TEST
> + tristate "KStackWatch Test Module"
> + depends on KSTACK_WATCH
> + help
> + This module provides controlled stack exhaustion and overflow scenarios
> + to verify the functionality of KStackWatch. It is particularly useful
> + for development and validation of the KStachWatch mechanism.

typo: ^^^^^^^^^^^

> +
> + If unsure, say N.

--
~Randy

Randy Dunlap

unread,
Sep 13, 2025, 12:13:23 AM (10 days ago) Sep 13
to Jinchao Wang, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org


On 9/12/25 3:11 AM, Jinchao Wang wrote:
> +/**
> + * modify_wide_hw_breakpoint_local - update breakpoint config for local cpu
> + * @bp: the hwbp perf event for this cpu
> + * @attr: the new attribute for @bp
> + *
> + * This does not release and reserve the slot of HWBP, just reuse the current

of a HWBP; it just reuses

and preferable s/cpu/CPU/ in comments.

> + * slot on local CPU. So the users must update the other CPUs by themselves.
> + * Also, since this does not release/reserve the slot, this can not change the
> + * type to incompatible type of the HWBP.
> + * Return err if attr is invalid or the cpu fails to update debug register
> + * for new @attr.
> + */
> +#ifdef CONFIG_HAVE_REINSTALL_HW_BREAKPOINT
> +int modify_wide_hw_breakpoint_local(struct perf_event *bp,
> + struct perf_event_attr *attr)
> +{

--
~Randy

Masami Hiramatsu

unread,
Sep 14, 2025, 9:03:01 AM (9 days ago) Sep 14
to Randy Dunlap, Jinchao Wang, Andrew Morton, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org
On Fri, 12 Sep 2025 21:13:07 -0700
Randy Dunlap <rdu...@infradead.org> wrote:

>
>
> On 9/12/25 3:11 AM, Jinchao Wang wrote:
> > +/**
> > + * modify_wide_hw_breakpoint_local - update breakpoint config for local cpu
> > + * @bp: the hwbp perf event for this cpu
> > + * @attr: the new attribute for @bp
> > + *
> > + * This does not release and reserve the slot of HWBP, just reuse the current
>
> of a HWBP; it just reuses

OK,

>
> and preferable s/cpu/CPU/ in comments.

OK.

Thanks for review!

>
> > + * slot on local CPU. So the users must update the other CPUs by themselves.
> > + * Also, since this does not release/reserve the slot, this can not change the
> > + * type to incompatible type of the HWBP.
> > + * Return err if attr is invalid or the cpu fails to update debug register
> > + * for new @attr.
> > + */
> > +#ifdef CONFIG_HAVE_REINSTALL_HW_BREAKPOINT
> > +int modify_wide_hw_breakpoint_local(struct perf_event *bp,
> > + struct perf_event_attr *attr)
> > +{
>
> --
> ~Randy
>


--
Masami Hiramatsu (Google) <mhir...@kernel.org>

Masami Hiramatsu

unread,
Sep 14, 2025, 9:53:02 AM (9 days ago) Sep 14
to Jinchao Wang, Andrew Morton, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org
On Fri, 12 Sep 2025 18:11:11 +0800
Jinchao Wang <wangjin...@gmail.com> wrote:

> Consolidate breakpoint management to reduce code duplication.
> The diffstat was misleading, so the stripped code size is compared instead.
> After refactoring, it is reduced from 11976 bytes to 11448 bytes on my
> x86_64 system built with clang.
>
> This also makes it easier to introduce arch_reinstall_hw_breakpoint().
>
> In addition, including linux/types.h to fix a missing build dependency.
>

Looks good to me.

Reviewed-by: Masami Hiramatsu (Google) <mhir...@kernel.org>

Thanks,

Masami Hiramatsu

unread,
Sep 14, 2025, 9:53:23 AM (9 days ago) Sep 14
to Jinchao Wang, Andrew Morton, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org
On Fri, 12 Sep 2025 18:11:12 +0800
Jinchao Wang <wangjin...@gmail.com> wrote:

> The new arch_reinstall_hw_breakpoint() function can be used in an
> atomic context, unlike the more expensive free and re-allocation path.
> This allows callers to efficiently re-establish an existing breakpoint.
>

Looks good to me.

Reviewed-by: Masami Hiramatsu (Google) <mhir...@kernel.org>

Thanks!

Jinchao Wang

unread,
Sep 14, 2025, 10:03:16 PM (9 days ago) Sep 14
to Randy Dunlap, Andrew Morton, Masami Hiramatsu, Peter Zijlstra, Mike Rapoport, Alexander Potapenko, Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x...@kernel.org, H. Peter Anvin, Juri Lelli, Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider, Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter, Liang, Kan, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko, Nathan Chancellor, Nick Desaulniers, Bill Wendling, Justin Stitt, Kees Cook, Alice Ryhl, Sami Tolvanen, Miguel Ojeda, Masahiro Yamada, Rong Xu, Naveen N Rao, David Kaplan, Andrii Nakryiko, Jinjie Ruan, Nam Cao, work...@vger.kernel.org, linu...@vger.kernel.org, linux-...@vger.kernel.org, linux-pe...@vger.kernel.org, linu...@kvack.org, ll...@lists.linux.dev, Andrey Ryabinin, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com, David S. Miller, Mathieu Desnoyers, linux-tra...@vger.kernel.org
Thanks, will be fix in next version.
>
> > +
> > + If unsure, say N.
>
> --
> ~Randy
>

--
Jinchao
Reply all
Reply to author
Forward
0 new messages