Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

[PATCH v4 00/35] kmsan: Enable on s390

2 views
Skip to first unread message

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:46 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
v3: https://lore.kernel.org/lkml/2023121323360...@linux.ibm.com/
v3 -> v4: Rebase.
Elaborate why ftrace_ops_list_func() change is needed on
x64_64 (Steven).
Add a comment to the DFLTCC patch (Alexander P.).
Simplify diag224();
Improve __arch_local_irq_attributes style;
Use IS_ENABLED(CONFIG_KMSAN) for vmalloc area (Heiko).
Align vmalloc area on _SEGMENT_SIZE (Alexander G.).

v2: https://lore.kernel.org/lkml/20231121220155...@linux.ibm.com/
v2 -> v3: Drop kmsan_memmove_metadata() and strlcpy() patches;
Remove kmsan_get_metadata() stub;
Move kmsan_enable_current() and kmsan_disable_current() to
include/linux/kmsan.h, explain why a counter is needed;
Drop the memset_no_sanitize_memory() patch;
Use __memset() in the SLAB_POISON patch;
Add kmsan-checks.h to the DFLTCC patch;
Add recursion check to the arch_kmsan_get_meta_or_null()
patch (Alexander P.).

Fix inline + __no_kmsan_checks issues.
New patch for s390/irqflags, that resolves a lockdep warning.
New patch for s390/diag, that resolves a false positive when
running on an LPAR.
New patch for STCCTM, same as above.
New patch for check_bytes_and_report() that resolves a false
positive that occurs even on Intel.

v1: https://lore.kernel.org/lkml/20231115203401...@linux.ibm.com/
v1 -> v2: Add comments, sort #includes, introduce
memset_no_sanitize_memory() and use it to avoid unpoisoning
of redzones, change vmalloc alignment to _REGION3_SIZE, add
R-bs (Alexander P.).

Fix building
[PATCH 28/33] s390/string: Add KMSAN support
with FORTIFY_SOURCE.
Reported-by: kernel test robot <l...@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311170550...@intel.com/

Hi,

This series provides the minimal support for Kernel Memory Sanitizer on
s390. Kernel Memory Sanitizer is clang-only instrumentation for finding
accesses to uninitialized memory. The clang support for s390 has already
been merged [1].

With this series, I can successfully boot s390 defconfig and
debug_defconfig with kmsan.panic=1. The tool found one real
s390-specific bug (fixed in master).

Best regards,
Ilya

[1] https://reviews.llvm.org/D148596

Ilya Leoshkevich (35):
ftrace: Unpoison ftrace_regs in ftrace_ops_list_func()
kmsan: Make the tests compatible with kmsan.panic=1
kmsan: Disable KMSAN when DEFERRED_STRUCT_PAGE_INIT is enabled
kmsan: Increase the maximum store size to 4096
kmsan: Fix is_bad_asm_addr() on arches with overlapping address spaces
kmsan: Fix kmsan_copy_to_user() on arches with overlapping address
spaces
kmsan: Remove a useless assignment from
kmsan_vmap_pages_range_noflush()
kmsan: Remove an x86-specific #include from kmsan.h
kmsan: Expose kmsan_get_metadata()
kmsan: Export panic_on_kmsan
kmsan: Allow disabling KMSAN checks for the current task
kmsan: Support SLAB_POISON
kmsan: Use ALIGN_DOWN() in kmsan_get_metadata()
kmsan: Do not round up pg_data_t size
mm: slub: Let KMSAN access metadata
mm: slub: Unpoison the memchr_inv() return value
mm: kfence: Disable KMSAN when checking the canary
lib/zlib: Unpoison DFLTCC output buffers
kmsan: Accept ranges starting with 0 on s390
s390/boot: Turn off KMSAN
s390: Use a larger stack for KMSAN
s390/boot: Add the KMSAN runtime stub
s390/checksum: Add a KMSAN check
s390/cpacf: Unpoison the results of cpacf_trng()
s390/cpumf: Unpoison STCCTM output buffer
s390/diag: Unpoison diag224() output buffer
s390/ftrace: Unpoison ftrace_regs in kprobe_ftrace_handler()
s390/irqflags: Do not instrument arch_local_irq_*() with KMSAN
s390/mm: Define KMSAN metadata for vmalloc and modules
s390/string: Add KMSAN support
s390/traps: Unpoison the kernel_stack_overflow()'s pt_regs
s390/uaccess: Add KMSAN support to put_user() and get_user()
s390/unwind: Disable KMSAN checks
s390: Implement the architecture-specific KMSAN functions
kmsan: Enable on s390

Documentation/dev-tools/kmsan.rst | 4 +-
arch/s390/Kconfig | 1 +
arch/s390/Makefile | 2 +-
arch/s390/boot/Makefile | 3 +
arch/s390/boot/kmsan.c | 6 ++
arch/s390/boot/startup.c | 7 ++
arch/s390/boot/string.c | 16 ++++
arch/s390/include/asm/checksum.h | 2 +
arch/s390/include/asm/cpacf.h | 3 +
arch/s390/include/asm/cpu_mf.h | 6 ++
arch/s390/include/asm/irqflags.h | 17 ++++-
arch/s390/include/asm/kmsan.h | 43 +++++++++++
arch/s390/include/asm/pgtable.h | 8 ++
arch/s390/include/asm/string.h | 20 +++--
arch/s390/include/asm/thread_info.h | 2 +-
arch/s390/include/asm/uaccess.h | 111 ++++++++++++++++++++--------
arch/s390/kernel/diag.c | 10 ++-
arch/s390/kernel/ftrace.c | 2 +
arch/s390/kernel/traps.c | 6 ++
arch/s390/kernel/unwind_bc.c | 4 +
drivers/s390/char/sclp.c | 2 +-
include/linux/kmsan.h | 33 +++++++++
include/linux/kmsan_types.h | 2 +-
kernel/trace/ftrace.c | 1 +
lib/zlib_dfltcc/dfltcc.h | 1 +
lib/zlib_dfltcc/dfltcc_util.h | 28 +++++++
mm/Kconfig | 1 +
mm/kfence/core.c | 11 ++-
mm/kmsan/core.c | 1 -
mm/kmsan/hooks.c | 23 ++++--
mm/kmsan/init.c | 7 +-
mm/kmsan/instrumentation.c | 11 +--
mm/kmsan/kmsan.h | 9 +--
mm/kmsan/kmsan_test.c | 5 ++
mm/kmsan/report.c | 8 +-
mm/kmsan/shadow.c | 9 +--
mm/slub.c | 17 ++++-
tools/objtool/check.c | 2 +
38 files changed, 361 insertions(+), 83 deletions(-)
create mode 100644 arch/s390/boot/kmsan.c
create mode 100644 arch/s390/include/asm/kmsan.h

--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:47 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Architectures use assembly code to initialize ftrace_regs and call
ftrace_ops_list_func(). Therefore, from the KMSAN's point of view,
ftrace_regs is poisoned on ftrace_ops_list_func entry(). This causes
KMSAN warnings when running the ftrace testsuite.

Fix by trusting the architecture-specific assembly code and always
unpoisoning ftrace_regs in ftrace_ops_list_func.

The issue was not encountered on x86_64 so far only by accident:
assembly-allocated ftrace_regs was overlapping a stale partially
unpoisoned stack frame. Poisoning stack frames before returns [1]
makes the issue appear on x86_64 as well.

[1] https://github.com/iii-i/llvm-project/commits/msan-poison-allocas-before-returning-2024-06-12/

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
kernel/trace/ftrace.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 65208d3b5ed9..c35ad4362d71 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -7407,6 +7407,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
void arch_ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs)
{
+ kmsan_unpoison_memory(fregs, sizeof(*fregs));
__ftrace_ops_list_func(ip, parent_ip, NULL, fregs);
}
#else
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:48 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
It's useful to have both tests and kmsan.panic=1 during development,
but right now the warnings, that the tests cause, lead to kernel
panics.

Temporarily set kmsan.panic=0 for the duration of the KMSAN testing.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/kmsan_test.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index 07d3a3a5a9c5..9bfd11674fe3 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -659,9 +659,13 @@ static void test_exit(struct kunit *test)
{
}

+static int orig_panic_on_kmsan;
+
static int kmsan_suite_init(struct kunit_suite *suite)
{
register_trace_console(probe_console, NULL);
+ orig_panic_on_kmsan = panic_on_kmsan;
+ panic_on_kmsan = 0;
return 0;
}

@@ -669,6 +673,7 @@ static void kmsan_suite_exit(struct kunit_suite *suite)
{
unregister_trace_console(probe_console, NULL);
tracepoint_synchronize_unregister();
+ panic_on_kmsan = orig_panic_on_kmsan;
}

static struct kunit_suite kmsan_test_suite = {
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:49 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Comparing pointers with TASK_SIZE does not make sense when kernel and
userspace overlap. Skip the comparison when this is the case.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/instrumentation.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/kmsan/instrumentation.c b/mm/kmsan/instrumentation.c
index 470b0b4afcc4..8a1bbbc723ab 100644
--- a/mm/kmsan/instrumentation.c
+++ b/mm/kmsan/instrumentation.c
@@ -20,7 +20,8 @@

static inline bool is_bad_asm_addr(void *addr, uintptr_t size, bool is_store)
{
- if ((u64)addr < TASK_SIZE)
+ if (IS_ENABLED(CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE) &&
+ (u64)addr < TASK_SIZE)
return true;
if (!kmsan_get_metadata(addr, KMSAN_META_SHADOW))
return true;
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:49 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Comparing pointers with TASK_SIZE does not make sense when kernel and
userspace overlap. Assume that we are handling user memory access in
this case.

Reported-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/hooks.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/kmsan/hooks.c b/mm/kmsan/hooks.c
index 22e8657800ef..b408714f9ba3 100644
--- a/mm/kmsan/hooks.c
+++ b/mm/kmsan/hooks.c
@@ -267,7 +267,8 @@ void kmsan_copy_to_user(void __user *to, const void *from, size_t to_copy,
return;

ua_flags = user_access_save();
- if ((u64)to < TASK_SIZE) {
+ if (!IS_ENABLED(CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE) ||
+ (u64)to < TASK_SIZE) {
/* This is a user memory access, check it. */
kmsan_internal_check_memory((void *)from, to_copy - left, to,
REASON_COPY_TO_USER);
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:49 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
The inline assembly block in s390's chsc() stores that much.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/instrumentation.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/kmsan/instrumentation.c b/mm/kmsan/instrumentation.c
index cc3907a9c33a..470b0b4afcc4 100644
--- a/mm/kmsan/instrumentation.c
+++ b/mm/kmsan/instrumentation.c
@@ -110,11 +110,10 @@ void __msan_instrument_asm_store(void *addr, uintptr_t size)

ua_flags = user_access_save();
/*
- * Most of the accesses are below 32 bytes. The two exceptions so far
- * are clwb() (64 bytes) and FPU state (512 bytes).
- * It's unlikely that the assembly will touch more than 512 bytes.
+ * Most of the accesses are below 32 bytes. The exceptions so far are
+ * clwb() (64 bytes), FPU state (512 bytes) and chsc() (4096 bytes).
*/
- if (size > 512) {
+ if (size > 4096) {
WARN_ONCE(1, "assembly store size too big: %ld\n", size);
size = 8;
}
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:50 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
KMSAN relies on memblock returning all available pages to it
(see kmsan_memblock_free_pages()). It partitions these pages into 3
categories: pages available to the buddy allocator, shadow pages and
origin pages. This partitioning is static.

If new pages appear after kmsan_init_runtime(), it is considered
an error. DEFERRED_STRUCT_PAGE_INIT causes this, so mark it as
incompatible with KMSAN.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/Kconfig b/mm/Kconfig
index b4cb45255a54..9791fce5d0a7 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -946,6 +946,7 @@ config DEFERRED_STRUCT_PAGE_INIT
depends on SPARSEMEM
depends on !NEED_PER_CPU_KM
depends on 64BIT
+ depends on !KMSAN
select PADATA
help
Ordinarily all struct pages are initialised during early boot in a
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:51 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Replace the x86-specific asm/pgtable_64_types.h #include with the
linux/pgtable.h one, which all architectures have.

While at it, sort the headers alphabetically for the sake of
consistency with other KMSAN code.

Fixes: f80be4571b19 ("kmsan: add KMSAN runtime core")
Suggested-by: Heiko Carstens <h...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/kmsan.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/kmsan/kmsan.h b/mm/kmsan/kmsan.h
index a14744205435..adf443bcffe8 100644
--- a/mm/kmsan/kmsan.h
+++ b/mm/kmsan/kmsan.h
@@ -10,14 +10,14 @@
#ifndef __MM_KMSAN_KMSAN_H
#define __MM_KMSAN_KMSAN_H

-#include <asm/pgtable_64_types.h>
#include <linux/irqflags.h>
+#include <linux/mm.h>
+#include <linux/nmi.h>
+#include <linux/pgtable.h>
+#include <linux/printk.h>
#include <linux/sched.h>
#include <linux/stackdepot.h>
#include <linux/stacktrace.h>
-#include <linux/nmi.h>
-#include <linux/mm.h>
-#include <linux/printk.h>

#define KMSAN_ALLOCA_MAGIC_ORIGIN 0xabcd0100
#define KMSAN_CHAIN_MAGIC_ORIGIN 0xabcd0200
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:51 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Each s390 CPU has lowcore pages associated with it. Each CPU sees its
own lowcore at virtual address 0 through a hardware mechanism called
prefixing. Additionally, all lowcores are mapped to non-0 virtual
addresses stored in the lowcore_ptr[] array.

When lowcore is accessed through virtual address 0, one needs to
resolve metadata for lowcore_ptr[raw_smp_processor_id()].

Expose kmsan_get_metadata() to make it possible to do this from the
arch code.

Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
include/linux/kmsan.h | 9 +++++++++
mm/kmsan/instrumentation.c | 1 +
mm/kmsan/kmsan.h | 1 -
3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/linux/kmsan.h b/include/linux/kmsan.h
index e0c23a32cdf0..fe6c2212bdb1 100644
--- a/include/linux/kmsan.h
+++ b/include/linux/kmsan.h
@@ -230,6 +230,15 @@ void kmsan_handle_urb(const struct urb *urb, bool is_out);
*/
void kmsan_unpoison_entry_regs(const struct pt_regs *regs);

+/**
+ * kmsan_get_metadata() - Return a pointer to KMSAN shadow or origins.
+ * @addr: kernel address.
+ * @is_origin: whether to return origins or shadow.
+ *
+ * Return NULL if metadata cannot be found.
+ */
+void *kmsan_get_metadata(void *addr, bool is_origin);
+
#else

static inline void kmsan_init_shadow(void)
diff --git a/mm/kmsan/instrumentation.c b/mm/kmsan/instrumentation.c
index 8a1bbbc723ab..94b49fac9d8b 100644
--- a/mm/kmsan/instrumentation.c
+++ b/mm/kmsan/instrumentation.c
@@ -14,6 +14,7 @@

#include "kmsan.h"
#include <linux/gfp.h>
+#include <linux/kmsan.h>
#include <linux/kmsan_string.h>
#include <linux/mm.h>
#include <linux/uaccess.h>
diff --git a/mm/kmsan/kmsan.h b/mm/kmsan/kmsan.h
index adf443bcffe8..34b83c301d57 100644
--- a/mm/kmsan/kmsan.h
+++ b/mm/kmsan/kmsan.h
@@ -66,7 +66,6 @@ struct shadow_origin_ptr {

struct shadow_origin_ptr kmsan_get_shadow_origin_ptr(void *addr, u64 size,
bool store);
-void *kmsan_get_metadata(void *addr, bool is_origin);
void __init kmsan_init_alloc_meta_for_range(void *start, void *end);

enum kmsan_bug_reason {
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:51 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
When building the kmsan test as a module, modpost fails with the
following error message:

ERROR: modpost: "panic_on_kmsan" [mm/kmsan/kmsan_test.ko] undefined!

Export panic_on_kmsan in order to improve the KMSAN usability for
modules.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/report.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/kmsan/report.c b/mm/kmsan/report.c
index 02736ec757f2..c79d3b0d2d0d 100644
--- a/mm/kmsan/report.c
+++ b/mm/kmsan/report.c
@@ -20,6 +20,7 @@ static DEFINE_RAW_SPINLOCK(kmsan_report_lock);
/* Protected by kmsan_report_lock */
static char report_local_descr[DESCR_SIZE];
int panic_on_kmsan __read_mostly;
+EXPORT_SYMBOL_GPL(panic_on_kmsan);

#ifdef MODULE_PARAM_PREFIX
#undef MODULE_PARAM_PREFIX
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:51 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Like for KASAN, it's useful to temporarily disable KMSAN checks around,
e.g., redzone accesses. Introduce kmsan_disable_current() and
kmsan_enable_current(), which are similar to their KASAN counterparts.

Make them reentrant in order to handle memory allocations in interrupt
context. Repurpose the allow_reporting field for this.

Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
Documentation/dev-tools/kmsan.rst | 4 ++--
include/linux/kmsan.h | 24 ++++++++++++++++++++++++
include/linux/kmsan_types.h | 2 +-
mm/kmsan/core.c | 1 -
mm/kmsan/hooks.c | 18 +++++++++++++++---
mm/kmsan/report.c | 7 ++++---
tools/objtool/check.c | 2 ++
7 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/Documentation/dev-tools/kmsan.rst b/Documentation/dev-tools/kmsan.rst
index 323eedad53cd..022a823f5f1b 100644
--- a/Documentation/dev-tools/kmsan.rst
+++ b/Documentation/dev-tools/kmsan.rst
@@ -338,11 +338,11 @@ Per-task KMSAN state
~~~~~~~~~~~~~~~~~~~~

Every task_struct has an associated KMSAN task state that holds the KMSAN
-context (see above) and a per-task flag disallowing KMSAN reports::
+context (see above) and a per-task counter disallowing KMSAN reports::

struct kmsan_context {
...
- bool allow_reporting;
+ unsigned int depth;
struct kmsan_context_state cstate;
...
}
diff --git a/include/linux/kmsan.h b/include/linux/kmsan.h
index fe6c2212bdb1..23de1b3d6aee 100644
--- a/include/linux/kmsan.h
+++ b/include/linux/kmsan.h
@@ -239,6 +239,22 @@ void kmsan_unpoison_entry_regs(const struct pt_regs *regs);
*/
void *kmsan_get_metadata(void *addr, bool is_origin);

+/*
+ * kmsan_enable_current(): Enable KMSAN for the current task.
+ *
+ * Each kmsan_enable_current() current call must be preceded by a
+ * kmsan_disable_current() call. These call pairs may be nested.
+ */
+void kmsan_enable_current(void);
+
+/*
+ * kmsan_disable_current(): Disable KMSAN for the current task.
+ *
+ * Each kmsan_disable_current() current call must be followed by a
+ * kmsan_enable_current() call. These call pairs may be nested.
+ */
+void kmsan_disable_current(void);
+
#else

static inline void kmsan_init_shadow(void)
@@ -338,6 +354,14 @@ static inline void kmsan_unpoison_entry_regs(const struct pt_regs *regs)
{
}

+static inline void kmsan_enable_current(void)
+{
+}
+
+static inline void kmsan_disable_current(void)
+{
+}
+
#endif

#endif /* _LINUX_KMSAN_H */
diff --git a/include/linux/kmsan_types.h b/include/linux/kmsan_types.h
index 929287981afe..dfc59918b3c0 100644
--- a/include/linux/kmsan_types.h
+++ b/include/linux/kmsan_types.h
@@ -31,7 +31,7 @@ struct kmsan_context_state {
struct kmsan_ctx {
struct kmsan_context_state cstate;
int kmsan_in_runtime;
- bool allow_reporting;
+ unsigned int depth;
};

#endif /* _LINUX_KMSAN_TYPES_H */
diff --git a/mm/kmsan/core.c b/mm/kmsan/core.c
index 95f859e38c53..81b22220711a 100644
--- a/mm/kmsan/core.c
+++ b/mm/kmsan/core.c
@@ -43,7 +43,6 @@ void kmsan_internal_task_create(struct task_struct *task)
struct thread_info *info = current_thread_info();

__memset(ctx, 0, sizeof(*ctx));
- ctx->allow_reporting = true;
kmsan_internal_unpoison_memory(info, sizeof(*info), false);
}

diff --git a/mm/kmsan/hooks.c b/mm/kmsan/hooks.c
index b408714f9ba3..267d0afa2e8b 100644
--- a/mm/kmsan/hooks.c
+++ b/mm/kmsan/hooks.c
@@ -39,12 +39,10 @@ void kmsan_task_create(struct task_struct *task)

void kmsan_task_exit(struct task_struct *task)
{
- struct kmsan_ctx *ctx = &task->kmsan_ctx;
-
if (!kmsan_enabled || kmsan_in_runtime())
return;

- ctx->allow_reporting = false;
+ kmsan_disable_current();
}

void kmsan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags)
@@ -424,3 +422,17 @@ void kmsan_check_memory(const void *addr, size_t size)
REASON_ANY);
}
EXPORT_SYMBOL(kmsan_check_memory);
+
+void kmsan_enable_current(void)
+{
+ KMSAN_WARN_ON(current->kmsan_ctx.depth == 0);
+ current->kmsan_ctx.depth--;
+}
+EXPORT_SYMBOL(kmsan_enable_current);
+
+void kmsan_disable_current(void)
+{
+ current->kmsan_ctx.depth++;
+ KMSAN_WARN_ON(current->kmsan_ctx.depth == 0);
+}
+EXPORT_SYMBOL(kmsan_disable_current);
diff --git a/mm/kmsan/report.c b/mm/kmsan/report.c
index c79d3b0d2d0d..92e73ec61435 100644
--- a/mm/kmsan/report.c
+++ b/mm/kmsan/report.c
@@ -8,6 +8,7 @@
*/

#include <linux/console.h>
+#include <linux/kmsan.h>
#include <linux/moduleparam.h>
#include <linux/stackdepot.h>
#include <linux/stacktrace.h>
@@ -158,12 +159,12 @@ void kmsan_report(depot_stack_handle_t origin, void *address, int size,

if (!kmsan_enabled)
return;
- if (!current->kmsan_ctx.allow_reporting)
+ if (current->kmsan_ctx.depth)
return;
if (!origin)
return;

- current->kmsan_ctx.allow_reporting = false;
+ kmsan_disable_current();
ua_flags = user_access_save();
raw_spin_lock(&kmsan_report_lock);
pr_err("=====================================================\n");
@@ -216,5 +217,5 @@ void kmsan_report(depot_stack_handle_t origin, void *address, int size,
if (panic_on_kmsan)
panic("kmsan.panic set ...\n");
user_access_restore(ua_flags);
- current->kmsan_ctx.allow_reporting = true;
+ kmsan_enable_current();
}
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 0a33d9195b7a..01237d167223 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1202,6 +1202,8 @@ static const char *uaccess_safe_builtin[] = {
"__sanitizer_cov_trace_switch",
/* KMSAN */
"kmsan_copy_to_user",
+ "kmsan_disable_current",
+ "kmsan_enable_current",
"kmsan_report",
"kmsan_unpoison_entry_regs",
"kmsan_unpoison_memory",
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:52 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
x86's alloc_node_data() rounds up node data size to PAGE_SIZE. It's not
explained why it's needed, but it's most likely for performance
reasons, since the padding bytes are not used anywhere. Some other
architectures do it as well, e.g., mips rounds it up to the cache line
size.

kmsan_init_shadow() initializes metadata for each node data and assumes
the x86 rounding, which does not match other architectures. This may
cause the range end to overshoot the end of available memory, in turn
causing virt_to_page_or_null() in kmsan_init_alloc_meta_for_range() to
return NULL, which leads to kernel panic shortly after.

Since the padding bytes are not used, drop the rounding.

Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kmsan/init.c b/mm/kmsan/init.c
index 3ac3b8921d36..9de76ac7062c 100644
--- a/mm/kmsan/init.c
+++ b/mm/kmsan/init.c
@@ -72,7 +72,7 @@ static void __init kmsan_record_future_shadow_range(void *start, void *end)
*/
void __init kmsan_init_shadow(void)
{
- const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
+ const size_t nd_size = sizeof(pg_data_t);
phys_addr_t p_start, p_end;
u64 loop;
int nid;
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:52 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
The value assigned to prot is immediately overwritten on the next line
with PAGE_KERNEL. The right hand side of the assignment has no
side-effects.

Fixes: b073d7f8aee4 ("mm: kmsan: maintain KMSAN metadata for page operations")
Suggested-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/shadow.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/mm/kmsan/shadow.c b/mm/kmsan/shadow.c
index b9d05aff313e..2d57408c78ae 100644
--- a/mm/kmsan/shadow.c
+++ b/mm/kmsan/shadow.c
@@ -243,7 +243,6 @@ int kmsan_vmap_pages_range_noflush(unsigned long start, unsigned long end,
s_pages[i] = shadow_page_for(pages[i]);
o_pages[i] = origin_page_for(pages[i]);
}
- prot = __pgprot(pgprot_val(prot) | _PAGE_NX);
prot = PAGE_KERNEL;

origin_start = vmalloc_meta((void *)start, KMSAN_META_ORIGIN);
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:52 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Avoid false KMSAN negatives with SLUB_DEBUG by allowing
kmsan_slab_free() to poison the freed memory, and by preventing
init_object() from unpoisoning new allocations by using __memset().

There are two alternatives to this approach. First, init_object()
can be marked with __no_sanitize_memory. This annotation should be used
with great care, because it drops all instrumentation from the
function, and any shadow writes will be lost. Even though this is not a
concern with the current init_object() implementation, this may change
in the future.

Second, kmsan_poison_memory() calls may be added after memset() calls.
The downside is that init_object() is called from
free_debug_processing(), in which case poisoning will erase the
distinction between simply uninitialized memory and UAF.

Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/hooks.c | 2 +-
mm/slub.c | 13 +++++++++----
2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/mm/kmsan/hooks.c b/mm/kmsan/hooks.c
index 267d0afa2e8b..26d86dfdc819 100644
--- a/mm/kmsan/hooks.c
+++ b/mm/kmsan/hooks.c
@@ -74,7 +74,7 @@ void kmsan_slab_free(struct kmem_cache *s, void *object)
return;

/* RCU slabs could be legally used after free within the RCU period */
- if (unlikely(s->flags & (SLAB_TYPESAFE_BY_RCU | SLAB_POISON)))
+ if (unlikely(s->flags & SLAB_TYPESAFE_BY_RCU))
return;
/*
* If there's a constructor, freed memory must remain in the same state
diff --git a/mm/slub.c b/mm/slub.c
index 1373ac365a46..4dd55cabe701 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1139,7 +1139,12 @@ static void init_object(struct kmem_cache *s, void *object, u8 val)
unsigned int poison_size = s->object_size;

if (s->flags & SLAB_RED_ZONE) {
- memset(p - s->red_left_pad, val, s->red_left_pad);
+ /*
+ * Use __memset() here and below in order to avoid overwriting
+ * the KMSAN shadow. Keeping the shadow makes it possible to
+ * distinguish uninit-value from use-after-free.
+ */
+ __memset(p - s->red_left_pad, val, s->red_left_pad);

if (slub_debug_orig_size(s) && val == SLUB_RED_ACTIVE) {
/*
@@ -1152,12 +1157,12 @@ static void init_object(struct kmem_cache *s, void *object, u8 val)
}

if (s->flags & __OBJECT_POISON) {
- memset(p, POISON_FREE, poison_size - 1);
- p[poison_size - 1] = POISON_END;
+ __memset(p, POISON_FREE, poison_size - 1);
+ __memset(p + poison_size - 1, POISON_END, 1);
}

if (s->flags & SLAB_RED_ZONE)
- memset(p + poison_size, val, s->inuse - poison_size);
+ __memset(p + poison_size, val, s->inuse - poison_size);
}

static void restore_bytes(struct kmem_cache *s, char *message, u8 data,
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:53 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Building the kernel with CONFIG_SLUB_DEBUG and CONFIG_KMSAN causes
KMSAN to complain about touching redzones in kfree().

Fix by extending the existing KASAN-related metadata_access_enable()
and metadata_access_disable() functions to KMSAN.

Acked-by: Vlastimil Babka <vba...@suse.cz>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/slub.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index 4dd55cabe701..a290f6c63e7b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -829,10 +829,12 @@ static int disable_higher_order_debug;
static inline void metadata_access_enable(void)
{
kasan_disable_current();
+ kmsan_disable_current();
}

static inline void metadata_access_disable(void)
{
+ kmsan_enable_current();
kasan_enable_current();
}

--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:53 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
The constraints of the DFLTCC inline assembly are not precise: they
do not communicate the size of the output buffers to the compiler, so
it cannot automatically instrument it.

Add the manual kmsan_unpoison_memory() calls for the output buffers.
The logic is the same as in [1].

[1] https://github.com/zlib-ng/zlib-ng/commit/1f5ddcc009ac3511e99fc88736a9e1a6381168c5

Reported-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
lib/zlib_dfltcc/dfltcc.h | 1 +
lib/zlib_dfltcc/dfltcc_util.h | 28 ++++++++++++++++++++++++++++
2 files changed, 29 insertions(+)

diff --git a/lib/zlib_dfltcc/dfltcc.h b/lib/zlib_dfltcc/dfltcc.h
index b96232bdd44d..0f2a16d7a48a 100644
--- a/lib/zlib_dfltcc/dfltcc.h
+++ b/lib/zlib_dfltcc/dfltcc.h
@@ -80,6 +80,7 @@ struct dfltcc_param_v0 {
uint8_t csb[1152];
};

+static_assert(offsetof(struct dfltcc_param_v0, csb) == 384);
static_assert(sizeof(struct dfltcc_param_v0) == 1536);

#define CVT_CRC32 0
diff --git a/lib/zlib_dfltcc/dfltcc_util.h b/lib/zlib_dfltcc/dfltcc_util.h
index 4a46b5009f0d..10509270d822 100644
--- a/lib/zlib_dfltcc/dfltcc_util.h
+++ b/lib/zlib_dfltcc/dfltcc_util.h
@@ -2,6 +2,8 @@
#ifndef DFLTCC_UTIL_H
#define DFLTCC_UTIL_H

+#include "dfltcc.h"
+#include <linux/kmsan-checks.h>
#include <linux/zutil.h>

/*
@@ -20,6 +22,7 @@ typedef enum {
#define DFLTCC_CMPR 2
#define DFLTCC_XPND 4
#define HBT_CIRCULAR (1 << 7)
+#define DFLTCC_FN_MASK ((1 << 7) - 1)
#define HB_BITS 15
#define HB_SIZE (1 << HB_BITS)

@@ -34,6 +37,7 @@ static inline dfltcc_cc dfltcc(
)
{
Byte *t2 = op1 ? *op1 : NULL;
+ unsigned char *orig_t2 = t2;
size_t t3 = len1 ? *len1 : 0;
const Byte *t4 = op2 ? *op2 : NULL;
size_t t5 = len2 ? *len2 : 0;
@@ -59,6 +63,30 @@ static inline dfltcc_cc dfltcc(
: "cc", "memory");
t2 = r2; t3 = r3; t4 = r4; t5 = r5;

+ /*
+ * Unpoison the parameter block and the output buffer.
+ * This is a no-op in non-KMSAN builds.
+ */
+ switch (fn & DFLTCC_FN_MASK) {
+ case DFLTCC_QAF:
+ kmsan_unpoison_memory(param, sizeof(struct dfltcc_qaf_param));
+ break;
+ case DFLTCC_GDHT:
+ kmsan_unpoison_memory(param, offsetof(struct dfltcc_param_v0, csb));
+ break;
+ case DFLTCC_CMPR:
+ kmsan_unpoison_memory(param, sizeof(struct dfltcc_param_v0));
+ kmsan_unpoison_memory(
+ orig_t2,
+ t2 - orig_t2 +
+ (((struct dfltcc_param_v0 *)param)->sbb == 0 ? 0 : 1));
+ break;
+ case DFLTCC_XPND:
+ kmsan_unpoison_memory(param, sizeof(struct dfltcc_param_v0));
+ kmsan_unpoison_memory(orig_t2, t2 - orig_t2);
+ break;
+ }
+
if (op1)
*op1 = t2;
if (len1)
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:53 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Even though the KMSAN warnings generated by memchr_inv() are suppressed
by metadata_access_enable(), its return value may still be poisoned.

The reason is that the last iteration of memchr_inv() returns
`*start != value ? start : NULL`, where *start is poisoned. Because of
this, somewhat counterintuitively, the shadow value computed by
visitSelectInst() is equal to `(uintptr_t)start`.

The intention behind guarding memchr_inv() behind
metadata_access_enable() is to touch poisoned metadata without
triggering KMSAN, so unpoison its return value.

Acked-by: Vlastimil Babka <vba...@suse.cz>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/slub.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index a290f6c63e7b..b9101b2dc9aa 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1185,6 +1185,7 @@ static int check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
metadata_access_enable();
fault = memchr_inv(kasan_reset_tag(start), value, bytes);
metadata_access_disable();
+ kmsan_unpoison_memory(&fault, sizeof(fault));
if (!fault)
return 1;

@@ -1291,6 +1292,7 @@ static void slab_pad_check(struct kmem_cache *s, struct slab *slab)
metadata_access_enable();
fault = memchr_inv(kasan_reset_tag(pad), POISON_INUSE, remainder);
metadata_access_disable();
+ kmsan_unpoison_memory(&fault, sizeof(fault));
if (!fault)
return;
while (end > fault && end[-1] == POISON_INUSE)
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:54 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Improve the readability by replacing the custom aligning logic with
ALIGN_DOWN(). Unlike other places where a similar sequence is used,
there is no size parameter that needs to be adjusted, so the standard
macro fits.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/shadow.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/mm/kmsan/shadow.c b/mm/kmsan/shadow.c
index 2d57408c78ae..9c58f081d84f 100644
--- a/mm/kmsan/shadow.c
+++ b/mm/kmsan/shadow.c
@@ -123,14 +123,12 @@ struct shadow_origin_ptr kmsan_get_shadow_origin_ptr(void *address, u64 size,
*/
void *kmsan_get_metadata(void *address, bool is_origin)
{
- u64 addr = (u64)address, pad, off;
+ u64 addr = (u64)address, off;
struct page *page;
void *ret;

- if (is_origin && !IS_ALIGNED(addr, KMSAN_ORIGIN_SIZE)) {
- pad = addr % KMSAN_ORIGIN_SIZE;
- addr -= pad;
- }
+ if (is_origin)
+ addr = ALIGN_DOWN(addr, KMSAN_ORIGIN_SIZE);
address = (void *)addr;
if (kmsan_internal_is_vmalloc_addr(address) ||
kmsan_internal_is_module_addr(address))
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:57 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
All other sanitizers are disabled for boot as well. While at it, add a
comment explaining why we need this.

Reviewed-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/boot/Makefile | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
index 070c9b2e905f..526ed20b9d31 100644
--- a/arch/s390/boot/Makefile
+++ b/arch/s390/boot/Makefile
@@ -3,11 +3,13 @@
# Makefile for the linux s390-specific parts of the memory manager.
#

+# Tooling runtimes are unavailable and cannot be linked for early boot code
KCOV_INSTRUMENT := n
GCOV_PROFILE := n
UBSAN_SANITIZE := n
KASAN_SANITIZE := n
KCSAN_SANITIZE := n
+KMSAN_SANITIZE := n

KBUILD_AFLAGS := $(KBUILD_AFLAGS_DECOMPRESSOR)
KBUILD_CFLAGS := $(KBUILD_CFLAGS_DECOMPRESSOR)
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:57 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
On s390 the virtual address 0 is valid (current CPU's lowcore is mapped
there), therefore KMSAN should not complain about it.

Disable the respective check on s390. There doesn't seem to be a
Kconfig option to describe this situation, so explicitly check for
s390.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/init.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/kmsan/init.c b/mm/kmsan/init.c
index 9de76ac7062c..3f8b1bbb9060 100644
--- a/mm/kmsan/init.c
+++ b/mm/kmsan/init.c
@@ -33,7 +33,10 @@ static void __init kmsan_record_future_shadow_range(void *start, void *end)
bool merged = false;

KMSAN_WARN_ON(future_index == NUM_FUTURE_RANGES);
- KMSAN_WARN_ON((nstart >= nend) || !nstart || !nend);
+ KMSAN_WARN_ON((nstart >= nend) ||
+ /* Virtual address 0 is valid on s390. */
+ (!IS_ENABLED(CONFIG_S390) && !nstart) ||
+ !nend);
nstart = ALIGN_DOWN(nstart, PAGE_SIZE);
nend = ALIGN(nend, PAGE_SIZE);

--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:57 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Adjust the stack size for the KMSAN-enabled kernel like it was done
for the KASAN-enabled one in commit 7fef92ccadd7 ("s390/kasan: double
the stack size"). Both tools have similar requirements.

Reviewed-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/Makefile | 2 +-
arch/s390/include/asm/thread_info.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/s390/Makefile b/arch/s390/Makefile
index f2b21c7a70ef..7fd57398221e 100644
--- a/arch/s390/Makefile
+++ b/arch/s390/Makefile
@@ -36,7 +36,7 @@ KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_DEBUG_INFO_DWARF4), $(call cc-option
KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_CC_NO_ARRAY_BOUNDS),-Wno-array-bounds)

UTS_MACHINE := s390x
-STACK_SIZE := $(if $(CONFIG_KASAN),65536,16384)
+STACK_SIZE := $(if $(CONFIG_KASAN),65536,$(if $(CONFIG_KMSAN),65536,16384))
CHECKFLAGS += -D__s390__ -D__s390x__

export LD_BFD
diff --git a/arch/s390/include/asm/thread_info.h b/arch/s390/include/asm/thread_info.h
index a674c7d25da5..d02a709717b8 100644
--- a/arch/s390/include/asm/thread_info.h
+++ b/arch/s390/include/asm/thread_info.h
@@ -16,7 +16,7 @@
/*
* General size of kernel stacks
*/
-#ifdef CONFIG_KASAN
+#if defined(CONFIG_KASAN) || defined(CONFIG_KMSAN)
#define THREAD_SIZE_ORDER 4
#else
#define THREAD_SIZE_ORDER 2
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:57 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
KMSAN warns about check_canary() accessing the canary.

The reason is that, even though set_canary() is properly instrumented
and sets shadow, slub explicitly poisons the canary's address range
afterwards.

Unpoisoning the canary is not the right thing to do: only
check_canary() is supposed to ever touch it. Instead, disable KMSAN
checks around canary read accesses.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Tested-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kfence/core.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 964b8482275b..cce330d5b797 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -305,8 +305,14 @@ metadata_update_state(struct kfence_metadata *meta, enum kfence_object_state nex
WRITE_ONCE(meta->state, next);
}

+#ifdef CONFIG_KMSAN
+#define CHECK_CANARY_ATTRIBUTES noinline __no_kmsan_checks
+#else
+#define CHECK_CANARY_ATTRIBUTES inline
+#endif
+
/* Check canary byte at @addr. */
-static inline bool check_canary_byte(u8 *addr)
+static CHECK_CANARY_ATTRIBUTES bool check_canary_byte(u8 *addr)
{
struct kfence_metadata *meta;
unsigned long flags;
@@ -341,7 +347,8 @@ static inline void set_canary(const struct kfence_metadata *meta)
*((u64 *)addr) = KFENCE_CANARY_PATTERN_U64;
}

-static inline void check_canary(const struct kfence_metadata *meta)
+static CHECK_CANARY_ATTRIBUTES void
+check_canary(const struct kfence_metadata *meta)
{
const unsigned long pageaddr = ALIGN_DOWN(meta->addr, PAGE_SIZE);
unsigned long addr = pageaddr;
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:58 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Diagnose 224 stores 4k bytes, which currently cannot be deduced from
the inline assembly constraints. This leads to KMSAN false positives.

Fix the constraints by using a 4k-sized struct instead of a raw
pointer. While at it, prettify them too.

Suggested-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/kernel/diag.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/s390/kernel/diag.c b/arch/s390/kernel/diag.c
index 8dee9aa0ec95..8a7009618ba7 100644
--- a/arch/s390/kernel/diag.c
+++ b/arch/s390/kernel/diag.c
@@ -278,12 +278,14 @@ int diag224(void *ptr)
int rc = -EOPNOTSUPP;

diag_stat_inc(DIAG_STAT_X224);
- asm volatile(
- " diag %1,%2,0x224\n"
- "0: lhi %0,0x0\n"
+ asm volatile("\n"
+ " diag %[type],%[addr],0x224\n"
+ "0: lhi %[rc],0\n"
"1:\n"
EX_TABLE(0b,1b)
- : "+d" (rc) :"d" (0), "d" (addr) : "memory");
+ : [rc] "+d" (rc)
+ , "=m" (*(struct { char buf[PAGE_SIZE]; } *)ptr)
+ : [type] "d" (0), [addr] "d" (addr));
return rc;
}
EXPORT_SYMBOL(diag224);
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:58 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Prevent KMSAN from complaining about buffers filled by cpacf_trng()
being uninitialized.

Tested-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/cpacf.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/s390/include/asm/cpacf.h b/arch/s390/include/asm/cpacf.h
index c786538e397c..dae8843b164f 100644
--- a/arch/s390/include/asm/cpacf.h
+++ b/arch/s390/include/asm/cpacf.h
@@ -12,6 +12,7 @@
#define _ASM_S390_CPACF_H

#include <asm/facility.h>
+#include <linux/kmsan-checks.h>

/*
* Instruction opcodes for the CPACF instructions
@@ -542,6 +543,8 @@ static inline void cpacf_trng(u8 *ucbuf, unsigned long ucbuf_len,
: [ucbuf] "+&d" (u.pair), [cbuf] "+&d" (c.pair)
: [fc] "K" (CPACF_PRNO_TRNG), [opc] "i" (CPACF_PRNO)
: "cc", "memory", "0");
+ kmsan_unpoison_memory(ucbuf, ucbuf_len);
+ kmsan_unpoison_memory(cbuf, cbuf_len);
}

/**
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:58 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
It should be possible to have inline functions in the s390 header
files, which call kmsan_unpoison_memory(). The problem is that these
header files might be included by the decompressor, which does not
contain KMSAN runtime, causing linker errors.

Not compiling these calls if __SANITIZE_MEMORY__ is not defined -
either by changing kmsan-checks.h or at the call sites - may cause
unintended side effects, since calling these functions from an
uninstrumented code that is linked into the kernel is valid use case.

One might want to explicitly distinguish between the kernel and the
decompressor. Checking for a decompressor-specific #define is quite
heavy-handed, and will have to be done at all call sites.

A more generic approach is to provide a dummy kmsan_unpoison_memory()
definition. This produces some runtime overhead, but only when building
with CONFIG_KMSAN. The benefit is that it does not disturb the existing
KMSAN build logic and call sites don't need to be changed.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/boot/Makefile | 1 +
arch/s390/boot/kmsan.c | 6 ++++++
2 files changed, 7 insertions(+)
create mode 100644 arch/s390/boot/kmsan.c

diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
index 526ed20b9d31..e7658997452b 100644
--- a/arch/s390/boot/Makefile
+++ b/arch/s390/boot/Makefile
@@ -44,6 +44,7 @@ obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE)) +=
obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
obj-y += $(if $(CONFIG_KERNEL_UNCOMPRESSED),,decompressor.o) info.o
obj-$(CONFIG_KERNEL_ZSTD) += clz_ctz.o
+obj-$(CONFIG_KMSAN) += kmsan.o
obj-all := $(obj-y) piggy.o syms.o

targets := bzImage section_cmp.boot.data section_cmp.boot.preserved.data $(obj-y)
diff --git a/arch/s390/boot/kmsan.c b/arch/s390/boot/kmsan.c
new file mode 100644
index 000000000000..e7b3ac48143e
--- /dev/null
+++ b/arch/s390/boot/kmsan.c
@@ -0,0 +1,6 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kmsan-checks.h>
+
+void kmsan_unpoison_memory(const void *address, size_t size)
+{
+}
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:39:59 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
stcctm() uses the "Q" constraint for dest, therefore KMSAN does not
understand that it fills multiple doublewords pointed to by dest, not
just one. This results in false positives.

Unpoison the whole dest manually with kmsan_unpoison_memory().

Reported-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/cpu_mf.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/s390/include/asm/cpu_mf.h b/arch/s390/include/asm/cpu_mf.h
index a0de5b9b02ea..9e4bbc3e53f8 100644
--- a/arch/s390/include/asm/cpu_mf.h
+++ b/arch/s390/include/asm/cpu_mf.h
@@ -10,6 +10,7 @@
#define _ASM_S390_CPU_MF_H

#include <linux/errno.h>
+#include <linux/kmsan-checks.h>
#include <asm/asm-extable.h>
#include <asm/facility.h>

@@ -239,6 +240,11 @@ static __always_inline int stcctm(enum stcctm_ctr_set set, u64 range, u64 *dest)
: "=d" (cc)
: "Q" (*dest), "d" (range), "i" (set)
: "cc", "memory");
+ /*
+ * If cc == 2, less than RANGE counters are stored, but it's not easy
+ * to tell how many. Always unpoison the whole range for simplicity.
+ */
+ kmsan_unpoison_memory(dest, range * sizeof(u64));
return cc;
}

--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:40:00 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Add a KMSAN check to the CKSM inline assembly, similar to how it was
done for ASAN in commit e42ac7789df6 ("s390/checksum: always use cksm
instruction").

Acked-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/checksum.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/checksum.h b/arch/s390/include/asm/checksum.h
index b89159591ca0..46f5c9660616 100644
--- a/arch/s390/include/asm/checksum.h
+++ b/arch/s390/include/asm/checksum.h
@@ -13,6 +13,7 @@
#define _S390_CHECKSUM_H

#include <linux/instrumented.h>
+#include <linux/kmsan-checks.h>
#include <linux/in6.h>

static inline __wsum cksm(const void *buff, int len, __wsum sum)
@@ -23,6 +24,7 @@ static inline __wsum cksm(const void *buff, int len, __wsum sum)
};

instrument_read(buff, len);
+ kmsan_check_memory(buff, len);
asm volatile("\n"
"0: cksm %[sum],%[rp]\n"
" jo 0b\n"
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:40:01 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
This is normally done by the generic entry code, but the
kernel_stack_overflow() flow bypasses it.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/kernel/traps.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/s390/kernel/traps.c b/arch/s390/kernel/traps.c
index 52578b5cecbd..dde69d2a64f0 100644
--- a/arch/s390/kernel/traps.c
+++ b/arch/s390/kernel/traps.c
@@ -27,6 +27,7 @@
#include <linux/uaccess.h>
#include <linux/cpu.h>
#include <linux/entry-common.h>
+#include <linux/kmsan.h>
#include <asm/asm-extable.h>
#include <asm/vtime.h>
#include <asm/fpu.h>
@@ -262,6 +263,11 @@ static void monitor_event_exception(struct pt_regs *regs)

void kernel_stack_overflow(struct pt_regs *regs)
{
+ /*
+ * Normally regs are unpoisoned by the generic entry code, but
+ * kernel_stack_overflow() is a rare case that is called bypassing it.
+ */
+ kmsan_unpoison_entry_regs(regs);
bust_spinlocks(1);
printk("Kernel stack overflow.\n");
show_regs(regs);
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:40:01 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Add KMSAN support for the s390 implementations of the string functions.
Do this similar to how it's already done for KASAN, except that the
optimized memset{16,32,64}() functions need to be disabled: it's
important for KMSAN to know that they initialized something.

The way boot code is built with regard to string functions is
problematic, since most files think it's configured with sanitizers,
but boot/string.c doesn't. This creates various problems with the
memset64() definitions, depending on whether the code is built with
sanitizers or fortify. This should probably be streamlined, but in the
meantime resolve the issues by introducing the IN_BOOT_STRING_C macro,
similar to the existing IN_ARCH_STRING_C macro.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/boot/string.c | 16 ++++++++++++++++
arch/s390/include/asm/string.h | 20 +++++++++++++++-----
2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/s390/boot/string.c b/arch/s390/boot/string.c
index faccb33b462c..f6b9b1df48a8 100644
--- a/arch/s390/boot/string.c
+++ b/arch/s390/boot/string.c
@@ -1,11 +1,18 @@
// SPDX-License-Identifier: GPL-2.0
+#define IN_BOOT_STRING_C 1
#include <linux/ctype.h>
#include <linux/kernel.h>
#include <linux/errno.h>
#undef CONFIG_KASAN
#undef CONFIG_KASAN_GENERIC
+#undef CONFIG_KMSAN
#include "../lib/string.c"

+/*
+ * Duplicate some functions from the common lib/string.c
+ * instead of fully including it.
+ */
+
int strncmp(const char *cs, const char *ct, size_t count)
{
unsigned char c1, c2;
@@ -22,6 +29,15 @@ int strncmp(const char *cs, const char *ct, size_t count)
return 0;
}

+void *memset64(uint64_t *s, uint64_t v, size_t count)
+{
+ uint64_t *xs = s;
+
+ while (count--)
+ *xs++ = v;
+ return s;
+}
+
char *skip_spaces(const char *str)
{
while (isspace(*str))
diff --git a/arch/s390/include/asm/string.h b/arch/s390/include/asm/string.h
index 351685de53d2..2ab868cbae6c 100644
--- a/arch/s390/include/asm/string.h
+++ b/arch/s390/include/asm/string.h
@@ -15,15 +15,12 @@
#define __HAVE_ARCH_MEMCPY /* gcc builtin & arch function */
#define __HAVE_ARCH_MEMMOVE /* gcc builtin & arch function */
#define __HAVE_ARCH_MEMSET /* gcc builtin & arch function */
-#define __HAVE_ARCH_MEMSET16 /* arch function */
-#define __HAVE_ARCH_MEMSET32 /* arch function */
-#define __HAVE_ARCH_MEMSET64 /* arch function */

void *memcpy(void *dest, const void *src, size_t n);
void *memset(void *s, int c, size_t n);
void *memmove(void *dest, const void *src, size_t n);

-#ifndef CONFIG_KASAN
+#if !defined(CONFIG_KASAN) && !defined(CONFIG_KMSAN)
#define __HAVE_ARCH_MEMCHR /* inline & arch function */
#define __HAVE_ARCH_MEMCMP /* arch function */
#define __HAVE_ARCH_MEMSCAN /* inline & arch function */
@@ -36,6 +33,9 @@ void *memmove(void *dest, const void *src, size_t n);
#define __HAVE_ARCH_STRNCPY /* arch function */
#define __HAVE_ARCH_STRNLEN /* inline & arch function */
#define __HAVE_ARCH_STRSTR /* arch function */
+#define __HAVE_ARCH_MEMSET16 /* arch function */
+#define __HAVE_ARCH_MEMSET32 /* arch function */
+#define __HAVE_ARCH_MEMSET64 /* arch function */

/* Prototypes for non-inlined arch strings functions. */
int memcmp(const void *s1, const void *s2, size_t n);
@@ -44,7 +44,7 @@ size_t strlcat(char *dest, const char *src, size_t n);
char *strncat(char *dest, const char *src, size_t n);
char *strncpy(char *dest, const char *src, size_t n);
char *strstr(const char *s1, const char *s2);
-#endif /* !CONFIG_KASAN */
+#endif /* !defined(CONFIG_KASAN) && !defined(CONFIG_KMSAN) */

#undef __HAVE_ARCH_STRCHR
#undef __HAVE_ARCH_STRNCHR
@@ -74,20 +74,30 @@ void *__memset16(uint16_t *s, uint16_t v, size_t count);
void *__memset32(uint32_t *s, uint32_t v, size_t count);
void *__memset64(uint64_t *s, uint64_t v, size_t count);

+#ifdef __HAVE_ARCH_MEMSET16
static inline void *memset16(uint16_t *s, uint16_t v, size_t count)
{
return __memset16(s, v, count * sizeof(v));
}
+#endif

+#ifdef __HAVE_ARCH_MEMSET32
static inline void *memset32(uint32_t *s, uint32_t v, size_t count)
{
return __memset32(s, v, count * sizeof(v));
}
+#endif

+#ifdef __HAVE_ARCH_MEMSET64
+#ifdef IN_BOOT_STRING_C
+void *memset64(uint64_t *s, uint64_t v, size_t count);
+#else
static inline void *memset64(uint64_t *s, uint64_t v, size_t count)
{
return __memset64(s, v, count * sizeof(v));
}
+#endif
+#endif

#if !defined(IN_ARCH_STRING_C) && (!defined(CONFIG_FORTIFY_SOURCE) || defined(__NO_FORTIFY))

--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:40:02 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Lockdep generates the following false positives with KMSAN on s390x:

[ 6.063666] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
[ ...]
[ 6.577050] Call Trace:
[ 6.619637] [<000000000690d2de>] check_flags+0x1fe/0x210
[ 6.665411] ([<000000000690d2da>] check_flags+0x1fa/0x210)
[ 6.707478] [<00000000006cec1a>] lock_acquire+0x2ca/0xce0
[ 6.749959] [<00000000069820ea>] _raw_spin_lock_irqsave+0xea/0x190
[ 6.794912] [<00000000041fc988>] __stack_depot_save+0x218/0x5b0
[ 6.838420] [<000000000197affe>] __msan_poison_alloca+0xfe/0x1a0
[ 6.882985] [<0000000007c5827c>] start_kernel+0x70c/0xd50
[ 6.927454] [<0000000000100036>] startup_continue+0x36/0x40

Between trace_hardirqs_on() and `stosm __mask, 3` lockdep thinks that
interrupts are on, but on the CPU they are still off. KMSAN
instrumentation takes spinlocks, giving lockdep a chance to see and
complain about this discrepancy.

KMSAN instrumentation is inserted in order to poison the __mask
variable. Disable instrumentation in the respective functions. They are
very small and it's easy to see that no important metadata updates are
lost because of this.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/irqflags.h | 17 ++++++++++++++---
drivers/s390/char/sclp.c | 2 +-
2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/irqflags.h b/arch/s390/include/asm/irqflags.h
index 02427b205c11..bcab456dfb80 100644
--- a/arch/s390/include/asm/irqflags.h
+++ b/arch/s390/include/asm/irqflags.h
@@ -37,12 +37,18 @@ static __always_inline void __arch_local_irq_ssm(unsigned long flags)
asm volatile("ssm %0" : : "Q" (flags) : "memory");
}

-static __always_inline unsigned long arch_local_save_flags(void)
+#ifdef CONFIG_KMSAN
+#define arch_local_irq_attributes noinline notrace __no_sanitize_memory __maybe_unused
+#else
+#define arch_local_irq_attributes __always_inline
+#endif
+
+static arch_local_irq_attributes unsigned long arch_local_save_flags(void)
{
return __arch_local_irq_stnsm(0xff);
}

-static __always_inline unsigned long arch_local_irq_save(void)
+static arch_local_irq_attributes unsigned long arch_local_irq_save(void)
{
return __arch_local_irq_stnsm(0xfc);
}
@@ -52,7 +58,12 @@ static __always_inline void arch_local_irq_disable(void)
arch_local_irq_save();
}

-static __always_inline void arch_local_irq_enable(void)
+static arch_local_irq_attributes void arch_local_irq_enable_external(void)
+{
+ __arch_local_irq_stosm(0x01);
+}
+
+static arch_local_irq_attributes void arch_local_irq_enable(void)
{
__arch_local_irq_stosm(0x03);
}
diff --git a/drivers/s390/char/sclp.c b/drivers/s390/char/sclp.c
index d53ee34d398f..fb1d9949adca 100644
--- a/drivers/s390/char/sclp.c
+++ b/drivers/s390/char/sclp.c
@@ -736,7 +736,7 @@ sclp_sync_wait(void)
cr0_sync.val = cr0.val & ~CR0_IRQ_SUBCLASS_MASK;
cr0_sync.val |= 1UL << (63 - 54);
local_ctl_load(0, &cr0_sync);
- __arch_local_irq_stosm(0x01);
+ arch_local_irq_enable_external();
/* Loop until driver state indicates finished request */
while (sclp_running_state != sclp_running_state_idle) {
/* Check for expired request timer */
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:40:02 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
s390 uses assembly code to initialize ftrace_regs and call
kprobe_ftrace_handler(). Therefore, from the KMSAN's point of view,
ftrace_regs is poisoned on kprobe_ftrace_handler() entry. This causes
KMSAN warnings when running the ftrace testsuite.

Fix by trusting the assembly code and always unpoisoning ftrace_regs in
kprobe_ftrace_handler().

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/kernel/ftrace.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index ddf2ee47cb87..0bd6adc40a34 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -12,6 +12,7 @@
#include <linux/ftrace.h>
#include <linux/kernel.h>
#include <linux/types.h>
+#include <linux/kmsan-checks.h>
#include <linux/kprobes.h>
#include <linux/execmem.h>
#include <trace/syscall.h>
@@ -303,6 +304,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
if (bit < 0)
return;

+ kmsan_unpoison_memory(fregs, sizeof(*fregs));
regs = ftrace_get_regs(fregs);
p = get_kprobe((kprobe_opcode_t *)ip);
if (!regs || unlikely(!p) || kprobe_disabled(p))
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:40:03 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Now that everything else is in place, enable KMSAN in Kconfig.

Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index c59d2b54df49..3cba4993d7c7 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -158,6 +158,7 @@ config S390
select HAVE_ARCH_KASAN
select HAVE_ARCH_KASAN_VMALLOC
select HAVE_ARCH_KCSAN
+ select HAVE_ARCH_KMSAN
select HAVE_ARCH_KFENCE
select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
select HAVE_ARCH_SECCOMP_FILTER
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:40:03 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
put_user() uses inline assembly with precise constraints, so Clang is
in principle capable of instrumenting it automatically. Unfortunately,
one of the constraints contains a dereferenced user pointer, and Clang
does not currently distinguish user and kernel pointers. Therefore
KMSAN attempts to access shadow for user pointers, which is not a right
thing to do.

An obvious fix to add __no_sanitize_memory to __put_user_fn() does not
work, since it's __always_inline. And __always_inline cannot be removed
due to the __put_user_bad() trick.

A different obvious fix of using the "a" instead of the "+Q" constraint
degrades the code quality, which is very important here, since it's a
hot path.

Instead, repurpose the __put_user_asm() macro to define
__put_user_{char,short,int,long}_noinstr() functions and mark them with
__no_sanitize_memory. For the non-KMSAN builds make them
__always_inline in order to keep the generated code quality. Also
define __put_user_{char,short,int,long}() functions, which call the
aforementioned ones and which *are* instrumented, because they call
KMSAN hooks, which may be implemented as macros.

The same applies to get_user() as well.

Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/uaccess.h | 111 +++++++++++++++++++++++---------
1 file changed, 79 insertions(+), 32 deletions(-)

diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index 81ae8a98e7ec..c3c26dd1fc04 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -78,13 +78,24 @@ union oac {

int __noreturn __put_user_bad(void);

-#define __put_user_asm(to, from, size) \
-({ \
+#ifdef CONFIG_KMSAN
+#define GET_PUT_USER_NOINSTR_ATTRIBUTES \
+ noinline __maybe_unused __no_sanitize_memory
+#else
+#define GET_PUT_USER_NOINSTR_ATTRIBUTES __always_inline
+#endif
+
+#define DEFINE_PUT_USER(type) \
+static GET_PUT_USER_NOINSTR_ATTRIBUTES int \
+__put_user_##type##_noinstr(unsigned type __user *to, \
+ unsigned type *from, \
+ unsigned long size) \
+{ \
union oac __oac_spec = { \
.oac1.as = PSW_BITS_AS_SECONDARY, \
.oac1.a = 1, \
}; \
- int __rc; \
+ int rc; \
\
asm volatile( \
" lr 0,%[spec]\n" \
@@ -93,12 +104,28 @@ int __noreturn __put_user_bad(void);
"2:\n" \
EX_TABLE_UA_STORE(0b, 2b, %[rc]) \
EX_TABLE_UA_STORE(1b, 2b, %[rc]) \
- : [rc] "=&d" (__rc), [_to] "+Q" (*(to)) \
+ : [rc] "=&d" (rc), [_to] "+Q" (*(to)) \
: [_size] "d" (size), [_from] "Q" (*(from)), \
[spec] "d" (__oac_spec.val) \
: "cc", "0"); \
- __rc; \
-})
+ return rc; \
+} \
+ \
+static __always_inline int \
+__put_user_##type(unsigned type __user *to, unsigned type *from, \
+ unsigned long size) \
+{ \
+ int rc; \
+ \
+ rc = __put_user_##type##_noinstr(to, from, size); \
+ instrument_put_user(*from, to, size); \
+ return rc; \
+}
+
+DEFINE_PUT_USER(char);
+DEFINE_PUT_USER(short);
+DEFINE_PUT_USER(int);
+DEFINE_PUT_USER(long);

static __always_inline int __put_user_fn(void *x, void __user *ptr, unsigned long size)
{
@@ -106,24 +133,24 @@ static __always_inline int __put_user_fn(void *x, void __user *ptr, unsigned lon

switch (size) {
case 1:
- rc = __put_user_asm((unsigned char __user *)ptr,
- (unsigned char *)x,
- size);
+ rc = __put_user_char((unsigned char __user *)ptr,
+ (unsigned char *)x,
+ size);
break;
case 2:
- rc = __put_user_asm((unsigned short __user *)ptr,
- (unsigned short *)x,
- size);
+ rc = __put_user_short((unsigned short __user *)ptr,
+ (unsigned short *)x,
+ size);
break;
case 4:
- rc = __put_user_asm((unsigned int __user *)ptr,
+ rc = __put_user_int((unsigned int __user *)ptr,
(unsigned int *)x,
size);
break;
case 8:
- rc = __put_user_asm((unsigned long __user *)ptr,
- (unsigned long *)x,
- size);
+ rc = __put_user_long((unsigned long __user *)ptr,
+ (unsigned long *)x,
+ size);
break;
default:
__put_user_bad();
@@ -134,13 +161,17 @@ static __always_inline int __put_user_fn(void *x, void __user *ptr, unsigned lon

int __noreturn __get_user_bad(void);

-#define __get_user_asm(to, from, size) \
-({ \
+#define DEFINE_GET_USER(type) \
+static GET_PUT_USER_NOINSTR_ATTRIBUTES int \
+__get_user_##type##_noinstr(unsigned type *to, \
+ unsigned type __user *from, \
+ unsigned long size) \
+{ \
union oac __oac_spec = { \
.oac2.as = PSW_BITS_AS_SECONDARY, \
.oac2.a = 1, \
}; \
- int __rc; \
+ int rc; \
\
asm volatile( \
" lr 0,%[spec]\n" \
@@ -149,13 +180,29 @@ int __noreturn __get_user_bad(void);
"2:\n" \
EX_TABLE_UA_LOAD_MEM(0b, 2b, %[rc], %[_to], %[_ksize]) \
EX_TABLE_UA_LOAD_MEM(1b, 2b, %[rc], %[_to], %[_ksize]) \
- : [rc] "=&d" (__rc), "=Q" (*(to)) \
+ : [rc] "=&d" (rc), "=Q" (*(to)) \
: [_size] "d" (size), [_from] "Q" (*(from)), \
[spec] "d" (__oac_spec.val), [_to] "a" (to), \
[_ksize] "K" (size) \
: "cc", "0"); \
- __rc; \
-})
+ return rc; \
+} \
+ \
+static __always_inline int \
+__get_user_##type(unsigned type *to, unsigned type __user *from, \
+ unsigned long size) \
+{ \
+ int rc; \
+ \
+ rc = __get_user_##type##_noinstr(to, from, size); \
+ instrument_get_user(*to); \
+ return rc; \
+}
+
+DEFINE_GET_USER(char);
+DEFINE_GET_USER(short);
+DEFINE_GET_USER(int);
+DEFINE_GET_USER(long);

static __always_inline int __get_user_fn(void *x, const void __user *ptr, unsigned long size)
{
@@ -163,24 +210,24 @@ static __always_inline int __get_user_fn(void *x, const void __user *ptr, unsign

switch (size) {
case 1:
- rc = __get_user_asm((unsigned char *)x,
- (unsigned char __user *)ptr,
- size);
+ rc = __get_user_char((unsigned char *)x,
+ (unsigned char __user *)ptr,
+ size);
break;
case 2:
- rc = __get_user_asm((unsigned short *)x,
- (unsigned short __user *)ptr,
- size);
+ rc = __get_user_short((unsigned short *)x,
+ (unsigned short __user *)ptr,
+ size);
break;
case 4:
- rc = __get_user_asm((unsigned int *)x,
+ rc = __get_user_int((unsigned int *)x,
(unsigned int __user *)ptr,
size);
break;
case 8:
- rc = __get_user_asm((unsigned long *)x,
- (unsigned long __user *)ptr,
- size);
+ rc = __get_user_long((unsigned long *)x,
+ (unsigned long __user *)ptr,
+ size);
break;
default:
__get_user_bad();
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:40:03 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
The pages for the KMSAN metadata associated with most kernel mappings
are taken from memblock by the common code. However, vmalloc and module
metadata needs to be defined by the architectures.

Be a little bit more careful than x86: allocate exactly MODULES_LEN
for the module shadow and origins, and then take 2/3 of vmalloc for
the vmalloc shadow and origins. This ensures that users passing small
vmalloc= values on the command line do not cause module metadata
collisions.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Alexander Gordeev <agor...@linux.ibm.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/boot/startup.c | 7 +++++++
arch/s390/include/asm/pgtable.h | 8 ++++++++
2 files changed, 15 insertions(+)

diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c
index 182aac6a0f77..93775142322d 100644
--- a/arch/s390/boot/startup.c
+++ b/arch/s390/boot/startup.c
@@ -301,11 +301,18 @@ static unsigned long setup_kernel_memory_layout(unsigned long kernel_size)
MODULES_END = round_down(kernel_start, _SEGMENT_SIZE);
MODULES_VADDR = MODULES_END - MODULES_LEN;
VMALLOC_END = MODULES_VADDR;
+ if (IS_ENABLED(CONFIG_KMSAN))
+ VMALLOC_END -= MODULES_LEN * 2;

/* allow vmalloc area to occupy up to about 1/2 of the rest virtual space left */
vsize = (VMALLOC_END - FIXMAP_SIZE) / 2;
vsize = round_down(vsize, _SEGMENT_SIZE);
vmalloc_size = min(vmalloc_size, vsize);
+ if (IS_ENABLED(CONFIG_KMSAN)) {
+ /* take 2/3 of vmalloc area for KMSAN shadow and origins */
+ vmalloc_size = round_down(vmalloc_size / 3, _SEGMENT_SIZE);
+ VMALLOC_END -= vmalloc_size * 2;
+ }
VMALLOC_START = VMALLOC_END - vmalloc_size;

__memcpy_real_area = round_down(VMALLOC_START - MEMCPY_REAL_SIZE, PAGE_SIZE);
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 70b6ee557eb2..2f44c23efec0 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -107,6 +107,14 @@ static inline int is_module_addr(void *addr)
return 1;
}

+#ifdef CONFIG_KMSAN
+#define KMSAN_VMALLOC_SIZE (VMALLOC_END - VMALLOC_START)
+#define KMSAN_VMALLOC_SHADOW_START VMALLOC_END
+#define KMSAN_VMALLOC_ORIGIN_START (KMSAN_VMALLOC_SHADOW_START + KMSAN_VMALLOC_SIZE)
+#define KMSAN_MODULES_SHADOW_START (KMSAN_VMALLOC_ORIGIN_START + KMSAN_VMALLOC_SIZE)
+#define KMSAN_MODULES_ORIGIN_START (KMSAN_MODULES_SHADOW_START + MODULES_LEN)
+#endif
+
#ifdef CONFIG_RANDOMIZE_BASE
#define KASLR_LEN (1UL << 31)
#else
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:40:05 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
arch_kmsan_get_meta_or_null() finds the lowcore shadow by querying the
prefix and calling kmsan_get_metadata() again.

kmsan_virt_addr_valid() delegates to virt_addr_valid().

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/kmsan.h | 43 +++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)
create mode 100644 arch/s390/include/asm/kmsan.h

diff --git a/arch/s390/include/asm/kmsan.h b/arch/s390/include/asm/kmsan.h
new file mode 100644
index 000000000000..e572686d340c
--- /dev/null
+++ b/arch/s390/include/asm/kmsan.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_S390_KMSAN_H
+#define _ASM_S390_KMSAN_H
+
+#include <asm/lowcore.h>
+#include <asm/page.h>
+#include <linux/kmsan.h>
+#include <linux/mmzone.h>
+#include <linux/stddef.h>
+
+#ifndef MODULE
+
+static inline bool is_lowcore_addr(void *addr)
+{
+ return addr >= (void *)&S390_lowcore &&
+ addr < (void *)(&S390_lowcore + 1);
+}
+
+static inline void *arch_kmsan_get_meta_or_null(void *addr, bool is_origin)
+{
+ if (is_lowcore_addr(addr)) {
+ /*
+ * Different lowcores accessed via S390_lowcore are described
+ * by the same struct page. Resolve the prefix manually in
+ * order to get a distinct struct page.
+ */
+ addr += (void *)lowcore_ptr[raw_smp_processor_id()] -
+ (void *)&S390_lowcore;
+ if (WARN_ON_ONCE(is_lowcore_addr(addr)))
+ return NULL;
+ return kmsan_get_metadata(addr, is_origin);
+ }
+ return NULL;
+}
+
+static inline bool kmsan_virt_addr_valid(void *addr)
+{
+ return virt_addr_valid(addr);
+}
+
+#endif /* !MODULE */
+
+#endif /* _ASM_S390_KMSAN_H */
--
2.45.1

Ilya Leoshkevich

unread,
Jun 13, 2024, 11:40:07 AM6/13/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
The unwind code can read uninitialized frames. Furthermore, even in
the good case, KMSAN does not emit shadow for backchains. Therefore
disable it for the unwinding functions.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/kernel/unwind_bc.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/s390/kernel/unwind_bc.c b/arch/s390/kernel/unwind_bc.c
index 0ece156fdd7c..cd44be2b6ce8 100644
--- a/arch/s390/kernel/unwind_bc.c
+++ b/arch/s390/kernel/unwind_bc.c
@@ -49,6 +49,8 @@ static inline bool is_final_pt_regs(struct unwind_state *state,
READ_ONCE_NOCHECK(regs->psw.mask) & PSW_MASK_PSTATE;
}

+/* Avoid KMSAN false positives from touching uninitialized frames. */
+__no_kmsan_checks
bool unwind_next_frame(struct unwind_state *state)
{
struct stack_info *info = &state->stack_info;
@@ -118,6 +120,8 @@ bool unwind_next_frame(struct unwind_state *state)
}
EXPORT_SYMBOL_GPL(unwind_next_frame);

+/* Avoid KMSAN false positives from touching uninitialized frames. */
+__no_kmsan_checks
void __unwind_start(struct unwind_state *state, struct task_struct *task,
struct pt_regs *regs, unsigned long first_frame)
{
--
2.45.1

Steven Rostedt

unread,
Jun 13, 2024, 12:21:16 PM6/13/24
to Ilya Leoshkevich, Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Thu, 13 Jun 2024 17:34:03 +0200
Ilya Leoshkevich <i...@linux.ibm.com> wrote:

> Architectures use assembly code to initialize ftrace_regs and call
> ftrace_ops_list_func(). Therefore, from the KMSAN's point of view,
> ftrace_regs is poisoned on ftrace_ops_list_func entry(). This causes
> KMSAN warnings when running the ftrace testsuite.
>
> Fix by trusting the architecture-specific assembly code and always
> unpoisoning ftrace_regs in ftrace_ops_list_func.
>
> The issue was not encountered on x86_64 so far only by accident:
> assembly-allocated ftrace_regs was overlapping a stale partially
> unpoisoned stack frame. Poisoning stack frames before returns [1]
> makes the issue appear on x86_64 as well.
>
> [1] https://github.com/iii-i/llvm-project/commits/msan-poison-allocas-before-returning-2024-06-12/
>
> Reviewed-by: Alexander Potapenko <gli...@google.com>
> Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
> ---

Acked-by: Steven Rostedt (Google) <ros...@goodmis.org>

-- Steve

> kernel/trace/ftrace.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 65208d3b5ed9..c35ad4362d71 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -7407,6 +7407,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
> void arch_ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
> struct ftrace_ops *op, struct ftrace_regs *fregs)
> {
> + kmsan_unpoison_memory(fregs, sizeof(*fregs));
> __ftrace_ops_list_func(ip, parent_ip, NULL, fregs);
> }
> #else

SeongJae Park

unread,
Jun 13, 2024, 7:30:52 PM6/13/24
to Ilya Leoshkevich, SeongJae Park, Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
Hi Ilya,

On Thu, 13 Jun 2024 17:34:14 +0200 Ilya Leoshkevich <i...@linux.ibm.com> wrote:

> Avoid false KMSAN negatives with SLUB_DEBUG by allowing
> kmsan_slab_free() to poison the freed memory, and by preventing
> init_object() from unpoisoning new allocations by using __memset().
>
> There are two alternatives to this approach. First, init_object()
> can be marked with __no_sanitize_memory. This annotation should be used
> with great care, because it drops all instrumentation from the
> function, and any shadow writes will be lost. Even though this is not a
> concern with the current init_object() implementation, this may change
> in the future.
>
> Second, kmsan_poison_memory() calls may be added after memset() calls.
> The downside is that init_object() is called from
> free_debug_processing(), in which case poisoning will erase the
> distinction between simply uninitialized memory and UAF.
>
> Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
> ---
> mm/kmsan/hooks.c | 2 +-
> mm/slub.c | 13 +++++++++----
> 2 files changed, 10 insertions(+), 5 deletions(-)
>
[...]
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1139,7 +1139,12 @@ static void init_object(struct kmem_cache *s, void *object, u8 val)
> unsigned int poison_size = s->object_size;
>
> if (s->flags & SLAB_RED_ZONE) {
> - memset(p - s->red_left_pad, val, s->red_left_pad);
> + /*
> + * Use __memset() here and below in order to avoid overwriting
> + * the KMSAN shadow. Keeping the shadow makes it possible to
> + * distinguish uninit-value from use-after-free.
> + */
> + __memset(p - s->red_left_pad, val, s->red_left_pad);

I found my build test[1] fails with below error on latest mm-unstable branch.
'git bisect' points me this patch.

CC mm/slub.o
/mm/slub.c: In function 'init_object':
/mm/slub.c:1147:17: error: implicit declaration of function '__memset'; did you mean 'memset'? [-Werror=implicit-function-declaration]
1147 | __memset(p - s->red_left_pad, val, s->red_left_pad);
| ^~~~~~~~
| memset
cc1: some warnings being treated as errors

I haven't looked in deep, but reporting first. Do you have any idea?

[1] https://github.com/awslabs/damon-tests/blob/next/corr/tests/build_m68k.sh


Thanks,
SJ

[...]

Ilya Leoshkevich

unread,
Jun 13, 2024, 7:44:56 PM6/13/24
to SeongJae Park, Alexander Potapenko, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
Thanks for the report.

Apparently not all architectures have __memset(). We should probably go
back to memset_no_sanitize_memory() [1], but this time mark it with
noinline __maybe_unused __no_sanitize_memory, like it's done in, e.g.,
32/35.

Alexander, what do you think?

[1]
https://lore.kernel.org/lkml/20231121220155...@linux.ibm.com/

Alexander Potapenko

unread,
Jun 18, 2024, 5:25:33 AM6/18/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Thu, Jun 13, 2024 at 5:39 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> put_user() uses inline assembly with precise constraints, so Clang is
> in principle capable of instrumenting it automatically. Unfortunately,
> one of the constraints contains a dereferenced user pointer, and Clang
> does not currently distinguish user and kernel pointers. Therefore
> KMSAN attempts to access shadow for user pointers, which is not a right
> thing to do.
>
> An obvious fix to add __no_sanitize_memory to __put_user_fn() does not
> work, since it's __always_inline. And __always_inline cannot be removed
> due to the __put_user_bad() trick.
>
> A different obvious fix of using the "a" instead of the "+Q" constraint
> degrades the code quality, which is very important here, since it's a
> hot path.
>
> Instead, repurpose the __put_user_asm() macro to define
> __put_user_{char,short,int,long}_noinstr() functions and mark them with
> __no_sanitize_memory. For the non-KMSAN builds make them
> __always_inline in order to keep the generated code quality. Also
> define __put_user_{char,short,int,long}() functions, which call the
> aforementioned ones and which *are* instrumented, because they call
> KMSAN hooks, which may be implemented as macros.

I am not really familiar with s390 assembly, but I think you still
need to call kmsan_copy_to_user() and kmsan_copy_from_user() to
properly initialize the copied data and report infoleaks.
Would it be possible to insert calls to linux/instrumented.h hooks
into uaccess functions?

Alexander Potapenko

unread,
Jun 18, 2024, 5:27:07 AM6/18/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Thu, Jun 13, 2024 at 5:40 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> Diagnose 224 stores 4k bytes, which currently cannot be deduced from
> the inline assembly constraints. This leads to KMSAN false positives.
>
> Fix the constraints by using a 4k-sized struct instead of a raw
> pointer. While at it, prettify them too.
>
> Suggested-by: Heiko Carstens <h...@linux.ibm.com>
> Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>

Ilya Leoshkevich

unread,
Jun 18, 2024, 5:40:17 AM6/18/24
to Alexander Potapenko, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
Aren't the existing instrument_get_user() / instrument_put_user() calls
sufficient?

Alexander Potapenko

unread,
Jun 18, 2024, 5:52:42 AM6/18/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
Oh, sorry, I overlooked them. Yes, those should be sufficient.
But you don't include linux/instrumented.h, do you?

Ilya Leoshkevich

unread,
Jun 18, 2024, 5:56:22 AM6/18/24
to Alexander Potapenko, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
No, apparently we get this include from somewhere else by accident.
I will add it in a separate patch.

Alexander Potapenko

unread,
Jun 18, 2024, 8:23:27 AM6/18/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Thu, Jun 13, 2024 at 5:39 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> Like for KASAN, it's useful to temporarily disable KMSAN checks around,
> e.g., redzone accesses. Introduce kmsan_disable_current() and
> kmsan_enable_current(), which are similar to their KASAN counterparts.
>
> Make them reentrant in order to handle memory allocations in interrupt
> context. Repurpose the allow_reporting field for this.

I am still a bit reluctant, because these nested counters always end
up being inconsistent.
But your patch series fixes support for SLUB_DEBUG, and I don't have
better ideas how to do this.

Could you please extend "Disabling the instrumentation" in kmsan.rst
so that it explains the new enable/disable API?
I think we should mention that the users need to be careful with it,
keeping the regions short and preferring other ways to disable
instrumentation, where possible.

Alexander Potapenko

unread,
Jun 18, 2024, 10:22:42 AM6/18/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Thu, Jun 13, 2024 at 5:39 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> Building the kernel with CONFIG_SLUB_DEBUG and CONFIG_KMSAN causes
> KMSAN to complain about touching redzones in kfree().
>
> Fix by extending the existing KASAN-related metadata_access_enable()
> and metadata_access_disable() functions to KMSAN.
>
> Acked-by: Vlastimil Babka <vba...@suse.cz>

Alexander Potapenko

unread,
Jun 18, 2024, 10:38:34 AM6/18/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Thu, Jun 13, 2024 at 5:39 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> x86's alloc_node_data() rounds up node data size to PAGE_SIZE. It's not
> explained why it's needed, but it's most likely for performance
> reasons, since the padding bytes are not used anywhere. Some other
> architectures do it as well, e.g., mips rounds it up to the cache line
> size.
>
> kmsan_init_shadow() initializes metadata for each node data and assumes
> the x86 rounding, which does not match other architectures. This may
> cause the range end to overshoot the end of available memory, in turn
> causing virt_to_page_or_null() in kmsan_init_alloc_meta_for_range() to
> return NULL, which leads to kernel panic shortly after.
>
> Since the padding bytes are not used, drop the rounding.

Nice catch, thanks!

Alexander Potapenko

unread,
Jun 18, 2024, 10:39:10 AM6/18/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Thu, Jun 13, 2024 at 5:39 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> Even though the KMSAN warnings generated by memchr_inv() are suppressed
> by metadata_access_enable(), its return value may still be poisoned.
>
> The reason is that the last iteration of memchr_inv() returns
> `*start != value ? start : NULL`, where *start is poisoned. Because of
> this, somewhat counterintuitively, the shadow value computed by
> visitSelectInst() is equal to `(uintptr_t)start`.
>
> The intention behind guarding memchr_inv() behind
> metadata_access_enable() is to touch poisoned metadata without
> triggering KMSAN, so unpoison its return value.

What do you think about applying __no_kmsan_checks to these functions instead?

Alexander Potapenko

unread,
Jun 18, 2024, 11:05:48 AM6/18/24
to Ilya Leoshkevich, SeongJae Park, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
We could probably go without __no_sanitize_memory assuming that
platforms supporting KMSAN always have __memset():

#if defined(CONFIG_KMSAN)
static inline void *memset_no_sanitize_memory(void *s, int c, size_t n)
{
return __memset(s, c, n);
}
#else
static inline void *memset_no_sanitize_memory(void *s, int c, size_t n)
{
return memset(s, c, n);
}
#endif

Alexander Potapenko

unread,
Jun 18, 2024, 11:36:18 AM6/18/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Thu, Jun 13, 2024 at 5:39 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> Each s390 CPU has lowcore pages associated with it. Each CPU sees its
> own lowcore at virtual address 0 through a hardware mechanism called
> prefixing. Additionally, all lowcores are mapped to non-0 virtual
> addresses stored in the lowcore_ptr[] array.
>
> When lowcore is accessed through virtual address 0, one needs to
> resolve metadata for lowcore_ptr[raw_smp_processor_id()].
>
> Expose kmsan_get_metadata() to make it possible to do this from the
> arch code.

Alexander Potapenko

unread,
Jun 18, 2024, 11:36:43 AM6/18/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Thu, Jun 13, 2024 at 5:40 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> Now that everything else is in place, enable KMSAN in Kconfig.
>
> Acked-by: Heiko Carstens <h...@linux.ibm.com>
> Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>

Ilya Leoshkevich

unread,
Jun 19, 2024, 7:47:05 AM6/19/24
to Alexander Potapenko, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
Ok, will do. The __no_kmsan_checks approach is already taken by
"mm: kfence: Disable KMSAN when checking the canary", so we might as
well be consistent in how we fix these issues.

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:46 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
v4: https://lore.kernel.org/lkml/2024061315392...@linux.ibm.com/
v4 -> v5: Fix the __memset() build issue.
Change the attribute #defines to lowercase in order to match
the existing code style.
Fix the kmsan_virt_addr_valid() implementation to avoid
recursion in debug builds, like it's done on x86_64 - dropped
R-bs, please take another look.
Add kmsan_disable_current()/kmsan_enable_current() doc;
Fix the poisoned memchr_inv() value in a different way;
Add the missing linux/instrumented.h #include;
(Alexander P.).
Patches that need review:
- [PATCH 12/37] kmsan: Introduce memset_no_sanitize_memory()
- [PATCH 13/37] kmsan: Support SLAB_POISON
- [PATCH 17/37] mm: slub: Disable KMSAN when checking the padding bytes
- [PATCH 36/37] s390/kmsan: Implement the architecture-specific functions

v3: https://lore.kernel.org/lkml/2023121323360...@linux.ibm.com/
v3 -> v4: Rebase.
Elaborate why ftrace_ops_list_func() change is needed on
x64_64 (Steven).
Add a comment to the DFLTCC patch (Alexander P.).
Simplify diag224();
Improve __arch_local_irq_attributes style;
Use IS_ENABLED(CONFIG_KMSAN) for vmalloc area (Heiko).
Align vmalloc area on _SEGMENT_SIZE (Alexander G.).

v2: https://lore.kernel.org/lkml/20231121220155...@linux.ibm.com/
v2 -> v3: Drop kmsan_memmove_metadata() and strlcpy() patches;
Remove kmsan_get_metadata() stub;
Move kmsan_enable_current() and kmsan_disable_current() to
include/linux/kmsan.h, explain why a counter is needed;
Drop the memset_no_sanitize_memory() patch;
Use __memset() in the SLAB_POISON patch;
Add kmsan-checks.h to the DFLTCC patch;
Add recursion check to the arch_kmsan_get_meta_or_null()
patch (Alexander P.).

Fix inline + __no_kmsan_checks issues.
New patch for s390/irqflags, that resolves a lockdep warning.
New patch for s390/diag, that resolves a false positive when
running on an LPAR.
New patch for STCCTM, same as above.
New patch for check_bytes_and_report() that resolves a false
positive that occurs even on Intel.

v1: https://lore.kernel.org/lkml/20231115203401...@linux.ibm.com/
v1 -> v2: Add comments, sort #includes, introduce
memset_no_sanitize_memory() and use it to avoid unpoisoning
of redzones, change vmalloc alignment to _REGION3_SIZE, add
R-bs (Alexander P.).

Fix building
[PATCH 28/33] s390/string: Add KMSAN support
with FORTIFY_SOURCE.
Reported-by: kernel test robot <l...@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311170550...@intel.com/

Hi,

This series provides the minimal support for Kernel Memory Sanitizer on
s390. Kernel Memory Sanitizer is clang-only instrumentation for finding
accesses to uninitialized memory. The clang support for s390 has already
been merged [1].

With this series, I can successfully boot s390 defconfig and
debug_defconfig with kmsan.panic=1. The tool found one real
s390-specific bug (fixed in master).

Best regards,
Ilya

[1] https://reviews.llvm.org/D148596

Ilya Leoshkevich (37):
ftrace: Unpoison ftrace_regs in ftrace_ops_list_func()
kmsan: Make the tests compatible with kmsan.panic=1
kmsan: Disable KMSAN when DEFERRED_STRUCT_PAGE_INIT is enabled
kmsan: Increase the maximum store size to 4096
kmsan: Fix is_bad_asm_addr() on arches with overlapping address spaces
kmsan: Fix kmsan_copy_to_user() on arches with overlapping address
spaces
kmsan: Remove a useless assignment from
kmsan_vmap_pages_range_noflush()
kmsan: Remove an x86-specific #include from kmsan.h
kmsan: Expose kmsan_get_metadata()
kmsan: Export panic_on_kmsan
kmsan: Allow disabling KMSAN checks for the current task
kmsan: Introduce memset_no_sanitize_memory()
kmsan: Support SLAB_POISON
kmsan: Use ALIGN_DOWN() in kmsan_get_metadata()
kmsan: Do not round up pg_data_t size
mm: slub: Let KMSAN access metadata
mm: slub: Disable KMSAN when checking the padding bytes
mm: kfence: Disable KMSAN when checking the canary
lib/zlib: Unpoison DFLTCC output buffers
kmsan: Accept ranges starting with 0 on s390
s390/boot: Turn off KMSAN
s390: Use a larger stack for KMSAN
s390/boot: Add the KMSAN runtime stub
s390/checksum: Add a KMSAN check
s390/cpacf: Unpoison the results of cpacf_trng()
s390/cpumf: Unpoison STCCTM output buffer
s390/diag: Unpoison diag224() output buffer
s390/ftrace: Unpoison ftrace_regs in kprobe_ftrace_handler()
s390/irqflags: Do not instrument arch_local_irq_*() with KMSAN
s390/mm: Define KMSAN metadata for vmalloc and modules
s390/string: Add KMSAN support
s390/traps: Unpoison the kernel_stack_overflow()'s pt_regs
s390/uaccess: Add KMSAN support to put_user() and get_user()
s390/uaccess: Add the missing linux/instrumented.h #include
s390/unwind: Disable KMSAN checks
s390/kmsan: Implement the architecture-specific functions
kmsan: Enable on s390

Documentation/dev-tools/kmsan.rst | 11 ++-
arch/s390/Kconfig | 1 +
arch/s390/Makefile | 2 +-
arch/s390/boot/Makefile | 3 +
arch/s390/boot/kmsan.c | 6 ++
arch/s390/boot/startup.c | 7 ++
arch/s390/boot/string.c | 16 ++++
arch/s390/include/asm/checksum.h | 2 +
arch/s390/include/asm/cpacf.h | 3 +
arch/s390/include/asm/cpu_mf.h | 6 ++
arch/s390/include/asm/irqflags.h | 17 ++++-
arch/s390/include/asm/kmsan.h | 59 +++++++++++++++
arch/s390/include/asm/pgtable.h | 8 ++
arch/s390/include/asm/string.h | 20 +++--
arch/s390/include/asm/thread_info.h | 2 +-
arch/s390/include/asm/uaccess.h | 112 ++++++++++++++++++++--------
arch/s390/kernel/diag.c | 10 ++-
arch/s390/kernel/ftrace.c | 2 +
arch/s390/kernel/traps.c | 6 ++
arch/s390/kernel/unwind_bc.c | 4 +
drivers/s390/char/sclp.c | 2 +-
include/linux/kmsan.h | 46 ++++++++++++
include/linux/kmsan_types.h | 2 +-
kernel/trace/ftrace.c | 1 +
lib/zlib_dfltcc/dfltcc.h | 1 +
lib/zlib_dfltcc/dfltcc_util.h | 28 +++++++
mm/Kconfig | 1 +
mm/kfence/core.c | 11 ++-
mm/kmsan/core.c | 1 -
mm/kmsan/hooks.c | 23 ++++--
mm/kmsan/init.c | 7 +-
mm/kmsan/instrumentation.c | 11 +--
mm/kmsan/kmsan.h | 9 +--
mm/kmsan/kmsan_test.c | 5 ++
mm/kmsan/report.c | 8 +-
mm/kmsan/shadow.c | 9 +--
mm/slub.c | 33 ++++++--
tools/objtool/check.c | 2 +
38 files changed, 410 insertions(+), 87 deletions(-)
create mode 100644 arch/s390/boot/kmsan.c
create mode 100644 arch/s390/include/asm/kmsan.h

--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:47 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
KMSAN relies on memblock returning all available pages to it
(see kmsan_memblock_free_pages()). It partitions these pages into 3
categories: pages available to the buddy allocator, shadow pages and
origin pages. This partitioning is static.

If new pages appear after kmsan_init_runtime(), it is considered
an error. DEFERRED_STRUCT_PAGE_INIT causes this, so mark it as
incompatible with KMSAN.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/Kconfig b/mm/Kconfig
index b4cb45255a54..9791fce5d0a7 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -946,6 +946,7 @@ config DEFERRED_STRUCT_PAGE_INIT
depends on SPARSEMEM
depends on !NEED_PER_CPU_KM
depends on 64BIT
+ depends on !KMSAN
select PADATA
help
Ordinarily all struct pages are initialised during early boot in a
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:48 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
It's useful to have both tests and kmsan.panic=1 during development,
but right now the warnings, that the tests cause, lead to kernel
panics.

Temporarily set kmsan.panic=0 for the duration of the KMSAN testing.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/kmsan_test.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index 07d3a3a5a9c5..9bfd11674fe3 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -659,9 +659,13 @@ static void test_exit(struct kunit *test)
{
}

+static int orig_panic_on_kmsan;
+
static int kmsan_suite_init(struct kunit_suite *suite)
{
register_trace_console(probe_console, NULL);
+ orig_panic_on_kmsan = panic_on_kmsan;
+ panic_on_kmsan = 0;
return 0;
}

@@ -669,6 +673,7 @@ static void kmsan_suite_exit(struct kunit_suite *suite)
{
unregister_trace_console(probe_console, NULL);
tracepoint_synchronize_unregister();
+ panic_on_kmsan = orig_panic_on_kmsan;
}

static struct kunit_suite kmsan_test_suite = {
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:49 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Like for KASAN, it's useful to temporarily disable KMSAN checks around,
e.g., redzone accesses. Introduce kmsan_disable_current() and
kmsan_enable_current(), which are similar to their KASAN counterparts.

Make them reentrant in order to handle memory allocations in interrupt
context. Repurpose the allow_reporting field for this.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
Documentation/dev-tools/kmsan.rst | 11 +++++++++--
include/linux/kmsan.h | 24 ++++++++++++++++++++++++
include/linux/kmsan_types.h | 2 +-
mm/kmsan/core.c | 1 -
mm/kmsan/hooks.c | 18 +++++++++++++++---
mm/kmsan/report.c | 7 ++++---
tools/objtool/check.c | 2 ++
7 files changed, 55 insertions(+), 10 deletions(-)

diff --git a/Documentation/dev-tools/kmsan.rst b/Documentation/dev-tools/kmsan.rst
index 323eedad53cd..6a48d96c5c85 100644
--- a/Documentation/dev-tools/kmsan.rst
+++ b/Documentation/dev-tools/kmsan.rst
@@ -110,6 +110,13 @@ in the Makefile. Think of this as applying ``__no_sanitize_memory`` to every
function in the file or directory. Most users won't need KMSAN_SANITIZE, unless
their code gets broken by KMSAN (e.g. runs at early boot time).

+KMSAN checks can also be temporarily disabled for the current task using
+``kmsan_disable_current()`` and ``kmsan_enable_current()`` calls. Each
+``kmsan_enable_current()`` call must be preceded by a
+``kmsan_disable_current()`` call; these call pairs may be nested. One needs to
+be careful with these calls, keeping the regions short and preferring other
+ways to disable instrumentation, where possible.
+
Support
=======

@@ -338,11 +345,11 @@ Per-task KMSAN state
~~~~~~~~~~~~~~~~~~~~

Every task_struct has an associated KMSAN task state that holds the KMSAN
-context (see above) and a per-task flag disallowing KMSAN reports::
+context (see above) and a per-task counter disallowing KMSAN reports::

struct kmsan_context {
...
- bool allow_reporting;
+ unsigned int depth;
struct kmsan_context_state cstate;
...
}
diff --git a/include/linux/kmsan.h b/include/linux/kmsan.h
index fe6c2212bdb1..23de1b3d6aee 100644
--- a/include/linux/kmsan.h
+++ b/include/linux/kmsan.h
@@ -239,6 +239,22 @@ void kmsan_unpoison_entry_regs(const struct pt_regs *regs);
*/
void *kmsan_get_metadata(void *addr, bool is_origin);

+/*
+ * kmsan_enable_current(): Enable KMSAN for the current task.
+ *
+ * Each kmsan_enable_current() current call must be preceded by a
+ * kmsan_disable_current() call. These call pairs may be nested.
+ */
+void kmsan_enable_current(void);
+
+/*
+ * kmsan_disable_current(): Disable KMSAN for the current task.
+ *
+ * Each kmsan_disable_current() current call must be followed by a
+ * kmsan_enable_current() call. These call pairs may be nested.
+ */
+void kmsan_disable_current(void);
+
#else

static inline void kmsan_init_shadow(void)
@@ -338,6 +354,14 @@ static inline void kmsan_unpoison_entry_regs(const struct pt_regs *regs)
{
}

+static inline void kmsan_enable_current(void)
+{
+}
+
+static inline void kmsan_disable_current(void)
+{
+}
+
#endif

#endif /* _LINUX_KMSAN_H */
diff --git a/include/linux/kmsan_types.h b/include/linux/kmsan_types.h
index 929287981afe..dfc59918b3c0 100644
--- a/include/linux/kmsan_types.h
+++ b/include/linux/kmsan_types.h
@@ -31,7 +31,7 @@ struct kmsan_context_state {
struct kmsan_ctx {
struct kmsan_context_state cstate;
int kmsan_in_runtime;
- bool allow_reporting;
+ unsigned int depth;
};

#endif /* _LINUX_KMSAN_TYPES_H */
diff --git a/mm/kmsan/core.c b/mm/kmsan/core.c
index 95f859e38c53..81b22220711a 100644
--- a/mm/kmsan/core.c
+++ b/mm/kmsan/core.c
@@ -43,7 +43,6 @@ void kmsan_internal_task_create(struct task_struct *task)
struct thread_info *info = current_thread_info();

__memset(ctx, 0, sizeof(*ctx));
- ctx->allow_reporting = true;
kmsan_internal_unpoison_memory(info, sizeof(*info), false);
}

diff --git a/mm/kmsan/hooks.c b/mm/kmsan/hooks.c
index b408714f9ba3..267d0afa2e8b 100644
--- a/mm/kmsan/hooks.c
+++ b/mm/kmsan/hooks.c
@@ -39,12 +39,10 @@ void kmsan_task_create(struct task_struct *task)

void kmsan_task_exit(struct task_struct *task)
{
- struct kmsan_ctx *ctx = &task->kmsan_ctx;
-
if (!kmsan_enabled || kmsan_in_runtime())
return;

- ctx->allow_reporting = false;
+ kmsan_disable_current();
}

void kmsan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags)
@@ -424,3 +422,17 @@ void kmsan_check_memory(const void *addr, size_t size)
REASON_ANY);
}
EXPORT_SYMBOL(kmsan_check_memory);
+
+void kmsan_enable_current(void)
+{
+ KMSAN_WARN_ON(current->kmsan_ctx.depth == 0);
+ current->kmsan_ctx.depth--;
+}
+EXPORT_SYMBOL(kmsan_enable_current);
+
+void kmsan_disable_current(void)
+{
+ current->kmsan_ctx.depth++;
+ KMSAN_WARN_ON(current->kmsan_ctx.depth == 0);
+}
+EXPORT_SYMBOL(kmsan_disable_current);
diff --git a/mm/kmsan/report.c b/mm/kmsan/report.c
index c79d3b0d2d0d..92e73ec61435 100644
--- a/mm/kmsan/report.c
+++ b/mm/kmsan/report.c
@@ -8,6 +8,7 @@
*/

#include <linux/console.h>
+#include <linux/kmsan.h>
#include <linux/moduleparam.h>
#include <linux/stackdepot.h>
#include <linux/stacktrace.h>
@@ -158,12 +159,12 @@ void kmsan_report(depot_stack_handle_t origin, void *address, int size,

if (!kmsan_enabled)
return;
- if (!current->kmsan_ctx.allow_reporting)
+ if (current->kmsan_ctx.depth)
return;
if (!origin)
return;

- current->kmsan_ctx.allow_reporting = false;
+ kmsan_disable_current();
ua_flags = user_access_save();
raw_spin_lock(&kmsan_report_lock);
pr_err("=====================================================\n");
@@ -216,5 +217,5 @@ void kmsan_report(depot_stack_handle_t origin, void *address, int size,
if (panic_on_kmsan)
panic("kmsan.panic set ...\n");
user_access_restore(ua_flags);
- current->kmsan_ctx.allow_reporting = true;
+ kmsan_enable_current();
}
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 0a33d9195b7a..01237d167223 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1202,6 +1202,8 @@ static const char *uaccess_safe_builtin[] = {
"__sanitizer_cov_trace_switch",
/* KMSAN */
"kmsan_copy_to_user",
+ "kmsan_disable_current",
+ "kmsan_enable_current",
"kmsan_report",
"kmsan_unpoison_entry_regs",
"kmsan_unpoison_memory",
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:49 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
The value assigned to prot is immediately overwritten on the next line
with PAGE_KERNEL. The right hand side of the assignment has no
side-effects.

Fixes: b073d7f8aee4 ("mm: kmsan: maintain KMSAN metadata for page operations")
Suggested-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/shadow.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/mm/kmsan/shadow.c b/mm/kmsan/shadow.c
index b9d05aff313e..2d57408c78ae 100644
--- a/mm/kmsan/shadow.c
+++ b/mm/kmsan/shadow.c
@@ -243,7 +243,6 @@ int kmsan_vmap_pages_range_noflush(unsigned long start, unsigned long end,
s_pages[i] = shadow_page_for(pages[i]);
o_pages[i] = origin_page_for(pages[i]);
}
- prot = __pgprot(pgprot_val(prot) | _PAGE_NX);
prot = PAGE_KERNEL;

origin_start = vmalloc_meta((void *)start, KMSAN_META_ORIGIN);
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:49 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Comparing pointers with TASK_SIZE does not make sense when kernel and
userspace overlap. Assume that we are handling user memory access in
this case.

Reported-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/hooks.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/kmsan/hooks.c b/mm/kmsan/hooks.c
index 22e8657800ef..b408714f9ba3 100644
--- a/mm/kmsan/hooks.c
+++ b/mm/kmsan/hooks.c
@@ -267,7 +267,8 @@ void kmsan_copy_to_user(void __user *to, const void *from, size_t to_copy,
return;

ua_flags = user_access_save();
- if ((u64)to < TASK_SIZE) {
+ if (!IS_ENABLED(CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE) ||
+ (u64)to < TASK_SIZE) {
/* This is a user memory access, check it. */
kmsan_internal_check_memory((void *)from, to_copy - left, to,
REASON_COPY_TO_USER);
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:49 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
When building the kmsan test as a module, modpost fails with the
following error message:

ERROR: modpost: "panic_on_kmsan" [mm/kmsan/kmsan_test.ko] undefined!

Export panic_on_kmsan in order to improve the KMSAN usability for
modules.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/report.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/mm/kmsan/report.c b/mm/kmsan/report.c
index 02736ec757f2..c79d3b0d2d0d 100644
--- a/mm/kmsan/report.c
+++ b/mm/kmsan/report.c
@@ -20,6 +20,7 @@ static DEFINE_RAW_SPINLOCK(kmsan_report_lock);
/* Protected by kmsan_report_lock */
static char report_local_descr[DESCR_SIZE];
int panic_on_kmsan __read_mostly;
+EXPORT_SYMBOL_GPL(panic_on_kmsan);

#ifdef MODULE_PARAM_PREFIX
#undef MODULE_PARAM_PREFIX
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:50 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Add a wrapper for memset() that prevents unpoisoning. This is useful
for filling memory allocator redzones.

Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
include/linux/kmsan.h | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/include/linux/kmsan.h b/include/linux/kmsan.h
index 23de1b3d6aee..5f50885f2023 100644
--- a/include/linux/kmsan.h
+++ b/include/linux/kmsan.h
@@ -255,6 +255,14 @@ void kmsan_enable_current(void);
*/
void kmsan_disable_current(void);

+/*
+ * memset_no_sanitize_memory(): memset() without KMSAN instrumentation.
+ */
+static inline void *memset_no_sanitize_memory(void *s, int c, size_t n)
+{
+ return __memset(s, c, n);
+}
+
#else

static inline void kmsan_init_shadow(void)
@@ -362,6 +370,11 @@ static inline void kmsan_disable_current(void)
{
}

+static inline void *memset_no_sanitize_memory(void *s, int c, size_t n)
+{
+ return memset(s, c, n);
+}
+
#endif

#endif /* _LINUX_KMSAN_H */
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:50 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Architectures use assembly code to initialize ftrace_regs and call
ftrace_ops_list_func(). Therefore, from the KMSAN's point of view,
ftrace_regs is poisoned on ftrace_ops_list_func entry(). This causes
KMSAN warnings when running the ftrace testsuite.

Fix by trusting the architecture-specific assembly code and always
unpoisoning ftrace_regs in ftrace_ops_list_func.

The issue was not encountered on x86_64 so far only by accident:
assembly-allocated ftrace_regs was overlapping a stale partially
unpoisoned stack frame. Poisoning stack frames before returns [1]
makes the issue appear on x86_64 as well.

[1] https://github.com/iii-i/llvm-project/commits/msan-poison-allocas-before-returning-2024-06-12/

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Steven Rostedt (Google) <ros...@goodmis.org>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
kernel/trace/ftrace.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 65208d3b5ed9..c35ad4362d71 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -7407,6 +7407,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
void arch_ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs)
{
+ kmsan_unpoison_memory(fregs, sizeof(*fregs));
__ftrace_ops_list_func(ip, parent_ip, NULL, fregs);
}
#else
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:50 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
The inline assembly block in s390's chsc() stores that much.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/instrumentation.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/kmsan/instrumentation.c b/mm/kmsan/instrumentation.c
index cc3907a9c33a..470b0b4afcc4 100644
--- a/mm/kmsan/instrumentation.c
+++ b/mm/kmsan/instrumentation.c
@@ -110,11 +110,10 @@ void __msan_instrument_asm_store(void *addr, uintptr_t size)

ua_flags = user_access_save();
/*
- * Most of the accesses are below 32 bytes. The two exceptions so far
- * are clwb() (64 bytes) and FPU state (512 bytes).
- * It's unlikely that the assembly will touch more than 512 bytes.
+ * Most of the accesses are below 32 bytes. The exceptions so far are
+ * clwb() (64 bytes), FPU state (512 bytes) and chsc() (4096 bytes).
*/
- if (size > 512) {
+ if (size > 4096) {
WARN_ONCE(1, "assembly store size too big: %ld\n", size);
size = 8;
}
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:51 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Avoid false KMSAN negatives with SLUB_DEBUG by allowing
kmsan_slab_free() to poison the freed memory, and by preventing
init_object() from unpoisoning new allocations by using __memset().

There are two alternatives to this approach. First, init_object()
can be marked with __no_sanitize_memory. This annotation should be used
with great care, because it drops all instrumentation from the
function, and any shadow writes will be lost. Even though this is not a
concern with the current init_object() implementation, this may change
in the future.

Second, kmsan_poison_memory() calls may be added after memset() calls.
The downside is that init_object() is called from
free_debug_processing(), in which case poisoning will erase the
distinction between simply uninitialized memory and UAF.

Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/hooks.c | 2 +-
mm/slub.c | 15 +++++++++++----
2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/kmsan/hooks.c b/mm/kmsan/hooks.c
index 267d0afa2e8b..26d86dfdc819 100644
--- a/mm/kmsan/hooks.c
+++ b/mm/kmsan/hooks.c
@@ -74,7 +74,7 @@ void kmsan_slab_free(struct kmem_cache *s, void *object)
return;

/* RCU slabs could be legally used after free within the RCU period */
- if (unlikely(s->flags & (SLAB_TYPESAFE_BY_RCU | SLAB_POISON)))
+ if (unlikely(s->flags & SLAB_TYPESAFE_BY_RCU))
return;
/*
* If there's a constructor, freed memory must remain in the same state
diff --git a/mm/slub.c b/mm/slub.c
index 1373ac365a46..1134091abac5 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1139,7 +1139,13 @@ static void init_object(struct kmem_cache *s, void *object, u8 val)
unsigned int poison_size = s->object_size;

if (s->flags & SLAB_RED_ZONE) {
- memset(p - s->red_left_pad, val, s->red_left_pad);
+ /*
+ * Here and below, avoid overwriting the KMSAN shadow. Keeping
+ * the shadow makes it possible to distinguish uninit-value
+ * from use-after-free.
+ */
+ memset_no_sanitize_memory(p - s->red_left_pad, val,
+ s->red_left_pad);

if (slub_debug_orig_size(s) && val == SLUB_RED_ACTIVE) {
/*
@@ -1152,12 +1158,13 @@ static void init_object(struct kmem_cache *s, void *object, u8 val)
}

if (s->flags & __OBJECT_POISON) {
- memset(p, POISON_FREE, poison_size - 1);
- p[poison_size - 1] = POISON_END;
+ memset_no_sanitize_memory(p, POISON_FREE, poison_size - 1);
+ memset_no_sanitize_memory(p + poison_size - 1, POISON_END, 1);
}

if (s->flags & SLAB_RED_ZONE)
- memset(p + poison_size, val, s->inuse - poison_size);
+ memset_no_sanitize_memory(p + poison_size, val,
+ s->inuse - poison_size);
}

static void restore_bytes(struct kmem_cache *s, char *message, u8 data,
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:51 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Building the kernel with CONFIG_SLUB_DEBUG and CONFIG_KMSAN causes
KMSAN to complain about touching redzones in kfree().

Fix by extending the existing KASAN-related metadata_access_enable()
and metadata_access_disable() functions to KMSAN.

Acked-by: Vlastimil Babka <vba...@suse.cz>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/slub.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index 1134091abac5..b050e528112c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -829,10 +829,12 @@ static int disable_higher_order_debug;
static inline void metadata_access_enable(void)
{
kasan_disable_current();
+ kmsan_disable_current();
}

static inline void metadata_access_disable(void)
{
+ kmsan_enable_current();
kasan_enable_current();
}

--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:51 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Each s390 CPU has lowcore pages associated with it. Each CPU sees its
own lowcore at virtual address 0 through a hardware mechanism called
prefixing. Additionally, all lowcores are mapped to non-0 virtual
addresses stored in the lowcore_ptr[] array.

When lowcore is accessed through virtual address 0, one needs to
resolve metadata for lowcore_ptr[raw_smp_processor_id()].

Expose kmsan_get_metadata() to make it possible to do this from the
arch code.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
include/linux/kmsan.h | 9 +++++++++
mm/kmsan/instrumentation.c | 1 +
mm/kmsan/kmsan.h | 1 -
3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/linux/kmsan.h b/include/linux/kmsan.h
index e0c23a32cdf0..fe6c2212bdb1 100644
--- a/include/linux/kmsan.h
+++ b/include/linux/kmsan.h
@@ -230,6 +230,15 @@ void kmsan_handle_urb(const struct urb *urb, bool is_out);
*/
void kmsan_unpoison_entry_regs(const struct pt_regs *regs);

+/**
+ * kmsan_get_metadata() - Return a pointer to KMSAN shadow or origins.
+ * @addr: kernel address.
+ * @is_origin: whether to return origins or shadow.
+ *
+ * Return NULL if metadata cannot be found.
+ */
+void *kmsan_get_metadata(void *addr, bool is_origin);
+
#else

static inline void kmsan_init_shadow(void)
diff --git a/mm/kmsan/instrumentation.c b/mm/kmsan/instrumentation.c
index 8a1bbbc723ab..94b49fac9d8b 100644
--- a/mm/kmsan/instrumentation.c
+++ b/mm/kmsan/instrumentation.c
@@ -14,6 +14,7 @@

#include "kmsan.h"
#include <linux/gfp.h>
+#include <linux/kmsan.h>
#include <linux/kmsan_string.h>
#include <linux/mm.h>
#include <linux/uaccess.h>
diff --git a/mm/kmsan/kmsan.h b/mm/kmsan/kmsan.h
index adf443bcffe8..34b83c301d57 100644
--- a/mm/kmsan/kmsan.h
+++ b/mm/kmsan/kmsan.h
@@ -66,7 +66,6 @@ struct shadow_origin_ptr {

struct shadow_origin_ptr kmsan_get_shadow_origin_ptr(void *addr, u64 size,
bool store);
-void *kmsan_get_metadata(void *addr, bool is_origin);
void __init kmsan_init_alloc_meta_for_range(void *start, void *end);

enum kmsan_bug_reason {
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:51 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Replace the x86-specific asm/pgtable_64_types.h #include with the
linux/pgtable.h one, which all architectures have.

While at it, sort the headers alphabetically for the sake of
consistency with other KMSAN code.

Fixes: f80be4571b19 ("kmsan: add KMSAN runtime core")
Suggested-by: Heiko Carstens <h...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/kmsan.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/kmsan/kmsan.h b/mm/kmsan/kmsan.h
index a14744205435..adf443bcffe8 100644
--- a/mm/kmsan/kmsan.h
+++ b/mm/kmsan/kmsan.h
@@ -10,14 +10,14 @@
#ifndef __MM_KMSAN_KMSAN_H
#define __MM_KMSAN_KMSAN_H

-#include <asm/pgtable_64_types.h>
#include <linux/irqflags.h>
+#include <linux/mm.h>
+#include <linux/nmi.h>
+#include <linux/pgtable.h>
+#include <linux/printk.h>
#include <linux/sched.h>
#include <linux/stackdepot.h>
#include <linux/stacktrace.h>
-#include <linux/nmi.h>
-#include <linux/mm.h>
-#include <linux/printk.h>

#define KMSAN_ALLOCA_MAGIC_ORIGIN 0xabcd0100
#define KMSAN_CHAIN_MAGIC_ORIGIN 0xabcd0200
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:52 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
All other sanitizers are disabled for boot as well. While at it, add a
comment explaining why we need this.

Reviewed-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/boot/Makefile | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
index 070c9b2e905f..526ed20b9d31 100644
--- a/arch/s390/boot/Makefile
+++ b/arch/s390/boot/Makefile
@@ -3,11 +3,13 @@
# Makefile for the linux s390-specific parts of the memory manager.
#

+# Tooling runtimes are unavailable and cannot be linked for early boot code
KCOV_INSTRUMENT := n
GCOV_PROFILE := n
UBSAN_SANITIZE := n
KASAN_SANITIZE := n
KCSAN_SANITIZE := n
+KMSAN_SANITIZE := n

KBUILD_AFLAGS := $(KBUILD_AFLAGS_DECOMPRESSOR)
KBUILD_CFLAGS := $(KBUILD_CFLAGS_DECOMPRESSOR)
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:53 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Even though the KMSAN warnings generated by memchr_inv() are suppressed
by metadata_access_enable(), its return value may still be poisoned.

The reason is that the last iteration of memchr_inv() returns
`*start != value ? start : NULL`, where *start is poisoned. Because of
this, somewhat counterintuitively, the shadow value computed by
visitSelectInst() is equal to `(uintptr_t)start`.

One possibility to fix this, since the intention behind guarding
memchr_inv() behind metadata_access_enable() is to touch poisoned
metadata without triggering KMSAN, is to unpoison its return value.
However, this approach is too fragile. So simply disable the KMSAN
checks in the respective functions.

Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/slub.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index b050e528112c..fcd68fcea4ab 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1176,9 +1176,16 @@ static void restore_bytes(struct kmem_cache *s, char *message, u8 data,
memset(from, data, to - from);
}

-static int check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
- u8 *object, char *what,
- u8 *start, unsigned int value, unsigned int bytes)
+#ifdef CONFIG_KMSAN
+#define pad_check_attributes noinline __no_kmsan_checks
+#else
+#define pad_check_attributes
+#endif
+
+static pad_check_attributes int
+check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
+ u8 *object, char *what,
+ u8 *start, unsigned int value, unsigned int bytes)
{
u8 *fault;
u8 *end;
@@ -1270,7 +1277,8 @@ static int check_pad_bytes(struct kmem_cache *s, struct slab *slab, u8 *p)
}

/* Check the pad bytes at the end of a slab page */
-static void slab_pad_check(struct kmem_cache *s, struct slab *slab)
+static pad_check_attributes void
+slab_pad_check(struct kmem_cache *s, struct slab *slab)
{
u8 *start;
u8 *fault;
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:53 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Improve the readability by replacing the custom aligning logic with
ALIGN_DOWN(). Unlike other places where a similar sequence is used,
there is no size parameter that needs to be adjusted, so the standard
macro fits.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/shadow.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/mm/kmsan/shadow.c b/mm/kmsan/shadow.c
index 2d57408c78ae..9c58f081d84f 100644
--- a/mm/kmsan/shadow.c
+++ b/mm/kmsan/shadow.c
@@ -123,14 +123,12 @@ struct shadow_origin_ptr kmsan_get_shadow_origin_ptr(void *address, u64 size,
*/
void *kmsan_get_metadata(void *address, bool is_origin)
{
- u64 addr = (u64)address, pad, off;
+ u64 addr = (u64)address, off;
struct page *page;
void *ret;

- if (is_origin && !IS_ALIGNED(addr, KMSAN_ORIGIN_SIZE)) {
- pad = addr % KMSAN_ORIGIN_SIZE;
- addr -= pad;
- }
+ if (is_origin)
+ addr = ALIGN_DOWN(addr, KMSAN_ORIGIN_SIZE);
address = (void *)addr;
if (kmsan_internal_is_vmalloc_addr(address) ||
kmsan_internal_is_module_addr(address))
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:53 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
The constraints of the DFLTCC inline assembly are not precise: they
do not communicate the size of the output buffers to the compiler, so
it cannot automatically instrument it.

Add the manual kmsan_unpoison_memory() calls for the output buffers.
The logic is the same as in [1].

[1] https://github.com/zlib-ng/zlib-ng/commit/1f5ddcc009ac3511e99fc88736a9e1a6381168c5

Reported-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
lib/zlib_dfltcc/dfltcc.h | 1 +
lib/zlib_dfltcc/dfltcc_util.h | 28 ++++++++++++++++++++++++++++
2 files changed, 29 insertions(+)

diff --git a/lib/zlib_dfltcc/dfltcc.h b/lib/zlib_dfltcc/dfltcc.h
index b96232bdd44d..0f2a16d7a48a 100644
--- a/lib/zlib_dfltcc/dfltcc.h
+++ b/lib/zlib_dfltcc/dfltcc.h
@@ -80,6 +80,7 @@ struct dfltcc_param_v0 {
uint8_t csb[1152];
};

+static_assert(offsetof(struct dfltcc_param_v0, csb) == 384);
static_assert(sizeof(struct dfltcc_param_v0) == 1536);

#define CVT_CRC32 0
diff --git a/lib/zlib_dfltcc/dfltcc_util.h b/lib/zlib_dfltcc/dfltcc_util.h
index 4a46b5009f0d..10509270d822 100644
--- a/lib/zlib_dfltcc/dfltcc_util.h
+++ b/lib/zlib_dfltcc/dfltcc_util.h
@@ -2,6 +2,8 @@
#ifndef DFLTCC_UTIL_H
#define DFLTCC_UTIL_H

+#include "dfltcc.h"
+#include <linux/kmsan-checks.h>
#include <linux/zutil.h>

/*
@@ -20,6 +22,7 @@ typedef enum {
#define DFLTCC_CMPR 2
#define DFLTCC_XPND 4
#define HBT_CIRCULAR (1 << 7)
+#define DFLTCC_FN_MASK ((1 << 7) - 1)
#define HB_BITS 15
#define HB_SIZE (1 << HB_BITS)

@@ -34,6 +37,7 @@ static inline dfltcc_cc dfltcc(
)
{
Byte *t2 = op1 ? *op1 : NULL;
+ unsigned char *orig_t2 = t2;
size_t t3 = len1 ? *len1 : 0;
const Byte *t4 = op2 ? *op2 : NULL;
size_t t5 = len2 ? *len2 : 0;
@@ -59,6 +63,30 @@ static inline dfltcc_cc dfltcc(
: "cc", "memory");
t2 = r2; t3 = r3; t4 = r4; t5 = r5;

+ /*
+ * Unpoison the parameter block and the output buffer.
+ * This is a no-op in non-KMSAN builds.
+ */
+ switch (fn & DFLTCC_FN_MASK) {
+ case DFLTCC_QAF:
+ kmsan_unpoison_memory(param, sizeof(struct dfltcc_qaf_param));
+ break;
+ case DFLTCC_GDHT:
+ kmsan_unpoison_memory(param, offsetof(struct dfltcc_param_v0, csb));
+ break;
+ case DFLTCC_CMPR:
+ kmsan_unpoison_memory(param, sizeof(struct dfltcc_param_v0));
+ kmsan_unpoison_memory(
+ orig_t2,
+ t2 - orig_t2 +
+ (((struct dfltcc_param_v0 *)param)->sbb == 0 ? 0 : 1));
+ break;
+ case DFLTCC_XPND:
+ kmsan_unpoison_memory(param, sizeof(struct dfltcc_param_v0));
+ kmsan_unpoison_memory(orig_t2, t2 - orig_t2);
+ break;
+ }
+
if (op1)
*op1 = t2;
if (len1)
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:53 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Adjust the stack size for the KMSAN-enabled kernel like it was done
for the KASAN-enabled one in commit 7fef92ccadd7 ("s390/kasan: double
the stack size"). Both tools have similar requirements.

Reviewed-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/Makefile | 2 +-
arch/s390/include/asm/thread_info.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/s390/Makefile b/arch/s390/Makefile
index f2b21c7a70ef..7fd57398221e 100644
--- a/arch/s390/Makefile
+++ b/arch/s390/Makefile
@@ -36,7 +36,7 @@ KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_DEBUG_INFO_DWARF4), $(call cc-option
KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_CC_NO_ARRAY_BOUNDS),-Wno-array-bounds)

UTS_MACHINE := s390x
-STACK_SIZE := $(if $(CONFIG_KASAN),65536,16384)
+STACK_SIZE := $(if $(CONFIG_KASAN),65536,$(if $(CONFIG_KMSAN),65536,16384))
CHECKFLAGS += -D__s390__ -D__s390x__

export LD_BFD
diff --git a/arch/s390/include/asm/thread_info.h b/arch/s390/include/asm/thread_info.h
index a674c7d25da5..d02a709717b8 100644
--- a/arch/s390/include/asm/thread_info.h
+++ b/arch/s390/include/asm/thread_info.h
@@ -16,7 +16,7 @@
/*
* General size of kernel stacks
*/
-#ifdef CONFIG_KASAN
+#if defined(CONFIG_KASAN) || defined(CONFIG_KMSAN)
#define THREAD_SIZE_ORDER 4
#else
#define THREAD_SIZE_ORDER 2
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:53 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
KMSAN warns about check_canary() accessing the canary.

The reason is that, even though set_canary() is properly instrumented
and sets shadow, slub explicitly poisons the canary's address range
afterwards.

Unpoisoning the canary is not the right thing to do: only
check_canary() is supposed to ever touch it. Instead, disable KMSAN
checks around canary read accesses.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Tested-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kfence/core.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 964b8482275b..83f8e78827c0 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -305,8 +305,14 @@ metadata_update_state(struct kfence_metadata *meta, enum kfence_object_state nex
WRITE_ONCE(meta->state, next);
}

+#ifdef CONFIG_KMSAN
+#define check_canary_attributes noinline __no_kmsan_checks
+#else
+#define check_canary_attributes inline
+#endif
+
/* Check canary byte at @addr. */
-static inline bool check_canary_byte(u8 *addr)
+static check_canary_attributes bool check_canary_byte(u8 *addr)
{
struct kfence_metadata *meta;
unsigned long flags;
@@ -341,7 +347,8 @@ static inline void set_canary(const struct kfence_metadata *meta)
*((u64 *)addr) = KFENCE_CANARY_PATTERN_U64;
}

-static inline void check_canary(const struct kfence_metadata *meta)
+static check_canary_attributes void
+check_canary(const struct kfence_metadata *meta)
{
const unsigned long pageaddr = ALIGN_DOWN(meta->addr, PAGE_SIZE);
unsigned long addr = pageaddr;
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:53 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
On s390 the virtual address 0 is valid (current CPU's lowcore is mapped
there), therefore KMSAN should not complain about it.

Disable the respective check on s390. There doesn't seem to be a
Kconfig option to describe this situation, so explicitly check for
s390.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/init.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/kmsan/init.c b/mm/kmsan/init.c
index 9de76ac7062c..3f8b1bbb9060 100644
--- a/mm/kmsan/init.c
+++ b/mm/kmsan/init.c
@@ -33,7 +33,10 @@ static void __init kmsan_record_future_shadow_range(void *start, void *end)
bool merged = false;

KMSAN_WARN_ON(future_index == NUM_FUTURE_RANGES);
- KMSAN_WARN_ON((nstart >= nend) || !nstart || !nend);
+ KMSAN_WARN_ON((nstart >= nend) ||
+ /* Virtual address 0 is valid on s390. */
+ (!IS_ENABLED(CONFIG_S390) && !nstart) ||
+ !nend);
nstart = ALIGN_DOWN(nstart, PAGE_SIZE);
nend = ALIGN(nend, PAGE_SIZE);

--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:54 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Prevent KMSAN from complaining about buffers filled by cpacf_trng()
being uninitialized.

Tested-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/cpacf.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/s390/include/asm/cpacf.h b/arch/s390/include/asm/cpacf.h
index c786538e397c..dae8843b164f 100644
--- a/arch/s390/include/asm/cpacf.h
+++ b/arch/s390/include/asm/cpacf.h
@@ -12,6 +12,7 @@
#define _ASM_S390_CPACF_H

#include <asm/facility.h>
+#include <linux/kmsan-checks.h>

/*
* Instruction opcodes for the CPACF instructions
@@ -542,6 +543,8 @@ static inline void cpacf_trng(u8 *ucbuf, unsigned long ucbuf_len,
: [ucbuf] "+&d" (u.pair), [cbuf] "+&d" (c.pair)
: [fc] "K" (CPACF_PRNO_TRNG), [opc] "i" (CPACF_PRNO)
: "cc", "memory", "0");
+ kmsan_unpoison_memory(ucbuf, ucbuf_len);
+ kmsan_unpoison_memory(cbuf, cbuf_len);
}

/**
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:54 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Add a KMSAN check to the CKSM inline assembly, similar to how it was
done for ASAN in commit e42ac7789df6 ("s390/checksum: always use cksm
instruction").

Acked-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/checksum.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/checksum.h b/arch/s390/include/asm/checksum.h
index b89159591ca0..46f5c9660616 100644
--- a/arch/s390/include/asm/checksum.h
+++ b/arch/s390/include/asm/checksum.h
@@ -13,6 +13,7 @@
#define _S390_CHECKSUM_H

#include <linux/instrumented.h>
+#include <linux/kmsan-checks.h>
#include <linux/in6.h>

static inline __wsum cksm(const void *buff, int len, __wsum sum)
@@ -23,6 +24,7 @@ static inline __wsum cksm(const void *buff, int len, __wsum sum)
};

instrument_read(buff, len);
+ kmsan_check_memory(buff, len);
asm volatile("\n"
"0: cksm %[sum],%[rp]\n"
" jo 0b\n"
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:54 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Comparing pointers with TASK_SIZE does not make sense when kernel and
userspace overlap. Skip the comparison when this is the case.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/instrumentation.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/kmsan/instrumentation.c b/mm/kmsan/instrumentation.c
index 470b0b4afcc4..8a1bbbc723ab 100644
--- a/mm/kmsan/instrumentation.c
+++ b/mm/kmsan/instrumentation.c
@@ -20,7 +20,8 @@

static inline bool is_bad_asm_addr(void *addr, uintptr_t size, bool is_store)
{
- if ((u64)addr < TASK_SIZE)
+ if (IS_ENABLED(CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE) &&
+ (u64)addr < TASK_SIZE)
return true;
if (!kmsan_get_metadata(addr, KMSAN_META_SHADOW))
return true;
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:54 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
x86's alloc_node_data() rounds up node data size to PAGE_SIZE. It's not
explained why it's needed, but it's most likely for performance
reasons, since the padding bytes are not used anywhere. Some other
architectures do it as well, e.g., mips rounds it up to the cache line
size.

kmsan_init_shadow() initializes metadata for each node data and assumes
the x86 rounding, which does not match other architectures. This may
cause the range end to overshoot the end of available memory, in turn
causing virt_to_page_or_null() in kmsan_init_alloc_meta_for_range() to
return NULL, which leads to kernel panic shortly after.

Since the padding bytes are not used, drop the rounding.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
mm/kmsan/init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kmsan/init.c b/mm/kmsan/init.c
index 3ac3b8921d36..9de76ac7062c 100644
--- a/mm/kmsan/init.c
+++ b/mm/kmsan/init.c
@@ -72,7 +72,7 @@ static void __init kmsan_record_future_shadow_range(void *start, void *end)
*/
void __init kmsan_init_shadow(void)
{
- const size_t nd_size = roundup(sizeof(pg_data_t), PAGE_SIZE);
+ const size_t nd_size = sizeof(pg_data_t);
phys_addr_t p_start, p_end;
u64 loop;
int nid;
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:55 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
stcctm() uses the "Q" constraint for dest, therefore KMSAN does not
understand that it fills multiple doublewords pointed to by dest, not
just one. This results in false positives.

Unpoison the whole dest manually with kmsan_unpoison_memory().

Reported-by: Alexander Gordeev <agor...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/cpu_mf.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/s390/include/asm/cpu_mf.h b/arch/s390/include/asm/cpu_mf.h
index a0de5b9b02ea..9e4bbc3e53f8 100644
--- a/arch/s390/include/asm/cpu_mf.h
+++ b/arch/s390/include/asm/cpu_mf.h
@@ -10,6 +10,7 @@
#define _ASM_S390_CPU_MF_H

#include <linux/errno.h>
+#include <linux/kmsan-checks.h>
#include <asm/asm-extable.h>
#include <asm/facility.h>

@@ -239,6 +240,11 @@ static __always_inline int stcctm(enum stcctm_ctr_set set, u64 range, u64 *dest)
: "=d" (cc)
: "Q" (*dest), "d" (range), "i" (set)
: "cc", "memory");
+ /*
+ * If cc == 2, less than RANGE counters are stored, but it's not easy
+ * to tell how many. Always unpoison the whole range for simplicity.
+ */
+ kmsan_unpoison_memory(dest, range * sizeof(u64));
return cc;
}

--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:55 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
It should be possible to have inline functions in the s390 header
files, which call kmsan_unpoison_memory(). The problem is that these
header files might be included by the decompressor, which does not
contain KMSAN runtime, causing linker errors.

Not compiling these calls if __SANITIZE_MEMORY__ is not defined -
either by changing kmsan-checks.h or at the call sites - may cause
unintended side effects, since calling these functions from an
uninstrumented code that is linked into the kernel is valid use case.

One might want to explicitly distinguish between the kernel and the
decompressor. Checking for a decompressor-specific #define is quite
heavy-handed, and will have to be done at all call sites.

A more generic approach is to provide a dummy kmsan_unpoison_memory()
definition. This produces some runtime overhead, but only when building
with CONFIG_KMSAN. The benefit is that it does not disturb the existing
KMSAN build logic and call sites don't need to be changed.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/boot/Makefile | 1 +
arch/s390/boot/kmsan.c | 6 ++++++
2 files changed, 7 insertions(+)
create mode 100644 arch/s390/boot/kmsan.c

diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile
index 526ed20b9d31..e7658997452b 100644
--- a/arch/s390/boot/Makefile
+++ b/arch/s390/boot/Makefile
@@ -44,6 +44,7 @@ obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE)) +=
obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
obj-y += $(if $(CONFIG_KERNEL_UNCOMPRESSED),,decompressor.o) info.o
obj-$(CONFIG_KERNEL_ZSTD) += clz_ctz.o
+obj-$(CONFIG_KMSAN) += kmsan.o
obj-all := $(obj-y) piggy.o syms.o

targets := bzImage section_cmp.boot.data section_cmp.boot.preserved.data $(obj-y)
diff --git a/arch/s390/boot/kmsan.c b/arch/s390/boot/kmsan.c
new file mode 100644
index 000000000000..e7b3ac48143e
--- /dev/null
+++ b/arch/s390/boot/kmsan.c
@@ -0,0 +1,6 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/kmsan-checks.h>
+
+void kmsan_unpoison_memory(const void *address, size_t size)
+{
+}
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:55 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Lockdep generates the following false positives with KMSAN on s390x:

[ 6.063666] DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
[ ...]
[ 6.577050] Call Trace:
[ 6.619637] [<000000000690d2de>] check_flags+0x1fe/0x210
[ 6.665411] ([<000000000690d2da>] check_flags+0x1fa/0x210)
[ 6.707478] [<00000000006cec1a>] lock_acquire+0x2ca/0xce0
[ 6.749959] [<00000000069820ea>] _raw_spin_lock_irqsave+0xea/0x190
[ 6.794912] [<00000000041fc988>] __stack_depot_save+0x218/0x5b0
[ 6.838420] [<000000000197affe>] __msan_poison_alloca+0xfe/0x1a0
[ 6.882985] [<0000000007c5827c>] start_kernel+0x70c/0xd50
[ 6.927454] [<0000000000100036>] startup_continue+0x36/0x40

Between trace_hardirqs_on() and `stosm __mask, 3` lockdep thinks that
interrupts are on, but on the CPU they are still off. KMSAN
instrumentation takes spinlocks, giving lockdep a chance to see and
complain about this discrepancy.

KMSAN instrumentation is inserted in order to poison the __mask
variable. Disable instrumentation in the respective functions. They are
very small and it's easy to see that no important metadata updates are
lost because of this.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/irqflags.h | 17 ++++++++++++++---
drivers/s390/char/sclp.c | 2 +-
2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/irqflags.h b/arch/s390/include/asm/irqflags.h
index 02427b205c11..bcab456dfb80 100644
--- a/arch/s390/include/asm/irqflags.h
+++ b/arch/s390/include/asm/irqflags.h
@@ -37,12 +37,18 @@ static __always_inline void __arch_local_irq_ssm(unsigned long flags)
asm volatile("ssm %0" : : "Q" (flags) : "memory");
}

-static __always_inline unsigned long arch_local_save_flags(void)
+#ifdef CONFIG_KMSAN
+#define arch_local_irq_attributes noinline notrace __no_sanitize_memory __maybe_unused
+#else
+#define arch_local_irq_attributes __always_inline
+#endif
+
+static arch_local_irq_attributes unsigned long arch_local_save_flags(void)
{
return __arch_local_irq_stnsm(0xff);
}

-static __always_inline unsigned long arch_local_irq_save(void)
+static arch_local_irq_attributes unsigned long arch_local_irq_save(void)
{
return __arch_local_irq_stnsm(0xfc);
}
@@ -52,7 +58,12 @@ static __always_inline void arch_local_irq_disable(void)
arch_local_irq_save();
}

-static __always_inline void arch_local_irq_enable(void)
+static arch_local_irq_attributes void arch_local_irq_enable_external(void)
+{
+ __arch_local_irq_stosm(0x01);
+}
+
+static arch_local_irq_attributes void arch_local_irq_enable(void)
{
__arch_local_irq_stosm(0x03);
}
diff --git a/drivers/s390/char/sclp.c b/drivers/s390/char/sclp.c
index d53ee34d398f..fb1d9949adca 100644
--- a/drivers/s390/char/sclp.c
+++ b/drivers/s390/char/sclp.c
@@ -736,7 +736,7 @@ sclp_sync_wait(void)
cr0_sync.val = cr0.val & ~CR0_IRQ_SUBCLASS_MASK;
cr0_sync.val |= 1UL << (63 - 54);
local_ctl_load(0, &cr0_sync);
- __arch_local_irq_stosm(0x01);
+ arch_local_irq_enable_external();
/* Loop until driver state indicates finished request */
while (sclp_running_state != sclp_running_state_idle) {
/* Check for expired request timer */
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:56 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
s390 uses assembly code to initialize ftrace_regs and call
kprobe_ftrace_handler(). Therefore, from the KMSAN's point of view,
ftrace_regs is poisoned on kprobe_ftrace_handler() entry. This causes
KMSAN warnings when running the ftrace testsuite.

Fix by trusting the assembly code and always unpoisoning ftrace_regs in
kprobe_ftrace_handler().

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/kernel/ftrace.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index ddf2ee47cb87..0bd6adc40a34 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -12,6 +12,7 @@
#include <linux/ftrace.h>
#include <linux/kernel.h>
#include <linux/types.h>
+#include <linux/kmsan-checks.h>
#include <linux/kprobes.h>
#include <linux/execmem.h>
#include <trace/syscall.h>
@@ -303,6 +304,7 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
if (bit < 0)
return;

+ kmsan_unpoison_memory(fregs, sizeof(*fregs));
regs = ftrace_get_regs(fregs);
p = get_kprobe((kprobe_opcode_t *)ip);
if (!regs || unlikely(!p) || kprobe_disabled(p))
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:57 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
arch_kmsan_get_meta_or_null() finds the lowcore shadow by querying the
prefix and calling kmsan_get_metadata() again.

kmsan_virt_addr_valid() delegates to virt_addr_valid().

Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/kmsan.h | 59 +++++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)
create mode 100644 arch/s390/include/asm/kmsan.h

diff --git a/arch/s390/include/asm/kmsan.h b/arch/s390/include/asm/kmsan.h
new file mode 100644
index 000000000000..eb850c942204
--- /dev/null
+++ b/arch/s390/include/asm/kmsan.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_S390_KMSAN_H
+#define _ASM_S390_KMSAN_H
+
+#include <asm/lowcore.h>
+#include <asm/page.h>
+#include <linux/kmsan.h>
+#include <linux/mmzone.h>
+#include <linux/stddef.h>
+
+#ifndef MODULE
+
+static inline bool is_lowcore_addr(void *addr)
+{
+ return addr >= (void *)&S390_lowcore &&
+ addr < (void *)(&S390_lowcore + 1);
+}
+
+static inline void *arch_kmsan_get_meta_or_null(void *addr, bool is_origin)
+{
+ if (is_lowcore_addr(addr)) {
+ /*
+ * Different lowcores accessed via S390_lowcore are described
+ * by the same struct page. Resolve the prefix manually in
+ * order to get a distinct struct page.
+ */
+ addr += (void *)lowcore_ptr[raw_smp_processor_id()] -
+ (void *)&S390_lowcore;
+ if (WARN_ON_ONCE(is_lowcore_addr(addr)))
+ return NULL;
+ return kmsan_get_metadata(addr, is_origin);
+ }
+ return NULL;
+}
+
+static inline bool kmsan_virt_addr_valid(void *addr)
+{
+ bool ret;
+
+ /*
+ * pfn_valid() relies on RCU, and may call into the scheduler on exiting
+ * the critical section. However, this would result in recursion with
+ * KMSAN. Therefore, disable preemption here, and re-enable preemption
+ * below while suppressing reschedules to avoid recursion.
+ *
+ * Note, this sacrifices occasionally breaking scheduling guarantees.
+ * Although, a kernel compiled with KMSAN has already given up on any
+ * performance guarantees due to being heavily instrumented.
+ */
+ preempt_disable();
+ ret = virt_addr_valid(addr);
+ preempt_enable_no_resched();
+
+ return ret;
+}
+
+#endif /* !MODULE */
+
+#endif /* _ASM_S390_KMSAN_H */
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:57 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Diagnose 224 stores 4k bytes, which currently cannot be deduced from
the inline assembly constraints. This leads to KMSAN false positives.

Fix the constraints by using a 4k-sized struct instead of a raw
pointer. While at it, prettify them too.

Suggested-by: Heiko Carstens <h...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/kernel/diag.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/s390/kernel/diag.c b/arch/s390/kernel/diag.c
index 8dee9aa0ec95..8a7009618ba7 100644
--- a/arch/s390/kernel/diag.c
+++ b/arch/s390/kernel/diag.c
@@ -278,12 +278,14 @@ int diag224(void *ptr)
int rc = -EOPNOTSUPP;

diag_stat_inc(DIAG_STAT_X224);
- asm volatile(
- " diag %1,%2,0x224\n"
- "0: lhi %0,0x0\n"
+ asm volatile("\n"
+ " diag %[type],%[addr],0x224\n"
+ "0: lhi %[rc],0\n"
"1:\n"
EX_TABLE(0b,1b)
- : "+d" (rc) :"d" (0), "d" (addr) : "memory");
+ : [rc] "+d" (rc)
+ , "=m" (*(struct { char buf[PAGE_SIZE]; } *)ptr)
+ : [type] "d" (0), [addr] "d" (addr));
return rc;
}
EXPORT_SYMBOL(diag224);
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:57 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
This is normally done by the generic entry code, but the
kernel_stack_overflow() flow bypasses it.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/kernel/traps.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/s390/kernel/traps.c b/arch/s390/kernel/traps.c
index 52578b5cecbd..dde69d2a64f0 100644
--- a/arch/s390/kernel/traps.c
+++ b/arch/s390/kernel/traps.c
@@ -27,6 +27,7 @@
#include <linux/uaccess.h>
#include <linux/cpu.h>
#include <linux/entry-common.h>
+#include <linux/kmsan.h>
#include <asm/asm-extable.h>
#include <asm/vtime.h>
#include <asm/fpu.h>
@@ -262,6 +263,11 @@ static void monitor_event_exception(struct pt_regs *regs)

void kernel_stack_overflow(struct pt_regs *regs)
{
+ /*
+ * Normally regs are unpoisoned by the generic entry code, but
+ * kernel_stack_overflow() is a rare case that is called bypassing it.
+ */
+ kmsan_unpoison_entry_regs(regs);
bust_spinlocks(1);
printk("Kernel stack overflow.\n");
show_regs(regs);
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:57 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
The pages for the KMSAN metadata associated with most kernel mappings
are taken from memblock by the common code. However, vmalloc and module
metadata needs to be defined by the architectures.

Be a little bit more careful than x86: allocate exactly MODULES_LEN
for the module shadow and origins, and then take 2/3 of vmalloc for
the vmalloc shadow and origins. This ensures that users passing small
vmalloc= values on the command line do not cause module metadata
collisions.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Alexander Gordeev <agor...@linux.ibm.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/boot/startup.c | 7 +++++++
arch/s390/include/asm/pgtable.h | 8 ++++++++
2 files changed, 15 insertions(+)

diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c
index 48ef5fe5c08a..d6b0d114939a 100644
--- a/arch/s390/boot/startup.c
+++ b/arch/s390/boot/startup.c
@@ -301,11 +301,18 @@ static unsigned long setup_kernel_memory_layout(unsigned long kernel_size)
MODULES_END = round_down(kernel_start, _SEGMENT_SIZE);
MODULES_VADDR = MODULES_END - MODULES_LEN;
VMALLOC_END = MODULES_VADDR;
+ if (IS_ENABLED(CONFIG_KMSAN))
+ VMALLOC_END -= MODULES_LEN * 2;

/* allow vmalloc area to occupy up to about 1/2 of the rest virtual space left */
vsize = (VMALLOC_END - FIXMAP_SIZE) / 2;
vsize = round_down(vsize, _SEGMENT_SIZE);
vmalloc_size = min(vmalloc_size, vsize);
+ if (IS_ENABLED(CONFIG_KMSAN)) {
+ /* take 2/3 of vmalloc area for KMSAN shadow and origins */
+ vmalloc_size = round_down(vmalloc_size / 3, _SEGMENT_SIZE);
+ VMALLOC_END -= vmalloc_size * 2;
+ }
VMALLOC_START = VMALLOC_END - vmalloc_size;

__memcpy_real_area = round_down(VMALLOC_START - MEMCPY_REAL_SIZE, PAGE_SIZE);
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 70b6ee557eb2..2f44c23efec0 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -107,6 +107,14 @@ static inline int is_module_addr(void *addr)
return 1;
}

+#ifdef CONFIG_KMSAN
+#define KMSAN_VMALLOC_SIZE (VMALLOC_END - VMALLOC_START)
+#define KMSAN_VMALLOC_SHADOW_START VMALLOC_END
+#define KMSAN_VMALLOC_ORIGIN_START (KMSAN_VMALLOC_SHADOW_START + KMSAN_VMALLOC_SIZE)
+#define KMSAN_MODULES_SHADOW_START (KMSAN_VMALLOC_ORIGIN_START + KMSAN_VMALLOC_SIZE)
+#define KMSAN_MODULES_ORIGIN_START (KMSAN_MODULES_SHADOW_START + MODULES_LEN)
+#endif
+
#ifdef CONFIG_RANDOMIZE_BASE
#define KASLR_LEN (1UL << 31)
#else
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:58 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
put_user() uses inline assembly with precise constraints, so Clang is
in principle capable of instrumenting it automatically. Unfortunately,
one of the constraints contains a dereferenced user pointer, and Clang
does not currently distinguish user and kernel pointers. Therefore
KMSAN attempts to access shadow for user pointers, which is not a right
thing to do.

An obvious fix to add __no_sanitize_memory to __put_user_fn() does not
work, since it's __always_inline. And __always_inline cannot be removed
due to the __put_user_bad() trick.

A different obvious fix of using the "a" instead of the "+Q" constraint
degrades the code quality, which is very important here, since it's a
hot path.

Instead, repurpose the __put_user_asm() macro to define
__put_user_{char,short,int,long}_noinstr() functions and mark them with
__no_sanitize_memory. For the non-KMSAN builds make them
__always_inline in order to keep the generated code quality. Also
define __put_user_{char,short,int,long}() functions, which call the
aforementioned ones and which *are* instrumented, because they call
KMSAN hooks, which may be implemented as macros.

The same applies to get_user() as well.

Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/uaccess.h | 111 +++++++++++++++++++++++---------
1 file changed, 79 insertions(+), 32 deletions(-)

diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index 81ae8a98e7ec..70f0edc00c2a 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -78,13 +78,24 @@ union oac {

int __noreturn __put_user_bad(void);

-#define __put_user_asm(to, from, size) \
-({ \
+#ifdef CONFIG_KMSAN
+#define get_put_user_noinstr_attributes \
+ noinline __maybe_unused __no_sanitize_memory
+#else
+#define get_put_user_noinstr_attributes __always_inline
+#endif
+
+#define DEFINE_PUT_USER(type) \
+static get_put_user_noinstr_attributes int \
+__put_user_##type##_noinstr(unsigned type __user *to, \
+ unsigned type *from, \
+ unsigned long size) \
+{ \
union oac __oac_spec = { \
.oac1.as = PSW_BITS_AS_SECONDARY, \
.oac1.a = 1, \
}; \
- int __rc; \
+ int rc; \
\
asm volatile( \
" lr 0,%[spec]\n" \
@@ -93,12 +104,28 @@ int __noreturn __put_user_bad(void);
"2:\n" \
EX_TABLE_UA_STORE(0b, 2b, %[rc]) \
EX_TABLE_UA_STORE(1b, 2b, %[rc]) \
- : [rc] "=&d" (__rc), [_to] "+Q" (*(to)) \
+ : [rc] "=&d" (rc), [_to] "+Q" (*(to)) \
: [_size] "d" (size), [_from] "Q" (*(from)), \
[spec] "d" (__oac_spec.val) \
: "cc", "0"); \
- __rc; \
-})
+ return rc; \
+} \
+ \
+static __always_inline int \
+__put_user_##type(unsigned type __user *to, unsigned type *from, \
+ unsigned long size) \
+{ \
+ int rc; \
+ \
+ rc = __put_user_##type##_noinstr(to, from, size); \
+ instrument_put_user(*from, to, size); \
+ return rc; \
+}
+
+DEFINE_PUT_USER(char);
+DEFINE_PUT_USER(short);
+DEFINE_PUT_USER(int);
+DEFINE_PUT_USER(long);

static __always_inline int __put_user_fn(void *x, void __user *ptr, unsigned long size)
{
@@ -106,24 +133,24 @@ static __always_inline int __put_user_fn(void *x, void __user *ptr, unsigned lon

switch (size) {
case 1:
- rc = __put_user_asm((unsigned char __user *)ptr,
- (unsigned char *)x,
- size);
+ rc = __put_user_char((unsigned char __user *)ptr,
+ (unsigned char *)x,
+ size);
break;
case 2:
- rc = __put_user_asm((unsigned short __user *)ptr,
- (unsigned short *)x,
- size);
+ rc = __put_user_short((unsigned short __user *)ptr,
+ (unsigned short *)x,
+ size);
break;
case 4:
- rc = __put_user_asm((unsigned int __user *)ptr,
+ rc = __put_user_int((unsigned int __user *)ptr,
(unsigned int *)x,
size);
break;
case 8:
- rc = __put_user_asm((unsigned long __user *)ptr,
- (unsigned long *)x,
- size);
+ rc = __put_user_long((unsigned long __user *)ptr,
+ (unsigned long *)x,
+ size);
break;
default:
__put_user_bad();
@@ -134,13 +161,17 @@ static __always_inline int __put_user_fn(void *x, void __user *ptr, unsigned lon

int __noreturn __get_user_bad(void);

-#define __get_user_asm(to, from, size) \
-({ \
+#define DEFINE_GET_USER(type) \
+static get_put_user_noinstr_attributes int \
+__get_user_##type##_noinstr(unsigned type *to, \
+ unsigned type __user *from, \
+ unsigned long size) \
+{ \
union oac __oac_spec = { \
.oac2.as = PSW_BITS_AS_SECONDARY, \
.oac2.a = 1, \
}; \
- int __rc; \
+ int rc; \
\
asm volatile( \
" lr 0,%[spec]\n" \
@@ -149,13 +180,29 @@ int __noreturn __get_user_bad(void);
"2:\n" \
EX_TABLE_UA_LOAD_MEM(0b, 2b, %[rc], %[_to], %[_ksize]) \
EX_TABLE_UA_LOAD_MEM(1b, 2b, %[rc], %[_to], %[_ksize]) \
- : [rc] "=&d" (__rc), "=Q" (*(to)) \
+ : [rc] "=&d" (rc), "=Q" (*(to)) \
: [_size] "d" (size), [_from] "Q" (*(from)), \
[spec] "d" (__oac_spec.val), [_to] "a" (to), \
[_ksize] "K" (size) \
: "cc", "0"); \
- __rc; \
-})
+ return rc; \
+} \
+ \
+static __always_inline int \
+__get_user_##type(unsigned type *to, unsigned type __user *from, \
+ unsigned long size) \
+{ \
+ int rc; \
+ \
+ rc = __get_user_##type##_noinstr(to, from, size); \
+ instrument_get_user(*to); \
+ return rc; \
+}
+
+DEFINE_GET_USER(char);
+DEFINE_GET_USER(short);
+DEFINE_GET_USER(int);
+DEFINE_GET_USER(long);

static __always_inline int __get_user_fn(void *x, const void __user *ptr, unsigned long size)
{
@@ -163,24 +210,24 @@ static __always_inline int __get_user_fn(void *x, const void __user *ptr, unsign

switch (size) {
case 1:
- rc = __get_user_asm((unsigned char *)x,
- (unsigned char __user *)ptr,
- size);
+ rc = __get_user_char((unsigned char *)x,
+ (unsigned char __user *)ptr,
+ size);
break;
case 2:
- rc = __get_user_asm((unsigned short *)x,
- (unsigned short __user *)ptr,
- size);
+ rc = __get_user_short((unsigned short *)x,
+ (unsigned short __user *)ptr,
+ size);
break;
case 4:
- rc = __get_user_asm((unsigned int *)x,
+ rc = __get_user_int((unsigned int *)x,
(unsigned int __user *)ptr,
size);
break;
case 8:
- rc = __get_user_asm((unsigned long *)x,
- (unsigned long __user *)ptr,
- size);
+ rc = __get_user_long((unsigned long *)x,
+ (unsigned long __user *)ptr,
+ size);
break;
default:
__get_user_bad();
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:58 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Add KMSAN support for the s390 implementations of the string functions.
Do this similar to how it's already done for KASAN, except that the
optimized memset{16,32,64}() functions need to be disabled: it's
important for KMSAN to know that they initialized something.

The way boot code is built with regard to string functions is
problematic, since most files think it's configured with sanitizers,
but boot/string.c doesn't. This creates various problems with the
memset64() definitions, depending on whether the code is built with
sanitizers or fortify. This should probably be streamlined, but in the
meantime resolve the issues by introducing the IN_BOOT_STRING_C macro,
similar to the existing IN_ARCH_STRING_C macro.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/boot/string.c | 16 ++++++++++++++++
arch/s390/include/asm/string.h | 20 +++++++++++++++-----
2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/s390/boot/string.c b/arch/s390/boot/string.c
index faccb33b462c..f6b9b1df48a8 100644
--- a/arch/s390/boot/string.c
+++ b/arch/s390/boot/string.c
@@ -1,11 +1,18 @@
// SPDX-License-Identifier: GPL-2.0
+#define IN_BOOT_STRING_C 1
#include <linux/ctype.h>
#include <linux/kernel.h>
#include <linux/errno.h>
#undef CONFIG_KASAN
#undef CONFIG_KASAN_GENERIC
+#undef CONFIG_KMSAN
#include "../lib/string.c"

+/*
+ * Duplicate some functions from the common lib/string.c
+ * instead of fully including it.
+ */
+
int strncmp(const char *cs, const char *ct, size_t count)
{
unsigned char c1, c2;
@@ -22,6 +29,15 @@ int strncmp(const char *cs, const char *ct, size_t count)
return 0;
}

+void *memset64(uint64_t *s, uint64_t v, size_t count)
+{
+ uint64_t *xs = s;
+
+ while (count--)
+ *xs++ = v;
+ return s;
+}
+
char *skip_spaces(const char *str)
{
while (isspace(*str))
diff --git a/arch/s390/include/asm/string.h b/arch/s390/include/asm/string.h
index 351685de53d2..2ab868cbae6c 100644
--- a/arch/s390/include/asm/string.h
+++ b/arch/s390/include/asm/string.h
@@ -15,15 +15,12 @@
#define __HAVE_ARCH_MEMCPY /* gcc builtin & arch function */
#define __HAVE_ARCH_MEMMOVE /* gcc builtin & arch function */
#define __HAVE_ARCH_MEMSET /* gcc builtin & arch function */
-#define __HAVE_ARCH_MEMSET16 /* arch function */
-#define __HAVE_ARCH_MEMSET32 /* arch function */
-#define __HAVE_ARCH_MEMSET64 /* arch function */

void *memcpy(void *dest, const void *src, size_t n);
void *memset(void *s, int c, size_t n);
void *memmove(void *dest, const void *src, size_t n);

-#ifndef CONFIG_KASAN
+#if !defined(CONFIG_KASAN) && !defined(CONFIG_KMSAN)
#define __HAVE_ARCH_MEMCHR /* inline & arch function */
#define __HAVE_ARCH_MEMCMP /* arch function */
#define __HAVE_ARCH_MEMSCAN /* inline & arch function */
@@ -36,6 +33,9 @@ void *memmove(void *dest, const void *src, size_t n);
#define __HAVE_ARCH_STRNCPY /* arch function */
#define __HAVE_ARCH_STRNLEN /* inline & arch function */
#define __HAVE_ARCH_STRSTR /* arch function */
+#define __HAVE_ARCH_MEMSET16 /* arch function */
+#define __HAVE_ARCH_MEMSET32 /* arch function */
+#define __HAVE_ARCH_MEMSET64 /* arch function */

/* Prototypes for non-inlined arch strings functions. */
int memcmp(const void *s1, const void *s2, size_t n);
@@ -44,7 +44,7 @@ size_t strlcat(char *dest, const char *src, size_t n);
char *strncat(char *dest, const char *src, size_t n);
char *strncpy(char *dest, const char *src, size_t n);
char *strstr(const char *s1, const char *s2);
-#endif /* !CONFIG_KASAN */
+#endif /* !defined(CONFIG_KASAN) && !defined(CONFIG_KMSAN) */

#undef __HAVE_ARCH_STRCHR
#undef __HAVE_ARCH_STRNCHR
@@ -74,20 +74,30 @@ void *__memset16(uint16_t *s, uint16_t v, size_t count);
void *__memset32(uint32_t *s, uint32_t v, size_t count);
void *__memset64(uint64_t *s, uint64_t v, size_t count);

+#ifdef __HAVE_ARCH_MEMSET16
static inline void *memset16(uint16_t *s, uint16_t v, size_t count)
{
return __memset16(s, v, count * sizeof(v));
}
+#endif

+#ifdef __HAVE_ARCH_MEMSET32
static inline void *memset32(uint32_t *s, uint32_t v, size_t count)
{
return __memset32(s, v, count * sizeof(v));
}
+#endif

+#ifdef __HAVE_ARCH_MEMSET64
+#ifdef IN_BOOT_STRING_C
+void *memset64(uint64_t *s, uint64_t v, size_t count);
+#else
static inline void *memset64(uint64_t *s, uint64_t v, size_t count)
{
return __memset64(s, v, count * sizeof(v));
}
+#endif
+#endif

#if !defined(IN_ARCH_STRING_C) && (!defined(CONFIG_FORTIFY_SOURCE) || defined(__NO_FORTIFY))

--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:59 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
uaccess.h uses instrument_get_user() and instrument_put_user(), which
are defined in linux/instrumented.h. Currently we get this header from
somewhere else by accident; prefer to be explicit about it and include
it directly.

Suggested-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/include/asm/uaccess.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/s390/include/asm/uaccess.h b/arch/s390/include/asm/uaccess.h
index 70f0edc00c2a..9213be0529ee 100644
--- a/arch/s390/include/asm/uaccess.h
+++ b/arch/s390/include/asm/uaccess.h
@@ -18,6 +18,7 @@
#include <asm/extable.h>
#include <asm/facility.h>
#include <asm-generic/access_ok.h>
+#include <linux/instrumented.h>

void debug_user_asce(int exit);

--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:59 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
The unwind code can read uninitialized frames. Furthermore, even in
the good case, KMSAN does not emit shadow for backchains. Therefore
disable it for the unwinding functions.

Reviewed-by: Alexander Potapenko <gli...@google.com>
Acked-by: Heiko Carstens <h...@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/kernel/unwind_bc.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/s390/kernel/unwind_bc.c b/arch/s390/kernel/unwind_bc.c
index 0ece156fdd7c..cd44be2b6ce8 100644
--- a/arch/s390/kernel/unwind_bc.c
+++ b/arch/s390/kernel/unwind_bc.c
@@ -49,6 +49,8 @@ static inline bool is_final_pt_regs(struct unwind_state *state,
READ_ONCE_NOCHECK(regs->psw.mask) & PSW_MASK_PSTATE;
}

+/* Avoid KMSAN false positives from touching uninitialized frames. */
+__no_kmsan_checks
bool unwind_next_frame(struct unwind_state *state)
{
struct stack_info *info = &state->stack_info;
@@ -118,6 +120,8 @@ bool unwind_next_frame(struct unwind_state *state)
}
EXPORT_SYMBOL_GPL(unwind_next_frame);

+/* Avoid KMSAN false positives from touching uninitialized frames. */
+__no_kmsan_checks
void __unwind_start(struct unwind_state *state, struct task_struct *task,
struct pt_regs *regs, unsigned long first_frame)
{
--
2.45.1

Ilya Leoshkevich

unread,
Jun 19, 2024, 11:45:59 AM6/19/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle, Ilya Leoshkevich
Now that everything else is in place, enable KMSAN in Kconfig.

Acked-by: Heiko Carstens <h...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>
Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
---
arch/s390/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index c59d2b54df49..3cba4993d7c7 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -158,6 +158,7 @@ config S390
select HAVE_ARCH_KASAN
select HAVE_ARCH_KASAN_VMALLOC
select HAVE_ARCH_KCSAN
+ select HAVE_ARCH_KMSAN
select HAVE_ARCH_KFENCE
select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
select HAVE_ARCH_SECCOMP_FILTER
--
2.45.1

Alexander Potapenko

unread,
Jun 20, 2024, 4:15:11 AM6/20/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Wed, Jun 19, 2024 at 5:45 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> Add a wrapper for memset() that prevents unpoisoning. This is useful
> for filling memory allocator redzones.
>
> Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>

> ---
> include/linux/kmsan.h | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/include/linux/kmsan.h b/include/linux/kmsan.h
> index 23de1b3d6aee..5f50885f2023 100644
> --- a/include/linux/kmsan.h
> +++ b/include/linux/kmsan.h
> @@ -255,6 +255,14 @@ void kmsan_enable_current(void);
> */
> void kmsan_disable_current(void);
>
> +/*
> + * memset_no_sanitize_memory(): memset() without KMSAN instrumentation.
> + */
Please make this a doc comment, like in the rest of the file.
(Please also fix kmsan_enable_current/kmsan_disable_current in the
respective patch)

Alexander Potapenko

unread,
Jun 20, 2024, 4:16:25 AM6/20/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Wed, Jun 19, 2024 at 5:45 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> uaccess.h uses instrument_get_user() and instrument_put_user(), which
> are defined in linux/instrumented.h. Currently we get this header from
> somewhere else by accident; prefer to be explicit about it and include
> it directly.
>
> Suggested-by: Alexander Potapenko <gli...@google.com>
> Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
Reviewed-by: Alexander Potapenko <gli...@google.com>

Alexander Potapenko

unread,
Jun 20, 2024, 4:36:52 AM6/20/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Wed, Jun 19, 2024 at 5:45 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> put_user() uses inline assembly with precise constraints, so Clang is
> in principle capable of instrumenting it automatically. Unfortunately,
> one of the constraints contains a dereferenced user pointer, and Clang
> does not currently distinguish user and kernel pointers. Therefore
> KMSAN attempts to access shadow for user pointers, which is not a right
> thing to do.

By the way, how does this problem manifest?
I was expecting KMSAN to generate dummy shadow accesses in this case,
and reading/writing 1-8 bytes from dummy shadow shouldn't be a
problem.

(On the other hand, not inlining the get_user/put_user functions is
probably still faster than retrieving the dummy shadow, so I'm fine
either way)
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20240619154530.163232-34-iii%40linux.ibm.com.



--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

Alexander Potapenko

unread,
Jun 20, 2024, 5:01:18 AM6/20/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Wed, Jun 19, 2024 at 5:45 PM Ilya Leoshkevich <i...@linux.ibm.com> wrote:
>
> Even though the KMSAN warnings generated by memchr_inv() are suppressed
> by metadata_access_enable(), its return value may still be poisoned.
>
> The reason is that the last iteration of memchr_inv() returns
> `*start != value ? start : NULL`, where *start is poisoned. Because of
> this, somewhat counterintuitively, the shadow value computed by
> visitSelectInst() is equal to `(uintptr_t)start`.
>
> One possibility to fix this, since the intention behind guarding
> memchr_inv() behind metadata_access_enable() is to touch poisoned
> metadata without triggering KMSAN, is to unpoison its return value.
> However, this approach is too fragile. So simply disable the KMSAN
> checks in the respective functions.

Alexander Gordeev

unread,
Jun 20, 2024, 5:25:31 AM6/20/24
to Ilya Leoshkevich, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Wed, Jun 19, 2024 at 05:44:11PM +0200, Ilya Leoshkevich wrote:

Hi Ilya,

> +static inline bool is_lowcore_addr(void *addr)
> +{
> + return addr >= (void *)&S390_lowcore &&
> + addr < (void *)(&S390_lowcore + 1);
> +}
> +
> +static inline void *arch_kmsan_get_meta_or_null(void *addr, bool is_origin)
> +{
> + if (is_lowcore_addr(addr)) {
> + /*
> + * Different lowcores accessed via S390_lowcore are described
> + * by the same struct page. Resolve the prefix manually in
> + * order to get a distinct struct page.
> + */

> + addr += (void *)lowcore_ptr[raw_smp_processor_id()] -
> + (void *)&S390_lowcore;

If I am not mistaken neither raw_smp_processor_id() itself, nor
lowcore_ptr[raw_smp_processor_id()] are atomic. Should the preemption
be disabled while the addr is calculated?

But then the question arises - how meaningful the returned value is?
AFAICT kmsan_get_metadata() is called from a preemptable context.
So if the CPU is changed - how useful the previous CPU lowcore meta is?

Is it a memory block that needs to be ignored instead?

> + if (WARN_ON_ONCE(is_lowcore_addr(addr)))
> + return NULL;

lowcore_ptr[] pointing into S390_lowcore is rather a bug.

> + return kmsan_get_metadata(addr, is_origin);
> + }
> + return NULL;
> +}

Thanks!

Ilya Leoshkevich

unread,
Jun 20, 2024, 7:19:48 AM6/20/24
to Alexander Potapenko, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Thu, 2024-06-20 at 10:36 +0200, Alexander Potapenko wrote:
> On Wed, Jun 19, 2024 at 5:45 PM Ilya Leoshkevich <i...@linux.ibm.com>
> wrote:
> >
> > put_user() uses inline assembly with precise constraints, so Clang
> > is
> > in principle capable of instrumenting it automatically.
> > Unfortunately,
> > one of the constraints contains a dereferenced user pointer, and
> > Clang
> > does not currently distinguish user and kernel pointers. Therefore
> > KMSAN attempts to access shadow for user pointers, which is not a
> > right
> > thing to do.
>
> By the way, how does this problem manifest?
> I was expecting KMSAN to generate dummy shadow accesses in this case,
> and reading/writing 1-8 bytes from dummy shadow shouldn't be a
> problem.
>
> (On the other hand, not inlining the get_user/put_user functions is
> probably still faster than retrieving the dummy shadow, so I'm fine
> either way)

We have two problems here: not only clang can't distinguish user and
kernel pointers, the KMSAN runtime - which is supposed to clean that
up - can't do that either due to overlapping kernel and user address
spaces on s390. So the instrumentation ultimately tries to access the
real shadow.

I forgot what the consequences of that were exactly, so I reverted the
patch and now I get:

Unable to handle kernel pointer dereference in virtual kernel address
space
Failing address: 000003fed25fa000 TEID: 000003fed25fa403
Fault in home space mode while using kernel ASCE.
AS:0000000005a70007 R3:00000000824d8007 S:0000000000000020
Oops: 0010 ilc:2 [#1] SMP
Modules linked in:
CPU: 3 PID: 1 Comm: init Tainted: G B N 6.10.0-rc4-
g8aadb00f495e #11
Hardware name: IBM 3931 A01 704 (KVM/Linux)
Krnl PSW : 0704c00180000000 000003ffe288975a (memset+0x3a/0xa0)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000000 000003fed25fa180 000003fed25fa180
000003ffe28897a6
0000000000000007 000003ffe0000000 0000000000000000
000002ee06e68190
000002ee06f19000 000003fed25fa180 000003ffd25fa180
000003ffd25fa180
0000000000000008 0000000000000000 000003ffe17262e0
0000037ee000f730
Krnl Code: 000003ffe288974c: 41101100 la %r1,256(%r1)
000003ffe2889750: a737fffb brctg
%r3,000003ffe2889746
#000003ffe2889754: c03000000029 larl
%r3,000003ffe28897a6
>000003ffe288975a: 44403000 ex %r4,0(%r3)
000003ffe288975e: 07fe bcr 15,%r14
000003ffe2889760: a74f0001 cghi %r4,1
000003ffe2889764: b9040012 lgr %r1,%r2
000003ffe2889768: a784001c brc
8,000003ffe28897a0
Call Trace:
[<000003ffe288975a>] memset+0x3a/0xa0
([<000003ffe17262bc>] kmsan_internal_set_shadow_origin+0x21c/0x3a0)
[<000003ffe1725fb6>] kmsan_internal_unpoison_memory+0x26/0x30
[<000003ffe1c1c646>] create_elf_tables+0x13c6/0x2620
[<000003ffe1c0ebaa>] load_elf_binary+0x50da/0x68f0
[<000003ffe18c41fc>] bprm_execve+0x201c/0x2f40
[<000003ffe18bff9a>] kernel_execve+0x2cda/0x2d00
[<000003ffe49b745a>] kernel_init+0x9ba/0x1630
[<000003ffe000cd5c>] __ret_from_fork+0xbc/0x180
[<000003ffe4a1907a>] ret_from_fork+0xa/0x30
Last Breaking-Event-Address:
[<000003ffe2889742>] memset+0x22/0xa0
Kernel panic - not syncing: Fatal exception: panic_on_oops

So is_bad_asm_addr() returned false for a userspace address.
Why? Because it happened to collide with the kernel modules area:
precisely the effect of overlapping.

VMALLOC_START: 0x37ee0000000
VMALLOC_END: 0x3a960000000
MODULES_VADDR: 0x3ff60000000
Address: 0x3ffd157a580
MODULES_END: 0x3ffe0000000

Now the question is, why do we crash when accessing shadow for modules?
I'll need to investigate, this does not look normal. But even if that
worked, we clearly wouldn't want userspace accesses to pollute module
shadow, so I think we need this patch in its current form.

[...]

Alexander Potapenko

unread,
Jun 20, 2024, 7:47:40 AM6/20/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
I see, thanks for the clarification!

> Now the question is, why do we crash when accessing shadow for modules?
> I'll need to investigate, this does not look normal. But even if that
> worked, we clearly wouldn't want userspace accesses to pollute module
> shadow, so I think we need this patch in its current form.

Ok, it indeed makes sense.

Reviewed-by: Alexander Potapenko <gli...@google.com>

Ilya Leoshkevich

unread,
Jun 20, 2024, 9:38:46 AM6/20/24
to Alexander Gordeev, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
This code path will only be triggered by instrumented code that
accesses lowcore. That code is supposed to disable preemption;
if it didn't, it's a bug in that code and it should be fixed there.

>
> Is it a memory block that needs to be ignored instead?
>
> > + if (WARN_ON_ONCE(is_lowcore_addr(addr)))
> > + return NULL;
>
> lowcore_ptr[] pointing into S390_lowcore is rather a bug.

Right, but AFAIK BUG() calls are discouraged. I guess in a debug tool
the rules are more relaxed, but we can recover from this condition here
easily, that's why I still went for WARN_ON_ONCE().

Alexander Gordeev

unread,
Jun 20, 2024, 9:59:20 AM6/20/24
to Ilya Leoshkevich, Alexander Potapenko, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
On Wed, Jun 19, 2024 at 05:44:11PM +0200, Ilya Leoshkevich wrote:
> arch_kmsan_get_meta_or_null() finds the lowcore shadow by querying the
> prefix and calling kmsan_get_metadata() again.
>
> kmsan_virt_addr_valid() delegates to virt_addr_valid().
>
> Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com>
> ---
> arch/s390/include/asm/kmsan.h | 59 +++++++++++++++++++++++++++++++++++
> 1 file changed, 59 insertions(+)
> create mode 100644 arch/s390/include/asm/kmsan.h


Acked-by: Alexander Gordeev <agor...@linux.ibm.com>

Alexander Potapenko

unread,
Jun 20, 2024, 10:18:53 AM6/20/24
to Ilya Leoshkevich, Alexander Gordeev, Andrew Morton, Christoph Lameter, David Rientjes, Heiko Carstens, Joonsoo Kim, Marco Elver, Masami Hiramatsu, Pekka Enberg, Steven Rostedt, Vasily Gorbik, Vlastimil Babka, Christian Borntraeger, Dmitry Vyukov, Hyeonggon Yoo, kasa...@googlegroups.com, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, linux-tra...@vger.kernel.org, Mark Rutland, Roman Gushchin, Sven Schnelle
We have KMSAN_WARN_ON() for that, sorry for not pointing it out
earlier: https://elixir.bootlin.com/linux/latest/source/mm/kmsan/kmsan.h#L46
It is loading more messages.
0 new messages