[PATCH 1/3] x86: mm: disable KMSAN instrumentation for physaddr.c

0 views
Skip to first unread message

Alexander Potapenko

unread,
Jun 21, 2024, 5:49:08 AM (12 days ago) Jun 21
to gli...@google.com, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com, Kirill A . Shutemov
Enabling CONFIG_DEBUG_VIRTUAL=y together with KMSAN led to infinite
recursion, because kmsan_get_metadata() ended up calling instrumented
__pfn_valid() from arch/x86/mm/physaddr.c.

Prevent it by disabling instrumentation of the whole file.

Reported-by: Kirill A. Shutemov <kirill....@linux.intel.com>
Closes: https://github.com/google/kmsan/issues/95
Signed-off-by: Alexander Potapenko <gli...@google.com>
---
arch/x86/mm/Makefile | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 8d3a00e5c528e..d3b27a383127d 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -17,6 +17,7 @@ KCSAN_SANITIZE := n
# Avoid recursion by not calling KMSAN hooks for CEA code.
KMSAN_SANITIZE_cpu_entry_area.o := n
KMSAN_SANITIZE_mem_encrypt_identity.o := n
+KMSAN_SANITIZE_physaddr.o := n

ifdef CONFIG_FUNCTION_TRACER
CFLAGS_REMOVE_mem_encrypt.o = -pg
--
2.45.2.741.gdbec12cfda-goog

Alexander Potapenko

unread,
Jun 21, 2024, 5:49:11 AM (12 days ago) Jun 21
to gli...@google.com, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com, Kirill A . Shutemov
At least on x86 KMSAN is seriously slown down by lockdep, as every
pfn_valid() call (which is done on every instrumented memory access
in the kernel) performs several lockdep checks, all of which, in turn,
perform additional memory accesses and call KMSAN instrumentation.

Right now lockdep overflows the stack under KMSAN, but even if we use
reentrancy counters to avoid the recursion on the KMSAN side, the slowdown
from lockdep remains big enough for the kernel to become unusable.

Reported-by: Kirill A. Shutemov <kirill....@linux.intel.com>
Closes: https://github.com/google/kmsan/issues/94
Link: https://groups.google.com/g/kasan-dev/c/ZBiGzZL36-I/m/WtNuKqP9EQAJ
Signed-off-by: Alexander Potapenko <gli...@google.com>
---
lib/Kconfig.debug | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 59b6765d86b8f..036905cf1dbe9 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1339,7 +1339,7 @@ menu "Lock Debugging (spinlocks, mutexes, etc...)"

config LOCK_DEBUGGING_SUPPORT
bool
- depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
+ depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT && !KMSAN
default y

config PROVE_LOCKING
--
2.45.2.741.gdbec12cfda-goog

Alexander Potapenko

unread,
Jun 21, 2024, 5:49:15 AM (12 days ago) Jun 21
to gli...@google.com, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com, Kirill A . Shutemov
Because handle_bug() is a noinstr function, call to
kmsan_unpoison_entry_regs() should be happening within the
instrumentation_begin()/instrumentation_end() region.

Fortunately, the same noinstr annotation lets us dereference @regs
in handle_bug() without unpoisoning them, so we don't have to move the
`is_valid_bugaddr(regs->ip)` check below instrumentation_begin().

Reported-by: Kirill A. Shutemov <kirill....@linux.intel.com>
Link: https://groups.google.com/g/kasan-dev/c/ZBiGzZL36-I/m/WtNuKqP9EQAJ
Signed-off-by: Alexander Potapenko <gli...@google.com>
---
arch/x86/kernel/traps.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 4fa0b17e5043a..e8f330d9ba5d4 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -217,12 +217,6 @@ static noinstr bool handle_bug(struct pt_regs *regs)
{
bool handled = false;

- /*
- * Normally @regs are unpoisoned by irqentry_enter(), but handle_bug()
- * is a rare case that uses @regs without passing them to
- * irqentry_enter().
- */
- kmsan_unpoison_entry_regs(regs);
if (!is_valid_bugaddr(regs->ip))
return handled;

@@ -230,6 +224,15 @@ static noinstr bool handle_bug(struct pt_regs *regs)
* All lies, just get the WARN/BUG out.
*/
instrumentation_begin();
+ /*
+ * Normally @regs are unpoisoned by irqentry_enter(), but handle_bug()
+ * is a rare case that uses @regs without passing them to
+ * irqentry_enter().
+ * Unpoisoning of @regs should be done before the first access to it,
+ * but because this is a noinstr function it is fine to postpone
+ * unpoisoning until the call of instrumentation_begin().
+ */
+ kmsan_unpoison_entry_regs(regs);
/*
* Since we're emulating a CALL with exceptions, restore the interrupt
* state to what it was at the exception site.
--
2.45.2.741.gdbec12cfda-goog

Kirill A . Shutemov

unread,
Jun 21, 2024, 10:57:46 AM (12 days ago) Jun 21
to Alexander Potapenko, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com
On Fri, Jun 21, 2024 at 11:48:59AM +0200, Alexander Potapenko wrote:
> Enabling CONFIG_DEBUG_VIRTUAL=y together with KMSAN led to infinite
> recursion, because kmsan_get_metadata() ended up calling instrumented
> __pfn_valid() from arch/x86/mm/physaddr.c.
>
> Prevent it by disabling instrumentation of the whole file.
>
> Reported-by: Kirill A. Shutemov <kirill....@linux.intel.com>
> Closes: https://github.com/google/kmsan/issues/95
> Signed-off-by: Alexander Potapenko <gli...@google.com>

Reviewed-by: Kirill A. Shutemov <kirill....@linux.intel.com>

--
Kiryl Shutsemau / Kirill A. Shutemov

Kirill A . Shutemov

unread,
Jun 21, 2024, 11:02:43 AM (12 days ago) Jun 21
to Alexander Potapenko, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com
Nit: no need to pile everything in one line. Add "depends on !KMSAN" on a
separate line.

Otherwise, looks sane.

Kirill A . Shutemov

unread,
Jun 21, 2024, 11:09:07 AM (12 days ago) Jun 21
to Alexander Potapenko, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com
On Fri, Jun 21, 2024 at 11:49:01AM +0200, Alexander Potapenko wrote:
> Because handle_bug() is a noinstr function, call to
> kmsan_unpoison_entry_regs() should be happening within the
> instrumentation_begin()/instrumentation_end() region.
>
> Fortunately, the same noinstr annotation lets us dereference @regs
> in handle_bug() without unpoisoning them, so we don't have to move the
> `is_valid_bugaddr(regs->ip)` check below instrumentation_begin().

Imperative mood, please. And capitalize "fix" in the subject.

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/tree/Documentation/process/maintainer-tip.rst#n134
>
> Reported-by: Kirill A. Shutemov <kirill....@linux.intel.com>
> Link: https://groups.google.com/g/kasan-dev/c/ZBiGzZL36-I/m/WtNuKqP9EQAJ
> Signed-off-by: Alexander Potapenko <gli...@google.com>

Otherwise, looks good.

Dave Hansen

unread,
Jun 21, 2024, 12:23:28 PM (12 days ago) Jun 21
to Alexander Potapenko, el...@google.com, dvy...@google.com, dave....@linux.intel.com, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com, Kirill A . Shutemov, Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long, Boqun Feng
On 6/21/24 02:49, Alexander Potapenko wrote:
> config LOCK_DEBUGGING_SUPPORT
> bool
> - depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
> + depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT && !KMSAN
> default y

This kinda stinks. Practically, it'll mean that anyone turning on KMSAN
will accidentally turn off lockdep. That's really nasty, especially for
folks who are turning on debug options left and right to track down
nasty bugs.

I'd *MUCH* rather hide KMSAN:

config KMSAN
bool "KMSAN: detector of uninitialized values use"
depends on HAVE_ARCH_KMSAN && HAVE_KMSAN_COMPILER
depends on DEBUG_KERNEL && !KASAN && !KCSAN
depends on !PREEMPT_RT
+ depends on !LOCKDEP

Because, frankly, lockdep is way more important than KMSAN.

But ideally, we'd allow them to coexist somehow. Have we even discussed
the problem with the lockdep folks? For instance, I'd much rather have
a relaxed lockdep with no checking in pfn_valid() than no lockdep at all.

Dave Hansen

unread,
Jun 21, 2024, 12:40:28 PM (12 days ago) Jun 21
to Alexander Potapenko, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com, Kirill A . Shutemov
On 6/21/24 02:48, Alexander Potapenko wrote:
> Enabling CONFIG_DEBUG_VIRTUAL=y together with KMSAN led to infinite
> recursion, because kmsan_get_metadata() ended up calling instrumented
> __pfn_valid() from arch/x86/mm/physaddr.c.
>
> Prevent it by disabling instrumentation of the whole file.

This does seem rather ad-hoc. It's the same basic reason we have
"noinstr": code instrumentation infrastructure uses generally can't be
instrumented itself.

How hard would it be to make sure that kmsan_get_metadata() and friends
don't call any symbols that were compiled with -fsanitize=kernel-memory?

I do also think I'd much rather see __no_kmsan_checks on the functions
than doing whole files. I *guarantee* if code gets moved around that
whoever does it will miss the KMSAN_SANITIZE_physaddr.o in the makefile.

Boqun Feng

unread,
Jun 25, 2024, 2:52:04 PM (7 days ago) Jun 25
to Dave Hansen, Alexander Potapenko, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com, Kirill A . Shutemov, Ingo Molnar, Will Deacon, Waiman Long, Paul E. McKenney
The only locks used in pfn_valid() are rcu_read_lock_sched(), right? If
so, could you try (don't tell Paul ;-)) replace rcu_read_lock_sched()
with preempt_disable() and rcu_read_unlock_sched() with
preempt_enable()? That would avoid calling into lockdep. If that works
for KMSAN, we can either have a special rcu_read_lock_sched() or call
lockdep_recursion_inc() in instrumented pfn_valid() to disable lockdep
temporarily.

[Cc Paul]

Regards,
Boqun

Paul E. McKenney

unread,
Jun 25, 2024, 3:06:57 PM (7 days ago) Jun 25
to Boqun Feng, Dave Hansen, Alexander Potapenko, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com, Kirill A . Shutemov, Ingo Molnar, Will Deacon, Waiman Long
Don't tell me what? ;-)

An alternative is to use rcu_read_lock_sched_notrace() and
rcu_read_unlock_sched_notrace(). If you really want to use
preempt_disable() and preempt_enable() instead, you will likely want
the _notrace() variants.

Thanx, Paul

Boqun Feng

unread,
Jun 25, 2024, 3:38:19 PM (7 days ago) Jun 25
to Paul E. McKenney, Dave Hansen, Alexander Potapenko, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com, Kirill A . Shutemov, Ingo Molnar, Will Deacon, Waiman Long
Turn out that telling you is a good idea ;-)

> An alternative is to use rcu_read_lock_sched_notrace() and
> rcu_read_unlock_sched_notrace(). If you really want to use

Yes, I think this is better than what I proposed.

Regards,
Boqun

Alexander Potapenko

unread,
Jun 26, 2024, 4:36:06 AM (7 days ago) Jun 26
to Boqun Feng, Paul E. McKenney, Dave Hansen, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com, Kirill A . Shutemov, Ingo Molnar, Will Deacon, Waiman Long
Thanks for your comments!
Yes, that's what I was actually looking into after Dave's answer on
the other thread
(https://groups.google.com/g/kasan-dev/c/ZBiGzZL36-I/m/WtNuKqP9EQAJ)
I'll still need to rework the code calling virt_to_page() to avoid
deadlocks from there though.

Dave Hansen

unread,
Jul 1, 2024, 10:43:25 AM (2 days ago) Jul 1
to Alexander Potapenko, el...@google.com, dvy...@google.com, dave....@linux.intel.com, pet...@infradead.org, ak...@linux-foundation.org, x...@kernel.org, linux-...@vger.kernel.org, kasa...@googlegroups.com, Kirill A . Shutemov
On 6/21/24 02:49, Alexander Potapenko wrote:
> config LOCK_DEBUGGING_SUPPORT
> bool
> - depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
> + depends on TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT && !KMSAN
> default y

This kinda stinks. It ends up doubling the amount of work that ks
Reply all
Reply to author
Forward
0 new messages