Re: [PATCH v3 3/3] locking/lockdep: Disable KASAN instrumentation of lockdep.c

47 views
Skip to first unread message

Boqun Feng

unread,
Feb 12, 2025, 12:57:42 AM2/12/25
to Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, linux-...@vger.kernel.org, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com
[Cc KASAN]

A Reviewed-by or Acked-by from KASAN would be nice, thanks!

Regards,
Boqun

On Sun, Feb 09, 2025 at 11:26:12PM -0500, Waiman Long wrote:
> Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
> Each of them can significantly slow down the speed of a debug kernel.
> Enabling KASAN instrumentation of the LOCKDEP code will further slow
> thing down.
>
> Since LOCKDEP is a high overhead debugging tool, it will never get
> enabled in a production kernel. The LOCKDEP code is also pretty mature
> and is unlikely to get major changes. There is also a possibility of
> recursion similar to KCSAN.
>
> To evaluate the performance impact of disabling KASAN instrumentation
> of lockdep.c, the time to do a parallel build of the Linux defconfig
> kernel was used as the benchmark. Two x86-64 systems (Skylake & Zen 2)
> and an arm64 system were used as test beds. Two sets of non-RT and RT
> kernels with similar configurations except mainly CONFIG_PREEMPT_RT
> were used for evaulation.
>
> For the Skylake system:
>
> Kernel Run time Sys time
> ------ -------- --------
> Non-debug kernel (baseline) 0m47.642s 4m19.811s
> Debug kernel 2m11.108s (x2.8) 38m20.467s (x8.9)
> Debug kernel (patched) 1m49.602s (x2.3) 31m28.501s (x7.3)
> Debug kernel
> (patched + mitigations=off) 1m30.988s (x1.9) 26m41.993s (x6.2)
>
> RT kernel (baseline) 0m54.871s 7m15.340s
> RT debug kernel 6m07.151s (x6.7) 135m47.428s (x18.7)
> RT debug kernel (patched) 3m42.434s (x4.1) 74m51.636s (x10.3)
> RT debug kernel
> (patched + mitigations=off) 2m40.383s (x2.9) 57m54.369s (x8.0)
>
> For the Zen 2 system:
>
> Kernel Run time Sys time
> ------ -------- --------
> Non-debug kernel (baseline) 1m42.806s 39m48.714s
> Debug kernel 4m04.524s (x2.4) 125m35.904s (x3.2)
> Debug kernel (patched) 3m56.241s (x2.3) 127m22.378s (x3.2)
> Debug kernel
> (patched + mitigations=off) 2m38.157s (x1.5) 92m35.680s (x2.3)
>
> RT kernel (baseline) 1m51.500s 14m56.322s
> RT debug kernel 16m04.962s (x8.7) 244m36.463s (x16.4)
> RT debug kernel (patched) 9m09.073s (x4.9) 129m28.439s (x8.7)
> RT debug kernel
> (patched + mitigations=off) 3m31.662s (x1.9) 51m01.391s (x3.4)
>
> For the arm64 system:
>
> Kernel Run time Sys time
> ------ -------- --------
> Non-debug kernel (baseline) 1m56.844s 8m47.150s
> Debug kernel 3m54.774s (x2.0) 92m30.098s (x10.5)
> Debug kernel (patched) 3m32.429s (x1.8) 77m40.779s (x8.8)
>
> RT kernel (baseline) 4m01.641s 18m16.777s
> RT debug kernel 19m32.977s (x4.9) 304m23.965s (x16.7)
> RT debug kernel (patched) 16m28.354s (x4.1) 234m18.149s (x12.8)
>
> Turning the mitigations off doesn't seems to have any noticeable impact
> on the performance of the arm64 system. So the mitigation=off entries
> aren't included.
>
> For the x86 CPUs, cpu mitigations has a much bigger impact on
> performance, especially the RT debug kernel. The SRSO mitigation in
> Zen 2 has an especially big impact on the debug kernel. It is also the
> majority of the slowdown with mitigations on. It is because the patched
> ret instruction slows down function returns. A lot of helper functions
> that are normally compiled out or inlined may become real function
> calls in the debug kernel. The KASAN instrumentation inserts a lot
> of __asan_loadX*() and __kasan_check_read() function calls to memory
> access portion of the code. The lockdep's __lock_acquire() function,
> for instance, has 66 __asan_loadX*() and 6 __kasan_check_read() calls
> added with KASAN instrumentation. Of course, the actual numbers may vary
> depending on the compiler used and the exact version of the lockdep code.
>
> With the newly added rtmutex and lockdep lock events, the relevant
> event counts for the test runs with the Skylake system were:
>
> Event type Debug kernel RT debug kernel
> ---------- ------------ ---------------
> lockdep_acquire 1,968,663,277 5,425,313,953
> rtlock_slowlock - 401,701,156
> rtmutex_slowlock - 139,672
>
> The __lock_acquire() calls in the RT debug kernel are x2.8 times of the
> non-RT debug kernel with the same workload. Since the __lock_acquire()
> function is a big hitter in term of performance slowdown, this makes
> the RT debug kernel much slower than the non-RT one. The average lock
> nesting depth is likely to be higher in the RT debug kernel too leading
> to longer execution time in the __lock_acquire() function.
>
> As the small advantage of enabling KASAN instrumentation to catch
> potential memory access error in the lockdep debugging tool is probably
> not worth the drawback of further slowing down a debug kernel, disable
> KASAN instrumentation in the lockdep code to allow the debug kernels
> to regain some performance back, especially for the RT debug kernels.
>
> Signed-off-by: Waiman Long <lon...@redhat.com>
> ---
> kernel/locking/Makefile | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
> index 0db4093d17b8..a114949eeed5 100644
> --- a/kernel/locking/Makefile
> +++ b/kernel/locking/Makefile
> @@ -5,7 +5,8 @@ KCOV_INSTRUMENT := n
>
> obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
>
> -# Avoid recursion lockdep -> sanitizer -> ... -> lockdep.
> +# Avoid recursion lockdep -> sanitizer -> ... -> lockdep & improve performance.
> +KASAN_SANITIZE_lockdep.o := n
> KCSAN_SANITIZE_lockdep.o := n
>
> ifdef CONFIG_FUNCTION_TRACER
> --
> 2.48.1
>

Marco Elver

unread,
Feb 12, 2025, 6:31:19 AM2/12/25
to Boqun Feng, Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, linux-...@vger.kernel.org, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com
For completeness-sake, we'd also have to compare with
CONFIG_KASAN_INLINE=y, which gets rid of the __asan_ calls (not the
explicit __kasan_ checks). But I leave it up to you - I'm aware it
results in slow-downs, too. ;-)

> > With the newly added rtmutex and lockdep lock events, the relevant
> > event counts for the test runs with the Skylake system were:
> >
> > Event type Debug kernel RT debug kernel
> > ---------- ------------ ---------------
> > lockdep_acquire 1,968,663,277 5,425,313,953
> > rtlock_slowlock - 401,701,156
> > rtmutex_slowlock - 139,672
> >
> > The __lock_acquire() calls in the RT debug kernel are x2.8 times of the
> > non-RT debug kernel with the same workload. Since the __lock_acquire()
> > function is a big hitter in term of performance slowdown, this makes
> > the RT debug kernel much slower than the non-RT one. The average lock
> > nesting depth is likely to be higher in the RT debug kernel too leading
> > to longer execution time in the __lock_acquire() function.
> >
> > As the small advantage of enabling KASAN instrumentation to catch
> > potential memory access error in the lockdep debugging tool is probably
> > not worth the drawback of further slowing down a debug kernel, disable
> > KASAN instrumentation in the lockdep code to allow the debug kernels
> > to regain some performance back, especially for the RT debug kernels.

It's not about catching a bug in the lockdep code, but rather guard
against bugs in code that allocated the storage for some
synchronization object. Since lockdep state is embedded in each
synchronization object, lockdep checking code may be passed a
reference to garbage data, e.g. on use-after-free (or even
out-of-bounds if there's an array of sync objects). In that case, all
bets are off and lockdep may produce random false reports. Sure the
system is already in a bad state at that point, but it's going to make
debugging much harder.

Our approach has always been to ensure that as soon as there's an
error state detected it's reported as soon as we can, before it
results in random failure as execution continues (e.g. bad lock
reports).

To guard against that, I would propose adding carefully placed
kasan_check_byte() in lockdep code.

Waiman Long

unread,
Feb 12, 2025, 9:20:18 AM2/12/25
to Marco Elver, Boqun Feng, Peter Zijlstra, Ingo Molnar, Will Deacon, linux-...@vger.kernel.org, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com
I see. I don't realize that there is such an Kconfig option. Will try it out to see how it works out.
With CONFIG_LOCKDEP on, the lock_acquire() function is usually the first call before the lock is acquired. So it is likely the one that reports these memory bug. However, the lock itself will eventually be accessed. KASAN instrumentation there should be able to catch the same problem.

Our approach has always been to ensure that as soon as there's an
error state detected it's reported as soon as we can, before it
results in random failure as execution continues (e.g. bad lock
reports).

To guard against that, I would propose adding carefully placed
kasan_check_byte() in lockdep code.

OK, will look into that.

Thanks,
Longman

Waiman Long

unread,
Feb 12, 2025, 11:57:35 AM2/12/25
to Marco Elver, Boqun Feng, Peter Zijlstra, Ingo Molnar, Will Deacon, linux-...@vger.kernel.org, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com
On 2/12/25 6:30 AM, Marco Elver wrote:
I just realize that my config file for non-RT debug kernel does have
CONFIG_KASAN_INLINE=y set, though the RT debug kernel does not have
this. For the non-RT debug kernel, the _asan_report_load* functions are
still being called because lockdep.c is very big (> 6k lines of code).
So "call_threshold := 10000" in scripts/Makefile.kasan is probably not
enough for lockdep.c.
Will take a look at that.

Cheers,
Longman

Waiman Long

unread,
Feb 13, 2025, 3:02:46 PM2/13/25
to Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, Marco Elver, linux-...@vger.kernel.org, kasa...@googlegroups.com, Waiman Long
v3:
- Add another patch to insert lock events into lockdep.c.
- Rerun all the tests with the simpler defconfig kernel build and do
further analysis of the of the performance difference between the
the RT and non-RT debug kernels.

v4:
- Update test results in patch 3 after incorporating CONFIG_KASAN_INLINE
into the test matrix.
- Add patch 4 to call kasan_check_byte() in lock_acquire.

It is found that disabling KASAN instrumentation when compiling
lockdep.c can significantly improve the performance of RT debug kernel
while the performance benefit of non-RT debug kernel is relatively
modest.

This series also include patches to add locking events to the rtmutex
slow paths and the lockdep code for better analysis of the different
performance behavior between RT and non-RT debug kernels.

Waiman Long (4):
locking/lock_events: Add locking events for rtmutex slow paths
locking/lock_events: Add locking events for lockdep
locking/lockdep: Disable KASAN instrumentation of lockdep.c
locking/lockdep: Add kasan_check_byte() check in lock_acquire()

kernel/locking/Makefile | 3 ++-
kernel/locking/lock_events_list.h | 29 +++++++++++++++++++++++++++++
kernel/locking/lockdep.c | 22 +++++++++++++++++++++-
kernel/locking/rtmutex.c | 29 ++++++++++++++++++++++++-----
4 files changed, 76 insertions(+), 7 deletions(-)

--
2.48.1

Waiman Long

unread,
Feb 13, 2025, 3:02:50 PM2/13/25
to Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, Marco Elver, linux-...@vger.kernel.org, kasa...@googlegroups.com, Waiman Long
Add locking events for rtlock_slowlock() and rt_mutex_slowlock() for
profiling the slow path behavior of rt_spin_lock() and rt_mutex_lock().

Signed-off-by: Waiman Long <lon...@redhat.com>
---
kernel/locking/lock_events_list.h | 21 +++++++++++++++++++++
kernel/locking/rtmutex.c | 29 ++++++++++++++++++++++++-----
2 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/lock_events_list.h b/kernel/locking/lock_events_list.h
index 97fb6f3f840a..80b11f194c9f 100644
--- a/kernel/locking/lock_events_list.h
+++ b/kernel/locking/lock_events_list.h
@@ -67,3 +67,24 @@ LOCK_EVENT(rwsem_rlock_handoff) /* # of read lock handoffs */
LOCK_EVENT(rwsem_wlock) /* # of write locks acquired */
LOCK_EVENT(rwsem_wlock_fail) /* # of failed write lock acquisitions */
LOCK_EVENT(rwsem_wlock_handoff) /* # of write lock handoffs */
+
+/*
+ * Locking events for rtlock_slowlock()
+ */
+LOCK_EVENT(rtlock_slowlock) /* # of rtlock_slowlock() calls */
+LOCK_EVENT(rtlock_slow_acq1) /* # of locks acquired after wait_lock */
+LOCK_EVENT(rtlock_slow_acq2) /* # of locks acquired in for loop */
+LOCK_EVENT(rtlock_slow_sleep) /* # of sleeps */
+LOCK_EVENT(rtlock_slow_wake) /* # of wakeup's */
+
+/*
+ * Locking events for rt_mutex_slowlock()
+ */
+LOCK_EVENT(rtmutex_slowlock) /* # of rt_mutex_slowlock() calls */
+LOCK_EVENT(rtmutex_slow_block) /* # of rt_mutex_slowlock_block() calls */
+LOCK_EVENT(rtmutex_slow_acq1) /* # of locks acquired after wait_lock */
+LOCK_EVENT(rtmutex_slow_acq2) /* # of locks acquired at the end */
+LOCK_EVENT(rtmutex_slow_acq3) /* # of locks acquired in *block() */
+LOCK_EVENT(rtmutex_slow_sleep) /* # of sleeps */
+LOCK_EVENT(rtmutex_slow_wake) /* # of wakeup's */
+LOCK_EVENT(rtmutex_deadlock) /* # of rt_mutex_handle_deadlock()'s */
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 4a8df1800cbb..c80902eacd79 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -27,6 +27,7 @@
#include <trace/events/lock.h>

#include "rtmutex_common.h"
+#include "lock_events.h"

#ifndef WW_RT
# define build_ww_mutex() (false)
@@ -1612,10 +1613,13 @@ static int __sched rt_mutex_slowlock_block(struct rt_mutex_base *lock,
struct task_struct *owner;
int ret = 0;

+ lockevent_inc(rtmutex_slow_block);
for (;;) {
/* Try to acquire the lock: */
- if (try_to_take_rt_mutex(lock, current, waiter))
+ if (try_to_take_rt_mutex(lock, current, waiter)) {
+ lockevent_inc(rtmutex_slow_acq3);
break;
+ }

if (timeout && !timeout->task) {
ret = -ETIMEDOUT;
@@ -1638,8 +1642,10 @@ static int __sched rt_mutex_slowlock_block(struct rt_mutex_base *lock,
owner = NULL;
raw_spin_unlock_irq_wake(&lock->wait_lock, wake_q);

- if (!owner || !rtmutex_spin_on_owner(lock, waiter, owner))
+ if (!owner || !rtmutex_spin_on_owner(lock, waiter, owner)) {
+ lockevent_inc(rtmutex_slow_sleep);
rt_mutex_schedule();
+ }

raw_spin_lock_irq(&lock->wait_lock);
set_current_state(state);
@@ -1694,6 +1700,7 @@ static int __sched __rt_mutex_slowlock(struct rt_mutex_base *lock,
int ret;

lockdep_assert_held(&lock->wait_lock);
+ lockevent_inc(rtmutex_slowlock);

/* Try to acquire the lock again: */
if (try_to_take_rt_mutex(lock, current, NULL)) {
@@ -1701,6 +1708,7 @@ static int __sched __rt_mutex_slowlock(struct rt_mutex_base *lock,
__ww_mutex_check_waiters(rtm, ww_ctx, wake_q);
ww_mutex_lock_acquired(ww, ww_ctx);
}
+ lockevent_inc(rtmutex_slow_acq1);
return 0;
}

@@ -1719,10 +1727,12 @@ static int __sched __rt_mutex_slowlock(struct rt_mutex_base *lock,
__ww_mutex_check_waiters(rtm, ww_ctx, wake_q);
ww_mutex_lock_acquired(ww, ww_ctx);
}
+ lockevent_inc(rtmutex_slow_acq2);
} else {
__set_current_state(TASK_RUNNING);
remove_waiter(lock, waiter);
rt_mutex_handle_deadlock(ret, chwalk, lock, waiter);
+ lockevent_inc(rtmutex_deadlock);
}

/*
@@ -1751,6 +1761,7 @@ static inline int __rt_mutex_slowlock_locked(struct rt_mutex_base *lock,
&waiter, wake_q);

debug_rt_mutex_free_waiter(&waiter);
+ lockevent_cond_inc(rtmutex_slow_wake, !wake_q_empty(wake_q));
return ret;
}

@@ -1823,9 +1834,12 @@ static void __sched rtlock_slowlock_locked(struct rt_mutex_base *lock,
struct task_struct *owner;

lockdep_assert_held(&lock->wait_lock);
+ lockevent_inc(rtlock_slowlock);

- if (try_to_take_rt_mutex(lock, current, NULL))
+ if (try_to_take_rt_mutex(lock, current, NULL)) {
+ lockevent_inc(rtlock_slow_acq1);
return;
+ }

rt_mutex_init_rtlock_waiter(&waiter);

@@ -1838,8 +1852,10 @@ static void __sched rtlock_slowlock_locked(struct rt_mutex_base *lock,

for (;;) {
/* Try to acquire the lock again */
- if (try_to_take_rt_mutex(lock, current, &waiter))
+ if (try_to_take_rt_mutex(lock, current, &waiter)) {
+ lockevent_inc(rtlock_slow_acq2);
break;
+ }

if (&waiter == rt_mutex_top_waiter(lock))
owner = rt_mutex_owner(lock);
@@ -1847,8 +1863,10 @@ static void __sched rtlock_slowlock_locked(struct rt_mutex_base *lock,
owner = NULL;
raw_spin_unlock_irq_wake(&lock->wait_lock, wake_q);

- if (!owner || !rtmutex_spin_on_owner(lock, &waiter, owner))
+ if (!owner || !rtmutex_spin_on_owner(lock, &waiter, owner)) {
+ lockevent_inc(rtlock_slow_sleep);
schedule_rtlock();
+ }

raw_spin_lock_irq(&lock->wait_lock);
set_current_state(TASK_RTLOCK_WAIT);
@@ -1865,6 +1883,7 @@ static void __sched rtlock_slowlock_locked(struct rt_mutex_base *lock,
debug_rt_mutex_free_waiter(&waiter);

trace_contention_end(lock, 0);
+ lockevent_cond_inc(rtlock_slow_wake, !wake_q_empty(wake_q));
}

static __always_inline void __sched rtlock_slowlock(struct rt_mutex_base *lock)
--
2.48.1

Waiman Long

unread,
Feb 13, 2025, 3:02:53 PM2/13/25
to Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, Marco Elver, linux-...@vger.kernel.org, kasa...@googlegroups.com, Waiman Long
Add some lock events to the lockdep for profiling its behavior.

Signed-off-by: Waiman Long <lon...@redhat.com>
---
kernel/locking/lock_events_list.h | 7 +++++++
kernel/locking/lockdep.c | 8 +++++++-
2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/lock_events_list.h b/kernel/locking/lock_events_list.h
index 80b11f194c9f..9ef9850aeebe 100644
--- a/kernel/locking/lock_events_list.h
+++ b/kernel/locking/lock_events_list.h
@@ -88,3 +88,10 @@ LOCK_EVENT(rtmutex_slow_acq3) /* # of locks acquired in *block() */
LOCK_EVENT(rtmutex_slow_sleep) /* # of sleeps */
LOCK_EVENT(rtmutex_slow_wake) /* # of wakeup's */
LOCK_EVENT(rtmutex_deadlock) /* # of rt_mutex_handle_deadlock()'s */
+
+/*
+ * Locking events for lockdep
+ */
+LOCK_EVENT(lockdep_acquire)
+LOCK_EVENT(lockdep_lock)
+LOCK_EVENT(lockdep_nocheck)
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 4470680f0226..8436f017c74d 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -61,6 +61,7 @@
#include <asm/sections.h>

#include "lockdep_internals.h"
+#include "lock_events.h"

#include <trace/events/lock.h>

@@ -170,6 +171,7 @@ static struct task_struct *lockdep_selftest_task_struct;
static int graph_lock(void)
{
lockdep_lock();
+ lockevent_inc(lockdep_lock);
/*
* Make sure that if another CPU detected a bug while
* walking the graph we dont change it (while the other
@@ -5091,8 +5093,12 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
if (unlikely(lock->key == &__lockdep_no_track__))
return 0;

- if (!prove_locking || lock->key == &__lockdep_no_validate__)
+ lockevent_inc(lockdep_acquire);
+
+ if (!prove_locking || lock->key == &__lockdep_no_validate__) {
check = 0;
+ lockevent_inc(lockdep_nocheck);
+ }

if (subclass < NR_LOCKDEP_CACHING_CLASSES)
class = lock->class_cache[subclass];
--
2.48.1

Waiman Long

unread,
Feb 13, 2025, 3:02:54 PM2/13/25
to Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, Marco Elver, linux-...@vger.kernel.org, kasa...@googlegroups.com, Waiman Long
Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
Each of them can significantly slow down the speed of a debug kernel.
Enabling KASAN instrumentation of the LOCKDEP code will further slow
thing down.

Since LOCKDEP is a high overhead debugging tool, it will never get
enabled in a production kernel. The LOCKDEP code is also pretty mature
and is unlikely to get major changes. There is also a possibility of
recursion similar to KCSAN.

To evaluate the performance impact of disabling KASAN instrumentation
of lockdep.c, the time to do a parallel build of the Linux defconfig
kernel was used as the benchmark. Two x86-64 systems (Skylake & Zen 2)
and an arm64 system were used as test beds. Two sets of non-RT and RT
kernels with similar configurations except mainly CONFIG_PREEMPT_RT
were used for evaulation.

For the Skylake system:

Kernel Run time Sys time
------ -------- --------
Non-debug kernel (baseline) 0m47.642s 4m19.811s

[CONFIG_KASAN_INLINE=y]
Debug kernel 2m11.108s (x2.8) 38m20.467s (x8.9)
Debug kernel (patched) 1m49.602s (x2.3) 31m28.501s (x7.3)
Debug kernel
(patched + mitigations=off) 1m30.988s (x1.9) 26m41.993s (x6.2)

RT kernel (baseline) 0m54.871s 7m15.340s

[CONFIG_KASAN_INLINE=n]
RT debug kernel 6m07.151s (x6.7) 135m47.428s (x18.7)
RT debug kernel (patched) 3m42.434s (x4.1) 74m51.636s (x10.3)
RT debug kernel
(patched + mitigations=off) 2m40.383s (x2.9) 57m54.369s (x8.0)

[CONFIG_KASAN_INLINE=y]
RT debug kernel 3m22.155s (x3.7) 77m53.018s (x10.7)
RT debug kernel (patched) 2m36.700s (x2.9) 54m31.195s (x7.5)
RT debug kernel
(patched + mitigations=off) 2m06.110s (x2.3) 45m49.493s (x6.3)

For the Zen 2 system:

Kernel Run time Sys time
------ -------- --------
Non-debug kernel (baseline) 1m42.806s 39m48.714s

[CONFIG_KASAN_INLINE=y]
Debug kernel 4m04.524s (x2.4) 125m35.904s (x3.2)
Debug kernel (patched) 3m56.241s (x2.3) 127m22.378s (x3.2)
Debug kernel
(patched + mitigations=off) 2m38.157s (x1.5) 92m35.680s (x2.3)

RT kernel (baseline) 1m51.500s 14m56.322s

[CONFIG_KASAN_INLINE=n]
RT debug kernel 16m04.962s (x8.7) 244m36.463s (x16.4)
RT debug kernel (patched) 9m09.073s (x4.9) 129m28.439s (x8.7)
RT debug kernel
(patched + mitigations=off) 3m31.662s (x1.9) 51m01.391s (x3.4)

For the arm64 system:

Kernel Run time Sys time
------ -------- --------
Non-debug kernel (baseline) 1m56.844s 8m47.150s
Debug kernel 3m54.774s (x2.0) 92m30.098s (x10.5)
Debug kernel (patched) 3m32.429s (x1.8) 77m40.779s (x8.8)

RT kernel (baseline) 4m01.641s 18m16.777s

[CONFIG_KASAN_INLINE=n]
RT debug kernel 19m32.977s (x4.9) 304m23.965s (x16.7)
RT debug kernel (patched) 16m28.354s (x4.1) 234m18.149s (x12.8)

Turning the mitigations off doesn't seems to have any noticeable impact
on the performance of the arm64 system. So the mitigation=off entries
aren't included.

For the x86 CPUs, cpu mitigations has a much bigger
impact on performance, especially the RT debug kernel with
CONFIG_KASAN_INLINE=n. The SRSO mitigation in Zen 2 has an especially
big impact on the debug kernel. It is also the majority of the slowdown
with mitigations on. It is because the patched ret instruction slows
down function returns. A lot of helper functions that are normally
compiled out or inlined may become real function calls in the debug
kernel.

With CONFIG_KASAN_INLINE=n, the KASAN instrumentation inserts a
lot of __asan_loadX*() and __kasan_check_read() function calls to memory
access portion of the code. The lockdep's __lock_acquire() function,
for instance, has 66 __asan_loadX*() and 6 __kasan_check_read() calls
added with KASAN instrumentation. Of course, the actual numbers may vary
depending on the compiler used and the exact version of the lockdep code.

With the Skylake test system, the parallel kernel build times reduction
of the RT debug kernel with this patch are:

CONFIG_KASAN_INLINE=n: -37%
CONFIG_KASAN_INLINE=y: -22%

The time reduction is less with CONFIG_KASAN_INLINE=y, but it is still
significant.

Setting CONFIG_KASAN_INLINE=y can result in a significant performance
improvement. The major drawback is a significant increase in the size
of kernel text. In the case of vmlinux, its text size increases from
45997948 to 67606807. That is a 47% size increase (about 21 Mbytes). The
size increase of other kernel modules should be similar.

With the newly added rtmutex and lockdep lock events, the relevant
event counts for the test runs with the Skylake system were:

Event type Debug kernel RT debug kernel
---------- ------------ ---------------
lockdep_acquire 1,968,663,277 5,425,313,953
rtlock_slowlock - 401,701,156
rtmutex_slowlock - 139,672

The __lock_acquire() calls in the RT debug kernel are x2.8 times of the
non-RT debug kernel with the same workload. Since the __lock_acquire()
function is a big hitter in term of performance slowdown, this makes
the RT debug kernel much slower than the non-RT one. The average lock
nesting depth is likely to be higher in the RT debug kernel too leading
to longer execution time in the __lock_acquire() function.

As the small advantage of enabling KASAN instrumentation to catch
potential memory access error in the lockdep debugging tool is probably
not worth the drawback of further slowing down a debug kernel, disable
KASAN instrumentation in the lockdep code to allow the debug kernels
to regain some performance back, especially for the RT debug kernels.

Signed-off-by: Waiman Long <lon...@redhat.com>
---

Waiman Long

unread,
Feb 13, 2025, 3:02:58 PM2/13/25
to Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, Marco Elver, linux-...@vger.kernel.org, kasa...@googlegroups.com, Waiman Long
KASAN instrumentation of lockdep has been disabled as we don't need
KASAN to check the validity of lockdep internal data structures and
incur unnecessary performance overhead. However, the lockdep_map pointer
passed in externally may not be valid (e.g. use-after-free) and we run
the risk of using garbage data resulting in false lockdep reports. Add
kasan_check_byte() call in lock_acquire() for non kernel core data
object to catch invalid lockdep_map and abort lockdep processing if
input data isn't valid.

Suggested-by: Marco Elver <el...@google.com>
Signed-off-by: Waiman Long <lon...@redhat.com>
---
kernel/locking/lock_events_list.h | 1 +
kernel/locking/lockdep.c | 14 ++++++++++++++
2 files changed, 15 insertions(+)

diff --git a/kernel/locking/lock_events_list.h b/kernel/locking/lock_events_list.h
index 9ef9850aeebe..bed59b2195c7 100644
--- a/kernel/locking/lock_events_list.h
+++ b/kernel/locking/lock_events_list.h
@@ -95,3 +95,4 @@ LOCK_EVENT(rtmutex_deadlock) /* # of rt_mutex_handle_deadlock()'s */
LOCK_EVENT(lockdep_acquire)
LOCK_EVENT(lockdep_lock)
LOCK_EVENT(lockdep_nocheck)
+LOCK_EVENT(lockdep_kasan_fail)
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 8436f017c74d..98dd0455d4be 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -57,6 +57,7 @@
#include <linux/lockdep.h>
#include <linux/context_tracking.h>
#include <linux/console.h>
+#include <linux/kasan.h>

#include <asm/sections.h>

@@ -5830,6 +5831,19 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
if (!debug_locks)
return;

+ /*
+ * As KASAN instrumentation is disabled and lock_acquire() is usually
+ * the first lockdep call when a task tries to acquire a lock, add
+ * kasan_check_byte() here to check for use-after-free of non kernel
+ * core lockdep_map data to avoid referencing garbage data.
+ */
+ if (unlikely(IS_ENABLED(CONFIG_KASAN) &&
+ !is_kernel_core_data((unsigned long)lock) &&
+ !kasan_check_byte(lock))) {
+ lockevent_inc(lockdep_kasan_fail);
+ return;
+ }
+
if (unlikely(!lockdep_enabled())) {
/* XXX allow trylock from NMI ?!? */
if (lockdep_nmi() && !trylock) {
--
2.48.1

Waiman Long

unread,
Feb 13, 2025, 3:06:22 PM2/13/25
to Marco Elver, Boqun Feng, Peter Zijlstra, Ingo Molnar, Will Deacon, linux-...@vger.kernel.org, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, kasa...@googlegroups.com
That is not correct. Setting CONFIG_KASAN_INLINE=y does have an effect
in lockdep.c to reduce the number of __asan_* calls. I have posted the
v4 series with the updated test results. I have also added a new patch
to KASAN checking in lock_acquire().

Cheers,
Longman

Marco Elver

unread,
Feb 14, 2025, 5:45:03 AM2/14/25
to Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, linux-...@vger.kernel.org, kasa...@googlegroups.com
On Thu, 13 Feb 2025 at 21:02, Waiman Long <lon...@redhat.com> wrote:
>
> KASAN instrumentation of lockdep has been disabled as we don't need
> KASAN to check the validity of lockdep internal data structures and
> incur unnecessary performance overhead. However, the lockdep_map pointer
> passed in externally may not be valid (e.g. use-after-free) and we run
> the risk of using garbage data resulting in false lockdep reports. Add
> kasan_check_byte() call in lock_acquire() for non kernel core data
> object to catch invalid lockdep_map and abort lockdep processing if
> input data isn't valid.
>
> Suggested-by: Marco Elver <el...@google.com>
> Signed-off-by: Waiman Long <lon...@redhat.com>

Reviewed-by: Marco Elver <el...@google.com>

but double-check if the below can be simplified.
This is not needed - kasan_check_byte() will always return true if
KASAN is disabled or not compiled in.

> + !is_kernel_core_data((unsigned long)lock) &&

Why use !is_kernel_core_data()? Is it to improve performance?

> + !kasan_check_byte(lock))) {
> + lockevent_inc(lockdep_kasan_fail);
> + return;
> + }
> +
> if (unlikely(!lockdep_enabled())) {
> /* XXX allow trylock from NMI ?!? */
> if (lockdep_nmi() && !trylock) {
> --
> 2.48.1
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/kasan-dev/20250213200228.1993588-5-longman%40redhat.com.

Waiman Long

unread,
Feb 14, 2025, 9:09:46 AM2/14/25
to Marco Elver, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, linux-...@vger.kernel.org, kasa...@googlegroups.com
I added this check because of the is_kernel_core_data() call.
>
>> + !is_kernel_core_data((unsigned long)lock) &&
> Why use !is_kernel_core_data()? Is it to improve performance?

Not exactly. In my testing, just using kasan_check_byte() doesn't quite
work out. It seems to return false positive in some cases causing
lockdep splat. I didn't look into exactly why this happens and I added
the is_kernel_core_data() call to work around that.

Cheers,
Longman

Marco Elver

unread,
Feb 14, 2025, 9:45:06 AM2/14/25
to Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, linux-...@vger.kernel.org, kasa...@googlegroups.com
Globals should have their shadow memory unpoisoned by default, so
that's definitely odd.

Out of curiosity, do you have such a false positive splat? Wondering
which data it's accessing. Maybe that'll tell us more about what's
wrong.

Waiman Long

unread,
Feb 14, 2025, 10:30:26 AM2/14/25
to Marco Elver, Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, linux-...@vger.kernel.org, kasa...@googlegroups.com
Will do more investigation about this and let you know the result.

Cheers,
Longman

>

Waiman Long

unread,
Feb 14, 2025, 11:19:01 AM2/14/25
to Marco Elver, Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, linux-...@vger.kernel.org, kasa...@googlegroups.com
The kasan_check_byte() failure happens very early in the boot cycle.
There is no KASAN report, but the API returns false. I inserted a
WARN_ON(1) to dump out the stack.

[    0.000046] ------------[ cut here ]------------
[    0.000047] WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:5817
lock_acquire.part.0+0x22c/0x280
[    0.000057] Modules linked in:
[    0.000062] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted
6.12.0-el10-test+ #15
[    0.000066] Hardware name: HPE ProLiant DL560 Gen10/ProLiant DL560
Gen10, BIOS U34 01/16/2025
[    0.000068] RIP: 0010:lock_acquire.part.0+0x22c/0x280
[    0.000073] Code: 69 d1 04 85 c0 0f 85 fc fe ff ff 65 48 8b 3d 2b d8
c1 75 b9 0a 00 00 00 ba 08 00 00 00 4c 89 ee e8 19 e3 ff ff e9 dd fe ff
ff <0f>
0b 65 48 ff 05 ca 5f c0 75 e9 ce fe ff ff 4c 89 14 24 e8 bc f8
[    0.000076] RSP: 0000:ffffffff8e407c98 EFLAGS: 00010046 ORIG_RAX:
0000000000000000
[    0.000079] RAX: 0000000000000000 RBX: ffffffff8e54fe70 RCX:
0000000000000000
[    0.000081] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
ffffffff8e407c40
[    0.000083] RBP: 0000000000000000 R08: 0000000000000001 R09:
0000000000000000
[    0.000084] R10: ffffffff8a43af29 R11: 00000000002087cc R12:
0000000000000001
[    0.000087] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[    0.000088] FS:  0000000000000000(0000) GS:ffffffff8fb88000(0000)
knlGS:0000000000000000
[    0.000090] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.000093] CR2: ffff888000000413 CR3: 0000001fc96e0000 CR4:
00000000000000f0
[    0.000095] Call Trace:
[    0.000096]  <TASK>
[    0.000101]  ? show_trace_log_lvl+0x1b0/0x2f0
[    0.000105]  ? show_trace_log_lvl+0x1b0/0x2f0
[    0.000119]  ? lock_acquire.part.0+0x22c/0x280
[    0.000124]  ? __warn.cold+0x5b/0xe5
[    0.000133]  ? lock_acquire.part.0+0x22c/0x280
[    0.000138]  ? report_bug+0x1f0/0x390
[    0.000146]  ? early_fixup_exception+0x145/0x230
[    0.000154]  ? early_idt_handler_common+0x2f/0x3a
[    0.000164]  ? request_resource+0x29/0x2b0
[    0.000172]  ? lock_acquire.part.0+0x22c/0x280
[    0.000177]  ? lock_acquire.part.0+0x3f/0x280
[    0.000182]  ? rcu_is_watching+0x15/0xb0
[    0.000187]  ? __pfx___might_resched+0x10/0x10
[    0.000192]  ? lock_acquire+0x120/0x170
[    0.000195]  ? request_resource+0x29/0x2b0
[    0.000201]  ? rt_write_lock+0x7d/0x110
[    0.000208]  ? request_resource+0x29/0x2b0
[    0.000211]  ? request_resource+0x29/0x2b0
[    0.000217]  ? probe_roms+0x150/0x370
[    0.000222]  ? __pfx_probe_roms+0x10/0x10
[    0.000226]  ? __lock_release.isra.0+0x120/0x2c0
[    0.000231]  ? setup_arch+0x92d/0x1180
[    0.000238]  ? setup_arch+0x95c/0x1180
[    0.000243]  ? __pfx_setup_arch+0x10/0x10
[    0.000246]  ? _printk+0xcc/0x102
[    0.000254]  ? __pfx__printk+0x10/0x10
[    0.000259]  ? cgroup_init_early+0x26a/0x290
[    0.000268]  ? cgroup_init_early+0x26a/0x290
[    0.000271]  ? cgroup_init_early+0x1af/0x290
[    0.000279]  ? start_kernel+0x68/0x3b0
[    0.000285]  ? x86_64_start_reservations+0x24/0x30
[    0.000288]  ? x86_64_start_kernel+0x9c/0xa0
[    0.000292]  ? common_startup_64+0x13e/0x141
[    0.000309]  </TASK>
[    0.000311] irq event stamp: 0
[    0.000312] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[    0.000316] hardirqs last disabled at (0): [<0000000000000000>] 0x0
[    0.000318] softirqs last  enabled at (0): [<0000000000000000>] 0x0
[    0.000320] softirqs last disabled at (0): [<0000000000000000>] 0x0
[    0.000322] ---[ end trace 0000000000000000 ]---
[    0.000331] ------------[ cut here ]------------
[    0.000332] WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:5817
lock_acquire.part.0+0x22c/0x280
[    0.000336] Modules linked in:
[    0.000339] CPU: 0 UID: 0 PID: 0 Comm: swapper Tainted: G       
W         -------  ---  6.12.0-el10-test+ #15
[    0.000343] Tainted: [W]=WARN
[    0.000345] Hardware name: HPE ProLiant DL560 Gen10/ProLiant DL560
Gen10, BIOS U34 01/16/2025
[    0.000346] RIP: 0010:lock_acquire.part.0+0x22c/0x280
[    0.000350] Code: 69 d1 04 85 c0 0f 85 fc fe ff ff 65 48 8b 3d 2b d8
c1 75 b9 0a 00 00 00 ba 08 00 00 00 4c 89 ee e8 19 e3 ff ff e9 dd fe ff
ff <0f>
0b 65 48 ff 05 ca 5f c0 75 e9 ce fe ff ff 4c 89 14 24 e8 bc f8
[    0.000352] RSP: 0000:ffffffff8e407c20 EFLAGS: 00010046 ORIG_RAX:
0000000000000000
[    0.000354] RAX: 0000000000000000 RBX: ffffffff8e54fe20 RCX:
0000000000000000
[    0.000356] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
ffffffff8e407bc8
[    0.000357] RBP: 0000000000000000 R08: 0000000000000001 R09:
0000000000000000
[    0.000359] R10: ffffffff8ccf84d2 R11: 00000000002087cc R12:
0000000000000001
[    0.000360] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[    0.000362] FS:  0000000000000000(0000) GS:ffffffff8fb88000(0000)
knlGS:0000000000000000
[    0.000364] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.000365] CR2: ffff888000000413 CR3: 0000001fc96e0000 CR4:
00000000000000f0
[    0.000367] Call Trace:
[    0.000368]  <TASK>
[    0.000369]  ? show_trace_log_lvl+0x1b0/0x2f0
[    0.000373]  ? show_trace_log_lvl+0x1b0/0x2f0
[    0.000386]  ? lock_acquire.part.0+0x22c/0x280
[    0.000391]  ? __warn.cold+0x5b/0xe5
[    0.000396]  ? lock_acquire.part.0+0x22c/0x280
[    0.000400]  ? report_bug+0x1f0/0x390
[    0.000407]  ? early_fixup_exception+0x145/0x230
[    0.000412]  ? early_idt_handler_common+0x2f/0x3a
[    0.000419]  ? rwbase_write_lock.constprop.0.isra.0+0x22/0x5f0
[    0.000427]  ? lock_acquire.part.0+0x22c/0x280
[    0.000434]  ? rcu_is_watching+0x15/0xb0
[    0.000438]  ? lock_acquire+0x120/0x170
[    0.000441]  ? rwbase_write_lock.constprop.0.isra.0+0x22/0x5f0
[    0.000448]  ? _raw_spin_lock_irqsave+0x46/0x90
[    0.000451]  ? rwbase_write_lock.constprop.0.isra.0+0x22/0x5f0
[    0.000456]  ? rwbase_write_lock.constprop.0.isra.0+0x22/0x5f0
[    0.000459]  ? lock_acquire+0x120/0x170
[    0.000462]  ? request_resource+0x29/0x2b0
[    0.000468]  ? rt_write_lock+0x85/0x110
[    0.000471]  ? request_resource+0x29/0x2b0
[    0.000475]  ? request_resource+0x29/0x2b0
[    0.000480]  ? probe_roms+0x150/0x370
[    0.000484]  ? __pfx_probe_roms+0x10/0x10
[    0.000488]  ? __lock_release.isra.0+0x120/0x2c0
[    0.000493]  ? setup_arch+0x92d/0x1180
[    0.000500]  ? setup_arch+0x95c/0x1180
[    0.000505]  ? __pfx_setup_arch+0x10/0x10
[    0.000508]  ? _printk+0xcc/0x102
[    0.000513]  ? __pfx__printk+0x10/0x10
[    0.000517]  ? cgroup_init_early+0x26a/0x290
[    0.000525]  ? cgroup_init_early+0x26a/0x290
[    0.000528]  ? cgroup_init_early+0x1af/0x290
[    0.000535]  ? start_kernel+0x68/0x3b0
[    0.000539]  ? x86_64_start_reservations+0x24/0x30
[    0.000543]  ? x86_64_start_kernel+0x9c/0xa0
[    0.000547]  ? common_startup_64+0x13e/0x141
[    0.000561]  </TASK>
[    0.000562] irq event stamp: 0
[    0.000563] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[    0.000565] hardirqs last disabled at (0): [<0000000000000000>] 0x0
[    0.000567] softirqs last  enabled at (0): [<0000000000000000>] 0x0
[    0.000569] softirqs last disabled at (0): [<0000000000000000>] 0x0
[    0.000571] ---[ end trace 0000000000000000 ]---

Cheers,
Longman

Marco Elver

unread,
Feb 14, 2025, 11:44:15 AM2/14/25
to Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, linux-...@vger.kernel.org, kasa...@googlegroups.com
I see - I suspect this is before ctors had a chance to run, which is
the way globals are registered with KASAN.

I think it'd be fair to just remove the lockdep_kasan_fail event,
given KASAN would produce its own report on a real error anyway.

I.e. just do the kasan_check_byte(), and don't bail even if it returns
false. The KASAN report would appear before everything else (incl. a
bad lockdep report due to possible corrupted memory) and I think
that's all we need to be able to debug a real bug.

Waiman Long

unread,
Feb 14, 2025, 12:18:59 PM2/14/25
to Marco Elver, Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, linux-...@vger.kernel.org, kasa...@googlegroups.com
Fair, will update the patch.

Cheers,
Longman

>

Waiman Long

unread,
Feb 14, 2025, 2:53:13 PM2/14/25
to Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, Marco Elver, linux-...@vger.kernel.org, kasa...@googlegroups.com, Waiman Long
KASAN instrumentation of lockdep has been disabled as we don't need
KASAN to check the validity of lockdep internal data structures and
incur unnecessary performance overhead. However, the lockdep_map pointer
passed in externally may not be valid (e.g. use-after-free) and we run
the risk of using garbage data resulting in false lockdep reports.

Add kasan_check_byte() call in lock_acquire() for non kernel core data
object to catch invalid lockdep_map and print out a KASAN report before
any lockdep splat, if any.

Suggested-by: Marco Elver <el...@google.com>
Signed-off-by: Waiman Long <lon...@redhat.com>
---
kernel/locking/lockdep.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 8436f017c74d..b15757e63626 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -57,6 +57,7 @@
#include <linux/lockdep.h>
#include <linux/context_tracking.h>
#include <linux/console.h>
+#include <linux/kasan.h>

#include <asm/sections.h>

@@ -5830,6 +5831,14 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
if (!debug_locks)
return;

+ /*
+ * As KASAN instrumentation is disabled and lock_acquire() is usually
+ * the first lockdep call when a task tries to acquire a lock, add
+ * kasan_check_byte() here to check for use-after-free and other
+ * memory errors.
+ */
+ kasan_check_byte(lock);

Marco Elver

unread,
Feb 17, 2025, 2:00:27 AM2/17/25
to Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, linux-...@vger.kernel.org, kasa...@googlegroups.com
On Fri, 14 Feb 2025 at 20:53, Waiman Long <lon...@redhat.com> wrote:
>
> KASAN instrumentation of lockdep has been disabled as we don't need
> KASAN to check the validity of lockdep internal data structures and
> incur unnecessary performance overhead. However, the lockdep_map pointer
> passed in externally may not be valid (e.g. use-after-free) and we run
> the risk of using garbage data resulting in false lockdep reports.
>
> Add kasan_check_byte() call in lock_acquire() for non kernel core data
> object to catch invalid lockdep_map and print out a KASAN report before
> any lockdep splat, if any.
>
> Suggested-by: Marco Elver <el...@google.com>
> Signed-off-by: Waiman Long <lon...@redhat.com>

Reviewed-by: Marco Elver <el...@google.com>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/kasan-dev/20250214195242.2480920-1-longman%40redhat.com.

Marco Elver

unread,
Feb 17, 2025, 2:01:53 AM2/17/25
to Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, linux-...@vger.kernel.org, kasa...@googlegroups.com
On Thu, 13 Feb 2025 at 21:02, Waiman Long <lon...@redhat.com> wrote:
>
Reviewed-by: Marco Elver <el...@google.com>

Andrey Konovalov

unread,
Feb 17, 2025, 11:51:51 AM2/17/25
to Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino, Marco Elver, linux-...@vger.kernel.org, kasa...@googlegroups.com
I wonder if kasan_check_read/write() would be a better fit here. Those
are intended for the compiler-based modes and are no-op for HW_TAGS.
But I assume lockdep will access this lock variable anyway, so HW_TAGS
will detect memory errors.

On the other hand, detecting a bug earlier is better, so
kasan_check_byte() seems the better choice. And lockdep is not
intended to be fast / used on production anyway, so the extra
instructions added by kasan_check_byte() for HW_TAGS don't matter.

I guess we can change this later, if there's ever a reason to do so.

Reviewed-by: Andrey Konovalov <andre...@gmail.com>

Andrey Konovalov

unread,
Feb 17, 2025, 11:53:21 AM2/17/25
to Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Boqun Feng, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino, Marco Elver, linux-...@vger.kernel.org, kasa...@googlegroups.com
Reviewed-by: Andrey Konovalov <andre...@gmail.com>

Boqun Feng

unread,
Feb 23, 2025, 9:11:46 PM2/23/25
to Waiman Long, Peter Zijlstra, Ingo Molnar, Will Deacon, Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino, Marco Elver, linux-...@vger.kernel.org, kasa...@googlegroups.com
On Thu, Feb 13, 2025 at 03:02:24PM -0500, Waiman Long wrote:
> v3:
> - Add another patch to insert lock events into lockdep.c.
> - Rerun all the tests with the simpler defconfig kernel build and do
> further analysis of the of the performance difference between the
> the RT and non-RT debug kernels.
>
> v4:
> - Update test results in patch 3 after incorporating CONFIG_KASAN_INLINE
> into the test matrix.
> - Add patch 4 to call kasan_check_byte() in lock_acquire.
>
> It is found that disabling KASAN instrumentation when compiling
> lockdep.c can significantly improve the performance of RT debug kernel
> while the performance benefit of non-RT debug kernel is relatively
> modest.
>
> This series also include patches to add locking events to the rtmutex
> slow paths and the lockdep code for better analysis of the different
> performance behavior between RT and non-RT debug kernels.
>

Thank you, and thank Marco and Andrey for the reviews. Queued for v6.15.

Regards,
Boqun
Reply all
Reply to author
Forward
0 new messages