Re: [PATCH] powerpc/kasan/book3s_64: warn when running with hash MMU

1 view
Skip to first unread message

Christophe Leroy

unread,
Oct 6, 2022, 1:04:55 AM10/6/22
to Michael Ellerman, Nathan Lynch, linuxp...@lists.ozlabs.org, kasan-dev
+ KASAN list

Le 06/10/2022 à 06:10, Michael Ellerman a écrit :
> Nathan Lynch <nat...@linux.ibm.com> writes:
>> kasan is known to crash at boot on book3s_64 with non-radix MMU. As
>> noted in commit 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only
>> KASAN support"):
>>
>> A kernel with CONFIG_KASAN=y will crash during boot on a machine
>> using HPT translation because not all the entry points to the
>> generic KASAN code are protected with a call to kasan_arch_is_ready().
>
> I guess I thought there was some plan to fix that.

I was thinking the same.

Do we have a list of the said entry points to the generic code that are
lacking a call to kasan_arch_is_ready() ?

Typically, the BUG dump below shows that kasan_byte_accessible() is
lacking the check. It should be straight forward to add
kasan_arch_is_ready() check to kasan_byte_accessible(), shouldn't it ?

>
> But maybe I'm misremembering. Looking now it's not entirely straight
> forward with the way the headers are structured. So I guess I'm wrong
> about that.
>
>> Such crashes look like this:
>>
>> BUG: Unable to handle kernel data access at 0xc00e00000308b100
>> Faulting instruction address: 0xc0000000006d0fcc
>> Oops: Kernel access of bad area, sig: 11 [#1]
>> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc5-02183-g3ab165dea2a2 #13
>> [...regs...]
>> NIP [c0000000006d0fcc] kasan_byte_accessible+0xc/0x20
>> LR [c0000000006cd9cc] __kasan_check_byte+0x2c/0xa0
>> Call Trace:
...
>>
>> Change init_book3s_64.c::kasan_init() to emit a warning backtrace and
>> taint the kernel when not running on radix. When the kernel likely
>> oopses later, the 'W' taint flag in the report should help minimize
>> developer time spent trying to understand what's gone wrong.
>
> Should we just panic() directly?

But then you loose any sight that the problem is in
kasan_byte_accessible() and have to be fixed there.

Christophe

Michael Ellerman

unread,
Oct 7, 2022, 6:41:25 AM10/7/22
to Christophe Leroy, Nathan Lynch, linuxp...@lists.ozlabs.org, kasan-dev
Christophe Leroy <christop...@csgroup.eu> writes:
> + KASAN list
>
> Le 06/10/2022 à 06:10, Michael Ellerman a écrit :
>> Nathan Lynch <nat...@linux.ibm.com> writes:
>>> kasan is known to crash at boot on book3s_64 with non-radix MMU. As
>>> noted in commit 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only
>>> KASAN support"):
>>>
>>> A kernel with CONFIG_KASAN=y will crash during boot on a machine
>>> using HPT translation because not all the entry points to the
>>> generic KASAN code are protected with a call to kasan_arch_is_ready().
>>
>> I guess I thought there was some plan to fix that.
>
> I was thinking the same.
>
> Do we have a list of the said entry points to the generic code that are
> lacking a call to kasan_arch_is_ready() ?
>
> Typically, the BUG dump below shows that kasan_byte_accessible() is
> lacking the check. It should be straight forward to add
> kasan_arch_is_ready() check to kasan_byte_accessible(), shouldn't it ?

Yes :)

And one other spot, but the patch below boots OK for me. I'll leave it
running for a while just in case there's a path I've missed.

cheers


diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 69f583855c8b..5def0118f2cd 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -377,6 +377,9 @@ bool __kasan_slab_free(struct kmem_cache *cache, void *object,

static inline bool ____kasan_kfree_large(void *ptr, unsigned long ip)
{
+ if (!kasan_arch_is_ready())
+ return false;
+
if (ptr != page_address(virt_to_head_page(ptr))) {
kasan_report_invalid_free(ptr, ip, KASAN_REPORT_INVALID_FREE);
return true;
diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
index 437fcc7e77cf..017d3c69e3b3 100644
--- a/mm/kasan/generic.c
+++ b/mm/kasan/generic.c
@@ -191,7 +191,12 @@ bool kasan_check_range(unsigned long addr, size_t size, bool write,

bool kasan_byte_accessible(const void *addr)
{
- s8 shadow_byte = READ_ONCE(*(s8 *)kasan_mem_to_shadow(addr));
+ s8 shadow_byte;
+
+ if (!kasan_arch_is_ready())
+ return true;
+
+ shadow_byte = READ_ONCE(*(s8 *)kasan_mem_to_shadow(addr));

return shadow_byte >= 0 && shadow_byte < KASAN_GRANULE_SIZE;
}

Nathan Lynch

unread,
Oct 10, 2022, 10:10:46 AM10/10/22
to Michael Ellerman, Christophe Leroy, linuxp...@lists.ozlabs.org, kasan-dev
Michael Ellerman <m...@ellerman.id.au> writes:
> Christophe Leroy <christop...@csgroup.eu> writes:
>> + KASAN list
>>
>> Le 06/10/2022 à 06:10, Michael Ellerman a écrit :
>>> Nathan Lynch <nat...@linux.ibm.com> writes:
>>>> kasan is known to crash at boot on book3s_64 with non-radix MMU. As
>>>> noted in commit 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only
>>>> KASAN support"):
>>>>
>>>> A kernel with CONFIG_KASAN=y will crash during boot on a machine
>>>> using HPT translation because not all the entry points to the
>>>> generic KASAN code are protected with a call to kasan_arch_is_ready().
>>>
>>> I guess I thought there was some plan to fix that.
>>
>> I was thinking the same.
>>
>> Do we have a list of the said entry points to the generic code that are
>> lacking a call to kasan_arch_is_ready() ?
>>
>> Typically, the BUG dump below shows that kasan_byte_accessible() is
>> lacking the check. It should be straight forward to add
>> kasan_arch_is_ready() check to kasan_byte_accessible(), shouldn't it ?
>
> Yes :)
>
> And one other spot, but the patch below boots OK for me. I'll leave it
> running for a while just in case there's a path I've missed.

It works for me too, thanks (p8 pseries qemu).

This avoids the boot-time oops, but kasan remains unimplemented for hash
mmu. Raising the question: with the trivial crashes addressed, is the
current message ('KASAN not enabled as it requires radix!') sufficient
to notify developers (such as me, a week ago) who mean to use kasan on a
book3s platform, unaware that it's radix-only? Would a WARN or something
more prominent still be justified?

I guess people will figure it out as soon as they think to search the
kernel log for 'KASAN'...

Christophe Leroy

unread,
Oct 10, 2022, 1:03:52 PM10/10/22
to Nathan Lynch, Michael Ellerman, linuxp...@lists.ozlabs.org, kasan-dev
I don't think the big hammer WARN would be justified.
WARN is supposed to be used only with unexpected conditions.

KASAN not working with hash-MMU is expected. A pr_warn() should be enough.

Someone who has a kernel with KASAN built in but who is not interested
by KASAN and who is booting it one a HASH-MMU will be terrified by a WARN.

Michael Ellerman

unread,
Oct 11, 2022, 6:00:31 AM10/11/22
to Nathan Lynch, Christophe Leroy, linuxp...@lists.ozlabs.org, kasan-dev
Nathan Lynch <nat...@linux.ibm.com> writes:
> Michael Ellerman <m...@ellerman.id.au> writes:
>> Christophe Leroy <christop...@csgroup.eu> writes:
>>> + KASAN list
>>>
>>> Le 06/10/2022 à 06:10, Michael Ellerman a écrit :
>>>> Nathan Lynch <nat...@linux.ibm.com> writes:
>>>>> kasan is known to crash at boot on book3s_64 with non-radix MMU. As
>>>>> noted in commit 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only
>>>>> KASAN support"):
>>>>>
>>>>> A kernel with CONFIG_KASAN=y will crash during boot on a machine
>>>>> using HPT translation because not all the entry points to the
>>>>> generic KASAN code are protected with a call to kasan_arch_is_ready().
>>>>
>>>> I guess I thought there was some plan to fix that.
>>>
>>> I was thinking the same.
>>>
>>> Do we have a list of the said entry points to the generic code that are
>>> lacking a call to kasan_arch_is_ready() ?
>>>
>>> Typically, the BUG dump below shows that kasan_byte_accessible() is
>>> lacking the check. It should be straight forward to add
>>> kasan_arch_is_ready() check to kasan_byte_accessible(), shouldn't it ?
>>
>> Yes :)
>>
>> And one other spot, but the patch below boots OK for me. I'll leave it
>> running for a while just in case there's a path I've missed.
>
> It works for me too, thanks (p8 pseries qemu).

It works but I still see the kasan shadow getting mapped, which we would
ideally avoid.

From PTDUMP:

---[ kasan shadow mem start ]---
0xc00f000000000000-0xc00f00000006ffff 0x00000000045e0000 448K r w pte valid present dirty accessed
0xc00f3ffffffe0000-0xc00f3fffffffffff 0x0000000004d80000 128K r w pte valid present dirty accessed

I haven't worked out how those are getting mapped.

> This avoids the boot-time oops, but kasan remains unimplemented for hash
> mmu. Raising the question: with the trivial crashes addressed, is the
> current message ('KASAN not enabled as it requires radix!') sufficient
> to notify developers (such as me, a week ago) who mean to use kasan on a
> book3s platform, unaware that it's radix-only? Would a WARN or something
> more prominent still be justified?
>
> I guess people will figure it out as soon as they think to search the
> kernel log for 'KASAN'...

Yeah, I think a warning is a bit too strong. I think that's more likely
to lead to bug reports than anything :)

cheers

Christophe Leroy

unread,
Oct 11, 2022, 6:25:07 AM10/11/22
to Michael Ellerman, Nathan Lynch, linuxp...@lists.ozlabs.org, kasan-dev
kasan_populate_vmalloc() maybe ?

Christophe

Christophe Leroy

unread,
Jan 26, 2023, 2:11:59 AM1/26/23
to Michael Ellerman, Nathan Lynch, linuxp...@lists.ozlabs.org, kasan-dev
Reply all
Reply to author
Forward
0 new messages