KASAN isn't catching rd/wr underflow bugs on static global memory?

145 views
Skip to first unread message

Kaiwan N Billimoria

unread,
Nov 15, 2021, 11:54:30 PM11/15/21
to kasa...@googlegroups.com, Chi-Thanh Hoang
Hello all,

I'm facing some issues when testing for read/write underflow ('left OOB') defects via KASAN, and am requesting your help...
Briefly, KASAN does not seem to catch the read/write undeflow ('left OOB') on a static global memory buffer.
First off, is this a known limitation?

More details follow, requesting your patience in reading thorugh...

1. Test Env:
x86_64 Ubuntu 20.04 LTS guest VM
Custom 'debug' kernel: ver 5.10.60
CONFIG_KASAN=y
CONFIG_UBSAN=y

Detailed configs:

$ grep KASAN /boot/config-5.10.60-dbg02
CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
CONFIG_KASAN_OUTLINE=y
# CONFIG_KASAN_INLINE is not set
CONFIG_KASAN_STACK=1
CONFIG_KASAN_VMALLOC=y
CONFIG_KASAN_KUNIT_TEST=m
CONFIG_TEST_KASAN_MODULE=m
$

$ grep UBSAN /boot/config-5.10.60-dbg02
CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
CONFIG_UBSAN=y
# CONFIG_UBSAN_TRAP is not set
CONFIG_UBSAN_BOUNDS=y
CONFIG_UBSAN_MISC=y
CONFIG_UBSAN_SANITIZE_ALL=y
# CONFIG_UBSAN_ALIGNMENT is not set
CONFIG_TEST_UBSAN=m
$

2. I've written a module to perform simple test cases:
https://github.com/PacktPublishing/Linux-Kernel-Debugging/tree/main/ch7/kmembugs_test
(the book is in dev :-)...

It provides an interactive way to run various memory-related (and other) test cases; pl have a look (and try!)

Here's the test cases that KASAN does NOT seem to catch:
# 4.3 and 4.4 : OOB (Out oF Bounds) access on static global memory buffer.
I'm unsure why...

Here's the relevant code for the testcase (as of now):
https://github.com/PacktPublishing/Linux-Kernel-Debugging/blob/81a2873275bd400fd235dc51cdac352d9d5fb03a/ch7/kmembugs_test/kmembugs_test.c#L185

My testing shows that UBSAN clearly catches both the read and write underflow bugs, but KASAN doesn't. All other defects are correctly caught by KASAN..
(Am having other issues with UBSAN, which we can discuss on another thread :).

3. Also, I tried a similar testcase in userspace w/ ASAN and the global memory underflow bugs were caught... (with both gcc and clang).

Any help is appreciated!
TIA and Regards,
Kaiwan.

Marco Elver

unread,
Nov 16, 2021, 6:52:32 AM11/16/21
to Kaiwan N Billimoria, kasa...@googlegroups.com, Chi-Thanh Hoang
On Tue, 16 Nov 2021 at 05:54, Kaiwan N Billimoria
<kaiwan.b...@gmail.com> wrote:

> I'm facing some issues when testing for read/write underflow ('left OOB') defects via KASAN, and am requesting your help...
> Briefly, KASAN does not seem to catch the read/write undeflow ('left OOB') on a static global memory buffer.
> First off, is this a known limitation?

KASAN globals support used to be limited in Clang. This was fixed in
Clang 11. I'm not sure about GCC.

> More details follow, requesting your patience in reading thorugh...
>
> 1. Test Env:
> x86_64 Ubuntu 20.04 LTS guest VM
> Custom 'debug' kernel: ver 5.10.60

Which compiler versions are you using? This is probably the most
important piece to the puzzle.

[...]
> 2. I've written a module to perform simple test cases:
> https://github.com/PacktPublishing/Linux-Kernel-Debugging/tree/main/ch7/kmembugs_test
> (the book is in dev :-)...
>
> It provides an interactive way to run various memory-related (and other) test cases; pl have a look (and try!)
>
> Here's the test cases that KASAN does NOT seem to catch:
> # 4.3 and 4.4 : OOB (Out oF Bounds) access on static global memory buffer.
> I'm unsure why...
>
> Here's the relevant code for the testcase (as of now):
> https://github.com/PacktPublishing/Linux-Kernel-Debugging/blob/81a2873275bd400fd235dc51cdac352d9d5fb03a/ch7/kmembugs_test/kmembugs_test.c#L185

FWIW, the kernel has its own KASAN test suite in lib/test_kasan.c.
There are a few things to not make the compiler optimize away
explicitly buggy code, so I'd also suggest you embed your test in
test_kasan and see if it changes anything (unlikely but worth a shot).

If you are using GCC, can you try again with Clang 11 or 12?

Thanks,
-- Marco

Kaiwan N Billimoria

unread,
Nov 16, 2021, 9:17:03 AM11/16/21
to Marco Elver, kasa...@googlegroups.com, Chi-Thanh Hoang
On Tue, 2021-11-16 at 12:52 +0100, Marco Elver wrote:
>
> KASAN globals support used to be limited in Clang. This was fixed in
> Clang 11. I'm not sure about GCC.
...
> > Which compiler versions are you using? This is probably the most
> important piece to the puzzle.
>
Right! This is the primary issue i think, thanks!
am currently using gcc 9.3.0.

So, my Ubuntu system had clang-10; I installed clang-11 on top of it...
(this causes some issues?). Updated the Makefile to use clang-11, and it did build.

But when running these tests, *only* UBSAN was triggered, KASAN unseen.
So: I then rebuilt the 5.10.60 kernel removing UBSAN config and retried (same module rebuilt w/ clang 11).
This time UBSAN didn't pop up but nor did KASAN ! (For the same rd/wr underflow testcases)...
My script + dmesg:
...
(Type in the testcase number to run):
4.4
Running testcase "4.4" via test module now...
[ 371.368096] testcase to run: 4.4
$

This implies it escaped unnoticed..

To show the difference, here's my testcase #4.1- Read (right) overflow on global memory - output:

Running testcase "4.1" via test module now...
[ 1372.401484] testcase to run: 4.1
[ 1372.401515] ==================================================================
[ 1372.402284] BUG: KASAN: global-out-of-bounds in static_mem_oob_right+0xaf/0x160 [test_kmembugs]
[ 1372.402851] Read of size 1 at addr ffffffffc088dfcc by task run_tests/1656

[ 1372.403428] CPU: 2 PID: 1656 Comm: run_tests Tainted: G B O 5.10.60-dbg02 #14
[ 1372.403442] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 1372.403454] Call Trace:
[ 1372.403486] dump_stack+0xbd/0xfa

[... lots more, as expected ...]

So, am puzzled... why isn't KASAN catching the underflow...

A couple of caveats:
1) I had to manually setup a soft link to llvm-objdump (it was installed as llvm-objdump-11)
2) the module build initially failed with
/bin/sh: 1: ld.lld: not found
So I installed the 'lld' package; then the build worked..

Any thoughts?
...

>
> FWIW, the kernel has its own KASAN test suite in lib/test_kasan.c.
> There are a few things to not make the compiler optimize away
> explicitly buggy code, so I'd also suggest you embed your test in
> test_kasan and see if it changes anything (unlikely but worth a shot).
I have studied it, and essentially copied it's techniques where required... Interestingly, the kernel's test_kasan module does _not_ have a test case for this: underflow on global memory! :-)

Thanks,
Kaiwan.

Marco Elver

unread,
Nov 16, 2021, 11:37:33 AM11/16/21
to Kaiwan N Billimoria, kasa...@googlegroups.com, Chi-Thanh Hoang
On Tue, Nov 16, 2021 at 07:46PM +0530, Kaiwan N Billimoria wrote:
> On Tue, 2021-11-16 at 12:52 +0100, Marco Elver wrote:
> >
> > KASAN globals support used to be limited in Clang. This was fixed in
> > Clang 11. I'm not sure about GCC.
> ...
> > > Which compiler versions are you using? This is probably the most
> > important piece to the puzzle.
> >
> Right! This is the primary issue i think, thanks!
> am currently using gcc 9.3.0.
>
> So, my Ubuntu system had clang-10; I installed clang-11 on top of it...
> (this causes some issues?). Updated the Makefile to use clang-11, and it did build.

Only the test or the whole kernel? You need to build the whole kernel
and your module with the same compiler, otherwise all bets are off wrt
things like KASAN.

> But when running these tests, *only* UBSAN was triggered, KASAN unseen.
> So: I then rebuilt the 5.10.60 kernel removing UBSAN config and retried (same module rebuilt w/ clang 11).
> This time UBSAN didn't pop up but nor did KASAN ! (For the same rd/wr underflow testcases)...
> My script + dmesg:
> ...
> (Type in the testcase number to run):
> 4.4
> Running testcase "4.4" via test module now...
> [ 371.368096] testcase to run: 4.4
> $
>
> This implies it escaped unnoticed..
>
> To show the difference, here's my testcase #4.1- Read (right) overflow on global memory - output:
>
> Running testcase "4.1" via test module now...
> [ 1372.401484] testcase to run: 4.1
> [ 1372.401515] ==================================================================
> [ 1372.402284] BUG: KASAN: global-out-of-bounds in static_mem_oob_right+0xaf/0x160 [test_kmembugs]
> [ 1372.402851] Read of size 1 at addr ffffffffc088dfcc by task run_tests/1656
>
> [ 1372.403428] CPU: 2 PID: 1656 Comm: run_tests Tainted: G B O 5.10.60-dbg02 #14
> [ 1372.403442] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> [ 1372.403454] Call Trace:
> [ 1372.403486] dump_stack+0xbd/0xfa
>
> [... lots more, as expected ...]
>
> So, am puzzled... why isn't KASAN catching the underflow...

Please take a look at the paragraph at:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/test_kasan.c#n706

I think your test is giving the compiler opportunities to miscompile
your code, because, well it has undefined behaviour (negative index)
that it very clearly can see. I think you need to put more effort into
hiding the UB from the optimizer like we do in test_kasan.c.

If you want to know in detail what's happening I recommend you
disassemble your compiled code and check if the negative dereferences
are still there.

> A couple of caveats:
> 1) I had to manually setup a soft link to llvm-objdump (it was installed as llvm-objdump-11)
> 2) the module build initially failed with
> /bin/sh: 1: ld.lld: not found
> So I installed the 'lld' package; then the build worked..
>
> Any thoughts?

Is this "make LLVM=1". Yeah, if there's a version suffix it's known to
be problematic.

You can just build the kernel with "make CC=clang" and it'll use
binutils ld, which works as well.

> > FWIW, the kernel has its own KASAN test suite in lib/test_kasan.c.
> > There are a few things to not make the compiler optimize away
> > explicitly buggy code, so I'd also suggest you embed your test in
> > test_kasan and see if it changes anything (unlikely but worth a shot).
> I have studied it, and essentially copied it's techniques where required... Interestingly, the kernel's test_kasan module does _not_ have a test case for this: underflow on global memory! :-)

I just added such a test (below) and it passes just fine with clang 11
(I'll probably send it as a real patch later). Notice that the address
itself ("array") is a volatile, so that the compiler cannot make any
assumptions about it.

Thanks,
-- Marco

------ >8 ------

diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index 67ed689a0b1b..e56c9eb3f16e 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -700,7 +700,7 @@ static void kmem_cache_bulk(struct kunit *test)

static char global_array[10];

-static void kasan_global_oob(struct kunit *test)
+static void kasan_global_oob_right(struct kunit *test)
{
/*
* Deliberate out-of-bounds access. To prevent CONFIG_UBSAN_LOCAL_BOUNDS
@@ -723,6 +723,15 @@ static void kasan_global_oob(struct kunit *test)
KUNIT_EXPECT_KASAN_FAIL(test, *(volatile char *)p);
}

+static void kasan_global_oob_left(struct kunit *test)
+{
+ char *volatile array = global_array;
+ char *p = array - 3;
+
+ KASAN_TEST_NEEDS_CONFIG_ON(test, CONFIG_KASAN_GENERIC);
+ KUNIT_EXPECT_KASAN_FAIL(test, *(volatile char *)p);
+}
+
/* Check that ksize() makes the whole object accessible. */
static void ksize_unpoisons_memory(struct kunit *test)
{
@@ -1160,7 +1169,8 @@ static struct kunit_case kasan_kunit_test_cases[] = {
KUNIT_CASE(kmem_cache_oob),
KUNIT_CASE(kmem_cache_accounted),
KUNIT_CASE(kmem_cache_bulk),
- KUNIT_CASE(kasan_global_oob),
+ KUNIT_CASE(kasan_global_oob_right),
+ KUNIT_CASE(kasan_global_oob_left),
KUNIT_CASE(kasan_stack_oob),
KUNIT_CASE(kasan_alloca_oob_left),
KUNIT_CASE(kasan_alloca_oob_right),

Kaiwan N Billimoria

unread,
Nov 17, 2021, 2:23:42 AM11/17/21
to Marco Elver, kasa...@googlegroups.com, Chi-Thanh Hoang


On Tue, 16 Nov 2021, 22:07 Marco Elver, <el...@google.com> wrote:
On Tue, Nov 16, 2021 at 07:46PM +0530, Kaiwan N Billimoria wrote:
> On Tue, 2021-11-16 at 12:52 +0100, Marco Elver wrote:
> >
> > KASAN globals support used to be limited in Clang. This was fixed in
> > Clang 11. I'm not sure about GCC.
> ...
> > > Which compiler versions are you using? This is probably the most
> > important piece to the puzzle.
> >
> Right! This is the primary issue i think, thanks!
> am currently using gcc 9.3.0.
>
> So, my Ubuntu system had clang-10; I installed clang-11 on top of it...
> (this causes some issues?). Updated the Makefile to use clang-11, and it did build.

Only the test or the whole kernel? You need to build the whole kernel
and your module with the same compiler, otherwise all bets are off wrt
things like KASAN.
Ah, will do so and let you know, thanks! 
Will recheck... 

Thanks, Kaiwan. 

Chi-Thanh Hoang

unread,
Nov 17, 2021, 9:11:48 AM11/17/21
to Kaiwan N Billimoria, Marco Elver, kasa...@googlegroups.com
I managed to figure out why the global OOB-left is not being detected and work around the issue 8-)
I am still using gcc 9.3.0.
I notice KASAN detects fine when OOB happen in overflow, KASAN shown the status of shadow memory around the OOB, I see there is no redzone for the global before the allocated memory, there is redzone after, if the global is the first declared object in the .bss example, there is no redzone in front of it so shadow memory are zero, that is why KASAN did not detect.
I then do the following, I declare 3 globals array in .bss, and test the OOB underflow on the second array and KASAN does detect as doing -1 will fall into the redzone of the first object.
I agree this is kind of a corner case, but to fix this I guess we need to provide redzone in front of the first global either in .bss or .data, and if possible to configure the size of such redzone.

at ffffffffa07a6580 is start of .bss, in the log below there is 3 arrays of 10 bytes (00 02 from shadow mem), the fault detected as shown on the 2nd array when I do a -1 reference.
[25768.140717] Memory state around the buggy address:
[25768.140721]  ffffffffa07a6480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[25768.140725]  ffffffffa07a6500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <<<<< Here are zero value in shadow mem so access is good
[25768.140730] >ffffffffa07a6580: 00 02 f9 f9 f9 f9 f9 f9 00 02 f9 f9 f9 f9 f9 f9
[25768.140733]                                         ^
[25768.140737]  ffffffffa07a6600: 00 02 f9 f9 f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9
[25768.140741]  ffffffffa07a6680: 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00


Marco Elver

unread,
Nov 17, 2021, 9:14:34 AM11/17/21
to Chi-Thanh Hoang, Kaiwan N Billimoria, kasa...@googlegroups.com
On Wed, 17 Nov 2021 at 15:11, Chi-Thanh Hoang <chithan...@gmail.com> wrote:
>
> I managed to figure out why the global OOB-left is not being detected and work around the issue 8-)
> I am still using gcc 9.3.0.

Yeah, gcc is doing worse here. I just filed:
https://bugzilla.kernel.org/show_bug.cgi?id=215051

Clang 11+ doesn't have this issue.

Please, if you can, post your findings to the bugzilla bug above. Then
we can perhaps take it to gcc devs and ask them to do the same as
clang or fix it some other way.

Thanks,
-- Marco

Chi-Thanh Hoang

unread,
Nov 17, 2021, 9:59:28 PM11/17/21
to Marco Elver, Kaiwan N Billimoria, kasa...@googlegroups.com
Thanks Marco for creating the bugzilla.
I will post my findings.
I found the Clang compiler quite smart when comparing code generated vs gcc, i.e. clang would not bother generating code that are OOB when indexing [ ].

Kaiwan N Billimoria

unread,
Nov 18, 2021, 12:56:57 AM11/18/21
to Chi-Thanh Hoang, Marco Elver, kasa...@googlegroups.com
On Thu, Nov 18, 2021 at 8:29 AM Chi-Thanh Hoang
<chithan...@gmail.com> wrote:
>
> Thanks Marco for creating the bugzilla.
> I will post my findings.

Super. Thanks Chi-Thanh, Marco, very helpful insights.
Also, Marco, am glad to see your latest patch to the test_kasan module
covering the left OOB on global data..

> I found the Clang compiler quite smart when comparing code generated vs gcc, i.e. clang would not bother generating code that are OOB when indexing [ ].

Really good to know!
I think I am facing this "issue" - my supposedly buggy code isn't
causing bugs (when built with clang) :-)
Specifically, the OOB accesses upon global memory..
>>
>>
>> Please, if you can, post your findings to the bugzilla bug above. Then
>> we can perhaps take it to gcc devs and ask them to do the same as
>> clang or fix it some other way.

That would be great...

>>
>> Thanks,
>> -- Marco
>>
>> > I notice KASAN detects fine when OOB happen in overflow, KASAN shown the status of shadow memory around the OOB, I see there is no redzone for the global before the allocated memory, there is redzone after, if the global is the first declared object in the .bss example, there is no redzone in front of it so shadow memory are zero, that is why KASAN did not detect.
>> > I then do the following, I declare 3 globals array in .bss, and test the OOB underflow on the second array and KASAN does detect as doing -1 will fall into the redzone of the first object.
>> > I agree this is kind of a corner case, but to fix this I guess we need to provide redzone in front of the first global either in .bss or .data, and if possible to configure the size of such redzone.
>> >
>> > at ffffffffa07a6580 is start of .bss, in the log below there is 3 arrays of 10 bytes (00 02 from shadow mem), the fault detected as shown on the 2nd array when I do a -1 reference.
>> > [25768.140717] Memory state around the buggy address:
>> > [25768.140721] ffffffffa07a6480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> > [25768.140725] ffffffffa07a6500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <<<<< Here are zero value in shadow mem so access is good
>> > [25768.140730] >ffffffffa07a6580: 00 02 f9 f9 f9 f9 f9 f9 00 02 f9 f9 f9 f9 f9 f9
>> > [25768.140733] ^
>> > [25768.140737] ffffffffa07a6600: 00 02 f9 f9 f9 f9 f9 f9 01 f9 f9 f9 f9 f9 f9 f9
>> > [25768.140741] ffffffffa07a6680: 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
>> >

Really interesting! Am trying to replicate along similar lines but it
doesn't trigger !

static char global_arr1[100];
static int global_arr2[10];
static char global_arr3[10];
...
int global_mem_oob_left(int mode)
{
volatile char w;
char *volatile array = global_arr3;
char *p = array - 3; // invalid, not within bounds

w = *(volatile char *)p;
...
}

I also find that the global arrays seem to be laid out "in reverse",
i.e., if i print their kernel va's:
test_kmembugs:global_mem_oob_left(): global_arr1=ffffffffc07db8e0
global_arr2=ffffffffc07db900 global_arr3=ffffffffc07db8c0

And the last one, global_arr3, coincides with the BSS start:

$ sudo cat /sys/module/test_kmembugs/sections/.bss
0xffffffffc07db8c0

Can we infer anything here?

Thanks Marco, Chi-Thanh,

Regards,
Kaiwan.

Marco Elver

unread,
Nov 18, 2021, 3:36:39 AM11/18/21
to Kaiwan N Billimoria, Chi-Thanh Hoang, kasa...@googlegroups.com
On Thu, 18 Nov 2021 at 06:56, Kaiwan N Billimoria
<kaiwan.b...@gmail.com> wrote:
>
> On Thu, Nov 18, 2021 at 8:29 AM Chi-Thanh Hoang
> <chithan...@gmail.com> wrote:
> >
> > Thanks Marco for creating the bugzilla.
> > I will post my findings.

Thanks for adding your findings.

[...]
>
> Really interesting! Am trying to replicate along similar lines but it
> doesn't trigger !
>
> static char global_arr1[100];
> static int global_arr2[10];
> static char global_arr3[10];
> ...
> int global_mem_oob_left(int mode)
> {
> volatile char w;
> char *volatile array = global_arr3;
> char *p = array - 3; // invalid, not within bounds
>
> w = *(volatile char *)p;
> ...
> }
>
> I also find that the global arrays seem to be laid out "in reverse",
> i.e., if i print their kernel va's:
> test_kmembugs:global_mem_oob_left(): global_arr1=ffffffffc07db8e0
> global_arr2=ffffffffc07db900 global_arr3=ffffffffc07db8c0
>
> And the last one, global_arr3, coincides with the BSS start:
>
> $ sudo cat /sys/module/test_kmembugs/sections/.bss
> 0xffffffffc07db8c0
>
> Can we infer anything here?

Infer why it's broken? Not really, there's no guaranteed order how
globals are laid out in memory. It's entirely up to the linker (except
if you explicitly put the symbol in some section).

The reason why GCC is not detecting this is because last I checked its
implementation of adding globals redzones is based on increasing
alignment of globals, which is really not the most reliable way to
ensure there's always padding. Clang explicitly adds data after a
global and doesn't rely on alignment.

Kaiwan N Billimoria

unread,
Nov 18, 2021, 3:54:22 AM11/18/21
to Marco Elver, Chi-Thanh Hoang, kasa...@googlegroups.com
Ok.. 

The reason why GCC is not detecting this is because last I checked its
implementation of adding globals redzones is based on increasing
alignment of globals, which is really not the most reliable way to
ensure there's always padding. Clang explicitly adds data after a
global and doesn't rely on alignment.
Ok. But, this had been done on a kernel and module compiled with clang. 
Thanks, Kaiwan. 
Reply all
Reply to author
Forward
0 new messages