Re: [PATCH] Enable '-Werror' by default for all kernel builds

232 views
Skip to first unread message

Nathan Chancellor

unread,
Sep 8, 2021, 4:55:07 PM9/8/21
to Arnd Bergmann, Linus Torvalds, Guenter Roeck, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux...@lists.infradead.org, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasa...@googlegroups.com
Hi Arnd,

On Tue, Sep 07, 2021 at 11:11:17AM +0200, Arnd Bergmann wrote:
> On Tue, Sep 7, 2021 at 4:32 AM Nathan Chancellor <nat...@kernel.org> wrote:
> >
> > arm32-allmodconfig.log: crypto/wp512.c:782:13: error: stack frame size (1176) exceeds limit (1024) in function 'wp512_process_buffer' [-Werror,-Wframe-larger-than]
> > arm32-allmodconfig.log: drivers/firmware/tegra/bpmp-debugfs.c:294:12: error: stack frame size (1256) exceeds limit (1024) in function 'bpmp_debug_show' [-Werror,-Wframe-larger-than]
> > arm32-allmodconfig.log: drivers/firmware/tegra/bpmp-debugfs.c:357:16: error: stack frame size (1264) exceeds limit (1024) in function 'bpmp_debug_store' [-Werror,-Wframe-larger-than]
> > arm32-allmodconfig.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:3043:6: error: stack frame size (1384) exceeds limit (1024) in function 'bw_calcs' [-Werror,-Wframe-larger-than]
> > arm32-allmodconfig.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:77:13: error: stack frame size (5560) exceeds limit (1024) in function 'calculate_bandwidth' [-Werror,-Wframe-larger-than]
> > arm32-allmodconfig.log: drivers/mtd/chips/cfi_cmdset_0001.c:1872:12: error: stack frame size (1064) exceeds limit (1024) in function 'cfi_intelext_writev' [-Werror,-Wframe-larger-than]
> > arm32-allmodconfig.log: drivers/ntb/hw/idt/ntb_hw_idt.c:1041:27: error: stack frame size (1032) exceeds limit (1024) in function 'idt_scan_mws' [-Werror,-Wframe-larger-than]
> > arm32-allmodconfig.log: drivers/staging/fbtft/fbtft-core.c:902:12: error: stack frame size (1072) exceeds limit (1024) in function 'fbtft_init_display_from_property' [-Werror,-Wframe-larger-than]
> > arm32-allmodconfig.log: drivers/staging/fbtft/fbtft-core.c:992:5: error: stack frame size (1064) exceeds limit (1024) in function 'fbtft_init_display' [-Werror,-Wframe-larger-than]
> > arm32-allmodconfig.log: drivers/staging/rtl8723bs/core/rtw_security.c:1288:5: error: stack frame size (1040) exceeds limit (1024) in function 'rtw_aes_decrypt' [-Werror,-Wframe-larger-than]
> > arm32-fedora.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:3043:6: error: stack frame size (1376) exceeds limit (1024) in function 'bw_calcs' [-Werror,-Wframe-larger-than]
> > arm32-fedora.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:77:13: error: stack frame size (5384) exceeds limit (1024) in function 'calculate_bandwidth' [-Werror,-Wframe-larger-than]
> >
> > Aside from the dce_calcs.c warnings, these do not seem too bad. I
> > believe allmodconfig turns on UBSAN but it could also be aggressive
> > inlining by clang. I intend to look at all -Wframe-large-than warnings
> > closely later.
>
> I've had them close to zero in the past, but a couple of new ones came in.
>
> The amdgpu ones are probably not fixable unless they stop using 64-bit
> floats in the kernel for
> random calculations. The crypto/* ones tend to be compiler bugs, but hard to fix

I have started taking a look at these. Most of the allmodconfig ones
appear to be related to CONFIG_KASAN, which is now supported for
CONFIG_ARM.

The two in bpmp-debugfs.c appear regardless of CONFIG_KASAN and it turns
out that you actually submitted a patch for these:

https://lore.kernel.org/r/20201204193714...@kernel.org/

Is it worth resending or pinging that?

The dce_calcs.c ones also appear without CONFIG_KASAN, which you noted
is probably unavoidable.

The other ones only appear with CONFIG_KASAN. I have not investigated
each instance to see exactly how much KASAN makes the stack blow up.
Perhaps it is worth setting the default of CONFIG_FRAME_WARN to a higher
value with clang+COMPILE_TEST+KASAN?

> > It appears that both Arch Linux and Fedora define CONFIG_FRAME_WARN
> > as 1024, below its default of 2048. I am not sure these look particurly
> > scary (although there are some that are rather large that need to be
> > looked at).
>
> For 64-bit, you usually need 1280 bytes stack space to get a
> reasonably clean build,
> anything that uses more than that tends to be a bug in the code but we
> never warned
> about those by default as the default warning limit in defconfig is 2048.
>
> I think the distros using 1024 did that because they use a common base config
> for 32-bit and 64-bit targets.

That is a fair explanation.

> > I suspect this is a backend problem because these do not really appear
> > in any other configurations (might also be something with a sanitizer?)
>
> Agreed. Someone needs to bisect the .config or the compiler flags to see what
> triggers them.

For other people following along, there were a lot of
-Wframe-larger-than instances from RISC-V allmodconfig.

Turns out this is because CONFIG_KASAN_STACK is not respected with
RISC-V. They do not set CONFIG_KASAN_SHADOW_OFFSET so following along in
scripts/Makefile.kasan, CFLAGS_KASAN_SHADOW does not get set to
anything, which means that only '-fsanitize=kernel-address' gets added
to the command line, with none of the other parameters.

I guess there are a couple of ways to tackle this:

1. RISC-V could implement CONFIG_KASAN_SHADOW_OFFSET. They mention that
the logic of KASAN_SHADOW_OFFSET was taken from arm64 but they did
not borrow the Kconfig logic it seems.

2. asan-stack could be hoisted out of the else branch so that it is
always enabled/disabled regardless of KASAN_SHADOW_OFFSET being
defined, which resolved all of these warnings for me in my testing.

I am adding the KASAN and RISC-V folks to CC for this reason.

Cheers,
Nathan

Guenter Roeck

unread,
Sep 8, 2021, 5:17:34 PM9/8/21
to Nathan Chancellor, Arnd Bergmann, Linus Torvalds, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux...@lists.infradead.org, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasa...@googlegroups.com
Would it make sense to make KASAN depend on !COMPILE_TEST ?
After all, the point of KASAN is runtime testing, not build testing.

Guenter

Marco Elver

unread,
Sep 8, 2021, 5:59:04 PM9/8/21
to Guenter Roeck, Nathan Chancellor, Arnd Bergmann, Linus Torvalds, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux...@lists.infradead.org, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasa...@googlegroups.com
On Wed, Sep 08, 2021 at 02:16PM -0700, Guenter Roeck wrote:
> On 9/8/21 1:55 PM, Nathan Chancellor wrote:
[...]
> > I have started taking a look at these. Most of the allmodconfig ones
> > appear to be related to CONFIG_KASAN, which is now supported for
> > CONFIG_ARM.
> >
>
> Would it make sense to make KASAN depend on !COMPILE_TEST ?
> After all, the point of KASAN is runtime testing, not build testing.

It'd be good to avoid. It has helped uncover build issues with KASAN in
the past. Or at least make it dependent on the problematic architecture.
For example if arm is a problem, something like this:

--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -71,7 +71,7 @@ config ARM
select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6
select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU
select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU
- select HAVE_ARCH_KASAN if MMU && !XIP_KERNEL
+ select HAVE_ARCH_KASAN if MMU && !XIP_KERNEL && (!COMPILE_TEST || !CC_IS_CLANG)
select HAVE_ARCH_MMAP_RND_BITS if MMU
select HAVE_ARCH_PFN_VALID
select HAVE_ARCH_SECCOMP

More generally, with clang, the problem is known and due to KASAN stack
instrumentation (CONFIG_KASAN_STACK):

| config KASAN_STACK
| bool "Enable stack instrumentation (unsafe)" if CC_IS_CLANG && !COMPILE_TEST
| depends on KASAN_GENERIC || KASAN_SW_TAGS
| depends on !ARCH_DISABLE_KASAN_INLINE
| default y if CC_IS_GCC
| help
| The LLVM stack address sanitizer has a know problem that
| causes excessive stack usage in a lot of functions, see
| https://bugs.llvm.org/show_bug.cgi?id=38809
| Disabling asan-stack makes it safe to run kernels build
| with clang-8 with KASAN enabled, though it loses some of
| the functionality.
| This feature is always disabled when compile-testing with clang
| to avoid cluttering the output in stack overflow warnings,
| but clang users can still enable it for builds without
| CONFIG_COMPILE_TEST. On gcc it is assumed to always be safe
| to use and enabled by default.
| If the architecture disables inline instrumentation, stack
| instrumentation is also disabled as it adds inline-style
| instrumentation that is run unconditionally.

This is already disabled if COMPILE_TEST and building with clang. As
far as I know, there's no easy fix for clang and it's been discussed
many times over with LLVM devs.

Thanks,
-- Marco

Christoph Hellwig

unread,
Sep 9, 2021, 2:02:07 AM9/9/21
to Marco Elver, Guenter Roeck, Nathan Chancellor, Arnd Bergmann, Linus Torvalds, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux...@lists.infradead.org, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasa...@googlegroups.com, Christian König, Pan, Xinhui, amd...@lists.freedesktop.org
On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> It'd be good to avoid. It has helped uncover build issues with KASAN in
> the past. Or at least make it dependent on the problematic architecture.
> For example if arm is a problem, something like this:

I'm also seeing quite a few stack size warnings with KASAN on x86_64
without COMPILT_TEST using gcc 10.2.1 from Debian. In fact there are a
few warnings without KASAN, but with KASAN there are a lot more.
I'll try to find some time to dig into them.

While we're at it, with -Werror something like this is really futile:

drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function ‘amdgpu_bo_support_uswc’:
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to write-combining [-Wcpp
493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance \
| ^~~~~~~

Guenter Roeck

unread,
Sep 9, 2021, 2:08:00 AM9/9/21
to Christoph Hellwig, Marco Elver, Nathan Chancellor, Arnd Bergmann, Linus Torvalds, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux...@lists.infradead.org, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasa...@googlegroups.com, Christian König, Pan, Xinhui, amd...@lists.freedesktop.org
I have been wondering if all those #warning "errors" should either
be removed or be replaced with "#pragma message".

Guenter

Christian König

unread,
Sep 9, 2021, 3:30:38 AM9/9/21
to Guenter Roeck, Christoph Hellwig, Marco Elver, Nathan Chancellor, Arnd Bergmann, Linus Torvalds, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux...@lists.infradead.org, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasa...@googlegroups.com, Pan, Xinhui, amd...@lists.freedesktop.org
Am 09.09.21 um 08:07 schrieb Guenter Roeck:
> On 9/8/21 10:58 PM, Christoph Hellwig wrote:
>> On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
>>> It'd be good to avoid. It has helped uncover build issues with KASAN in
>>> the past. Or at least make it dependent on the problematic
>>> architecture.
>>> For example if arm is a problem, something like this:
>>
>> I'm also seeing quite a few stack size warnings with KASAN on x86_64
>> without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
>> few warnings without KASAN, but with KASAN there are a lot more.
>> I'll try to find some time to dig into them.
>>
>> While we're at it, with -Werror something like this is really futile:
>>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function
>> ‘amdgpu_bo_support_uswc’:
>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
>> Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance
>> thanks to write-combining [-Wcpp
>>    493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for
>> better performance \
>>        |  ^~~~~~~

Ah, yes good point!

>
> I have been wondering if all those #warning "errors" should either
> be removed or be replaced with "#pragma message".

Well we started to add those warnings because people compiled their
kernel with CONFIG_MTRR and CONFIG_X86_PAT and was then wondering why
the performance of the display driver was so crappy.

When those warning now generate an error which you have to disable
explicitly then that might not be bad at all.

It at least points people to this setting and makes it really clear that
they are doing something very unusual and need to keep in mind that it
might not have the desired result.

Regards,
Christian.

>
> Guenter

Marco Elver

unread,
Sep 9, 2021, 6:53:15 AM9/9/21
to Christoph Hellwig, Guenter Roeck, Nathan Chancellor, Arnd Bergmann, Linus Torvalds, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux...@lists.infradead.org, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasa...@googlegroups.com, Christian König, Pan, Xinhui, amd...@lists.freedesktop.org
On Thu, 9 Sept 2021 at 07:59, Christoph Hellwig <h...@infradead.org> wrote:
> On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> > It'd be good to avoid. It has helped uncover build issues with KASAN in
> > the past. Or at least make it dependent on the problematic architecture.
> > For example if arm is a problem, something like this:
>
> I'm also seeing quite a few stack size warnings with KASAN on x86_64
> without COMPILT_TEST using gcc 10.2.1 from Debian. In fact there are a
> few warnings without KASAN, but with KASAN there are a lot more.
> I'll try to find some time to dig into them.

Right, this reminded me that we actually at least double the real
stack size for KASAN builds, because it inherently requires more stack
space. I think we need Wframe-larger-than to match that, otherwise
we'll just keep having this problem:

https://lkml.kernel.org/r/20210909104925...@google.com

Arnd Bergmann

unread,
Sep 9, 2021, 7:00:44 AM9/9/21
to Marco Elver, Christoph Hellwig, Guenter Roeck, Nathan Chancellor, Linus Torvalds, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev, Christian König, Pan, Xinhui, amd-gfx list
On Thu, Sep 9, 2021 at 12:54 PM Marco Elver <el...@google.com> wrote:
> On Thu, 9 Sept 2021 at 07:59, Christoph Hellwig <h...@infradead.org> wrote:
> > On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> > > It'd be good to avoid. It has helped uncover build issues with KASAN in
> > > the past. Or at least make it dependent on the problematic architecture.
> > > For example if arm is a problem, something like this:
> >
> > I'm also seeing quite a few stack size warnings with KASAN on x86_64
> > without COMPILT_TEST using gcc 10.2.1 from Debian. In fact there are a
> > few warnings without KASAN, but with KASAN there are a lot more.
> > I'll try to find some time to dig into them.
>
> Right, this reminded me that we actually at least double the real
> stack size for KASAN builds, because it inherently requires more stack
> space. I think we need Wframe-larger-than to match that, otherwise
> we'll just keep having this problem:
>
> https://lkml.kernel.org/r/20210909104925...@google.com

The problem with this is that it completely defeats the point of the
stack size warnings in allmodconfig kernels when they have KASAN
enabled and end up missing obvious code bugs in drivers that put
large structures on the stack. Let's not go there.

Arnd

Marco Elver

unread,
Sep 9, 2021, 7:43:57 AM9/9/21
to Arnd Bergmann, Christoph Hellwig, Guenter Roeck, Nathan Chancellor, Linus Torvalds, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev, Christian König, Pan, Xinhui, amd-gfx list
Sure, but the reality is that the real stack size is already doubled
for KASAN. And that should be reflected in Wframe-larger-than.

Either that, or we just have to live with the occasional warning (that
is likely benign). But with WERROR we're now forced to make the
defaults as sane as possible. If the worry is allmodconfig, maybe we
do have to make KASAN dependent on !COMPILE_TEST, even though that's
not great either because it has caught real issues in the past (it'll
also mean doing the same for all other instrumentation-based tools,
like KCSAN, UBSAN, etc.).

Arnd Bergmann

unread,
Sep 9, 2021, 8:55:35 AM9/9/21
to Marco Elver, Christoph Hellwig, Guenter Roeck, Nathan Chancellor, Linus Torvalds, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev, Christian König, Pan, Xinhui, amd-gfx list
On Thu, Sep 9, 2021 at 1:43 PM Marco Elver <el...@google.com> wrote:
> On Thu, 9 Sept 2021 at 13:00, Arnd Bergmann <ar...@kernel.org> wrote:
> > On Thu, Sep 9, 2021 at 12:54 PM Marco Elver <el...@google.com> wrote:
> > > On Thu, 9 Sept 2021 at 07:59, Christoph Hellwig <h...@infradead.org> wrote:
> > > > On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> > > > > It'd be good to avoid. It has helped uncover build issues with KASAN in
> > > > > the past. Or at least make it dependent on the problematic architecture.
> > > > > For example if arm is a problem, something like this:
> > > >
> > > > I'm also seeing quite a few stack size warnings with KASAN on x86_64
> > > > without COMPILT_TEST using gcc 10.2.1 from Debian. In fact there are a
> > > > few warnings without KASAN, but with KASAN there are a lot more.
> > > > I'll try to find some time to dig into them.
> > >
> > > Right, this reminded me that we actually at least double the real
> > > stack size for KASAN builds, because it inherently requires more stack
> > > space. I think we need Wframe-larger-than to match that, otherwise
> > > we'll just keep having this problem:
> > >
> > > https://lkml.kernel.org/r/20210909104925...@google.com
> >
> > The problem with this is that it completely defeats the point of the
> > stack size warnings in allmodconfig kernels when they have KASAN
> > enabled and end up missing obvious code bugs in drivers that put
> > large structures on the stack. Let's not go there.
>
> Sure, but the reality is that the real stack size is already doubled
> for KASAN. And that should be reflected in Wframe-larger-than.

I don't think "double" is an accurate description of what is going on,
it's much more complex than this. There are some functions
that completely explode with KASAN_STACK enabled on clang,
and many other functions instances that don't grow much at all.

I've been building randconfig kernels for a long time with KASAN_STACK
enabled on gcc, and the limit increased to 1440 bytes for 32-bit
and not increased beyond the normal 2048 bytes for 64-bit. I have
some patches to address the outliers and should go through and
resend some of those.

With the same limits and patches using clang, and KASAN=y but
KASAN_STACK=n I also get no warnings in randconfig builds,
but KASAN_STACK on clang doesn't really seem to have a good
limit that would make an allmodconfig kernel build with no warnings.

These are the worst offenders I see based on configuration, using
an 32-bit ARM allmodconfig with my fixups:

gcc-11, KASAN, no KASAN_STACK, FRAME_WARN=1024:
(nothing)

gcc-11, KASAN_STACK:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:782:1:
warning: the frame size of 1416 bytes is larger than 1024 bytes
[-Wframe-larger-than=]
drivers/media/dvb-frontends/mxl5xx.c:1575:1: warning: the frame size
of 1240 bytes is larger than 1024 bytes [-Wframe-larger-than=]
drivers/mtd/nftlcore.c:468:1: warning: the frame size of 1232 bytes is
larger than 1024 bytes [-Wframe-larger-than=]
drivers/char/ipmi/ipmi_msghandler.c:4880:1: warning: the frame size of
1232 bytes is larger than 1024 bytes [-Wframe-larger-than=]
drivers/mtd/chips/cfi_cmdset_0001.c:1870:1: warning: the frame size of
1224 bytes is larger than 1024 bytes [-Wframe-larger-than=]
drivers/net/wireless/ath/ath9k/ar9003_paprd.c:749:1: warning: the
frame size of 1216 bytes is larger than 1024 bytes
[-Wframe-larger-than=]
drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c:136:1: warning:
the frame size of 1216 bytes is larger than 1024 bytes
[-Wframe-larger-than=]
drivers/ntb/hw/idt/ntb_hw_idt.c:1116:1: warning: the frame size of
1200 bytes is larger than 1024 bytes [-Wframe-larger-than=]
net/dcb/dcbnl.c:1172:1: warning: the frame size of 1192 bytes is
larger than 1024 bytes [-Wframe-larger-than=]
fs/select.c:1042:1: warning: the frame size of 1192 bytes is larger
than 1024 bytes [-Wframe-larger-than=]

clang-12 KASAN, no KASAN_STACK, FRAME_WARN=1024:

kernel/trace/trace_events_hist.c:4601:13: error: stack frame size 1384
exceeds limit 1024 in function 'hist_trigger_print_key'
[-Werror,-Wframe-larger-than]
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:3045:6:
error: stack frame size 1384 exceeds limit 1024 in function 'bw_calcs'
[-Werror,-Wframe-larger-than]
drivers/staging/fbtft/fbtft-core.c:992:5: error: stack frame size 1208
exceeds limit 1024 in function 'fbtft_init_display'
[-Werror,-Wframe-larger-than]
crypto/wp512.c:782:13: error: stack frame size 1176 exceeds limit 1024
in function 'wp512_process_buffer' [-Werror,-Wframe-larger-than]
drivers/staging/fbtft/fbtft-core.c:902:12: error: stack frame size
1080 exceeds limit 1024 in function 'fbtft_init_display_from_property'
[-Werror,-Wframe-larger-than]
drivers/mtd/chips/cfi_cmdset_0001.c:1872:12: error: stack frame size
1064 exceeds limit 1024 in function 'cfi_intelext_writev'
[-Werror,-Wframe-larger-than]
drivers/staging/rtl8723bs/core/rtw_security.c:1288:5: error: stack
frame size 1040 exceeds limit 1024 in function 'rtw_aes_decrypt'
[-Werror,-Wframe-larger-than]
drivers/ntb/hw/idt/ntb_hw_idt.c:1041:27: error: stack frame size 1032
exceeds limit 1024 in function 'idt_scan_mws'
[-Werror,-Wframe-larger-than]

clang-12, KASAN_STACK:

drivers/infiniband/hw/ocrdma/ocrdma_stats.c:686:16: error: stack frame
size 20608 exceeds limit 1024 in function 'ocrdma_dbgfs_ops_read'
[-Werror,-Wframe-larger-than]
lib/bitfield_kunit.c:60:20: error: stack frame size 10336 exceeds
limit 10240 in function 'test_bitfields_constants'
[-Werror,-Wframe-larger-than]
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9012:13: error: stack
frame size 9952 exceeds limit 1024 in function 'rt2800_init_rfcsr'
[-Werror,-Wframe-larger-than]
drivers/net/usb/r8152.c:7486:13: error: stack frame size 8768 exceeds
limit 1024 in function 'r8156b_hw_phy_cfg'
[-Werror,-Wframe-larger-than]
drivers/media/dvb-frontends/nxt200x.c:915:12: error: stack frame size
8192 exceeds limit 1024 in function 'nxt2004_init'
[-Werror,-Wframe-larger-than]
drivers/net/wan/slic_ds26522.c:203:12: error: stack frame size 8064
exceeds limit 1024 in function 'slic_ds26522_probe'
[-Werror,-Wframe-larger-than]
drivers/firmware/broadcom/bcm47xx_sprom.c:188:13: error: stack frame
size 8064 exceeds limit 1024 in function 'bcm47xx_sprom_fill_auto'
[-Werror,-Wframe-larger-than]
drivers/media/dvb-frontends/drxd_hard.c:2857:12: error: stack frame
size 7584 exceeds limit 1024 in function 'drxd_set_frontend'
[-Werror,-Wframe-larger-than]
drivers/media/dvb-frontends/nxt200x.c:519:12: error: stack frame size
6848 exceeds limit 1024 in function
'nxt200x_setup_frontend_parameters' [-Werror,-Wframe-larger-than]
drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c:17019:13:
error: stack frame size 6560 exceeds limit 1024 in function
'wlc_phy_workarounds_nphy' [-Werror,-Wframe-larger-than]

> Either that, or we just have to live with the occasional warning (that
> is likely benign). But with WERROR we're now forced to make the
> defaults as sane as possible. If the worry is allmodconfig, maybe we
> do have to make KASAN dependent on !COMPILE_TEST, even though that's
> not great either because it has caught real issues in the past (it'll
> also mean doing the same for all other instrumentation-based tools,
> like KCSAN, UBSAN, etc.).

I would prefer going back to marking KASAN_STACK as broken on clang, it does
not seem like the warnings on the symbol were enough to stop people from
attempting to using it, and the remaining warnings seem fixable with a small
increase of the FRAME_WARN when using KASAN with clang but no KASAN_STACK,
or when using KASAN_STACK with gcc.

Arnd

Guenter Roeck

unread,
Sep 9, 2021, 11:00:00 AM9/9/21
to Christian König, Christoph Hellwig, Marco Elver, Nathan Chancellor, Arnd Bergmann, Linus Torvalds, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux...@lists.infradead.org, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasa...@googlegroups.com, Pan, Xinhui, amd...@lists.freedesktop.org
That specific warning is surrounded with "#ifndef CONFIG_COMPILE_TEST"
so it doesn't really matter because it doesn't cause test build failures.
Of course, we could do the same for any #warning which does now
cause a test build failure.

Guenter

Linus Torvalds

unread,
Sep 9, 2021, 12:46:46 PM9/9/21
to Christoph Hellwig, Marco Elver, Guenter Roeck, Nathan Chancellor, Arnd Bergmann, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux...@lists.infradead.org, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev, Christian König, Pan, Xinhui, amd...@lists.freedesktop.org
On Wed, Sep 8, 2021 at 10:59 PM Christoph Hellwig <h...@infradead.org> wrote:
>
> While we're at it, with -Werror something like this is really futile:

Yeah, I'm thinking we could do

-Wno-error=cpp

to at least allow the cpp warnings to come through without being fatal.

Because while they can be annoying too, they are most definitely under
our direct control, so..

I didn't actually test that, but I think it should work.

That said, maybe they should just be removed. They might be better off
just as Kconfig rules, rather than as a "hey, you screwed up your
Kconfig" warning after the fact.

Linus

Linus Torvalds

unread,
Sep 9, 2021, 1:00:11 PM9/9/21
to Marco Elver, Arnd Bergmann, Christoph Hellwig, Guenter Roeck, Nathan Chancellor, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev, Christian König, Pan, Xinhui, amd-gfx list
On Thu, Sep 9, 2021 at 4:43 AM Marco Elver <el...@google.com> wrote:
>
> Sure, but the reality is that the real stack size is already doubled
> for KASAN. And that should be reflected in Wframe-larger-than.

I don't think that's true.

Quite the reverse, in fact.

Yes, the *dynamic* stack size is doubled due to KASAN, because it will
cause much deeper callchains.

But the individual frames don't grow that much apart from compilers
doing stupid things (ie apparently clang and KASAN_STACK), and if
anything, the deeper dynamic call chains means that the individual
frame size being small is even *more* important, but we do compensate
for the deeper stacks by making THREAD_SIZE_ORDER bigger at least on
x86.

Honestly, I am not even happy with the current "2048 bytes for
64-bit". The excuse has been that 64-bit needs more stack, but all it
ever did was clearly to just allow people to just do bad things.

Because a 1kB stack frame is horrendous even in 64-bit. That's not
"spill some registers" kind of stack frame. That's "put a big
structure on the stack" kind of stack frame regardless of any other
issues.

And no, "but we have 16kB of stack and we'll switch stacks on
interrupts" is not an excuse for one single level to use up 1kB, much
less 2kB. Does anybody seriously believe that we don't quite normally
have stacks that are easily tens of frames deep?

Without having some true "this is the full callchain" information, the
best we can do is just limit individual stack frames. And 2kB is
*excessive*.

Linus

Arnd Bergmann

unread,
Sep 21, 2021, 11:42:18 AM9/21/21
to Nathan Chancellor, Linus Torvalds, Guenter Roeck, Linux Kernel Mailing List, ll...@lists.linux.dev, Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev, Harry Wentland, Alex Deucher, Christian König, xinhui pan, amd-gfx list
On Wed, Sep 8, 2021 at 10:55 PM Nathan Chancellor <nat...@kernel.org> wrote:
> On Tue, Sep 07, 2021 at 11:11:17AM +0200, Arnd Bergmann wrote:
> > On Tue, Sep 7, 2021 at 4:32 AM Nathan Chancellor <nat...@kernel.org> wrote:
function 'rtw_aes_decrypt' [-Werror,-Wframe-larger-than]
> > > arm32-fedora.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:3043:6: error: stack frame size (1376) exceeds limit (1024) in function 'bw_calcs' [-Werror,-Wframe-larger-than]
> > > arm32-fedora.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:77:13: error: stack frame size (5384) exceeds limit (1024) in function 'calculate_bandwidth' [-Werror,-Wframe-larger-than]
> > >
> > > Aside from the dce_calcs.c warnings, these do not seem too bad. I
> > > believe allmodconfig turns on UBSAN but it could also be aggressive
> > > inlining by clang. I intend to look at all -Wframe-large-than warnings
> > > closely later.
> >
> > I've had them close to zero in the past, but a couple of new ones came in.
> >
> > The amdgpu ones are probably not fixable unless they stop using 64-bit
> > floats in the kernel for
> > random calculations. The crypto/* ones tend to be compiler bugs, but hard to fix
>
> I have started taking a look at these. Most of the allmodconfig ones
> appear to be related to CONFIG_KASAN, which is now supported for
> CONFIG_ARM.
>
> The two in bpmp-debugfs.c appear regardless of CONFIG_KASAN and it turns
> out that you actually submitted a patch for these:
>
> https://lore.kernel.org/r/20201204193714...@kernel.org/
>
> Is it worth resending or pinging that?

I'm now restarting from a clean tree for my randconfig patches to see which
ones are actually needed, will hopefully get to that.

> The dce_calcs.c ones also appear without CONFIG_KASAN, which you noted
> is probably unavoidable.

(adding amdgpu folks to Cc here)

Harry Wentland did a nice rework for dcn_calcs.c that should also be
portable to dce_calcs.c, I hope that he will be able to get to that as well.

Looking at my older patches now, I found that I had only suppressed that one
and given up fixing it, but I did put my analysis into
https://bugs.llvm.org/show_bug.cgi?id=42551, which should be helpful
for addressing it in either the kernel or the compiler.

Arnd
Reply all
Reply to author
Forward
0 new messages