[syzbot] upstream build error (23)

6 views
Skip to first unread message

syzbot

unread,
Jul 29, 2025, 9:43:37 AM7/29/25
to b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
Hello,

syzbot found the following issue on:

HEAD commit: 86aa72182095 Merge tag 'chrome-platform-v6.17' of git://gi..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=171674a2580000
kernel config: https://syzkaller.appspot.com/x/.config?x=3816ffa0a2bab886
dashboard link: https://syzkaller.appspot.com/bug?extid=5245cb609175fb6e8122
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+5245cb...@syzkaller.appspotmail.com

arch/x86/kernel/setup.c:1251: undefined reference to `efi_mem_type'
ld: arch/x86/kernel/setup.c:987: undefined reference to `efi_init'
ld: arch/x86/kernel/setup.c:971: undefined reference to `efi_memblock_x86_reserve_range'
arch/x86/kernel/cpu/mshyperv.c:496: undefined reference to `isolation_type_tdx'
ld: arch/x86/kernel/cpu/mshyperv.c:494: undefined reference to `isolation_type_snp'
arch/x86/kernel/kvm.c:600: undefined reference to `efi'
ld: arch/x86/kernel/kvm.c:600: undefined reference to `efi'
drivers/acpi/osl.c:210: undefined reference to `efi'
ld: drivers/acpi/osl.c:210: undefined reference to `efi'

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Borislav Petkov

unread,
Jul 29, 2025, 10:25:38 AM7/29/25
to syzbot, Ard Biesheuvel, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
+ Ard.

On Tue, Jul 29, 2025 at 06:43:32AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 86aa72182095 Merge tag 'chrome-platform-v6.17' of git://gi..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=171674a2580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=3816ffa0a2bab886
> dashboard link: https://syzkaller.appspot.com/bug?extid=5245cb609175fb6e8122
> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+5245cb...@syzkaller.appspotmail.com
>
> arch/x86/kernel/setup.c:1251: undefined reference to `efi_mem_type'
> ld: arch/x86/kernel/setup.c:987: undefined reference to `efi_init'
> ld: arch/x86/kernel/setup.c:971: undefined reference to `efi_memblock_x86_reserve_range'
> arch/x86/kernel/cpu/mshyperv.c:496: undefined reference to `isolation_type_tdx'
> ld: arch/x86/kernel/cpu/mshyperv.c:494: undefined reference to `isolation_type_snp'
> arch/x86/kernel/kvm.c:600: undefined reference to `efi'
> ld: arch/x86/kernel/kvm.c:600: undefined reference to `efi'
> drivers/acpi/osl.c:210: undefined reference to `efi'
> ld: drivers/acpi/osl.c:210: undefined reference to `efi'

# CONFIG_EFI is not set

If that's a random config, why do we care?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Aleksandr Nogikh

unread,
Jul 29, 2025, 10:32:54 AM7/29/25
to Borislav Petkov, syzbot, Ard Biesheuvel, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
It's not a random config - syzbot uses a set of fixed kernel configs
(most without CONFIG_EFI, as in this report) which used to work well
before today.

--
Aleksandr

Borislav Petkov

unread,
Jul 29, 2025, 10:49:25 AM7/29/25
to Aleksandr Nogikh, syzbot, Ard Biesheuvel, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, x...@kernel.org
On Tue, Jul 29, 2025 at 04:32:39PM +0200, Aleksandr Nogikh wrote:
> It's not a random config - syzbot uses a set of fixed kernel configs
> (most without CONFIG_EFI, as in this report) which used to work well
> before today.

Can you bisect perhaps?

I'm assuming 6.16 is fine with that .config and if so, that has likely been
introduced by some of the latest merges of the currently open merge window...

Thx.

Thomas Gleixner

unread,
Jul 29, 2025, 3:36:19 PM7/29/25
to syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, x...@kernel.org, Linus Torvalds
On Tue, Jul 29 2025 at 06:43, syzbot wrote:
> syzbot found the following issue on:
>
> HEAD commit: 86aa72182095 Merge tag 'chrome-platform-v6.17' of git://gi..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=171674a2580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=3816ffa0a2bab886
> dashboard link: https://syzkaller.appspot.com/bug?extid=5245cb609175fb6e8122
> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+5245cb...@syzkaller.appspotmail.com
>
> arch/x86/kernel/setup.c:1251: undefined reference to `efi_mem_type'
> ld: arch/x86/kernel/setup.c:987: undefined reference to `efi_init'

Cute. So the code has:

if (efi_enabled(EFI_BOOT))
efi_init();

in the CONFIG_EFI=n case:

static inline bool efi_enabled(int feature)
{
return false;
}

efi_init() has an unconditional forward declaration:

extern void efi_init (void);

This has been the case forever and has been optimized out because
efi_enabled() evaluates to a constant.

I haven't checked which sanitizer option causes GCC to compile this
into:

00000000000000d0 <efi_enabled.constprop.0>:
}
extern void efi_find_mirror(void);
#else
static inline bool efi_enabled(int feature)
{
return false;
d0: e8 00 00 00 00 call d5 <efi_enabled.constprop.0+0x5>
}
d5: 31 c0 xor %eax,%eax
d7: e9 00 00 00 00 jmp dc <efi_enabled.constprop.0+0xc>

and to keep the call for efi_init() as a symbol for the linker to
resolve, which obviously fails.

If I change the efi_enabled() stub to __always_inline, it's optimized
out.

Disabling CONFIG_KCOV_INSTRUMENT_ALL makes it go away. So GCC confuses
the optimizer when CONFIG_KCOV_INSTRUMENT_ALL is on.

The kernel is full of such inline (not __always_inline) stub
conditionals which evaluate to a constant....

Thanks,

tglx




Thomas Gleixner

unread,
Jul 29, 2025, 5:17:47 PM7/29/25
to syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, x...@kernel.org, Kees Cook, Linus Torvalds
On Tue, Jul 29 2025 at 21:36, Thomas Gleixner wrote:
> On Tue, Jul 29 2025 at 06:43, syzbot wrote:
> and to keep the call for efi_init() as a symbol for the linker to
> resolve, which obviously fails.
>
> If I change the efi_enabled() stub to __always_inline, it's optimized
> out.

Kees has addressed similar problems in:

8245d47cfaba ("x86: Handle KCOV __init vs inline mismatches")
65c430906eff ("arm64: Handle KCOV __init vs inline mismatches")
c64d6be1a6f8 ("s390: Handle KCOV __init vs inline mismatches")
2424fe1cac4f ("arm: Handle KCOV __init vs inline mismatches")
d01daf9d95c9 ("mips: Handle KCOV __init vs inline mismatch")

> Disabling CONFIG_KCOV_INSTRUMENT_ALL makes it go away. So GCC confuses
> the optimizer when CONFIG_KCOV_INSTRUMENT_ALL is on.

Seems to be GCC 12 specific. GCC13 does not have that problem.

> The kernel is full of such inline (not __always_inline) stub
> conditionals which evaluate to a constant....

And chasing all those stubs and convert them to __always_inline seems to
be a whack-a-mole game.

Can we just stop pretending that GCC12 is KCOV capable?

Thanks,

tglx


Linus Torvalds

unread,
Jul 29, 2025, 5:39:14 PM7/29/25
to Thomas Gleixner, syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, x...@kernel.org, Kees Cook
On Tue, 29 Jul 2025 at 14:17, Thomas Gleixner <tg...@linutronix.de> wrote:
>
> Can we just stop pretending that GCC12 is KCOV capable?

Yeah, that does seem to be the right thing to do. KCOV just isn't
important enough to

(a) play constant whack-a-mole with

(b) pretend we support broken compilers for

and people who want KCOV can damn well get a fixed compiler.

We already have *some* amount of compiler dependency there, since KCOV has this:

depends on !ARCH_WANTS_NO_INSTR || HAVE_NOINSTR_HACK || \
GCC_VERSION >= 120000 || CC_IS_CLANG

but clearly that allows for gcc-12 - and allows for other versions too
for that NOINSTR thing.

And x86 sets "HAVE_NOINSTR_HACK" because of some argument that objtool
fixes whatever problems there were.

So it's not just about changing that GCC_VERSION number - there's some
interaction with other crazy KCOV hacks, in particular I think the
whole NOINSTR hack is about 0f1441b44e82 ("objtool: Fix noinstr vs
KCOV")

I'd personally be perfectly happy just saying "gcc-13 is required" and
presumably that allows just removing the NOINSTR_HACK thing too.

But I would want somebody to test that and verify that gcc-13 really does do ok.

Linus

Borislav Petkov

unread,
Jul 29, 2025, 5:52:54 PM7/29/25
to Linus Torvalds, Thomas Gleixner, syzbot, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, x...@kernel.org, Kees Cook
On Tue, Jul 29, 2025 at 02:38:50PM -0700, Linus Torvalds wrote:
> I'd personally be perfectly happy just saying "gcc-13 is required" and
> presumably that allows just removing the NOINSTR_HACK thing too.
>
> But I would want somebody to test that and verify that gcc-13 really does do ok.

I was just typing a reply to tglx and saw your mail snow in...

So:

I triggered the same thing today with:

$ gcc-13 --version
gcc-13 (Debian 13.2.0-25) 13.2.0

And with

gcc (Debian 13.3.0-15) 13.3.0

on the other machine.

I'm thinking if this has worked before, then it must be something coming in
during the merge window...

Because 6.16 with the same compiler and kernel builds fine!

So it is something during the merge window *plus* gcc-13!

Your current master which fails with gcc-13 here builds fine with:

gcc (Debian 14.2.0-16) 14.2.0
Copyright (C) 2024 Free Software Foundation, Inc.

I'll run those again tomorrow on a clear head to confirm but it sure sounds
more nasty than just gcc-13 is fine.

Linus Torvalds

unread,
Jul 29, 2025, 6:12:13 PM7/29/25
to Borislav Petkov, Thomas Gleixner, syzbot, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, x...@kernel.org, Kees Cook
On Tue, 29 Jul 2025 at 14:52, Borislav Petkov <b...@alien8.de> wrote:
>
> I triggered the same thing today with:
>
> $ gcc-13 --version
> gcc-13 (Debian 13.2.0-25) 13.2.0

Bah. I should have connected the dots and looked at my own compiler
version, because I saw a variation of this same thing yesterday that
caused

section mismatch in reference: volume_set_software_mute+0x6f
(section: .text.unlikely) -> tpacpi_is_lenovo (section: .init.text)

due to gcc not inlining a single-instruction function.

And yes, KCOV was part of it.

And I have gcc version 15.1.1, so clearly "upgrade gcc" isn't the answer.

> I'm thinking if this has worked before, then it must be something coming in
> during the merge window...

The thing that triggered it is apparently commit 381a38ea53d2
("init.h: Disable sanitizer coverage for __init and __head")

Which is supposed to _lessen_ the sanitizer coverage by adding the
__attribute__((no_sanitize("coverage"))), but it's clearly causing
more problems and making gcc just do crazy things.

Linus

Kees Cook

unread,
Jul 29, 2025, 6:12:23 PM7/29/25
to Thomas Gleixner, syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, x...@kernel.org, Linus Torvalds, Arnd Bergmann
On Tue, Jul 29, 2025 at 11:17:41PM +0200, Thomas Gleixner wrote:
> On Tue, Jul 29 2025 at 21:36, Thomas Gleixner wrote:
> > On Tue, Jul 29 2025 at 06:43, syzbot wrote:
> > and to keep the call for efi_init() as a symbol for the linker to
> > resolve, which obviously fails.
> >
> > If I change the efi_enabled() stub to __always_inline, it's optimized
> > out.
>
> Kees has addressed similar problems in:

The change that I made that is triggering these warnings is:
381a38ea53d2 ("init.h: Disable sanitizer coverage for __init and __head")

The __init vs inline warnings I saw as I was working on this and had
been tackling them. Arnd found a couple more recently, too:
https://lore.kernel.org/all/f8bcf5ce-8b8b-4555...@app.fastmail.com/

> > Disabling CONFIG_KCOV_INSTRUMENT_ALL makes it go away. So GCC confuses
> > the optimizer when CONFIG_KCOV_INSTRUMENT_ALL is on.
>
> Seems to be GCC 12 specific. GCC13 does not have that problem.

Now this 'efi' issue got noticed too, and it seemed to be a preexisting
problem with GCC 12:
https://lore.kernel.org/all/202507272255.50254C0C@keescook/

There were the same linking problems even before 381a38ea53d2.

> > The kernel is full of such inline (not __always_inline) stub
> > conditionals which evaluate to a constant....
>
> And chasing all those stubs and convert them to __always_inline seems to
> be a whack-a-mole game.
>
> Can we just stop pretending that GCC12 is KCOV capable?

That's fine by me, but I do think something weirder is happening here.
Those efi linkages should be entirely DCE'ed?

--
Kees Cook

Kees Cook

unread,
Jul 29, 2025, 6:27:10 PM7/29/25
to Linus Torvalds, Borislav Petkov, Thomas Gleixner, syzbot, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, x...@kernel.org
On Tue, Jul 29, 2025 at 03:11:50PM -0700, Linus Torvalds wrote:
> Which is supposed to _lessen_ the sanitizer coverage by adding the
> __attribute__((no_sanitize("coverage"))), but it's clearly causing
> more problems and making gcc just do crazy things.

Since this change was only made for Clang's stack depth coverage
analysis, let's drop it from GCC builds? I'm testing this currently:

diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index 6bfdaeddbae8..5a68e9db6518 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -5,7 +5,7 @@
#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 170000
#define __head __section(".head.text") __no_sanitize_undefined __no_stack_protector
#else
-#define __head __section(".head.text") __no_sanitize_undefined __no_sanitize_coverage
+#define __head __section(".head.text") __no_sanitize_undefined __no_kstack_erase
#endif

struct x86_mapping_info {
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 2b77d12e07b2..89e2c01fc8b1 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -378,6 +378,13 @@ struct ftrace_likely_data {
# define __signed_wrap
#endif

+/* GCC does not like splitting sanitizer coverage across section inlines */
+#ifdef CC_IS_CLANG
+#define __no_kstack_erase __no_sanitize_coverage
+#else
+#define __no_kstack_erase
+#endif
+
/* Section for code which can't be instrumented at all */
#define __noinstr_section(section) \
noinline notrace __attribute((__section__(section))) \
diff --git a/include/linux/init.h b/include/linux/init.h
index c65a050d52a7..a60d32d227ee 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -51,7 +51,7 @@
discard it in modules) */
#define __init __section(".init.text") __cold __latent_entropy \
__noinitretpoline \
- __no_sanitize_coverage
+ __no_kstack_erase
#define __initdata __section(".init.data")
#define __initconst __section(".init.rodata")
#define __exitdata __section(".exit.data")

--
Kees Cook

Thomas Gleixner

unread,
Jul 30, 2025, 6:44:39 AM7/30/25
to Kees Cook, syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, linux-...@vger.kernel.org, mi...@redhat.com, syzkall...@googlegroups.com, x...@kernel.org, Linus Torvalds, Arnd Bergmann
Of course.

Though it un-inlines the stub function and slaps the sanitizer call into
it, which seems to prevent DCE to drop it:

.type efi_enabled.constprop.0, @function
efi_enabled.constprop.0:
.LASANPC6082:
.LFB6082:
.file 5 "/home/tglx/work/kernel/linus/linux/include/linux/efi.h"
.loc 5 891 20 is_stmt 1 view -0
.cfi_startproc
.LVL13:
.loc 5 893 2 view .LVU43
.loc 5 893 9 is_stmt 0 view .LVU44
call __sanitizer_cov_trace_pc
.LVL14:
.loc 5 894 1 view .LVU45
xorl %eax, %eax
jmp __x86_return_thunk
.cfi_endproc

We had similar issues with function tracing in the past where different
GCC versions decided un-inlining at random places, so we ended up adding
notrace to the inline define.

Adding __no_sanitize_coverage as well is curing it for good.

Thanks,

tglx
---
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 2b77d12e07b2..46f7722039c3 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -233,7 +233,7 @@ struct ftrace_likely_data {
* of extern inline functions at link time.
* A lot of inline functions can cause havoc with function tracing.
*/
-#define inline inline __gnu_inline __inline_maybe_unused notrace
+#define inline inline __gnu_inline __inline_maybe_unused notrace __no_sanitize_coverage

/*
* gcc provides both __inline__ and __inline as alternate spellings of
Reply all
Reply to author
Forward
0 new messages