[PATCH v5 00/36] Warn on orphan section placement

75 views
Skip to first unread message

Kees Cook

unread,
Jul 31, 2020, 7:08:34 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=linker/orphans/warn/v5

v5:
- rebase from -rc2 to -rc7 to avoid build failures on Clang vs binutils
- include Arvind's GOT fix-up series[3], since it touches many similar areas
- add PGO/AutoFDO section patch[4]
- split up x86 and arm changes into more digestable steps
- move several sections out of DISCARD and into zero-size asserts
- introduce COMMON_DISCARDS to improve ARM's linker scripts
v4: https://lore.kernel.org/lkml/20200629061840.4...@chromium.org/
v3: https://lore.kernel.org/lkml/20200624014940.1...@chromium.org/
v2: https://lore.kernel.org/lkml/20200622205815.2...@chromium.org/
v1: https://lore.kernel.org/lkml/20200228002244....@chromium.org/

A recent bug[1] was solved for builds linked with ld.lld, and tracking
it down took way longer than it needed to (a year). Ultimately, it
boiled down to differences between ld.bfd and ld.lld's handling of
orphan sections. Similar situation have continued to recur, and it's
clear the kernel build needs to be much more explicit about linker
sections. Similarly, the recent FGKASLR series brought up orphan section
handling too[2]. In all cases, it would have been nice if the linker was
running with --orphan-handling=warn so that surprise sections wouldn't
silently get mapped into the kernel image at locations up to the whim
of the linker's orphan handling logic. Instead, all desired sections
should be explicitly identified in the linker script (to be either kept,
discarded, or verified to be zero-sized) with any orphans throwing a
warning. The powerpc architecture has actually been doing this for some
time, so this series just extends that coverage to x86, arm, and arm64.

This has gotten sucecssful build testing under the following matrix:

compiler/linker: gcc+ld.bfd, clang+ld.lld
targets: defconfig, allmodconfig
architectures: x86, i386, arm64, arm
versions: v5.8-rc7, next-20200731 (with various build fixes[7][8])

Two known-failure exceptions (unchanged by this series) being:
- clang+arm/arm64 needs CONFIG_CPU_BIG_ENDIAN=n to pass allmodconfig[5]
- clang+i386 only builds in -next, which was recently fixed[6]

All three architectures depend on the first several commits to
vmlinux.lds.h. x86 depends on Arvind's GOT series[3], so I collected it
into this version of my series, as it hadn't been taken into -tip yet.
arm64 depends on the efi/libstub patch. As such, I'd like to land this
series as a whole. Given that two thirds of it is in the arm universe,
perhaps this can land via the arm64 tree? If x86 -tip is preferred, that
works too. If I don't hear otherwise, I will just carry this myself in
-next. In all cases, I would really appreciate reviews/acks/etc. :)

Thanks!

-Kees

[1] https://github.com/ClangBuiltLinux/linux/issues/282
[2] https://lore.kernel.org/lkml/202002242122.AA4D1B8@keescook/
[3] https://lore.kernel.org/lkml/20200715004133.1...@alum.mit.edu/
[4] https://lore.kernel.org/lkml/20200625184752.73...@google.com/
[5] https://github.com/ClangBuiltLinux/linux/issues/1071
[6] https://github.com/ClangBuiltLinux/linux/issues/194
[7] https://lore.kernel.org/lkml/1596166744-2954-2-gi...@mediatek.com/
[8] https://lore.kernel.org/lkml/82f750c4-d423-1ed8...@huawei.com/


Ard Biesheuvel (3):
x86/boot/compressed: Move .got.plt entries out of the .got section
x86/boot/compressed: Force hidden visibility for all symbol references
x86/boot/compressed: Get rid of GOT fixup code

Arvind Sankar (4):
x86/boot: Add .text.* to setup.ld
x86/boot: Remove run-time relocations from .head.text code
x86/boot: Remove run-time relocations from head_{32,64}.S
x86/boot: Check that there are no run-time relocations

Kees Cook (28):
vmlinux.lds.h: Create COMMON_DISCARDS
vmlinux.lds.h: Add .gnu.version* to COMMON_DISCARDS
vmlinux.lds.h: Avoid KASAN and KCSAN's unwanted sections
vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG
vmlinux.lds.h: Add .symtab, .strtab, and .shstrtab to ELF_DETAILS
efi/libstub: Disable -mbranch-protection
arm64/mm: Remove needless section quotes
arm64/kernel: Remove needless Call Frame Information annotations
arm64/build: Remove .eh_frame* sections due to unwind tables
arm64/build: Use common DISCARDS in linker script
arm64/build: Add missing DWARF sections
arm64/build: Assert for unwanted sections
arm64/build: Warn on orphan section placement
arm/build: Refactor linker script headers
arm/build: Explicitly keep .ARM.attributes sections
arm/build: Add missing sections
arm/build: Warn on orphan section placement
arm/boot: Handle all sections explicitly
arm/boot: Warn on orphan section placement
x86/asm: Avoid generating unused kprobe sections
x86/build: Enforce an empty .got.plt section
x86/build: Assert for unwanted sections
x86/build: Warn on orphan section placement
x86/boot/compressed: Reorganize zero-size section asserts
x86/boot/compressed: Remove, discard, or assert for unwanted sections
x86/boot/compressed: Add missing debugging sections to output
x86/boot/compressed: Warn on orphan section placement
arm/build: Assert for unwanted sections

Nick Desaulniers (1):
vmlinux.lds.h: add PGO and AutoFDO input sections

arch/alpha/kernel/vmlinux.lds.S | 1 +
arch/arc/kernel/vmlinux.lds.S | 1 +
arch/arm/Makefile | 4 +
arch/arm/boot/compressed/Makefile | 2 +
arch/arm/boot/compressed/vmlinux.lds.S | 20 +--
.../arm/{kernel => include/asm}/vmlinux.lds.h | 29 ++-
arch/arm/kernel/vmlinux-xip.lds.S | 8 +-
arch/arm/kernel/vmlinux.lds.S | 8 +-
arch/arm64/Makefile | 9 +-
arch/arm64/kernel/smccc-call.S | 2 -
arch/arm64/kernel/vmlinux.lds.S | 28 ++-
arch/arm64/mm/mmu.c | 2 +-
arch/csky/kernel/vmlinux.lds.S | 1 +
arch/hexagon/kernel/vmlinux.lds.S | 1 +
arch/ia64/kernel/vmlinux.lds.S | 1 +
arch/mips/kernel/vmlinux.lds.S | 1 +
arch/nds32/kernel/vmlinux.lds.S | 1 +
arch/nios2/kernel/vmlinux.lds.S | 1 +
arch/openrisc/kernel/vmlinux.lds.S | 1 +
arch/parisc/boot/compressed/vmlinux.lds.S | 1 +
arch/parisc/kernel/vmlinux.lds.S | 1 +
arch/powerpc/kernel/vmlinux.lds.S | 2 +-
arch/riscv/kernel/vmlinux.lds.S | 1 +
arch/s390/kernel/vmlinux.lds.S | 1 +
arch/sh/kernel/vmlinux.lds.S | 1 +
arch/sparc/kernel/vmlinux.lds.S | 1 +
arch/um/kernel/dyn.lds.S | 2 +-
arch/um/kernel/uml.lds.S | 2 +-
arch/unicore32/kernel/vmlinux.lds.S | 1 +
arch/x86/Makefile | 4 +
arch/x86/boot/compressed/Makefile | 41 +----
arch/x86/boot/compressed/head_32.S | 99 ++++-------
arch/x86/boot/compressed/head_64.S | 165 +++++++-----------
arch/x86/boot/compressed/mkpiggy.c | 6 +
arch/x86/boot/compressed/vmlinux.lds.S | 48 ++++-
arch/x86/boot/setup.ld | 2 +-
arch/x86/include/asm/asm.h | 6 +-
arch/x86/kernel/vmlinux.lds.S | 39 ++++-
drivers/firmware/efi/libstub/Makefile | 11 +-
drivers/firmware/efi/libstub/hidden.h | 6 -
include/asm-generic/vmlinux.lds.h | 49 +++++-
include/linux/hidden.h | 19 ++
42 files changed, 377 insertions(+), 252 deletions(-)
rename arch/arm/{kernel => include/asm}/vmlinux.lds.h (84%)
delete mode 100644 drivers/firmware/efi/libstub/hidden.h
create mode 100644 include/linux/hidden.h

--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:35 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Ard Biesheuvel, Arvind Sankar, Sedat Dilek, Nick Desaulniers, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

The .got.plt section contains the part of the GOT which is used by PLT
entries, and which gets updated lazily by the dynamic loader when
function calls are dispatched through those PLT entries.

On fully linked binaries such as the kernel proper or the decompressor,
this never happens, and so in practice, the .got.plt section consists
only of the first 3 magic entries that are meant to point at the _DYNAMIC
section and at the fixup routine in the loader. However, since we don't
use a dynamic loader, those entries are never populated or used.

This means that treating those entries like ordinary GOT entries, and
updating their values based on the actual placement of the executable in
memory is completely pointless, and we can just ignore the .got.plt
section entirely, provided that it has no additional entries beyond
the first 3 ones.

So add an assertion in the linker script to ensure that this assumption
holds, and move the contents out of the [_got, _egot) memory range that
is modified by the GOT fixup routines.

While at it, drop the KEEP(), since it has no effect on the contents
of output sections that are created by the linker itself.

Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
Tested-by: Sedat Dilek <sedat...@gmail.com>
Tested-by: Nick Desaulniers <ndesau...@google.com>
Reviewed-by: Kees Cook <kees...@chromium.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/boot/compressed/vmlinux.lds.S | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 8f1025d1f681..b17d218ccdf9 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -44,10 +44,13 @@ SECTIONS
}
.got : {
_got = .;
- KEEP(*(.got.plt))
KEEP(*(.got))
_egot = .;
}
+ .got.plt : {
+ *(.got.plt)
+ }
+
.data : {
_data = . ;
*(.data)
@@ -77,3 +80,9 @@ SECTIONS

DISCARDS
}
+
+#ifdef CONFIG_X86_64
+ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
+#else
+ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
+#endif
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:36 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Ard Biesheuvel, Nick Desaulniers, Arvind Sankar, Sedat Dilek, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

In a previous patch, we have eliminated GOT entries from the decompressor
binary and added an assertion that the .got section is empty. This means
that the GOT fixup routines that exist in both the 32-bit and 64-bit
startup routines have become dead code, and can be removed.

While at it, drop the KEEP() from the linker script, as it has no effect
on the contents of output sections that are created by the linker itself.

Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Tested-by: Nick Desaulniers <ndesau...@google.com>
Reviewed-by: Kees Cook <kees...@chromium.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
Tested-by: Sedat Dilek <sedat...@gmail.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/boot/compressed/head_32.S | 24 ++---------
arch/x86/boot/compressed/head_64.S | 57 --------------------------
arch/x86/boot/compressed/vmlinux.lds.S | 4 +-
3 files changed, 5 insertions(+), 80 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 03557f2174bf..39f0bb43218f 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -49,16 +49,13 @@
* Position Independent Executable (PIE) so that linker won't optimize
* R_386_GOT32X relocation to its fixed symbol address. Older
* linkers generate R_386_32 relocations against locally defined symbols,
- * _bss, _ebss, _got, _egot and _end, in PIE. It isn't wrong, just less
- * optimal than R_386_RELATIVE. But the x86 kernel fails to properly handle
- * R_386_32 relocations when relocating the kernel. To generate
- * R_386_RELATIVE relocations, we mark _bss, _ebss, _got, _egot and _end as
- * hidden:
+ * _bss, _ebss and _end, in PIE. It isn't wrong, just less optimal than
+ * R_386_RELATIVE. But the x86 kernel fails to properly handle R_386_32
+ * relocations when relocating the kernel. To generate R_386_RELATIVE
+ * relocations, we mark _bss, _ebss and _end as hidden:
*/
.hidden _bss
.hidden _ebss
- .hidden _got
- .hidden _egot
.hidden _end

__HEAD
@@ -192,19 +189,6 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
shrl $2, %ecx
rep stosl

-/*
- * Adjust our own GOT
- */
- leal _got(%ebx), %edx
- leal _egot(%ebx), %ecx
-1:
- cmpl %ecx, %edx
- jae 2f
- addl %ebx, (%edx)
- addl $4, %edx
- jmp 1b
-2:
-
/*
* Do the extraction, and jump to the new kernel..
*/
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 97d37f0a34f5..bf1ab30acc5b 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -40,8 +40,6 @@
*/
.hidden _bss
.hidden _ebss
- .hidden _got
- .hidden _egot
.hidden _end

__HEAD
@@ -353,25 +351,6 @@ SYM_CODE_START(startup_64)
/* Set up the stack */
leaq boot_stack_end(%rbx), %rsp

- /*
- * paging_prepare() and cleanup_trampoline() below can have GOT
- * references. Adjust the table with address we are running at.
- *
- * Zero RAX for adjust_got: the GOT was not adjusted before;
- * there's no adjustment to undo.
- */
- xorq %rax, %rax
-
- /*
- * Calculate the address the binary is loaded at and use it as
- * a GOT adjustment.
- */
- call 1f
-1: popq %rdi
- subq $1b, %rdi
-
- call .Ladjust_got
-
/*
* At this point we are in long mode with 4-level paging enabled,
* but we might want to enable 5-level paging or vice versa.
@@ -464,21 +443,6 @@ trampoline_return:
pushq $0
popfq

- /*
- * Previously we've adjusted the GOT with address the binary was
- * loaded at. Now we need to re-adjust for relocation address.
- *
- * Calculate the address the binary is loaded at, so that we can
- * undo the previous GOT adjustment.
- */
- call 1f
-1: popq %rax
- subq $1b, %rax
-
- /* The new adjustment is the relocation address */
- movq %rbx, %rdi
- call .Ladjust_got
-
/*
* Copy the compressed kernel to the end of our buffer
* where decompression in place becomes safe.
@@ -556,27 +520,6 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
jmp *%rax
SYM_FUNC_END(.Lrelocated)

-/*
- * Adjust the global offset table
- *
- * RAX is the previous adjustment of the table to undo (use 0 if it's the
- * first time we touch GOT).
- * RDI is the new adjustment to apply.
- */
-.Ladjust_got:
- /* Walk through the GOT adding the address to the entries */
- leaq _got(%rip), %rdx
- leaq _egot(%rip), %rcx
-1:
- cmpq %rcx, %rdx
- jae 2f
- subq %rax, (%rdx) /* Undo previous adjustment */
- addq %rdi, (%rdx) /* Apply the new adjustment */
- addq $8, %rdx
- jmp 1b
-2:
- ret
-
.code32
/*
* This is the 32-bit trampoline that will be copied over to low memory.
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 4bcc943842ab..a4a4a59a2628 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -43,9 +43,7 @@ SECTIONS
_erodata = . ;
}
.got : {
- _got = .;
- KEEP(*(.got))
- _egot = .;
+ *(.got)
}
.got.plt : {
*(.got.plt)
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:37 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Arvind Sankar, Nick Desaulniers, Ard Biesheuvel, Fangrui Song, Sedat Dilek, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
From: Arvind Sankar <nive...@alum.mit.edu>

The BFD linker generates run-time relocations for z_input_len and
z_output_len, even though they are absolute symbols.

This is fixed for binutils-2.35 [1]. Work around this for earlier
versions by defining two variables input_len and output_len in addition
to the symbols, and use them via position-independent references.

This eliminates the last two run-time relocations in the head code and
allows us to drop the -z noreloc-overflow flag to the linker.

Move the -pie and --no-dynamic-linker LDFLAGS to LDFLAGS_vmlinux instead
of KBUILD_LDFLAGS. There shouldn't be anything else getting linked, but
this is the more logical location for these flags, and modversions might
call the linker if an EXPORT_SYMBOL is left over accidentally in one of
the decompressors.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=25754

Tested-by: Nick Desaulniers <ndesau...@google.com>
Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Tested-by: Sedat Dilek <sedat...@gmail.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/boot/compressed/Makefile | 12 ++----------
arch/x86/boot/compressed/head_32.S | 17 ++++++++---------
arch/x86/boot/compressed/head_64.S | 4 ++--
arch/x86/boot/compressed/mkpiggy.c | 6 ++++++
4 files changed, 18 insertions(+), 21 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 489fea16bcfb..7db0102a573d 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -51,16 +51,8 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-ifeq ($(CONFIG_X86_32),y)
-KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
-else
-# To build 64-bit compressed kernel as PIE, we disable relocation
-# overflow check to avoid relocation overflow error with a new linker
-# command-line option, -z noreloc-overflow.
-KBUILD_LDFLAGS += $(shell $(LD) --help 2>&1 | grep -q "\-z noreloc-overflow" \
- && echo "-z noreloc-overflow -pie --no-dynamic-linker")
-endif
-LDFLAGS_vmlinux := -T
+LDFLAGS_vmlinux := $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
+LDFLAGS_vmlinux += -T

hostprogs := mkpiggy
HOST_EXTRACFLAGS += -I$(srctree)/tools/include
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 8c1a4f5610f5..659fad53ca82 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -178,18 +178,17 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
/*
* Do the extraction, and jump to the new kernel..
*/
- /* push arguments for extract_kernel: */
- pushl $z_output_len /* decompressed length, end of relocs */
+ /* push arguments for extract_kernel: */

- pushl %ebp /* output address */
-
- pushl $z_input_len /* input_len */
+ pushl output_len@GOTOFF(%ebx) /* decompressed length, end of relocs */
+ pushl %ebp /* output address */
+ pushl input_len@GOTOFF(%ebx) /* input_len */
leal input_data@GOTOFF(%ebx), %eax
- pushl %eax /* input_data */
+ pushl %eax /* input_data */
leal boot_heap@GOTOFF(%ebx), %eax
- pushl %eax /* heap area */
- pushl %esi /* real mode pointer */
- call extract_kernel /* returns kernel location in %eax */
+ pushl %eax /* heap area */
+ pushl %esi /* real mode pointer */
+ call extract_kernel /* returns kernel location in %eax */
addl $24, %esp

/*
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 11429092c224..9e46729cf162 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -534,9 +534,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
movq %rsi, %rdi /* real mode address */
leaq boot_heap(%rip), %rsi /* malloc area for uncompression */
leaq input_data(%rip), %rdx /* input_data */
- movl $z_input_len, %ecx /* input_len */
+ movl input_len(%rip), %ecx /* input_len */
movq %rbp, %r8 /* output target address */
- movl $z_output_len, %r9d /* decompressed length, end of relocs */
+ movl output_len(%rip), %r9d /* decompressed length, end of relocs */
call extract_kernel /* returns kernel location in %rax */
popq %rsi

diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
index 7e01248765b2..52aa56cdbacc 100644
--- a/arch/x86/boot/compressed/mkpiggy.c
+++ b/arch/x86/boot/compressed/mkpiggy.c
@@ -60,6 +60,12 @@ int main(int argc, char *argv[])
printf(".incbin \"%s\"\n", argv[1]);
printf("input_data_end:\n");

+ printf(".section \".rodata\",\"a\",@progbits\n");
+ printf(".globl input_len\n");
+ printf("input_len:\n\t.long %lu\n", ilen);
+ printf(".globl output_len\n");
+ printf("output_len:\n\t.long %lu\n", (unsigned long)olen);
+
retval = 0;
bail:
if (f)
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:38 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Collect the common DISCARD sections for architectures that need more
specialized discard control than what the standard DISCARDS section
provides.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
include/asm-generic/vmlinux.lds.h | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 052e0f05a984..ff65a20faf4c 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -930,13 +930,16 @@
EXIT_DATA
#endif

+#define COMMON_DISCARDS \
+ *(.discard) \
+ *(.discard.*) \
+ *(.modinfo)
+
#define DISCARDS \
/DISCARD/ : { \
EXIT_DISCARDS \
EXIT_CALL \
- *(.discard) \
- *(.discard.*) \
- *(.modinfo) \
+ COMMON_DISCARDS \
}

/**
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:39 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Arvind Sankar, Nick Desaulniers, Ard Biesheuvel, Fangrui Song, Sedat Dilek, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
From: Arvind Sankar <nive...@alum.mit.edu>

Add a linker script check that there are no run-time relocations, and
remove the old one that tries to check via looking for specially-named
sections in the object files.

Drop the tests for -fPIE compiler option and -pie linker option, as they
are available in all supported gcc and binutils versions (as well as
clang and lld).

Tested-by: Nick Desaulniers <ndesau...@google.com>
Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Reviewed-by: Sedat Dilek <sedat...@gmail.com>
Tested-by: Sedat Dilek <sedat...@gmail.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/boot/compressed/Makefile | 28 +++-----------------------
arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
2 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 7db0102a573d..96d53e300ab6 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -29,7 +29,7 @@ targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \
vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4

KBUILD_CFLAGS := -m$(BITS) -O2
-KBUILD_CFLAGS += -fno-strict-aliasing $(call cc-option, -fPIE, -fPIC)
+KBUILD_CFLAGS += -fno-strict-aliasing -fPIE
KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
cflags-$(CONFIG_X86_32) := -march=i386
cflags-$(CONFIG_X86_64) := -mcmodel=small
@@ -51,7 +51,7 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-LDFLAGS_vmlinux := $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
+LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker)
LDFLAGS_vmlinux += -T

hostprogs := mkpiggy
@@ -86,30 +86,8 @@ vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
efi-obj-$(CONFIG_EFI_STUB) = $(objtree)/drivers/firmware/efi/libstub/lib.a

-# The compressed kernel is built with -fPIC/-fPIE so that a boot loader
-# can place it anywhere in memory and it will still run. However, since
-# it is executed as-is without any ELF relocation processing performed
-# (and has already had all relocation sections stripped from the binary),
-# none of the code can use data relocations (e.g. static assignments of
-# pointer values), since they will be meaningless at runtime. This check
-# will refuse to link the vmlinux if any of these relocations are found.
-quiet_cmd_check_data_rel = DATAREL $@
-define cmd_check_data_rel
- for obj in $(filter %.o,$^); do \
- $(READELF) -S $$obj | grep -qF .rel.local && { \
- echo "error: $$obj has data relocations!" >&2; \
- exit 1; \
- } || true; \
- done
-endef
-
-# We need to run two commands under "if_changed", so merge them into a
-# single invocation.
-quiet_cmd_check-and-link-vmlinux = LD $@
- cmd_check-and-link-vmlinux = $(cmd_check_data_rel); $(cmd_ld)
-
$(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE
- $(call if_changed,check-and-link-vmlinux)
+ $(call if_changed,ld)

OBJCOPYFLAGS_vmlinux.bin := -R .comment -S
$(obj)/vmlinux.bin: vmlinux FORCE
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index a4a4a59a2628..29df99b6cc64 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -42,6 +42,12 @@ SECTIONS
*(.rodata.*)
_erodata = . ;
}
+ .rel.dyn : {
+ *(.rel.*)
+ }
+ .rela.dyn : {
+ *(.rela.*)
+ }
.got : {
*(.got)
}
@@ -85,3 +91,5 @@ ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT en
#else
ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
#endif
+
+ASSERT(SIZEOF(.rel.dyn) == 0 && SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations detected!")
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:40 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Arvind Sankar, Nick Desaulniers, Ard Biesheuvel, Fangrui Song, Sedat Dilek, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
From: Arvind Sankar <nive...@alum.mit.edu>

The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.

On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output

With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object

On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.

Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.

Tested-by: Nick Desaulniers <ndesau...@google.com>
Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Tested-by: Sedat Dilek <sedat...@gmail.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/boot/compressed/head_32.S | 64 +++++++-----------
arch/x86/boot/compressed/head_64.S | 104 ++++++++++++++++++-----------
2 files changed, 90 insertions(+), 78 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 39f0bb43218f..8c1a4f5610f5 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -33,26 +33,10 @@
#include <asm/bootparam.h>

/*
- * The 32-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
- * relocation to get the symbol address in PIC. When the compressed x86
- * kernel isn't built as PIC, the linker optimizes R_386_GOT32X
- * relocations to their fixed symbol addresses. However, when the
- * compressed x86 kernel is loaded at a different address, it leads
- * to the following load failure:
- *
- * Failed to allocate space for phdrs
- *
- * during the decompression stage.
- *
- * If the compressed x86 kernel is relocatable at run-time, it should be
- * compiled with -fPIE, instead of -fPIC, if possible and should be built as
- * Position Independent Executable (PIE) so that linker won't optimize
- * R_386_GOT32X relocation to its fixed symbol address. Older
- * linkers generate R_386_32 relocations against locally defined symbols,
- * _bss, _ebss and _end, in PIE. It isn't wrong, just less optimal than
- * R_386_RELATIVE. But the x86 kernel fails to properly handle R_386_32
- * relocations when relocating the kernel. To generate R_386_RELATIVE
- * relocations, we mark _bss, _ebss and _end as hidden:
+ * These symbols needed to be marked as .hidden to prevent the BFD linker from
+ * generating R_386_32 (rather than R_386_RELATIVE) relocations for them when
+ * the 32-bit compressed kernel is linked as PIE. This is no longer necessary,
+ * but it doesn't hurt to keep them .hidden.
*/
.hidden _bss
.hidden _ebss
@@ -74,10 +58,10 @@ SYM_FUNC_START(startup_32)
leal (BP_scratch+4)(%esi), %esp
call 1f
1: popl %edx
- subl $1b, %edx
+ addl $_GLOBAL_OFFSET_TABLE_+(.-1b), %edx

/* Load new GDT */
- leal gdt(%edx), %eax
+ leal gdt@GOTOFF(%edx), %eax
movl %eax, 2(%eax)
lgdt (%eax)

@@ -90,14 +74,16 @@ SYM_FUNC_START(startup_32)
movl %eax, %ss

/*
- * %edx contains the address we are loaded at by the boot loader and %ebx
- * contains the address where we should move the kernel image temporarily
- * for safe in-place decompression. %ebp contains the address that the kernel
- * will be decompressed to.
+ * %edx contains the address we are loaded at by the boot loader (plus the
+ * offset to the GOT). The below code calculates %ebx to be the address where
+ * we should move the kernel image temporarily for safe in-place decompression
+ * (again, plus the offset to the GOT).
+ *
+ * %ebp is calculated to be the address that the kernel will be decompressed to.
*/

#ifdef CONFIG_RELOCATABLE
- movl %edx, %ebx
+ leal startup_32@GOTOFF(%edx), %ebx

#ifdef CONFIG_EFI_STUB
/*
@@ -108,7 +94,7 @@ SYM_FUNC_START(startup_32)
* image_offset = startup_32 - image_base
* Otherwise image_offset will be zero and has no effect on the calculations.
*/
- subl image_offset(%edx), %ebx
+ subl image_offset@GOTOFF(%edx), %ebx
#endif

movl BP_kernel_alignment(%esi), %eax
@@ -125,10 +111,10 @@ SYM_FUNC_START(startup_32)
movl %ebx, %ebp // Save the output address for later
/* Target address to relocate to for decompression */
addl BP_init_size(%esi), %ebx
- subl $_end, %ebx
+ subl $_end@GOTOFF, %ebx

/* Set up the stack */
- leal boot_stack_end(%ebx), %esp
+ leal boot_stack_end@GOTOFF(%ebx), %esp

/* Zero EFLAGS */
pushl $0
@@ -139,8 +125,8 @@ SYM_FUNC_START(startup_32)
* where decompression in place becomes safe.
*/
pushl %esi
- leal (_bss-4)(%edx), %esi
- leal (_bss-4)(%ebx), %edi
+ leal (_bss@GOTOFF-4)(%edx), %esi
+ leal (_bss@GOTOFF-4)(%ebx), %edi
movl $(_bss - startup_32), %ecx
shrl $2, %ecx
std
@@ -153,14 +139,14 @@ SYM_FUNC_START(startup_32)
* during extract_kernel below. To avoid any issues, repoint the GDTR
* to the new copy of the GDT.
*/
- leal gdt(%ebx), %eax
+ leal gdt@GOTOFF(%ebx), %eax
movl %eax, 2(%eax)
lgdt (%eax)

/*
* Jump to the relocated address.
*/
- leal .Lrelocated(%ebx), %eax
+ leal .Lrelocated@GOTOFF(%ebx), %eax
jmp *%eax
SYM_FUNC_END(startup_32)

@@ -170,7 +156,7 @@ SYM_FUNC_START_ALIAS(efi_stub_entry)
add $0x4, %esp
movl 8(%esp), %esi /* save boot_params pointer */
call efi_main
- leal startup_32(%eax), %eax
+ /* efi_main returns the possibly relocated address of startup_32 */
jmp *%eax
SYM_FUNC_END(efi32_stub_entry)
SYM_FUNC_END_ALIAS(efi_stub_entry)
@@ -183,8 +169,8 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
* Clear BSS (stack is currently empty)
*/
xorl %eax, %eax
- leal _bss(%ebx), %edi
- leal _ebss(%ebx), %ecx
+ leal _bss@GOTOFF(%ebx), %edi
+ leal _ebss@GOTOFF(%ebx), %ecx
subl %edi, %ecx
shrl $2, %ecx
rep stosl
@@ -198,9 +184,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
pushl %ebp /* output address */

pushl $z_input_len /* input_len */
- leal input_data(%ebx), %eax
+ leal input_data@GOTOFF(%ebx), %eax
pushl %eax /* input_data */
- leal boot_heap(%ebx), %eax
+ leal boot_heap@GOTOFF(%ebx), %eax
pushl %eax /* heap area */
pushl %esi /* real mode pointer */
call extract_kernel /* returns kernel location in %eax */
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index bf1ab30acc5b..11429092c224 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -43,6 +43,32 @@
.hidden _end

__HEAD
+
+/*
+ * This macro gives the relative virtual address of X, i.e. the offset of X
+ * from startup_32. This is the same as the link-time virtual address of X,
+ * since startup_32 is at 0, but defining it this way tells the
+ * assembler/linker that we do not want the actual run-time address of X. This
+ * prevents the linker from trying to create unwanted run-time relocation
+ * entries for the reference when the compressed kernel is linked as PIE.
+ *
+ * A reference X(%reg) will result in the link-time VA of X being stored with
+ * the instruction, and a run-time R_X86_64_RELATIVE relocation entry that
+ * adds the 64-bit base address where the kernel is loaded.
+ *
+ * Replacing it with (X-startup_32)(%reg) results in the offset being stored,
+ * and no run-time relocation.
+ *
+ * The macro should be used as a displacement with a base register containing
+ * the run-time address of startup_32 [i.e. rva(X)(%reg)], or as an immediate
+ * [$ rva(X)].
+ *
+ * This macro can only be used from within the .head.text section, since the
+ * expression requires startup_32 to be in the same section as the code being
+ * assembled.
+ */
+#define rva(X) ((X) - startup_32)
+
.code32
SYM_FUNC_START(startup_32)
/*
@@ -65,10 +91,10 @@ SYM_FUNC_START(startup_32)
leal (BP_scratch+4)(%esi), %esp
call 1f
1: popl %ebp
- subl $1b, %ebp
+ subl $ rva(1b), %ebp

/* Load new GDT with the 64bit segments using 32bit descriptor */
- leal gdt(%ebp), %eax
+ leal rva(gdt)(%ebp), %eax
movl %eax, 2(%eax)
lgdt (%eax)

@@ -81,7 +107,7 @@ SYM_FUNC_START(startup_32)
movl %eax, %ss

/* setup a stack and make sure cpu supports long mode. */
- leal boot_stack_end(%ebp), %esp
+ leal rva(boot_stack_end)(%ebp), %esp

call verify_cpu
testl %eax, %eax
@@ -108,7 +134,7 @@ SYM_FUNC_START(startup_32)
* image_offset = startup_32 - image_base
* Otherwise image_offset will be zero and has no effect on the calculations.
*/
- subl image_offset(%ebp), %ebx
+ subl rva(image_offset)(%ebp), %ebx
#endif

movl BP_kernel_alignment(%esi), %eax
@@ -124,7 +150,7 @@ SYM_FUNC_START(startup_32)

/* Target address to relocate to for decompression */
addl BP_init_size(%esi), %ebx
- subl $_end, %ebx
+ subl $ rva(_end), %ebx

/*
* Prepare for entering 64 bit mode
@@ -152,19 +178,19 @@ SYM_FUNC_START(startup_32)
1:

/* Initialize Page tables to 0 */
- leal pgtable(%ebx), %edi
+ leal rva(pgtable)(%ebx), %edi
xorl %eax, %eax
movl $(BOOT_INIT_PGT_SIZE/4), %ecx
rep stosl

/* Build Level 4 */
- leal pgtable + 0(%ebx), %edi
+ leal rva(pgtable + 0)(%ebx), %edi
leal 0x1007 (%edi), %eax
movl %eax, 0(%edi)
addl %edx, 4(%edi)

/* Build Level 3 */
- leal pgtable + 0x1000(%ebx), %edi
+ leal rva(pgtable + 0x1000)(%ebx), %edi
leal 0x1007(%edi), %eax
movl $4, %ecx
1: movl %eax, 0x00(%edi)
@@ -175,7 +201,7 @@ SYM_FUNC_START(startup_32)
jnz 1b

/* Build Level 2 */
- leal pgtable + 0x2000(%ebx), %edi
+ leal rva(pgtable + 0x2000)(%ebx), %edi
movl $0x00000183, %eax
movl $2048, %ecx
1: movl %eax, 0(%edi)
@@ -186,7 +212,7 @@ SYM_FUNC_START(startup_32)
jnz 1b

/* Enable the boot page tables */
- leal pgtable(%ebx), %eax
+ leal rva(pgtable)(%ebx), %eax
movl %eax, %cr3

/* Enable Long mode in EFER (Extended Feature Enable Register) */
@@ -211,14 +237,14 @@ SYM_FUNC_START(startup_32)
* We place all of the values on our mini stack so lret can
* used to perform that far jump.
*/
- leal startup_64(%ebp), %eax
+ leal rva(startup_64)(%ebp), %eax
#ifdef CONFIG_EFI_MIXED
- movl efi32_boot_args(%ebp), %edi
+ movl rva(efi32_boot_args)(%ebp), %edi
cmp $0, %edi
jz 1f
- leal efi64_stub_entry(%ebp), %eax
- movl efi32_boot_args+4(%ebp), %esi
- movl efi32_boot_args+8(%ebp), %edx // saved bootparams pointer
+ leal rva(efi64_stub_entry)(%ebp), %eax
+ movl rva(efi32_boot_args+4)(%ebp), %esi
+ movl rva(efi32_boot_args+8)(%ebp), %edx // saved bootparams pointer
cmpl $0, %edx
jnz 1f
/*
@@ -229,7 +255,7 @@ SYM_FUNC_START(startup_32)
* the correct stack alignment for entry.
*/
subl $40, %esp
- leal efi_pe_entry(%ebp), %eax
+ leal rva(efi_pe_entry)(%ebp), %eax
movl %edi, %ecx // MS calling convention
movl %esi, %edx
1:
@@ -255,18 +281,18 @@ SYM_FUNC_START(efi32_stub_entry)

call 1f
1: pop %ebp
- subl $1b, %ebp
+ subl $ rva(1b), %ebp

- movl %esi, efi32_boot_args+8(%ebp)
+ movl %esi, rva(efi32_boot_args+8)(%ebp)
SYM_INNER_LABEL(efi32_pe_stub_entry, SYM_L_LOCAL)
- movl %ecx, efi32_boot_args(%ebp)
- movl %edx, efi32_boot_args+4(%ebp)
- movb $0, efi_is64(%ebp)
+ movl %ecx, rva(efi32_boot_args)(%ebp)
+ movl %edx, rva(efi32_boot_args+4)(%ebp)
+ movb $0, rva(efi_is64)(%ebp)

/* Save firmware GDTR and code/data selectors */
- sgdtl efi32_boot_gdt(%ebp)
- movw %cs, efi32_boot_cs(%ebp)
- movw %ds, efi32_boot_ds(%ebp)
+ sgdtl rva(efi32_boot_gdt)(%ebp)
+ movw %cs, rva(efi32_boot_cs)(%ebp)
+ movw %ds, rva(efi32_boot_ds)(%ebp)

/* Disable paging */
movl %cr0, %eax
@@ -345,11 +371,11 @@ SYM_CODE_START(startup_64)

/* Target address to relocate to for decompression */
movl BP_init_size(%rsi), %ebx
- subl $_end, %ebx
+ subl $ rva(_end), %ebx
addq %rbp, %rbx

/* Set up the stack */
- leaq boot_stack_end(%rbx), %rsp
+ leaq rva(boot_stack_end)(%rbx), %rsp

/*
* At this point we are in long mode with 4-level paging enabled,
@@ -423,7 +449,7 @@ SYM_CODE_START(startup_64)
lretq
trampoline_return:
/* Restore the stack, the 32-bit trampoline uses its own stack */
- leaq boot_stack_end(%rbx), %rsp
+ leaq rva(boot_stack_end)(%rbx), %rsp

/*
* cleanup_trampoline() would restore trampoline memory.
@@ -435,7 +461,7 @@ trampoline_return:
* this function call.
*/
pushq %rsi
- leaq top_pgtable(%rbx), %rdi
+ leaq rva(top_pgtable)(%rbx), %rdi
call cleanup_trampoline
popq %rsi

@@ -449,9 +475,9 @@ trampoline_return:
*/
pushq %rsi
leaq (_bss-8)(%rip), %rsi
- leaq (_bss-8)(%rbx), %rdi
- movq $_bss /* - $startup_32 */, %rcx
- shrq $3, %rcx
+ leaq rva(_bss-8)(%rbx), %rdi
+ movl $(_bss - startup_32), %ecx
+ shrl $3, %ecx
std
rep movsq
cld
@@ -462,15 +488,15 @@ trampoline_return:
* during extract_kernel below. To avoid any issues, repoint the GDTR
* to the new copy of the GDT.
*/
- leaq gdt64(%rbx), %rax
- leaq gdt(%rbx), %rdx
+ leaq rva(gdt64)(%rbx), %rax
+ leaq rva(gdt)(%rbx), %rdx
movq %rdx, 2(%rax)
lgdt (%rax)

/*
* Jump to the relocated address.
*/
- leaq .Lrelocated(%rbx), %rax
+ leaq rva(.Lrelocated)(%rbx), %rax
jmp *%rax
SYM_CODE_END(startup_64)

@@ -482,7 +508,7 @@ SYM_FUNC_START_ALIAS(efi_stub_entry)
movq %rdx, %rbx /* save boot_params pointer */
call efi_main
movq %rbx,%rsi
- leaq startup_64(%rax), %rax
+ leaq rva(startup_64)(%rax), %rax
jmp *%rax
SYM_FUNC_END(efi64_stub_entry)
SYM_FUNC_END_ALIAS(efi_stub_entry)
@@ -645,7 +671,7 @@ SYM_DATA(efi_is64, .byte 1)
#define BS32_handle_protocol 88 // offsetof(efi_boot_services_32_t, handle_protocol)
#define LI32_image_base 32 // offsetof(efi_loaded_image_32_t, image_base)

- .text
+ __HEAD
.code32
SYM_FUNC_START(efi32_pe_entry)
/*
@@ -667,12 +693,12 @@ SYM_FUNC_START(efi32_pe_entry)

call 1f
1: pop %ebx
- subl $1b, %ebx
+ subl $ rva(1b), %ebx

/* Get the loaded image protocol pointer from the image handle */
leal -4(%ebp), %eax
pushl %eax // &loaded_image
- leal loaded_image_proto(%ebx), %eax
+ leal rva(loaded_image_proto)(%ebx), %eax
pushl %eax // pass the GUID address
pushl 8(%ebp) // pass the image handle

@@ -707,7 +733,7 @@ SYM_FUNC_START(efi32_pe_entry)
* use it before we get to the 64-bit efi_pe_entry() in C code.
*/
subl %esi, %ebx
- movl %ebx, image_offset(%ebp) // save image_offset
+ movl %ebx, rva(image_offset)(%ebp) // save image_offset
jmp efi32_pe_stub_entry

2: popl %edi // restore callee-save registers
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:41 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Fangrui Song, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
For vmlinux linking, no architecture uses the .gnu.version* sections,
so remove it via the COMMON_DISCARDS macro in preparation for adding
--orphan-handling=warn more widely. This is a work-around for what
appears to be a bug[1] in ld.bfd which warns for this synthetic section
even when none is found in input objects, and even when no section is
emitted for an output object[2].

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=26153
[2] https://lore.kernel.org/lkml/202006221524.CEB86E036B@keescook/

Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
include/asm-generic/vmlinux.lds.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index ff65a20faf4c..22985cf02130 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -933,7 +933,9 @@
#define COMMON_DISCARDS \
*(.discard) \
*(.discard.*) \
- *(.modinfo)
+ *(.modinfo) \
+ /* ld.bfd warns about .gnu.version* even when not emitted */ \
+ *(.gnu.version*) \

#define DISCARDS \
/DISCARD/ : { \
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:42 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Fangrui Song, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
When linking vmlinux with LLD, the synthetic sections .symtab, .strtab,
and .shstrtab are listed as orphaned. Add them to the ELF_DETAILS section
so there will be no warnings when --orphan-handling=warn is used more
widely. (They are added above comment as it is the more common
order[1].)

ld.lld: warning: <internal>:(.symtab) is being placed in '.symtab'
ld.lld: warning: <internal>:(.shstrtab) is being placed in '.shstrtab'
ld.lld: warning: <internal>:(.strtab) is being placed in '.strtab'

[1] https://lore.kernel.org/lkml/20200622224928....@google.com/

Reported-by: Fangrui Song <mas...@google.com>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
include/asm-generic/vmlinux.lds.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 22c9a68c02ae..2593957f6e8b 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -799,7 +799,10 @@

/* Required sections not related to debugging. */
#define ELF_DETAILS \
- .comment 0 : { *(.comment) }
+ .comment 0 : { *(.comment) } \
+ .symtab 0 : { *(.symtab) } \
+ .strtab 0 : { *(.strtab) } \
+ .shstrtab 0 : { *(.shstrtab) }

#ifdef CONFIG_GENERIC_BUG
#define BUG_TABLE \
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:42 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Marco Elver, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
KASAN (-fsanitize=kernel-address) and KCSAN (-fsanitize=thread)
produce unwanted[1] .eh_frame and .init_array.* sections. Add them to
COMMON_DISCARDS, except with CONFIG_CONSTRUCTORS, which wants to keep
.init_array.* sections.

[1] https://bugs.llvm.org/show_bug.cgi?id=46478

Tested-by: Marco Elver <el...@google.com>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
include/asm-generic/vmlinux.lds.h | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 22985cf02130..f236cf0fa779 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -930,7 +930,27 @@
EXIT_DATA
#endif

+/*
+ * Clang's -fsanitize=kernel-address and -fsanitize=thread produce
+ * unwanted sections (.eh_frame and .init_array.*), but
+ * CONFIG_CONSTRUCTORS wants to keep any .init_array.* sections.
+ * https://bugs.llvm.org/show_bug.cgi?id=46478
+ */
+#if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KCSAN)
+# ifdef CONFIG_CONSTRUCTORS
+# define SANITIZER_DISCARDS \
+ *(.eh_frame)
+# else
+# define SANITIZER_DISCARDS \
+ *(.init_array) *(.init_array.*) \
+ *(.eh_frame)
+# endif
+#else
+# define SANITIZER_DISCARDS
+#endif
+
#define COMMON_DISCARDS \
+ SANITIZER_DISCARDS \
*(.discard) \
*(.discard.*) \
*(.modinfo) \
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:43 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
The .comment section doesn't belong in STABS_DEBUG. Split it out into a
new macro named ELF_DETAILS. This will gain other non-debug sections
that need to be accounted for when linking with --orphan-handling=warn.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/alpha/kernel/vmlinux.lds.S | 1 +
arch/arc/kernel/vmlinux.lds.S | 1 +
arch/arm/kernel/vmlinux-xip.lds.S | 1 +
arch/arm/kernel/vmlinux.lds.S | 1 +
arch/arm64/kernel/vmlinux.lds.S | 1 +
arch/csky/kernel/vmlinux.lds.S | 1 +
arch/hexagon/kernel/vmlinux.lds.S | 1 +
arch/ia64/kernel/vmlinux.lds.S | 1 +
arch/mips/kernel/vmlinux.lds.S | 1 +
arch/nds32/kernel/vmlinux.lds.S | 1 +
arch/nios2/kernel/vmlinux.lds.S | 1 +
arch/openrisc/kernel/vmlinux.lds.S | 1 +
arch/parisc/boot/compressed/vmlinux.lds.S | 1 +
arch/parisc/kernel/vmlinux.lds.S | 1 +
arch/powerpc/kernel/vmlinux.lds.S | 2 +-
arch/riscv/kernel/vmlinux.lds.S | 1 +
arch/s390/kernel/vmlinux.lds.S | 1 +
arch/sh/kernel/vmlinux.lds.S | 1 +
arch/sparc/kernel/vmlinux.lds.S | 1 +
arch/um/kernel/dyn.lds.S | 2 +-
arch/um/kernel/uml.lds.S | 2 +-
arch/unicore32/kernel/vmlinux.lds.S | 1 +
arch/x86/boot/compressed/vmlinux.lds.S | 2 ++
arch/x86/kernel/vmlinux.lds.S | 1 +
include/asm-generic/vmlinux.lds.h | 8 ++++++--
25 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/alpha/kernel/vmlinux.lds.S b/arch/alpha/kernel/vmlinux.lds.S
index bc6f727278fd..5b78d640725d 100644
--- a/arch/alpha/kernel/vmlinux.lds.S
+++ b/arch/alpha/kernel/vmlinux.lds.S
@@ -72,6 +72,7 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
}
diff --git a/arch/arc/kernel/vmlinux.lds.S b/arch/arc/kernel/vmlinux.lds.S
index 54139a6f469b..33ce59d91461 100644
--- a/arch/arc/kernel/vmlinux.lds.S
+++ b/arch/arc/kernel/vmlinux.lds.S
@@ -122,6 +122,7 @@ SECTIONS
_end = . ;

STABS_DEBUG
+ ELF_DETAILS
DISCARDS

.arcextmap 0 : {
diff --git a/arch/arm/kernel/vmlinux-xip.lds.S b/arch/arm/kernel/vmlinux-xip.lds.S
index 6d2be994ae58..3d4e88f08196 100644
--- a/arch/arm/kernel/vmlinux-xip.lds.S
+++ b/arch/arm/kernel/vmlinux-xip.lds.S
@@ -152,6 +152,7 @@ SECTIONS
_end = .;

STABS_DEBUG
+ ELF_DETAILS
}

/*
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 7f24bc08403e..5592f14b7e35 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -151,6 +151,7 @@ SECTIONS
_end = .;

STABS_DEBUG
+ ELF_DETAILS
}

#ifdef CONFIG_STRICT_KERNEL_RWX
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 5423ffe0a987..df2916b25ee0 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -242,6 +242,7 @@ SECTIONS
_end = .;

STABS_DEBUG
+ ELF_DETAILS

HEAD_SYMBOLS
}
diff --git a/arch/csky/kernel/vmlinux.lds.S b/arch/csky/kernel/vmlinux.lds.S
index f05b413df328..f03033e17c29 100644
--- a/arch/csky/kernel/vmlinux.lds.S
+++ b/arch/csky/kernel/vmlinux.lds.S
@@ -109,6 +109,7 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
}
diff --git a/arch/hexagon/kernel/vmlinux.lds.S b/arch/hexagon/kernel/vmlinux.lds.S
index 0ca2471ddb9f..35b18e55eae8 100644
--- a/arch/hexagon/kernel/vmlinux.lds.S
+++ b/arch/hexagon/kernel/vmlinux.lds.S
@@ -67,5 +67,6 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

}
diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
index d259690eb91a..9b265783be6a 100644
--- a/arch/ia64/kernel/vmlinux.lds.S
+++ b/arch/ia64/kernel/vmlinux.lds.S
@@ -218,6 +218,7 @@ SECTIONS {

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

/* Default discards */
DISCARDS
diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
index f185a85a27c1..5e97e9d02f98 100644
--- a/arch/mips/kernel/vmlinux.lds.S
+++ b/arch/mips/kernel/vmlinux.lds.S
@@ -202,6 +202,7 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

/* These must appear regardless of . */
.gptab.sdata : {
diff --git a/arch/nds32/kernel/vmlinux.lds.S b/arch/nds32/kernel/vmlinux.lds.S
index 7a6c1cefe3fe..6a91b965fb1e 100644
--- a/arch/nds32/kernel/vmlinux.lds.S
+++ b/arch/nds32/kernel/vmlinux.lds.S
@@ -64,6 +64,7 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
}
diff --git a/arch/nios2/kernel/vmlinux.lds.S b/arch/nios2/kernel/vmlinux.lds.S
index c55a7cfa1075..126e114744cb 100644
--- a/arch/nios2/kernel/vmlinux.lds.S
+++ b/arch/nios2/kernel/vmlinux.lds.S
@@ -58,6 +58,7 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
}
diff --git a/arch/openrisc/kernel/vmlinux.lds.S b/arch/openrisc/kernel/vmlinux.lds.S
index 60449fd7f16f..d287dbb84d0f 100644
--- a/arch/openrisc/kernel/vmlinux.lds.S
+++ b/arch/openrisc/kernel/vmlinux.lds.S
@@ -115,6 +115,7 @@ SECTIONS
/* Throw in the debugging sections */
STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

/* Sections to be discarded -- must be last */
DISCARDS
diff --git a/arch/parisc/boot/compressed/vmlinux.lds.S b/arch/parisc/boot/compressed/vmlinux.lds.S
index 2ac3a643f2eb..ab7b43990857 100644
--- a/arch/parisc/boot/compressed/vmlinux.lds.S
+++ b/arch/parisc/boot/compressed/vmlinux.lds.S
@@ -84,6 +84,7 @@ SECTIONS
}

STABS_DEBUG
+ ELF_DETAILS
.note 0 : { *(.note) }

/* Sections to be discarded */
diff --git a/arch/parisc/kernel/vmlinux.lds.S b/arch/parisc/kernel/vmlinux.lds.S
index 53e29d88f99c..2769eb991f58 100644
--- a/arch/parisc/kernel/vmlinux.lds.S
+++ b/arch/parisc/kernel/vmlinux.lds.S
@@ -164,6 +164,7 @@ SECTIONS
_end = . ;

STABS_DEBUG
+ ELF_DETAILS
.note 0 : { *(.note) }

/* Sections to be discarded */
diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index 326e113d2e45..e0548b4950de 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -360,8 +360,8 @@ SECTIONS
PROVIDE32 (end = .);

STABS_DEBUG
-
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
/DISCARD/ : {
diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
index e6f8016b366a..00a325289a26 100644
--- a/arch/riscv/kernel/vmlinux.lds.S
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -97,6 +97,7 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
}
diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S
index 37695499717d..177ccfbda40a 100644
--- a/arch/s390/kernel/vmlinux.lds.S
+++ b/arch/s390/kernel/vmlinux.lds.S
@@ -181,6 +181,7 @@ SECTIONS
/* Debugging sections. */
STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

/* Sections to be discarded */
DISCARDS
diff --git a/arch/sh/kernel/vmlinux.lds.S b/arch/sh/kernel/vmlinux.lds.S
index bde7a6c01aaf..3161b9ccd2a5 100644
--- a/arch/sh/kernel/vmlinux.lds.S
+++ b/arch/sh/kernel/vmlinux.lds.S
@@ -76,6 +76,7 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
}
diff --git a/arch/sparc/kernel/vmlinux.lds.S b/arch/sparc/kernel/vmlinux.lds.S
index f99e99e58075..d55ae65a07ad 100644
--- a/arch/sparc/kernel/vmlinux.lds.S
+++ b/arch/sparc/kernel/vmlinux.lds.S
@@ -187,6 +187,7 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
}
diff --git a/arch/um/kernel/dyn.lds.S b/arch/um/kernel/dyn.lds.S
index f5001481010c..dacbfabf66d8 100644
--- a/arch/um/kernel/dyn.lds.S
+++ b/arch/um/kernel/dyn.lds.S
@@ -164,8 +164,8 @@ SECTIONS
PROVIDE (end = .);

STABS_DEBUG
-
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
}
diff --git a/arch/um/kernel/uml.lds.S b/arch/um/kernel/uml.lds.S
index 3b6dab3d4501..45d957d7004c 100644
--- a/arch/um/kernel/uml.lds.S
+++ b/arch/um/kernel/uml.lds.S
@@ -108,8 +108,8 @@ SECTIONS
PROVIDE (end = .);

STABS_DEBUG
-
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
}
diff --git a/arch/unicore32/kernel/vmlinux.lds.S b/arch/unicore32/kernel/vmlinux.lds.S
index 6fb320b337ef..22eb642c7280 100644
--- a/arch/unicore32/kernel/vmlinux.lds.S
+++ b/arch/unicore32/kernel/vmlinux.lds.S
@@ -54,6 +54,7 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS /* Exit code and data */
}
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 29df99b6cc64..3c2ee9a5bf43 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -82,6 +82,8 @@ SECTIONS
. = ALIGN(PAGE_SIZE); /* keep ZO size page aligned */
_end = .;

+ ELF_DETAILS
+
DISCARDS
}

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 9a03e5b23135..0cc035cb15f1 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -411,6 +411,7 @@ SECTIONS

STABS_DEBUG
DWARF_DEBUG
+ ELF_DETAILS

DISCARDS
}
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index f236cf0fa779..22c9a68c02ae 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -34,6 +34,7 @@
*
* STABS_DEBUG
* DWARF_DEBUG
+ * ELF_DETAILS
*
* DISCARDS // must be the last
* }
@@ -787,14 +788,17 @@
.debug_macro 0 : { *(.debug_macro) } \
.debug_addr 0 : { *(.debug_addr) }

- /* Stabs debugging sections. */
+/* Stabs debugging sections. */
#define STABS_DEBUG \
.stab 0 : { *(.stab) } \
.stabstr 0 : { *(.stabstr) } \
.stab.excl 0 : { *(.stab.excl) } \
.stab.exclstr 0 : { *(.stab.exclstr) } \
.stab.index 0 : { *(.stab.index) } \
- .stab.indexstr 0 : { *(.stab.indexstr) } \
+ .stab.indexstr 0 : { *(.stab.indexstr) }
+
+/* Required sections not related to debugging. */
+#define ELF_DETAILS \
.comment 0 : { *(.comment) }

#ifdef CONFIG_GENERIC_BUG
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:43 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Ard Biesheuvel, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Remove last instance of an .eh_frame section by removing the needless Call
Frame Information annotations which were likely leftovers from 32-bit arm.

Suggested-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Will Deacon <wi...@kernel.org>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm64/kernel/smccc-call.S | 2 --
1 file changed, 2 deletions(-)

diff --git a/arch/arm64/kernel/smccc-call.S b/arch/arm64/kernel/smccc-call.S
index 1f93809528a4..d62447964ed9 100644
--- a/arch/arm64/kernel/smccc-call.S
+++ b/arch/arm64/kernel/smccc-call.S
@@ -9,7 +9,6 @@
#include <asm/assembler.h>

.macro SMCCC instr
- .cfi_startproc
\instr #0
ldr x4, [sp]
stp x0, x1, [x4, #ARM_SMCCC_RES_X0_OFFS]
@@ -21,7 +20,6 @@
b.ne 1f
str x6, [x4, ARM_SMCCC_QUIRK_STATE_OFFS]
1: ret
- .cfi_endproc
.endm

/*
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:44 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Arvind Sankar, Atish Patra, linu...@vger.kernel.org, Ard Biesheuvel, Nick Desaulniers, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
In preparation for adding --orphan-handling=warn to more architectures,
disable -mbranch-protection, as EFI does not yet support it[1]. This was
noticed due to it producing unwanted .note.gnu.property sections (prefixed
with .init due to the objcopy build step).

However, we must also work around a bug in Clang where the section is
still emitted for code-less object files[2], so also remove the section
during the objcopy.

[1] https://lore.kernel.org/lkml/CAMj1kXHck12juGi=E=P4hWP_8vQhQ+-x3v...@mail.gmail.com
[2] https://bugs.llvm.org/show_bug.cgi?id=46480

Cc: Arvind Sankar <nive...@alum.mit.edu>
Cc: Atish Patra <atish...@wdc.com>
Cc: linu...@vger.kernel.org
Acked-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Nick Desaulniers <ndesau...@google.com>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
drivers/firmware/efi/libstub/Makefile | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index b4f8c80cc591..d7d395ede89f 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -18,7 +18,8 @@ cflags-$(CONFIG_X86) += -m$(BITS) -D__KERNEL__ \
# arm64 uses the full KBUILD_CFLAGS so it's necessary to explicitly
# disable the stackleak plugin
cflags-$(CONFIG_ARM64) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
- -fpie $(DISABLE_STACKLEAK_PLUGIN)
+ -fpie $(DISABLE_STACKLEAK_PLUGIN) \
+ $(call cc-option,-mbranch-protection=none)
cflags-$(CONFIG_ARM) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
-fno-builtin -fpic \
$(call cc-option,-mno-single-pic-base)
@@ -66,6 +67,12 @@ lib-$(CONFIG_X86) += x86-stub.o
CFLAGS_arm32-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)

+# Even when -mbranch-protection=none is set, Clang will generate a
+# .note.gnu.property for code-less object files (like lib/ctype.c),
+# so work around this by explicitly removing the unwanted section.
+# https://bugs.llvm.org/show_bug.cgi?id=46480
+STUBCOPY_FLAGS-y += --remove-section=.note.gnu.property
+
#
# For x86, bootloaders like systemd-boot or grub-efi do not zero-initialize the
# .bss section, so the .bss section of the EFI stub needs to be included in the
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:45 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Nick Desaulniers, Jian Cai, Fāng-ruì Sòng, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
From: Nick Desaulniers <ndesau...@google.com>

Basically, consider .text.{hot|unlikely|unknown}.* part of .text, too.

When compiling with profiling information (collected via PGO
instrumentations or AutoFDO sampling), Clang will separate code into
.text.hot, .text.unlikely, or .text.unknown sections based on profiling
information. After D79600 (clang-11), these sections will have a
trailing `.` suffix, ie. .text.hot., .text.unlikely., .text.unknown..

When using -ffunction-sections together with profiling infomation,
either explicitly (FGKASLR) or implicitly (LTO), code may be placed in
sections following the convention:
.text.hot.<foo>, .text.unlikely.<bar>, .text.unknown.<baz>
where <foo>, <bar>, and <baz> are functions. (This produces one section
per function; we generally try to merge these all back via linker script
so that we don't have 50k sections).

For the above cases, we need to teach our linker scripts that such
sections might exist and that we'd explicitly like them grouped
together, otherwise we can wind up with code outside of the
_stext/_etext boundaries that might not be mapped properly for some
architectures, resulting in boot failures.

If the linker script is not told about possible input sections, then
where the section is placed as output is a heuristic-laiden mess that's
non-portable between linkers (ie. BFD and LLD), and has resulted in many
hard to debug bugs. Kees Cook is working on cleaning this up by adding
--orphan-handling=warn linker flag used in ARCH=powerpc to additional
architectures. In the case of linker scripts, borrowing from the Zen of
Python: explicit is better than implicit.

Also, ld.bfd's internal linker script considers .text.hot AND
.text.hot.* to be part of .text, as well as .text.unlikely and
.text.unlikely.*. I didn't see support for .text.unknown.*, and didn't
see Clang producing such code in our kernel builds, but I see code in
LLVM that can produce such section names if profiling information is
missing. That may point to a larger issue with generating or collecting
profiles, but I would much rather be safe and explicit than have to
debug yet another issue related to orphan section placement.

Reported-by: Jian Cai <jia...@google.com>
Suggested-by: Fāng-ruì Sòng <mas...@google.com>
Tested-by: Luis Lozano <llo...@google.com>
Tested-by: Manoj Gupta <manoj...@google.com>
Acked-by: Kees Cook <kees...@chromium.org>
Cc: sta...@vger.kernel.org
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=add44f8d5c5c05e08b11e033127a744d61c26aee
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=1de778ed23ce7492c523d5850c6c6dbb34152655
Link: https://reviews.llvm.org/D79600
Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1084760
Debugged-by: Luis Lozano <llo...@google.com>
Signed-off-by: Nick Desaulniers <ndesau...@google.com>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
include/asm-generic/vmlinux.lds.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 2593957f6e8b..af5211ca857c 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -561,7 +561,10 @@
*/
#define TEXT_TEXT \
ALIGN_FUNCTION(); \
- *(.text.hot TEXT_MAIN .text.fixup .text.unlikely) \
+ *(.text.hot .text.hot.*) \
+ *(TEXT_MAIN .text.fixup) \
+ *(.text.unlikely .text.unlikely.*) \
+ *(.text.unknown .text.unknown.*) \
NOINSTR_TEXT \
*(.text..refcount) \
*(.ref.text) \
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:46 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
In preparation for warning on orphan sections, enforce other
expected-to-be-zero-sized sections (since discarding them might hide
problems with them suddenly gaining unexpected entries).

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/kernel/vmlinux.lds.S | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 7faffe7414d6..d8792f9c536f 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -415,6 +415,15 @@ SECTIONS

DISCARDS

+ /*
+ * Sections that should stay zero sized, which is safer to
+ * explicitly check instead of blindly discarding.
+ */
+ .got (NOLOAD) : {
+ *(.got) *(.igot.*)
+ }
+ ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")
+
/*
* Make sure that the .got.plt is either completely empty or it
* contains only the lazy dispatch entries.
@@ -427,6 +436,21 @@ SECTIONS
SIZEOF(.got.plt) == 0xc,
#endif
"Unexpected GOT/PLT entries detected!")
+
+ .plt (NOLOAD) : {
+ *(.plt) *(.plt.*) *(.iplt)
+ }
+ ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!")
+
+ /* ld.lld does not like .rel* sections being made "NOLOAD". */
+ .rel.dyn : {
+ *(.rel.*) *(.rel_*)
+ }
+ ASSERT(SIZEOF(.rel.dyn) == 0, "Unexpected run-time relocations (.rel) detected!")
+ .rela.dyn : {
+ *(.rela.*) *(.rela_*)
+ }
+ ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations (.rela) detected!")
}

#ifdef CONFIG_X86_32
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:46 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
We don't want to depend on the linker's orphan section placement
heuristics as these can vary between linkers, and may change between
versions. All sections need to be explicitly handled in the linker
script.

With all sections now handled, enable orphan section warnings.

Acked-by: Will Deacon <wi...@kernel.org>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm64/Makefile | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 35de43c29873..b8a3142db0dd 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -29,6 +29,10 @@ LDFLAGS_vmlinux += --fix-cortex-a53-843419
endif
endif

+# We never want expected sections to be placed heuristically by the
+# linker. All sections should be explicitly named in the linker script.
+LDFLAGS_vmlinux += --orphan-handling=warn
+
ifeq ($(CONFIG_ARM64_USE_LSE_ATOMICS), y)
ifneq ($(CONFIG_ARM64_LSE_ATOMICS), y)
$(warning LSE atomics not supported by binutils)
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:47 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Ard Biesheuvel, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
In preparation for warning on orphan sections, discard
unwanted non-zero-sized generated sections, and enforce other
expected-to-be-zero-sized sections (since discarding them might hide
problems with them suddenly gaining unexpected entries).

Suggested-by: Ard Biesheuvel <ar...@kernel.org>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm64/kernel/vmlinux.lds.S | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 4cf825301c3a..01485941ed35 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -122,6 +122,14 @@ SECTIONS
*(.got) /* Global offset table */
}

+ /*
+ * Make sure that the .got.plt is either completely empty or it
+ * contains only the lazy dispatch entries.
+ */
+ .got.plt : { *(.got.plt) }
+ ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18,
+ "Unexpected GOT/PLT entries detected!")
+
. = ALIGN(SEGMENT_ALIGN);
_etext = .; /* End of text section */

@@ -244,6 +252,18 @@ SECTIONS
ELF_DETAILS

HEAD_SYMBOLS
+
+ /*
+ * Sections that should stay zero sized, which is safer to
+ * explicitly check instead of blindly discarding.
+ */
+ .plt (NOLOAD) : {
+ *(.plt) *(.plt.*) *(.iplt) *(.igot)
+ }
+ ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!")
+
+ .data.rel.ro (NOLOAD) : { *(.data.rel.ro) }
+ ASSERT(SIZEOF(.data.rel.ro) == 0, "Unexpected RELRO detected!")
}

#include "image-vars.h"
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:48 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
We don't want to depend on the linker's orphan section placement
heuristics as these can vary between linkers, and may change between
versions. All sections need to be explicitly handled in the linker script.

Now that all sections are explicitly handled, enable orphan section
warnings.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/Makefile | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 00e378de8bc0..f8a5b2333729 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -51,6 +51,10 @@ ifdef CONFIG_X86_NEED_RELOCS
LDFLAGS_vmlinux := --emit-relocs --discard-none
endif

+# We never want expected sections to be placed heuristically by the
+# linker. All sections should be explicitly named in the linker script.
+LDFLAGS_vmlinux += --orphan-handling=warn
+
#
# Prevent GCC from generating any FP code by mistake.
#
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:49 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
For readability, move the zero-sized sections to the end after DISCARDS
and mark them NOLOAD for good measure.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/boot/compressed/vmlinux.lds.S | 42 +++++++++++++++-----------
1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 3c2ee9a5bf43..42dea70a5091 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -42,18 +42,16 @@ SECTIONS
*(.rodata.*)
_erodata = . ;
}
- .rel.dyn : {
- *(.rel.*)
- }
- .rela.dyn : {
- *(.rela.*)
- }
- .got : {
- *(.got)
- }
.got.plt : {
*(.got.plt)
}
+ ASSERT(SIZEOF(.got.plt) == 0 ||
+#ifdef CONFIG_X86_64
+ SIZEOF(.got.plt) == 0x18,
+#else
+ SIZEOF(.got.plt) == 0xc,
+#endif
+ "Unexpected GOT/PLT entries detected!")

.data : {
_data = . ;
@@ -85,13 +83,23 @@ SECTIONS
ELF_DETAILS

DISCARDS
-}

-ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")
-#ifdef CONFIG_X86_64
-ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
-#else
-ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
-#endif
+ /*
+ * Sections that should stay zero sized, which is safer to
+ * explicitly check instead of blindly discarding.
+ */
+ .got (NOLOAD) : {
+ *(.got)
+ }
+ ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")

-ASSERT(SIZEOF(.rel.dyn) == 0 && SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations detected!")
+ /* ld.lld does not like .rel* sections being made "NOLOAD". */
+ .rel.dyn : {
+ *(.rel.*)
+ }
+ ASSERT(SIZEOF(.rel.dyn) == 0, "Unexpected run-time relocations (.rel) detected!")
+ .rela.dyn : {
+ *(.rela.*)
+ }
+ ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations (.rela) detected!")
+}
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:49 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
In preparation for warning on orphan sections, stop the linker from
generating the .eh_frame* sections, discard unwanted non-zero-sized
generated sections, and enforce other expected-to-be-zero-sized sections
(since discarding them might hide problems with them suddenly gaining
unexpected entries).

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/vmlinux.lds.S | 14 ++++++++++++--
2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 96d53e300ab6..43b49e1f5b6d 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -49,6 +49,7 @@ GCOV_PROFILE := n
UBSAN_SANITIZE :=n

KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
+KBUILD_LDFLAGS += $(call ld-option,--no-ld-generated-unwind-info)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker)
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 42dea70a5091..1fb9809a9e61 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -83,6 +83,11 @@ SECTIONS
ELF_DETAILS

DISCARDS
+ /DISCARD/ : {
+ *(.dynamic) *(.dynsym) *(.dynstr) *(.dynbss)
+ *(.hash) *(.gnu.hash)
+ *(.note.*)
+ }

/*
* Sections that should stay zero sized, which is safer to
@@ -93,13 +98,18 @@ SECTIONS
}
ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")

+ .plt (NOLOAD) : {
+ *(.plt) *(.plt.*)
+ }
+ ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!")
+
/* ld.lld does not like .rel* sections being made "NOLOAD". */
.rel.dyn : {
- *(.rel.*)
+ *(.rel.*) *(.rel_*)
}
ASSERT(SIZEOF(.rel.dyn) == 0, "Unexpected run-time relocations (.rel) detected!")
.rela.dyn : {
- *(.rela.*)
+ *(.rela.*) *(.rela_*)
}
ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations (.rela) detected!")
}
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:50 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Include the missing DWARF and STABS sections in the compressed image,
when they are present.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/boot/compressed/vmlinux.lds.S | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 1fb9809a9e61..a7a68415b999 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -80,6 +80,8 @@ SECTIONS
. = ALIGN(PAGE_SIZE); /* keep ZO size page aligned */
_end = .;

+ STABS_DEBUG
+ DWARF_DEBUG
ELF_DETAILS

DISCARDS
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:51 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
We don't want to depend on the linker's orphan section placement
heuristics as these can vary between linkers, and may change between
versions. All sections need to be explicitly handled in the linker
script.

Now that all sections are explicitly handled, enable orphan section
warnings.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/boot/compressed/Makefile | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 43b49e1f5b6d..f8270d924858 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -53,6 +53,7 @@ KBUILD_LDFLAGS += $(call ld-option,--no-ld-generated-unwind-info)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker)
+LDFLAGS_vmlinux += --orphan-handling=warn
LDFLAGS_vmlinux += -T

hostprogs := mkpiggy
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:08:52 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
In preparation for warning on orphan sections, enforce
expected-to-be-zero-sized sections (since discarding them might hide
problems with them suddenly gaining unexpected entries).

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm/include/asm/vmlinux.lds.h | 10 ++++++++++
arch/arm/kernel/vmlinux-xip.lds.S | 2 ++
arch/arm/kernel/vmlinux.lds.S | 2 ++
3 files changed, 14 insertions(+)

diff --git a/arch/arm/include/asm/vmlinux.lds.h b/arch/arm/include/asm/vmlinux.lds.h
index 6624dd97475c..e0d49fd756f7 100644
--- a/arch/arm/include/asm/vmlinux.lds.h
+++ b/arch/arm/include/asm/vmlinux.lds.h
@@ -52,6 +52,16 @@
ARM_MMU_DISCARD(*(__ex_table)) \
COMMON_DISCARDS

+/*
+ * Sections that should stay zero sized, which is safer to explicitly
+ * check instead of blindly discarding.
+ */
+#define ARM_ASSERTS \
+ .plt (NOLOAD) : { \
+ *(.iplt) *(.rel.iplt) *(.iplt) *(.igot.plt) \
+ } \
+ ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!")
+
#define ARM_DETAILS \
ELF_DETAILS \
.ARM.attributes 0 : { *(.ARM.attributes) }
diff --git a/arch/arm/kernel/vmlinux-xip.lds.S b/arch/arm/kernel/vmlinux-xip.lds.S
index 11ffa79751da..50136828f5b5 100644
--- a/arch/arm/kernel/vmlinux-xip.lds.S
+++ b/arch/arm/kernel/vmlinux-xip.lds.S
@@ -152,6 +152,8 @@ SECTIONS
STABS_DEBUG
DWARF_DEBUG
ARM_DETAILS
+
+ ARM_ASSERTS
}

/*
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index dc672fe35de3..5f4922e858d0 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -151,6 +151,8 @@ SECTIONS
STABS_DEBUG
DWARF_DEBUG
ARM_DETAILS
+
+ ARM_ASSERTS
}

#ifdef CONFIG_STRICT_KERNEL_RWX
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:12 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
When !CONFIG_KPROBES, do not generate kprobe sections. This makes
sure there are no unexpected sections encountered by the linker scripts.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/include/asm/asm.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 0f63585edf5f..92feec0f0a12 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -138,11 +138,15 @@
# define _ASM_EXTABLE_FAULT(from, to) \
_ASM_EXTABLE_HANDLE(from, to, ex_handler_fault)

-# define _ASM_NOKPROBE(entry) \
+# ifdef CONFIG_KPROBES
+# define _ASM_NOKPROBE(entry) \
.pushsection "_kprobe_blacklist","aw" ; \
_ASM_ALIGN ; \
_ASM_PTR (entry); \
.popsection
+# else
+# define _ASM_NOKPROBE(entry)
+# endif

#else
# define _EXPAND_EXTABLE_HANDLE(x) #x
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:13 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Explicitly include DWARF sections when they're present in the build.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm64/kernel/vmlinux.lds.S | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 5c1960406b08..4cf825301c3a 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -240,6 +240,7 @@ SECTIONS
_end = .;

STABS_DEBUG
+ DWARF_DEBUG
ELF_DETAILS

HEAD_SYMBOLS
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:14 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
The .got.plt section should always be zero (or filled only with the
linker-generated lazy dispatch entry). Enforce this with an assert and
mark the section as NOLOAD. This is more sensitive than just blindly
discarding the section.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/x86/kernel/vmlinux.lds.S | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 0cc035cb15f1..7faffe7414d6 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -414,8 +414,20 @@ SECTIONS
ELF_DETAILS

DISCARDS
-}

+ /*
+ * Make sure that the .got.plt is either completely empty or it
+ * contains only the lazy dispatch entries.
+ */
+ .got.plt (NOLOAD) : { *(.got.plt) }
+ ASSERT(SIZEOF(.got.plt) == 0 ||
+#ifdef CONFIG_X86_64
+ SIZEOF(.got.plt) == 0x18,
+#else
+ SIZEOF(.got.plt) == 0xc,
+#endif
+ "Unexpected GOT/PLT entries detected!")
+}

#ifdef CONFIG_X86_32
/*
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:15 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
We don't want to depend on the linker's orphan section placement
heuristics as these can vary between linkers, and may change between
versions. All sections need to be explicitly handled in the linker script.

With all sections now handled, enable orphan section warning.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm/boot/compressed/Makefile | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile
index 00602a6fba04..b8a97d81662d 100644
--- a/arch/arm/boot/compressed/Makefile
+++ b/arch/arm/boot/compressed/Makefile
@@ -128,6 +128,8 @@ endif
LDFLAGS_vmlinux += --no-undefined
# Delete all temporary local symbols
LDFLAGS_vmlinux += -X
+# Report orphan sections
+LDFLAGS_vmlinux += --orphan-handling=warn
# Next argument is a linker script
LDFLAGS_vmlinux += -T

--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:15 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Use the common DISCARDS rule for the linker script in an effort to
regularize the linker script to prepare for warning on orphaned
sections. Additionally clean up left-over no-op macros.

Signed-off-by: Kees Cook <kees...@chromium.org>
Acked-by: Will Deacon <wi...@kernel.org>
---
arch/arm64/kernel/vmlinux.lds.S | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index b29081d16a70..5c1960406b08 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -6,6 +6,7 @@
*/

#define RO_EXCEPTION_TABLE_ALIGN 8
+#define RUNTIME_DISCARD_EXIT

#include <asm-generic/vmlinux.lds.h>
#include <asm/cache.h>
@@ -89,10 +90,8 @@ SECTIONS
* matching the same input section name. There is no documented
* order of matching.
*/
+ DISCARDS
/DISCARD/ : {
- EXIT_CALL
- *(.discard)
- *(.discard.*)
*(.interp .dynamic)
*(.dynsym .dynstr .hash .gnu.hash)
}
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:17 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Ard Biesheuvel, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Avoid .eh_frame* section generation by making sure both CFLAGS and AFLAGS
contain -fno-asychronous-unwind-tables and -fno-unwind-tables.

With all sources of .eh_frame now removed from the build, drop this
DISCARD so we can be alerted in the future if it returns unexpectedly
once orphan section warnings have been enabled.

Suggested-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Will Deacon <wi...@kernel.org>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm64/Makefile | 5 ++++-
arch/arm64/kernel/vmlinux.lds.S | 1 -
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 70f5905954dd..35de43c29873 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -47,13 +47,16 @@ endif

KBUILD_CFLAGS += -mgeneral-regs-only \
$(compat_vdso) $(cc_has_k_constraint)
-KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
KBUILD_CFLAGS += $(call cc-disable-warning, psabi)
KBUILD_AFLAGS += $(compat_vdso)

KBUILD_CFLAGS += $(call cc-option,-mabi=lp64)
KBUILD_AFLAGS += $(call cc-option,-mabi=lp64)

+# Avoid generating .eh_frame* sections.
+KBUILD_CFLAGS += -fno-asynchronous-unwind-tables -fno-unwind-tables
+KBUILD_AFLAGS += -fno-asynchronous-unwind-tables -fno-unwind-tables
+
ifeq ($(CONFIG_STACKPROTECTOR_PER_TASK),y)
prepare: stack_protector_prepare
stack_protector_prepare: prepare0
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index df2916b25ee0..b29081d16a70 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -95,7 +95,6 @@ SECTIONS
*(.discard.*)
*(.interp .dynamic)
*(.dynsym .dynstr .hash .gnu.hash)
- *(.eh_frame)
}

. = KIMAGE_VADDR + TEXT_OFFSET;
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:17 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Nick Desaulniers, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Fix a case of needless quotes in __section(), which Clang doesn't like.

Acked-by: Will Deacon <wi...@kernel.org>
Reviewed-by: Nick Desaulniers <ndesau...@google.com>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm64/mm/mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 1df25f26571d..dce024ea6084 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -42,7 +42,7 @@
u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
u64 idmap_ptrs_per_pgd = PTRS_PER_PGD;

-u64 __section(".mmuoff.data.write") vabits_actual;
+u64 __section(.mmuoff.data.write) vabits_actual;
EXPORT_SYMBOL(vabits_actual);

u64 kimage_voffset __ro_after_init;
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:18 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Nick Desaulniers, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
We don't want to depend on the linker's orphan section placement
heuristics as these can vary between linkers, and may change between
versions. All sections need to be explicitly handled in the linker
script.

Specifically, this would have made a recently fixed bug very obvious:

ld: warning: orphan section `.fixup' from `arch/arm/lib/copy_from_user.o' being placed in section `.fixup'

With all sections handled, enable orphan section warning.

Reviewed-by: Nick Desaulniers <ndesau...@google.com>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm/Makefile | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 59fde2d598d8..e414e3732b3a 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -16,6 +16,10 @@ LDFLAGS_vmlinux += --be8
KBUILD_LDFLAGS_MODULE += --be8
endif

+# We never want expected sections to be placed heuristically by the
+# linker. All sections should be explicitly named in the linker script.
+LDFLAGS_vmlinux += --orphan-handling=warn
+
ifeq ($(CONFIG_ARM_MODULE_PLTS),y)
KBUILD_LDS_MODULE += $(srctree)/arch/arm/kernel/module.lds
endif
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:18 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
In preparation for warning on orphan sections, use common macros for
debug sections, discards, and text stubs. Add discards for unwanted .note,
and .rel sections.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm/boot/compressed/vmlinux.lds.S | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/arm/boot/compressed/vmlinux.lds.S b/arch/arm/boot/compressed/vmlinux.lds.S
index 09ac33f52814..b914be3a207b 100644
--- a/arch/arm/boot/compressed/vmlinux.lds.S
+++ b/arch/arm/boot/compressed/vmlinux.lds.S
@@ -2,6 +2,7 @@
/*
* Copyright (C) 2000 Russell King
*/
+#include <asm/vmlinux.lds.h>

#ifdef CONFIG_CPU_ENDIAN_BE8
#define ZIMAGE_MAGIC(x) ( (((x) >> 24) & 0x000000ff) | \
@@ -17,8 +18,11 @@ ENTRY(_start)
SECTIONS
{
/DISCARD/ : {
+ COMMON_DISCARDS
*(.ARM.exidx*)
*(.ARM.extab*)
+ *(.note.*)
+ *(.rel.*)
/*
* Discard any r/w data - this produces a link error if we have any,
* which is required for PIC decompression. Local data generates
@@ -36,9 +40,7 @@ SECTIONS
*(.start)
*(.text)
*(.text.*)
- *(.gnu.warning)
- *(.glue_7t)
- *(.glue_7)
+ ARM_STUBS_TEXT
}
.table : ALIGN(4) {
_table_start = .;
@@ -128,12 +130,10 @@ SECTIONS
PROVIDE(__pecoff_data_size = ALIGN(512) - ADDR(.data));
PROVIDE(__pecoff_end = ALIGN(512));

- .stab 0 : { *(.stab) }
- .stabstr 0 : { *(.stabstr) }
- .stab.excl 0 : { *(.stab.excl) }
- .stab.exclstr 0 : { *(.stab.exclstr) }
- .stab.index 0 : { *(.stab.index) }
- .stab.indexstr 0 : { *(.stab.indexstr) }
- .comment 0 : { *(.comment) }
+ STABS_DEBUG
+ DWARF_DEBUG
+ ARM_DETAILS
+
+ ARM_ASSERTS
}
ASSERT(_edata_real == _edata, "error: zImage file size is incorrect");
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:19 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Nick Desaulniers, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
In preparation for adding --orphan-handling=warn, explicitly keep the
.ARM.attributes section by expanding the existing ELF_DETAILS macro into
ARM_DETAILS.

Suggested-by: Nick Desaulniers <ndesau...@google.com>
Link: https://lore.kernel.org/lkml/CAKwvOdk-racgq5pxsoGS6Vti...@mail.gmail.com/
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm/include/asm/vmlinux.lds.h | 4 ++++
arch/arm/kernel/vmlinux-xip.lds.S | 2 +-
arch/arm/kernel/vmlinux.lds.S | 2 +-
3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/vmlinux.lds.h b/arch/arm/include/asm/vmlinux.lds.h
index a08f4301b718..c4af5182ab48 100644
--- a/arch/arm/include/asm/vmlinux.lds.h
+++ b/arch/arm/include/asm/vmlinux.lds.h
@@ -52,6 +52,10 @@
ARM_MMU_DISCARD(*(__ex_table)) \
COMMON_DISCARDS

+#define ARM_DETAILS \
+ ELF_DETAILS \
+ .ARM.attributes 0 : { *(.ARM.attributes) }
+
#define ARM_STUBS_TEXT \
*(.gnu.warning) \
*(.glue_7) \
diff --git a/arch/arm/kernel/vmlinux-xip.lds.S b/arch/arm/kernel/vmlinux-xip.lds.S
index 904c31fa20ed..57fcbf55f913 100644
--- a/arch/arm/kernel/vmlinux-xip.lds.S
+++ b/arch/arm/kernel/vmlinux-xip.lds.S
@@ -150,7 +150,7 @@ SECTIONS
_end = .;

STABS_DEBUG
- ELF_DETAILS
+ ARM_DETAILS
}

/*
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index bb950c896a67..1d3d3b599635 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -149,7 +149,7 @@ SECTIONS
_end = .;

STABS_DEBUG
- ELF_DETAILS
+ ARM_DETAILS
}

#ifdef CONFIG_STRICT_KERNEL_RWX
--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:20 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
In preparation for adding --orphan-handling=warn, refactor the linker
script header includes, and extract common macros.

Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm/{kernel => include/asm}/vmlinux.lds.h | 13 ++++++++-----
arch/arm/kernel/vmlinux-xip.lds.S | 4 +---
arch/arm/kernel/vmlinux.lds.S | 4 +---
3 files changed, 10 insertions(+), 11 deletions(-)
rename arch/arm/{kernel => include/asm}/vmlinux.lds.h (96%)

diff --git a/arch/arm/kernel/vmlinux.lds.h b/arch/arm/include/asm/vmlinux.lds.h
similarity index 96%
rename from arch/arm/kernel/vmlinux.lds.h
rename to arch/arm/include/asm/vmlinux.lds.h
index 381a8e105fa5..a08f4301b718 100644
--- a/arch/arm/kernel/vmlinux.lds.h
+++ b/arch/arm/include/asm/vmlinux.lds.h
@@ -1,4 +1,5 @@
/* SPDX-License-Identifier: GPL-2.0 */
+#include <asm-generic/vmlinux.lds.h>

#ifdef CONFIG_HOTPLUG_CPU
#define ARM_CPU_DISCARD(x)
@@ -49,8 +50,12 @@
EXIT_CALL \
ARM_MMU_DISCARD(*(.text.fixup)) \
ARM_MMU_DISCARD(*(__ex_table)) \
- *(.discard) \
- *(.discard.*)
+ COMMON_DISCARDS
+
+#define ARM_STUBS_TEXT \
+ *(.gnu.warning) \
+ *(.glue_7) \
+ *(.glue_7t)

#define ARM_TEXT \
IDMAP_TEXT \
@@ -64,9 +69,7 @@
CPUIDLE_TEXT \
LOCK_TEXT \
KPROBES_TEXT \
- *(.gnu.warning) \
- *(.glue_7) \
- *(.glue_7t) \
+ ARM_STUBS_TEXT \
. = ALIGN(4); \
*(.got) /* Global offset table */ \
ARM_CPU_KEEP(PROC_INFO)
diff --git a/arch/arm/kernel/vmlinux-xip.lds.S b/arch/arm/kernel/vmlinux-xip.lds.S
index 3d4e88f08196..904c31fa20ed 100644
--- a/arch/arm/kernel/vmlinux-xip.lds.S
+++ b/arch/arm/kernel/vmlinux-xip.lds.S
@@ -9,15 +9,13 @@

#include <linux/sizes.h>

-#include <asm-generic/vmlinux.lds.h>
+#include <asm/vmlinux.lds.h>
#include <asm/cache.h>
#include <asm/thread_info.h>
#include <asm/memory.h>
#include <asm/mpu.h>
#include <asm/page.h>

-#include "vmlinux.lds.h"
-
OUTPUT_ARCH(arm)
ENTRY(stext)

diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 5592f14b7e35..bb950c896a67 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -9,15 +9,13 @@
#else

#include <linux/pgtable.h>
-#include <asm-generic/vmlinux.lds.h>
+#include <asm/vmlinux.lds.h>
#include <asm/cache.h>
#include <asm/thread_info.h>
#include <asm/memory.h>
#include <asm/mpu.h>
#include <asm/page.h>

-#include "vmlinux.lds.h"
-
OUTPUT_ARCH(arm)
ENTRY(stext)

--
2.25.1

Kees Cook

unread,
Jul 31, 2020, 7:18:20 PM7/31/20
to Thomas Gleixner, Will Deacon, Kees Cook, Nick Desaulniers, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Add missing text stub sections .vfp11_veneer and .v4_bx, as well as
missing DWARF sections, when present in the build.

Reviewed-by: Nick Desaulniers <ndesau...@google.com>
Signed-off-by: Kees Cook <kees...@chromium.org>
---
arch/arm/include/asm/vmlinux.lds.h | 4 +++-
arch/arm/kernel/vmlinux-xip.lds.S | 1 +
arch/arm/kernel/vmlinux.lds.S | 1 +
3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/vmlinux.lds.h b/arch/arm/include/asm/vmlinux.lds.h
index c4af5182ab48..6624dd97475c 100644
--- a/arch/arm/include/asm/vmlinux.lds.h
+++ b/arch/arm/include/asm/vmlinux.lds.h
@@ -59,7 +59,9 @@
#define ARM_STUBS_TEXT \
*(.gnu.warning) \
*(.glue_7) \
- *(.glue_7t)
+ *(.glue_7t) \
+ *(.vfp11_veneer) \
+ *(.v4_bx)

#define ARM_TEXT \
IDMAP_TEXT \
diff --git a/arch/arm/kernel/vmlinux-xip.lds.S b/arch/arm/kernel/vmlinux-xip.lds.S
index 57fcbf55f913..11ffa79751da 100644
--- a/arch/arm/kernel/vmlinux-xip.lds.S
+++ b/arch/arm/kernel/vmlinux-xip.lds.S
@@ -150,6 +150,7 @@ SECTIONS
_end = .;

STABS_DEBUG
+ DWARF_DEBUG
ARM_DETAILS
}

diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index 1d3d3b599635..dc672fe35de3 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -149,6 +149,7 @@ SECTIONS
_end = .;

STABS_DEBUG
+ DWARF_DEBUG
ARM_DETAILS
}

--
2.25.1

Nick Desaulniers

unread,
Jul 31, 2020, 7:42:29 PM7/31/20
to Kees Cook, Thomas Gleixner, Will Deacon, Arvind Sankar, Ard Biesheuvel, Fangrui Song, Sedat Dilek, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), clang-built-linux, linux-arch, linux-efi, Linux ARM, LKML, Dmitry Golovin
On Fri, Jul 31, 2020 at 4:08 PM Kees Cook <kees...@chromium.org> wrote:
>
> From: Arvind Sankar <nive...@alum.mit.edu>
>
> The assembly code in head_{32,64}.S, while meant to be
> position-independent, generates run-time relocations because it uses
> instructions such as
> leal gdt(%edx), %eax
> which make the assembler and linker think that the code is using %edx as
> an index into gdt, and hence gdt needs to be relocated to its run-time
> address.
>
> On 32-bit, with lld Dmitry Golovin reports that this results in a

^ if it's not too late, you could give Dima a shout out with a reported by tag?

Reported-by: Dmitry Golovin <di...@golovin.in>

> link-time error with default options (i.e. unless -z notext is
> explicitly passed):
> LD arch/x86/boot/compressed/vmlinux
> ld.lld: error: can't create dynamic relocation R_386_32 against local
> symbol in readonly segment; recompile object files with -fPIC or pass
> '-Wl,-z,notext' to allow text relocations in the output
>
> With the BFD linker, this generates a warning during the build, if
> --warn-shared-textrel is enabled, which at least Gentoo enables by
> default:
> LD arch/x86/boot/compressed/vmlinux
> ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
> ld: warning: creating a DT_TEXTREL in object
>
> On 64-bit, it is not possible to link the kernel as -pie with lld, and
> it is only possible with a BFD linker that supports -z noreloc-overflow,
> i.e. versions >2.26. This is because these instructions cannot really be
> relocated: the displacement field is only 32-bits wide, and thus cannot
> be relocated for a 64-bit load address. The -z noreloc-overflow option
> simply overrides the linker error, and results in R_X86_64_RELATIVE
> relocations that apply a 64-bit relocation to a 32-bit field anyway.
> This happens to work because nothing will process these run-time
> relocations.
>
> Start fixing this by removing relocations from .head.text:
> - On 32-bit, use a base register that holds the address of the GOT and
> reference symbol addresses using @GOTOFF, i.e.
> leal gdt@GOTOFF(%edx), %eax
> - On 64-bit, most of the code can (and already does) use %rip-relative
> addressing, however the .code32 bits can't, and the 64-bit code also
> needs to reference symbol addresses as they will be after moving the
> compressed kernel to the end of the decompression buffer.
> For these cases, reference the symbols as an offset to startup_32 to
> avoid creating relocations, i.e.
> leal (gdt-startup_32)(%bp), %eax
> This only works in .head.text as the subtraction cannot be represented
> as a PC-relative relocation unless startup_32 is in the same section
> as the code. Move efi32_pe_entry into .head.text so that it can use
> the same method to avoid relocations.
>
> Tested-by: Nick Desaulniers <ndesau...@google.com>
> Reviewed-by: Kees Cook <kees...@chromium.org>
> Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
> Reviewed-by: Fangrui Song <mas...@google.com>
> Tested-by: Sedat Dilek <sedat...@gmail.com>
> Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
> Signed-off-by: Kees Cook <kees...@chromium.org>
> ---
> arch/x86/boot/compressed/head_32.S | 64 +++++++-----------
> arch/x86/boot/compressed/head_64.S | 104 ++++++++++++++++++-----------
> 2 files changed, 90 insertions(+), 78 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
> index 39f0bb43218f..8c1a4f5610f5 100644
> --- a/arch/x86/boot/compressed/head_32.S
> +++ b/arch/x86/boot/compressed/head_32.S
> @@ -33,26 +33,10 @@
> #include <asm/bootparam.h>
>
> /*
> - * The 32-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
> - * relocation to get the symbol address in PIC. When the compressed x86
> - * kernel isn't built as PIC, the linker optimizes R_386_GOT32X
> - * relocations to their fixed symbol addresses. However, when the
> - * compressed x86 kernel is loaded at a different address, it leads
> - * to the following load failure:
> - *
> - * Failed to allocate space for phdrs
> - *
> - * during the decompression stage.
> - *
> - * If the compressed x86 kernel is relocatable at run-time, it should be
> - * compiled with -fPIE, instead of -fPIC, if possible and should be built as
> - * Position Independent Executable (PIE) so that linker won't optimize
> - * R_386_GOT32X relocation to its fixed symbol address. Older
> - * linkers generate R_386_32 relocations against locally defined symbols,
> - * _bss, _ebss and _end, in PIE. It isn't wrong, just less optimal than
> - * R_386_RELATIVE. But the x86 kernel fails to properly handle R_386_32
> - * relocations when relocating the kernel. To generate R_386_RELATIVE
> - * relocations, we mark _bss, _ebss and _end as hidden:
> + * These symbols needed to be marked as .hidden to prevent the BFD linker from
> + * generating R_386_32 (rather than R_386_RELATIVE) relocations for them when
> + * the 32-bit compressed kernel is linked as PIE. This is no longer necessary,
> + * but it doesn't hurt to keep them .hidden.
> */
> .hidden _bss
> .hidden _ebss
> @@ -74,10 +58,10 @@ SYM_FUNC_START(startup_32)
> leal (BP_scratch+4)(%esi), %esp
> call 1f
> 1: popl %edx
> - subl $1b, %edx
> + addl $_GLOBAL_OFFSET_TABLE_+(.-1b), %edx
>
> /* Load new GDT */
> - leal gdt(%edx), %eax
> + leal gdt@GOTOFF(%edx), %eax
> movl %eax, 2(%eax)
> lgdt (%eax)
>
> @@ -90,14 +74,16 @@ SYM_FUNC_START(startup_32)
> movl %eax, %ss
>
> /*
> - * %edx contains the address we are loaded at by the boot loader and %ebx
> - * contains the address where we should move the kernel image temporarily
> - * for safe in-place decompression. %ebp contains the address that the kernel
> - * will be decompressed to.
> + * %edx contains the address we are loaded at by the boot loader (plus the
> + * offset to the GOT). The below code calculates %ebx to be the address where
> + * we should move the kernel image temporarily for safe in-place decompression
> + * (again, plus the offset to the GOT).
> + *
> + * %ebp is calculated to be the address that the kernel will be decompressed to.
> */
>
> #ifdef CONFIG_RELOCATABLE
> - movl %edx, %ebx
> + leal startup_32@GOTOFF(%edx), %ebx
>
> #ifdef CONFIG_EFI_STUB
> /*
> @@ -108,7 +94,7 @@ SYM_FUNC_START(startup_32)
> * image_offset = startup_32 - image_base
> * Otherwise image_offset will be zero and has no effect on the calculations.
> */
> - subl image_offset(%edx), %ebx
> + subl image_offset@GOTOFF(%edx), %ebx
> #endif
>
> movl BP_kernel_alignment(%esi), %eax
> @@ -125,10 +111,10 @@ SYM_FUNC_START(startup_32)
> movl %ebx, %ebp // Save the output address for later
> /* Target address to relocate to for decompression */
> addl BP_init_size(%esi), %ebx
> - subl $_end, %ebx
> + subl $_end@GOTOFF, %ebx
>
> /* Set up the stack */
> - leal boot_stack_end(%ebx), %esp
> + leal boot_stack_end@GOTOFF(%ebx), %esp
>
> /* Zero EFLAGS */
> pushl $0
> @@ -139,8 +125,8 @@ SYM_FUNC_START(startup_32)
> * where decompression in place becomes safe.
> */
> pushl %esi
> - leal (_bss-4)(%edx), %esi
> - leal (_bss-4)(%ebx), %edi
> + leal (_bss@GOTOFF-4)(%edx), %esi
> + leal (_bss@GOTOFF-4)(%ebx), %edi
> movl $(_bss - startup_32), %ecx
> shrl $2, %ecx
> std
> @@ -153,14 +139,14 @@ SYM_FUNC_START(startup_32)
> * during extract_kernel below. To avoid any issues, repoint the GDTR
> * to the new copy of the GDT.
> */
> - leal gdt(%ebx), %eax
> + leal gdt@GOTOFF(%ebx), %eax
> movl %eax, 2(%eax)
> lgdt (%eax)
>
> /*
> * Jump to the relocated address.
> */
> - leal .Lrelocated(%ebx), %eax
> + leal .Lrelocated@GOTOFF(%ebx), %eax
> jmp *%eax
> SYM_FUNC_END(startup_32)
>
> @@ -170,7 +156,7 @@ SYM_FUNC_START_ALIAS(efi_stub_entry)
> add $0x4, %esp
> movl 8(%esp), %esi /* save boot_params pointer */
> call efi_main
> - leal startup_32(%eax), %eax
> + /* efi_main returns the possibly relocated address of startup_32 */
> jmp *%eax
> SYM_FUNC_END(efi32_stub_entry)
> SYM_FUNC_END_ALIAS(efi_stub_entry)
> @@ -183,8 +169,8 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
> * Clear BSS (stack is currently empty)
> */
> xorl %eax, %eax
> - leal _bss(%ebx), %edi
> - leal _ebss(%ebx), %ecx
> + leal _bss@GOTOFF(%ebx), %edi
> + leal _ebss@GOTOFF(%ebx), %ecx
> subl %edi, %ecx
> shrl $2, %ecx
> rep stosl
> @@ -198,9 +184,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
> pushl %ebp /* output address */
>
> pushl $z_input_len /* input_len */
> - leal input_data(%ebx), %eax
> + leal input_data@GOTOFF(%ebx), %eax
> pushl %eax /* input_data */
> - leal boot_heap(%ebx), %eax
> + leal boot_heap@GOTOFF(%ebx), %eax
> pushl %eax /* heap area */
> pushl %esi /* real mode pointer */
> call extract_kernel /* returns kernel location in %eax */
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index bf1ab30acc5b..11429092c224 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -43,6 +43,32 @@
> .hidden _end
>
> __HEAD
> +
> +/*
> + * This macro gives the relative virtual address of X, i.e. the offset of X
> + * from startup_32. This is the same as the link-time virtual address of X,
> + * since startup_32 is at 0, but defining it this way tells the
> + * assembler/linker that we do not want the actual run-time address of X. This
> + * prevents the linker from trying to create unwanted run-time relocation
> + * entries for the reference when the compressed kernel is linked as PIE.
> + *
> + * A reference X(%reg) will result in the link-time VA of X being stored with
> + * the instruction, and a run-time R_X86_64_RELATIVE relocation entry that
> + * adds the 64-bit base address where the kernel is loaded.
> + *
> + * Replacing it with (X-startup_32)(%reg) results in the offset being stored,
> + * and no run-time relocation.
> + *
> + * The macro should be used as a displacement with a base register containing
> + * the run-time address of startup_32 [i.e. rva(X)(%reg)], or as an immediate
> + * [$ rva(X)].
> + *
> + * This macro can only be used from within the .head.text section, since the
> + * expression requires startup_32 to be in the same section as the code being
> + * assembled.
> + */
> +#define rva(X) ((X) - startup_32)
> +
> .code32
> SYM_FUNC_START(startup_32)
> /*
> @@ -65,10 +91,10 @@ SYM_FUNC_START(startup_32)
> leal (BP_scratch+4)(%esi), %esp
> call 1f
> 1: popl %ebp
> - subl $1b, %ebp
> + subl $ rva(1b), %ebp
>
> /* Load new GDT with the 64bit segments using 32bit descriptor */
> - leal gdt(%ebp), %eax
> + leal rva(gdt)(%ebp), %eax
> movl %eax, 2(%eax)
> lgdt (%eax)
>
> @@ -81,7 +107,7 @@ SYM_FUNC_START(startup_32)
> movl %eax, %ss
>
> /* setup a stack and make sure cpu supports long mode. */
> - leal boot_stack_end(%ebp), %esp
> + leal rva(boot_stack_end)(%ebp), %esp
>
> call verify_cpu
> testl %eax, %eax
> @@ -108,7 +134,7 @@ SYM_FUNC_START(startup_32)
> * image_offset = startup_32 - image_base
> * Otherwise image_offset will be zero and has no effect on the calculations.
> */
> - subl image_offset(%ebp), %ebx
> + subl rva(image_offset)(%ebp), %ebx
> #endif
>
> movl BP_kernel_alignment(%esi), %eax
> @@ -124,7 +150,7 @@ SYM_FUNC_START(startup_32)
>
> /* Target address to relocate to for decompression */
> addl BP_init_size(%esi), %ebx
> - subl $_end, %ebx
> + subl $ rva(_end), %ebx
>
> /*
> * Prepare for entering 64 bit mode
> @@ -152,19 +178,19 @@ SYM_FUNC_START(startup_32)
> 1:
>
> /* Initialize Page tables to 0 */
> - leal pgtable(%ebx), %edi
> + leal rva(pgtable)(%ebx), %edi
> xorl %eax, %eax
> movl $(BOOT_INIT_PGT_SIZE/4), %ecx
> rep stosl
>
> /* Build Level 4 */
> - leal pgtable + 0(%ebx), %edi
> + leal rva(pgtable + 0)(%ebx), %edi
> leal 0x1007 (%edi), %eax
> movl %eax, 0(%edi)
> addl %edx, 4(%edi)
>
> /* Build Level 3 */
> - leal pgtable + 0x1000(%ebx), %edi
> + leal rva(pgtable + 0x1000)(%ebx), %edi
> leal 0x1007(%edi), %eax
> movl $4, %ecx
> 1: movl %eax, 0x00(%edi)
> @@ -175,7 +201,7 @@ SYM_FUNC_START(startup_32)
> jnz 1b
>
> /* Build Level 2 */
> - leal pgtable + 0x2000(%ebx), %edi
> + leal rva(pgtable + 0x2000)(%ebx), %edi
> movl $0x00000183, %eax
> movl $2048, %ecx
> 1: movl %eax, 0(%edi)
> @@ -186,7 +212,7 @@ SYM_FUNC_START(startup_32)
> jnz 1b
>
> /* Enable the boot page tables */
> - leal pgtable(%ebx), %eax
> + leal rva(pgtable)(%ebx), %eax
> movl %eax, %cr3
>
> /* Enable Long mode in EFER (Extended Feature Enable Register) */
> @@ -211,14 +237,14 @@ SYM_FUNC_START(startup_32)
> * We place all of the values on our mini stack so lret can
> * used to perform that far jump.
> */
> - leal startup_64(%ebp), %eax
> + leal rva(startup_64)(%ebp), %eax
> #ifdef CONFIG_EFI_MIXED
> - movl efi32_boot_args(%ebp), %edi
> + movl rva(efi32_boot_args)(%ebp), %edi
> cmp $0, %edi
> jz 1f
> - leal efi64_stub_entry(%ebp), %eax
> - movl efi32_boot_args+4(%ebp), %esi
> - movl efi32_boot_args+8(%ebp), %edx // saved bootparams pointer
> + leal rva(efi64_stub_entry)(%ebp), %eax
> + movl rva(efi32_boot_args+4)(%ebp), %esi
> + movl rva(efi32_boot_args+8)(%ebp), %edx // saved bootparams pointer
> cmpl $0, %edx
> jnz 1f
> /*
> @@ -229,7 +255,7 @@ SYM_FUNC_START(startup_32)
> * the correct stack alignment for entry.
> */
> subl $40, %esp
> - leal efi_pe_entry(%ebp), %eax
> + leal rva(efi_pe_entry)(%ebp), %eax
> movl %edi, %ecx // MS calling convention
> movl %esi, %edx
> 1:
> @@ -255,18 +281,18 @@ SYM_FUNC_START(efi32_stub_entry)
>
> call 1f
> 1: pop %ebp
> - subl $1b, %ebp
> + subl $ rva(1b), %ebp
>
> - movl %esi, efi32_boot_args+8(%ebp)
> + movl %esi, rva(efi32_boot_args+8)(%ebp)
> SYM_INNER_LABEL(efi32_pe_stub_entry, SYM_L_LOCAL)
> - movl %ecx, efi32_boot_args(%ebp)
> - movl %edx, efi32_boot_args+4(%ebp)
> - movb $0, efi_is64(%ebp)
> + movl %ecx, rva(efi32_boot_args)(%ebp)
> + movl %edx, rva(efi32_boot_args+4)(%ebp)
> + movb $0, rva(efi_is64)(%ebp)
>
> /* Save firmware GDTR and code/data selectors */
> - sgdtl efi32_boot_gdt(%ebp)
> - movw %cs, efi32_boot_cs(%ebp)
> - movw %ds, efi32_boot_ds(%ebp)
> + sgdtl rva(efi32_boot_gdt)(%ebp)
> + movw %cs, rva(efi32_boot_cs)(%ebp)
> + movw %ds, rva(efi32_boot_ds)(%ebp)
>
> /* Disable paging */
> movl %cr0, %eax
> @@ -345,11 +371,11 @@ SYM_CODE_START(startup_64)
>
> /* Target address to relocate to for decompression */
> movl BP_init_size(%rsi), %ebx
> - subl $_end, %ebx
> + subl $ rva(_end), %ebx
> addq %rbp, %rbx
>
> /* Set up the stack */
> - leaq boot_stack_end(%rbx), %rsp
> + leaq rva(boot_stack_end)(%rbx), %rsp
>
> /*
> * At this point we are in long mode with 4-level paging enabled,
> @@ -423,7 +449,7 @@ SYM_CODE_START(startup_64)
> lretq
> trampoline_return:
> /* Restore the stack, the 32-bit trampoline uses its own stack */
> - leaq boot_stack_end(%rbx), %rsp
> + leaq rva(boot_stack_end)(%rbx), %rsp
>
> /*
> * cleanup_trampoline() would restore trampoline memory.
> @@ -435,7 +461,7 @@ trampoline_return:
> * this function call.
> */
> pushq %rsi
> - leaq top_pgtable(%rbx), %rdi
> + leaq rva(top_pgtable)(%rbx), %rdi
> call cleanup_trampoline
> popq %rsi
>
> @@ -449,9 +475,9 @@ trampoline_return:
> */
> pushq %rsi
> leaq (_bss-8)(%rip), %rsi
> - leaq (_bss-8)(%rbx), %rdi
> - movq $_bss /* - $startup_32 */, %rcx
> - shrq $3, %rcx
> + leaq rva(_bss-8)(%rbx), %rdi
> + movl $(_bss - startup_32), %ecx
> + shrl $3, %ecx
> std
> rep movsq
> cld
> @@ -462,15 +488,15 @@ trampoline_return:
> * during extract_kernel below. To avoid any issues, repoint the GDTR
> * to the new copy of the GDT.
> */
> - leaq gdt64(%rbx), %rax
> - leaq gdt(%rbx), %rdx
> + leaq rva(gdt64)(%rbx), %rax
> + leaq rva(gdt)(%rbx), %rdx
> movq %rdx, 2(%rax)
> lgdt (%rax)
>
> /*
> * Jump to the relocated address.
> */
> - leaq .Lrelocated(%rbx), %rax
> + leaq rva(.Lrelocated)(%rbx), %rax
> jmp *%rax
> SYM_CODE_END(startup_64)
>
> @@ -482,7 +508,7 @@ SYM_FUNC_START_ALIAS(efi_stub_entry)
> movq %rdx, %rbx /* save boot_params pointer */
> call efi_main
> movq %rbx,%rsi
> - leaq startup_64(%rax), %rax
> + leaq rva(startup_64)(%rax), %rax
> jmp *%rax
> SYM_FUNC_END(efi64_stub_entry)
> SYM_FUNC_END_ALIAS(efi_stub_entry)
> @@ -645,7 +671,7 @@ SYM_DATA(efi_is64, .byte 1)
> #define BS32_handle_protocol 88 // offsetof(efi_boot_services_32_t, handle_protocol)
> #define LI32_image_base 32 // offsetof(efi_loaded_image_32_t, image_base)
>
> - .text
> + __HEAD
> .code32
> SYM_FUNC_START(efi32_pe_entry)
> /*
> @@ -667,12 +693,12 @@ SYM_FUNC_START(efi32_pe_entry)
>
> call 1f
> 1: pop %ebx
> - subl $1b, %ebx
> + subl $ rva(1b), %ebx
>
> /* Get the loaded image protocol pointer from the image handle */
> leal -4(%ebp), %eax
> pushl %eax // &loaded_image
> - leal loaded_image_proto(%ebx), %eax
> + leal rva(loaded_image_proto)(%ebx), %eax
> pushl %eax // pass the GUID address
> pushl 8(%ebp) // pass the image handle
>
> @@ -707,7 +733,7 @@ SYM_FUNC_START(efi32_pe_entry)
> * use it before we get to the 64-bit efi_pe_entry() in C code.
> */
> subl %esi, %ebx
> - movl %ebx, image_offset(%ebp) // save image_offset
> + movl %ebx, rva(image_offset)(%ebp) // save image_offset
> jmp efi32_pe_stub_entry
>
> 2: popl %edi // restore callee-save registers
> --
> 2.25.1
>


--
Thanks,
~Nick Desaulniers

Arvind Sankar

unread,
Jul 31, 2020, 9:47:59 PM7/31/20
to Kees Cook, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
There's no point in marking zero-size sections NOLOAD -- if the ASSERT's
passed, they won't be present in the file at all anyway.

The only section for which there might be a point is .got.plt, which is
non-empty on 32-bit, and only if it is first moved to the end. That
saves a few bytes.

Arvind Sankar

unread,
Jul 31, 2020, 10:12:51 PM7/31/20
to Kees Cook, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Is this actually needed? vmlinux is a position-dependent executable, and
it doesn't get linked with any shared libraries, so it should never have
a .got or .got.plt at all I think? Does it show up as an orphan without
this?

Arvind Sankar

unread,
Jul 31, 2020, 10:53:29 PM7/31/20
to Arvind Sankar, Kees Cook, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Btw, you should move .got.plt also to the end anyway for readability,
it's unused even if non-empty. And with the ASSERT being placed
immediately after it, it's even more distracting from the actual section
layout.

Arvind Sankar

unread,
Jul 31, 2020, 11:51:32 PM7/31/20
to Kees Cook, Thomas Gleixner, Will Deacon, Nick Desaulniers, Jian Cai, Fāng-ruì Sòng, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, Andi Kleen, Michal Marek
On Fri, Jul 31, 2020 at 04:07:57PM -0700, Kees Cook wrote:
> From: Nick Desaulniers <ndesau...@google.com>
>
> Basically, consider .text.{hot|unlikely|unknown}.* part of .text, too.
>
> When compiling with profiling information (collected via PGO
> instrumentations or AutoFDO sampling), Clang will separate code into
> .text.hot, .text.unlikely, or .text.unknown sections based on profiling
> information. After D79600 (clang-11), these sections will have a
> trailing `.` suffix, ie. .text.hot., .text.unlikely., .text.unknown..
>
> When using -ffunction-sections together with profiling infomation,
> either explicitly (FGKASLR) or implicitly (LTO), code may be placed in
> sections following the convention:
> .text.hot.<foo>, .text.unlikely.<bar>, .text.unknown.<baz>
> where <foo>, <bar>, and <baz> are functions. (This produces one section
> per function; we generally try to merge these all back via linker script
> so that we don't have 50k sections).
>
> For the above cases, we need to teach our linker scripts that such
> sections might exist and that we'd explicitly like them grouped
> together, otherwise we can wind up with code outside of the
> _stext/_etext boundaries that might not be mapped properly for some
> architectures, resulting in boot failures.
>
> If the linker script is not told about possible input sections, then
> where the section is placed as output is a heuristic-laiden mess that's
> non-portable between linkers (ie. BFD and LLD), and has resulted in many
> hard to debug bugs. Kees Cook is working on cleaning this up by adding
> --orphan-handling=warn linker flag used in ARCH=powerpc to additional
> architectures. In the case of linker scripts, borrowing from the Zen of
> Python: explicit is better than implicit.
>
> Also, ld.bfd's internal linker script considers .text.hot AND
> .text.hot.* to be part of .text, as well as .text.unlikely and
> .text.unlikely.*. I didn't see support for .text.unknown.*, and didn't
> see Clang producing such code in our kernel builds, but I see code in
> LLVM that can produce such section names if profiling information is
> missing. That may point to a larger issue with generating or collecting
> profiles, but I would much rather be safe and explicit than have to
> debug yet another issue related to orphan section placement.
>
> Reported-by: Jian Cai <jia...@google.com>
> Suggested-by: Fāng-ruì Sòng <mas...@google.com>
> Tested-by: Luis Lozano <llo...@google.com>
> Tested-by: Manoj Gupta <manoj...@google.com>
> Acked-by: Kees Cook <kees...@chromium.org>
> Cc: sta...@vger.kernel.org
> Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=add44f8d5c5c05e08b11e033127a744d61c26aee
> Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=1de778ed23ce7492c523d5850c6c6dbb34152655
> Link: https://reviews.llvm.org/D79600
> Link: https://bugs.chromium.org/p/chromium/issues/detail?id=1084760
> Debugged-by: Luis Lozano <llo...@google.com>
> Signed-off-by: Nick Desaulniers <ndesau...@google.com>
> Signed-off-by: Kees Cook <kees...@chromium.org>
> ---
> include/asm-generic/vmlinux.lds.h | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 2593957f6e8b..af5211ca857c 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -561,7 +561,10 @@
> */
> #define TEXT_TEXT \
> ALIGN_FUNCTION(); \
> - *(.text.hot TEXT_MAIN .text.fixup .text.unlikely) \
> + *(.text.hot .text.hot.*) \
> + *(TEXT_MAIN .text.fixup) \
> + *(.text.unlikely .text.unlikely.*) \
> + *(.text.unknown .text.unknown.*) \
> NOINSTR_TEXT \
> *(.text..refcount) \
> *(.ref.text) \
> --
> 2.25.1
>

This also changes the ordering to place all hot resp unlikely sections separate
from other text, while currently it places the hot/unlikely bits of each file
together with the rest of the code in that file. That seems like a reasonable
change and should be mentioned in the commit message.

However, the history of their being together comes from

9bebe9e5b0f3 ("kbuild: Fix .text.unlikely placement")

which seems to indicate there was some problem with having them separated out,
although I don't quite understand what the issue was from the commit message.

Cc Andi and Michal to see if they remember.

Kees Cook

unread,
Aug 1, 2020, 1:33:01 AM8/1/20
to Arvind Sankar, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
When I switched from DISCARD to 0-assert, I tested all of these, but given
so many combinations, perhaps I made a mistake. I will double-check and
report back.

--
Kees Cook

Kees Cook

unread,
Aug 1, 2020, 1:35:16 AM8/1/20
to Arvind Sankar, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
On Fri, Jul 31, 2020 at 09:47:55PM -0400, Arvind Sankar wrote:
I did not find that universally true. I found some sections be written
out with a 0 size. Some I could remove from disk with NOLOAD, others did
not like that so much.

> The only section for which there might be a point is .got.plt, which is
> non-empty on 32-bit, and only if it is first moved to the end. That
> saves a few bytes.

What do you mean about "only if it is first moved to the end"? Would it
be zero-sized if it was closer to .text?

--
Kees Cook

Kees Cook

unread,
Aug 1, 2020, 1:36:03 AM8/1/20
to Arvind Sankar, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
ld.bfd (if I'm remembering correctly) was extraordinarily upset about it
being at the end. I will retest and report back.

--
Kees Cook

Kees Cook

unread,
Aug 1, 2020, 2:18:05 AM8/1/20
to Arvind Sankar, Thomas Gleixner, Will Deacon, Nick Desaulniers, Jian Cai, Fāng-ruì Sòng, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, Andi Kleen, Michal Marek, Kristen Carlson Accardi
Oh, hmm, yes, we aren't explicitly using SORT() here. Does that mean the
input sections were entirely be ordered in compilation unit link order,
even in the case of orphan sections? (And I think either way, the answer
isn't the same between bfd and lld.) I actually thought the like-named
input sections were collected together first with lld, but bfd strictly
appended to the output section. I guess it's time for me to stare at -M
output from ld...

Regardless, this patch is attempting to fix the problem where bfd and lld
lay out the orphans differently (as mentioned above, lld seems to sort
them in a way that is not strictly appended, and bfd seems to sort them
strictly appended). In the case of being appended to the .text output
section, this would cause boot failures due to _etext not covering the
resulting sections (which this[1] also encountered and fixed to be more
robust for such appended collection -- that series actually _depends_ on
orphan handling doing the appending, because there is no current way
to map wildcard input sections to their own separate output sections).

> change and should be mentioned in the commit message.
>
> However, the history of their being together comes from
>
> 9bebe9e5b0f3 ("kbuild: Fix .text.unlikely placement")
>
> which seems to indicate there was some problem with having them separated out,
> although I don't quite understand what the issue was from the commit message.

Looking at this again, I actually wonder if we have bigger issues here
with dead code elimination:

#ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
#define TEXT_MAIN .text .text.[0-9a-zA-Z_]*
...

that would catch: .text.hot .text.fixup .text.unlikely and .text.unknown
but not .text.hot.*, etc (i.e. the third dot isn't matched, which is,
I assume, why Clang switched to adding a trailing dot). However, this
patch lists .text.hot .text.hot.* first, so they'd get pulled to the
front correctly, but the trailing ones (with 2 dots) would not, since
they'd match the TEXT_MAIN wildcard first. (This problem actually existed
before this patch too, and is not the fault of 9bebe9e5b0f3, but rather
the addition of TEXT_MAIN, which could potentially match .text.unlikely
and .text.fixup)

Unless I'm totally wrong and the bfd docs don't match the behavior? e.g.
if I have a link order of ".foo.before", ".foo.after", and ".foo.middle",
and this rule:

.foo : { *(.foo.before .foo.* .foo.after) }

do I get this (first match):

.foo.before
.foo.after
.foo.middle

or (most specific match):

.foo.before
.foo.middle
.foo.after

?

As I said, now that I'm able to better articulate these questions, I'll
go get answers from -M output. :)

Perhaps we need to fix TEXT_MAIN not TEXT_TEXT? TEXT_TEXT is for
collecting .text, .text.[^\.]* and *.text, where, effectively,
.text and .text[^\.]* are defined by TEXT_MAIN. i.e. adding 3-dot "text"
input sections needs to likely be included in TEXT_MAIN

Anyway, I'll keep looking at this...

(In the meantime, perhaps we can take Arvind's series, and the earlier
portions of the orphan series where asm-generic/vmlinux.lds.h and other
things are cleaned up...)

-Kees

[1] https://lore.kernel.org/lkml/20200717170008...@linux.intel.com/

--
Kees Cook

Arvind Sankar

unread,
Aug 1, 2020, 1:00:53 PM8/1/20
to Kees Cook, Arvind Sankar, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Neither LLD nor BFD is creating 0-sized .got or .rel sections in my
builds. In any case, if they're 0-sized, NOLOAD shouldn't affect
anything: it doesn't remove them from disk, it stops them being loaded
into memory, which is a nop if it was 0-sized.

>
> > The only section for which there might be a point is .got.plt, which is
> > non-empty on 32-bit, and only if it is first moved to the end. That
> > saves a few bytes.
>
> What do you mean about "only if it is first moved to the end"? Would it
> be zero-sized if it was closer to .text?
>
> --
> Kees Cook

Sorry, my sentence is confusingly worded: it's always non-empty on
x86-32. I meant, move .got.plt to the end (after _end), add (INFO) to
it, and it might save a few bytes, or not, depending on alignment
padding. If it's left in the middle, it still pushes the addresses of
the remaining sections out, so it doesn't save anything.

I'd tested that out the last time we talked about this, but left it out
of my series as Fangrui was negative about the idea.

I tested with NOLOAD instead of INFO, and at least ld.bfd actually
errors out if .got.plt is marked NOLOAD, no matter where it's located.

LDS arch/x86/boot/compressed/vmlinux.lds
LD arch/x86/boot/compressed/vmlinux
ld: final link failed: section has no contents

Side note: I also discovered something peculiar with the gcc/lld combo.
On x86-64, it turns out that this still generates a .got.plt section,
which was unexpected as _GLOBAL_OFFSET_TABLE_ shouldn't be referenced on
64-bit. It turns out that when gcc (or even clang) generates an
out-of-line call to memcpy from a __builtin_memcpy call, it doesn't
declare the memcpy as hidden even with Ard's hidden.h, or even if memcpy
was explicitly declared with hidden visibility. It uses memcpy@PLT
instead of memcpy, and this generates a reference to
_GLOBAL_OFFSET_TABLE_ in the .o file. The linker later converts this to
a direct call to the function, but LLD leaves .got.plt in the
executable, while BFD strips it out.

It also turns out that clang's integrated assembler, unlike gas, does
not generate a reference to _GLOBAL_OFFSET_TABLE for a foo@PLT call. And
because we redefine KBUILD_CFLAGS in boot/compressed Makefile, we lose
the -no-integrated-as option, and clang is using its integrated
assembler when building the compressed kernel.

Arvind Sankar

unread,
Aug 1, 2020, 1:12:28 PM8/1/20
to Kees Cook, Arvind Sankar, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
Actually, moving it to the end also requires marking it INFO or
stripping it out when creating the bzImage. Otherwise we get back to
that old problem of materializing .bss/.pgtable in the bzImage.

Arvind Sankar

unread,
Aug 1, 2020, 1:27:49 PM8/1/20
to Kees Cook, Arvind Sankar, Thomas Gleixner, Will Deacon, Nick Desaulniers, Jian Cai, Fāng-ruì Sòng, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, Andi Kleen, Michal Marek, Kristen Carlson Accardi
On Fri, Jul 31, 2020 at 11:18:02PM -0700, Kees Cook wrote:
> On Fri, Jul 31, 2020 at 11:51:28PM -0400, Arvind Sankar wrote:
> >
> > This also changes the ordering to place all hot resp unlikely sections separate
> > from other text, while currently it places the hot/unlikely bits of each file
> > together with the rest of the code in that file. That seems like a reasonable
>
> Oh, hmm, yes, we aren't explicitly using SORT() here. Does that mean the
> input sections were entirely be ordered in compilation unit link order,
> even in the case of orphan sections? (And I think either way, the answer
> isn't the same between bfd and lld.) I actually thought the like-named
> input sections were collected together first with lld, but bfd strictly
> appended to the output section. I guess it's time for me to stare at -M
> output from ld...

I don't know what happened to the orphans previously. But .text.hot and
.text.unlikely will now change ordering. It sounds from below like this
wasn't intentional? Though it does seem to be how BFD's default linker
scripts lay it out.
The existing comment on TEXT_TEXT mentions that issue. However, note
that the dead code stuff is only available currently on mips and ppc,
and is hidden behind EXPERT for those, so I'm not sure if anyone
actually uses it.

9bebe9e5b0f3 predates LD_DEAD_CODE_DATA_ELIMINATION, and there were no
wildcards I can see in .text at the time, which is why I don't
understand what problem is referred to in the commit message.

Btw, for the FGKASLR stuff, instead of keeping the output sections per
function, couldn't you generate a table of functions with sizes, and use
that when randomizing the order? Then the sections themselves could be
collected into .text explicitly.

Nick Desaulniers

unread,
Aug 3, 2020, 3:02:19 PM8/3/20
to Kees Cook, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nathan Chancellor, Arnd Bergmann, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), clang-built-linux, linux-arch, linux-efi, Linux ARM, LKML
On Fri, Jul 31, 2020 at 4:18 PM Kees Cook <kees...@chromium.org> wrote:
>
> In preparation for adding --orphan-handling=warn, explicitly keep the
> .ARM.attributes section by expanding the existing ELF_DETAILS macro into
> ARM_DETAILS.
>
> Suggested-by: Nick Desaulniers <ndesau...@google.com>
> Link: https://lore.kernel.org/lkml/CAKwvOdk-racgq5pxsoGS6Vti...@mail.gmail.com/
> Signed-off-by: Kees Cook <kees...@chromium.org>
> ---
> arch/arm/include/asm/vmlinux.lds.h | 4 ++++
> arch/arm/kernel/vmlinux-xip.lds.S | 2 +-
> arch/arm/kernel/vmlinux.lds.S | 2 +-
> 3 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/include/asm/vmlinux.lds.h b/arch/arm/include/asm/vmlinux.lds.h
> index a08f4301b718..c4af5182ab48 100644
> --- a/arch/arm/include/asm/vmlinux.lds.h
> +++ b/arch/arm/include/asm/vmlinux.lds.h
> @@ -52,6 +52,10 @@
> ARM_MMU_DISCARD(*(__ex_table)) \
> COMMON_DISCARDS
>
> +#define ARM_DETAILS \
> + ELF_DETAILS \
> + .ARM.attributes 0 : { *(.ARM.attributes) }

I had to look up what the `0` meant:
https://sourceware.org/binutils/docs/ld/Output-Section-Attributes.html#Output-Section-Attributes
mentions it's an "address" and
https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_chapter/ld_3.html#SEC21
mentions it as "start" (an address).
Unless we need those, can we drop them? (Sorry for the resulting churn
that would cause). I think the NO_LOAD stuff makes more sense, but
I'm curious if the kernel checks for that.
--
Thanks,
~Nick Desaulniers

Andi Kleen

unread,
Aug 3, 2020, 3:05:10 PM8/3/20
to Arvind Sankar, Kees Cook, Thomas Gleixner, Will Deacon, Nick Desaulniers, Jian Cai, Fāng-ruì Sòng, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, Michal Marek
> However, the history of their being together comes from
>
> 9bebe9e5b0f3 ("kbuild: Fix .text.unlikely placement")
>
> which seems to indicate there was some problem with having them separated out,
> although I don't quite understand what the issue was from the commit message.

Separating it out is less efficient. Gives worse packing for the hot part
if they are not aligned to 64byte boundaries, which they are usually not.

It also improves packing of the cold part, but that probably doesn't matter.

-Andi

Arvind Sankar

unread,
Aug 3, 2020, 4:15:29 PM8/3/20
to Andi Kleen, Arvind Sankar, Kees Cook, Thomas Gleixner, Will Deacon, Nick Desaulniers, Jian Cai, Fāng-ruì Sòng, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, Michal Marek
Why is that? Both .text and .text.hot have alignment of 2^4 (default
function alignment on x86) by default, so it doesn't seem like it should
matter for packing density. Avoiding interspersing cold text among
regular/hot text seems like it should be a net win.

That old commit doesn't reference efficiency -- it says there was some
problem with matching when they were separated out, but there were no
wildcard section names back then.

commit 9bebe9e5b0f3109a14000df25308c2971f872605
Author: Andi Kleen <a...@linux.intel.com>
Date: Sun Jul 19 18:01:19 2015 -0700

kbuild: Fix .text.unlikely placement

When building a kernel with .text.unlikely text the unlikely text for
each translation unit was put next to the main .text code in the
final vmlinux.

The problem is that the linker doesn't allow more specific submatches
of a section name in a different linker script statement after the
main match.

So we need to move them all into one line. With that change
.text.unlikely is at the end of everything again.

I also moved .text.hot into the same statement though, even though
that's not strictly needed.

Signed-off-by: Andi Kleen <a...@linux.intel.com>
Signed-off-by: Michal Marek <mma...@suse.com>

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 8bd374d3cf21..1781e54ea6d3 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -412,12 +412,10 @@
* during second ld run in second ld pass when generating System.map */
#define TEXT_TEXT \
ALIGN_FUNCTION(); \
- *(.text.hot) \
- *(.text .text.fixup) \
+ *(.text.hot .text .text.fixup .text.unlikely) \
*(.ref.text) \
MEM_KEEP(init.text) \
MEM_KEEP(exit.text) \
- *(.text.unlikely)


/* sched.text is aling to function alignment to secure we have same

Fāng-ruì Sòng

unread,
Aug 3, 2020, 9:19:40 PM8/3/20
to Arvind Sankar, Kees Cook, Andi Kleen, Thomas Gleixner, Will Deacon, Nick Desaulniers, Jian Cai, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, Michal Marek
On 2020-08-03, Arvind Sankar wrote:
>On Mon, Aug 03, 2020 at 12:05:06PM -0700, Andi Kleen wrote:
>> > However, the history of their being together comes from
>> >
>> > 9bebe9e5b0f3 ("kbuild: Fix .text.unlikely placement")
>> >
>> > which seems to indicate there was some problem with having them separated out,
>> > although I don't quite understand what the issue was from the commit message.
>>
>> Separating it out is less efficient. Gives worse packing for the hot part
>> if they are not aligned to 64byte boundaries, which they are usually not.
>>
>> It also improves packing of the cold part, but that probably doesn't matter.
>>
>> -Andi
>
>Why is that? Both .text and .text.hot have alignment of 2^4 (default
>function alignment on x86) by default, so it doesn't seem like it should
>matter for packing density. Avoiding interspersing cold text among
>regular/hot text seems like it should be a net win.
>
>That old commit doesn't reference efficiency -- it says there was some
>problem with matching when they were separated out, but there were no
>wildcard section names back then.

I just want to share some context. GNU ld's internal linker script does
impose a particular input section order by specifying separate input section descriptions:

.text :
{
*(.text.unlikely .text.*_unlikely .text.unlikely.*)
*(.text.exit .text.exit.*)
*(.text.startup .text.startup.*)
*(.text.hot .text.hot.*)
*(SORT(.text.sorted.*)) # binutils 5fa5f8f5fe494ba4fe98c11899a5464cd164ec75, invented for GCC's call graph profiling. LLVM doesn't use it
*(.text .stub .text.* .gnu.linkonce.t.*)
...

This order is a bit arbitrary. gold and LLD have -z keep-text-section-prefix. With the option,
there can be several output sections, with the '.unlikely'/'.exit'/'.startup'/etc suffix.
This has the advantage that the hot/unlikely/exit/etc attribution of a particular function is more obvious:

[ 2] .text PROGBITS 000000000040007c 00007c 000003 00 AX 0 0 4
[ 3] .text.startup PROGBITS 000000000040007f 00007f 000001 00 AX 0 0 1
[ 4] .text.exit PROGBITS 0000000000400080 000080 000002 00 AX 0 0 1
[ 5] .text.unlikely PROGBITS 0000000000400082 000082 000001 00 AX 0 0 1
...

In our case we only need one output section.......

If we place all text sections in one input section description:

*(.text.unlikely .text.*_unlikely .text.exit .text.exit.* .text.startup .text.startup.* .text.hot .text.hot.* ... )

In many cases the input sections are laid out in the input order. In LLD there are two ordering cases:

* If clang PGO (-fprofile-use=) is enabled, .llvm.call-graph-profile will be created automatically.
LLD can perform reordering **within an input section description**. The ordering is quite complex,
you can read https://github.com/llvm/llvm-project/blob/master/lld/ELF/CallGraphSort.cpp#L9 if you are curious:)

I don't know the performance improvement of this heuristic. (I don't think the original paper
cgo2017-hfsort-final1.pdf took ThinLTO into account, so the result might not
reflect realistic work loads where both ThinLTO and PGO are used) This, if matters, likely only matters
for very large executable, not the case for the kernel.
* On some RISC architectures (ARM/AArch64/PowerPC),
the ordered sections (due to either .llvm.call-graph-profile or
--symbol-reordering-file=; the two can't be used together) are placed in a
suitable place in the input section description
( http://reviews.llvm.org/D44969 )


In summary, using one (large) input section description may have some
performance improvement with LLD but I don't think it will be significant.
There may be some size improvement for ARM/AArch64/PowerPC if someone wants to test.

Andi Kleen

unread,
Aug 4, 2020, 12:45:33 AM8/4/20
to Arvind Sankar, Kees Cook, Thomas Gleixner, Will Deacon, Nick Desaulniers, Jian Cai, Fāng-ruì Sòng, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, Michal Marek
> Why is that? Both .text and .text.hot have alignment of 2^4 (default
> function alignment on x86) by default, so it doesn't seem like it should
> matter for packing density. Avoiding interspersing cold text among

You may lose part of a cache line on each unit boundary. Linux has
a lot of units, some of them small. All these bytes add up.

It's bad for TLB locality too. Sadly with all the fine grained protection
changes the 2MB coverage is eroding anyways, but this makes it even worse.

> regular/hot text seems like it should be a net win.

>
> That old commit doesn't reference efficiency -- it says there was some
> problem with matching when they were separated out, but there were no
> wildcard section names back then.

It was about efficiency.

-Andi

Fāng-ruì Sòng

unread,
Aug 4, 2020, 1:32:32 AM8/4/20
to Andi Kleen, Arvind Sankar, Kees Cook, Thomas Gleixner, Will Deacon, Nick Desaulniers, Jian Cai, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, Michal Marek
On 2020-08-03, Andi Kleen wrote:
>> Why is that? Both .text and .text.hot have alignment of 2^4 (default
>> function alignment on x86) by default, so it doesn't seem like it should
>> matter for packing density. Avoiding interspersing cold text among
>
>You may lose part of a cache line on each unit boundary. Linux has
>a lot of units, some of them small. All these bytes add up.
>
>It's bad for TLB locality too. Sadly with all the fine grained protection
>changes the 2MB coverage is eroding anyways, but this makes it even worse.

> Gives worse packing for the hot part
> if they are not aligned to 64byte boundaries, which they are usually
> not.

I do not see how the 64-byte argument is related to this patch.
If a function requires 64-byte alignment to be efficient, the compiler
should communicate this fact by setting the alignment of its containing
section to 64 bytes or above.

If a text section has a 16-byte alignment, the linker can reorder it to
an address which is a multiple of 16 but not a multiple of 64.

I agree with your other statement that having a single input section
description might be helpful. With more than one input section
descrition, the linker has to respect the ordering requirement. With
just one input section description, the linker has more freedom ordering
sections if profitable. For example, LLD performs two ordering
heuristics as my previous reply mentions.

It'd be good if someone can measure the benefit. Personally I don't
think this kind of ordering has significant benefit. (For
arm/aarch64/powerpc there might be some size benefit due to fewer range
extension thunks)

>> regular/hot text seems like it should be a net win.
>
>>
>> That old commit doesn't reference efficiency -- it says there was some
>> problem with matching when they were separated out, but there were no
>> wildcard section names back then.
>
>It was about efficiency.
>
>-Andi
>
>--
>You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
>To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200804044532.GC1321588%40tassilo.jf.intel.com.

Arvind Sankar

unread,
Aug 4, 2020, 12:06:53 PM8/4/20
to Andi Kleen, Arvind Sankar, Kees Cook, Thomas Gleixner, Will Deacon, Nick Desaulniers, Jian Cai, Fāng-ruì Sòng, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, Michal Marek
On Mon, Aug 03, 2020 at 09:45:32PM -0700, Andi Kleen wrote:
> > Why is that? Both .text and .text.hot have alignment of 2^4 (default
> > function alignment on x86) by default, so it doesn't seem like it should
> > matter for packing density. Avoiding interspersing cold text among
>
> You may lose part of a cache line on each unit boundary. Linux has
> a lot of units, some of them small. All these bytes add up.

Separating out .text.unlikely, which isn't aligned, slightly _reduces_
this loss, but not by much -- just over 1K on a defconfig. More
importantly, it moves cold code out of line (~320k on a defconfig),
giving better code density for the hot code.

For .text and .text.hot, you lose the alignment padding on every
function boundary, not unit boundary, because of the 16-byte alignment.
Whether .text.hot and .text are arranged by translation unit or not
makes no difference.

With *(.text.hot) *(.text) you get HHTT, with *(.text.hot .text) you get
HTHT, but in both cases the individual chunks are already aligned to 16
bytes. If .text.hot _had_ different alignment requirements to .text, the
HHTT should actually give better packing in general, I think.

>
> It's bad for TLB locality too. Sadly with all the fine grained protection
> changes the 2MB coverage is eroding anyways, but this makes it even worse.
>

Yes, that could be true for .text.hot, depending on whether the hot
functions are called from all over the kernel (in which case putting
them together ought to be better) or mostly from regular text within the
unit in which they appeared (in which case it would be better together
with that code).

Nick Desaulniers

unread,
Aug 7, 2020, 2:12:42 PM8/7/20
to Kees Cook, Arvind Sankar, Thomas Gleixner, Will Deacon, Ard Biesheuvel, Fangrui Song, Sedat Dilek, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), clang-built-linux, linux-arch, linux-efi, Linux ARM, LKML
On Fri, Jul 31, 2020 at 4:08 PM Kees Cook <kees...@chromium.org> wrote:
>
> From: Arvind Sankar <nive...@alum.mit.edu>
>
> The BFD linker generates run-time relocations for z_input_len and
> z_output_len, even though they are absolute symbols.
>
> This is fixed for binutils-2.35 [1]. Work around this for earlier
> versions by defining two variables input_len and output_len in addition
> to the symbols, and use them via position-independent references.
>
> This eliminates the last two run-time relocations in the head code and
> allows us to drop the -z noreloc-overflow flag to the linker.
>
> Move the -pie and --no-dynamic-linker LDFLAGS to LDFLAGS_vmlinux instead
> of KBUILD_LDFLAGS. There shouldn't be anything else getting linked, but
> this is the more logical location for these flags, and modversions might
> call the linker if an EXPORT_SYMBOL is left over accidentally in one of
> the decompressors.
>
> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=25754
>
> Tested-by: Nick Desaulniers <ndesau...@google.com>
> Reviewed-by: Kees Cook <kees...@chromium.org>
> Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
> Reviewed-by: Fangrui Song <mas...@google.com>
> Tested-by: Sedat Dilek <sedat...@gmail.com>
> Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
> Signed-off-by: Kees Cook <kees...@chromium.org>
> ---
> arch/x86/boot/compressed/Makefile | 12 ++----------
> arch/x86/boot/compressed/head_32.S | 17 ++++++++---------
> arch/x86/boot/compressed/head_64.S | 4 ++--
> arch/x86/boot/compressed/mkpiggy.c | 6 ++++++
> 4 files changed, 18 insertions(+), 21 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index 489fea16bcfb..7db0102a573d 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -51,16 +51,8 @@ UBSAN_SANITIZE :=n
> KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
> # Compressed kernel should be built as PIE since it may be loaded at any
> # address by the bootloader.
> -ifeq ($(CONFIG_X86_32),y)
> -KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
> -else
> -# To build 64-bit compressed kernel as PIE, we disable relocation
> -# overflow check to avoid relocation overflow error with a new linker
> -# command-line option, -z noreloc-overflow.
> -KBUILD_LDFLAGS += $(shell $(LD) --help 2>&1 | grep -q "\-z noreloc-overflow" \
> - && echo "-z noreloc-overflow -pie --no-dynamic-linker")
> -endif
> -LDFLAGS_vmlinux := -T
> +LDFLAGS_vmlinux := $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)

Oh, do these still need ld-option? bfd and lld both support these
flags. (Though in their --help, they mention single hyphen and double
hyphen respectively. Also, if we don't build this as PIE because the
linker doesn't support the option, we probably want to fail the build?

> +LDFLAGS_vmlinux += -T
>
> hostprogs := mkpiggy
> HOST_EXTRACFLAGS += -I$(srctree)/tools/include
> diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
> index 8c1a4f5610f5..659fad53ca82 100644
> --- a/arch/x86/boot/compressed/head_32.S
> +++ b/arch/x86/boot/compressed/head_32.S
> @@ -178,18 +178,17 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
> /*
> * Do the extraction, and jump to the new kernel..
> */
> - /* push arguments for extract_kernel: */
> - pushl $z_output_len /* decompressed length, end of relocs */
> + /* push arguments for extract_kernel: */
>
> - pushl %ebp /* output address */
> -
> - pushl $z_input_len /* input_len */
> + pushl output_len@GOTOFF(%ebx) /* decompressed length, end of relocs */
> + pushl %ebp /* output address */
> + pushl input_len@GOTOFF(%ebx) /* input_len */
> leal input_data@GOTOFF(%ebx), %eax
> - pushl %eax /* input_data */
> + pushl %eax /* input_data */
> leal boot_heap@GOTOFF(%ebx), %eax
> - pushl %eax /* heap area */
> - pushl %esi /* real mode pointer */
> - call extract_kernel /* returns kernel location in %eax */
> + pushl %eax /* heap area */
> + pushl %esi /* real mode pointer */
> + call extract_kernel /* returns kernel location in %eax */
> addl $24, %esp
>
> /*
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index 11429092c224..9e46729cf162 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -534,9 +534,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
> movq %rsi, %rdi /* real mode address */
> leaq boot_heap(%rip), %rsi /* malloc area for uncompression */
> leaq input_data(%rip), %rdx /* input_data */
> - movl $z_input_len, %ecx /* input_len */
> + movl input_len(%rip), %ecx /* input_len */
> movq %rbp, %r8 /* output target address */
> - movl $z_output_len, %r9d /* decompressed length, end of relocs */
> + movl output_len(%rip), %r9d /* decompressed length, end of relocs */
> call extract_kernel /* returns kernel location in %rax */
> popq %rsi
>
> diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
> index 7e01248765b2..52aa56cdbacc 100644
> --- a/arch/x86/boot/compressed/mkpiggy.c
> +++ b/arch/x86/boot/compressed/mkpiggy.c
> @@ -60,6 +60,12 @@ int main(int argc, char *argv[])
> printf(".incbin \"%s\"\n", argv[1]);
> printf("input_data_end:\n");
>
> + printf(".section \".rodata\",\"a\",@progbits\n");
> + printf(".globl input_len\n");
> + printf("input_len:\n\t.long %lu\n", ilen);
> + printf(".globl output_len\n");
> + printf("output_len:\n\t.long %lu\n", (unsigned long)olen);
> +
> retval = 0;
> bail:
> if (f)
> --
> 2.25.1
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200731230820.1742553-7-keescook%40chromium.org.



--
Thanks,
~Nick Desaulniers

Arvind Sankar

unread,
Aug 7, 2020, 4:21:00 PM8/7/20
to Nick Desaulniers, Kees Cook, Arvind Sankar, Thomas Gleixner, Will Deacon, Ard Biesheuvel, Fangrui Song, Sedat Dilek, Catalin Marinas, Mark Rutland, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), clang-built-linux, linux-arch, linux-efi, Linux ARM, LKML
The check for pie doesn't, it's dropped in the next patch and pie is
used unconditionally.

no-dynamic-linker still needs the check as it was only supported from
binutils-2.26.

Fangrui Song

unread,
Aug 17, 2020, 6:06:34 PM8/17/20
to Nick Desaulniers, Kees Cook, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Arvind Sankar, Nathan Chancellor, Arnd Bergmann, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), clang-built-linux, linux-arch, linux-efi, Linux ARM, LKML
NOLOAD means SHT_NOBITS (usually SHF_ALLOC). .ARM.attributes is a
non-SHF_ALLOC section.

An explicit 0 (output section address) is good - GNU ld's internal
linker scripts (ld --verbose output) use 0 for such non-SHF_ALLOC sections.
Without the 0, the section may get a non-zero address, which is not
wrong - but probably does not look well. See https://reviews.llvm.org/D85867 for details.


Reviewed-by: Fangrui Song <mas...@google.com>

Kees Cook

unread,
Aug 21, 2020, 1:49:05 PM8/21/20
to Arvind Sankar, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
On Fri, Jul 31, 2020 at 10:12:48PM -0400, Arvind Sankar wrote:
Yup, I see this:

/usr/bin/ld.bfd: warning: orphan section `.got.plt' from `arch/x86/kernel/head_64.o' being placed in section `.got.plt'


--
Kees Cook

Kees Cook

unread,
Aug 21, 2020, 2:19:09 PM8/21/20
to Arvind Sankar, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
On Fri, Jul 31, 2020 at 09:47:55PM -0400, Arvind Sankar wrote:
That's a good point, yes. I will adjust this.

> The only section for which there might be a point is .got.plt, which is
> non-empty on 32-bit, and only if it is first moved to the end. That
> saves a few bytes.

Yeah, also this. :) I will try it.

--
Kees Cook

Kees Cook

unread,
Aug 21, 2020, 2:24:28 PM8/21/20
to Arvind Sankar, Thomas Gleixner, Will Deacon, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nick Desaulniers, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org
On Sat, Aug 01, 2020 at 01:12:25PM -0400, Arvind Sankar wrote:
> Actually, moving it to the end also requires marking it INFO or
> stripping it out when creating the bzImage. Otherwise we get back to
> that old problem of materializing .bss/.pgtable in the bzImage.

Yeah -- I wonder what the best way to make sure we avoid causing the
.bss appearing again...

--
Kees Cook

Kees Cook

unread,
Aug 21, 2020, 3:18:13 PM8/21/20
to Arvind Sankar, Andi Kleen, Thomas Gleixner, Will Deacon, Nick Desaulniers, Jian Cai, Fāng-ruì Sòng, Luis Lozano, Manoj Gupta, sta...@vger.kernel.org, Catalin Marinas, Mark Rutland, Ard Biesheuvel, Peter Collingbourne, James Morse, Borislav Petkov, Ingo Molnar, Russell King, Masahiro Yamada, Nathan Chancellor, Arnd Bergmann, x...@kernel.org, clang-bu...@googlegroups.com, linux...@vger.kernel.org, linu...@vger.kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, Michal Marek
On Tue, Aug 04, 2020 at 12:06:49PM -0400, Arvind Sankar wrote:
> On Mon, Aug 03, 2020 at 09:45:32PM -0700, Andi Kleen wrote:
> > > Why is that? Both .text and .text.hot have alignment of 2^4 (default
> > > function alignment on x86) by default, so it doesn't seem like it should
> > > matter for packing density. Avoiding interspersing cold text among
> >
> > You may lose part of a cache line on each unit boundary. Linux has
> > a lot of units, some of them small. All these bytes add up.
>
> Separating out .text.unlikely, which isn't aligned, slightly _reduces_
> this loss, but not by much -- just over 1K on a defconfig. More
> importantly, it moves cold code out of line (~320k on a defconfig),
> giving better code density for the hot code.
>
> For .text and .text.hot, you lose the alignment padding on every
> function boundary, not unit boundary, because of the 16-byte alignment.
> Whether .text.hot and .text are arranged by translation unit or not
> makes no difference.
>
> With *(.text.hot) *(.text) you get HHTT, with *(.text.hot .text) you get
> HTHT, but in both cases the individual chunks are already aligned to 16
> bytes. If .text.hot _had_ different alignment requirements to .text, the
> HHTT should actually give better packing in general, I think.

Okay, so at the end of the conversation, I think it looks like this
patch is correct: it collects the hot, unlikely, etc into their own
areas (e.g. HHTTUU is more correct than HTUHTU), so this patch stands
as-is.

--
Kees Cook
Reply all
Reply to author
Forward
0 new messages