[PATCH v3 0/7] x86/boot: Remove runtime relocations from compressed kernel

14 views
Skip to first unread message

Arvind Sankar

unread,
Jun 29, 2020, 10:09:31 AM6/29/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
The compressed kernel currently contains bogus runtime relocations in
the startup code in head_{32,64}.S, which are generated by the linker,
but must not actually be processed at runtime.

This generates warnings when linking with the BFD linker, and errors
with LLD, which defaults to erroring on runtime relocations in read-only
sections. It also requires the -z noreloc-overflow hack for the 64-bit
kernel, which prevents us from linking it as -pie on an older BFD linker
(<= 2.26) or on LLD, because the locations that are to be apparently
relocated are only 32-bits in size and so cannot really have
R_X86_64_RELATIVE relocations.

This series aims to get rid of these relocations. I've build- and
boot-tested with combinations of clang/gcc-10 with lld/bfd-2.34, and
gcc-4.8.5 with bfd-2.23, skipping clang on 32-bit because it currently
has other issues [0].

The first three patches by Ard remove indirection via the GOT from the
compressed kernel code.

The next patch is an independent fix for LLD, to avoid an orphan
section in arch/x86/boot/setup.elf.

The fifth patch gets rid of almost all the relocations. It uses
standard PIC addressing technique for 32-bit, i.e. loading a register
with the address of _GLOBAL_OFFSET_TABLE_ and then using GOTOFF
references to access variables. For 64-bit, there is 32-bit code that
cannot use RIP-relative addressing, and also cannot use the 32-bit
method, since GOTOFF references are 64-bit only. This is instead handled
using a macro to replace a reference like gdt with (gdt-startup_32)
instead. The assembler will generate a PC32 relocation entry, with
addend set to (.-startup_32), and these will be replaced with constants
at link time. This works as long as all the code using such references
lives in the same section as startup_32, i.e. in .head.text.

The sixth patch addresses a remaining issue with the BFD linker, which
generates runtime relocations for absolute symbols. We use z_input_len
and z_output_len, defined in the generated piggy.S file, as symbols
whose absolute "addresses" are actually the size of the compressed
payload and the size of the decompressed kernel image respectively. LLD
does not generate relocations for these two symbols, but the BFD linker
does, prior to the upcoming 2.35. To get around this, piggy.S is
extended to also define two u32 variables (in .rodata) with the lengths,
and the head code is modified to use those instead of the symbol
addresses.

An alternative way to handle z_input_len/z_output_len would be to just
include piggy.S in head_{32,64}.S instead of as a separate object file,
since the GNU assembler doesn't generate relocations for symbols set to
constants.

The last patch adds a check in the linker script to ensure that no
runtime relocations get reintroduced.

[0] https://lore.kernel.org/lkml/20200504230309.237...@google.com/

Changes from v2:
- Incorporate Ard's patches for eliminating GOT references into this
series
- Rebase on v5.8-rc3

v2: https://lore.kernel.org/lkml/20200525225918.1...@alum.mit.edu/

Changes from v1:
- Add .text.* to setup.ld instead of just .text.startup
- Rename the la() macro introduced in the second patch for 64-bit to
rva(), and rework the explanatory comment.
- In the last patch, check both .rel.dyn and .rela.dyn, instead of just
one per arch.

Ard Biesheuvel (3):
x86/boot/compressed: Move .got.plt entries out of the .got section
x86/boot/compressed: Force hidden visibility for all symbol references
x86/boot/compressed: Get rid of GOT fixup code

Arvind Sankar (4):
x86/boot: Add .text.* to setup.ld
x86/boot: Remove run-time relocations from .head.text code
x86/boot: Remove runtime relocations from head_{32,64}.S
x86/boot: Check that there are no runtime relocations

arch/x86/boot/compressed/Makefile | 37 +-----
arch/x86/boot/compressed/head_32.S | 99 +++++----------
arch/x86/boot/compressed/head_64.S | 165 ++++++++++---------------
arch/x86/boot/compressed/hidden.h | 19 +++
arch/x86/boot/compressed/mkpiggy.c | 6 +
arch/x86/boot/compressed/vmlinux.lds.S | 24 +++-
arch/x86/boot/setup.ld | 2 +-
7 files changed, 151 insertions(+), 201 deletions(-)
create mode 100644 arch/x86/boot/compressed/hidden.h


base-commit: 9ebcfadb0610322ac537dd7aa5d9cbc2b2894c68
--
2.26.2

Arvind Sankar

unread,
Jun 29, 2020, 10:09:32 AM6/29/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

The .got.plt section contains the part of the GOT which is used by PLT
entries, and which gets updated lazily by the dynamic loader when
function calls are dispatched through those PLT entries.

On fully linked binaries such as the kernel proper or the decompressor,
this never happens, and so in practice, the .got.plt section consists
only of the first 3 magic entries that are meant to point at the _DYNAMIC
section and at the fixup routine in the loader. However, since we don't
use a dynamic loader, those entries are never populated or used.

This means that treating those entries like ordinary GOT entries, and
updating their values based on the actual placement of the executable in
memory is completely pointless, and we can just ignore the .got.plt
section entirely, provided that it has no additional entries beyond
the first 3 ones.

So add an assertion in the linker script to ensure that this assumption
holds, and move the contents out of the [_got, _egot) memory range that
is modified by the GOT fixup routines.

While at it, drop the KEEP(), since it has no effect on the contents
of output sections that are created by the linker itself.

Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/vmlinux.lds.S | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 8f1025d1f681..b17d218ccdf9 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -44,10 +44,13 @@ SECTIONS
}
.got : {
_got = .;
- KEEP(*(.got.plt))
KEEP(*(.got))
_egot = .;
}
+ .got.plt : {
+ *(.got.plt)
+ }
+
.data : {
_data = . ;
*(.data)
@@ -77,3 +80,9 @@ SECTIONS

DISCARDS
}
+
+#ifdef CONFIG_X86_64
+ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
+#else
+ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
+#endif
--
2.26.2

Arvind Sankar

unread,
Jun 29, 2020, 10:09:33 AM6/29/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

Eliminate all GOT entries in the decompressor binary, by forcing hidden
visibility for all symbol references, which informs the compiler that
such references will be resolved at link time without the need for
allocating GOT entries.

To ensure that no GOT entries will creep back in, add an assertion to
the decompressor linker script that will fire if the .got section has
a non-zero size.

[Arvind: fixup -include hidden.h to -include $(srctree)/$(src)/hidden.h]

Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/hidden.h | 19 +++++++++++++++++++
arch/x86/boot/compressed/vmlinux.lds.S | 1 +
3 files changed, 21 insertions(+)
create mode 100644 arch/x86/boot/compressed/hidden.h

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 7619742f91c9..b01c8aed0f23 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -42,6 +42,7 @@ KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
KBUILD_CFLAGS += -Wno-pointer-sign
KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=)
KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
+KBUILD_CFLAGS += -include $(srctree)/$(src)/hidden.h

KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__
GCOV_PROFILE := n
diff --git a/arch/x86/boot/compressed/hidden.h b/arch/x86/boot/compressed/hidden.h
new file mode 100644
index 000000000000..49a17b6b5962
--- /dev/null
+++ b/arch/x86/boot/compressed/hidden.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * When building position independent code with GCC using the -fPIC option,
+ * (or even the -fPIE one on older versions), it will assume that we are
+ * building a dynamic object (either a shared library or an executable) that
+ * may have symbol references that can only be resolved at load time. For a
+ * variety of reasons (ELF symbol preemption, the CoW footprint of the section
+ * that is modified by the loader), this results in all references to symbols
+ * with external linkage to go via entries in the Global Offset Table (GOT),
+ * which carries absolute addresses which need to be fixed up when the
+ * executable image is loaded at an offset which is different from its link
+ * time offset.
+ *
+ * Fortunately, there is a way to inform the compiler that such symbol
+ * references will be satisfied at link time rather than at load time, by
+ * giving them 'hidden' visibility.
+ */
+
+#pragma GCC visibility push(hidden)
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index b17d218ccdf9..4bcc943842ab 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -81,6 +81,7 @@ SECTIONS
DISCARDS
}

+ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")
#ifdef CONFIG_X86_64
ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
#else
--
2.26.2

Arvind Sankar

unread,
Jun 29, 2020, 10:09:34 AM6/29/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

In a previous patch, we have eliminated GOT entries from the decompressor
binary and added an assertion that the .got section is empty. This means
that the GOT fixup routines that exist in both the 32-bit and 64-bit
startup routines have become dead code, and can be removed.

While at it, drop the KEEP() from the linker script, as it has no effect
on the contents of output sections that are created by the linker itself.

Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/head_32.S | 24 ++---------
arch/x86/boot/compressed/head_64.S | 57 --------------------------
arch/x86/boot/compressed/vmlinux.lds.S | 4 +-
3 files changed, 5 insertions(+), 80 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 03557f2174bf..39f0bb43218f 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -49,16 +49,13 @@
* Position Independent Executable (PIE) so that linker won't optimize
* R_386_GOT32X relocation to its fixed symbol address. Older
* linkers generate R_386_32 relocations against locally defined symbols,
- * _bss, _ebss, _got, _egot and _end, in PIE. It isn't wrong, just less
- * optimal than R_386_RELATIVE. But the x86 kernel fails to properly handle
- * R_386_32 relocations when relocating the kernel. To generate
- * R_386_RELATIVE relocations, we mark _bss, _ebss, _got, _egot and _end as
- * hidden:
+ * _bss, _ebss and _end, in PIE. It isn't wrong, just less optimal than
+ * R_386_RELATIVE. But the x86 kernel fails to properly handle R_386_32
+ * relocations when relocating the kernel. To generate R_386_RELATIVE
+ * relocations, we mark _bss, _ebss and _end as hidden:
*/
.hidden _bss
.hidden _ebss
- .hidden _got
- .hidden _egot
.hidden _end

__HEAD
@@ -192,19 +189,6 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
shrl $2, %ecx
rep stosl

-/*
- * Adjust our own GOT
- */
- leal _got(%ebx), %edx
- leal _egot(%ebx), %ecx
-1:
- cmpl %ecx, %edx
- jae 2f
- addl %ebx, (%edx)
- addl $4, %edx
- jmp 1b
-2:
-
/*
* Do the extraction, and jump to the new kernel..
*/
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 97d37f0a34f5..bf1ab30acc5b 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -40,8 +40,6 @@
*/
.hidden _bss
.hidden _ebss
- .hidden _got
- .hidden _egot
.hidden _end

__HEAD
@@ -353,25 +351,6 @@ SYM_CODE_START(startup_64)
/* Set up the stack */
leaq boot_stack_end(%rbx), %rsp

- /*
- * paging_prepare() and cleanup_trampoline() below can have GOT
- * references. Adjust the table with address we are running at.
- *
- * Zero RAX for adjust_got: the GOT was not adjusted before;
- * there's no adjustment to undo.
- */
- xorq %rax, %rax
-
- /*
- * Calculate the address the binary is loaded at and use it as
- * a GOT adjustment.
- */
- call 1f
-1: popq %rdi
- subq $1b, %rdi
-
- call .Ladjust_got
-
/*
* At this point we are in long mode with 4-level paging enabled,
* but we might want to enable 5-level paging or vice versa.
@@ -464,21 +443,6 @@ trampoline_return:
pushq $0
popfq

- /*
- * Previously we've adjusted the GOT with address the binary was
- * loaded at. Now we need to re-adjust for relocation address.
- *
- * Calculate the address the binary is loaded at, so that we can
- * undo the previous GOT adjustment.
- */
- call 1f
-1: popq %rax
- subq $1b, %rax
-
- /* The new adjustment is the relocation address */
- movq %rbx, %rdi
- call .Ladjust_got
-
/*
* Copy the compressed kernel to the end of our buffer
* where decompression in place becomes safe.
@@ -556,27 +520,6 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
jmp *%rax
SYM_FUNC_END(.Lrelocated)

-/*
- * Adjust the global offset table
- *
- * RAX is the previous adjustment of the table to undo (use 0 if it's the
- * first time we touch GOT).
- * RDI is the new adjustment to apply.
- */
-.Ladjust_got:
- /* Walk through the GOT adding the address to the entries */
- leaq _got(%rip), %rdx
- leaq _egot(%rip), %rcx
-1:
- cmpq %rcx, %rdx
- jae 2f
- subq %rax, (%rdx) /* Undo previous adjustment */
- addq %rdi, (%rdx) /* Apply the new adjustment */
- addq $8, %rdx
- jmp 1b
-2:
- ret
-
.code32
/*
* This is the 32-bit trampoline that will be copied over to low memory.
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 4bcc943842ab..a4a4a59a2628 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -43,9 +43,7 @@ SECTIONS
_erodata = . ;
}
.got : {
- _got = .;
- KEEP(*(.got))
- _egot = .;
+ *(.got)
}
.got.plt : {
*(.got.plt)
--
2.26.2

Arvind Sankar

unread,
Jun 29, 2020, 10:09:35 AM6/29/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
gcc puts the main function into .text.startup when compiled with -Os (or
-O2). This results in arch/x86/boot/main.c having a .text.startup
section which is currently not included explicitly in the linker script
setup.ld in the same directory.

The BFD linker places this orphan section immediately after .text, so
this still works. However, LLD git, since [1], is choosing to place it
immediately after the .bstext section instead (this is the first code
section). This plays havoc with the section layout that setup.elf
requires to create the setup header, for eg on 64-bit:

LD arch/x86/boot/setup.elf
ld.lld: error: section .text.startup file range overlaps with .header
>>> .text.startup range is [0x200040, 0x2001FE]
>>> .header range is [0x2001EF, 0x20026B]

ld.lld: error: section .header file range overlaps with .bsdata
>>> .header range is [0x2001EF, 0x20026B]
>>> .bsdata range is [0x2001FF, 0x200398]

ld.lld: error: section .bsdata file range overlaps with .entrytext
>>> .bsdata range is [0x2001FF, 0x200398]
>>> .entrytext range is [0x20026C, 0x2002D3]

ld.lld: error: section .text.startup virtual address range overlaps
with .header
>>> .text.startup range is [0x40, 0x1FE]
>>> .header range is [0x1EF, 0x26B]

ld.lld: error: section .header virtual address range overlaps with
.bsdata
>>> .header range is [0x1EF, 0x26B]
>>> .bsdata range is [0x1FF, 0x398]

ld.lld: error: section .bsdata virtual address range overlaps with
.entrytext
>>> .bsdata range is [0x1FF, 0x398]
>>> .entrytext range is [0x26C, 0x2D3]

ld.lld: error: section .text.startup load address range overlaps with
.header
>>> .text.startup range is [0x40, 0x1FE]
>>> .header range is [0x1EF, 0x26B]

ld.lld: error: section .header load address range overlaps with
.bsdata
>>> .header range is [0x1EF, 0x26B]
>>> .bsdata range is [0x1FF, 0x398]

ld.lld: error: section .bsdata load address range overlaps with
.entrytext
>>> .bsdata range is [0x1FF, 0x398]
>>> .entrytext range is [0x26C, 0x2D3]

Add .text.* to the .text output section to fix this, and also prevent
any future surprises if the compiler decides to create other such
sections.

[1] https://reviews.llvm.org/D75225

Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
Reviewed-by: Fangrui Song <mas...@google.com>
---
arch/x86/boot/setup.ld | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld
index 24c95522f231..49546c247ae2 100644
--- a/arch/x86/boot/setup.ld
+++ b/arch/x86/boot/setup.ld
@@ -20,7 +20,7 @@ SECTIONS
.initdata : { *(.initdata) }
__end_init = .;

- .text : { *(.text) }
+ .text : { *(.text .text.*) }
.text32 : { *(.text32) }

. = ALIGN(16);
--
2.26.2

Arvind Sankar

unread,
Jun 29, 2020, 10:09:36 AM6/29/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.

On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output

With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object

On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.

Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.

Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
---
arch/x86/boot/compressed/head_32.S | 64 +++++++-----------
arch/x86/boot/compressed/head_64.S | 104 ++++++++++++++++++-----------
2 files changed, 90 insertions(+), 78 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 39f0bb43218f..8c1a4f5610f5 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -33,26 +33,10 @@
#include <asm/bootparam.h>

/*
- * The 32-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
- * relocation to get the symbol address in PIC. When the compressed x86
- * kernel isn't built as PIC, the linker optimizes R_386_GOT32X
- * relocations to their fixed symbol addresses. However, when the
- * compressed x86 kernel is loaded at a different address, it leads
- * to the following load failure:
- *
- * Failed to allocate space for phdrs
- *
- * during the decompression stage.
- *
- * If the compressed x86 kernel is relocatable at run-time, it should be
- * compiled with -fPIE, instead of -fPIC, if possible and should be built as
- * Position Independent Executable (PIE) so that linker won't optimize
- * R_386_GOT32X relocation to its fixed symbol address. Older
- * linkers generate R_386_32 relocations against locally defined symbols,
- * _bss, _ebss and _end, in PIE. It isn't wrong, just less optimal than
- * R_386_RELATIVE. But the x86 kernel fails to properly handle R_386_32
- * relocations when relocating the kernel. To generate R_386_RELATIVE
- * relocations, we mark _bss, _ebss and _end as hidden:
+ * These symbols needed to be marked as .hidden to prevent the BFD linker from
+ * generating R_386_32 (rather than R_386_RELATIVE) relocations for them when
+ * the 32-bit compressed kernel is linked as PIE. This is no longer necessary,
+ * but it doesn't hurt to keep them .hidden.
*/
.hidden _bss
.hidden _ebss
@@ -74,10 +58,10 @@ SYM_FUNC_START(startup_32)
leal (BP_scratch+4)(%esi), %esp
call 1f
1: popl %edx
- subl $1b, %edx
+ addl $_GLOBAL_OFFSET_TABLE_+(.-1b), %edx

/* Load new GDT */
- leal gdt(%edx), %eax
+ leal gdt@GOTOFF(%edx), %eax
movl %eax, 2(%eax)
lgdt (%eax)

@@ -90,14 +74,16 @@ SYM_FUNC_START(startup_32)
movl %eax, %ss

/*
- * %edx contains the address we are loaded at by the boot loader and %ebx
- * contains the address where we should move the kernel image temporarily
- * for safe in-place decompression. %ebp contains the address that the kernel
- * will be decompressed to.
+ * %edx contains the address we are loaded at by the boot loader (plus the
+ * offset to the GOT). The below code calculates %ebx to be the address where
+ * we should move the kernel image temporarily for safe in-place decompression
+ * (again, plus the offset to the GOT).
+ *
+ * %ebp is calculated to be the address that the kernel will be decompressed to.
*/

#ifdef CONFIG_RELOCATABLE
- movl %edx, %ebx
+ leal startup_32@GOTOFF(%edx), %ebx

#ifdef CONFIG_EFI_STUB
/*
@@ -108,7 +94,7 @@ SYM_FUNC_START(startup_32)
* image_offset = startup_32 - image_base
* Otherwise image_offset will be zero and has no effect on the calculations.
*/
- subl image_offset(%edx), %ebx
+ subl image_offset@GOTOFF(%edx), %ebx
#endif

movl BP_kernel_alignment(%esi), %eax
@@ -125,10 +111,10 @@ SYM_FUNC_START(startup_32)
movl %ebx, %ebp // Save the output address for later
/* Target address to relocate to for decompression */
addl BP_init_size(%esi), %ebx
- subl $_end, %ebx
+ subl $_end@GOTOFF, %ebx

/* Set up the stack */
- leal boot_stack_end(%ebx), %esp
+ leal boot_stack_end@GOTOFF(%ebx), %esp

/* Zero EFLAGS */
pushl $0
@@ -139,8 +125,8 @@ SYM_FUNC_START(startup_32)
* where decompression in place becomes safe.
*/
pushl %esi
- leal (_bss-4)(%edx), %esi
- leal (_bss-4)(%ebx), %edi
+ leal (_bss@GOTOFF-4)(%edx), %esi
+ leal (_bss@GOTOFF-4)(%ebx), %edi
movl $(_bss - startup_32), %ecx
shrl $2, %ecx
std
@@ -153,14 +139,14 @@ SYM_FUNC_START(startup_32)
* during extract_kernel below. To avoid any issues, repoint the GDTR
* to the new copy of the GDT.
*/
- leal gdt(%ebx), %eax
+ leal gdt@GOTOFF(%ebx), %eax
movl %eax, 2(%eax)
lgdt (%eax)

/*
* Jump to the relocated address.
*/
- leal .Lrelocated(%ebx), %eax
+ leal .Lrelocated@GOTOFF(%ebx), %eax
jmp *%eax
SYM_FUNC_END(startup_32)

@@ -170,7 +156,7 @@ SYM_FUNC_START_ALIAS(efi_stub_entry)
add $0x4, %esp
movl 8(%esp), %esi /* save boot_params pointer */
call efi_main
- leal startup_32(%eax), %eax
+ /* efi_main returns the possibly relocated address of startup_32 */
jmp *%eax
SYM_FUNC_END(efi32_stub_entry)
SYM_FUNC_END_ALIAS(efi_stub_entry)
@@ -183,8 +169,8 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
* Clear BSS (stack is currently empty)
*/
xorl %eax, %eax
- leal _bss(%ebx), %edi
- leal _ebss(%ebx), %ecx
+ leal _bss@GOTOFF(%ebx), %edi
+ leal _ebss@GOTOFF(%ebx), %ecx
subl %edi, %ecx
shrl $2, %ecx
rep stosl
@@ -198,9 +184,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
pushl %ebp /* output address */

pushl $z_input_len /* input_len */
- leal input_data(%ebx), %eax
+ leal input_data@GOTOFF(%ebx), %eax
pushl %eax /* input_data */
- leal boot_heap(%ebx), %eax
+ leal boot_heap@GOTOFF(%ebx), %eax
pushl %eax /* heap area */
pushl %esi /* real mode pointer */
call extract_kernel /* returns kernel location in %eax */
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index bf1ab30acc5b..11429092c224 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -43,6 +43,32 @@
.hidden _end

__HEAD
+
+/*
+ * This macro gives the relative virtual address of X, i.e. the offset of X
+ * from startup_32. This is the same as the link-time virtual address of X,
+ * since startup_32 is at 0, but defining it this way tells the
+ * assembler/linker that we do not want the actual run-time address of X. This
+ * prevents the linker from trying to create unwanted run-time relocation
+ * entries for the reference when the compressed kernel is linked as PIE.
+ *
+ * A reference X(%reg) will result in the link-time VA of X being stored with
+ * the instruction, and a run-time R_X86_64_RELATIVE relocation entry that
+ * adds the 64-bit base address where the kernel is loaded.
+ *
+ * Replacing it with (X-startup_32)(%reg) results in the offset being stored,
+ * and no run-time relocation.
+ *
+ * The macro should be used as a displacement with a base register containing
+ * the run-time address of startup_32 [i.e. rva(X)(%reg)], or as an immediate
+ * [$ rva(X)].
+ *
+ * This macro can only be used from within the .head.text section, since the
+ * expression requires startup_32 to be in the same section as the code being
+ * assembled.
+ */
+#define rva(X) ((X) - startup_32)
+
.code32
SYM_FUNC_START(startup_32)
/*
@@ -65,10 +91,10 @@ SYM_FUNC_START(startup_32)
leal (BP_scratch+4)(%esi), %esp
call 1f
1: popl %ebp
- subl $1b, %ebp
+ subl $ rva(1b), %ebp

/* Load new GDT with the 64bit segments using 32bit descriptor */
- leal gdt(%ebp), %eax
+ leal rva(gdt)(%ebp), %eax
movl %eax, 2(%eax)
lgdt (%eax)

@@ -81,7 +107,7 @@ SYM_FUNC_START(startup_32)
movl %eax, %ss

/* setup a stack and make sure cpu supports long mode. */
- leal boot_stack_end(%ebp), %esp
+ leal rva(boot_stack_end)(%ebp), %esp

call verify_cpu
testl %eax, %eax
@@ -108,7 +134,7 @@ SYM_FUNC_START(startup_32)
* image_offset = startup_32 - image_base
* Otherwise image_offset will be zero and has no effect on the calculations.
*/
- subl image_offset(%ebp), %ebx
+ subl rva(image_offset)(%ebp), %ebx
#endif

movl BP_kernel_alignment(%esi), %eax
@@ -124,7 +150,7 @@ SYM_FUNC_START(startup_32)

/* Target address to relocate to for decompression */
addl BP_init_size(%esi), %ebx
- subl $_end, %ebx
+ subl $ rva(_end), %ebx

/*
* Prepare for entering 64 bit mode
@@ -152,19 +178,19 @@ SYM_FUNC_START(startup_32)
1:

/* Initialize Page tables to 0 */
- leal pgtable(%ebx), %edi
+ leal rva(pgtable)(%ebx), %edi
xorl %eax, %eax
movl $(BOOT_INIT_PGT_SIZE/4), %ecx
rep stosl

/* Build Level 4 */
- leal pgtable + 0(%ebx), %edi
+ leal rva(pgtable + 0)(%ebx), %edi
leal 0x1007 (%edi), %eax
movl %eax, 0(%edi)
addl %edx, 4(%edi)

/* Build Level 3 */
- leal pgtable + 0x1000(%ebx), %edi
+ leal rva(pgtable + 0x1000)(%ebx), %edi
leal 0x1007(%edi), %eax
movl $4, %ecx
1: movl %eax, 0x00(%edi)
@@ -175,7 +201,7 @@ SYM_FUNC_START(startup_32)
jnz 1b

/* Build Level 2 */
- leal pgtable + 0x2000(%ebx), %edi
+ leal rva(pgtable + 0x2000)(%ebx), %edi
movl $0x00000183, %eax
movl $2048, %ecx
1: movl %eax, 0(%edi)
@@ -186,7 +212,7 @@ SYM_FUNC_START(startup_32)
jnz 1b

/* Enable the boot page tables */
- leal pgtable(%ebx), %eax
+ leal rva(pgtable)(%ebx), %eax
movl %eax, %cr3

/* Enable Long mode in EFER (Extended Feature Enable Register) */
@@ -211,14 +237,14 @@ SYM_FUNC_START(startup_32)
* We place all of the values on our mini stack so lret can
* used to perform that far jump.
*/
- leal startup_64(%ebp), %eax
+ leal rva(startup_64)(%ebp), %eax
#ifdef CONFIG_EFI_MIXED
- movl efi32_boot_args(%ebp), %edi
+ movl rva(efi32_boot_args)(%ebp), %edi
cmp $0, %edi
jz 1f
- leal efi64_stub_entry(%ebp), %eax
- movl efi32_boot_args+4(%ebp), %esi
- movl efi32_boot_args+8(%ebp), %edx // saved bootparams pointer
+ leal rva(efi64_stub_entry)(%ebp), %eax
+ movl rva(efi32_boot_args+4)(%ebp), %esi
+ movl rva(efi32_boot_args+8)(%ebp), %edx // saved bootparams pointer
cmpl $0, %edx
jnz 1f
/*
@@ -229,7 +255,7 @@ SYM_FUNC_START(startup_32)
* the correct stack alignment for entry.
*/
subl $40, %esp
- leal efi_pe_entry(%ebp), %eax
+ leal rva(efi_pe_entry)(%ebp), %eax
movl %edi, %ecx // MS calling convention
movl %esi, %edx
1:
@@ -255,18 +281,18 @@ SYM_FUNC_START(efi32_stub_entry)

call 1f
1: pop %ebp
- subl $1b, %ebp
+ subl $ rva(1b), %ebp

- movl %esi, efi32_boot_args+8(%ebp)
+ movl %esi, rva(efi32_boot_args+8)(%ebp)
SYM_INNER_LABEL(efi32_pe_stub_entry, SYM_L_LOCAL)
- movl %ecx, efi32_boot_args(%ebp)
- movl %edx, efi32_boot_args+4(%ebp)
- movb $0, efi_is64(%ebp)
+ movl %ecx, rva(efi32_boot_args)(%ebp)
+ movl %edx, rva(efi32_boot_args+4)(%ebp)
+ movb $0, rva(efi_is64)(%ebp)

/* Save firmware GDTR and code/data selectors */
- sgdtl efi32_boot_gdt(%ebp)
- movw %cs, efi32_boot_cs(%ebp)
- movw %ds, efi32_boot_ds(%ebp)
+ sgdtl rva(efi32_boot_gdt)(%ebp)
+ movw %cs, rva(efi32_boot_cs)(%ebp)
+ movw %ds, rva(efi32_boot_ds)(%ebp)

/* Disable paging */
movl %cr0, %eax
@@ -345,11 +371,11 @@ SYM_CODE_START(startup_64)

/* Target address to relocate to for decompression */
movl BP_init_size(%rsi), %ebx
- subl $_end, %ebx
+ subl $ rva(_end), %ebx
addq %rbp, %rbx

/* Set up the stack */
- leaq boot_stack_end(%rbx), %rsp
+ leaq rva(boot_stack_end)(%rbx), %rsp

/*
* At this point we are in long mode with 4-level paging enabled,
@@ -423,7 +449,7 @@ SYM_CODE_START(startup_64)
lretq
trampoline_return:
/* Restore the stack, the 32-bit trampoline uses its own stack */
- leaq boot_stack_end(%rbx), %rsp
+ leaq rva(boot_stack_end)(%rbx), %rsp

/*
* cleanup_trampoline() would restore trampoline memory.
@@ -435,7 +461,7 @@ trampoline_return:
* this function call.
*/
pushq %rsi
- leaq top_pgtable(%rbx), %rdi
+ leaq rva(top_pgtable)(%rbx), %rdi
call cleanup_trampoline
popq %rsi

@@ -449,9 +475,9 @@ trampoline_return:
*/
pushq %rsi
leaq (_bss-8)(%rip), %rsi
- leaq (_bss-8)(%rbx), %rdi
- movq $_bss /* - $startup_32 */, %rcx
- shrq $3, %rcx
+ leaq rva(_bss-8)(%rbx), %rdi
+ movl $(_bss - startup_32), %ecx
+ shrl $3, %ecx
std
rep movsq
cld
@@ -462,15 +488,15 @@ trampoline_return:
* during extract_kernel below. To avoid any issues, repoint the GDTR
* to the new copy of the GDT.
*/
- leaq gdt64(%rbx), %rax
- leaq gdt(%rbx), %rdx
+ leaq rva(gdt64)(%rbx), %rax
+ leaq rva(gdt)(%rbx), %rdx
movq %rdx, 2(%rax)
lgdt (%rax)

/*
* Jump to the relocated address.
*/
- leaq .Lrelocated(%rbx), %rax
+ leaq rva(.Lrelocated)(%rbx), %rax
jmp *%rax
SYM_CODE_END(startup_64)

@@ -482,7 +508,7 @@ SYM_FUNC_START_ALIAS(efi_stub_entry)
movq %rdx, %rbx /* save boot_params pointer */
call efi_main
movq %rbx,%rsi
- leaq startup_64(%rax), %rax
+ leaq rva(startup_64)(%rax), %rax
jmp *%rax
SYM_FUNC_END(efi64_stub_entry)
SYM_FUNC_END_ALIAS(efi_stub_entry)
@@ -645,7 +671,7 @@ SYM_DATA(efi_is64, .byte 1)
#define BS32_handle_protocol 88 // offsetof(efi_boot_services_32_t, handle_protocol)
#define LI32_image_base 32 // offsetof(efi_loaded_image_32_t, image_base)

- .text
+ __HEAD
.code32
SYM_FUNC_START(efi32_pe_entry)
/*
@@ -667,12 +693,12 @@ SYM_FUNC_START(efi32_pe_entry)

call 1f
1: pop %ebx
- subl $1b, %ebx
+ subl $ rva(1b), %ebx

/* Get the loaded image protocol pointer from the image handle */
leal -4(%ebp), %eax
pushl %eax // &loaded_image
- leal loaded_image_proto(%ebx), %eax
+ leal rva(loaded_image_proto)(%ebx), %eax
pushl %eax // pass the GUID address
pushl 8(%ebp) // pass the image handle

@@ -707,7 +733,7 @@ SYM_FUNC_START(efi32_pe_entry)
* use it before we get to the 64-bit efi_pe_entry() in C code.
*/
subl %esi, %ebx
- movl %ebx, image_offset(%ebp) // save image_offset
+ movl %ebx, rva(image_offset)(%ebp) // save image_offset
jmp efi32_pe_stub_entry

2: popl %edi // restore callee-save registers
--
2.26.2

Arvind Sankar

unread,
Jun 29, 2020, 10:09:37 AM6/29/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
The BFD linker generates runtime relocations for z_input_len and
z_output_len, even though they are absolute symbols.

This is fixed for binutils-2.35 [1]. Work around this for earlier
versions by defining two variables input_len and output_len in addition
to the symbols, and use them via position-independent references.

This eliminates the last two runtime relocations in the head code and
allows us to drop the -z noreloc-overflow flag to the linker.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=25754

Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
---
arch/x86/boot/compressed/Makefile | 8 --------
arch/x86/boot/compressed/head_32.S | 17 ++++++++---------
arch/x86/boot/compressed/head_64.S | 4 ++--
arch/x86/boot/compressed/mkpiggy.c | 6 ++++++
4 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index b01c8aed0f23..242b291037e8 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -51,15 +51,7 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-ifeq ($(CONFIG_X86_32),y)
KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
-else
-# To build 64-bit compressed kernel as PIE, we disable relocation
-# overflow check to avoid relocation overflow error with a new linker
-# command-line option, -z noreloc-overflow.
-KBUILD_LDFLAGS += $(shell $(LD) --help 2>&1 | grep -q "\-z noreloc-overflow" \
- && echo "-z noreloc-overflow -pie --no-dynamic-linker")
-endif
LDFLAGS_vmlinux := -T

hostprogs := mkpiggy
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 8c1a4f5610f5..659fad53ca82 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -178,18 +178,17 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
/*
* Do the extraction, and jump to the new kernel..
*/
- /* push arguments for extract_kernel: */
- pushl $z_output_len /* decompressed length, end of relocs */
+ /* push arguments for extract_kernel: */

- pushl %ebp /* output address */
-
- pushl $z_input_len /* input_len */
+ pushl output_len@GOTOFF(%ebx) /* decompressed length, end of relocs */
+ pushl %ebp /* output address */
+ pushl input_len@GOTOFF(%ebx) /* input_len */
leal input_data@GOTOFF(%ebx), %eax
- pushl %eax /* input_data */
+ pushl %eax /* input_data */
leal boot_heap@GOTOFF(%ebx), %eax
- pushl %eax /* heap area */
- pushl %esi /* real mode pointer */
- call extract_kernel /* returns kernel location in %eax */
+ pushl %eax /* heap area */
+ pushl %esi /* real mode pointer */
+ call extract_kernel /* returns kernel location in %eax */
addl $24, %esp

/*
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 11429092c224..9e46729cf162 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -534,9 +534,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
movq %rsi, %rdi /* real mode address */
leaq boot_heap(%rip), %rsi /* malloc area for uncompression */
leaq input_data(%rip), %rdx /* input_data */
- movl $z_input_len, %ecx /* input_len */
+ movl input_len(%rip), %ecx /* input_len */
movq %rbp, %r8 /* output target address */
- movl $z_output_len, %r9d /* decompressed length, end of relocs */
+ movl output_len(%rip), %r9d /* decompressed length, end of relocs */
call extract_kernel /* returns kernel location in %rax */
popq %rsi

diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
index 7e01248765b2..52aa56cdbacc 100644
--- a/arch/x86/boot/compressed/mkpiggy.c
+++ b/arch/x86/boot/compressed/mkpiggy.c
@@ -60,6 +60,12 @@ int main(int argc, char *argv[])
printf(".incbin \"%s\"\n", argv[1]);
printf("input_data_end:\n");

+ printf(".section \".rodata\",\"a\",@progbits\n");
+ printf(".globl input_len\n");
+ printf("input_len:\n\t.long %lu\n", ilen);
+ printf(".globl output_len\n");
+ printf("output_len:\n\t.long %lu\n", (unsigned long)olen);
+
retval = 0;
bail:
if (f)
--
2.26.2

Arvind Sankar

unread,
Jun 29, 2020, 10:09:39 AM6/29/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Add a linker script check that there are no runtime relocations, and
remove the old one that tries to check via looking for specially-named
sections in the object files.

Drop the tests for -fPIE compiler option and -pie linker option, as they
are available in all supported gcc and binutils versions (as well as
clang and lld).

Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
---
arch/x86/boot/compressed/Makefile | 28 +++-----------------------
arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
2 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 242b291037e8..9b064ea55537 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -29,7 +29,7 @@ targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \
vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4

KBUILD_CFLAGS := -m$(BITS) -O2
-KBUILD_CFLAGS += -fno-strict-aliasing $(call cc-option, -fPIE, -fPIC)
+KBUILD_CFLAGS += -fno-strict-aliasing -fPIE
KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
cflags-$(CONFIG_X86_32) := -march=i386
cflags-$(CONFIG_X86_64) := -mcmodel=small
@@ -51,7 +51,7 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
+KBUILD_LDFLAGS += -pie $(call ld-option, --no-dynamic-linker)
LDFLAGS_vmlinux := -T

hostprogs := mkpiggy
@@ -86,30 +86,8 @@ vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
vmlinux-objs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o

-# The compressed kernel is built with -fPIC/-fPIE so that a boot loader
-# can place it anywhere in memory and it will still run. However, since
-# it is executed as-is without any ELF relocation processing performed
-# (and has already had all relocation sections stripped from the binary),
-# none of the code can use data relocations (e.g. static assignments of
-# pointer values), since they will be meaningless at runtime. This check
-# will refuse to link the vmlinux if any of these relocations are found.
-quiet_cmd_check_data_rel = DATAREL $@
-define cmd_check_data_rel
- for obj in $(filter %.o,$^); do \
- $(READELF) -S $$obj | grep -qF .rel.local && { \
- echo "error: $$obj has data relocations!" >&2; \
- exit 1; \
- } || true; \
- done
-endef
-
-# We need to run two commands under "if_changed", so merge them into a
-# single invocation.
-quiet_cmd_check-and-link-vmlinux = LD $@
- cmd_check-and-link-vmlinux = $(cmd_check_data_rel); $(cmd_ld)
-
$(obj)/vmlinux: $(vmlinux-objs-y) FORCE
- $(call if_changed,check-and-link-vmlinux)
+ $(call if_changed,ld)

OBJCOPYFLAGS_vmlinux.bin := -R .comment -S
$(obj)/vmlinux.bin: vmlinux FORCE
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index a4a4a59a2628..a78510046eec 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -42,6 +42,12 @@ SECTIONS
*(.rodata.*)
_erodata = . ;
}
+ .rel.dyn : {
+ *(.rel.*)
+ }
+ .rela.dyn : {
+ *(.rela.*)
+ }
.got : {
*(.got)
}
@@ -85,3 +91,5 @@ ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT en
#else
ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
#endif
+
+ASSERT(SIZEOF(.rel.dyn) == 0 && SIZEOF(.rela.dyn) == 0, "Unexpected runtime relocations detected!")
--
2.26.2

Kees Cook

unread,
Jun 29, 2020, 11:48:09 AM6/29/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
This is also being done on arm64, and the section was specified slightly
differently (with INFO) which maybe should be done here too?

.got.plt (INFO) : { *(.got.plt) }
ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, ".got.plt not empty")

Otherwise, yes, looks good.

Reviewed-by: Kees Cook <kees...@chromium.org>

> +
> .data : {
> _data = . ;
> *(.data)
> @@ -77,3 +80,9 @@ SECTIONS
>
> DISCARDS
> }
> +
> +#ifdef CONFIG_X86_64
> +ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
> +#else
> +ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
> +#endif
> --
> 2.26.2
>

--
Kees Cook

Arvind Sankar

unread,
Jun 29, 2020, 11:50:14 AM6/29/20
to Kees Cook, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Mon, Jun 29, 2020 at 08:48:05AM -0700, Kees Cook wrote:
> On Mon, Jun 29, 2020 at 10:09:22AM -0400, Arvind Sankar wrote:
>
> This is also being done on arm64, and the section was specified slightly
> differently (with INFO) which maybe should be done here too?

I was actually just about to email you to ask what that INFO is for :)
What does it do?

Kees Cook

unread,
Jun 29, 2020, 11:51:02 AM6/29/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Is this recognized by Clang? I'm assuming so, since I see this already
being used in drivers/firmware/efi/libstub/hidden.h

> diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
> index b17d218ccdf9..4bcc943842ab 100644
> --- a/arch/x86/boot/compressed/vmlinux.lds.S
> +++ b/arch/x86/boot/compressed/vmlinux.lds.S
> @@ -81,6 +81,7 @@ SECTIONS
> DISCARDS
> }
>
> +ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")
> #ifdef CONFIG_X86_64
> ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
> #else

Reviewed-by: Kees Cook <kees...@chromium.org>

--
Kees Cook

Ard Biesheuvel

unread,
Jun 29, 2020, 11:51:15 AM6/29/20
to Arvind Sankar, Kees Cook, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
On Mon, 29 Jun 2020 at 17:50, Arvind Sankar <nive...@alum.mit.edu> wrote:
>
> On Mon, Jun 29, 2020 at 08:48:05AM -0700, Kees Cook wrote:
> > On Mon, Jun 29, 2020 at 10:09:22AM -0400, Arvind Sankar wrote:
> >
> > This is also being done on arm64, and the section was specified slightly
> > differently (with INFO) which maybe should be done here too?
>
> I was actually just about to email you to ask what that INFO is for :)
> What does it do?
>

It makes the section non-allocatable, so it is dropped from the final image

Kees Cook

unread,
Jun 29, 2020, 11:53:09 AM6/29/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Mon, Jun 29, 2020 at 10:09:24AM -0400, Arvind Sankar wrote:
> From: Ard Biesheuvel <ar...@kernel.org>
>
> In a previous patch, we have eliminated GOT entries from the decompressor
> binary and added an assertion that the .got section is empty. This means
> that the GOT fixup routines that exist in both the 32-bit and 64-bit
> startup routines have become dead code, and can be removed.
>
> While at it, drop the KEEP() from the linker script, as it has no effect
> on the contents of output sections that are created by the linker itself.
>
> Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
> Acked-by: Arvind Sankar <nive...@alum.mit.edu>
> Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
> From: Ard Biesheuvel <ar...@kernel.org>
> Link: https://lore.kernel.org/r/2020052312002...@kernel.org
> ---
> arch/x86/boot/compressed/head_32.S | 24 ++---------
> arch/x86/boot/compressed/head_64.S | 57 --------------------------
> arch/x86/boot/compressed/vmlinux.lds.S | 4 +-
> 3 files changed, 5 insertions(+), 80 deletions(-)

This makes me very happy! This code always bugged me. ;)

Kees Cook

unread,
Jun 29, 2020, 11:55:56 AM6/29/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Mon, Jun 29, 2020 at 10:09:25AM -0400, Arvind Sankar wrote:
> gcc puts the main function into .text.startup when compiled with -Os (or
> -O2). This results in arch/x86/boot/main.c having a .text.startup
> section which is currently not included explicitly in the linker script
> setup.ld in the same directory.
>
> The BFD linker places this orphan section immediately after .text, so
> this still works. However, LLD git, since [1], is choosing to place it
> immediately after the .bstext section instead (this is the first code
> section). This plays havoc with the section layout that setup.elf
> requires to create the setup header, for eg on 64-bit:

Eep! I guess this was a linking case I missed for adding
--orphan-handling=warn to my orphan handling series. (I will add that.)

Kees Cook

unread,
Jun 29, 2020, 12:04:08 PM6/29/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Are any of Thomas Garnier's PIE fixes useful here too? He had a lot of
fixes to make changes for PC-relative addressing in the various
assembly bits:
https://lore.kernel.org/lkml/20200228000105.1...@chromium.org/

>
> Start fixing this by removing relocations from .head.text:
> - On 32-bit, use a base register that holds the address of the GOT and
> reference symbol addresses using @GOTOFF, i.e.
> leal gdt@GOTOFF(%edx), %eax
> - On 64-bit, most of the code can (and already does) use %rip-relative
> addressing, however the .code32 bits can't, and the 64-bit code also
> needs to reference symbol addresses as they will be after moving the
> compressed kernel to the end of the decompression buffer.
> For these cases, reference the symbols as an offset to startup_32 to
> avoid creating relocations, i.e.
> leal (gdt-startup_32)(%bp), %eax
> This only works in .head.text as the subtraction cannot be represented
> as a PC-relative relocation unless startup_32 is in the same section
> as the code. Move efi32_pe_entry into .head.text so that it can use
> the same method to avoid relocations.
>
> Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
> Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
> Reviewed-by: Fangrui Song <mas...@google.com>

Nice.

Kees Cook

unread,
Jun 29, 2020, 12:06:09 PM6/29/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Mon, Jun 29, 2020 at 10:09:27AM -0400, Arvind Sankar wrote:
> The BFD linker generates runtime relocations for z_input_len and
> z_output_len, even though they are absolute symbols.
>
> This is fixed for binutils-2.35 [1]. Work around this for earlier
> versions by defining two variables input_len and output_len in addition
> to the symbols, and use them via position-independent references.
>
> This eliminates the last two runtime relocations in the head code and
> allows us to drop the -z noreloc-overflow flag to the linker.
>
> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=25754
>
> Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
> Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
> Reviewed-by: Fangrui Song <mas...@google.com>
> ---
> arch/x86/boot/compressed/Makefile | 8 --------
> arch/x86/boot/compressed/head_32.S | 17 ++++++++---------
> arch/x86/boot/compressed/head_64.S | 4 ++--
> arch/x86/boot/compressed/mkpiggy.c | 6 ++++++
> 4 files changed, 16 insertions(+), 19 deletions(-)

I continue to really enjoy the smaller code. Anything that makes
mkpiggy.c smaller is a win. :)

Kees Cook

unread,
Jun 29, 2020, 12:09:19 PM6/29/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Mon, Jun 29, 2020 at 10:09:28AM -0400, Arvind Sankar wrote:
> Add a linker script check that there are no runtime relocations, and
> remove the old one that tries to check via looking for specially-named
> sections in the object files.
>
> Drop the tests for -fPIE compiler option and -pie linker option, as they
> are available in all supported gcc and binutils versions (as well as
> clang and lld).
>
> Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
> Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
> Reviewed-by: Fangrui Song <mas...@google.com>
> ---
> arch/x86/boot/compressed/Makefile | 28 +++-----------------------
> arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
> 2 files changed, 11 insertions(+), 25 deletions(-)

Reviewed-by: Kees Cook <kees...@chromium.org>

question below ...

> diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
> index a4a4a59a2628..a78510046eec 100644
> --- a/arch/x86/boot/compressed/vmlinux.lds.S
> +++ b/arch/x86/boot/compressed/vmlinux.lds.S
> @@ -42,6 +42,12 @@ SECTIONS
> *(.rodata.*)
> _erodata = . ;
> }
> + .rel.dyn : {
> + *(.rel.*)
> + }
> + .rela.dyn : {
> + *(.rela.*)
> + }
> .got : {
> *(.got)
> }

Should these be marked (INFO) as well?

> @@ -85,3 +91,5 @@ ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT en
> #else
> ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
> #endif
> +
> +ASSERT(SIZEOF(.rel.dyn) == 0 && SIZEOF(.rela.dyn) == 0, "Unexpected runtime relocations detected!")

I think I should be doing this same ASSERT style for other explicit
DISCARDS in my orphan series so we'll notice if they change, instead
of being silently dropped if they grow.

--
Kees Cook

Kees Cook

unread,
Jun 29, 2020, 12:10:07 PM6/29/20
to Ard Biesheuvel, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
On Mon, Jun 29, 2020 at 05:51:00PM +0200, Ard Biesheuvel wrote:
> On Mon, 29 Jun 2020 at 17:50, Arvind Sankar <nive...@alum.mit.edu> wrote:
> >
> > On Mon, Jun 29, 2020 at 08:48:05AM -0700, Kees Cook wrote:
> > > On Mon, Jun 29, 2020 at 10:09:22AM -0400, Arvind Sankar wrote:
> > >
> > > This is also being done on arm64, and the section was specified slightly
> > > differently (with INFO) which maybe should be done here too?
> >
> > I was actually just about to email you to ask what that INFO is for :)
> > What does it do?
> >
>
> It makes the section non-allocatable, so it is dropped from the final image

i.e. takes no disk space, but the ASSERT can still be done.

--
Kees Cook

Ard Biesheuvel

unread,
Jun 29, 2020, 12:12:12 PM6/29/20
to Kees Cook, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
Given that sections marked as (INFO) will still be emitted into the
ELF image, it does not really make a difference to do this for zero
sized sections.

Kees Cook

unread,
Jun 29, 2020, 12:20:34 PM6/29/20
to Ard Biesheuvel, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
Oh, I misunderstood -- I though they were _not_ emitted; I see now what
you said was not allocated. So, disk space used for the .got.plt case,
but not memory space used. Sorry for the confusion!

-Kees

>
> > > @@ -85,3 +91,5 @@ ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT en
> > > #else
> > > ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
> > > #endif
> > > +
> > > +ASSERT(SIZEOF(.rel.dyn) == 0 && SIZEOF(.rela.dyn) == 0, "Unexpected runtime relocations detected!")
> >
> > I think I should be doing this same ASSERT style for other explicit
> > DISCARDS in my orphan series so we'll notice if they change, instead
> > of being silently dropped if they grow.
> >

--
Kees Cook

Arvind Sankar

unread,
Jun 29, 2020, 12:52:13 PM6/29/20
to Kees Cook, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
It doesn't actually reduce the size? mkpiggy.c _could_ I think be
replaced with some carefully written shell code, but this doesn't do
anything like that.

Arvind Sankar

unread,
Jun 29, 2020, 12:56:07 PM6/29/20
to Kees Cook, Ard Biesheuvel, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
In the case of the REL[A] and .got sections, they are actually already
not emitted at all into the ELF file now that they are zero size.

For .got.plt, it is only emitted for 32-bit (with the 3 reserved
entries), the 64-bit linker seems to get rid of it.

Arvind Sankar

unread,
Jun 29, 2020, 1:01:42 PM6/29/20
to Kees Cook, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
The case there is somewhat different -- he needed to convert non-PIE
code into PIE code, and has to worry about the distinction between
physical and virtual addresses. Here we actually already have PIE code,
so the assembly doesn't really have to change. It's just a matter of
being more precise so that the toolchain understands that it's PIE, and
everything is identity-mapped so it's simpler.

Fangrui Song

unread,
Jun 29, 2020, 1:37:39 PM6/29/20
to Kees Cook, Arvind Sankar, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
About output section type (INFO):
https://sourceware.org/binutils/docs/ld/Output-Section-Type.html#Output-Section-Type
says "These type names are supported for backward compatibility, and are
rarely used."

If all input section don't have the SHF_ALLOC flag, the output section
will not have this flag as well. This type is not useful...

If .got and .got.plt were used, they should be considered dynamic
relocations which should be part of the loadable image. So they should
have the SHF_ALLOC flag. (INFO) will not be applicable anyway.

SHT_REL[A] may be allocable or not. Usually .rel[a].dyn and .rel[a].plt
are linker created allocable sections. (INFO) does not make sense for them.

Ard Biesheuvel

unread,
Jun 29, 2020, 2:12:01 PM6/29/20
to Fangrui Song, Kees Cook, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
I don't care deeply either way, but Kees indicated that he would like
to get rid of the 24 bytes of .got.plt magic entries that we have no
need for.

In fact, a lot of this mangling is caused by the fact that the linker
is creating a relocatable binary, and assumes that it is a hosted
binary that is loaded by a dynamic loader. It would actually be much
better if the compiler and linker would take -ffreestanding into
account, and suppress GOT entries, PLTs, dynamic program headers for
shared libraries altogether.

Arvind Sankar

unread,
Jun 29, 2020, 2:43:39 PM6/29/20
to Kees Cook, Ard Biesheuvel, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
On Mon, Jun 29, 2020 at 09:20:31AM -0700, Kees Cook wrote:
> On Mon, Jun 29, 2020 at 06:11:59PM +0200, Ard Biesheuvel wrote:
> > On Mon, 29 Jun 2020 at 18:09, Kees Cook <kees...@chromium.org> wrote:
> > >
> > > Should these be marked (INFO) as well?
> > >
> >
> > Given that sections marked as (INFO) will still be emitted into the
> > ELF image, it does not really make a difference to do this for zero
> > sized sections.
>
> Oh, I misunderstood -- I though they were _not_ emitted; I see now what
> you said was not allocated. So, disk space used for the .got.plt case,
> but not memory space used. Sorry for the confusion!
>
> -Kees
>

To confuse the issue a bit more, there are subtleties around "disk space
used" :) The section will be present in the ELF format file, but at
least on x86, that file is then converted into binary format via
objcopy. At this point a non-allocated section at the end of the file
will be stripped off. So on 32-bit x86, moving .got.plt to the end and
marking it INFO will shave 16 bytes off the bzImage kernel.

Fangrui Song

unread,
Jun 29, 2020, 7:34:11 PM6/29/20
to Ard Biesheuvel, Kees Cook, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
Linkers (GNU ld and LLD) don't create .got or .got.plt just because the linker
command line has -pie or -shared. They create .got or .got.plt if there are
specific needs.

For .got.plt, if there is (1) any .plt/.iplt entry, (2) any .got.plt based
relocation (e.g. R_X86_64_GOTPC32 on x86-64), or (3) if _GLOBAL_OFFSET_TABLE_ is
referenced, .got.plt will be created (both GNU ld and LLD) with usually 3
entries (for ld.so purposes).

If (1) is not satisfied, the created .got.plt is just served as an anchor for
things that want to reference (the distance from GOT base to some point). The
linker will still reserve 3 words but the words are likely not needed.

I don't think there is a specific need for another option to teach the linker
(GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
-static (ld -no-pie))/-static-pie (-pie) already work quite well.

Ard Biesheuvel

unread,
Jun 30, 2020, 12:26:56 PM6/30/20
to Fangrui Song, Kees Cook, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
This is not the case for AArch64. There, __GLOBAL_OFFSET_TABLE__ is
always emitted, along with the magic .got.plt entries, regardless of
the input.

As for the input objects - why is '#pragma GCC visibility(hidden)' not
the default for -ffreestanding builds? This suppresses any GOT entries
emitted by the compiler, but the only way to get this behavior is
through the #pragma, which is how we ended up with '-include hidden.h'
in a couple of places.

IOW, if the toolchain behavior was not 100% geared towards shared
executables as it is today, we would not need the hacks that we need
to apply to get a relocatable bare metal binary like we need for the
KASLR kernel.


> If (1) is not satisfied, the created .got.plt is just served as an anchor for
> things that want to reference (the distance from GOT base to some point). The
> linker will still reserve 3 words but the words are likely not needed.
>
> I don't think there is a specific need for another option to teach the linker
> (GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
> -static (ld -no-pie))/-static-pie (-pie) already work quite well.

You mean 'ld -static -pie' right? That seems to work. Is that a recent
invention?

Arvind Sankar

unread,
Jun 30, 2020, 1:54:13 PM6/30/20
to Ard Biesheuvel, Fangrui Song, Kees Cook, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
gcc -static-pie is fairly recent [0], but it just influences how the
linker is invoked AFAIK (at least for gcc) -- in addition to passing
some linker flags, it will change what startup files get linked in (for
non-freestanding). It does not even imply -fPIE to the compiler, which
is confusing as hell. It _would_ be nice if this also told the compiler
that all symbols (perhaps unless explicitly marked) will be resolved at
static link time, so there is no need to use the GOT or PLT for globals.

As it stands, the executable can still have relocations, GOT and PLT, it
just needs to have startup code to handle them (provided by libc
typically) instead of relying on an external dynamic linker.

I don't think it's really relevant for the kernel build -- all we get is
ld -static --no-dynamic-linker, all -static does is prevent searching
shared libraries, and we already pass --no-dynamic-linker if it's
supported.

[0] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81498

Fangrui Song

unread,
Jun 30, 2020, 6:08:45 PM6/30/20
to Arvind Sankar, Ard Biesheuvel, Kees Cook, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
* Ard Biesheuvel
> On Tue, 30 Jun 2020 at 01:34, Fangrui Song <mas...@google.com> wrote:
> >
A -ffreestanding build may provide a shared object used by other
applications. For example, musl (libc)'s CFLAGS has -ffreestanding.

> IOW, if the toolchain behavior was not 100% geared towards shared
> executables as it is today, we would not need the hacks that we need
> to apply to get a relocatable bare metal binary like we need for the
> KASLR kernel.

My mere concern regarding "geared towards shared objects" is that ELF
assumes symbols of default visibility are preemptible by default.

So unfortunately it is difficult to make -fno-semantic-interposition the
default.

> If (1) is not satisfied, the created .got.plt is just served as an anchor for
> things that want to reference (the distance from GOT base to some point). The
> linker will still reserve 3 words but the words are likely not needed.
>
> I don't think there is a specific need for another option to teach the linker
> (GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
> -static (ld -no-pie))/-static-pie (-pie) already work quite well.

On 2020-06-30, Arvind Sankar wrote:
>On Tue, Jun 30, 2020 at 06:26:43PM +0200, Ard Biesheuvel wrote:
>> On Tue, 30 Jun 2020 at 01:34, Fangrui Song <mas...@google.com> wrote:
>>
>> > If (1) is not satisfied, the created .got.plt is just served as an anchor for
>> > things that want to reference (the distance from GOT base to some point). The
>> > linker will still reserve 3 words but the words are likely not needed.
>> >
>> > I don't think there is a specific need for another option to teach the linker
>> > (GNU ld or LLD) that this is a kernel link. For -ffreestanding builds, cc
>> > -static (ld -no-pie))/-static-pie (-pie) already work quite well.
>>
>> You mean 'ld -static -pie' right? That seems to work. Is that a recent
>> invention?
>
>gcc -static-pie is fairly recent [0], but it just influences how the
>linker is invoked AFAIK (at least for gcc) -- in addition to passing
>some linker flags, it will change what startup files get linked in (for
>non-freestanding). It does not even imply -fPIE to the compiler, which
>is confusing as hell. It _would_ be nice if this also told the compiler
>that all symbols (perhaps unless explicitly marked) will be resolved at
>static link time, so there is no need to use the GOT or PLT for globals.
>
>As it stands, the executable can still have relocations, GOT and PLT, it
>just needs to have startup code to handle them (provided by libc
>typically) instead of relying on an external dynamic linker.

If the executable is purely static, it does not need to have PLT. All
calls to a PLT can be redirected to the function itself. Some range
extension thunks (other terms: stub groups, veneers, etc) may still be
needed if the distance is too large.

There are cases where a GOT cannot be avoided, e.g.

extern char foo[] __attribute__((weak, visibility("hidden")));
char *fun() { return foo; }

If foo is a SHN_ABS, `movq foo@GOTPCREL(%rip), %rax` can't be optimized
by GOTPCRELX (https://sourceware.org/bugzilla/show_bug.cgi?id=25749 binutils>=2.35 will be good)
Many other architectures don't even have a GOT optimization.

Arvind Sankar

unread,
Jun 30, 2020, 7:28:02 PM6/30/20
to Fangrui Song, Arvind Sankar, Ard Biesheuvel, Kees Cook, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
Urk -- the example given in that bug report isn't even weak. Are you
guys proposing to pessimize every access to a global symbol, regardless
of visibility, by going through the GOT on the off chance that somebody
might define one of them as SHN_ABS? Can we at least gate it behind
something like __attribute__((might_be_shn_abs))?

Ard Biesheuvel

unread,
Jul 1, 2020, 2:45:09 AM7/1/20
to Arvind Sankar, Fangrui Song, Kees Cook, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
On Wed, 1 Jul 2020 at 01:28, Arvind Sankar <nive...@alum.mit.edu> wrote:
>
> On Tue, Jun 30, 2020 at 03:00:43PM -0700, Fangrui Song wrote:
> > * Ard Biesheuvel
> > > On Tue, 30 Jun 2020 at 01:34, Fangrui Song <mas...@google.com> wrote:
> >
> > If the executable is purely static, it does not need to have PLT. All
> > calls to a PLT can be redirected to the function itself. Some range
> > extension thunks (other terms: stub groups, veneers, etc) may still be
> > needed if the distance is too large.
> >
> > There are cases where a GOT cannot be avoided, e.g.
> >
> > extern char foo[] __attribute__((weak, visibility("hidden")));
> > char *fun() { return foo; }
> >
> > If foo is a SHN_ABS, `movq foo@GOTPCREL(%rip), %rax` can't be optimized
> > by GOTPCRELX (https://sourceware.org/bugzilla/show_bug.cgi?id=25749 binutils>=2.35 will be good)
> > Many other architectures don't even have a GOT optimization.
>
> Urk -- the example given in that bug report isn't even weak. Are you
> guys proposing to pessimize every access to a global symbol, regardless
> of visibility, by going through the GOT on the off chance that somebody
> might define one of them as SHN_ABS? Can we at least gate it behind
> something like __attribute__((might_be_shn_abs))?
>

SHN_ABS is typically only used for constants emitted by the linker
script, so I don't think this is a huge deal.

The example above is not that different from having a statically
initialized function pointer in your object (which might be NULL), and
that is something we already have to deal with anyway.

What I was talking about is the tendency of the compiler to assume
that every function symbol with external linkage is preemptible, and
the only way to suppress this behavior is to issue a #pragma that can
be done in code only, not on the compiler command line.

Arvind Sankar

unread,
Jul 1, 2020, 10:42:35 AM7/1/20
to Ard Biesheuvel, Arvind Sankar, Fangrui Song, Kees Cook, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
Yes, SHN_ABS is rare. But supporting it without an explicit annotation
for the compiler requires pessimizing the common case where the symbol
is _not_ SHN_ABS, because the compiler would have to assume that
everything might be defined as SHN_ABS. This includes accessing a simple
extern int variable.

It would have to generate the GOT-referencing code in all cases where it
doesn't see a strong definition, even with hidden visibility. And on
x86, we'd have to bump the toolchain requirement to at least 2.26 so we
can get the linker relaxations, otherwise you'd have GOT references in
the final code.

Arvind Sankar

unread,
Jul 13, 2020, 10:38:40 PM7/13/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
The compressed kernel currently contains bogus runtime relocations in
the startup code in head_{32,64}.S, which are generated by the linker,
but must not actually be processed at runtime.

This generates warnings when linking with the BFD linker, and errors
with LLD, which defaults to erroring on runtime relocations in read-only
sections. It also requires the -z noreloc-overflow hack for the 64-bit
kernel, which prevents us from linking it as -pie on an older BFD linker
(<= 2.26) or on LLD, because the locations that are to be apparently
relocated are only 32-bits in size and so cannot really have
R_X86_64_RELATIVE relocations.

This series aims to get rid of these relocations. I've build- and
boot-tested with combinations of clang/gcc-10 with lld/bfd-2.34, and
gcc-4.9.0 with bfd-2.24, skipping clang on 32-bit because it currently
has other issues [0].

The first three patches by Ard remove indirection via the GOT from the
compressed kernel code.

The next patch is an independent fix for LLD, to avoid an orphan
section in arch/x86/boot/setup.elf.

The fifth patch gets rid of almost all the relocations. It uses
standard PIC addressing technique for 32-bit, i.e. loading a register
with the address of _GLOBAL_OFFSET_TABLE_ and then using GOTOFF
references to access variables. For 64-bit, there is 32-bit code that
cannot use RIP-relative addressing, and also cannot use the 32-bit
method, since GOTOFF references are 64-bit only. This is instead handled
using a macro to replace a reference like gdt with (gdt-startup_32)
instead. The assembler will generate a PC32 relocation entry, with
addend set to (.-startup_32), and these will be replaced with constants
at link time. This works as long as all the code using such references
lives in the same section as startup_32, i.e. in .head.text.

The sixth patch addresses a remaining issue with the BFD linker, which
generates runtime relocations for absolute symbols. We use z_input_len
and z_output_len, defined in the generated piggy.S file, as symbols
whose absolute "addresses" are actually the size of the compressed
payload and the size of the decompressed kernel image respectively. LLD
does not generate relocations for these two symbols, but the BFD linker
does, prior to the upcoming 2.35. To get around this, piggy.S is
extended to also define two u32 variables (in .rodata) with the lengths,
and the head code is modified to use those instead of the symbol
addresses.

An alternative way to handle z_input_len/z_output_len would be to just
include piggy.S in head_{32,64}.S instead of as a separate object file,
since the GNU assembler doesn't generate relocations for symbols set to
constants.

The last patch adds a check in the linker script to ensure that no
runtime relocations get reintroduced.

[0] https://lore.kernel.org/lkml/20200504230309.237...@google.com/

Changes from v3:
- Move hidden.h to include/linux so the EFI stub and the compressed
kernel can share the same file

Changes from v2:
- Incorporate Ard's patches for eliminating GOT references into this
series
- Rebase on v5.8-rc3

v2: https://lore.kernel.org/lkml/20200525225918.1...@alum.mit.edu/

Changes from v1:
- Add .text.* to setup.ld instead of just .text.startup
- Rename the la() macro introduced in the second patch for 64-bit to
rva(), and rework the explanatory comment.
- In the last patch, check both .rel.dyn and .rela.dyn, instead of just
one per arch.


Ard Biesheuvel (3):
x86/boot/compressed: Move .got.plt entries out of the .got section
x86/boot/compressed: Force hidden visibility for all symbol references
x86/boot/compressed: Get rid of GOT fixup code

Arvind Sankar (4):
x86/boot: Add .text.* to setup.ld
x86/boot: Remove run-time relocations from .head.text code
x86/boot: Remove runtime relocations from head_{32,64}.S
x86/boot: Check that there are no runtime relocations

arch/x86/boot/compressed/Makefile | 37 +-----
arch/x86/boot/compressed/head_32.S | 99 +++++----------
arch/x86/boot/compressed/head_64.S | 165 ++++++++++---------------
arch/x86/boot/compressed/mkpiggy.c | 6 +
arch/x86/boot/compressed/vmlinux.lds.S | 24 +++-
arch/x86/boot/setup.ld | 2 +-
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/hidden.h | 6 -
include/linux/hidden.h | 19 +++
9 files changed, 152 insertions(+), 208 deletions(-)
delete mode 100644 drivers/firmware/efi/libstub/hidden.h
create mode 100644 include/linux/hidden.h


base-commit: 11ba468877bb23f28956a35e896356252d63c983
--
2.26.2

Arvind Sankar

unread,
Jul 13, 2020, 10:38:40 PM7/13/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

The .got.plt section contains the part of the GOT which is used by PLT
entries, and which gets updated lazily by the dynamic loader when
function calls are dispatched through those PLT entries.

On fully linked binaries such as the kernel proper or the decompressor,
this never happens, and so in practice, the .got.plt section consists
only of the first 3 magic entries that are meant to point at the _DYNAMIC
section and at the fixup routine in the loader. However, since we don't
use a dynamic loader, those entries are never populated or used.

This means that treating those entries like ordinary GOT entries, and
updating their values based on the actual placement of the executable in
memory is completely pointless, and we can just ignore the .got.plt
section entirely, provided that it has no additional entries beyond
the first 3 ones.

So add an assertion in the linker script to ensure that this assumption
holds, and move the contents out of the [_got, _egot) memory range that
is modified by the GOT fixup routines.

While at it, drop the KEEP(), since it has no effect on the contents
of output sections that are created by the linker itself.

Reviewed-by: Kees Cook <kees...@chromium.org>
Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/vmlinux.lds.S | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 8f1025d1f681..b17d218ccdf9 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -44,10 +44,13 @@ SECTIONS
}
.got : {
_got = .;
- KEEP(*(.got.plt))
KEEP(*(.got))
_egot = .;
}
+ .got.plt : {
+ *(.got.plt)
+ }
+
.data : {
_data = . ;
*(.data)
@@ -77,3 +80,9 @@ SECTIONS

DISCARDS
}
+
+#ifdef CONFIG_X86_64
+ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
+#else
+ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
+#endif
--
2.26.2

Arvind Sankar

unread,
Jul 13, 2020, 10:38:41 PM7/13/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

Eliminate all GOT entries in the decompressor binary, by forcing hidden
visibility for all symbol references, which informs the compiler that
such references will be resolved at link time without the need for
allocating GOT entries.

To ensure that no GOT entries will creep back in, add an assertion to
the decompressor linker script that will fire if the .got section has
a non-zero size.

[Arvind: move hidden.h to include/linux instead of making a copy]

Reviewed-by: Kees Cook <kees...@chromium.org>
Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/vmlinux.lds.S | 1 +
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/hidden.h | 6 ------
include/linux/hidden.h | 19 +++++++++++++++++++
5 files changed, 22 insertions(+), 7 deletions(-)
delete mode 100644 drivers/firmware/efi/libstub/hidden.h
create mode 100644 include/linux/hidden.h

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 7619742f91c9..c829d874dcac 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -42,6 +42,7 @@ KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
KBUILD_CFLAGS += -Wno-pointer-sign
KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=)
KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
+KBUILD_CFLAGS += -include $(srctree)/include/linux/hidden.h

KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__
GCOV_PROFILE := n
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index b17d218ccdf9..4bcc943842ab 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -81,6 +81,7 @@ SECTIONS
DISCARDS
}

+ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")
#ifdef CONFIG_X86_64
ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
#else
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 4cce372edaf4..609157a40493 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -27,7 +27,7 @@ cflags-$(CONFIG_ARM) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt

KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
- -include $(srctree)/drivers/firmware/efi/libstub/hidden.h \
+ -include $(srctree)/include/linux/hidden.h \
-D__NO_FORTIFY \
$(call cc-option,-ffreestanding) \
$(call cc-option,-fno-stack-protector) \
diff --git a/drivers/firmware/efi/libstub/hidden.h b/drivers/firmware/efi/libstub/hidden.h
deleted file mode 100644
index 3493b041f419..000000000000
--- a/drivers/firmware/efi/libstub/hidden.h
+++ /dev/null
@@ -1,6 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * To prevent the compiler from emitting GOT-indirected (and thus absolute)
- * references to any global symbols, override their visibility as 'hidden'
- */
-#pragma GCC visibility push(hidden)
diff --git a/include/linux/hidden.h b/include/linux/hidden.h
new file mode 100644
index 000000000000..49a17b6b5962
--- /dev/null
+++ b/include/linux/hidden.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * When building position independent code with GCC using the -fPIC option,
+ * (or even the -fPIE one on older versions), it will assume that we are
+ * building a dynamic object (either a shared library or an executable) that
+ * may have symbol references that can only be resolved at load time. For a
+ * variety of reasons (ELF symbol preemption, the CoW footprint of the section
+ * that is modified by the loader), this results in all references to symbols
+ * with external linkage to go via entries in the Global Offset Table (GOT),
+ * which carries absolute addresses which need to be fixed up when the
+ * executable image is loaded at an offset which is different from its link
+ * time offset.
+ *
+ * Fortunately, there is a way to inform the compiler that such symbol
+ * references will be satisfied at link time rather than at load time, by
+ * giving them 'hidden' visibility.
+ */
+
+#pragma GCC visibility push(hidden)
--
2.26.2

Arvind Sankar

unread,
Jul 13, 2020, 10:38:43 PM7/13/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

In a previous patch, we have eliminated GOT entries from the decompressor
binary and added an assertion that the .got section is empty. This means
that the GOT fixup routines that exist in both the 32-bit and 64-bit
startup routines have become dead code, and can be removed.

While at it, drop the KEEP() from the linker script, as it has no effect
on the contents of output sections that are created by the linker itself.

Reviewed-by: Kees Cook <kees...@chromium.org>
Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/head_32.S | 24 ++---------
arch/x86/boot/compressed/head_64.S | 57 --------------------------
arch/x86/boot/compressed/vmlinux.lds.S | 4 +-
3 files changed, 5 insertions(+), 80 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 03557f2174bf..39f0bb43218f 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -49,16 +49,13 @@
* Position Independent Executable (PIE) so that linker won't optimize
* R_386_GOT32X relocation to its fixed symbol address. Older
* linkers generate R_386_32 relocations against locally defined symbols,
- * _bss, _ebss, _got, _egot and _end, in PIE. It isn't wrong, just less
- * optimal than R_386_RELATIVE. But the x86 kernel fails to properly handle
- * R_386_32 relocations when relocating the kernel. To generate
- * R_386_RELATIVE relocations, we mark _bss, _ebss, _got, _egot and _end as
- * hidden:
+ * _bss, _ebss and _end, in PIE. It isn't wrong, just less optimal than
+ * R_386_RELATIVE. But the x86 kernel fails to properly handle R_386_32
+ * relocations when relocating the kernel. To generate R_386_RELATIVE
+ * relocations, we mark _bss, _ebss and _end as hidden:
*/
.hidden _bss
.hidden _ebss
- .hidden _got
- .hidden _egot
.hidden _end

__HEAD
@@ -192,19 +189,6 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
shrl $2, %ecx
rep stosl

-/*
- * Adjust our own GOT
- */
- leal _got(%ebx), %edx
- leal _egot(%ebx), %ecx
-1:
- cmpl %ecx, %edx
- jae 2f
- addl %ebx, (%edx)
- addl $4, %edx
- jmp 1b
-2:
-
/*
* Do the extraction, and jump to the new kernel..
*/
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 97d37f0a34f5..bf1ab30acc5b 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -40,8 +40,6 @@
*/
.hidden _bss
.hidden _ebss
- .hidden _got
- .hidden _egot
.hidden _end

__HEAD
@@ -353,25 +351,6 @@ SYM_CODE_START(startup_64)
/* Set up the stack */
leaq boot_stack_end(%rbx), %rsp

- /*
- * paging_prepare() and cleanup_trampoline() below can have GOT
- * references. Adjust the table with address we are running at.
- *
- * Zero RAX for adjust_got: the GOT was not adjusted before;
- * there's no adjustment to undo.
- */
- xorq %rax, %rax
-
- /*
- * Calculate the address the binary is loaded at and use it as
- * a GOT adjustment.
- */
- call 1f
-1: popq %rdi
- subq $1b, %rdi
-
- call .Ladjust_got
-
/*
* At this point we are in long mode with 4-level paging enabled,
* but we might want to enable 5-level paging or vice versa.
@@ -464,21 +443,6 @@ trampoline_return:
pushq $0
popfq

- /*
- * Previously we've adjusted the GOT with address the binary was
- * loaded at. Now we need to re-adjust for relocation address.
- *
- * Calculate the address the binary is loaded at, so that we can
- * undo the previous GOT adjustment.
- */
- call 1f
-1: popq %rax
- subq $1b, %rax
-
- /* The new adjustment is the relocation address */
- movq %rbx, %rdi
- call .Ladjust_got
-
/*
* Copy the compressed kernel to the end of our buffer
* where decompression in place becomes safe.
@@ -556,27 +520,6 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
jmp *%rax
SYM_FUNC_END(.Lrelocated)

-/*
- * Adjust the global offset table
- *
- * RAX is the previous adjustment of the table to undo (use 0 if it's the
- * first time we touch GOT).
- * RDI is the new adjustment to apply.
- */
-.Ladjust_got:
- /* Walk through the GOT adding the address to the entries */
- leaq _got(%rip), %rdx
- leaq _egot(%rip), %rcx
-1:
- cmpq %rcx, %rdx
- jae 2f
- subq %rax, (%rdx) /* Undo previous adjustment */
- addq %rdi, (%rdx) /* Apply the new adjustment */
- addq $8, %rdx
- jmp 1b
-2:
- ret
-
.code32
/*
* This is the 32-bit trampoline that will be copied over to low memory.
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 4bcc943842ab..a4a4a59a2628 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -43,9 +43,7 @@ SECTIONS
_erodata = . ;
}
.got : {
- _got = .;
- KEEP(*(.got))
- _egot = .;
+ *(.got)
}
.got.plt : {
*(.got.plt)
--
2.26.2

Arvind Sankar

unread,
Jul 13, 2020, 10:38:44 PM7/13/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
gcc puts the main function into .text.startup when compiled with -Os (or
-O2). This results in arch/x86/boot/main.c having a .text.startup
section which is currently not included explicitly in the linker script
setup.ld in the same directory.

The BFD linker places this orphan section immediately after .text, so
this still works. However, LLD git, since [1], is choosing to place it
immediately after the .bstext section instead (this is the first code
section). This plays havoc with the section layout that setup.elf
requires to create the setup header, for eg on 64-bit:

LD arch/x86/boot/setup.elf
ld.lld: error: section .text.startup file range overlaps with .header
>>> .text.startup range is [0x200040, 0x2001FE]
>>> .header range is [0x2001EF, 0x20026B]

ld.lld: error: section .header file range overlaps with .bsdata
>>> .header range is [0x2001EF, 0x20026B]
>>> .bsdata range is [0x2001FF, 0x200398]

ld.lld: error: section .bsdata file range overlaps with .entrytext
>>> .bsdata range is [0x2001FF, 0x200398]
>>> .entrytext range is [0x20026C, 0x2002D3]

ld.lld: error: section .text.startup virtual address range overlaps
with .header
>>> .text.startup range is [0x40, 0x1FE]
>>> .header range is [0x1EF, 0x26B]

ld.lld: error: section .header virtual address range overlaps with
.bsdata
>>> .header range is [0x1EF, 0x26B]
>>> .bsdata range is [0x1FF, 0x398]

ld.lld: error: section .bsdata virtual address range overlaps with
.entrytext
>>> .bsdata range is [0x1FF, 0x398]
>>> .entrytext range is [0x26C, 0x2D3]

ld.lld: error: section .text.startup load address range overlaps with
.header
>>> .text.startup range is [0x40, 0x1FE]
>>> .header range is [0x1EF, 0x26B]

ld.lld: error: section .header load address range overlaps with
.bsdata
>>> .header range is [0x1EF, 0x26B]
>>> .bsdata range is [0x1FF, 0x398]

ld.lld: error: section .bsdata load address range overlaps with
.entrytext
>>> .bsdata range is [0x1FF, 0x398]
>>> .entrytext range is [0x26C, 0x2D3]

Add .text.* to the .text output section to fix this, and also prevent
any future surprises if the compiler decides to create other such
sections.

[1] https://reviews.llvm.org/D75225

Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/setup.ld | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld
index 24c95522f231..49546c247ae2 100644
--- a/arch/x86/boot/setup.ld
+++ b/arch/x86/boot/setup.ld
@@ -20,7 +20,7 @@ SECTIONS
.initdata : { *(.initdata) }
__end_init = .;

- .text : { *(.text) }
+ .text : { *(.text .text.*) }
.text32 : { *(.text32) }

. = ALIGN(16);
--
2.26.2

Arvind Sankar

unread,
Jul 13, 2020, 10:38:45 PM7/13/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.

Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/compressed/head_32.S | 64 +++++++-----------
arch/x86/boot/compressed/head_64.S | 104 ++++++++++++++++++-----------
2 files changed, 90 insertions(+), 78 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 39f0bb43218f..8c1a4f5610f5 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -33,26 +33,10 @@
#include <asm/bootparam.h>

/*
- * The 32-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
- * relocation to get the symbol address in PIC. When the compressed x86
- * kernel isn't built as PIC, the linker optimizes R_386_GOT32X
- * relocations to their fixed symbol addresses. However, when the
- * compressed x86 kernel is loaded at a different address, it leads
- * to the following load failure:
- *
- * Failed to allocate space for phdrs
- *
- * during the decompression stage.
- *
- * If the compressed x86 kernel is relocatable at run-time, it should be
- * compiled with -fPIE, instead of -fPIC, if possible and should be built as
- * Position Independent Executable (PIE) so that linker won't optimize
- * R_386_GOT32X relocation to its fixed symbol address. Older
- * linkers generate R_386_32 relocations against locally defined symbols,
- * _bss, _ebss and _end, in PIE. It isn't wrong, just less optimal than
- * R_386_RELATIVE. But the x86 kernel fails to properly handle R_386_32
- * relocations when relocating the kernel. To generate R_386_RELATIVE
- * relocations, we mark _bss, _ebss and _end as hidden:
+ * These symbols needed to be marked as .hidden to prevent the BFD linker from
+ * generating R_386_32 (rather than R_386_RELATIVE) relocations for them when
+ * the 32-bit compressed kernel is linked as PIE. This is no longer necessary,
+ * but it doesn't hurt to keep them .hidden.
*/
.hidden _bss
.hidden _ebss
@@ -74,10 +58,10 @@ SYM_FUNC_START(startup_32)
leal (BP_scratch+4)(%esi), %esp
call 1f
1: popl %edx
- subl $1b, %edx
+ addl $_GLOBAL_OFFSET_TABLE_+(.-1b), %edx

/* Load new GDT */
- leal gdt(%edx), %eax
+ leal gdt@GOTOFF(%edx), %eax
movl %eax, 2(%eax)
lgdt (%eax)

@@ -90,14 +74,16 @@ SYM_FUNC_START(startup_32)
movl %eax, %ss

/*
- * %edx contains the address we are loaded at by the boot loader and %ebx
- * contains the address where we should move the kernel image temporarily
- * for safe in-place decompression. %ebp contains the address that the kernel
- * will be decompressed to.
+ * %edx contains the address we are loaded at by the boot loader (plus the
+ * offset to the GOT). The below code calculates %ebx to be the address where
+ * we should move the kernel image temporarily for safe in-place decompression
+ * (again, plus the offset to the GOT).
+ *
+ * %ebp is calculated to be the address that the kernel will be decompressed to.
*/

#ifdef CONFIG_RELOCATABLE
- movl %edx, %ebx
+ leal startup_32@GOTOFF(%edx), %ebx

#ifdef CONFIG_EFI_STUB
/*
@@ -108,7 +94,7 @@ SYM_FUNC_START(startup_32)
* image_offset = startup_32 - image_base
* Otherwise image_offset will be zero and has no effect on the calculations.
*/
- subl image_offset(%edx), %ebx
+ subl image_offset@GOTOFF(%edx), %ebx
#endif

movl BP_kernel_alignment(%esi), %eax
@@ -125,10 +111,10 @@ SYM_FUNC_START(startup_32)
movl %ebx, %ebp // Save the output address for later
/* Target address to relocate to for decompression */
addl BP_init_size(%esi), %ebx
- subl $_end, %ebx
+ subl $_end@GOTOFF, %ebx

/* Set up the stack */
- leal boot_stack_end(%ebx), %esp
+ leal boot_stack_end@GOTOFF(%ebx), %esp

/* Zero EFLAGS */
pushl $0
@@ -139,8 +125,8 @@ SYM_FUNC_START(startup_32)
* where decompression in place becomes safe.
*/
pushl %esi
- leal (_bss-4)(%edx), %esi
- leal (_bss-4)(%ebx), %edi
+ leal (_bss@GOTOFF-4)(%edx), %esi
+ leal (_bss@GOTOFF-4)(%ebx), %edi
movl $(_bss - startup_32), %ecx
shrl $2, %ecx
std
@@ -153,14 +139,14 @@ SYM_FUNC_START(startup_32)
* during extract_kernel below. To avoid any issues, repoint the GDTR
* to the new copy of the GDT.
*/
- leal gdt(%ebx), %eax
+ leal gdt@GOTOFF(%ebx), %eax
movl %eax, 2(%eax)
lgdt (%eax)

/*
* Jump to the relocated address.
*/
- leal .Lrelocated(%ebx), %eax
+ leal .Lrelocated@GOTOFF(%ebx), %eax
jmp *%eax
SYM_FUNC_END(startup_32)

@@ -170,7 +156,7 @@ SYM_FUNC_START_ALIAS(efi_stub_entry)
add $0x4, %esp
movl 8(%esp), %esi /* save boot_params pointer */
call efi_main
- leal startup_32(%eax), %eax
+ /* efi_main returns the possibly relocated address of startup_32 */
jmp *%eax
SYM_FUNC_END(efi32_stub_entry)
SYM_FUNC_END_ALIAS(efi_stub_entry)
@@ -183,8 +169,8 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
* Clear BSS (stack is currently empty)
*/
xorl %eax, %eax
- leal _bss(%ebx), %edi
- leal _ebss(%ebx), %ecx
+ leal _bss@GOTOFF(%ebx), %edi
+ leal _ebss@GOTOFF(%ebx), %ecx
subl %edi, %ecx
shrl $2, %ecx
rep stosl
@@ -198,9 +184,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
pushl %ebp /* output address */

pushl $z_input_len /* input_len */
- leal input_data(%ebx), %eax
+ leal input_data@GOTOFF(%ebx), %eax
pushl %eax /* input_data */
- leal boot_heap(%ebx), %eax
+ leal boot_heap@GOTOFF(%ebx), %eax
pushl %eax /* heap area */
pushl %esi /* real mode pointer */
call extract_kernel /* returns kernel location in %eax */
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index bf1ab30acc5b..11429092c224 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -43,6 +43,32 @@
.hidden _end

__HEAD
+
+/*
+ * This macro gives the relative virtual address of X, i.e. the offset of X
+ * from startup_32. This is the same as the link-time virtual address of X,
+ * since startup_32 is at 0, but defining it this way tells the
+ * assembler/linker that we do not want the actual run-time address of X. This
+ * prevents the linker from trying to create unwanted run-time relocation
+ * entries for the reference when the compressed kernel is linked as PIE.
+ *
+ * A reference X(%reg) will result in the link-time VA of X being stored with
+ * the instruction, and a run-time R_X86_64_RELATIVE relocation entry that
+ * adds the 64-bit base address where the kernel is loaded.
+ *
+ * Replacing it with (X-startup_32)(%reg) results in the offset being stored,
+ * and no run-time relocation.
+ *
+ * The macro should be used as a displacement with a base register containing
+ * the run-time address of startup_32 [i.e. rva(X)(%reg)], or as an immediate
+ * [$ rva(X)].
+ *
+ * This macro can only be used from within the .head.text section, since the
+ * expression requires startup_32 to be in the same section as the code being
+ * assembled.
+ */
+#define rva(X) ((X) - startup_32)
+
.code32
SYM_FUNC_START(startup_32)
/*
@@ -65,10 +91,10 @@ SYM_FUNC_START(startup_32)
leal (BP_scratch+4)(%esi), %esp
call 1f
1: popl %ebp
- subl $1b, %ebp
+ subl $ rva(1b), %ebp

/* Load new GDT with the 64bit segments using 32bit descriptor */
- leal gdt(%ebp), %eax
+ leal rva(gdt)(%ebp), %eax
movl %eax, 2(%eax)
lgdt (%eax)

@@ -81,7 +107,7 @@ SYM_FUNC_START(startup_32)
movl %eax, %ss

/* setup a stack and make sure cpu supports long mode. */
- leal boot_stack_end(%ebp), %esp
+ leal rva(boot_stack_end)(%ebp), %esp

call verify_cpu
testl %eax, %eax
@@ -108,7 +134,7 @@ SYM_FUNC_START(startup_32)
* image_offset = startup_32 - image_base
* Otherwise image_offset will be zero and has no effect on the calculations.
*/
- subl image_offset(%ebp), %ebx
+ subl rva(image_offset)(%ebp), %ebx
#endif

movl BP_kernel_alignment(%esi), %eax
@@ -124,7 +150,7 @@ SYM_FUNC_START(startup_32)

/* Target address to relocate to for decompression */
addl BP_init_size(%esi), %ebx
- subl $_end, %ebx
+ subl $ rva(_end), %ebx

/*
* Prepare for entering 64 bit mode
@@ -152,19 +178,19 @@ SYM_FUNC_START(startup_32)
1:

/* Initialize Page tables to 0 */
- leal pgtable(%ebx), %edi
+ leal rva(pgtable)(%ebx), %edi
xorl %eax, %eax
movl $(BOOT_INIT_PGT_SIZE/4), %ecx
rep stosl

/* Build Level 4 */
- leal pgtable + 0(%ebx), %edi
+ leal rva(pgtable + 0)(%ebx), %edi
leal 0x1007 (%edi), %eax
movl %eax, 0(%edi)
addl %edx, 4(%edi)

/* Build Level 3 */
- leal pgtable + 0x1000(%ebx), %edi
+ leal rva(pgtable + 0x1000)(%ebx), %edi
leal 0x1007(%edi), %eax
movl $4, %ecx
1: movl %eax, 0x00(%edi)
@@ -175,7 +201,7 @@ SYM_FUNC_START(startup_32)
jnz 1b

/* Build Level 2 */
- leal pgtable + 0x2000(%ebx), %edi
+ leal rva(pgtable + 0x2000)(%ebx), %edi
movl $0x00000183, %eax
movl $2048, %ecx
1: movl %eax, 0(%edi)
@@ -186,7 +212,7 @@ SYM_FUNC_START(startup_32)
jnz 1b

/* Enable the boot page tables */
- leal pgtable(%ebx), %eax
+ leal rva(pgtable)(%ebx), %eax
movl %eax, %cr3

/* Enable Long mode in EFER (Extended Feature Enable Register) */
@@ -211,14 +237,14 @@ SYM_FUNC_START(startup_32)
* We place all of the values on our mini stack so lret can
* used to perform that far jump.
*/
- leal startup_64(%ebp), %eax
+ leal rva(startup_64)(%ebp), %eax
#ifdef CONFIG_EFI_MIXED
- movl efi32_boot_args(%ebp), %edi
+ movl rva(efi32_boot_args)(%ebp), %edi
cmp $0, %edi
jz 1f
- leal efi64_stub_entry(%ebp), %eax
- movl efi32_boot_args+4(%ebp), %esi
- movl efi32_boot_args+8(%ebp), %edx // saved bootparams pointer
+ leal rva(efi64_stub_entry)(%ebp), %eax
+ movl rva(efi32_boot_args+4)(%ebp), %esi
+ movl rva(efi32_boot_args+8)(%ebp), %edx // saved bootparams pointer
cmpl $0, %edx
jnz 1f
/*
@@ -229,7 +255,7 @@ SYM_FUNC_START(startup_32)
* the correct stack alignment for entry.
*/
subl $40, %esp
- leal efi_pe_entry(%ebp), %eax
+ leal rva(efi_pe_entry)(%ebp), %eax
movl %edi, %ecx // MS calling convention
movl %esi, %edx
1:
@@ -255,18 +281,18 @@ SYM_FUNC_START(efi32_stub_entry)

call 1f
1: pop %ebp
- subl $1b, %ebp
+ subl $ rva(1b), %ebp

- movl %esi, efi32_boot_args+8(%ebp)
+ movl %esi, rva(efi32_boot_args+8)(%ebp)
SYM_INNER_LABEL(efi32_pe_stub_entry, SYM_L_LOCAL)
- movl %ecx, efi32_boot_args(%ebp)
- movl %edx, efi32_boot_args+4(%ebp)
- movb $0, efi_is64(%ebp)
+ movl %ecx, rva(efi32_boot_args)(%ebp)
+ movl %edx, rva(efi32_boot_args+4)(%ebp)
+ movb $0, rva(efi_is64)(%ebp)

/* Save firmware GDTR and code/data selectors */
- sgdtl efi32_boot_gdt(%ebp)
- movw %cs, efi32_boot_cs(%ebp)
- movw %ds, efi32_boot_ds(%ebp)
+ sgdtl rva(efi32_boot_gdt)(%ebp)
+ movw %cs, rva(efi32_boot_cs)(%ebp)
+ movw %ds, rva(efi32_boot_ds)(%ebp)

/* Disable paging */
movl %cr0, %eax
@@ -345,11 +371,11 @@ SYM_CODE_START(startup_64)

/* Target address to relocate to for decompression */
movl BP_init_size(%rsi), %ebx
- subl $_end, %ebx
+ subl $ rva(_end), %ebx
addq %rbp, %rbx

/* Set up the stack */
- leaq boot_stack_end(%rbx), %rsp
+ leaq rva(boot_stack_end)(%rbx), %rsp

/*
* At this point we are in long mode with 4-level paging enabled,
@@ -423,7 +449,7 @@ SYM_CODE_START(startup_64)
lretq
trampoline_return:
/* Restore the stack, the 32-bit trampoline uses its own stack */
- leaq boot_stack_end(%rbx), %rsp
+ leaq rva(boot_stack_end)(%rbx), %rsp

/*
* cleanup_trampoline() would restore trampoline memory.
@@ -435,7 +461,7 @@ trampoline_return:
* this function call.
*/
pushq %rsi
- leaq top_pgtable(%rbx), %rdi
+ leaq rva(top_pgtable)(%rbx), %rdi
call cleanup_trampoline
popq %rsi

@@ -449,9 +475,9 @@ trampoline_return:
*/
pushq %rsi
leaq (_bss-8)(%rip), %rsi
- leaq (_bss-8)(%rbx), %rdi
- movq $_bss /* - $startup_32 */, %rcx
- shrq $3, %rcx
+ leaq rva(_bss-8)(%rbx), %rdi
+ movl $(_bss - startup_32), %ecx
+ shrl $3, %ecx
std
rep movsq
cld
@@ -462,15 +488,15 @@ trampoline_return:
* during extract_kernel below. To avoid any issues, repoint the GDTR
* to the new copy of the GDT.
*/
- leaq gdt64(%rbx), %rax
- leaq gdt(%rbx), %rdx
+ leaq rva(gdt64)(%rbx), %rax
+ leaq rva(gdt)(%rbx), %rdx
movq %rdx, 2(%rax)
lgdt (%rax)

/*
* Jump to the relocated address.
*/
- leaq .Lrelocated(%rbx), %rax
+ leaq rva(.Lrelocated)(%rbx), %rax
jmp *%rax
SYM_CODE_END(startup_64)

@@ -482,7 +508,7 @@ SYM_FUNC_START_ALIAS(efi_stub_entry)
movq %rdx, %rbx /* save boot_params pointer */
call efi_main
movq %rbx,%rsi
- leaq startup_64(%rax), %rax
+ leaq rva(startup_64)(%rax), %rax
jmp *%rax
SYM_FUNC_END(efi64_stub_entry)
SYM_FUNC_END_ALIAS(efi_stub_entry)
@@ -645,7 +671,7 @@ SYM_DATA(efi_is64, .byte 1)
#define BS32_handle_protocol 88 // offsetof(efi_boot_services_32_t, handle_protocol)
#define LI32_image_base 32 // offsetof(efi_loaded_image_32_t, image_base)

- .text
+ __HEAD
.code32
SYM_FUNC_START(efi32_pe_entry)
/*
@@ -667,12 +693,12 @@ SYM_FUNC_START(efi32_pe_entry)

call 1f
1: pop %ebx
- subl $1b, %ebx
+ subl $ rva(1b), %ebx

/* Get the loaded image protocol pointer from the image handle */
leal -4(%ebp), %eax
pushl %eax // &loaded_image
- leal loaded_image_proto(%ebx), %eax
+ leal rva(loaded_image_proto)(%ebx), %eax
pushl %eax // pass the GUID address
pushl 8(%ebp) // pass the image handle

@@ -707,7 +733,7 @@ SYM_FUNC_START(efi32_pe_entry)
* use it before we get to the 64-bit efi_pe_entry() in C code.
*/
subl %esi, %ebx
- movl %ebx, image_offset(%ebp) // save image_offset
+ movl %ebx, rva(image_offset)(%ebp) // save image_offset
jmp efi32_pe_stub_entry

2: popl %edi // restore callee-save registers
--
2.26.2

Arvind Sankar

unread,
Jul 13, 2020, 10:38:46 PM7/13/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
The BFD linker generates runtime relocations for z_input_len and
z_output_len, even though they are absolute symbols.

This is fixed for binutils-2.35 [1]. Work around this for earlier
versions by defining two variables input_len and output_len in addition
to the symbols, and use them via position-independent references.

This eliminates the last two runtime relocations in the head code and
allows us to drop the -z noreloc-overflow flag to the linker.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=25754

Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/compressed/Makefile | 8 --------
arch/x86/boot/compressed/head_32.S | 17 ++++++++---------
arch/x86/boot/compressed/head_64.S | 4 ++--
arch/x86/boot/compressed/mkpiggy.c | 6 ++++++
4 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index c829d874dcac..7cd9a2870f7c 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -51,15 +51,7 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-ifeq ($(CONFIG_X86_32),y)
KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
-else
-# To build 64-bit compressed kernel as PIE, we disable relocation
-# overflow check to avoid relocation overflow error with a new linker
-# command-line option, -z noreloc-overflow.
-KBUILD_LDFLAGS += $(shell $(LD) --help 2>&1 | grep -q "\-z noreloc-overflow" \
- && echo "-z noreloc-overflow -pie --no-dynamic-linker")
-endif
LDFLAGS_vmlinux := -T

hostprogs := mkpiggy
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 8c1a4f5610f5..659fad53ca82 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -178,18 +178,17 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
/*
* Do the extraction, and jump to the new kernel..
*/
- /* push arguments for extract_kernel: */
- pushl $z_output_len /* decompressed length, end of relocs */
+ /* push arguments for extract_kernel: */

- pushl %ebp /* output address */
-
- pushl $z_input_len /* input_len */
+ pushl output_len@GOTOFF(%ebx) /* decompressed length, end of relocs */
+ pushl %ebp /* output address */
+ pushl input_len@GOTOFF(%ebx) /* input_len */
leal input_data@GOTOFF(%ebx), %eax
- pushl %eax /* input_data */
+ pushl %eax /* input_data */
leal boot_heap@GOTOFF(%ebx), %eax
- pushl %eax /* heap area */
- pushl %esi /* real mode pointer */
- call extract_kernel /* returns kernel location in %eax */
+ pushl %eax /* heap area */
+ pushl %esi /* real mode pointer */
+ call extract_kernel /* returns kernel location in %eax */
addl $24, %esp

/*
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 11429092c224..9e46729cf162 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -534,9 +534,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
movq %rsi, %rdi /* real mode address */
leaq boot_heap(%rip), %rsi /* malloc area for uncompression */
leaq input_data(%rip), %rdx /* input_data */
- movl $z_input_len, %ecx /* input_len */
+ movl input_len(%rip), %ecx /* input_len */
movq %rbp, %r8 /* output target address */
- movl $z_output_len, %r9d /* decompressed length, end of relocs */
+ movl output_len(%rip), %r9d /* decompressed length, end of relocs */
call extract_kernel /* returns kernel location in %rax */
popq %rsi

diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
index 7e01248765b2..52aa56cdbacc 100644
--- a/arch/x86/boot/compressed/mkpiggy.c
+++ b/arch/x86/boot/compressed/mkpiggy.c
@@ -60,6 +60,12 @@ int main(int argc, char *argv[])
printf(".incbin \"%s\"\n", argv[1]);
printf("input_data_end:\n");

+ printf(".section \".rodata\",\"a\",@progbits\n");
+ printf(".globl input_len\n");
+ printf("input_len:\n\t.long %lu\n", ilen);
+ printf(".globl output_len\n");
+ printf("output_len:\n\t.long %lu\n", (unsigned long)olen);
+
retval = 0;
bail:
if (f)
--
2.26.2

Arvind Sankar

unread,
Jul 13, 2020, 10:38:47 PM7/13/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Add a linker script check that there are no runtime relocations, and
remove the old one that tries to check via looking for specially-named
sections in the object files.

Drop the tests for -fPIE compiler option and -pie linker option, as they
are available in all supported gcc and binutils versions (as well as
clang and lld).

Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/compressed/Makefile | 28 +++-----------------------
arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
2 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 7cd9a2870f7c..cd286bb90423 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -29,7 +29,7 @@ targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \
vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4

KBUILD_CFLAGS := -m$(BITS) -O2
-KBUILD_CFLAGS += -fno-strict-aliasing $(call cc-option, -fPIE, -fPIC)
+KBUILD_CFLAGS += -fno-strict-aliasing -fPIE
KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
cflags-$(CONFIG_X86_32) := -march=i386
cflags-$(CONFIG_X86_64) := -mcmodel=small
@@ -51,7 +51,7 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
+KBUILD_LDFLAGS += -pie $(call ld-option, --no-dynamic-linker)
LDFLAGS_vmlinux := -T

hostprogs := mkpiggy
@@ -86,30 +86,8 @@ vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
vmlinux-objs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o

-# The compressed kernel is built with -fPIC/-fPIE so that a boot loader
-# can place it anywhere in memory and it will still run. However, since
-# it is executed as-is without any ELF relocation processing performed
-# (and has already had all relocation sections stripped from the binary),
-# none of the code can use data relocations (e.g. static assignments of
-# pointer values), since they will be meaningless at runtime. This check
-# will refuse to link the vmlinux if any of these relocations are found.
-quiet_cmd_check_data_rel = DATAREL $@
-define cmd_check_data_rel
- for obj in $(filter %.o,$^); do \
- $(READELF) -S $$obj | grep -qF .rel.local && { \
- echo "error: $$obj has data relocations!" >&2; \
- exit 1; \
- } || true; \
- done
-endef
-
-# We need to run two commands under "if_changed", so merge them into a
-# single invocation.
-quiet_cmd_check-and-link-vmlinux = LD $@
- cmd_check-and-link-vmlinux = $(cmd_check_data_rel); $(cmd_ld)
-
$(obj)/vmlinux: $(vmlinux-objs-y) FORCE
- $(call if_changed,check-and-link-vmlinux)
+ $(call if_changed,ld)

OBJCOPYFLAGS_vmlinux.bin := -R .comment -S
$(obj)/vmlinux.bin: vmlinux FORCE
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index a4a4a59a2628..a78510046eec 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -42,6 +42,12 @@ SECTIONS
*(.rodata.*)
_erodata = . ;
}
+ .rel.dyn : {
+ *(.rel.*)
+ }
+ .rela.dyn : {
+ *(.rela.*)
+ }
.got : {
*(.got)
}
@@ -85,3 +91,5 @@ ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT en
#else
ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
#endif
+
+ASSERT(SIZEOF(.rel.dyn) == 0 && SIZEOF(.rela.dyn) == 0, "Unexpected runtime relocations detected!")
--
2.26.2

Sedat Dilek

unread,
Jul 14, 2020, 5:21:09 AM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Mon, Jun 29, 2020 at 4:09 PM Arvind Sankar <nive...@alum.mit.edu> wrote:
>
> From: Ard Biesheuvel <ar...@kernel.org>
>
> Eliminate all GOT entries in the decompressor binary, by forcing hidden
> visibility for all symbol references, which informs the compiler that
> such references will be resolved at link time without the need for
> allocating GOT entries.
>
> To ensure that no GOT entries will creep back in, add an assertion to
> the decompressor linker script that will fire if the .got section has
> a non-zero size.
>
> [Arvind: fixup -include hidden.h to -include $(srctree)/$(src)/hidden.h]
>

Thanks for your v3 patchset.

I tried your initial patchset and informed you about the <hidden.h>
include file handling.
Dropped your patchset against Linux v5.7 as I got no (satisfying) replies.
For me this one is missing a Reported-by of mine.

As I want to test the whole v3 series, I will report later.

- Sedat -

> Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
> Acked-by: Arvind Sankar <nive...@alum.mit.edu>
> Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
> From: Ard Biesheuvel <ar...@kernel.org>
> Link: https://lore.kernel.org/r/2020052312002...@kernel.org
> ---
> arch/x86/boot/compressed/Makefile | 1 +
> arch/x86/boot/compressed/hidden.h | 19 +++++++++++++++++++
> arch/x86/boot/compressed/vmlinux.lds.S | 1 +
> 3 files changed, 21 insertions(+)
> create mode 100644 arch/x86/boot/compressed/hidden.h
>
> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
> index 7619742f91c9..b01c8aed0f23 100644
> --- a/arch/x86/boot/compressed/Makefile
> +++ b/arch/x86/boot/compressed/Makefile
> @@ -42,6 +42,7 @@ KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
> KBUILD_CFLAGS += -Wno-pointer-sign
> KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=)
> KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
> +KBUILD_CFLAGS += -include $(srctree)/$(src)/hidden.h
>
> KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__
> GCOV_PROFILE := n
> diff --git a/arch/x86/boot/compressed/hidden.h b/arch/x86/boot/compressed/hidden.h
> new file mode 100644
> index 000000000000..49a17b6b5962
> --- /dev/null
> +++ b/arch/x86/boot/compressed/hidden.h
> @@ -0,0 +1,19 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * When building position independent code with GCC using the -fPIC option,
> + * (or even the -fPIE one on older versions), it will assume that we are
> + * building a dynamic object (either a shared library or an executable) that
> + * may have symbol references that can only be resolved at load time. For a
> + * variety of reasons (ELF symbol preemption, the CoW footprint of the section
> + * that is modified by the loader), this results in all references to symbols
> + * with external linkage to go via entries in the Global Offset Table (GOT),
> + * which carries absolute addresses which need to be fixed up when the
> + * executable image is loaded at an offset which is different from its link
> + * time offset.
> + *
> + * Fortunately, there is a way to inform the compiler that such symbol
> + * references will be satisfied at link time rather than at load time, by
> + * giving them 'hidden' visibility.
> + */
> +
> +#pragma GCC visibility push(hidden)
> diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
> index b17d218ccdf9..4bcc943842ab 100644
> --- a/arch/x86/boot/compressed/vmlinux.lds.S
> +++ b/arch/x86/boot/compressed/vmlinux.lds.S
> @@ -81,6 +81,7 @@ SECTIONS
> DISCARDS
> }
>
> +ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")
> #ifdef CONFIG_X86_64
> ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
> #else
> --
> 2.26.2
>

Ard Biesheuvel

unread,
Jul 14, 2020, 5:47:50 AM7/14/20
to Sedat Dilek, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
On Tue, 14 Jul 2020 at 12:21, Sedat Dilek <sedat...@gmail.com> wrote:
>
> On Mon, Jun 29, 2020 at 4:09 PM Arvind Sankar <nive...@alum.mit.edu> wrote:
> >
> > From: Ard Biesheuvel <ar...@kernel.org>
> >
> > Eliminate all GOT entries in the decompressor binary, by forcing hidden
> > visibility for all symbol references, which informs the compiler that
> > such references will be resolved at link time without the need for
> > allocating GOT entries.
> >
> > To ensure that no GOT entries will creep back in, add an assertion to
> > the decompressor linker script that will fire if the .got section has
> > a non-zero size.
> >
> > [Arvind: fixup -include hidden.h to -include $(srctree)/$(src)/hidden.h]
> >
>
> Thanks for your v3 patchset.
>
> I tried your initial patchset and informed you about the <hidden.h>
> include file handling.
> Dropped your patchset against Linux v5.7 as I got no (satisfying) replies.

Dropped from where? This series should be taken through the -tip tree.

> For me this one is missing a Reported-by of mine.
>

We don't usually add reported-by lines for issues that were resolved
before the series was merged, given that the reported issue never
existed in mainline to begin with.

> As I want to test the whole v3 series, I will report later.
>

A tested-by is always appreciated.

Sedat Dilek

unread,
Jul 14, 2020, 9:16:02 AM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
How to test this series without building a full new kernel?

make $make_opts vmlinux

- Sedat -

Sedat Dilek

unread,
Jul 14, 2020, 9:21:04 AM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
> Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
> Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
> Reviewed-by: Fangrui Song <mas...@google.com>

Just a small nit:
You use "runtime" in another patch's patch subject-line, here "run-time"?

- Sedat -

> ---
> arch/x86/boot/compressed/head_32.S | 64 +++++++-----------
> arch/x86/boot/compressed/head_64.S | 104 ++++++++++++++++++-----------
> 2 files changed, 90 insertions(+), 78 deletions(-)
>
> diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
> index 39f0bb43218f..8c1a4f5610f5 100644
> --- a/arch/x86/boot/compressed/head_32.S
> +++ b/arch/x86/boot/compressed/head_32.S
> pushl %ebp /* output address */
>
> pushl $z_input_len /* input_len */
> - leal input_data(%ebx), %eax
> + leal input_data@GOTOFF(%ebx), %eax
> pushl %eax /* input_data */
> - leal boot_heap(%ebx), %eax
> + leal boot_heap@GOTOFF(%ebx), %eax
> pushl %eax /* heap area */
> pushl %esi /* real mode pointer */
> call extract_kernel /* returns kernel location in %eax */
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index bf1ab30acc5b..11429092c224 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S

Arvind Sankar

unread,
Jul 14, 2020, 10:15:53 AM7/14/20
to Sedat Dilek, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Tue, Jul 14, 2020 at 03:15:49PM +0200, Sedat Dilek wrote:
>
> How to test this series without building a full new kernel?
>
> make $make_opts vmlinux
>
> - Sedat -
>

Not sure I understood the question: you do have to build a full new
kernel with this patch series to test it. Since the changes are to the
compressed kernel, you would need to make bzImage, which is the end
product of the full kernel build.

Thanks.

Sedat Dilek

unread,
Jul 14, 2020, 2:13:26 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Thanks for the informations.

make bzImage

...was the last I tried - gasped 15mins on this - after gasping 30mins on

make vmlinux

I did a full new build...

...and it fails with ld.lld-11 as linker:

ld.lld-11 -m elf_x86_64 -pie --no-dynamic-linker -r -o
arch/x86/boot/compressed/.tmp_misc.o arch/x86/boot/compressed/misc.o
-T arch/x86/boot/compressed/.tmp_misc.ver; mv -f
arch/x86/boot/compressed/.tmp_misc.o arch/x86/boot/compressed/misc.o;
rm -f arch/x86/boot/compressed/.tmp_misc.ver; fi
*** ld.lld-11: error: -r and -pie may not be used together ***
make[5]: *** [scripts/Makefile.build:281:
arch/x86/boot/compressed/misc.o] Error 1

It's annoying to fail on the last minutes of a build.
Sorry for being very honest.

- Sedat -

Sedat Dilek

unread,
Jul 14, 2020, 2:30:26 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
> I did a full new build...
>
> ...and it fails with ld.lld-11 as linker:
>
> ld.lld-11 -m elf_x86_64 -pie --no-dynamic-linker -r -o
> arch/x86/boot/compressed/.tmp_misc.o arch/x86/boot/compressed/misc.o
> -T arch/x86/boot/compressed/.tmp_misc.ver; mv -f
> arch/x86/boot/compressed/.tmp_misc.o arch/x86/boot/compressed/misc.o;
> rm -f arch/x86/boot/compressed/.tmp_misc.ver; fi
> *** ld.lld-11: error: -r and -pie may not be used together ***
> make[5]: *** [scripts/Makefile.build:281:
> arch/x86/boot/compressed/misc.o] Error 1
>
> It's annoying to fail on the last minutes of a build.
> Sorry for being very honest.
>

I applied this diff...

$ git diff arch/x86/boot/compressed/Makefile
diff --git a/arch/x86/boot/compressed/Makefile
b/arch/x86/boot/compressed/Makefile
index 789d5d14d8b0..9ba52a656838 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -51,7 +51,10 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
+# LLD linker does not allow -r and -pie options to be used together.
+ifndef CONFIG_LD_IS_LLD
KBUILD_LDFLAGS += -pie $(call ld-option, --no-dynamic-linker)
+endif
LDFLAGS_vmlinux := -T

hostprogs := mkpiggy

...and was able to build, assemble, link arch/x86/boot/compressed/*.

- Sedat -

Sedat Dilek

unread,
Jul 14, 2020, 2:33:32 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
I checked my last succesfull build without your patchset:

$ grep no-dynamic-linker build-log_5.8.0-rc5-1-amd64-llvm11-ias.txt
[ EMPTY ]

- Sedat -

Sedat Dilek

unread,
Jul 14, 2020, 3:21:22 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
I was able to boot on bare metal.

Feel free to add my...

Tested-by: Sedat Dilek <sedat...@gmail.com>

...when you restore "old" behaviour when CONFIG_LD_IS_LLD=y (apply or
fold-in my diff with comment)...

...and use one "runtime" in your subject-line:

$ git log --oneline
v5.8-rc5..for-5.8/x86-boot-compressed-remove-runtime-relocations-nivedita-v4
| egrep 'runtime|run-time'
9547f8f08689 x86/boot: Check that there are no runtime relocations
ede02a307b30 x86/boot: Remove runtime relocations from head_{32,64}.S
525a67ac7ea9 x86/boot: Remove run-time relocations from .head.text code

Thanks.

- Sedat -

Arvind Sankar

unread,
Jul 14, 2020, 3:29:59 PM7/14/20
to Sedat Dilek, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Thanks for the test.

Can you share your .config? The error messages look like they're coming
from running modversions on misc.o, which is unexpected as it shouldn't
have any exported symbols, and it doesn't in my builds.

In any case, I think the right fix here would be to add -pie and
--no-dynamic-linker to LDFLAGS_vmlinux instead of KBUILD_LDFLAGS.

Sedat Dilek

unread,
Jul 14, 2020, 3:53:33 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Hmm, you might be right with moving to LDFLAGS_vmlinux.

Attached are my linux-config and dmesg-output.

- Sedat -
dmesg-T_5.8.0-rc5-2-amd64-llvm11-ias.txt
config-5.8.0-rc5-2-amd64-llvm11-ias

Arvind Sankar

unread,
Jul 14, 2020, 4:07:32 PM7/14/20
to Sedat Dilek, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Tue, Jul 14, 2020 at 09:53:19PM +0200, Sedat Dilek wrote:
>
> Hmm, you might be right with moving to LDFLAGS_vmlinux.
>
> Attached are my linux-config and dmesg-output.
>
> - Sedat -

Which tree are you building against? I notice you have KERNEL_ZSTD
enabled, which hasn't been merged yet.

Are you using Nick's series [v7]?

Also from the naming -- are you using LLVM_IAS=1 with some patches to
make it work?

[v7] https://lore.kernel.org/lkml/20200708185024.276...@gmail.com/

Sedat Dilek

unread,
Jul 14, 2020, 4:08:17 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
We will need the "ifndef CONFIG_LD_IS_LLD" as -r and -pie cannot be
used together.
Is that the or not the fact when moving to LDFLAGS_vmlinux?

I cannot test as I modified my local Git and re-invoking my
build-script is doing a whole new build-dance.

- Sedat -

Sedat Dilek

unread,
Jul 14, 2020, 4:10:26 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Sorry for not telling you the full story.
Yes, I have some additional patches like Nick T. "zstd-v7" which
should IMHO not touch this area.

- Sedat -

Arvind Sankar

unread,
Jul 14, 2020, 4:14:54 PM7/14/20
to Sedat Dilek, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
That series does touch boot/compressed, since the point is to add
support for zstd-compressed kernels. I'll need to test it out, the
__DISABLE_EXPORTS he's now using must not be working for some reason.

Sedat Dilek

unread,
Jul 14, 2020, 4:17:46 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
I have seen this in the between zstd-v6 and zstd-v7 - diff attached.

- Sedat -
zstd-initramfs-terrelln-v6-VS-zstd-initramfs-terrelln-v7.diff

Arvind Sankar

unread,
Jul 14, 2020, 4:21:04 PM7/14/20
to Sedat Dilek, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Tue, Jul 14, 2020 at 10:08:04PM +0200, Sedat Dilek wrote:
> > >
> > > In any case, I think the right fix here would be to add -pie and
> > > --no-dynamic-linker to LDFLAGS_vmlinux instead of KBUILD_LDFLAGS.
> >
> > Hmm, you might be right with moving to LDFLAGS_vmlinux.
> >
>
> We will need the "ifndef CONFIG_LD_IS_LLD" as -r and -pie cannot be
> used together.
> Is that the or not the fact when moving to LDFLAGS_vmlinux?

LDFLAGS_vmlinux will only be used to link boot/compressed/vmlinux,
whereas KBUILD_LDFLAGS is used for all linker invocations, and in
particular the little link to do the modversions stuff ends up using it.

So once we move -pie --no-dynamic-linker to the more correct
LDFLAGS_vmlinux and/or stop modversions running, we'll be fine. Being
able to use -pie with lld is one of the goals of this series.

Sedat Dilek

unread,
Jul 14, 2020, 4:24:20 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Tue, Jul 14, 2020 at 10:21 PM Arvind Sankar <nive...@alum.mit.edu> wrote:
>
> On Tue, Jul 14, 2020 at 10:08:04PM +0200, Sedat Dilek wrote:
> > > >
> > > > In any case, I think the right fix here would be to add -pie and
> > > > --no-dynamic-linker to LDFLAGS_vmlinux instead of KBUILD_LDFLAGS.
> > >
> > > Hmm, you might be right with moving to LDFLAGS_vmlinux.
> > >
> >
> > We will need the "ifndef CONFIG_LD_IS_LLD" as -r and -pie cannot be
> > used together.
> > Is that the or not the fact when moving to LDFLAGS_vmlinux?
>
> LDFLAGS_vmlinux will only be used to link boot/compressed/vmlinux,
> whereas KBUILD_LDFLAGS is used for all linker invocations, and in
> particular the little link to do the modversions stuff ends up using it.
>
> So once we move -pie --no-dynamic-linker to the more correct
> LDFLAGS_vmlinux and/or stop modversions running, we'll be fine. Being
> able to use -pie with lld is one of the goals of this series.
>

OK, I am doing a new full kernel build with:

$ git diff arch/x86/boot/compressed/Makefile
diff --git a/arch/x86/boot/compressed/Makefile
b/arch/x86/boot/compressed/Makefile
index 789d5d14d8b0..056a738e47c6 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -51,7 +51,7 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-KBUILD_LDFLAGS += -pie $(call ld-option, --no-dynamic-linker)
+LDFLAGS_vmlinux += -pie $(call ld-option, --no-dynamic-linker)
LDFLAGS_vmlinux := -T

hostprogs := mkpiggy

- Sedat -

Sedat Dilek

unread,
Jul 14, 2020, 4:27:40 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Not my day - today.

$ git diff arch/x86/boot/compressed/Makefile
diff --git a/arch/x86/boot/compressed/Makefile
b/arch/x86/boot/compressed/Makefile
index 789d5d14d8b0..9784ed37a5d7 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -51,8 +51,8 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-KBUILD_LDFLAGS += -pie $(call ld-option, --no-dynamic-linker)
LDFLAGS_vmlinux := -T
+LDFLAGS_vmlinux += -pie $(call ld-option, --no-dynamic-linker)

hostprogs := mkpiggy
HOST_EXTRACFLAGS += -I$(srctree)/tools/include

- Sedat -

Arvind Sankar

unread,
Jul 14, 2020, 4:33:30 PM7/14/20
to Sedat Dilek, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
That needs to be in just one line (or with the += on the -T line)
LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker) -T

Your second assignment will blow away the -pie.. flags.

Arvind Sankar

unread,
Jul 14, 2020, 4:35:40 PM7/14/20
to Sedat Dilek, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Nope -- -T needs to be last, since it's (trickily) put together with the
first prerequisite $(obj)/vmlinux.lds.

Sedat Dilek

unread,
Jul 14, 2020, 4:43:24 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Good I read this.

Checked the previous build-log:

ld.lld-11 -m elf_x86_64 -T arch/x86/boot/compressed/vmlinux.lds
arch/x86/boot/compressed/kernel_info.o
arch/x86/boot/compressed/head_64.o arch/x86/boot/compressed/misc.o
arch/x86/boot/compressed/string.o arch/x86/boot/compressed/cmdline.o
arch/x86/boot/compressed/error.o arch/x86/boot/compressed/piggy.o
arch/x86/boot/compressed/cpuflags.o
arch/x86/boot/compressed/early_serial_console.o
arch/x86/boot/compressed/kaslr.o arch/x86/boot/compressed/kaslr_64.o
arch/x86/boot/compressed/mem_encrypt.o
arch/x86/boot/compressed/pgtable_64.o arch/x86/boot/compressed/acpi.o
drivers/firmware/efi/libstub/lib.a
arch/x86/boot/compressed/efi_thunk_64.o -o
arch/x86/boot/compressed/vmlinux

So there is no -r option in this line.

If we move to LDFLAGS_vmlinux we can drop the "call ld-option" as both
linker GNU/ld.bfd and LLVM/lld.ld support this?

Do we need to adjust the comments?
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader

That's the minimal change needed:

$ git diff arch/x86/boot/compressed/Makefile
diff --git a/arch/x86/boot/compressed/Makefile
b/arch/x86/boot/compressed/Makefile
index 789d5d14d8b0..d0aafcd8cf6c 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -51,8 +51,8 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-KBUILD_LDFLAGS += -pie $(call ld-option, --no-dynamic-linker)
-LDFLAGS_vmlinux := -T
+LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker)
+LDFLAGS_vmlinux += -T

Arvind Sankar

unread,
Jul 14, 2020, 5:07:49 PM7/14/20
to Sedat Dilek, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Tue, Jul 14, 2020 at 10:43:11PM +0200, Sedat Dilek wrote:
> If we move to LDFLAGS_vmlinux we can drop the "call ld-option" as both
> linker GNU/ld.bfd and LLVM/lld.ld support this?

No, because ld.bfd only started supporting it from v2.26, and the kernel
aims to be buildable with v2.23.

>
> Do we need to adjust the comments?
> # Compressed kernel should be built as PIE since it may be loaded at any
> # address by the bootloader
>

It looks fine, no?

Sedat Dilek

unread,
Jul 14, 2020, 5:25:14 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org


Arvind Sankar <nive...@alum.mit.edu> schrieb am Di., 14. Juli 2020, 23:07:
On Tue, Jul 14, 2020 at 10:43:11PM +0200, Sedat Dilek wrote:
> If we move to LDFLAGS_vmlinux we can drop the "call ld-option" as both
> linker GNU/ld.bfd and LLVM/lld.ld support this?

No, because ld.bfd only started supporting it from v2.26, and the kernel
aims to be buildable with v2.23.

OK.
I cannot say much to ld.bfd and ld.lld with which version they support this.

>
> Do we need to adjust the comments?
>  # Compressed kernel should be built as PIE since it may be loaded at any
>  # address by the bootloader
>

It looks fine, no?

"My head is empty."
K. Anan

Today is not my day - no good ideas.

- Sedat -

Arvind Sankar

unread,
Jul 14, 2020, 8:41:36 PM7/14/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
The compressed kernel currently contains bogus run-time relocations in
the startup code in head_{32,64}.S, which are generated by the linker,
but must not actually be processed at run-time.

This generates warnings when linking with the BFD linker, and errors
with LLD, which defaults to erroring on run-time relocations in read-only
generates run-time relocations for absolute symbols. We use z_input_len
and z_output_len, defined in the generated piggy.S file, as symbols
whose absolute "addresses" are actually the size of the compressed
payload and the size of the decompressed kernel image respectively. LLD
does not generate relocations for these two symbols, but the BFD linker
does, prior to the upcoming 2.35. To get around this, piggy.S is
extended to also define two u32 variables (in .rodata) with the lengths,
and the head code is modified to use those instead of the symbol
addresses.

An alternative way to handle z_input_len/z_output_len would be to just
include piggy.S in head_{32,64}.S instead of as a separate object file,
since the GNU assembler doesn't generate relocations for symbols set to
constants.

The last patch adds a check in the linker script to ensure that no
run-time relocations get reintroduced.

[0] https://lore.kernel.org/lkml/20200504230309.237...@google.com/

Changes from v4:
- Move -pie --no-dynamic-linker from KBUILD_LDFLAGS to LDFLAGS_vmlinux
Sedat: I'm not clear on whether you tested with the final LDFLAGS,
could you confirm: i.e. if you tested with -pie passed to LLD?
- Replace runtime -> run-time to be consistent in wording

Changes from v3:
- Move hidden.h to include/linux so the EFI stub and the compressed
kernel can share the same file

Changes from v2:
- Incorporate Ard's patches for eliminating GOT references into this
series
- Rebase on v5.8-rc3

v2: https://lore.kernel.org/lkml/20200525225918.1...@alum.mit.edu/

Changes from v1:
- Add .text.* to setup.ld instead of just .text.startup
- Rename the la() macro introduced in the second patch for 64-bit to
rva(), and rework the explanatory comment.
- In the last patch, check both .rel.dyn and .rela.dyn, instead of just
one per arch.

Ard Biesheuvel (3):
x86/boot/compressed: Move .got.plt entries out of the .got section
x86/boot/compressed: Force hidden visibility for all symbol references
x86/boot/compressed: Get rid of GOT fixup code

Arvind Sankar (4):
x86/boot: Add .text.* to setup.ld
x86/boot: Remove run-time relocations from .head.text code
x86/boot: Remove run-time relocations from head_{32,64}.S
x86/boot: Check that there are no run-time relocations

arch/x86/boot/compressed/Makefile | 39 +-----
arch/x86/boot/compressed/head_32.S | 99 +++++----------
arch/x86/boot/compressed/head_64.S | 165 ++++++++++---------------
arch/x86/boot/compressed/mkpiggy.c | 6 +
arch/x86/boot/compressed/vmlinux.lds.S | 24 +++-
arch/x86/boot/setup.ld | 2 +-
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/hidden.h | 6 -
include/linux/hidden.h | 19 +++
9 files changed, 153 insertions(+), 209 deletions(-)
delete mode 100644 drivers/firmware/efi/libstub/hidden.h
create mode 100644 include/linux/hidden.h


base-commit: e9919e11e219eaa5e8041b7b1a196839143e9125
--
2.26.2

Arvind Sankar

unread,
Jul 14, 2020, 8:41:37 PM7/14/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

The .got.plt section contains the part of the GOT which is used by PLT
entries, and which gets updated lazily by the dynamic loader when
function calls are dispatched through those PLT entries.

On fully linked binaries such as the kernel proper or the decompressor,
this never happens, and so in practice, the .got.plt section consists
only of the first 3 magic entries that are meant to point at the _DYNAMIC
section and at the fixup routine in the loader. However, since we don't
use a dynamic loader, those entries are never populated or used.

This means that treating those entries like ordinary GOT entries, and
updating their values based on the actual placement of the executable in
memory is completely pointless, and we can just ignore the .got.plt
section entirely, provided that it has no additional entries beyond
the first 3 ones.

So add an assertion in the linker script to ensure that this assumption
holds, and move the contents out of the [_got, _egot) memory range that
is modified by the GOT fixup routines.

While at it, drop the KEEP(), since it has no effect on the contents
of output sections that are created by the linker itself.

Reviewed-by: Kees Cook <kees...@chromium.org>
Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/vmlinux.lds.S | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 8f1025d1f681..b17d218ccdf9 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -44,10 +44,13 @@ SECTIONS
}
.got : {
_got = .;
- KEEP(*(.got.plt))
KEEP(*(.got))
_egot = .;
}
+ .got.plt : {
+ *(.got.plt)
+ }
+
.data : {
_data = . ;
*(.data)
@@ -77,3 +80,9 @@ SECTIONS

DISCARDS
}
+
+#ifdef CONFIG_X86_64
+ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
+#else
+ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
+#endif
--
2.26.2

Arvind Sankar

unread,
Jul 14, 2020, 8:41:38 PM7/14/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

Eliminate all GOT entries in the decompressor binary, by forcing hidden
visibility for all symbol references, which informs the compiler that
such references will be resolved at link time without the need for
allocating GOT entries.

To ensure that no GOT entries will creep back in, add an assertion to
the decompressor linker script that will fire if the .got section has
a non-zero size.

[Arvind: move hidden.h to include/linux instead of making a copy]

Reviewed-by: Kees Cook <kees...@chromium.org>
Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/vmlinux.lds.S | 1 +
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/hidden.h | 6 ------
include/linux/hidden.h | 19 +++++++++++++++++++
5 files changed, 22 insertions(+), 7 deletions(-)
delete mode 100644 drivers/firmware/efi/libstub/hidden.h
create mode 100644 include/linux/hidden.h

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 7619742f91c9..c829d874dcac 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -42,6 +42,7 @@ KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
KBUILD_CFLAGS += -Wno-pointer-sign
KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=)
KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
+KBUILD_CFLAGS += -include $(srctree)/include/linux/hidden.h

KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__
GCOV_PROFILE := n
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index b17d218ccdf9..4bcc943842ab 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -81,6 +81,7 @@ SECTIONS
DISCARDS
}

+ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")
#ifdef CONFIG_X86_64
ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
#else
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 4cce372edaf4..609157a40493 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -27,7 +27,7 @@ cflags-$(CONFIG_ARM) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt

KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
- -include $(srctree)/drivers/firmware/efi/libstub/hidden.h \
+ -include $(srctree)/include/linux/hidden.h \
-D__NO_FORTIFY \
$(call cc-option,-ffreestanding) \
$(call cc-option,-fno-stack-protector) \
diff --git a/drivers/firmware/efi/libstub/hidden.h b/drivers/firmware/efi/libstub/hidden.h
deleted file mode 100644
index 3493b041f419..000000000000
--- a/drivers/firmware/efi/libstub/hidden.h
+++ /dev/null
@@ -1,6 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * To prevent the compiler from emitting GOT-indirected (and thus absolute)
- * references to any global symbols, override their visibility as 'hidden'
- */
-#pragma GCC visibility push(hidden)
diff --git a/include/linux/hidden.h b/include/linux/hidden.h
new file mode 100644
index 000000000000..49a17b6b5962
--- /dev/null
+++ b/include/linux/hidden.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * When building position independent code with GCC using the -fPIC option,
+ * (or even the -fPIE one on older versions), it will assume that we are
+ * building a dynamic object (either a shared library or an executable) that
+ * may have symbol references that can only be resolved at load time. For a
+ * variety of reasons (ELF symbol preemption, the CoW footprint of the section
+ * that is modified by the loader), this results in all references to symbols
+ * with external linkage to go via entries in the Global Offset Table (GOT),
+ * which carries absolute addresses which need to be fixed up when the
+ * executable image is loaded at an offset which is different from its link
+ * time offset.
+ *
+ * Fortunately, there is a way to inform the compiler that such symbol
+ * references will be satisfied at link time rather than at load time, by
+ * giving them 'hidden' visibility.
+ */
+
+#pragma GCC visibility push(hidden)
--
2.26.2

Arvind Sankar

unread,
Jul 14, 2020, 8:41:39 PM7/14/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

In a previous patch, we have eliminated GOT entries from the decompressor
binary and added an assertion that the .got section is empty. This means
that the GOT fixup routines that exist in both the 32-bit and 64-bit
startup routines have become dead code, and can be removed.

While at it, drop the KEEP() from the linker script, as it has no effect
on the contents of output sections that are created by the linker itself.

Reviewed-by: Kees Cook <kees...@chromium.org>
Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/head_32.S | 24 ++---------
arch/x86/boot/compressed/head_64.S | 57 --------------------------
arch/x86/boot/compressed/vmlinux.lds.S | 4 +-
3 files changed, 5 insertions(+), 80 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 03557f2174bf..39f0bb43218f 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -49,16 +49,13 @@
* Position Independent Executable (PIE) so that linker won't optimize
* R_386_GOT32X relocation to its fixed symbol address. Older
* linkers generate R_386_32 relocations against locally defined symbols,
- * _bss, _ebss, _got, _egot and _end, in PIE. It isn't wrong, just less
- * optimal than R_386_RELATIVE. But the x86 kernel fails to properly handle
- * R_386_32 relocations when relocating the kernel. To generate
- * R_386_RELATIVE relocations, we mark _bss, _ebss, _got, _egot and _end as
- * hidden:
+ * _bss, _ebss and _end, in PIE. It isn't wrong, just less optimal than
+ * R_386_RELATIVE. But the x86 kernel fails to properly handle R_386_32
+ * relocations when relocating the kernel. To generate R_386_RELATIVE
+ * relocations, we mark _bss, _ebss and _end as hidden:
*/
.hidden _bss
.hidden _ebss
- .hidden _got
- .hidden _egot
.hidden _end

__HEAD
@@ -192,19 +189,6 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
shrl $2, %ecx
rep stosl

-/*
- * Adjust our own GOT
- */
- leal _got(%ebx), %edx
- leal _egot(%ebx), %ecx
-1:
- cmpl %ecx, %edx
- jae 2f
- addl %ebx, (%edx)
- addl $4, %edx
- jmp 1b
-2:
-
/*
* Do the extraction, and jump to the new kernel..
*/
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 97d37f0a34f5..bf1ab30acc5b 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -40,8 +40,6 @@
*/
.hidden _bss
.hidden _ebss
- .hidden _got
- .hidden _egot
.hidden _end

__HEAD
@@ -353,25 +351,6 @@ SYM_CODE_START(startup_64)
/* Set up the stack */
leaq boot_stack_end(%rbx), %rsp

- /*
- * paging_prepare() and cleanup_trampoline() below can have GOT
- * references. Adjust the table with address we are running at.
- *
- * Zero RAX for adjust_got: the GOT was not adjusted before;
- * there's no adjustment to undo.
- */
- xorq %rax, %rax
-
- /*
- * Calculate the address the binary is loaded at and use it as
- * a GOT adjustment.
- */
- call 1f
-1: popq %rdi
- subq $1b, %rdi
-
- call .Ladjust_got
-
/*
* At this point we are in long mode with 4-level paging enabled,
* but we might want to enable 5-level paging or vice versa.
@@ -464,21 +443,6 @@ trampoline_return:
pushq $0
popfq

- /*
- * Previously we've adjusted the GOT with address the binary was
- * loaded at. Now we need to re-adjust for relocation address.
- *
- * Calculate the address the binary is loaded at, so that we can
- * undo the previous GOT adjustment.
- */
- call 1f
-1: popq %rax
- subq $1b, %rax
-
- /* The new adjustment is the relocation address */
- movq %rbx, %rdi
- call .Ladjust_got
-
/*
* Copy the compressed kernel to the end of our buffer
* where decompression in place becomes safe.
@@ -556,27 +520,6 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
jmp *%rax
SYM_FUNC_END(.Lrelocated)

-/*
- * Adjust the global offset table
- *
- * RAX is the previous adjustment of the table to undo (use 0 if it's the
- * first time we touch GOT).
- * RDI is the new adjustment to apply.
- */
-.Ladjust_got:
- /* Walk through the GOT adding the address to the entries */
- leaq _got(%rip), %rdx
- leaq _egot(%rip), %rcx
-1:
- cmpq %rcx, %rdx
- jae 2f
- subq %rax, (%rdx) /* Undo previous adjustment */
- addq %rdi, (%rdx) /* Apply the new adjustment */
- addq $8, %rdx
- jmp 1b
-2:
- ret
-
.code32
/*
* This is the 32-bit trampoline that will be copied over to low memory.
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 4bcc943842ab..a4a4a59a2628 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -43,9 +43,7 @@ SECTIONS
_erodata = . ;
}
.got : {
- _got = .;
- KEEP(*(.got))
- _egot = .;
+ *(.got)
}
.got.plt : {
*(.got.plt)
--
2.26.2

Arvind Sankar

unread,
Jul 14, 2020, 8:41:39 PM7/14/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
gcc puts the main function into .text.startup when compiled with -Os (or
-O2). This results in arch/x86/boot/main.c having a .text.startup
section which is currently not included explicitly in the linker script
setup.ld in the same directory.

The BFD linker places this orphan section immediately after .text, so
this still works. However, LLD git, since [1], is choosing to place it
immediately after the .bstext section instead (this is the first code
section). This plays havoc with the section layout that setup.elf
requires to create the setup header, for eg on 64-bit:

LD arch/x86/boot/setup.elf
ld.lld: error: section .text.startup file range overlaps with .header
>>> .text.startup range is [0x200040, 0x2001FE]
>>> .header range is [0x2001EF, 0x20026B]

ld.lld: error: section .header file range overlaps with .bsdata
>>> .header range is [0x2001EF, 0x20026B]
>>> .bsdata range is [0x2001FF, 0x200398]

ld.lld: error: section .bsdata file range overlaps with .entrytext
>>> .bsdata range is [0x2001FF, 0x200398]
>>> .entrytext range is [0x20026C, 0x2002D3]

ld.lld: error: section .text.startup virtual address range overlaps
with .header
>>> .text.startup range is [0x40, 0x1FE]
>>> .header range is [0x1EF, 0x26B]

ld.lld: error: section .header virtual address range overlaps with
.bsdata
>>> .header range is [0x1EF, 0x26B]
>>> .bsdata range is [0x1FF, 0x398]

ld.lld: error: section .bsdata virtual address range overlaps with
.entrytext
>>> .bsdata range is [0x1FF, 0x398]
>>> .entrytext range is [0x26C, 0x2D3]

ld.lld: error: section .text.startup load address range overlaps with
.header
>>> .text.startup range is [0x40, 0x1FE]
>>> .header range is [0x1EF, 0x26B]

ld.lld: error: section .header load address range overlaps with
.bsdata
>>> .header range is [0x1EF, 0x26B]
>>> .bsdata range is [0x1FF, 0x398]

ld.lld: error: section .bsdata load address range overlaps with
.entrytext
>>> .bsdata range is [0x1FF, 0x398]
>>> .entrytext range is [0x26C, 0x2D3]

Add .text.* to the .text output section to fix this, and also prevent
any future surprises if the compiler decides to create other such
sections.

[1] https://reviews.llvm.org/D75225

Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/setup.ld | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld
index 24c95522f231..49546c247ae2 100644
--- a/arch/x86/boot/setup.ld
+++ b/arch/x86/boot/setup.ld
@@ -20,7 +20,7 @@ SECTIONS
.initdata : { *(.initdata) }
__end_init = .;

- .text : { *(.text) }
+ .text : { *(.text .text.*) }
.text32 : { *(.text32) }

. = ALIGN(16);
--
2.26.2

Arvind Sankar

unread,
Jul 14, 2020, 8:41:41 PM7/14/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.

On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output

With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object

On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.

Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.

Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/compressed/head_32.S | 64 +++++++-----------
arch/x86/boot/compressed/head_64.S | 104 ++++++++++++++++++-----------
2 files changed, 90 insertions(+), 78 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 39f0bb43218f..8c1a4f5610f5 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -33,26 +33,10 @@
#include <asm/bootparam.h>

/*
- * The 32-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
- * relocation to get the symbol address in PIC. When the compressed x86
- * kernel isn't built as PIC, the linker optimizes R_386_GOT32X
- * relocations to their fixed symbol addresses. However, when the
- * compressed x86 kernel is loaded at a different address, it leads
- * to the following load failure:
- *
- * Failed to allocate space for phdrs
- *
- * during the decompression stage.
- *
- * If the compressed x86 kernel is relocatable at run-time, it should be
- * compiled with -fPIE, instead of -fPIC, if possible and should be built as
- * Position Independent Executable (PIE) so that linker won't optimize
- * R_386_GOT32X relocation to its fixed symbol address. Older
- * linkers generate R_386_32 relocations against locally defined symbols,
- * _bss, _ebss and _end, in PIE. It isn't wrong, just less optimal than
- * R_386_RELATIVE. But the x86 kernel fails to properly handle R_386_32
- * relocations when relocating the kernel. To generate R_386_RELATIVE
- * relocations, we mark _bss, _ebss and _end as hidden:
+ * These symbols needed to be marked as .hidden to prevent the BFD linker from
+ * generating R_386_32 (rather than R_386_RELATIVE) relocations for them when
+ * the 32-bit compressed kernel is linked as PIE. This is no longer necessary,
+ * but it doesn't hurt to keep them .hidden.
*/
.hidden _bss
.hidden _ebss
@@ -74,10 +58,10 @@ SYM_FUNC_START(startup_32)
leal (BP_scratch+4)(%esi), %esp
call 1f
1: popl %edx
- subl $1b, %edx
+ addl $_GLOBAL_OFFSET_TABLE_+(.-1b), %edx

/* Load new GDT */
- leal gdt(%edx), %eax
+ leal gdt@GOTOFF(%edx), %eax
movl %eax, 2(%eax)
lgdt (%eax)

@@ -90,14 +74,16 @@ SYM_FUNC_START(startup_32)
movl %eax, %ss

/*
- * %edx contains the address we are loaded at by the boot loader and %ebx
- * contains the address where we should move the kernel image temporarily
- * for safe in-place decompression. %ebp contains the address that the kernel
- * will be decompressed to.
+ * %edx contains the address we are loaded at by the boot loader (plus the
+ * offset to the GOT). The below code calculates %ebx to be the address where
+ * we should move the kernel image temporarily for safe in-place decompression
+ * (again, plus the offset to the GOT).
+ *
+ * %ebp is calculated to be the address that the kernel will be decompressed to.
*/

#ifdef CONFIG_RELOCATABLE
- movl %edx, %ebx
+ leal startup_32@GOTOFF(%edx), %ebx

#ifdef CONFIG_EFI_STUB
/*
@@ -108,7 +94,7 @@ SYM_FUNC_START(startup_32)
* image_offset = startup_32 - image_base
* Otherwise image_offset will be zero and has no effect on the calculations.
*/
- subl image_offset(%edx), %ebx
+ subl image_offset@GOTOFF(%edx), %ebx
#endif

movl BP_kernel_alignment(%esi), %eax
@@ -125,10 +111,10 @@ SYM_FUNC_START(startup_32)
movl %ebx, %ebp // Save the output address for later
/* Target address to relocate to for decompression */
addl BP_init_size(%esi), %ebx
- subl $_end, %ebx
+ subl $_end@GOTOFF, %ebx

/* Set up the stack */
- leal boot_stack_end(%ebx), %esp
+ leal boot_stack_end@GOTOFF(%ebx), %esp

/* Zero EFLAGS */
pushl $0
@@ -139,8 +125,8 @@ SYM_FUNC_START(startup_32)
* where decompression in place becomes safe.
*/
pushl %esi
- leal (_bss-4)(%edx), %esi
- leal (_bss-4)(%ebx), %edi
+ leal (_bss@GOTOFF-4)(%edx), %esi
+ leal (_bss@GOTOFF-4)(%ebx), %edi
movl $(_bss - startup_32), %ecx
shrl $2, %ecx
std
@@ -153,14 +139,14 @@ SYM_FUNC_START(startup_32)
* during extract_kernel below. To avoid any issues, repoint the GDTR
* to the new copy of the GDT.
*/
- leal gdt(%ebx), %eax
+ leal gdt@GOTOFF(%ebx), %eax
movl %eax, 2(%eax)
lgdt (%eax)

/*
* Jump to the relocated address.
*/
- leal .Lrelocated(%ebx), %eax
+ leal .Lrelocated@GOTOFF(%ebx), %eax
jmp *%eax
SYM_FUNC_END(startup_32)

@@ -170,7 +156,7 @@ SYM_FUNC_START_ALIAS(efi_stub_entry)
add $0x4, %esp
movl 8(%esp), %esi /* save boot_params pointer */
call efi_main
- leal startup_32(%eax), %eax
+ /* efi_main returns the possibly relocated address of startup_32 */
jmp *%eax
SYM_FUNC_END(efi32_stub_entry)
SYM_FUNC_END_ALIAS(efi_stub_entry)
@@ -183,8 +169,8 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
* Clear BSS (stack is currently empty)
*/
xorl %eax, %eax
- leal _bss(%ebx), %edi
- leal _ebss(%ebx), %ecx
+ leal _bss@GOTOFF(%ebx), %edi
+ leal _ebss@GOTOFF(%ebx), %ecx
subl %edi, %ecx
shrl $2, %ecx
rep stosl
@@ -198,9 +184,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
pushl %ebp /* output address */

pushl $z_input_len /* input_len */
- leal input_data(%ebx), %eax
+ leal input_data@GOTOFF(%ebx), %eax
pushl %eax /* input_data */
- leal boot_heap(%ebx), %eax
+ leal boot_heap@GOTOFF(%ebx), %eax
pushl %eax /* heap area */
pushl %esi /* real mode pointer */
call extract_kernel /* returns kernel location in %eax */
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index bf1ab30acc5b..11429092c224 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -43,6 +43,32 @@
.hidden _end

__HEAD
+
+/*
+ * This macro gives the relative virtual address of X, i.e. the offset of X
+ * from startup_32. This is the same as the link-time virtual address of X,
+ * since startup_32 is at 0, but defining it this way tells the
+ * assembler/linker that we do not want the actual run-time address of X. This
+ * prevents the linker from trying to create unwanted run-time relocation
+ * entries for the reference when the compressed kernel is linked as PIE.
+ *
+ * A reference X(%reg) will result in the link-time VA of X being stored with
+ * the instruction, and a run-time R_X86_64_RELATIVE relocation entry that
+ * adds the 64-bit base address where the kernel is loaded.
+ *
+ * Replacing it with (X-startup_32)(%reg) results in the offset being stored,
+ * and no run-time relocation.
+ *
+ * The macro should be used as a displacement with a base register containing
+ * the run-time address of startup_32 [i.e. rva(X)(%reg)], or as an immediate
+ * [$ rva(X)].
+ *
+ * This macro can only be used from within the .head.text section, since the
+ * expression requires startup_32 to be in the same section as the code being
+ * assembled.
+ */
+#define rva(X) ((X) - startup_32)
+
.code32
SYM_FUNC_START(startup_32)
/*
@@ -65,10 +91,10 @@ SYM_FUNC_START(startup_32)
leal (BP_scratch+4)(%esi), %esp
call 1f
1: popl %ebp
- subl $1b, %ebp
+ subl $ rva(1b), %ebp

/* Load new GDT with the 64bit segments using 32bit descriptor */
- leal gdt(%ebp), %eax
+ leal rva(gdt)(%ebp), %eax
movl %eax, 2(%eax)
lgdt (%eax)

@@ -81,7 +107,7 @@ SYM_FUNC_START(startup_32)
movl %eax, %ss

/* setup a stack and make sure cpu supports long mode. */
- leal boot_stack_end(%ebp), %esp
+ leal rva(boot_stack_end)(%ebp), %esp

call verify_cpu
testl %eax, %eax
@@ -108,7 +134,7 @@ SYM_FUNC_START(startup_32)
* image_offset = startup_32 - image_base
* Otherwise image_offset will be zero and has no effect on the calculations.
*/
- subl image_offset(%ebp), %ebx
+ subl rva(image_offset)(%ebp), %ebx
#endif

movl BP_kernel_alignment(%esi), %eax
@@ -124,7 +150,7 @@ SYM_FUNC_START(startup_32)

/* Target address to relocate to for decompression */
addl BP_init_size(%esi), %ebx
- subl $_end, %ebx
+ subl $ rva(_end), %ebx

/*
* Prepare for entering 64 bit mode
@@ -152,19 +178,19 @@ SYM_FUNC_START(startup_32)
1:

/* Initialize Page tables to 0 */
- leal pgtable(%ebx), %edi
+ leal rva(pgtable)(%ebx), %edi
xorl %eax, %eax
movl $(BOOT_INIT_PGT_SIZE/4), %ecx
rep stosl

/* Build Level 4 */
- leal pgtable + 0(%ebx), %edi
+ leal rva(pgtable + 0)(%ebx), %edi
leal 0x1007 (%edi), %eax
movl %eax, 0(%edi)
addl %edx, 4(%edi)

/* Build Level 3 */
- leal pgtable + 0x1000(%ebx), %edi
+ leal rva(pgtable + 0x1000)(%ebx), %edi
leal 0x1007(%edi), %eax
movl $4, %ecx
1: movl %eax, 0x00(%edi)
@@ -175,7 +201,7 @@ SYM_FUNC_START(startup_32)
jnz 1b

/* Build Level 2 */
- leal pgtable + 0x2000(%ebx), %edi
+ leal rva(pgtable + 0x2000)(%ebx), %edi
movl $0x00000183, %eax
movl $2048, %ecx
1: movl %eax, 0(%edi)
@@ -186,7 +212,7 @@ SYM_FUNC_START(startup_32)
jnz 1b

/* Enable the boot page tables */
- leal pgtable(%ebx), %eax
+ leal rva(pgtable)(%ebx), %eax
movl %eax, %cr3

/* Enable Long mode in EFER (Extended Feature Enable Register) */
@@ -211,14 +237,14 @@ SYM_FUNC_START(startup_32)
* We place all of the values on our mini stack so lret can
* used to perform that far jump.
*/
- leal startup_64(%ebp), %eax
+ leal rva(startup_64)(%ebp), %eax
#ifdef CONFIG_EFI_MIXED
- movl efi32_boot_args(%ebp), %edi
+ movl rva(efi32_boot_args)(%ebp), %edi
cmp $0, %edi
jz 1f
- leal efi64_stub_entry(%ebp), %eax
- movl efi32_boot_args+4(%ebp), %esi
- movl efi32_boot_args+8(%ebp), %edx // saved bootparams pointer
+ leal rva(efi64_stub_entry)(%ebp), %eax
+ movl rva(efi32_boot_args+4)(%ebp), %esi
+ movl rva(efi32_boot_args+8)(%ebp), %edx // saved bootparams pointer
cmpl $0, %edx
jnz 1f
/*
@@ -229,7 +255,7 @@ SYM_FUNC_START(startup_32)
* the correct stack alignment for entry.
*/
subl $40, %esp
- leal efi_pe_entry(%ebp), %eax
+ leal rva(efi_pe_entry)(%ebp), %eax
movl %edi, %ecx // MS calling convention
movl %esi, %edx
1:
@@ -255,18 +281,18 @@ SYM_FUNC_START(efi32_stub_entry)

call 1f
1: pop %ebp
- subl $1b, %ebp
+ subl $ rva(1b), %ebp

- movl %esi, efi32_boot_args+8(%ebp)
+ movl %esi, rva(efi32_boot_args+8)(%ebp)
SYM_INNER_LABEL(efi32_pe_stub_entry, SYM_L_LOCAL)
- movl %ecx, efi32_boot_args(%ebp)
- movl %edx, efi32_boot_args+4(%ebp)
- movb $0, efi_is64(%ebp)
+ movl %ecx, rva(efi32_boot_args)(%ebp)
+ movl %edx, rva(efi32_boot_args+4)(%ebp)
+ movb $0, rva(efi_is64)(%ebp)

/* Save firmware GDTR and code/data selectors */
- sgdtl efi32_boot_gdt(%ebp)
- movw %cs, efi32_boot_cs(%ebp)
- movw %ds, efi32_boot_ds(%ebp)
+ sgdtl rva(efi32_boot_gdt)(%ebp)
+ movw %cs, rva(efi32_boot_cs)(%ebp)
+ movw %ds, rva(efi32_boot_ds)(%ebp)

/* Disable paging */
movl %cr0, %eax
@@ -345,11 +371,11 @@ SYM_CODE_START(startup_64)

/* Target address to relocate to for decompression */
movl BP_init_size(%rsi), %ebx
- subl $_end, %ebx
+ subl $ rva(_end), %ebx
addq %rbp, %rbx

/* Set up the stack */
- leaq boot_stack_end(%rbx), %rsp
+ leaq rva(boot_stack_end)(%rbx), %rsp

/*
* At this point we are in long mode with 4-level paging enabled,
@@ -423,7 +449,7 @@ SYM_CODE_START(startup_64)
lretq
trampoline_return:
/* Restore the stack, the 32-bit trampoline uses its own stack */
- leaq boot_stack_end(%rbx), %rsp
+ leaq rva(boot_stack_end)(%rbx), %rsp

/*
* cleanup_trampoline() would restore trampoline memory.
@@ -435,7 +461,7 @@ trampoline_return:
* this function call.
*/
pushq %rsi
- leaq top_pgtable(%rbx), %rdi
+ leaq rva(top_pgtable)(%rbx), %rdi
call cleanup_trampoline
popq %rsi

@@ -449,9 +475,9 @@ trampoline_return:
*/
pushq %rsi
leaq (_bss-8)(%rip), %rsi
- leaq (_bss-8)(%rbx), %rdi
- movq $_bss /* - $startup_32 */, %rcx
- shrq $3, %rcx
+ leaq rva(_bss-8)(%rbx), %rdi
+ movl $(_bss - startup_32), %ecx
+ shrl $3, %ecx
std
rep movsq
cld
@@ -462,15 +488,15 @@ trampoline_return:
* during extract_kernel below. To avoid any issues, repoint the GDTR
* to the new copy of the GDT.
*/
- leaq gdt64(%rbx), %rax
- leaq gdt(%rbx), %rdx
+ leaq rva(gdt64)(%rbx), %rax
+ leaq rva(gdt)(%rbx), %rdx
movq %rdx, 2(%rax)
lgdt (%rax)

/*
* Jump to the relocated address.
*/
- leaq .Lrelocated(%rbx), %rax
+ leaq rva(.Lrelocated)(%rbx), %rax
jmp *%rax
SYM_CODE_END(startup_64)

@@ -482,7 +508,7 @@ SYM_FUNC_START_ALIAS(efi_stub_entry)
movq %rdx, %rbx /* save boot_params pointer */
call efi_main
movq %rbx,%rsi
- leaq startup_64(%rax), %rax
+ leaq rva(startup_64)(%rax), %rax
jmp *%rax
SYM_FUNC_END(efi64_stub_entry)
SYM_FUNC_END_ALIAS(efi_stub_entry)
@@ -645,7 +671,7 @@ SYM_DATA(efi_is64, .byte 1)
#define BS32_handle_protocol 88 // offsetof(efi_boot_services_32_t, handle_protocol)
#define LI32_image_base 32 // offsetof(efi_loaded_image_32_t, image_base)

- .text
+ __HEAD
.code32
SYM_FUNC_START(efi32_pe_entry)
/*
@@ -667,12 +693,12 @@ SYM_FUNC_START(efi32_pe_entry)

call 1f
1: pop %ebx
- subl $1b, %ebx
+ subl $ rva(1b), %ebx

/* Get the loaded image protocol pointer from the image handle */
leal -4(%ebp), %eax
pushl %eax // &loaded_image
- leal loaded_image_proto(%ebx), %eax
+ leal rva(loaded_image_proto)(%ebx), %eax
pushl %eax // pass the GUID address
pushl 8(%ebp) // pass the image handle

@@ -707,7 +733,7 @@ SYM_FUNC_START(efi32_pe_entry)
* use it before we get to the 64-bit efi_pe_entry() in C code.
*/
subl %esi, %ebx
- movl %ebx, image_offset(%ebp) // save image_offset
+ movl %ebx, rva(image_offset)(%ebp) // save image_offset
jmp efi32_pe_stub_entry

2: popl %edi // restore callee-save registers
--
2.26.2

Arvind Sankar

unread,
Jul 14, 2020, 8:41:42 PM7/14/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
The BFD linker generates run-time relocations for z_input_len and
z_output_len, even though they are absolute symbols.

This is fixed for binutils-2.35 [1]. Work around this for earlier
versions by defining two variables input_len and output_len in addition
to the symbols, and use them via position-independent references.

This eliminates the last two run-time relocations in the head code and
allows us to drop the -z noreloc-overflow flag to the linker.

Move the -pie and --no-dynamic-linker LDFLAGS to LDFLAGS_vmlinux instead
of KBUILD_LDFLAGS. There shouldn't be anything else getting linked, but
this is the more logical location for these flags, and modversions might
call the linker if an EXPORT_SYMBOL is left over accidentally in one of
the decompressors.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=25754

Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/compressed/Makefile | 12 ++----------
arch/x86/boot/compressed/head_32.S | 17 ++++++++---------
arch/x86/boot/compressed/head_64.S | 4 ++--
arch/x86/boot/compressed/mkpiggy.c | 6 ++++++
4 files changed, 18 insertions(+), 21 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index c829d874dcac..ae2c0dc98a6a 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -51,16 +51,8 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-ifeq ($(CONFIG_X86_32),y)
-KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
-else
-# To build 64-bit compressed kernel as PIE, we disable relocation
-# overflow check to avoid relocation overflow error with a new linker
-# command-line option, -z noreloc-overflow.
-KBUILD_LDFLAGS += $(shell $(LD) --help 2>&1 | grep -q "\-z noreloc-overflow" \
- && echo "-z noreloc-overflow -pie --no-dynamic-linker")
-endif
-LDFLAGS_vmlinux := -T
+LDFLAGS_vmlinux := $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
+LDFLAGS_vmlinux += -T

hostprogs := mkpiggy
HOST_EXTRACFLAGS += -I$(srctree)/tools/include
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 8c1a4f5610f5..659fad53ca82 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -178,18 +178,17 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
/*
* Do the extraction, and jump to the new kernel..
*/
- /* push arguments for extract_kernel: */
- pushl $z_output_len /* decompressed length, end of relocs */
+ /* push arguments for extract_kernel: */

- pushl %ebp /* output address */
-
- pushl $z_input_len /* input_len */
+ pushl output_len@GOTOFF(%ebx) /* decompressed length, end of relocs */
+ pushl %ebp /* output address */
+ pushl input_len@GOTOFF(%ebx) /* input_len */
leal input_data@GOTOFF(%ebx), %eax
- pushl %eax /* input_data */
+ pushl %eax /* input_data */
leal boot_heap@GOTOFF(%ebx), %eax
- pushl %eax /* heap area */
- pushl %esi /* real mode pointer */
- call extract_kernel /* returns kernel location in %eax */
+ pushl %eax /* heap area */
+ pushl %esi /* real mode pointer */
+ call extract_kernel /* returns kernel location in %eax */
addl $24, %esp

/*
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 11429092c224..9e46729cf162 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -534,9 +534,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
movq %rsi, %rdi /* real mode address */
leaq boot_heap(%rip), %rsi /* malloc area for uncompression */
leaq input_data(%rip), %rdx /* input_data */
- movl $z_input_len, %ecx /* input_len */
+ movl input_len(%rip), %ecx /* input_len */
movq %rbp, %r8 /* output target address */
- movl $z_output_len, %r9d /* decompressed length, end of relocs */
+ movl output_len(%rip), %r9d /* decompressed length, end of relocs */
call extract_kernel /* returns kernel location in %rax */
popq %rsi

diff --git a/arch/x86/boot/compressed/mkpiggy.c b/arch/x86/boot/compressed/mkpiggy.c
index 7e01248765b2..52aa56cdbacc 100644
--- a/arch/x86/boot/compressed/mkpiggy.c
+++ b/arch/x86/boot/compressed/mkpiggy.c
@@ -60,6 +60,12 @@ int main(int argc, char *argv[])
printf(".incbin \"%s\"\n", argv[1]);
printf("input_data_end:\n");

+ printf(".section \".rodata\",\"a\",@progbits\n");
+ printf(".globl input_len\n");
+ printf("input_len:\n\t.long %lu\n", ilen);
+ printf(".globl output_len\n");
+ printf("output_len:\n\t.long %lu\n", (unsigned long)olen);
+
retval = 0;
bail:
if (f)
--
2.26.2

Arvind Sankar

unread,
Jul 14, 2020, 8:41:42 PM7/14/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Add a linker script check that there are no run-time relocations, and
remove the old one that tries to check via looking for specially-named
sections in the object files.

Drop the tests for -fPIE compiler option and -pie linker option, as they
are available in all supported gcc and binutils versions (as well as
clang and lld).

Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/compressed/Makefile | 28 +++-----------------------
arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
2 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index ae2c0dc98a6a..a9e082b8c720 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -29,7 +29,7 @@ targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \
vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4

KBUILD_CFLAGS := -m$(BITS) -O2
-KBUILD_CFLAGS += -fno-strict-aliasing $(call cc-option, -fPIE, -fPIC)
+KBUILD_CFLAGS += -fno-strict-aliasing -fPIE
KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
cflags-$(CONFIG_X86_32) := -march=i386
cflags-$(CONFIG_X86_64) := -mcmodel=small
@@ -51,7 +51,7 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-LDFLAGS_vmlinux := $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
+LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker)
LDFLAGS_vmlinux += -T

hostprogs := mkpiggy
@@ -86,30 +86,8 @@ vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
vmlinux-objs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o

-# The compressed kernel is built with -fPIC/-fPIE so that a boot loader
-# can place it anywhere in memory and it will still run. However, since
-# it is executed as-is without any ELF relocation processing performed
-# (and has already had all relocation sections stripped from the binary),
-# none of the code can use data relocations (e.g. static assignments of
-# pointer values), since they will be meaningless at runtime. This check
-# will refuse to link the vmlinux if any of these relocations are found.
-quiet_cmd_check_data_rel = DATAREL $@
-define cmd_check_data_rel
- for obj in $(filter %.o,$^); do \
- $(READELF) -S $$obj | grep -qF .rel.local && { \
- echo "error: $$obj has data relocations!" >&2; \
- exit 1; \
- } || true; \
- done
-endef
-
-# We need to run two commands under "if_changed", so merge them into a
-# single invocation.
-quiet_cmd_check-and-link-vmlinux = LD $@
- cmd_check-and-link-vmlinux = $(cmd_check_data_rel); $(cmd_ld)
-
$(obj)/vmlinux: $(vmlinux-objs-y) FORCE
- $(call if_changed,check-and-link-vmlinux)
+ $(call if_changed,ld)

OBJCOPYFLAGS_vmlinux.bin := -R .comment -S
$(obj)/vmlinux.bin: vmlinux FORCE
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index a4a4a59a2628..29df99b6cc64 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -42,6 +42,12 @@ SECTIONS
*(.rodata.*)
_erodata = . ;
}
+ .rel.dyn : {
+ *(.rel.*)
+ }
+ .rela.dyn : {
+ *(.rela.*)
+ }
.got : {
*(.got)
}
@@ -85,3 +91,5 @@ ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT en
#else
ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
#endif
+
+ASSERT(SIZEOF(.rel.dyn) == 0 && SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations detected!")
--
2.26.2

Sedat Dilek

unread,
Jul 14, 2020, 9:47:11 PM7/14/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Hi Arvind,

thanks for v5.

With my diff applied against your patchset *v4*:

diff --git a/arch/x86/boot/compressed/Makefile
b/arch/x86/boot/compressed/Makefile
index 789d5d14d8b0..d0aafcd8cf6c 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -51,8 +51,8 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-KBUILD_LDFLAGS += -pie $(call ld-option, --no-dynamic-linker)
-LDFLAGS_vmlinux := -T
+LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker)
+LDFLAGS_vmlinux += -T

hostprogs := mkpiggy
HOST_EXTRACFLAGS += -I$(srctree)/tools/include

I was able to build/assemble with LLVM/Clang v11.0.0-git+ffee8040534
and boot on bare metal.

Note:
I have applied some additional patches to be compliant with LLVM_IAS=1
(Clang's Integrated Assembler) and LLVM=1 means LLVM utilities.
( As pointed out zstd-v7. )

- Sedat -

P.S.: Check my build-log

$ grep 'arch/x86/boot/compressed/vmlinux'
build-log_5.8.0-rc5-3-amd64-llvm11-ias.txt
make -f ./scripts/Makefile.build obj=arch/x86/boot/compressed
arch/x86/boot/compressed/vmlinux
clang-11 -E -Wp,-MMD,arch/x86/boot/compressed/.vmlinux.lds.d
-nostdinc -isystem /usr/lib/llvm-11/lib/clang/11.0.0/include
-I./arch/x86/include -I./arch/x86/include/generated -I./include
-I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi
-I./include/uapi -I./include/generated/uapi -include
./include/linux/kconfig.h -D__KERNEL__ -Qunused-arguments -P -Ux86
-D__ASSEMBLY__ -DLINKER_SCRIPT -o arch/x86/boot/compressed/vmlinux.lds
arch/x86/boot/compressed/vmlinux.lds.S
llvm-objcopy-11 -R .comment -S vmlinux arch/x86/boot/compressed/vmlinux.bin
arch/x86/tools/relocs vmlinux >
arch/x86/boot/compressed/vmlinux.relocs;arch/x86/tools/relocs
--abs-relocs vmlinux
{ cat arch/x86/boot/compressed/vmlinux.bin
arch/x86/boot/compressed/vmlinux.relocs | zstd -22 --ultra; printf
\114\015\315\001; } > arch/x86/boot/compressed/vmlinux.bin.zst
arch/x86/boot/compressed/mkpiggy
arch/x86/boot/compressed/vmlinux.bin.zst >
arch/x86/boot/compressed/piggy.S
ld.lld-11 -m elf_x86_64 -pie --no-dynamic-linker -T
arch/x86/boot/compressed/vmlinux.lds
arch/x86/boot/compressed/kernel_info.o
arch/x86/boot/compressed/head_64.o arch/x86/boot/compressed/misc.o
arch/x86/boot/compressed/string.o arch/x86/boot/compressed/cmdline.o
arch/x86/boot/compressed/error.o arch/x86/boot/compressed/piggy.o
arch/x86/boot/compressed/cpuflags.o
arch/x86/boot/compressed/early_serial_console.o
arch/x86/boot/compressed/kaslr.o arch/x86/boot/compressed/kaslr_64.o
arch/x86/boot/compressed/mem_encrypt.o
arch/x86/boot/compressed/pgtable_64.o arch/x86/boot/compressed/acpi.o
drivers/firmware/efi/libstub/lib.a
arch/x86/boot/compressed/efi_thunk_64.o -o
arch/x86/boot/compressed/vmlinux
llvm-nm-11 arch/x86/boot/compressed/vmlinux | sed -n -e
's/^\([0-9a-fA-F]*\) [a-zA-Z]
\(startup_32\|startup_64\|efi32_stub_entry\|efi64_stub_entry\|efi_pe_entry\|efi32_pe_entry\|input_data\|kernel_info\|_end\|_ehead\|_text\|z_.*\)$/#define
ZO_ 0x/p' > arch/x86/boot/zoffset.h
llvm-objcopy-11 -O binary -R .note -R .comment -S
arch/x86/boot/compressed/vmlinux arch/x86/boot/vmlinux.bin

- EOT -

Sedat Dilek

unread,
Jul 15, 2020, 3:11:18 AM7/15/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Re-tested v5 of your patchset, feel free to add appropriate credits.

- Sedat -

Sedat Dilek

unread,
Jul 15, 2020, 4:52:33 AM7/15/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Wed, Jul 15, 2020 at 2:41 AM Arvind Sankar <nive...@alum.mit.edu> wrote:
>
Tested-by: Sedat Dilek <sedat...@gmail.com>

- Sedat -

Sedat Dilek

unread,
Jul 15, 2020, 4:54:32 AM7/15/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Wed, Jul 15, 2020 at 2:41 AM Arvind Sankar <nive...@alum.mit.edu> wrote:
>
> From: Ard Biesheuvel <ar...@kernel.org>
>
> Eliminate all GOT entries in the decompressor binary, by forcing hidden
> visibility for all symbol references, which informs the compiler that
> such references will be resolved at link time without the need for
> allocating GOT entries.
>
> To ensure that no GOT entries will creep back in, add an assertion to
> the decompressor linker script that will fire if the .got section has
> a non-zero size.
>
> [Arvind: move hidden.h to include/linux instead of making a copy]
>

Tested-by: Sedat Dilek <sedat...@gmail.com>
Reported-by: Sedat Dilek <sedat...@gmail.com>

Reported hidden.h breakage and asked for a follow-up.

- Sedat -

[1] https://marc.info/?l=linux-kernel&m=159056070321982&w=2

Sedat Dilek

unread,
Jul 15, 2020, 4:55:01 AM7/15/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Wed, Jul 15, 2020 at 2:41 AM Arvind Sankar <nive...@alum.mit.edu> wrote:
>
> From: Ard Biesheuvel <ar...@kernel.org>
>
> In a previous patch, we have eliminated GOT entries from the decompressor
> binary and added an assertion that the .got section is empty. This means
> that the GOT fixup routines that exist in both the 32-bit and 64-bit
> startup routines have become dead code, and can be removed.
>
> While at it, drop the KEEP() from the linker script, as it has no effect
> on the contents of output sections that are created by the linker itself.
>
> Reviewed-by: Kees Cook <kees...@chromium.org>
> Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
> Acked-by: Arvind Sankar <nive...@alum.mit.edu>
> Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
> From: Ard Biesheuvel <ar...@kernel.org>
> Link: https://lore.kernel.org/r/2020052312002...@kernel.org

Tested-by: Sedat Dilek <sedat...@gmail.com>

- Sedat -

Sedat Dilek

unread,
Jul 15, 2020, 4:55:44 AM7/15/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Wed, Jul 15, 2020 at 2:41 AM Arvind Sankar <nive...@alum.mit.edu> wrote:
>
Tested-by: Sedat Dilek <sedat...@gmail.com>

- Sedat -

> ---
> arch/x86/boot/setup.ld | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld
> index 24c95522f231..49546c247ae2 100644
> --- a/arch/x86/boot/setup.ld
> +++ b/arch/x86/boot/setup.ld
> @@ -20,7 +20,7 @@ SECTIONS
> .initdata : { *(.initdata) }
> __end_init = .;
>
> - .text : { *(.text) }
> + .text : { *(.text .text.*) }
> .text32 : { *(.text32) }
>
> . = ALIGN(16);
> --
> 2.26.2
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200715004133.1430068-5-nivedita%40alum.mit.edu.

Sedat Dilek

unread,
Jul 15, 2020, 4:56:24 AM7/15/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Wed, Jul 15, 2020 at 2:41 AM Arvind Sankar <nive...@alum.mit.edu> wrote:
>
Tested-by: Sedat Dilek <sedat...@gmail.com>

- Sedat -

Sedat Dilek

unread,
Jul 15, 2020, 4:58:13 AM7/15/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Wed, Jul 15, 2020 at 2:41 AM Arvind Sankar <nive...@alum.mit.edu> wrote:
>
> The BFD linker generates run-time relocations for z_input_len and
> z_output_len, even though they are absolute symbols.
>
> This is fixed for binutils-2.35 [1]. Work around this for earlier
> versions by defining two variables input_len and output_len in addition
> to the symbols, and use them via position-independent references.
>
> This eliminates the last two run-time relocations in the head code and
> allows us to drop the -z noreloc-overflow flag to the linker.
>
> Move the -pie and --no-dynamic-linker LDFLAGS to LDFLAGS_vmlinux instead
> of KBUILD_LDFLAGS. There shouldn't be anything else getting linked, but
> this is the more logical location for these flags, and modversions might
> call the linker if an EXPORT_SYMBOL is left over accidentally in one of
> the decompressors.
>

Tested-by: Sedat Dilek <sedat...@gmail.com>
Reported-by: Sedat Dilek <sedat...@gmail.com>

Reported breakage with LLD in previous patchset version.

- Sedat -

Sedat Dilek

unread,
Jul 15, 2020, 5:00:22 AM7/15/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Wed, Jul 15, 2020 at 2:41 AM Arvind Sankar <nive...@alum.mit.edu> wrote:
>
> Add a linker script check that there are no run-time relocations, and
> remove the old one that tries to check via looking for specially-named
> sections in the object files.
>
> Drop the tests for -fPIE compiler option and -pie linker option, as they
> are available in all supported gcc and binutils versions (as well as
> clang and lld).
>

Tested-by: Sedat Dilek <sedat...@gmail.com>
Reported-by: Sedat Dilek <sedat...@gmail.com>

Here I reported the breakage with LLD and tested the move to
LDFLAGS_vmlinux with a previous version of the patch.

- Sedat -

Ard Biesheuvel

unread,
Jul 15, 2020, 5:03:21 AM7/15/20
to Sedat Dilek, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
On Wed, 15 Jul 2020 at 11:58, Sedat Dilek <sedat...@gmail.com> wrote:
>
> On Wed, Jul 15, 2020 at 2:41 AM Arvind Sankar <nive...@alum.mit.edu> wrote:
> >
> > The BFD linker generates run-time relocations for z_input_len and
> > z_output_len, even though they are absolute symbols.
> >
> > This is fixed for binutils-2.35 [1]. Work around this for earlier
> > versions by defining two variables input_len and output_len in addition
> > to the symbols, and use them via position-independent references.
> >
> > This eliminates the last two run-time relocations in the head code and
> > allows us to drop the -z noreloc-overflow flag to the linker.
> >
> > Move the -pie and --no-dynamic-linker LDFLAGS to LDFLAGS_vmlinux instead
> > of KBUILD_LDFLAGS. There shouldn't be anything else getting linked, but
> > this is the more logical location for these flags, and modversions might
> > call the linker if an EXPORT_SYMBOL is left over accidentally in one of
> > the decompressors.
> >
>
> Tested-by: Sedat Dilek <sedat...@gmail.com>
> Reported-by: Sedat Dilek <sedat...@gmail.com>
>
> Reported breakage with LLD in previous patchset version.
>

Please drop the bogus reported-bys. The patch does *not* address an
issue you reported, so recording this in the commit log would be
incorrect.

Your review and testing is appreciated, and resulted in substantial
improvements. So feel free to give your reviewed-by in addition to
your tested-by. But reported-by is inappropriate here.

Sedat Dilek

unread,
Jul 15, 2020, 5:10:45 AM7/15/20
to Ard Biesheuvel, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, X86 ML, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, Linux Kernel Mailing List
On Wed, Jul 15, 2020 at 11:03 AM Ard Biesheuvel <ar...@kernel.org> wrote:
>
> On Wed, 15 Jul 2020 at 11:58, Sedat Dilek <sedat...@gmail.com> wrote:
> >
> > On Wed, Jul 15, 2020 at 2:41 AM Arvind Sankar <nive...@alum.mit.edu> wrote:
> > >
> > > The BFD linker generates run-time relocations for z_input_len and
> > > z_output_len, even though they are absolute symbols.
> > >
> > > This is fixed for binutils-2.35 [1]. Work around this for earlier
> > > versions by defining two variables input_len and output_len in addition
> > > to the symbols, and use them via position-independent references.
> > >
> > > This eliminates the last two run-time relocations in the head code and
> > > allows us to drop the -z noreloc-overflow flag to the linker.
> > >
> > > Move the -pie and --no-dynamic-linker LDFLAGS to LDFLAGS_vmlinux instead
> > > of KBUILD_LDFLAGS. There shouldn't be anything else getting linked, but
> > > this is the more logical location for these flags, and modversions might
> > > call the linker if an EXPORT_SYMBOL is left over accidentally in one of
> > > the decompressors.
> > >
> >
> > Tested-by: Sedat Dilek <sedat...@gmail.com>
> > Reported-by: Sedat Dilek <sedat...@gmail.com>
> >
> > Reported breakage with LLD in previous patchset version.
> >
>
> Please drop the bogus reported-bys. The patch does *not* address an
> issue you reported, so recording this in the commit log would be
> incorrect.
>
> Your review and testing is appreciated, and resulted in substantial
> improvements. So feel free to give your reviewed-by in addition to
> your tested-by. But reported-by is inappropriate here.
>

Correct, in a hurry I mixed it up as said in the next patch 7/7.
Please drop this Reported-by.

- Sedat -

Sedat Dilek

unread,
Jul 15, 2020, 5:12:27 AM7/15/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, Clang-Built-Linux ML, Ard Biesheuvel, Masahiro Yamada, Daniel Kiper, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Wed, Jul 15, 2020 at 11:00 AM Sedat Dilek <sedat...@gmail.com> wrote:
>
> On Wed, Jul 15, 2020 at 2:41 AM Arvind Sankar <nive...@alum.mit.edu> wrote:
> >
> > Add a linker script check that there are no run-time relocations, and
> > remove the old one that tries to check via looking for specially-named
> > sections in the object files.
> >
> > Drop the tests for -fPIE compiler option and -pie linker option, as they
> > are available in all supported gcc and binutils versions (as well as
> > clang and lld).
> >
>
> Tested-by: Sedat Dilek <sedat...@gmail.com>
> Reported-by: Sedat Dilek <sedat...@gmail.com>
>
> Here I reported the breakage with LLD and tested the move to
> LDFLAGS_vmlinux with a previous version of the patch.
>

Feel free to also add my:

Reviewed-by: Sedat Dilek <sedat...@gmail.com>

Arvind Sankar

unread,
Jul 17, 2020, 9:46:58 AM7/17/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
On Tue, Jul 14, 2020 at 08:41:26PM -0400, Arvind Sankar wrote:
> The compressed kernel currently contains bogus run-time relocations in
> the startup code in head_{32,64}.S, which are generated by the linker,
> but must not actually be processed at run-time.
>
> This generates warnings when linking with the BFD linker, and errors
> with LLD, which defaults to erroring on run-time relocations in read-only
> sections. It also requires the -z noreloc-overflow hack for the 64-bit
> kernel, which prevents us from linking it as -pie on an older BFD linker
> (<= 2.26) or on LLD, because the locations that are to be apparently
> relocated are only 32-bits in size and so cannot really have
> R_X86_64_RELATIVE relocations.
>
> This series aims to get rid of these relocations. I've build- and
> boot-tested with combinations of clang/gcc-10 with lld/bfd-2.34, and
> gcc-4.9.0 with bfd-2.24, skipping clang on 32-bit because it currently
> has other issues [0].
>

Hi Thomas, Ingo, Borislav, would you be able to take a look over this
series in time for 5.9?

Thanks.

Nick Desaulniers

unread,
Jul 17, 2020, 2:17:09 PM7/17/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Fangrui Song, Dmitry Golovin, clang-built-linux, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, LKML
Hi Arvind, thanks for the series; I'm behind on testing. When I try
to apply this series on top of linux-next, I get a collision in
drivers/firmware/efi/libstub/Makefile:27 when applying "0002
x86/boot/compressed: Force hidden visibility for all symbol
references". Would you mind refreshing the series to avoid that
collision?
--
Thanks,
~Nick Desaulniers

Sedat Dilek

unread,
Jul 17, 2020, 2:21:25 PM7/17/20
to Nick Desaulniers, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Fangrui Song, Dmitry Golovin, clang-built-linux, Ard Biesheuvel, Masahiro Yamada, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, LKML
I guess taking a Linux-next release as a new base is not a good idea.
With the next Linux-next release... new troubles.
Please, keep the base on recent Linux v5.8-rcX.

- Sedat -

Arvind Sankar

unread,
Jul 17, 2020, 4:18:04 PM7/17/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Same as v5 previously posted, but rebased onto next-20200717.

v5: https://lore.kernel.org/lkml/20200715004133.1...@alum.mit.edu/

Ard Biesheuvel (3):
x86/boot/compressed: Move .got.plt entries out of the .got section
x86/boot/compressed: Force hidden visibility for all symbol references
x86/boot/compressed: Get rid of GOT fixup code

Arvind Sankar (4):
x86/boot: Add .text.* to setup.ld
x86/boot: Remove run-time relocations from .head.text code
x86/boot: Remove run-time relocations from head_{32,64}.S
x86/boot: Check that there are no run-time relocations

arch/x86/boot/compressed/Makefile | 39 +-----
arch/x86/boot/compressed/head_32.S | 99 +++++----------
arch/x86/boot/compressed/head_64.S | 165 ++++++++++---------------
arch/x86/boot/compressed/mkpiggy.c | 6 +
arch/x86/boot/compressed/vmlinux.lds.S | 24 +++-
arch/x86/boot/setup.ld | 2 +-
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/hidden.h | 6 -
include/linux/hidden.h | 19 +++
9 files changed, 153 insertions(+), 209 deletions(-)
delete mode 100644 drivers/firmware/efi/libstub/hidden.h
create mode 100644 include/linux/hidden.h


base-commit: aab7ee9f8ff0110bfcd594b33dc33748dc1baf46
--
2.26.2

Arvind Sankar

unread,
Jul 17, 2020, 4:18:05 PM7/17/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

The .got.plt section contains the part of the GOT which is used by PLT
entries, and which gets updated lazily by the dynamic loader when
function calls are dispatched through those PLT entries.

On fully linked binaries such as the kernel proper or the decompressor,
this never happens, and so in practice, the .got.plt section consists
only of the first 3 magic entries that are meant to point at the _DYNAMIC
section and at the fixup routine in the loader. However, since we don't
use a dynamic loader, those entries are never populated or used.

This means that treating those entries like ordinary GOT entries, and
updating their values based on the actual placement of the executable in
memory is completely pointless, and we can just ignore the .got.plt
section entirely, provided that it has no additional entries beyond
the first 3 ones.

So add an assertion in the linker script to ensure that this assumption
holds, and move the contents out of the [_got, _egot) memory range that
is modified by the GOT fixup routines.

While at it, drop the KEEP(), since it has no effect on the contents
of output sections that are created by the linker itself.

Reviewed-by: Kees Cook <kees...@chromium.org>
Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/vmlinux.lds.S | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 8f1025d1f681..b17d218ccdf9 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -44,10 +44,13 @@ SECTIONS
}
.got : {
_got = .;
- KEEP(*(.got.plt))
KEEP(*(.got))
_egot = .;
}
+ .got.plt : {
+ *(.got.plt)
+ }
+
.data : {
_data = . ;
*(.data)
@@ -77,3 +80,9 @@ SECTIONS

DISCARDS
}
+
+#ifdef CONFIG_X86_64
+ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
+#else
+ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0xc, "Unexpected GOT/PLT entries detected!")
+#endif
--
2.26.2

Arvind Sankar

unread,
Jul 17, 2020, 4:18:06 PM7/17/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

Eliminate all GOT entries in the decompressor binary, by forcing hidden
visibility for all symbol references, which informs the compiler that
such references will be resolved at link time without the need for
allocating GOT entries.

To ensure that no GOT entries will creep back in, add an assertion to
the decompressor linker script that will fire if the .got section has
a non-zero size.

[Arvind: move hidden.h to include/linux instead of making a copy]

Reviewed-by: Kees Cook <kees...@chromium.org>
Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/Makefile | 1 +
arch/x86/boot/compressed/vmlinux.lds.S | 1 +
drivers/firmware/efi/libstub/Makefile | 2 +-
drivers/firmware/efi/libstub/hidden.h | 6 ------
include/linux/hidden.h | 19 +++++++++++++++++++
5 files changed, 22 insertions(+), 7 deletions(-)
delete mode 100644 drivers/firmware/efi/libstub/hidden.h
create mode 100644 include/linux/hidden.h

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index b7beabecef8a..b6d7caaaef9e 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -42,6 +42,7 @@ KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
KBUILD_CFLAGS += -Wno-pointer-sign
KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=)
KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
+KBUILD_CFLAGS += -include $(srctree)/include/linux/hidden.h

KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__
GCOV_PROFILE := n
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index b17d218ccdf9..4bcc943842ab 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -81,6 +81,7 @@ SECTIONS
DISCARDS
}

+ASSERT(SIZEOF(.got) == 0, "Unexpected GOT entries detected!")
#ifdef CONFIG_X86_64
ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!")
#else
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 2a156f7fec3b..8b350e5a65bc 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -27,7 +27,7 @@ cflags-$(CONFIG_ARM) := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) \
cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt

KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
- -include $(srctree)/drivers/firmware/efi/libstub/hidden.h \
+ -include $(srctree)/include/linux/hidden.h \
-D__NO_FORTIFY \
-ffreestanding \
-fno-stack-protector \

Arvind Sankar

unread,
Jul 17, 2020, 4:18:08 PM7/17/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
From: Ard Biesheuvel <ar...@kernel.org>

In a previous patch, we have eliminated GOT entries from the decompressor
binary and added an assertion that the .got section is empty. This means
that the GOT fixup routines that exist in both the 32-bit and 64-bit
startup routines have become dead code, and can be removed.

While at it, drop the KEEP() from the linker script, as it has no effect
on the contents of output sections that are created by the linker itself.

Reviewed-by: Kees Cook <kees...@chromium.org>
Signed-off-by: Ard Biesheuvel <ar...@kernel.org>
Acked-by: Arvind Sankar <nive...@alum.mit.edu>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
From: Ard Biesheuvel <ar...@kernel.org>
Link: https://lore.kernel.org/r/2020052312002...@kernel.org
---
arch/x86/boot/compressed/head_32.S | 24 ++---------
arch/x86/boot/compressed/head_64.S | 57 --------------------------
arch/x86/boot/compressed/vmlinux.lds.S | 4 +-
3 files changed, 5 insertions(+), 80 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 03557f2174bf..39f0bb43218f 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -49,16 +49,13 @@
* Position Independent Executable (PIE) so that linker won't optimize
* R_386_GOT32X relocation to its fixed symbol address. Older
* linkers generate R_386_32 relocations against locally defined symbols,
- * _bss, _ebss, _got, _egot and _end, in PIE. It isn't wrong, just less
- * optimal than R_386_RELATIVE. But the x86 kernel fails to properly handle
- * R_386_32 relocations when relocating the kernel. To generate
- * R_386_RELATIVE relocations, we mark _bss, _ebss, _got, _egot and _end as
- * hidden:
+ * _bss, _ebss and _end, in PIE. It isn't wrong, just less optimal than
+ * R_386_RELATIVE. But the x86 kernel fails to properly handle R_386_32
+ * relocations when relocating the kernel. To generate R_386_RELATIVE
+ * relocations, we mark _bss, _ebss and _end as hidden:
*/
.hidden _bss
.hidden _ebss
- .hidden _got
- .hidden _egot
.hidden _end

__HEAD
@@ -192,19 +189,6 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
shrl $2, %ecx
rep stosl

-/*
- * Adjust our own GOT
- */
- leal _got(%ebx), %edx
- leal _egot(%ebx), %ecx
-1:
- cmpl %ecx, %edx
- jae 2f
- addl %ebx, (%edx)
- addl $4, %edx
- jmp 1b
-2:
-
/*
* Do the extraction, and jump to the new kernel..
*/
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 97d37f0a34f5..bf1ab30acc5b 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -40,8 +40,6 @@
*/
.hidden _bss
.hidden _ebss
- .hidden _got
- .hidden _egot
.hidden _end

__HEAD
@@ -353,25 +351,6 @@ SYM_CODE_START(startup_64)
/* Set up the stack */
leaq boot_stack_end(%rbx), %rsp

- /*
- * paging_prepare() and cleanup_trampoline() below can have GOT
- * references. Adjust the table with address we are running at.
- *
- * Zero RAX for adjust_got: the GOT was not adjusted before;
- * there's no adjustment to undo.
- */
- xorq %rax, %rax
-
- /*
- * Calculate the address the binary is loaded at and use it as
- * a GOT adjustment.
- */
- call 1f
-1: popq %rdi
- subq $1b, %rdi
-
- call .Ladjust_got
-
/*
* At this point we are in long mode with 4-level paging enabled,
* but we might want to enable 5-level paging or vice versa.
@@ -464,21 +443,6 @@ trampoline_return:
pushq $0
popfq

- /*
- * Previously we've adjusted the GOT with address the binary was
- * loaded at. Now we need to re-adjust for relocation address.
- *
- * Calculate the address the binary is loaded at, so that we can
- * undo the previous GOT adjustment.
- */
- call 1f
-1: popq %rax
- subq $1b, %rax
-
- /* The new adjustment is the relocation address */
- movq %rbx, %rdi
- call .Ladjust_got
-
/*
* Copy the compressed kernel to the end of our buffer
* where decompression in place becomes safe.
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 4bcc943842ab..a4a4a59a2628 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S

Arvind Sankar

unread,
Jul 17, 2020, 4:18:08 PM7/17/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---

Arvind Sankar

unread,
Jul 17, 2020, 4:18:10 PM7/17/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/compressed/head_32.S | 64 +++++++-----------
arch/x86/boot/compressed/head_64.S | 104 ++++++++++++++++++-----------
2 files changed, 90 insertions(+), 78 deletions(-)

diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 39f0bb43218f..8c1a4f5610f5 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
*/
.hidden _bss
.hidden _ebss
/* Set up the stack */
- leal boot_stack_end(%ebx), %esp
+ leal boot_stack_end@GOTOFF(%ebx), %esp

/* Zero EFLAGS */
pushl $0
@@ -139,8 +125,8 @@ SYM_FUNC_START(startup_32)
* where decompression in place becomes safe.
shrl $2, %ecx
rep stosl
@@ -198,9 +184,9 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
pushl %ebp /* output address */

pushl $z_input_len /* input_len */
- leal input_data(%ebx), %eax
+ leal input_data@GOTOFF(%ebx), %eax
pushl %eax /* input_data */
- leal boot_heap(%ebx), %eax
+ leal boot_heap@GOTOFF(%ebx), %eax
pushl %eax /* heap area */
pushl %esi /* real mode pointer */
call extract_kernel /* returns kernel location in %eax */
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index bf1ab30acc5b..11429092c224 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
/* Set up the stack */
- leaq boot_stack_end(%rbx), %rsp
+ leaq rva(boot_stack_end)(%rbx), %rsp

/*
* At this point we are in long mode with 4-level paging enabled,

Arvind Sankar

unread,
Jul 17, 2020, 4:18:10 PM7/17/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
The BFD linker generates run-time relocations for z_input_len and
z_output_len, even though they are absolute symbols.

This is fixed for binutils-2.35 [1]. Work around this for earlier
versions by defining two variables input_len and output_len in addition
to the symbols, and use them via position-independent references.

This eliminates the last two run-time relocations in the head code and
allows us to drop the -z noreloc-overflow flag to the linker.

Move the -pie and --no-dynamic-linker LDFLAGS to LDFLAGS_vmlinux instead
of KBUILD_LDFLAGS. There shouldn't be anything else getting linked, but
this is the more logical location for these flags, and modversions might
call the linker if an EXPORT_SYMBOL is left over accidentally in one of
the decompressors.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=25754

Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/compressed/Makefile | 12 ++----------
arch/x86/boot/compressed/head_32.S | 17 ++++++++---------
arch/x86/boot/compressed/head_64.S | 4 ++--
arch/x86/boot/compressed/mkpiggy.c | 6 ++++++
4 files changed, 18 insertions(+), 21 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index b6d7caaaef9e..4a3953a596e0 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -51,16 +51,8 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-ifeq ($(CONFIG_X86_32),y)
-KBUILD_LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
-else
-# To build 64-bit compressed kernel as PIE, we disable relocation
-# overflow check to avoid relocation overflow error with a new linker
-# command-line option, -z noreloc-overflow.
-KBUILD_LDFLAGS += $(shell $(LD) --help 2>&1 | grep -q "\-z noreloc-overflow" \
- && echo "-z noreloc-overflow -pie --no-dynamic-linker")
-endif
-LDFLAGS_vmlinux := -T
+LDFLAGS_vmlinux := $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
+LDFLAGS_vmlinux += -T

hostprogs := mkpiggy
HOST_EXTRACFLAGS += -I$(srctree)/tools/include
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 8c1a4f5610f5..659fad53ca82 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -178,18 +178,17 @@ SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
/*
* Do the extraction, and jump to the new kernel..
*/
- /* push arguments for extract_kernel: */
- pushl $z_output_len /* decompressed length, end of relocs */
+ /* push arguments for extract_kernel: */

- pushl %ebp /* output address */
-
- pushl $z_input_len /* input_len */
+ pushl output_len@GOTOFF(%ebx) /* decompressed length, end of relocs */
+ pushl %ebp /* output address */
+ pushl input_len@GOTOFF(%ebx) /* input_len */
leal input_data@GOTOFF(%ebx), %eax
- pushl %eax /* input_data */
+ pushl %eax /* input_data */
leal boot_heap@GOTOFF(%ebx), %eax
- pushl %eax /* heap area */
- pushl %esi /* real mode pointer */
- call extract_kernel /* returns kernel location in %eax */
+ pushl %eax /* heap area */
+ pushl %esi /* real mode pointer */
+ call extract_kernel /* returns kernel location in %eax */
addl $24, %esp

/*
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 11429092c224..9e46729cf162 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S

Arvind Sankar

unread,
Jul 17, 2020, 4:18:12 PM7/17/20
to Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x...@kernel.org, Nick Desaulniers, Fangrui Song, Dmitry Golovin, clang-bu...@googlegroups.com, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, linux-...@vger.kernel.org
Add a linker script check that there are no run-time relocations, and
remove the old one that tries to check via looking for specially-named
sections in the object files.

Drop the tests for -fPIE compiler option and -pie linker option, as they
are available in all supported gcc and binutils versions (as well as
clang and lld).

Reviewed-by: Kees Cook <kees...@chromium.org>
Reviewed-by: Ard Biesheuvel <ar...@kernel.org>
Reviewed-by: Fangrui Song <mas...@google.com>
Signed-off-by: Arvind Sankar <nive...@alum.mit.edu>
---
arch/x86/boot/compressed/Makefile | 28 +++-----------------------
arch/x86/boot/compressed/vmlinux.lds.S | 8 ++++++++
2 files changed, 11 insertions(+), 25 deletions(-)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 4a3953a596e0..271cc933c59c 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -29,7 +29,7 @@ targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma \
vmlinux.bin.xz vmlinux.bin.lzo vmlinux.bin.lz4

KBUILD_CFLAGS := -m$(BITS) -O2
-KBUILD_CFLAGS += -fno-strict-aliasing $(call cc-option, -fPIE, -fPIC)
+KBUILD_CFLAGS += -fno-strict-aliasing -fPIE
KBUILD_CFLAGS += -DDISABLE_BRANCH_PROFILING
cflags-$(CONFIG_X86_32) := -march=i386
cflags-$(CONFIG_X86_64) := -mcmodel=small
@@ -51,7 +51,7 @@ UBSAN_SANITIZE :=n
KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE)
# Compressed kernel should be built as PIE since it may be loaded at any
# address by the bootloader.
-LDFLAGS_vmlinux := $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
+LDFLAGS_vmlinux := -pie $(call ld-option, --no-dynamic-linker)
LDFLAGS_vmlinux += -T

hostprogs := mkpiggy
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index a4a4a59a2628..29df99b6cc64 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S

Nick Desaulniers

unread,
Jul 17, 2020, 7:46:41 PM7/17/20
to Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Fangrui Song, Dmitry Golovin, clang-built-linux, Ard Biesheuvel, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, LKML
On Fri, Jul 17, 2020 at 1:18 PM Arvind Sankar <nive...@alum.mit.edu> wrote:
>
> Same as v5 previously posted, but rebased onto next-20200717.
>
> v5: https://lore.kernel.org/lkml/20200715004133.1...@alum.mit.edu/
>
> Ard Biesheuvel (3):
> x86/boot/compressed: Move .got.plt entries out of the .got section
> x86/boot/compressed: Force hidden visibility for all symbol references
> x86/boot/compressed: Get rid of GOT fixup code
>
> Arvind Sankar (4):
> x86/boot: Add .text.* to setup.ld
> x86/boot: Remove run-time relocations from .head.text code
> x86/boot: Remove run-time relocations from head_{32,64}.S
> x86/boot: Check that there are no run-time relocations

Thank you Arvind for the series. I did quick build+boot tests of x86
defconfigs with `make LLVM=1` which look good.

Further, I was able to build+boot i386 defconfigs with `make LLVM=1`
with your series, Mr. Rothwell's proposed fixed for today's linux-next
breakage: https://lore.kernel.org/linux-next/20200717213...@canb.auug.org.au/,
and Mr. Gerst's series for i386+clang support. So that looks like
with this series, we can now link i386 defconfig with LLD! Nice!

Tested-by: Nick Desaulniers <ndesau...@google.com>

>
> arch/x86/boot/compressed/Makefile | 39 +-----
> arch/x86/boot/compressed/head_32.S | 99 +++++----------
> arch/x86/boot/compressed/head_64.S | 165 ++++++++++---------------
> arch/x86/boot/compressed/mkpiggy.c | 6 +
> arch/x86/boot/compressed/vmlinux.lds.S | 24 +++-
> arch/x86/boot/setup.ld | 2 +-
> drivers/firmware/efi/libstub/Makefile | 2 +-
> drivers/firmware/efi/libstub/hidden.h | 6 -
> include/linux/hidden.h | 19 +++
> 9 files changed, 153 insertions(+), 209 deletions(-)
> delete mode 100644 drivers/firmware/efi/libstub/hidden.h
> create mode 100644 include/linux/hidden.h
>
>
> base-commit: aab7ee9f8ff0110bfcd594b33dc33748dc1baf46
> --
> 2.26.2
>


--
Thanks,
~Nick Desaulniers

Ard Biesheuvel

unread,
Jul 18, 2020, 1:45:03 AM7/18/20
to Nick Desaulniers, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Fangrui Song, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Sedat Dilek, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, LKML
That is not the right way to deal with conflicts against -next.

This series targets the -tip tree, and applies fine against it. If you
want to apply it on some other tree and test it, that is fine, and
highly appreciated, but 'refreshing' the series against -next means it
no longer applies to -tip, and may be based on unidentified conflict
resolutions performed by Stephen that the maintainers will have to
deal with.

Boris, Ingo, Thomas,

Mind taking v5 of this series? (With Nick's Tested-by) I think these
patches have been simmering long enough. Do note there is a conflict
against the kbuild tree, but the resolution should be straightforward.

Sedat Dilek

unread,
Jul 18, 2020, 3:01:50 AM7/18/20
to Ard Biesheuvel, Nick Desaulniers, Arvind Sankar, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Fangrui Song, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Kees Cook, Nathan Chancellor, Arnd Bergmann, H . J . Lu, LKML
Agreed with that approach.

v5 misses also my credits - Tested-by for the whole series is sufficient.

- Sedat -

Kees Cook

unread,
Jul 24, 2020, 7:25:24 PM7/24/20
to Ard Biesheuvel, Thomas Gleixner, Borislav Petkov, Nick Desaulniers, Arvind Sankar, Ingo Molnar, H. Peter Anvin, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT), Fangrui Song, Dmitry Golovin, clang-built-linux, Masahiro Yamada, Sedat Dilek, Nathan Chancellor, Arnd Bergmann, H . J . Lu, LKML
I would love that; I need to rebase my orphan series on this too...

--
Kees Cook
It is loading more messages.
0 new messages