[PATCH 00/21] objtool: vmlinux.o and CLANG LTO support

6 views
Skip to first unread message

Josh Poimboeuf

unread,
Jan 14, 2021, 2:40:28 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
Add support for proper vmlinux.o validation, which will be needed for
Sami's upcoming x86 LTO set. (And vmlinux validation is the future for
objtool anyway, for other reasons.)

This isn't 100% done -- most notably, crypto still needs to be supported
-- but I think this gets us most of the way there.

This can also be found at

git://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git objtool-vmlinux

And for more testing it can be combined with Sami's x86 LTO patches:

https://github.com/samitolvanen/linux clang-lto



Josh Poimboeuf (21):
objtool: Fix seg fault in BT_FUNC() with fake jump
objtool: Fix error handling for STD/CLD warnings
objtool: Fix retpoline detection in asm code
objtool: Fix ".cold" section suffix check for newer versions of GCC
objtool: Support retpoline jump detection for vmlinux.o
x86/ftrace: Add UNWIND_HINT_FUNC annotation for ftrace_stub
objtool: Assume only ELF functions do sibling calls
objtool: Add asm version of STACK_FRAME_NON_STANDARD
objtool: Combine UNWIND_HINT_RET_OFFSET and UNWIND_HINT_FUNC
objtool: Add xen_start_kernel() to noreturn list
objtool: Move unsuffixed symbol conversion to a helper function
objtool: Add CONFIG_CFI_CLANG support
x86/xen: Support objtool validation in xen-asm.S
x86/xen: Support objtool vmlinux.o validation in xen-head.S
x86/xen/pvh: Convert indirect jump to retpoline
x86/ftrace: Support objtool vmlinux.o validation in ftrace_64.S
x86/acpi: Convert indirect jump to retpoline
x86/acpi: Support objtool validation in wakeup_64.S
x86/power: Convert indirect jumps to retpolines
x86/power: Move restore_registers() to top of the file
x86/power: Support objtool validation in hibernate_asm_64.S

arch/x86/include/asm/unwind_hints.h | 13 +---
arch/x86/kernel/acpi/Makefile | 1 -
arch/x86/kernel/acpi/wakeup_64.S | 5 +-
arch/x86/kernel/ftrace_64.S | 8 +--
arch/x86/lib/retpoline.S | 2 +-
arch/x86/platform/pvh/head.S | 3 +-
arch/x86/power/Makefile | 1 -
arch/x86/power/hibernate_asm_64.S | 105 ++++++++++++++--------------
arch/x86/xen/Makefile | 1 -
arch/x86/xen/xen-asm.S | 29 +++++---
arch/x86/xen/xen-head.S | 5 +-
include/linux/objtool.h | 13 +++-
tools/include/linux/objtool.h | 13 +++-
tools/objtool/arch/x86/decode.c | 4 +-
tools/objtool/arch/x86/special.c | 2 +-
tools/objtool/check.c | 91 +++++++++++++-----------
tools/objtool/check.h | 12 +++-
tools/objtool/elf.c | 87 +++++++++++++++++------
tools/objtool/elf.h | 2 +-
19 files changed, 241 insertions(+), 156 deletions(-)

--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:40:28 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
Objtool appends a temporary fake jump at the end of alternative
replacement instructions. If the replacement code is empty -- resulting
in patched nops -- the fake jump doesn't have a section. When running
objtool with '--backtrace', the fake jump's missing section can cause
BT_FUNC() to trigger a seg fault when the NULL insn->sec is passed to
offstr().

Fix it by ensuring fake jumps always have a section.

Fixes: 7697eee3ddd7 ("objtool: Add --backtrace support")
Reported-by: Sami Tolvanen <samito...@google.com>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
tools/objtool/check.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 5f8d3eed78a1..ed26c22c8244 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1017,7 +1017,7 @@ static int handle_group_alt(struct objtool_file *file,
INIT_LIST_HEAD(&fake_jump->stack_ops);
init_cfi_state(&fake_jump->cfi);

- fake_jump->sec = special_alt->new_sec;
+ fake_jump->sec = special_alt->new_sec ? : orig_insn->sec;
fake_jump->offset = FAKE_JUMP_OFFSET;
fake_jump->type = INSN_JUMP_UNCONDITIONAL;
fake_jump->jump_dest = list_next_entry(last_orig_insn, list);
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:40:33 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
Actually return an error (and display a backtrace, if requested) for
directional bit warnings.

Fixes: 2f0f9e9ad7b3 ("objtool: Add Direction Flag validation")
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
tools/objtool/check.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index ed26c22c8244..93c3937d1d6a 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2592,15 +2592,19 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
break;

case INSN_STD:
- if (state.df)
+ if (state.df) {
WARN_FUNC("recursive STD", sec, insn->offset);
+ return 1;
+ }

state.df = true;
break;

case INSN_CLD:
- if (!state.df && func)
+ if (!state.df && func) {
WARN_FUNC("redundant CLD", sec, insn->offset);
+ return 1;
+ }

state.df = false;
break;
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:40:36 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
The JMP_NOSPEC macro branches to __x86_retpoline_*() rather than the
__x86_indirect_thunk_*() wrappers used by C code. Detect jumps to
__x86_retpoline_*() as retpoline dynamic jumps.

Presumably this doesn't trigger a user-visible bug. I only found it
when testing vmlinux.o validation.

Fixes: 39b735332cb8 ("objtool: Detect jumps to retpoline thunks")
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
tools/objtool/arch/x86/special.c | 2 +-
tools/objtool/check.c | 3 ++-
tools/objtool/check.h | 11 +++++++++++
3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/objtool/arch/x86/special.c b/tools/objtool/arch/x86/special.c
index fd4af88c0ea5..151b13d0a267 100644
--- a/tools/objtool/arch/x86/special.c
+++ b/tools/objtool/arch/x86/special.c
@@ -48,7 +48,7 @@ bool arch_support_alt_relocation(struct special_alt *special_alt,
* replacement group.
*/
return insn->offset == special_alt->new_off &&
- (insn->type == INSN_CALL || is_static_jump(insn));
+ (insn->type == INSN_CALL || is_jump(insn));
}

/*
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 93c3937d1d6a..00969eac5f7b 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -789,7 +789,8 @@ static int add_jump_destinations(struct objtool_file *file)
dest_sec = reloc->sym->sec;
dest_off = reloc->sym->sym.st_value +
arch_dest_reloc_offset(reloc->addend);
- } else if (strstr(reloc->sym->name, "_indirect_thunk_")) {
+ } else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21) ||
+ !strncmp(reloc->sym->name, "__x86_retpoline_", 16)) {
/*
* Retpoline jumps are really dynamic jumps in
* disguise, so convert them accordingly.
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index 5ec00a4b891b..2804848e628e 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -54,6 +54,17 @@ static inline bool is_static_jump(struct instruction *insn)
insn->type == INSN_JUMP_UNCONDITIONAL;
}

+static inline bool is_dynamic_jump(struct instruction *insn)
+{
+ return insn->type == INSN_JUMP_DYNAMIC ||
+ insn->type == INSN_JUMP_DYNAMIC_CONDITIONAL;
+}
+
+static inline bool is_jump(struct instruction *insn)
+{
+ return is_static_jump(insn) || is_dynamic_jump(insn);
+}
+
struct instruction *find_insn(struct objtool_file *file,
struct section *sec, unsigned long offset);

--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:40:40 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
With my version of GCC 9.3.1 the ".cold" subfunctions no longer have a
numbered suffix, so the trailing period is no longer there.

Presumably this doesn't yet trigger a user-visible bug since most of the
subfunction detection logic is duplicated. I only found it when
testing vmlinux.o validation.

Fixes: 54262aa28301 ("objtool: Fix sibling call detection")
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
tools/objtool/check.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 00969eac5f7b..12d48db41626 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -850,8 +850,8 @@ static int add_jump_destinations(struct objtool_file *file)
* case where the parent function's only reference to a
* subfunction is through a jump table.
*/
- if (!strstr(insn->func->name, ".cold.") &&
- strstr(insn->jump_dest->func->name, ".cold.")) {
+ if (!strstr(insn->func->name, ".cold") &&
+ strstr(insn->jump_dest->func->name, ".cold")) {
insn->func->cfunc = insn->jump_dest->func;
insn->jump_dest->func->pfunc = insn->func;

--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:40:43 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
Objtool converts direct retpoline jumps to type INSN_JUMP_DYNAMIC, since
that's what they are semantically.

That conversion doesn't work in vmlinux.o validation because the
indirect thunk function is present in the object, so the intra-object
jump check succeeds before the retpoline jump check gets a chance.

Rearrange the checks: check for a retpoline jump before checking for an
intra-object jump.

Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
tools/objtool/check.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 12d48db41626..b1327789d049 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -785,10 +785,6 @@ static int add_jump_destinations(struct objtool_file *file)
} else if (reloc->sym->type == STT_SECTION) {
dest_sec = reloc->sym->sec;
dest_off = arch_dest_reloc_offset(reloc->addend);
- } else if (reloc->sym->sec->idx) {
- dest_sec = reloc->sym->sec;
- dest_off = reloc->sym->sym.st_value +
- arch_dest_reloc_offset(reloc->addend);
} else if (!strncmp(reloc->sym->name, "__x86_indirect_thunk_", 21) ||
!strncmp(reloc->sym->name, "__x86_retpoline_", 16)) {
/*
@@ -802,6 +798,10 @@ static int add_jump_destinations(struct objtool_file *file)

insn->retpoline_safe = true;
continue;
+ } else if (reloc->sym->sec->idx) {
+ dest_sec = reloc->sym->sec;
+ dest_off = reloc->sym->sym.st_value +
+ arch_dest_reloc_offset(reloc->addend);
} else {
/* external sibling call */
insn->call_dest = reloc->sym;
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:40:52 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes, Steven Rostedt
Prevent an unreachable objtool warning after the sibling call detection
gets improved. ftrace_stub() is basically a function, annotate it as
such.

Cc: Steven Rostedt <ros...@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/kernel/ftrace_64.S | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 0d54099c2a3a..58d125ed9d1a 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -184,6 +184,7 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
* It is also used to copy the retq for trampolines.
*/
SYM_INNER_LABEL_ALIGN(ftrace_stub, SYM_L_WEAK)
+ UNWIND_HINT_FUNC
retq
SYM_FUNC_END(ftrace_epilogue)

--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:05 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
There's an inconsistency in how sibling calls are detected in
non-function asm code, depending on the scope of the object. If the
target code is external to the object, objtool considers it a sibling
call. If the target code is internal but not a function, objtool
*doesn't* consider it a sibling call.

This can cause some inconsistencies between per-object and vmlinux.o
validation.

Instead, assume only ELF functions can do sibling calls. This generally
matches existing reality, and makes sibling call validation consistent
between vmlinux.o and per-object.

Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
tools/objtool/check.c | 36 ++++++++++++++++++++++--------------
1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index b1327789d049..d833855f3aa8 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -111,15 +111,20 @@ static struct instruction *prev_insn_same_sym(struct objtool_file *file,

static bool is_sibling_call(struct instruction *insn)
{
+ /*
+ * Assume only ELF functions can make sibling calls. This ensures
+ * sibling call detection consistency between vmlinux.o and individual
+ * objects.
+ */
+ if (!insn->func)
+ return false;
+
/* An indirect jump is either a sibling call or a jump to a table. */
if (insn->type == INSN_JUMP_DYNAMIC)
return list_empty(&insn->alts);

- if (!is_static_jump(insn))
- return false;
-
/* add_jump_destinations() sets insn->call_dest for sibling calls. */
- return !!insn->call_dest;
+ return (is_static_jump(insn) && insn->call_dest);
}

/*
@@ -778,7 +783,7 @@ static int add_jump_destinations(struct objtool_file *file)
continue;

reloc = find_reloc_by_dest_range(file->elf, insn->sec,
- insn->offset, insn->len);
+ insn->offset, insn->len);
if (!reloc) {
dest_sec = insn->sec;
dest_off = arch_jump_destination(insn);
@@ -798,18 +803,21 @@ static int add_jump_destinations(struct objtool_file *file)

insn->retpoline_safe = true;
continue;
- } else if (reloc->sym->sec->idx) {
- dest_sec = reloc->sym->sec;
- dest_off = reloc->sym->sym.st_value +
- arch_dest_reloc_offset(reloc->addend);
- } else {
- /* external sibling call */
+ } else if (insn->func) {
+ /* internal or external sibling call (with reloc) */
insn->call_dest = reloc->sym;
if (insn->call_dest->static_call_tramp) {
list_add_tail(&insn->static_call_node,
&file->static_call_list);
}
continue;
+ } else if (reloc->sym->sec->idx) {
+ dest_sec = reloc->sym->sec;
+ dest_off = reloc->sym->sym.st_value +
+ arch_dest_reloc_offset(reloc->addend);
+ } else {
+ /* non-func asm code jumping to another file */
+ continue;
}

insn->jump_dest = find_insn(file, dest_sec, dest_off);
@@ -858,7 +866,7 @@ static int add_jump_destinations(struct objtool_file *file)
} else if (insn->jump_dest->func->pfunc != insn->func->pfunc &&
insn->jump_dest->offset == insn->jump_dest->func->offset) {

- /* internal sibling call */
+ /* internal sibling call (without reloc) */
insn->call_dest = insn->jump_dest->func;
if (insn->call_dest->static_call_tramp) {
list_add_tail(&insn->static_call_node,
@@ -2528,7 +2536,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,

case INSN_JUMP_CONDITIONAL:
case INSN_JUMP_UNCONDITIONAL:
- if (func && is_sibling_call(insn)) {
+ if (is_sibling_call(insn)) {
ret = validate_sibling_call(insn, &state);
if (ret)
return ret;
@@ -2550,7 +2558,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,

case INSN_JUMP_DYNAMIC:
case INSN_JUMP_DYNAMIC_CONDITIONAL:
- if (func && is_sibling_call(insn)) {
+ if (is_sibling_call(insn)) {
ret = validate_sibling_call(insn, &state);
if (ret)
return ret;
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:06 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
To be used for adding asm functions to the ignore list. The "aw" is
needed to help the ELF section metadata match GCC-created sections.
Otherwise the linker creates duplicate sections instead of combining
them.

Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
include/linux/objtool.h | 8 ++++++++
tools/include/linux/objtool.h | 8 ++++++++
2 files changed, 16 insertions(+)

diff --git a/include/linux/objtool.h b/include/linux/objtool.h
index 577f51436cf9..add1c6eb157e 100644
--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -109,6 +109,12 @@ struct unwind_hint {
.popsection
.endm

+.macro STACK_FRAME_NON_STANDARD func:req
+ .pushsection .discard.func_stack_frame_non_standard, "aw"
+ .long \func - .
+ .popsection
+.endm
+
#endif /* __ASSEMBLY__ */

#else /* !CONFIG_STACK_VALIDATION */
@@ -122,6 +128,8 @@ struct unwind_hint {
#define ANNOTATE_INTRA_FUNCTION_CALL
.macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
.endm
+.macro STACK_FRAME_NON_STANDARD func:req
+.endm
#endif

#endif /* CONFIG_STACK_VALIDATION */
diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h
index 577f51436cf9..add1c6eb157e 100644
--- a/tools/include/linux/objtool.h
+++ b/tools/include/linux/objtool.h
@@ -109,6 +109,12 @@ struct unwind_hint {
.popsection
.endm

+.macro STACK_FRAME_NON_STANDARD func:req
+ .pushsection .discard.func_stack_frame_non_standard, "aw"
+ .long \func - .
+ .popsection
+.endm
+
#endif /* __ASSEMBLY__ */

#else /* !CONFIG_STACK_VALIDATION */
@@ -122,6 +128,8 @@ struct unwind_hint {
#define ANNOTATE_INTRA_FUNCTION_CALL
.macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
.endm
+.macro STACK_FRAME_NON_STANDARD func:req
+.endm
#endif

#endif /* CONFIG_STACK_VALIDATION */
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:13 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
The ORC metadata generated for UNWIND_HINT_FUNC isn't actually very
func-like. With certain usages it can cause stack state mismatches
because it doesn't set the return address (CFI_RA).

Also, users of UNWIND_HINT_RET_OFFSET no longer need to set a custom
return stack offset. Instead they just need to specify a func-like
situation, so the current ret_offset code is hacky for no good reason.

Solve both problems by simplifying the RET_OFFSET handling and
converting it into a more useful UNWIND_HINT_FUNC.

If we end up needing the old 'ret_offset' functionality again in the
future, we should be able to support it pretty easily with the addition
of a custom 'sp_offset' in UNWIND_HINT_FUNC.

Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/include/asm/unwind_hints.h | 13 ++--------
arch/x86/kernel/ftrace_64.S | 2 +-
arch/x86/lib/retpoline.S | 2 +-
include/linux/objtool.h | 5 +++-
tools/include/linux/objtool.h | 5 +++-
tools/objtool/arch/x86/decode.c | 4 ++--
tools/objtool/check.c | 37 ++++++++++++-----------------
tools/objtool/check.h | 1 -
8 files changed, 29 insertions(+), 40 deletions(-)

diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h
index 664d4610d700..8e574c0afef8 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -48,17 +48,8 @@
UNWIND_HINT_REGS base=\base offset=\offset partial=1
.endm

-.macro UNWIND_HINT_FUNC sp_offset=8
- UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=\sp_offset type=UNWIND_HINT_TYPE_CALL
-.endm
-
-/*
- * RET_OFFSET: Used on instructions that terminate a function; mostly RETURN
- * and sibling calls. On these, sp_offset denotes the expected offset from
- * initial_func_cfi.
- */
-.macro UNWIND_HINT_RET_OFFSET sp_offset=8
- UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_RET_OFFSET sp_offset=\sp_offset
+.macro UNWIND_HINT_FUNC
+ UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=8 type=UNWIND_HINT_TYPE_FUNC
.endm

#endif /* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 58d125ed9d1a..1bf568d901b1 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -277,7 +277,7 @@ SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL)
restore_mcount_regs 8
/* Restore flags */
popfq
- UNWIND_HINT_RET_OFFSET
+ UNWIND_HINT_FUNC
jmp ftrace_epilogue

SYM_FUNC_END(ftrace_regs_caller)
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index b4c43a9b1483..f6fb1d218dcc 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -28,7 +28,7 @@ SYM_FUNC_START_NOALIGN(__x86_retpoline_\reg)
jmp .Lspec_trap_\@
.Ldo_rop_\@:
mov %\reg, (%_ASM_SP)
- UNWIND_HINT_RET_OFFSET
+ UNWIND_HINT_FUNC
ret
SYM_FUNC_END(__x86_retpoline_\reg)

diff --git a/include/linux/objtool.h b/include/linux/objtool.h
index add1c6eb157e..7e72d975cb76 100644
--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -29,11 +29,14 @@ struct unwind_hint {
*
* UNWIND_HINT_TYPE_REGS_PARTIAL: Used in entry code to indicate that
* sp_reg+sp_offset points to the iret return frame.
+ *
+ * UNWIND_HINT_FUNC: Generate the unwind metadata of a callable function.
+ * Useful for code which doesn't have an ELF function annotation.
*/
#define UNWIND_HINT_TYPE_CALL 0
#define UNWIND_HINT_TYPE_REGS 1
#define UNWIND_HINT_TYPE_REGS_PARTIAL 2
-#define UNWIND_HINT_TYPE_RET_OFFSET 3
+#define UNWIND_HINT_TYPE_FUNC 3

#ifdef CONFIG_STACK_VALIDATION

diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h
index add1c6eb157e..7e72d975cb76 100644
--- a/tools/include/linux/objtool.h
+++ b/tools/include/linux/objtool.h
@@ -29,11 +29,14 @@ struct unwind_hint {
*
* UNWIND_HINT_TYPE_REGS_PARTIAL: Used in entry code to indicate that
* sp_reg+sp_offset points to the iret return frame.
+ *
+ * UNWIND_HINT_FUNC: Generate the unwind metadata of a callable function.
+ * Useful for code which doesn't have an ELF function annotation.
*/
#define UNWIND_HINT_TYPE_CALL 0
#define UNWIND_HINT_TYPE_REGS 1
#define UNWIND_HINT_TYPE_REGS_PARTIAL 2
-#define UNWIND_HINT_TYPE_RET_OFFSET 3
+#define UNWIND_HINT_TYPE_FUNC 3

#ifdef CONFIG_STACK_VALIDATION

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index cde9c36e40ae..6f5a45754ad8 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -563,8 +563,8 @@ void arch_initial_func_cfi_state(struct cfi_init_state *state)
state->cfa.offset = 8;

/* initial RA (return address) */
- state->regs[16].base = CFI_CFA;
- state->regs[16].offset = -8;
+ state->regs[CFI_RA].base = CFI_CFA;
+ state->regs[CFI_RA].offset = -8;
}

const char *arch_nop_insn(int len)
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index d833855f3aa8..6636f4fd694a 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -1400,13 +1400,20 @@ static int add_jump_table_alts(struct objtool_file *file)
return 0;
}

+static void set_func_state(struct cfi_state *state)
+{
+ state->cfa = initial_func_cfi.cfa;
+ memcpy(&state->regs, &initial_func_cfi.regs,
+ CFI_NUM_REGS * sizeof(struct cfi_reg));
+ state->stack_size = initial_func_cfi.cfa.offset;
+}
+
static int read_unwind_hints(struct objtool_file *file)
{
struct section *sec, *relocsec;
struct reloc *reloc;
struct unwind_hint *hint;
struct instruction *insn;
- struct cfi_reg *cfa;
int i;

sec = find_section_by_name(file->elf, ".discard.unwind_hints");
@@ -1441,22 +1448,20 @@ static int read_unwind_hints(struct objtool_file *file)
return -1;
}

- cfa = &insn->cfi.cfa;
+ insn->hint = true;

- if (hint->type == UNWIND_HINT_TYPE_RET_OFFSET) {
- insn->ret_offset = hint->sp_offset;
+ if (hint->type == UNWIND_HINT_TYPE_FUNC) {
+ set_func_state(&insn->cfi);
continue;
}

- insn->hint = true;
-
if (arch_decode_hint_reg(insn, hint->sp_reg)) {
WARN_FUNC("unsupported unwind_hint sp base reg %d",
insn->sec, insn->offset, hint->sp_reg);
return -1;
}

- cfa->offset = hint->sp_offset;
+ insn->cfi.cfa.offset = hint->sp_offset;
insn->cfi.type = hint->type;
insn->cfi.end = hint->end;
}
@@ -1712,27 +1717,18 @@ static bool is_fentry_call(struct instruction *insn)

static bool has_modified_stack_frame(struct instruction *insn, struct insn_state *state)
{
- u8 ret_offset = insn->ret_offset;
struct cfi_state *cfi = &state->cfi;
int i;

if (cfi->cfa.base != initial_func_cfi.cfa.base || cfi->drap)
return true;

- if (cfi->cfa.offset != initial_func_cfi.cfa.offset + ret_offset)
+ if (cfi->cfa.offset != initial_func_cfi.cfa.offset)
return true;

- if (cfi->stack_size != initial_func_cfi.cfa.offset + ret_offset)
+ if (cfi->stack_size != initial_func_cfi.cfa.offset)
return true;

- /*
- * If there is a ret offset hint then don't check registers
- * because a callee-saved register might have been pushed on
- * the stack.
- */
- if (ret_offset)
- return false;
-
for (i = 0; i < CFI_NUM_REGS; i++) {
if (cfi->regs[i].base != initial_func_cfi.regs[i].base ||
cfi->regs[i].offset != initial_func_cfi.regs[i].offset)
@@ -2824,10 +2820,7 @@ static int validate_section(struct objtool_file *file, struct section *sec)
continue;

init_insn_state(&state, sec);
- state.cfi.cfa = initial_func_cfi.cfa;
- memcpy(&state.cfi.regs, &initial_func_cfi.regs,
- CFI_NUM_REGS * sizeof(struct cfi_reg));
- state.cfi.stack_size = initial_func_cfi.cfa.offset;
+ set_func_state(&state.cfi);

warnings += validate_symbol(file, sec, func, &state);
}
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index 2804848e628e..4615ca210fae 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -33,7 +33,6 @@ struct instruction {
bool retpoline_safe;
s8 instr;
u8 visited;
- u8 ret_offset;
int alt_group;
struct symbol *call_dest;
struct instruction *jump_dest;
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:17 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
xen_start_kernel() doesn't return. Annotate it as such so objtool can
follow the code flow.

Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
tools/objtool/check.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 6636f4fd694a..a430eaacd7aa 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -161,6 +161,7 @@ static bool __dead_end_function(struct objtool_file *file, struct symbol *func,
"machine_real_restart",
"rewind_stack_do_exit",
"kunit_try_catch_throw",
+ "xen_start_kernel",
};

if (!func)
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:19 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
This logic will also be needed for the CONFIG_CFI_CLANG support.

Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
tools/objtool/elf.c | 59 ++++++++++++++++++++++++++++-----------------
1 file changed, 37 insertions(+), 22 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index be89c741ba9a..292f015f7ec6 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -262,6 +262,38 @@ struct reloc *find_reloc_by_dest(const struct elf *elf, struct section *sec, uns
return find_reloc_by_dest_range(elf, sec, offset, 1);
}

+static int find_unsuffixed_func(const struct elf *elf, struct symbol *sym,
+ const char *suffix, struct symbol **func)
+{
+ char name[MAX_NAME_LEN + 1];
+ const char *loc;
+ size_t len;
+
+ *func = NULL;
+
+ loc = strstr(sym->name, suffix);
+ if (!loc)
+ return 0;
+
+ len = loc - sym->name;
+ if (len > MAX_NAME_LEN) {
+ WARN("%s(): unsuffixed function name exceeds maximum length of %d characters",
+ sym->name, MAX_NAME_LEN);
+ return -1;
+ }
+
+ strncpy(name, sym->name, len);
+ name[len] = '\0';
+
+ *func = find_symbol_by_name(elf, name);
+ if (!*func || (*func)->type != STT_FUNC) {
+ WARN("%s(): can't find unsuffixed function", sym->name);
+ return -1;
+ }
+
+ return 0;
+}
+
void insn_to_reloc_sym_addend(struct section *sec, unsigned long offset,
struct reloc *reloc)
{
@@ -456,37 +488,20 @@ static int read_symbols(struct elf *elf)
/* Create parent/child links for any cold subfunctions */
list_for_each_entry(sec, &elf->sections, list) {
list_for_each_entry(sym, &sec->symbol_list, list) {
- char pname[MAX_NAME_LEN + 1];
- size_t pnamelen;
if (sym->type != STT_FUNC)
continue;

- if (sym->pfunc == NULL)
+ if (!sym->pfunc)
sym->pfunc = sym;

- if (sym->cfunc == NULL)
+ if (!sym->cfunc)
sym->cfunc = sym;

- coldstr = strstr(sym->name, ".cold");
- if (!coldstr)
- continue;
-
- pnamelen = coldstr - sym->name;
- if (pnamelen > MAX_NAME_LEN) {
- WARN("%s(): parent function name exceeds maximum length of %d characters",
- sym->name, MAX_NAME_LEN);
+ if (find_unsuffixed_func(elf, sym, ".cold", &pfunc))
return -1;
- }

- strncpy(pname, sym->name, pnamelen);
- pname[pnamelen] = '\0';
- pfunc = find_symbol_by_name(elf, pname);
-
- if (!pfunc) {
- WARN("%s(): can't find parent function",
- sym->name);
- return -1;
- }
+ if (!pfunc)
+ continue;

sym->pfunc = pfunc;
pfunc->cfunc = sym;
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:25 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
The upcoming CONFIG_CFI_CLANG support uses -fsanitize=cfi, the
non-canonical version of which hijacks function entry by changing
function relocation references to point to an intermediary jump table.

For example:

Relocation section '.rela.discard.func_stack_frame_non_standard' at offset 0x37e018 contains 6 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000000 0002944700000002 R_X86_64_PC32 00000000000023f0 do_suspend_lowlevel + 0
0000000000000008 0003c11900000001 R_X86_64_64 0000000000000008 xen_cpuid$e69bc59f4fade3b6f2b579b3934137df.cfi_jt + 0
0000000000000010 0003980900000001 R_X86_64_64 0000000000000060 machine_real_restart.cfi_jt + 0
0000000000000018 0003962b00000001 R_X86_64_64 0000000000000e18 kretprobe_trampoline.cfi_jt + 0
0000000000000020 000028f300000001 R_X86_64_64 0000000000000000 .rodata + 12
0000000000000028 000349f400000001 R_X86_64_64 0000000000000018 __crash_kexec.cfi_jt + 0

0000000000000060 <machine_real_restart.cfi_jt>:
60: e9 00 00 00 00 jmpq 65 <machine_real_restart.cfi_jt+0x5>
61: R_X86_64_PLT32 machine_real_restart-0x4
65: cc int3
66: cc int3
67: cc int3

This breaks objtool vmlinux validation in many ways, including static
call site detection and the STACK_FRAME_NON_STANDARD() macro.

Fix it by converting those relocations' symbol references back to their
original non-jump-table versions. Note this doesn't change the actual
relocations in the object itself, it just changes objtool's view of
them.

Reported-by: Sedat Dilek <sedat...@gmail.com>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
tools/objtool/elf.c | 28 ++++++++++++++++++++++++++++
tools/objtool/elf.h | 2 +-
2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 292f015f7ec6..e357dc34cd7a 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -382,6 +382,11 @@ static int read_sections(struct elf *elf)
}
sec->len = sec->sh.sh_size;

+ /* Detect -fsanitize=cfi related sections */
+ if (!strcmp(sec->name, ".text.__cfi_check") ||
+ !strncmp(sec->name, ".text..L.cfi.jumptable", 22))
+ sec->cfi_jt = true;
+
list_add_tail(&sec->list, &elf->sections);
elf_hash_add(elf->section_hash, &sec->hash, sec->idx);
elf_hash_add(elf->section_name_hash, &sec->name_hash, str_hash(sec->name));
@@ -614,6 +619,29 @@ static int read_relocs(struct elf *elf)
return -1;
}

+ /*
+ * Deal with -fsanitize=cfi (CONFIG_CFI_CLANG), which
+ * hijacks function entry by arbitrarily changing a lot
+ * of relocation symbol references to refer to an
+ * intermediate jump table. Undo that conversion so
+ * objtool can make sense of things.
+ */
+ if (reloc->sym->sec->cfi_jt) {
+ struct symbol *func, *sym;
+
+ if (sym->type == STT_SECTION)
+ sym = find_func_by_offset(sym->sec,
+ reloc->addend);
+ else
+ sym = reloc->sym;
+
+ if (find_unsuffixed_func(elf, sym, ".cfi_jt", &func))
+ return -1;
+
+ if (func)
+ reloc->sym = func;
+ }
+
elf_add_reloc(elf, reloc);
nr_reloc++;
}
diff --git a/tools/objtool/elf.h b/tools/objtool/elf.h
index e6890cc70a25..bcc524d73f51 100644
--- a/tools/objtool/elf.h
+++ b/tools/objtool/elf.h
@@ -39,7 +39,7 @@ struct section {
char *name;
int idx;
unsigned int len;
- bool changed, text, rodata, noinstr;
+ bool changed, text, rodata, noinstr, cfi_jt;
};

struct symbol {
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:27 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes, Boris Ostrovsky, Juergen Gross
The OBJECT_FILES_NON_STANDARD annotation is used to tell objtool to
ignore a file. File-level ignores won't work when validating vmlinux.o.

Tweak the ELF metadata and unwind hints to allow objtool to follow the
code.

Cc: Boris Ostrovsky <boris.o...@oracle.com>
Cc: Juergen Gross <jgr...@suse.com>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/xen/Makefile | 1 -
arch/x86/xen/xen-asm.S | 29 +++++++++++++++++++----------
2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index fc5c5ba4aacb..40b5779fce21 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -1,5 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
-OBJECT_FILES_NON_STANDARD_xen-asm.o := y

ifdef CONFIG_FUNCTION_TRACER
# Do not profile debug and lowlevel utilities
diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index 1cb0e84b9161..a05e80b552c0 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -14,6 +14,7 @@
#include <asm/thread_info.h>
#include <asm/asm.h>
#include <asm/frame.h>
+#include <asm/unwind_hints.h>

#include <xen/interface/xen.h>

@@ -146,6 +147,7 @@ SYM_FUNC_END(xen_read_cr2_direct);

.macro xen_pv_trap name
SYM_CODE_START(xen_\name)
+ UNWIND_HINT_EMPTY
pop %rcx
pop %r11
jmp \name
@@ -184,6 +186,7 @@ xen_pv_trap asm_exc_xen_hypervisor_callback
SYM_CODE_START(xen_early_idt_handler_array)
i = 0
.rept NUM_EXCEPTION_VECTORS
+ UNWIND_HINT_EMPTY
pop %rcx
pop %r11
jmp early_idt_handler_array + i*EARLY_IDT_HANDLER_SIZE
@@ -210,11 +213,13 @@ hypercall_iret = hypercall_page + __HYPERVISOR_iret * 32
* rsp->rax }
*/
SYM_CODE_START(xen_iret)
+ UNWIND_HINT_EMPTY
pushq $0
jmp hypercall_iret
SYM_CODE_END(xen_iret)

SYM_CODE_START(xen_sysret64)
+ UNWIND_HINT_EMPTY
/*
* We're already on the usermode stack at this point, but
* still with the kernel gs, so we can easily switch back.
@@ -250,7 +255,8 @@ SYM_CODE_END(xen_sysret64)
*/

/* Normal 64-bit system call target */
-SYM_FUNC_START(xen_syscall_target)
+SYM_CODE_START(xen_syscall_target)
+ UNWIND_HINT_EMPTY
popq %rcx
popq %r11

@@ -263,12 +269,13 @@ SYM_FUNC_START(xen_syscall_target)
movq $__USER_CS, 1*8(%rsp)

jmp entry_SYSCALL_64_after_hwframe
-SYM_FUNC_END(xen_syscall_target)
+SYM_CODE_END(xen_syscall_target)

#ifdef CONFIG_IA32_EMULATION

/* 32-bit compat syscall target */
-SYM_FUNC_START(xen_syscall32_target)
+SYM_CODE_START(xen_syscall32_target)
+ UNWIND_HINT_EMPTY
popq %rcx
popq %r11

@@ -281,10 +288,11 @@ SYM_FUNC_START(xen_syscall32_target)
movq $__USER32_CS, 1*8(%rsp)

jmp entry_SYSCALL_compat_after_hwframe
-SYM_FUNC_END(xen_syscall32_target)
+SYM_CODE_END(xen_syscall32_target)

/* 32-bit compat sysenter target */
-SYM_FUNC_START(xen_sysenter_target)
+SYM_CODE_START(xen_sysenter_target)
+ UNWIND_HINT_EMPTY
/*
* NB: Xen is polite and clears TF from EFLAGS for us. This means
* that we don't need to guard against single step exceptions here.
@@ -301,17 +309,18 @@ SYM_FUNC_START(xen_sysenter_target)
movq $__USER32_CS, 1*8(%rsp)

jmp entry_SYSENTER_compat_after_hwframe
-SYM_FUNC_END(xen_sysenter_target)
+SYM_CODE_END(xen_sysenter_target)

#else /* !CONFIG_IA32_EMULATION */

-SYM_FUNC_START_ALIAS(xen_syscall32_target)
-SYM_FUNC_START(xen_sysenter_target)
+SYM_CODE_START(xen_syscall32_target)
+SYM_CODE_START(xen_sysenter_target)
+ UNWIND_HINT_EMPTY
lea 16(%rsp), %rsp /* strip %rcx, %r11 */
mov $-ENOSYS, %rax
pushq $0
jmp hypercall_iret
-SYM_FUNC_END(xen_sysenter_target)
-SYM_FUNC_END_ALIAS(xen_syscall32_target)
+SYM_CODE_END(xen_sysenter_target)
+SYM_CODE_END(xen_syscall32_target)

#endif /* CONFIG_IA32_EMULATION */
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:29 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes, Boris Ostrovsky, Juergen Gross
The Xen hypercall page is filled with zeros, causing objtool to fall
through all the empty hypercall functions until it reaches a real
function, resulting in a stack state mismatch.

The build-time contents of the hypercall page don't matter, since it
gets mapped to the hypervisor. Make it more palatable to objtool by
making each hypervisor function a true empty function, with nops and a
return.

Cc: Boris Ostrovsky <boris.o...@oracle.com>
Cc: Juergen Gross <jgr...@suse.com>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/xen/xen-head.S | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 2d7c8f34f56c..cb6538ae2fe0 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -68,8 +68,9 @@ SYM_CODE_END(asm_cpu_bringup_and_idle)
.balign PAGE_SIZE
SYM_CODE_START(hypercall_page)
.rept (PAGE_SIZE / 32)
- UNWIND_HINT_EMPTY
- .skip 32
+ UNWIND_HINT_FUNC
+ .skip 31, 0x90
+ ret
.endr

#define HYPERCALL(n) \
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:35 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes, Boris Ostrovsky, Juergen Gross
It's kernel policy to not have (unannotated) indirect jumps because of
Spectre v2. This one's probably harmless, but better safe than sorry.
Convert it to a retpoline.

Cc: Boris Ostrovsky <boris.o...@oracle.com>
Cc: Juergen Gross <jgr...@suse.com>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/platform/pvh/head.S | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index 43b4d864817e..d87cebd08d32 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -16,6 +16,7 @@
#include <asm/boot.h>
#include <asm/processor-flags.h>
#include <asm/msr.h>
+#include <asm/nospec-branch.h>
#include <xen/interface/elfnote.h>

__HEAD
@@ -105,7 +106,7 @@ SYM_CODE_START_LOCAL(pvh_start_xen)
/* startup_64 expects boot_params in %rsi. */
mov $_pa(pvh_bootparams), %rsi
mov $_pa(startup_64), %rax
- jmp *%rax
+ JMP_NOSPEC rax

#else /* CONFIG_X86_64 */

--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:36 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes, Steven Rostedt
With objtool vmlinux.o validation of return_to_handler(), now that
objtool has visibility inside the retpoline, jumping from EMPTY state to
a proper function state results in a stack state mismatch.

return_to_handler() is actually quite normal despite the underlying
magic. Just annotate it as a normal function.

Cc: Steven Rostedt <ros...@goodmis.org>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/kernel/ftrace_64.S | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 1bf568d901b1..7c273846c687 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -334,8 +334,7 @@ SYM_FUNC_START(ftrace_graph_caller)
retq
SYM_FUNC_END(ftrace_graph_caller)

-SYM_CODE_START(return_to_handler)
- UNWIND_HINT_EMPTY
+SYM_FUNC_START(return_to_handler)
subq $24, %rsp

/* Save the return values */
@@ -350,5 +349,5 @@ SYM_CODE_START(return_to_handler)
movq (%rsp), %rax
addq $24, %rsp
JMP_NOSPEC rdi
-SYM_CODE_END(return_to_handler)
+SYM_FUNC_END(return_to_handler)
#endif
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:44 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes, Rafael J. Wysocki, Len Brown, Pavel Machek
It's kernel policy to not have (unannotated) indirect jumps because of
Spectre v2. This one's probably harmless, but better safe than sorry.
Convert it to a retpoline.

Cc: "Rafael J. Wysocki" <r...@rjwysocki.net>
Cc: Len Brown <len....@intel.com>
Cc: Pavel Machek <pa...@ucw.cz>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/kernel/acpi/wakeup_64.S | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index 5d3a0b8fd379..0b371580e620 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -7,6 +7,7 @@
#include <asm/msr.h>
#include <asm/asm-offsets.h>
#include <asm/frame.h>
+#include <asm/nospec-branch.h>

# Copyright 2003 Pavel Machek <pa...@suse.cz

@@ -39,7 +40,7 @@ SYM_FUNC_START(wakeup_long64)
movq saved_rbp, %rbp

movq saved_rip, %rax
- jmp *%rax
+ JMP_NOSPEC rax
SYM_FUNC_END(wakeup_long64)

SYM_FUNC_START(do_suspend_lowlevel)
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:50 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes, Rafael J. Wysocki, Len Brown, Pavel Machek
The OBJECT_FILES_NON_STANDARD annotation is used to tell objtool to
ignore a file. File-level ignores won't work when validating vmlinux.o.

Instead, tell objtool to ignore do_suspend_lowlevel() directly with the
STACK_FRAME_NON_STANDARD annotation.

Cc: "Rafael J. Wysocki" <r...@rjwysocki.net>
Cc: Len Brown <len....@intel.com>
Cc: Pavel Machek <pa...@ucw.cz>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/kernel/acpi/Makefile | 1 -
arch/x86/kernel/acpi/wakeup_64.S | 2 ++
2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/acpi/Makefile b/arch/x86/kernel/acpi/Makefile
index f1bb57b0e41e..cf340d85946a 100644
--- a/arch/x86/kernel/acpi/Makefile
+++ b/arch/x86/kernel/acpi/Makefile
@@ -1,5 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
-OBJECT_FILES_NON_STANDARD_wakeup_$(BITS).o := y

obj-$(CONFIG_ACPI) += boot.o
obj-$(CONFIG_ACPI_SLEEP) += sleep.o wakeup_$(BITS).o
diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index 0b371580e620..cc6846b35cab 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0-only */
.text
#include <linux/linkage.h>
+#include <linux/objtool.h>
#include <asm/segment.h>
#include <asm/pgtable_types.h>
#include <asm/page_types.h>
@@ -127,6 +128,7 @@ SYM_FUNC_START(do_suspend_lowlevel)
FRAME_END
jmp restore_processor_state
SYM_FUNC_END(do_suspend_lowlevel)
+STACK_FRAME_NON_STANDARD do_suspend_lowlevel

.data
saved_rbp: .quad 0
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:51 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes, Rafael J. Wysocki, Pavel Machek
It's kernel policy to not have (unannotated) indirect jumps because of
Spectre v2. These are probably harmless, but better safe than sorry.
Convert them to retpolines.

Cc: "Rafael J. Wysocki" <r...@rjwysocki.net>
Cc: Pavel Machek <pa...@ucw.cz>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/power/hibernate_asm_64.S | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index 7918b8415f13..24d971911c9d 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -21,6 +21,7 @@
#include <asm/asm-offsets.h>
#include <asm/processor-flags.h>
#include <asm/frame.h>
+#include <asm/nospec-branch.h>

SYM_FUNC_START(swsusp_arch_suspend)
movq $saved_context, %rax
@@ -66,7 +67,7 @@ SYM_CODE_START(restore_image)

/* jump to relocated restore code */
movq relocated_restore_code(%rip), %rcx
- jmpq *%rcx
+ JMP_NOSPEC rcx
SYM_CODE_END(restore_image)

/* code below has been relocated to a safe page */
@@ -97,7 +98,7 @@ SYM_CODE_START(core_restore_code)

.Ldone:
/* jump to the restore_registers address from the image header */
- jmpq *%r8
+ JMP_NOSPEC r8
SYM_CODE_END(core_restore_code)

/* code below belongs to the image kernel */
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:41:57 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes, Rafael J. Wysocki, Pavel Machek
Because restore_registers() is page-aligned, the assembler inexplicably
adds an unreachable jump from after the end of the previous function to
the beginning of restore_registers().

That confuses objtool, understandably. It also creates significant text
fragmentation. As a result, most of the object file is wasted text
(nops).

Move restore_registers() to the beginning of the file to both prevent
the text fragmentation and avoid the dead jump instruction.

$ size /tmp/hibernate_asm_64.before.o /tmp/hibernate_asm_64.after.o
text data bss dec hex filename
4415 0 0 4415 113f /tmp/hibernate_asm_64.before.o
524 0 0 524 20c /tmp/hibernate_asm_64.after.o

Cc: "Rafael J. Wysocki" <r...@rjwysocki.net>
Cc: Pavel Machek <pa...@ucw.cz>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/power/hibernate_asm_64.S | 92 +++++++++++++++----------------
1 file changed, 46 insertions(+), 46 deletions(-)

diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index 24d971911c9d..4ca6d68b0293 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -23,6 +23,52 @@
#include <asm/frame.h>
#include <asm/nospec-branch.h>

+ /* code below belongs to the image kernel */
+ .align PAGE_SIZE
+SYM_FUNC_START(restore_registers)
+ /* go back to the original page tables */
+ movq %r9, %cr3
+
+ /* Flush TLB, including "global" things (vmalloc) */
+ movq mmu_cr4_features(%rip), %rax
+ movq %rax, %rdx
+ andq $~(X86_CR4_PGE), %rdx
+ movq %rdx, %cr4; # turn off PGE
+ movq %cr3, %rcx; # flush TLB
+ movq %rcx, %cr3
+ movq %rax, %cr4; # turn PGE back on
+
+ /* We don't restore %rax, it must be 0 anyway */
+ movq $saved_context, %rax
+ movq pt_regs_sp(%rax), %rsp
+ movq pt_regs_bp(%rax), %rbp
+ movq pt_regs_si(%rax), %rsi
+ movq pt_regs_di(%rax), %rdi
+ movq pt_regs_bx(%rax), %rbx
+ movq pt_regs_cx(%rax), %rcx
+ movq pt_regs_dx(%rax), %rdx
+ movq pt_regs_r8(%rax), %r8
+ movq pt_regs_r9(%rax), %r9
+ movq pt_regs_r10(%rax), %r10
+ movq pt_regs_r11(%rax), %r11
+ movq pt_regs_r12(%rax), %r12
+ movq pt_regs_r13(%rax), %r13
+ movq pt_regs_r14(%rax), %r14
+ movq pt_regs_r15(%rax), %r15
+ pushq pt_regs_flags(%rax)
+ popfq
+
+ /* Saved in save_processor_state. */
+ lgdt saved_context_gdt_desc(%rax)
+
+ xorl %eax, %eax
+
+ /* tell the hibernation core that we've just restored the memory */
+ movq %rax, in_suspend(%rip)
+
+ ret
+SYM_FUNC_END(restore_registers)
+
SYM_FUNC_START(swsusp_arch_suspend)
movq $saved_context, %rax
movq %rsp, pt_regs_sp(%rax)
@@ -100,49 +146,3 @@ SYM_CODE_START(core_restore_code)
/* jump to the restore_registers address from the image header */
JMP_NOSPEC r8
SYM_CODE_END(core_restore_code)
-
- /* code below belongs to the image kernel */
- .align PAGE_SIZE
-SYM_FUNC_START(restore_registers)
- /* go back to the original page tables */
- movq %r9, %cr3
-
- /* Flush TLB, including "global" things (vmalloc) */
- movq mmu_cr4_features(%rip), %rax
- movq %rax, %rdx
- andq $~(X86_CR4_PGE), %rdx
- movq %rdx, %cr4; # turn off PGE
- movq %cr3, %rcx; # flush TLB
- movq %rcx, %cr3
- movq %rax, %cr4; # turn PGE back on
-
- /* We don't restore %rax, it must be 0 anyway */
- movq $saved_context, %rax
- movq pt_regs_sp(%rax), %rsp
- movq pt_regs_bp(%rax), %rbp
- movq pt_regs_si(%rax), %rsi
- movq pt_regs_di(%rax), %rdi
- movq pt_regs_bx(%rax), %rbx
- movq pt_regs_cx(%rax), %rcx
- movq pt_regs_dx(%rax), %rdx
- movq pt_regs_r8(%rax), %r8
- movq pt_regs_r9(%rax), %r9
- movq pt_regs_r10(%rax), %r10
- movq pt_regs_r11(%rax), %r11
- movq pt_regs_r12(%rax), %r12
- movq pt_regs_r13(%rax), %r13
- movq pt_regs_r14(%rax), %r14
- movq pt_regs_r15(%rax), %r15
- pushq pt_regs_flags(%rax)
- popfq
-
- /* Saved in save_processor_state. */
- lgdt saved_context_gdt_desc(%rax)
-
- xorl %eax, %eax
-
- /* tell the hibernation core that we've just restored the memory */
- movq %rax, in_suspend(%rip)
-
- ret
-SYM_FUNC_END(restore_registers)
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 2:42:00 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes, Rafael J. Wysocki, Pavel Machek
The OBJECT_FILES_NON_STANDARD annotation is used to tell objtool to
ignore a file. File-level ignores won't work when validating vmlinux.o.

Instead, convert restore_image() and core_restore_code() to be ELF
functions. Their code is conventional enough for objtool to be able to
understand them.

Cc: "Rafael J. Wysocki" <r...@rjwysocki.net>
Cc: Pavel Machek <pa...@ucw.cz>
Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
arch/x86/power/Makefile | 1 -
arch/x86/power/hibernate_asm_64.S | 8 ++++----
2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/power/Makefile b/arch/x86/power/Makefile
index 6907b523e856..3ff80156f21a 100644
--- a/arch/x86/power/Makefile
+++ b/arch/x86/power/Makefile
@@ -1,5 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
-OBJECT_FILES_NON_STANDARD_hibernate_asm_$(BITS).o := y

# __restore_processor_state() restores %gs after S3 resume and so should not
# itself be stack-protected
diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index 4ca6d68b0293..fce9ed03e939 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -99,7 +99,7 @@ SYM_FUNC_START(swsusp_arch_suspend)
ret
SYM_FUNC_END(swsusp_arch_suspend)

-SYM_CODE_START(restore_image)
+SYM_FUNC_START(restore_image)
/* prepare to jump to the image kernel */
movq restore_jump_address(%rip), %r8
movq restore_cr3(%rip), %r9
@@ -114,10 +114,10 @@ SYM_CODE_START(restore_image)
/* jump to relocated restore code */
movq relocated_restore_code(%rip), %rcx
JMP_NOSPEC rcx
-SYM_CODE_END(restore_image)
+SYM_FUNC_END(restore_image)

/* code below has been relocated to a safe page */
-SYM_CODE_START(core_restore_code)
+SYM_FUNC_START(core_restore_code)
/* switch to temporary page tables */
movq %rax, %cr3
/* flush TLB */
@@ -145,4 +145,4 @@ SYM_CODE_START(core_restore_code)
.Ldone:
/* jump to the restore_registers address from the image header */
JMP_NOSPEC r8
-SYM_CODE_END(core_restore_code)
+SYM_FUNC_END(core_restore_code)
--
2.29.2

Josh Poimboeuf

unread,
Jan 14, 2021, 3:04:12 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
On Thu, Jan 14, 2021 at 01:39:57PM -0600, Josh Poimboeuf wrote:
> Objtool appends a temporary fake jump at the end of alternative
> replacement instructions. If the replacement code is empty -- resulting
> in patched nops -- the fake jump doesn't have a section. When running
> objtool with '--backtrace', the fake jump's missing section can cause
> BT_FUNC() to trigger a seg fault when the NULL insn->sec is passed to
> offstr().
>
> Fix it by ensuring fake jumps always have a section.
>
> Fixes: 7697eee3ddd7 ("objtool: Add --backtrace support")
> Reported-by: Sami Tolvanen <samito...@google.com>
> Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>

This patch will probably end up getting dropped because fake jumps are
going away \o/

--
Josh

Steven Rostedt

unread,
Jan 14, 2021, 3:42:25 PM1/14/21
to Josh Poimboeuf, x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
On Thu, 14 Jan 2021 13:40:02 -0600
Josh Poimboeuf <jpoi...@redhat.com> wrote:

> Prevent an unreachable objtool warning after the sibling call detection
> gets improved. ftrace_stub() is basically a function, annotate it as
> such.
>
> Cc: Steven Rostedt <ros...@goodmis.org>

Acked-by: Steven Rostedt (VMware) <ros...@goodmis.org>

-- Steve

Steven Rostedt

unread,
Jan 14, 2021, 3:42:53 PM1/14/21
to Josh Poimboeuf, x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
On Thu, 14 Jan 2021 13:40:12 -0600
Josh Poimboeuf <jpoi...@redhat.com> wrote:

> With objtool vmlinux.o validation of return_to_handler(), now that
> objtool has visibility inside the retpoline, jumping from EMPTY state to
> a proper function state results in a stack state mismatch.
>
> return_to_handler() is actually quite normal despite the underlying
> magic. Just annotate it as a normal function.

If you say so ;-)

Acked-by: Steven Rostedt (VMware) <ros...@goodmis.org>

-- Steve

>

Sami Tolvanen

unread,
Jan 14, 2021, 3:49:41 PM1/14/21
to Josh Poimboeuf, X86 ML, LKML, Peter Zijlstra, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-built-linux, Miroslav Benes
Hi Josh,
Clang points out that sym is uninitialized here. Should these be
reloc->sym instead?

Sami

Josh Poimboeuf

unread,
Jan 14, 2021, 3:52:25 PM1/14/21
to Sami Tolvanen, X86 ML, LKML, Peter Zijlstra, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-built-linux, Miroslav Benes
Yup, I somehow managed to bork this patch (and the one before it) right
before I sent it. Will send updated versions of these two.

--
Josh

Josh Poimboeuf

unread,
Jan 14, 2021, 3:55:58 PM1/14/21
to x...@kernel.org, linux-...@vger.kernel.org, Peter Zijlstra, Sami Tolvanen, Sedat Dilek, Kees Cook, Nick Desaulniers, clang-bu...@googlegroups.com, Miroslav Benes
This logic will also be needed for the CONFIG_CFI_CLANG support.

Signed-off-by: Josh Poimboeuf <jpoi...@redhat.com>
---
tools/objtool/elf.c | 60 ++++++++++++++++++++++++++++-----------------
1 file changed, 37 insertions(+), 23 deletions(-)

diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index be89c741ba9a..6d248a19e2c6 100644
@@ -374,7 +406,6 @@ static int read_symbols(struct elf *elf)
struct list_head *entry;
struct rb_node *pnode;
int symbols_nr, i;
- char *coldstr;
Elf_Data *shndx_data = NULL;
Elf32_Word shndx;

@@ -456,37 +487,20 @@ static int read_symbols(struct elf *elf)

Josh Poimboeuf

unread,
Jan 14, 2021, 3:56:34 PM1/14/21