Hi Ard,
I'm likely missing/confusing terminology below, so apologies in advance for
that!
Ah, sorry; I hadn't spotted that.
> The issue seems to be where exactly the veneers for the entire image
> are dumped, and when this is right after .idmap.text (being the last
> input section with the EXEC ELF attribute), it pushes the
> __idmap_text_end symbol over the next 4k boundary.
I'm a bit confused as to why veneers for other sections would be dropped in
here. Note that these branches (for-next/core and for-kernelci) have your patch
moving IDMAP_TEXT into .rodata, so that's no longer the last input section with
the EXEC attribute, unless I've misunderstood?
Perhaps that's meant to be read as the last input section *within a given
output section* ?
If marking the .idmap.text section as non-executable prevents other veneers
from being collected there, then I reckon that doing that along with the
indirect branches using adr_l + blr should be sufficient? That way there should
be no veneers dropped into .idmap.text, and the cost of ADRP+ADD+BLR in a
one-time boot path isn't going to be noticeable.
FWIW, If I modify head.S to replace:
.section ".idmap.text","a"
With:
.section ".idmap.text","a"
Then the only branches which don't get PLTs are the ones I mentioned earlier,
and the linker complains about those becoming truncated:
| [mark@lakrids:~/src/linux]% usekorg 12.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- -j50 Image
| CALL scripts/checksyscalls.sh
| LDS arch/arm64/kernel/vmlinux.lds
| AS arch/arm64/kernel/head.o
| AR arch/arm64/kernel/built-in.a
| AR arch/arm64/built-in.a
| AR built-in.a
| AR vmlinux.a
| LD vmlinux.o
| OBJCOPY modules.builtin.modinfo
| GEN modules.builtin
| MODPOST vmlinux.symvers
| UPD include/generated/utsversion.h
| CC init/version-timestamp.o
| LD .tmp_vmlinux.kallsyms1
| arch/arm64/kernel/head.o: in function `primary_entry':
| (.idmap.text+0x1c): relocation truncated to fit: R_AARCH64_CALL26 against symbol `dcache_clean_poc' defined in .text section in arch/arm64/mm/cache.o
| arch/arm64/kernel/head.o: in function `init_el2':
| (.idmap.text+0x88): relocation truncated to fit: R_AARCH64_CALL26 against symbol `dcache_clean_poc' defined in .text section in arch/arm64/mm/cache.o
| make[1]: *** [scripts/Makefile.vmlinux:34: vmlinux] Error 1
| make: *** [Makefile:1252: vmlinux] Error 2
Is there any reason we need the "wx" attributs for .idmap.text?
With that in mind, does the below look ok? It builds cleanly and boots fine for
me with GCC 12.1.0 atop for-next/core.
We could wrap the adr_l + blr into a blr_l macro for legibility.
Thanks,
Mark.
---->8----
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 212d93aca5e61..b98970907226b 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -70,7 +70,7 @@
__EFI_PE_HEADER
- .section ".idmap.text","awx"
+ .section ".idmap.text","a"
/*
* The following callee saved general purpose registers are used on the
@@ -99,7 +99,8 @@ SYM_CODE_START(primary_entry)
cbz x19, 0f
adrp x0, __idmap_text_start
adr_l x1, __idmap_text_end
- bl dcache_clean_poc
+ adr_l x2, dcache_clean_poc
+ blr x2
0: mov x0, x19
bl init_kernel_el // w0=cpu_boot_mode
mov x20, x0
@@ -527,7 +528,7 @@ SYM_FUNC_END(__primary_switched)
* end early head section, begin head code that is also used for
* hotplug and needs to have the same protections as the text region
*/
- .section ".idmap.text","awx"
+ .section ".idmap.text","a"
/*
* Starting from EL2 or EL1, configure the CPU to execute at the highest
@@ -566,7 +567,8 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
cbz x0, 0f
adrp x0, __hyp_idmap_text_start
adr_l x1, __hyp_text_end
- bl dcache_clean_poc
+ adr_l x2, dcache_clean_poc
+ blr x2
0:
mov_q x0, HCR_HOST_NVHE_FLAGS
msr hcr_el2, x0
@@ -728,7 +730,7 @@ SYM_FUNC_END(set_cpu_boot_mode_flag)
* Checks if the selected granule size is supported by the CPU.
* If it isn't, park the CPU
*/
- .section ".idmap.text","awx"
+ .section ".idmap.text","a"
SYM_FUNC_START(__enable_mmu)
mrs x3, ID_AA64MMFR0_EL1
ubfx x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4