[RFC][PATCH] APX: Add R_X86_64_CODE_4_GOTPCRELX

34 views
Skip to first unread message

H.J. Lu

unread,
Aug 15, 2023, 3:11:40 PM8/15/23
to x86-6...@googlegroups.com
Intel Advanced Performance Extensions:

https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html

adds the REX2 prefix for the additional general-purpose registers, r16-r31,
in

mov name@GOTPCREL(%rip), %reg
test %reg, name@GOTPCREL(%rip)
binop name@GOTPCREL(%rip), %reg

where binop is one of adc, add, add, cmp, or, sbb, sub, xor instructions.
Add

# define R_X86_64_CODE_4_GOTPCRELX 43

if the instruction starts at 4 bytes before the relocation offset. It is
similar to R_X86_64_GOTPCRELX. Linker can treat R_X86_64_CODE_4_GOTPCRELX
as R_X86_64_GOTPCREL or convert the above instructions to

lea name(%rip), %reg
mov $name, %reg
test $name, %reg
binop $name, %reg

if the first byte of the instruction at the relocation offset - 4 is 0xd5
when possible.
---
x86-64-ABI/linker-optimization.tex | 10 ++++++----
x86-64-ABI/object-files.tex | 10 ++++++----
2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/x86-64-ABI/linker-optimization.tex b/x86-64-ABI/linker-optimization.tex
index 246350c..c03afe3 100644
--- a/x86-64-ABI/linker-optimization.tex
+++ b/x86-64-ABI/linker-optimization.tex
@@ -66,10 +66,12 @@ into an infinite loop at run-time.
\label{opt_gotpcrelx}

The \xARCH instruction encoding supports converting certain instructions
-on memory operand with \texttt{R_X86_64_GOTPCRELX} or
-\texttt{R_X86_64_REX_GOTPCRELX} relocations against symbol, \texttt{foo},
-into a different form on immediate operand if \texttt{foo} is defined
-locally and the relocation addend is -4.
+on memory operand with \texttt{R_X86_64_GOTPCRELX},
+\texttt{R_X86_64_REX_GOTPCRELX} or \texttt{R_X86_64_CODE_4_GOTPCRELX}
+if the first byte of the instruction at the relocation offset - 4 is 0xd5,
+relocations against symbol, \texttt{foo}, into a different form on
+immediate operand if
+\texttt{foo} is defined locally and the relocation addend is -4.

\begin{description}
\item[\textindex{Convert call and jmp}]
diff --git a/x86-64-ABI/object-files.tex b/x86-64-ABI/object-files.tex
index 7f20c0c..2a2e315 100644
--- a/x86-64-ABI/object-files.tex
+++ b/x86-64-ABI/object-files.tex
@@ -486,6 +486,7 @@ or \texttt{Elf32_Rel} relocation entries.
\texttt{Deprecated} & 40 & & \\
\texttt{R_X86_64_GOTPCRELX} & 41 & \textit{word32} & \texttt{G + GOT + A - P} \\
\texttt{R_X86_64_REX_GOTPCRELX} & 42 & \textit{word32} & \texttt{G + GOT + A - P} \\
+ \texttt{R_X86_64_CODE_4_GOTPCRELX} & 43 & \textit{word32} & \texttt{G + GOT + A - P} \\
\cline{1-4}
\multicolumn{4}{l}{\small $^\dagger$ This relocation is used only for LP64.}\\
\multicolumn{4}{l}{\small $^{\dagger\dagger}$ This relocation only
@@ -538,10 +539,11 @@ instructions:
where \code{binop} is one of \code{adc}, \code{add}, \code{and},
\code{cmp}, \code{or}, \code{sbb}, \code{sub}, \code{xor}
instructions, the \texttt{R_X86_64_GOTPCRELX} relocation,
-or the \texttt{R_X86_64_REX_GOTPCRELX} relocation if the
-\code{REX} prefix is present, should be generated,
-instead of the \texttt{R_X86_64_GOTPCREL} relocation. See also
-section~\ref{opt_gotpcrelx}.
+the \texttt{R_X86_64_REX_GOTPCRELX} relocation if the
+\code{REX} prefix is present, or the \texttt{R_X86_64_CODE_4_GOTPCRELX}
+relocation if the instruction starts at 4 bytes before the relocation
+offset, should be generated, instead of the \texttt{R_X86_64_GOTPCREL}
+relocation. See also section~\ref{opt_gotpcrelx}.
\end{sloppypar}

\begin{sloppypar}
--
2.41.0

Jan Beulich

unread,
Aug 16, 2023, 2:18:51 AM8/16/23
to H.J. Lu, x86-6...@googlegroups.com
On 15.08.2023 21:11, H.J. Lu wrote:
> --- a/x86-64-ABI/linker-optimization.tex
> +++ b/x86-64-ABI/linker-optimization.tex
> @@ -66,10 +66,12 @@ into an infinite loop at run-time.
> \label{opt_gotpcrelx}
>
> The \xARCH instruction encoding supports converting certain instructions
> -on memory operand with \texttt{R_X86_64_GOTPCRELX} or
> -\texttt{R_X86_64_REX_GOTPCRELX} relocations against symbol, \texttt{foo},
> -into a different form on immediate operand if \texttt{foo} is defined
> -locally and the relocation addend is -4.
> +on memory operand with \texttt{R_X86_64_GOTPCRELX},
> +\texttt{R_X86_64_REX_GOTPCRELX} or \texttt{R_X86_64_CODE_4_GOTPCRELX}
> +if the first byte of the instruction at the relocation offset - 4 is 0xd5,
> +relocations against symbol, \texttt{foo}, into a different form on
> +immediate operand if
> +\texttt{foo} is defined locally and the relocation addend is -4.

Should we introduce a uniform set of names also covering the pre-existing
relocs, retaining the prior names only for backwards compatibility?

Jan

Michael Matz

unread,
Aug 16, 2023, 9:50:23 AM8/16/23
to Jan Beulich, H.J. Lu, x86-6...@googlegroups.com
Hello,
I would like that, yes.


Ciao,
Michael.

H.J. Lu

unread,
Aug 16, 2023, 11:25:52 AM8/16/23
to Michael Matz, Jan Beulich, x86-6...@googlegroups.com
It may be OK for R_X86_64_GOTPCRELX since there is no different
encoding and linker doesn't need to check the first byte of the instruction.
But R_X86_64_REX_GOTPCRELX may be a problem since existing
linkers assume that the first byte is the REX byte when rewriting
instructions to perform relaxation. If we rename it to
R_X86_64_CODE_3_GOTPCRELX and there is a different prefix for
a different encoding scheme, it won't be compatible with existing
linkers.


--
H.J.

H.J. Lu

unread,
Aug 24, 2023, 11:54:46 AM8/24/23
to Michael Matz, Jan Beulich, x86-6...@googlegroups.com

Jan Beulich

unread,
Sep 11, 2023, 2:13:48 AM9/11/23
to H.J. Lu, x86-6...@googlegroups.com
On 15.08.2023 21:11, H.J. Lu wrote:
> Intel Advanced Performance Extensions:
>
> https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html
>
> adds the REX2 prefix for the additional general-purpose registers, r16-r31,
> in
>
> mov name@GOTPCREL(%rip), %reg
> test %reg, name@GOTPCREL(%rip)
> binop name@GOTPCREL(%rip), %reg
>
> where binop is one of adc, add, add, cmp, or, sbb, sub, xor instructions.
> Add
>
> # define R_X86_64_CODE_4_GOTPCRELX 43

So what about EVEX-encoded <binop>, which may be preferable to use here (in
its NF and/or ND flavors)? Don't we need R_X86_64_CODE_6_GOTPCRELX right
away as well, and then also not just for %r16...%r31?

Jan

H.J. Lu

unread,
Sep 11, 2023, 1:22:38 PM9/11/23
to Jan Beulich, x86-6...@googlegroups.com
We can add it if it turns out useful.


--
H.J.

Jan Beulich

unread,
Sep 12, 2023, 2:30:03 AM9/12/23
to H.J. Lu, x86-6...@googlegroups.com
> We can add it if it turns out useful.

Assuming the NF and ND forms are being added because of being deemed useful,
what question would there be that they will want using here (except perhaps
under -Os) in preference to the REX2 ones? Even the TEST and CMP in your set
might easily be CTESTscc and CCMPscc.

Btw, as to the set of <binop>: What are the underlying criteria for which
insns are "eligible" to using this specific kind of relocation?

Jan
Reply all
Reply to author
Forward
0 new messages