[PATCH] Clarify R_X86_64_REX_GOTPCRELX transformation

97 views
Skip to first unread message

H.J. Lu

unread,
Apr 28, 2021, 4:31:27 PM4/28/21
to x86-6...@googlegroups.com
https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/16

H.J.
--
Instructions on memory operand with GOTPCRELX relocations against symbol,
foo, can be transformed into a different form on immediate operand if foo
is defined locally and the relocation addend is -4:

movl foo@GOTPCREL(%rip), %eax

For

movl foo@GOTPCREL+4(%rip), %eax

The transformation is invalid since the relocation addend is 0.
---
x86-64-ABI/linker-optimization.tex | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/x86-64-ABI/linker-optimization.tex b/x86-64-ABI/linker-optimization.tex
index dcefa09..246350c 100644
--- a/x86-64-ABI/linker-optimization.tex
+++ b/x86-64-ABI/linker-optimization.tex
@@ -69,7 +69,7 @@ The \xARCH instruction encoding supports converting certain instructions
on memory operand with \texttt{R_X86_64_GOTPCRELX} or
\texttt{R_X86_64_REX_GOTPCRELX} relocations against symbol, \texttt{foo},
into a different form on immediate operand if \texttt{foo} is defined
-locally.
+locally and the relocation addend is -4.

\begin{description}
\item[\textindex{Convert call and jmp}]
--
2.31.1

Jan Beulich

unread,
Apr 29, 2021, 5:11:01 AM4/29/21
to H.J. Lu, x86-6...@googlegroups.com
On 28.04.2021 22:31, H.J. Lu wrote:
> https://gitlab.com/x86-psABIs/x86-64-ABI/-/merge_requests/16
>
> H.J.
> --
> Instructions on memory operand with GOTPCRELX relocations against symbol,
> foo, can be transformed into a different form on immediate operand if foo
> is defined locally and the relocation addend is -4:
>
> movl foo@GOTPCREL(%rip), %eax
>
> For
>
> movl foo@GOTPCREL+4(%rip), %eax
>
> The transformation is invalid since the relocation addend is 0.

Just for my own understanding: Addends other than -4 aren't
considered valid because no insn with an immediate is deemed to
be sensibly usable with @GOTPCREL?

Jan

H.J. Lu

unread,
Apr 29, 2021, 7:41:36 AM4/29/21
to Jan Beulich, x86-64-abi
Correct.

--
H.J.

Michael Matz

unread,
May 4, 2021, 10:24:14 AM5/4/21
to H.J. Lu, Jan Beulich, x86-64-abi
Hello,
I'm still confused. Why would addends other than -4 not be supportable?
If the relocation isn't resolved then the RELA record will contain the
addend (whatever it is) and needs to be applied by whatever ultimately
applies the relocation.

If the relocation is resolved then the addend was folded into the
computation that resolved it and placed into the appropriate immediate or
mem-offset field of the instruction. Either way the addend doesn't
matter anymore.

If the transformation from one to another instruction form, with a
different relocation type, moved the place of the relocated field the
addend needs to be adjusted if it was a PC relative relocation.

For instance, for the example above, you are probably thinking about the
mov to mov transformation with non-PIC code confined to low 32bit, i.e.

movl foo@GOTPCREL+4(%rip), %eax

into

movl $foo+4, %eax

Of course the checking for 32bit range needs to be applied to the full
value of foo+4, i.e. including the addend, but that's not different from
any other reloc processing. Even if the target insn is going to be lea,
it's then:

lea foo+4(%rip), %eax

which still is fine (of course foo-P+4 needs to fit the memoffset field,
like normally foo-4-P+4 needs to fit for the usual -4 addends).

If you think there are issues with any of this I would appreciate a
specific example of input instruction plus relocs, and candidate
transformation that is not going to work.


Ciao,
Michael.

H.J. Lu

unread,
May 4, 2021, 10:44:40 AM5/4/21
to Michael Matz, Jan Beulich, x86-64-abi
This uses the address from 2 GOT entries: 32 bits of one entry and
32 bits of another entry. Linker can't optimize it.

> into
>
> movl $foo+4, %eax
>
> Of course the checking for 32bit range needs to be applied to the full
> value of foo+4, i.e. including the addend, but that's not different from
> any other reloc processing. Even if the target insn is going to be lea,
> it's then:
>
> lea foo+4(%rip), %eax
>
> which still is fine (of course foo-P+4 needs to fit the memoffset field,
> like normally foo-4-P+4 needs to fit for the usual -4 addends).
>
> If you think there are issues with any of this I would appreciate a
> specific example of input instruction plus relocs, and candidate
> transformation that is not going to work.
>
>
> Ciao,
> Michael.



--
H.J.

Michael Matz

unread,
May 4, 2021, 11:38:40 AM5/4/21
to H.J. Lu, Jan Beulich, x86-64-abi
Hello,
Sure it can, and that it currently mistransforms it can be considered a
normal bug. You need a GOT slot for the value "foo+4", and use that in
the instruction.

That of course hinges a bit on the definition of onto which value the
@GOT... decorations are to be applied.

If it's defined as traditional (i.e. the above +4 is applied outside the
decoration) the input instruction was already asking for trouble in
this case, it explicitely wanted to access some random content of .got.
So I don't see why the linker can't transform it into an equivalently
random other access.

So, if you are worried about accessing GOT slots piecewise should we then
not rather define the relocations themself to be restricted to be valid
only for certain addends, not only some optimizations on them?

But I'm of the opinion that we don't need to put restrictions about this
in the psABI, if the user really wanted to access garbage we should let
her.

So, I'd still like to know a case where the input was meaningful but the
output of a transformation (without the -4 restriction) would be broken.


Ciao,
Michael.

H.J. Lu

unread,
May 4, 2021, 11:47:20 AM5/4/21
to Michael Matz, Jan Beulich, x86-64-abi
The current linker has

/* With the local symbol, foo, we convert
mov foo@GOTPCREL(%rip), %reg
to
lea foo(%rip), %reg
and convert
call/jmp *foo@GOTPCREL(%rip)
to
nop call foo/jmp foo nop
When PIC is false, convert
test %reg, foo@GOTPCREL(%rip)
to
test $foo, %reg
and convert
binop foo@GOTPCREL(%rip), %reg
to
binop $foo, %reg
where binop is one of adc, add, and, cmp, or, sbb, sub, xor
instructions. */

static bool
elf_x86_64_convert_load_reloc (bfd *abfd,
bfd_byte *contents,
unsigned int *r_type_p,
Elf_Internal_Rela *irel,
struct elf_link_hash_entry *h,
bool *converted,
struct bfd_link_info *link_info)
{
struct elf_x86_link_hash_table *htab;
bool is_pic;
bool no_overflow;
bool relocx;
bool to_reloc_pc32;
bool abs_symbol;
bool local_ref;
asection *tsec;
bfd_signed_vma raddend;
unsigned int opcode;
unsigned int modrm;
unsigned int r_type = *r_type_p;
unsigned int r_symndx;
bfd_vma roff = irel->r_offset;
bfd_vma abs_relocation;

if (roff < (r_type == R_X86_64_REX_GOTPCRELX ? 3 : 2))
return true;

raddend = irel->r_addend;
/* Addend for 32-bit PC-relative relocation must be -4. */
if (raddend != -4)
return true;

If addend != -4, the linker optimization is disabled. I don't believe
that linker should optimize the addend != -4 case.

>
> Ciao,
> Michael.
>
> >
> > > into
> > >
> > > movl $foo+4, %eax
> > >
> > > Of course the checking for 32bit range needs to be applied to the full
> > > value of foo+4, i.e. including the addend, but that's not different from
> > > any other reloc processing. Even if the target insn is going to be lea,
> > > it's then:
> > >
> > > lea foo+4(%rip), %eax
> > >
> > > which still is fine (of course foo-P+4 needs to fit the memoffset field,
> > > like normally foo-4-P+4 needs to fit for the usual -4 addends).
> > >
> > > If you think there are issues with any of this I would appreciate a
> > > specific example of input instruction plus relocs, and candidate
> > > transformation that is not going to work.
> > >
> > >
> > > Ciao,
> > > Michael.
> >
> >
> >
> >



--
H.J.

Michael Matz

unread,
May 4, 2021, 12:15:16 PM5/4/21
to H.J. Lu, Jan Beulich, x86-64-abi
>...
> raddend = irel->r_addend;
> /* Addend for 32-bit PC-relative relocation must be -4. */
> if (raddend != -4)
> return true;
>
> If addend != -4, the linker optimization is disabled. I don't believe
> that linker should optimize the addend != -4 case.

And that's fine. It essentially restricts itself to cases where the input
was a decorated symbol without any addend (and this all applies to GOT
slot relocs only, not to "32bit PC-relative relocs" as the comment
indicates). A linker can of course choose to restrict when it applies
any transformation.

But we're discussing the psABI here, and then the question always needs to
be if there's no possibly useful meaning to this transformation. And I
don't see that. The meaning follows quite naturally from any other reloc
processing. If the input contains garbage (like a GOT access with addend)
the output can contain garbage as well (but garbage constructed in a
reliable and obvious way!), presumably the author wanted that and it isn't
in fact garbage.

So, before putting a restriction into the psABI here I would like to see
an indication that something goes obviously wrong (i.e. something where
the input is not already questionable).

I.e. I could also ask why you've put in the above restriction into ld: are
you worried about specific cases (which ones?) or were you just being
cautious because trying to think about the ramifications of lifting the
restriction would be too much work for too little gain? And if you were
worried about specific cases: why haven't you put the addend checking into
the assembler when parsing operands that have GOTPCREL decorations (or any
other GOT slot accesses).


Ciao,
Michael.

Fangrui Song

unread,
May 4, 2021, 3:29:29 PM5/4/21
to Michael Matz, H.J. Lu, Jan Beulich, x86-64-abi
In any case, the psABI change does not affect clang.
In https://reviews.llvm.org/D92114 , I changed movq mov@GOTPCREL+nonzero(%rip), %rax to use R_X86_64_GOTPCREL instead of GOTPCRELX,
which is needed to workaround gold<2.36 (https://sourceware.org/bugzilla/show_bug.cgi?id=26939) and ld.lld<12

H.J. Lu

unread,
May 4, 2021, 3:55:09 PM5/4/21
to Michael Matz, Jan Beulich, x86-64-abi
The whole section of "Optimize GOTPCRELX Relocations" assumes that
the relocation addend is -4. My change only makes this requirement explicit.

--
H.J.
Reply all
Reply to author
Forward
0 new messages