Switching x86-64 to GNU2 TLS descriptors

16 views
Skip to first unread message

Florian Weimer

unread,
Jul 7, 2025, 4:37:57 AMJul 7
to libc-...@sourceware.org, g...@gcc.gnu.org, binu...@sourceware.org, x86-6...@googlegroups.com, H.J. Lu
H.J. proposed to switch the default for GCC 16 (turning on
-mtls-dialect=gnu2 by default). This is a bit tricky because when we
tried to make the switch in Fedora (for eventual implementation), we hit
an ABI compatibility problem:

_dl_tlsdesc_dynamic doesn't preserve all caller-saved registers
<https://sourceware.org/bugzilla/show_bug.cgi?id=31372>

This means that changing the defaults can have backwards compatibility
impact with older distributions.

(a) Do not nothing special and switch the default. Maybe try to
backport the glibc fix to more release branches and distributions. I
think we implicitly decided to follow this path when we decided thiswas
a glibc bug and not a GCC bug. The downside is that missing the bug fix
can result in unexpected, difficult-to-diagnose behavior. However, when
we rebuilt Fedora, the problem was exceedingly rare (we observed one
single failure, if I recall correctly).

(b) Introduce binary markup to indicate that binaries may need the glibc
fix, and that glibc has the fix.

[PATCH] x86-64: Add GLIBC_ABI_GNU2_TLS [BZ #33129]
<https://inbox.sourceware.org/libc-alpha/20250704205341.1...@gmail.com/>

This requires changes to all linkers, GCC and glibc.

(c) Introduce a new relocation type with the same behavior as
R_X86_64_TLSDESC. Unpatched glibc will not support it and error out
during relocation processing. Requires linker changes, GCC and glibc
changes. Does not produce a nice error message, unlike the
GLIBC_ABI_GNU2_TLS change. Ideally would need package manager changes
to produce the right dependencies (with GLIBC_ABI_GNU2_TLS, this could
happen automatically).

(d) Make the GCC default conditional on the glibc version used at GCC
build time. Add __memcmpeq support to GCC 16. Maybe add
errno@@GLIBC_2.43 to glibc 2.43. Even today, it is likely that binaries
contain at least one symbol version reference to something that is
relatively recent, and the __memcmpeq and errno changes would increase
this effect. Combined with the backport mentioned under (a), that could
be enough to force glibc upgrades in pretty much all cases. We have
__libc_start_main@@GLIBC_2.34, so if the glibc backports go back to 2.34
(or even 2.31), only shared objects suffer from this issue. Among the
Fedora binaries, the outliers without dependencies on recent glibc are
mostly Perl modules, and I expect the errno and __memcmpeq would cover
at least some of these. This is not as clean as (b) and (c), but only
needs glibc and GCC changes (for __memcmpeq). It does not achieve 100%
bug prevention, but given that bugs seem to be rare, this may be good
enough.

(e) Skip over GNU2 TLS altogether and implement inline TLS sequences
(GNU3 descriptors?) that do not have the dlopen incompatibility of
initial-exec TLS. This is currently vaporware. It requires nontrivial
glibc changes, GCC changes, linker changes, and x86-64 psABI work to
define new relocation types and perhaps relaxations. This is probably
what we want long-term. User experience is similar to (c), but with
more implementation sequences.

For comparison with an initial-exec TLS read,

movq threadvar@gottpoff(%rip), %rax
movl %fs:(%rax), %eax

this could look like this:

movl threadvar@gottpslot, %eax
movq %fs:(%rax), %rax
movl threadvar@gottlsslotoff, %ecx
movl (%rcx, %rax), %eax

Or with the descriptor in one word:

movq threadvar@gottpslotoff, %rax
movq %rax, %rdx
movq %fs:(%eax), %rax
shrq $32, %rdx
movl %(rax, %rdx), %eax

Or with a bit shorter instruction, using a 32-bit descriptor (which
still could cover at least 3 GiB of TLS data per thread):

movl threadvar@gottpslotoff, %rax
movzbl %al, %edx
shr $8, %eax
movq %fs:64(%edx), %rdx
mov (%rdx, %rax), %eax

And if we want a negative TLS slot index (which glibc would not use, and
I think it's incompatible with local-exec TLS anyway):

movq threadvar@gottpslotoff, %rax
movslq %eax, %rdx
shrq $32, %rax
movq %fs:(%rdx), %rdx
movl %(rdx, %rax), %eax

There might be other variant sequences.

Implementing this on the glibc side would require fundamental changes to
the TLS allocator, which is why this isn't straightforward.

(f) A less ambitions variant of (e): A new TLS descriptor call back that
returns the address of the TLS variable, and not the offset from the
thread pointer. This is much easier to implement on the glibc side.
The current GNU2 TLS descriptor callback is optimized for static TLS
access. We can avoid a memory access in the static TLS callback if we
use the RDFSBASE instruction (if glibc detects run-time support). It's
a new relocation type, so this too needs GCC, linker, ABI changes.
However, these changes are largely mechanical (except perhaps for the
relaxation support). Basically, TLS accesses would change from

leaq threadvar@TLSDESC(%rip), %rax
call *threadvar@TLSCALL(%rax)
movl %fs:(%rax), %eax

to:

leaq threadvar@TLSDESC2(%rip), %rax
call *threadvar@TLSCALL2(%rax)
movl (%rax), %eax

And the implementation of the static TLS case would change from

endbr64
movq 8(%rax), %rax
retq

to:

endbr64
rdfsbase %rax
addq %rsi, %rax
retq

But I don't think this detour is worth it if we eventually want to land
on (e).


I'm personally leaning towards (d) or (a) for GCC 16. I dislike (b).
And (e) is unrealistic in the short term.

Thanks,
Florian

H.J. Lu

unread,
Jul 7, 2025, 4:47:06 AMJul 7
to Florian Weimer, libc-...@sourceware.org, g...@gcc.gnu.org, binu...@sourceware.org, x86-6...@googlegroups.com
On Mon, Jul 7, 2025 at 4:37 PM Florian Weimer <fwe...@redhat.com> wrote:
>
> H.J. proposed to switch the default for GCC 16 (turning on
> -mtls-dialect=gnu2 by default). This is a bit tricky because when we
> tried to make the switch in Fedora (for eventual implementation), we hit
> an ABI compatibility problem:
>
> _dl_tlsdesc_dynamic doesn't preserve all caller-saved registers
> <https://sourceware.org/bugzilla/show_bug.cgi?id=31372>
>
> This means that changing the defaults can have backwards compatibility
> impact with older distributions.
>
> (a) Do not nothing special and switch the default. Maybe try to
> backport the glibc fix to more release branches and distributions. I
> think we implicitly decided to follow this path when we decided thiswas
> a glibc bug and not a GCC bug. The downside is that missing the bug fix
> can result in unexpected, difficult-to-diagnose behavior. However, when
> we rebuilt Fedora, the problem was exceedingly rare (we observed one
> single failure, if I recall correctly).
>
> (b) Introduce binary markup to indicate that binaries may need the glibc
> fix, and that glibc has the fix.
>
> [PATCH] x86-64: Add GLIBC_ABI_GNU2_TLS [BZ #33129]
> <https://inbox.sourceware.org/libc-alpha/20250704205341.1...@gmail.com/>
>
> This requires changes to all linkers, GCC and glibc.

This option is independent of GCC. Only glibc and linker changes
are needed. It just introduces a glibc version dependency whenever
GNU2 TLS is used, regardless whether it is the default or not.
--
H.J.

Florian Weimer

unread,
Jul 7, 2025, 5:07:06 AMJul 7
to Richard Biener, libc-...@sourceware.org, g...@gcc.gnu.org, binu...@sourceware.org, x86-6...@googlegroups.com, H.J. Lu
* Richard Biener:

> I think both (a) or (d) are reasonable, though I am missing a
> configure time flag to override the changed default. Even with
> glibc fixed we likely do not want to have this change in older
> enterprise code streams given there might be unknown external
> tooling that might be confused.

Yes, a configure flag makes sense.

> Oh, and what exactly is the advantage of GNU TLS2 descriptors?

The GNU2 TLS descriptor callback preserves most registers, and does not
need to save many registers on its fast path. This isn't true for
__tls_get_addr, which follows the standard calling convention. The
descriptors can be specialized based on the DSO that defines the TLS
variable. So GNU2 TLS descriptors are expected a little to be a bit
faster.

Thanks,
Florian

Florian Weimer

unread,
Jul 14, 2025, 7:55:32 PMJul 14
to Adhemerval Zanella Netto, libc-...@sourceware.org, g...@gcc.gnu.org, binu...@sourceware.org, x86-6...@googlegroups.com, H.J. Lu
* Adhemerval Zanella Netto:

>> (b) Introduce binary markup to indicate that binaries may need the glibc
>> fix, and that glibc has the fix.
>>
>> [PATCH] x86-64: Add GLIBC_ABI_GNU2_TLS [BZ #33129]
>> <https://inbox.sourceware.org/libc-alpha/20250704205341.1...@gmail.com/>
>>
>> This requires changes to all linkers, GCC and glibc.

>> I'm personally leaning towards (d) or (a) for GCC 16. I dislike (b).
>> And (e) is unrealistic in the short term.
>
> We did something similar to (b) for DT_RELR, so it is not
> unprecedented.

But DT_RELR was a new feature. We expected wide-spread,
hard-to-diagnose breakage if it were rolled out without proper markup.
The TLSDESC bug was only found by massive, systematic testing, using a
huge codebase. Not being default option contributed to the bug staying
hidden for so long, but the overall triggering conditions also seem to
be complicated to meet. This s really different from DT_RELR without
markup, I think.

We could treat TLSDESC without clobbers more like DT_RELR, but in my
opinion that requires revisting the question of whose ABI was right.
The (b) approach expresses that callee-saved vector registers are a new
feature, and that without it, GCC should generate code that assumes
vector registers are clobbered across TLSDESC. But we sort of decided
against this approach when we said this was a glibc bug and not a GCC
bug.

> The (b) might generate some attrition when users try to deploy binaries
> built with recent gcc on older systems; even though running with the new
> TLS variant is subject to breakage. Either users will fallback to use
> the old tls dialect or employ some hacks like remove the
> GLIBC_ABI_GNU2_TLS mark (and yeah, there are multiple cases like this,
> like Firefox with DT_RELR). But at least the toolchain does provide
> a way to work around it.
>
> As H.J has noted, GLIBC_ABI_GNU2_TLS only requires glibc and linker
> issues. This does not help when recent gcc are used with old binutils,
> but I take this is not really a common setup.

For (b), GCC would still need to know when it is safe to rely on
callee-saved vector registers for TLSDESC, no? If the linker is too
old, it won't know about the need to generate the ABI marker.

Thanks,
Florian

H.J. Lu

unread,
Jul 14, 2025, 8:41:11 PMJul 14
to Florian Weimer, Adhemerval Zanella Netto, libc-...@sourceware.org, g...@gcc.gnu.org, binu...@sourceware.org, x86-6...@googlegroups.com
Compilers will never know since the build-time glibc is independent of
the run-time glibc. If compilers want to be 100% sure that the run-time
is GNU2 TLS bug-free, they can require linkers which generate the
GLIBC_ABI_GNU2_TLS dependency.

--
H.J.

Florian Weimer

unread,
Jul 14, 2025, 8:47:20 PMJul 14
to H.J. Lu, Adhemerval Zanella Netto, libc-...@sourceware.org, g...@gcc.gnu.org, binu...@sourceware.org, x86-6...@googlegroups.com
* H. J. Lu:

> Compilers will never know since the build-time glibc is independent of
> the run-time glibc. If compilers want to be 100% sure that the run-time
> is GNU2 TLS bug-free, they can require linkers which generate the
> GLIBC_ABI_GNU2_TLS dependency.

(Such a linker requirement could be enforced by requiring that the
linker recognizes a command option specific to GLIBC_ABI_GNU2_TLS, and
that current linkers treat as an error.)

Would such an unconditional requirement be acceptable to GCC 16? I
don't think so. So we'd still have to design a configure option for it.
And that requires further compiler changes. It's also not entirely
clear how this would interact with -fuse-ld.

I just don't think option (b) is trivial from a compiler perspective.

Thanks,
Florian

H.J. Lu

unread,
Jul 14, 2025, 9:24:08 PMJul 14
to Florian Weimer, Adhemerval Zanella Netto, libc-...@sourceware.org, g...@gcc.gnu.org, binu...@sourceware.org, x86-6...@googlegroups.com
It depends on what we want. If we want 100% guarantee of glibc run-time
GNU2 TLS bug-free, GCC can pass -z gnu2-tls to linker whenever GNU2 TLS
is used with glibc. It is an error if the linker doesn't support -z
gnu2-tls. We
can provide a GCC option to disable -z gnu2-tls.

--
H.J.

Florian Weimer

unread,
Jul 15, 2025, 12:02:24 AMJul 15
to H.J. Lu, Adhemerval Zanella Netto, libc-...@sourceware.org, g...@gcc.gnu.org, binu...@sourceware.org, x86-6...@googlegroups.com
* H. J. Lu:
Unknown -z options are ignored by default:

$ ld -z gnu2-tls
ld: warning: -z gnu2-tls ignored

It would have to be a regular option. But that's just a minor detail.

Thanks,
Florian

Reply all
Reply to author
Forward
0 new messages