Make `_Alignof(_BitInt(128))` == `_Alignof(__int128)`

118 views
Skip to first unread message

Trevor Gross

unread,
Aug 11, 2023, 6:03:49 PM8/11/23
to X86-64 System V Application Binary Interface
Hello all,

I originally posted this on gitlab [1] but it is better to have here.

I think that there needs to be a change to make `_BitInt(128)` and
`__int128` agree on alignment; currently BitInt is align 8 and int is
align 16. Intuitively, `#define __int128 _BitInt(128)` _should_ be
acceptable and it seems like a lot of users will assume this, which
will lead to subtle interop bugs. You have to read unnecessarily deep
to learn why something that seems straightforward only works 90% of
the time - those are the bugs that everyone hates to come across.

GCC and LLVM currently disagree on the alignment & passing of __int128
and it has led to a lot of these quiet bugs (varargs segfault [2],
just using the stack segfaults [3], gcc and clang read bytes in
reverse order [4]) - and this is just now being fixed on the LLVM
side. If __int128 != _BinInt(128), anyone who uses both is probably
going to run across a similar  class of problems at some point with
the difference being that this can now happen within one's own code
(rather than mostly only across compilers) - quite the papercut.

Other than just those problems, supporting two different 128-bit
integers means redundant code and more confusing choices for languages
that don't support both (which alignment do I pick?). Mildly better
performance for `_BitInt(128)` if it were 16-aligned would be a bonus
[5] (see also that comment for the mention of `AArch` making alignment
match) for their implementation.

Fixing the spec is uncomfortable but I think gcc is still working on
their implementation, and LLVM can likely adjust theirs in a similar
way to the ongoing __int128 fix - the time to make this change is
really now or never. And I think that picking "never" might be a
choice that winds up being regretted down the line, once BitInts get
more popular.

Cheers,
Trevor

Link: Initial discussion on adding `_BitInt(N) [6]

[1]: https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/11
[2]: https://github.com/llvm/llvm-project/issues/20283
[3]: https://bugs.llvm.org/show_bug.cgi?id=50198
[4]: https://github.com/tgross35/quick-abi-check
[5]: https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/11#note_1289055948
[6]: https://groups.google.com/u/1/g/x86-64-abi/c/XQiSj-zU3w8/m/qzKc-T-5AwAJ


connor horman

unread,
Aug 11, 2023, 10:39:28 PM8/11/23
to Trevor Gross, X86-64 System V Application Binary Interface
I would like this adjustment as well.

In the compiler I'm working on, __int128 and _BitInt(128) lower to the same type, int(128), in the IR. Having them be consistent is very much desireable.

IMO, the sanest way to do scalar alignment is to pick a max align value, and align everything to next_power_of_two, clamped at maxalign. This is the currently implemented algorithm in the compiler I am working on.

--
You received this message because you are subscribed to the Google Groups "X86-64 System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to x86-64-abi+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/x86-64-abi/c4c699e4-e8b6-4fee-8ee5-570df1011a54n%40googlegroups.com.

Jan Beulich

unread,
Aug 14, 2023, 2:27:03 AM8/14/23
to Trevor Gross, X86-64 System V Application Binary Interface
On 12.08.2023 00:03, Trevor Gross wrote:
> I think that there needs to be a change to make `_BitInt(128)` and
> `__int128` agree on alignment; currently BitInt is align 8 and int is
> align 16. Intuitively, `#define __int128 _BitInt(128)` _should_ be
> acceptable and it seems like a lot of users will assume this, which
> will lead to subtle interop bugs. You have to read unnecessarily deep
> to learn why something that seems straightforward only works 90% of
> the time - those are the bugs that everyone hates to come across.
>
> GCC and LLVM currently disagree on the alignment & passing of __int128
> and it has led to a lot of these quiet bugs (varargs segfault [2],
> just using the stack segfaults [3], gcc and clang read bytes in
> reverse order [4]) - and this is just now being fixed on the LLVM
> side. If __int128 != _BinInt(128), anyone who uses both is probably
> going to run across a similar class of problems at some point with
> the difference being that this can now happen within one's own code
> (rather than mostly only across compilers) - quite the papercut.
>
> Other than just those problems, supporting two different 128-bit
> integers means redundant code and more confusing choices for languages
> that don't support both (which alignment do I pick?). Mildly better
> performance for `_BitInt(128)` if it were 16-aligned would be a bonus
> [5] (see also that comment for the mention of `AArch` making alignment
> match) for their implementation.

So you truly mean to special case just the single N = 128 case? I'd
consider this irritating as well, while I also agree that the
present situation isn't nice. What about making a more intrusive
change and specifying alignment as that of the next power-of-2 number
of bytes, for N > 64? That would then also accommodate a possible
__int256_t as well as e.g. permit the same 16-byte aligned load/store
insns to be used for items with 64 < N < 128 (should the compiler
elect to use SSE/AVX insns). And it would further make things
consistent with N <= 64.

That said, I'm generally wary of any ABI changes that have left draft
state, ...

> Fixing the spec is uncomfortable but I think gcc is still working on
> their implementation, and LLVM can likely adjust theirs in a similar
> way to the ongoing __int128 fix - the time to make this change is
> really now or never. And I think that picking "never" might be a
> choice that winds up being regretted down the line, once BitInts get
> more popular.

... which you also express here. For the psABI, though. I'm afraid I
don't really know how drafts are to be told from "official" versions.

Jan

Florian Weimer

unread,
Aug 14, 2023, 3:19:40 AM8/14/23
to 'Jan Beulich' via X86-64 System V Application Binary Interface, Trevor Gross, Jan Beulich
* via:

> So you truly mean to special case just the single N = 128 case? I'd
> consider this irritating as well, while I also agree that the
> present situation isn't nice. What about making a more intrusive
> change and specifying alignment as that of the next power-of-2 number
> of bytes, for N > 64? That would then also accommodate a possible
> __int256_t as well as e.g. permit the same 16-byte aligned load/store
> insns to be used for items with 64 < N < 128 (should the compiler
> elect to use SSE/AVX insns). And it would further make things
> consistent with N <= 64.

Does this mean 32-byte alignment for __int256_t? That would mean that
these types cannot be allocated using malloc, which seems rather
problematic.

Most mallocs already provide 16-byte alignment, at least for allocations
of 16 bytes or more, so the change for _BitInt(128) would be harmless in
that regard at least.

Thanks,
Florian

Jan Beulich

unread,
Aug 14, 2023, 3:24:17 AM8/14/23
to Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Trevor Gross
On 14.08.2023 09:19, Florian Weimer wrote:
> * via:
>
>> So you truly mean to special case just the single N = 128 case? I'd
>> consider this irritating as well, while I also agree that the
>> present situation isn't nice. What about making a more intrusive
>> change and specifying alignment as that of the next power-of-2 number
>> of bytes, for N > 64? That would then also accommodate a possible
>> __int256_t as well as e.g. permit the same 16-byte aligned load/store
>> insns to be used for items with 64 < N < 128 (should the compiler
>> elect to use SSE/AVX insns). And it would further make things
>> consistent with N <= 64.
>
> Does this mean 32-byte alignment for __int256_t? That would mean that
> these types cannot be allocated using malloc, which seems rather
> problematic.

That's not nice, I agree, but also no different from __m256 or __m512.

Jan

Florian Weimer

unread,
Aug 14, 2023, 3:29:51 AM8/14/23
to Jan Beulich, 'Jan Beulich' via X86-64 System V Application Binary Interface, Trevor Gross
* Jan Beulich:

> On 14.08.2023 09:19, Florian Weimer wrote:
>> * via:
>>
>>> So you truly mean to special case just the single N = 128 case? I'd
>>> consider this irritating as well, while I also agree that the
>>> present situation isn't nice. What about making a more intrusive
>>> change and specifying alignment as that of the next power-of-2 number
>>> of bytes, for N > 64? That would then also accommodate a possible
>>> __int256_t as well as e.g. permit the same 16-byte aligned load/store
>>> insns to be used for items with 64 < N < 128 (should the compiler
>>> elect to use SSE/AVX insns). And it would further make things
>>> consistent with N <= 64.
>>
>> Does this mean 32-byte alignment for __int256_t? That would mean that
>> these types cannot be allocated using malloc, which seems rather
>> problematic.
>
> That's not nice, I agree, but also no different from __m256 or __m512.

The difference is that it's less obvious that alignment would change if
you go from _BitInt(128) to _BitInt(129). (We have the same thing for
char[15] and char[16], but again that's covered by the 16-byte minimum
malloc alignment.)

Thanks,
Florian

Trevor Gross

unread,
Aug 14, 2023, 4:08:51 PM8/14/23
to Florian Weimer, Jan Beulich, 'Jan Beulich' via X86-64 System V Application Binary Interface
> So you truly mean to special case just the single N = 128 case? I'd
> consider this irritating as well, while I also agree that the
> present situation isn't nice. What about making a more intrusive
> change and specifying alignment as that of the next power-of-2 number
> of bytes, for N > 64? That would then also accommodate a possible
> __int256_t as well as e.g. permit the same 16-byte aligned load/store
> insns to be used for items with 64 < N < 128 (should the compiler
> elect to use SSE/AVX insns). And it would further make things
> consistent with N <= 64.

This does seem reasonable to me - but do we know if this is the
pattern sysv will follow if __int512_t or larger are ever added? Or is
it possible that alignment may be capped to `2 * sizeof(size_t)` for
standard ints, even though __m256 and __m512 currently violate it?
Guessing the first would be true, but I'm not sure where this would be
discussed. I indeed did not intend to special case N = 128, just
forgot to include a followup point about this.

> That said, I'm generally wary of any ABI changes that have left draft
> state, ...
>
>> Fixing the spec is uncomfortable but I think gcc is still working on
>> their implementation, and LLVM can likely adjust theirs in a similar
>> way to the ongoing __int128 fix - the time to make this change is
>> really now or never. And I think that picking "never" might be a
>> choice that winds up being regretted down the line, once BitInts get
>> more popular.
>
> ... which you also express here. For the psABI, though. I'm afraid I
> don't really know how drafts are to be told from "official" versions.

Absolutely correct to be weary of course; I only bring this up because
we are still reasonably early (low implementation / usage) and pretty
strong motivation. It is interesting that in a quick search [1] some
of the more common lines are:

typedef unsigned _BitInt(128) uint128_t;
typedef signed _BitInt(128) int128_t;
typedef unsigned _BitInt(256) uint256_t;
typedef unsigned _BitInt(512) uint512_t;

... which, as we've been discussing, is currently problematic.

Being that I don't see any strong opposition to making this change, I
think that maybe the best thing to do is alert GCC's implementation
thread and follow up on the clang issue, so anyone involved can bring
up their concerns here if needed. I will do this.

> Does this mean 32-byte alignment for __int256_t? That would mean that
> these types cannot be allocated using malloc, which seems rather
> problematic.

Malloc is a good point, but I am not sure to what extent it is a
problem (more below).

>>> Most mallocs already provide 16-byte alignment, at least for allocations
>>> of 16 bytes or more, so the change for _BitInt(128) would be harmless in
>>> that regard at least.
>> [ ... ]
>> That's not nice, I agree, but also no different from __m256 or __m512.
>
> The difference is that it's less obvious that alignment would change if
> you go from _BitInt(128) to _BitInt(129). (We have the same thing for
> char[15] and char[16], but again that's covered by the 16-byte minimum
> malloc alignment.)

I don't think that there is anything less obvious about alignment
changing compared to what already happens when you go from _BitInt(16)
to _BitInt(17) or similar - if the rule is that _BitInt alignment
changes at every power of 2, then that is easy to follow. Unless you
mean less obvious that malloc may not provide correct alignment?

I also don't believe there is any reason to avoid alignments greater
than `2 * sizeof(size_t)` on malloc's behalf. SIMD types on both 32-
and 64-bit already do, as mentioned above, as will (probably?)
int256_t+. Any implementation documentation needs to make the
alignment of `_BitInt` quite clear, and should include a warning that
larger sizes should use alligned_alloc instead of malloc. Glibc
already suggests this in their docs [2], I wish the malloc manpages
said something more useful about alignment than just "...suitably
aligned for any type that fits into the requested size or less".

[1]: https://grep.app/search?q=_BitInt%28&filter[lang][0]=C&filter[lang][1]=C%2B%2B
[2]: https://ftp.gnu.org/old-gnu/Manuals/glibc-2.2.3/html_node/libc_30.html

Joseph Myers

unread,
Aug 14, 2023, 5:29:34 PM8/14/23
to Trevor Gross, Florian Weimer, Jan Beulich, 'Jan Beulich' via X86-64 System V Application Binary Interface
On Mon, 14 Aug 2023, Trevor Gross wrote:

> I also don't believe there is any reason to avoid alignments greater
> than `2 * sizeof(size_t)` on malloc's behalf. SIMD types on both 32-

_BItInt types are specified to be basic types, which means the standard
doesn't permit them to have alignment requirements greater than
max_align_t.

--
Joseph S. Myers
jos...@codesourcery.com

Jan Beulich

unread,
Aug 15, 2023, 2:27:28 AM8/15/23
to Joseph Myers, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Trevor Gross
On 14.08.2023 23:29, Joseph Myers wrote:
> On Mon, 14 Aug 2023, Trevor Gross wrote:
>
>> I also don't believe there is any reason to avoid alignments greater
>> than `2 * sizeof(size_t)` on malloc's behalf. SIMD types on both 32-
>
> _BItInt types are specified to be basic types, which means the standard
> doesn't permit them to have alignment requirements greater than
> max_align_t.

Hmm, indeed. This effectively precludes adding new (wider) extended
integer types in a new revision of any psABI. IOW alignment cutoff
point needs to be (and remain) 128 bits here, including _BitInt().

Jan

Jubilee Young

unread,
Aug 15, 2023, 2:45:22 AM8/15/23
to X86-64 System V Application Binary Interface
Hello everyone, sorry for not following up on this earlier.

Yes, max_align_t is part of why I think the specification should be it reaches 16 alignment,
in addition to __int128 existing. That would make many things nicely consistent, I think,
since it allows reasoning exactly up to that value, even if mostly by coincidence.

As far as the compatibility concerns noted:
As already said, the ABI of __int128 is inconsistent between major compilers (gcc and clang).
This has already resulted in much grief, so I think "But we already shipped the spec!" is... silly.
In order to fix anything, one compiler already has to accept being inconsistent between versions.
And that isn't actually the only ABI disagreement to be fixed, just the currently relevant one.
If the x86-64 psABI can't unify the Big FOSS Compilers, what good is it?
If the Big FOSS Compilers cannot handle psABI complexity, what use additional complexity?
It's not a secret compilers would prefer to limit reasoning about "different" integers of equal size,
at least, during codegen anyways, which is, er, exactly when they need to implement a psABI.
They wind up having to do so nonetheless to implement programming language features,
but having to special-case it just for a CPU's psABI should require exceptional justification.

Michael Matz

unread,
Aug 15, 2023, 8:59:48 AM8/15/23
to Jan Beulich, Joseph Myers, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Trevor Gross
Hello,

On Tue, 15 Aug 2023, 'Jan Beulich' via X86-64 System V Application Binary Interface wrote:

> On 14.08.2023 23:29, Joseph Myers wrote:
> > On Mon, 14 Aug 2023, Trevor Gross wrote:
> >
> >> I also don't believe there is any reason to avoid alignments greater
> >> than `2 * sizeof(size_t)` on malloc's behalf. SIMD types on both 32-
> >
> > _BItInt types are specified to be basic types, which means the standard
> > doesn't permit them to have alignment requirements greater than
> > max_align_t.
>
> Hmm, indeed. This effectively precludes adding new (wider) extended
> integer types in a new revision of any psABI.

As long as those aren't basic integer types that would be fine.

> IOW alignment cutoff
> point needs to be (and remain) 128 bits here, including _BitInt().

But that shows a problem with the proposal of forcing
_BitInt(128)==__int128 by extension: imagine we will have __int256
eventually. No doubt people will expect that _BitInt(256) will be layout
compatible with it (certainly if we set precedent now with ensuring this
for 128). That would mean an alignment of 16 bytes for __int256 only
as well. This may be the right choice at that future time, but maybe
people would also like it to align to 32, at which point it can't possibly
be _BitInt(256) anymore.

Do note that increasing the alignment also means increasing the sizeof: a
_BitInt(129) will have a sizeof of 32 then, not only of 24. Or we would
do the alignment increase really _only_ for N==128, as special case.
Seems a bit strange, though.

FWIW, I do think we can still change wording regarding _BitInt in the
psABI. The current layout is in there since August 2021. But ideally I'd
like to see support for a change like this from more camps than just GGCC
and clang.


Ciao,
Michael.

Jan Beulich

unread,
Aug 15, 2023, 9:46:40 AM8/15/23
to Michael Matz, Joseph Myers, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Trevor Gross
On 15.08.2023 14:59, Michael Matz wrote:
> On Tue, 15 Aug 2023, 'Jan Beulich' via X86-64 System V Application Binary Interface wrote:
>> On 14.08.2023 23:29, Joseph Myers wrote:
>>> On Mon, 14 Aug 2023, Trevor Gross wrote:
>>>
>>>> I also don't believe there is any reason to avoid alignments greater
>>>> than `2 * sizeof(size_t)` on malloc's behalf. SIMD types on both 32-
>>>
>>> _BItInt types are specified to be basic types, which means the standard
>>> doesn't permit them to have alignment requirements greater than
>>> max_align_t.
>>
>> Hmm, indeed. This effectively precludes adding new (wider) extended
>> integer types in a new revision of any psABI.
>
> As long as those aren't basic integer types that would be fine.

Right, but that then again makes impossible to have _BitInt(<N>) and
int<N>_t be equivalent (for such <N> where int<N>_t exists in the
first place, of course).

Jan

Ballman, Aaron

unread,
Aug 15, 2023, 10:07:30 AM8/15/23
to Beulich, Jan, Michael Matz, Joseph Myers, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Trevor Gross
>> As long as those aren't basic integer types that would be fine.
> Right, but that then again makes impossible to have _BitInt(<N>) and int<N>_t be equivalent (for such <N> where int<N>_t exists in the first place, of course).

That's already impossible for any N < sizeof(int) * CHAR_BIT because _BitInt doesn't undergo integer promotions while int<N>_t does.

~Aaron
--
You received this message because you are subscribed to the Google Groups "X86-64 System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to x86-64-abi+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/x86-64-abi/63eca3cc-1f0d-cbb7-b341-1ac378971a17%40suse.com.

connor horman

unread,
Aug 15, 2023, 10:09:11 AM8/15/23
to Jan Beulich, X86-64 System V Application Binary Interface, Florian Weimer, Joseph Myers, Michael Matz, Trevor Gross

Frankly, I'd expect int256_t to also have 16 alignment, which would be compatible with _BitInt(256).


Michael Matz

unread,
Aug 15, 2023, 10:25:08 AM8/15/23
to Ballman, Aaron, Beulich, Jan, Joseph Myers, Florian Weimer, 'Jan Beulich' via X86-64 System V Application Binary Interface, Trevor Gross
Hi,

On Tue, 15 Aug 2023, Ballman, Aaron wrote:

> >> As long as those aren't basic integer types that would be fine.
> > Right, but that then again makes impossible to have _BitInt(<N>) and int<N>_t be equivalent (for such <N> where int<N>_t exists in the first place, of course).
>
> That's already impossible for any N < sizeof(int) * CHAR_BIT because
> _BitInt doesn't undergo integer promotions while int<N>_t does.

Correct, I was sloppy in my wording. Obviously they can't be equivalent.
I was only talking about layout, i.e. size, alignment, byte and bit order,
and padding location. (Which then still can't be the same between
hypothetical __int256 and _BitInt(256), except by constraining alignment
of __int256).


Ciao,
Michael.

Jubilee Young

unread,
Aug 15, 2023, 3:05:03 PM8/15/23
to X86-64 System V Application Binary Interface
On Tuesday, August 15, 2023 at 7:25:08 AM UTC-7 Michael Matz wrote:
Hi,

On Tue, 15 Aug 2023, Ballman, Aaron wrote:

> >> As long as those aren't basic integer types that would be fine.
> > Right, but that then again makes impossible to have _BitInt(<N>) and int<N>_t be equivalent (for such <N> where int<N>_t exists in the first place, of course).
>
> That's already impossible for any N < sizeof(int) * CHAR_BIT because
> _BitInt doesn't undergo integer promotions while int<N>_t does.

Correct, I was sloppy in my wording. Obviously they can't be equivalent.
I was only talking about layout, i.e. size, alignment, byte and bit order,
and padding location. (Which then still can't be the same between
hypothetical __int256 and _BitInt(256), except by constraining alignment
of __int256).

You can always decide, if this theoretical concern is big enough, that the psABI
should in fact specify alignof(max_align_t) must be 64, or whatever you prefer.
Currently it is unspecified by the psABI, despite being... somewhat relevant, thus
everyone is just coasting on the assumption that it would be preserved as 16.
As far as I understand things, we have no such promise:
Not from C, not from C++, not from the compilers, and not from the psABI.


Reply all
Reply to author
Forward
0 new messages