> So you truly mean to special case just the single N = 128 case? I'd
> consider this irritating as well, while I also agree that the
> present situation isn't nice. What about making a more intrusive
> change and specifying alignment as that of the next power-of-2 number
> of bytes, for N > 64? That would then also accommodate a possible
> __int256_t as well as e.g. permit the same 16-byte aligned load/store
> insns to be used for items with 64 < N < 128 (should the compiler
> elect to use SSE/AVX insns). And it would further make things
> consistent with N <= 64.
This does seem reasonable to me - but do we know if this is the
pattern sysv will follow if __int512_t or larger are ever added? Or is
it possible that alignment may be capped to `2 * sizeof(size_t)` for
standard ints, even though __m256 and __m512 currently violate it?
Guessing the first would be true, but I'm not sure where this would be
discussed. I indeed did not intend to special case N = 128, just
forgot to include a followup point about this.
> That said, I'm generally wary of any ABI changes that have left draft
> state, ...
>
>> Fixing the spec is uncomfortable but I think gcc is still working on
>> their implementation, and LLVM can likely adjust theirs in a similar
>> way to the ongoing __int128 fix - the time to make this change is
>> really now or never. And I think that picking "never" might be a
>> choice that winds up being regretted down the line, once BitInts get
>> more popular.
>
> ... which you also express here. For the psABI, though. I'm afraid I
> don't really know how drafts are to be told from "official" versions.
Absolutely correct to be weary of course; I only bring this up because
we are still reasonably early (low implementation / usage) and pretty
strong motivation. It is interesting that in a quick search [1] some
of the more common lines are:
typedef unsigned _BitInt(128) uint128_t;
typedef signed _BitInt(128) int128_t;
typedef unsigned _BitInt(256) uint256_t;
typedef unsigned _BitInt(512) uint512_t;
... which, as we've been discussing, is currently problematic.
Being that I don't see any strong opposition to making this change, I
think that maybe the best thing to do is alert GCC's implementation
thread and follow up on the clang issue, so anyone involved can bring
up their concerns here if needed. I will do this.
> Does this mean 32-byte alignment for __int256_t? That would mean that
> these types cannot be allocated using malloc, which seems rather
> problematic.
Malloc is a good point, but I am not sure to what extent it is a
problem (more below).
>>> Most mallocs already provide 16-byte alignment, at least for allocations
>>> of 16 bytes or more, so the change for _BitInt(128) would be harmless in
>>> that regard at least.
>> [ ... ]
>> That's not nice, I agree, but also no different from __m256 or __m512.
>
> The difference is that it's less obvious that alignment would change if
> you go from _BitInt(128) to _BitInt(129). (We have the same thing for
> char[15] and char[16], but again that's covered by the 16-byte minimum
> malloc alignment.)
I don't think that there is anything less obvious about alignment
changing compared to what already happens when you go from _BitInt(16)
to _BitInt(17) or similar - if the rule is that _BitInt alignment
changes at every power of 2, then that is easy to follow. Unless you
mean less obvious that malloc may not provide correct alignment?
I also don't believe there is any reason to avoid alignments greater
than `2 * sizeof(size_t)` on malloc's behalf. SIMD types on both 32-
and 64-bit already do, as mentioned above, as will (probably?)
int256_t+. Any implementation documentation needs to make the
alignment of `_BitInt` quite clear, and should include a warning that
larger sizes should use alligned_alloc instead of malloc. Glibc
already suggests this in their docs [2], I wish the malloc manpages
said something more useful about alignment than just "...suitably
aligned for any type that fits into the requested size or less".
[1]:
https://grep.app/search?q=_BitInt%28&filter[lang][0]=C&filter[lang][1]=C%2B%2B
[2]:
https://ftp.gnu.org/old-gnu/Manuals/glibc-2.2.3/html_node/libc_30.html