Hi,
As per current __BitInt(N) ABI, for N > 64, alignment is 8-byte. With 8-byte alignment, even 16-byte data can potentially split in 2 cache line.
BitInt[N] data split on cache line can create random performance degradation depending on memory data layout.
Since most platforms has 64-byte wide cache line, I propose following size/alignment for __BitInt(N > 64). This size/alignment ensures __BitInt(N) data up-to 64 bytes will fit in single 64-byte cache line. It also ensures __BitInt(N) data greater than 64 bytes get aligned to 64-byte cache line.
Proposed ABI change for __BitInt(N)
• For N <= 128, they have the same size and alignment as the smallest of (signed and unsigned) char, short, int, long, long long and __int128 types that can contain them.
• For N > 128, they are treated as struct of 128-bit integer chunks.
--Sunil
My understanding, implementation follow ABI. Can you please describe technical trade-off in detail. I understand, size increase with higher alignment, but not aware of any other trade-off.
on any specific implementation. If malloc use for __BitInt increased alignment
affects its other usage; it should be addressed separately.
__BitInt is newly introduced feature, it will be adopted over time on its own
merit.
I see following performance benefit of increasing alignment up-to cache line
size.