__BitInt(N) size and alignment change request for N > 64

98 views
Skip to first unread message

Sunil Pandey

unread,
Dec 23, 2023, 1:55:43 AM12/23/23
to X86-64 System V Application Binary Interface

Hi,

As per current __BitInt(N) ABI, for N > 64, alignment is 8-byte. With 8-byte alignment, even 16-byte data can potentially split in 2 cache line.

BitInt[N] data split on cache line can create random performance degradation depending on memory data layout.

Since most platforms has 64-byte wide cache line, I propose following size/alignment for __BitInt(N > 64). This size/alignment ensures __BitInt(N) data up-to 64 bytes will fit in single 64-byte cache line. It also ensures __BitInt(N) data greater than 64 bytes get aligned to 64-byte cache line.

 Proposed ABI change for __BitInt(N) 

• For N <= 128, they have the same size and alignment as the smallest of (signed and unsigned) char, short, int, long, long long and __int128 types that can contain them.

• For N > 128, they are treated as struct of 128-bit integer chunks.

  1. For 256 >= N > 128, _BitInt(N) types are byte-aligned to 256 bits. The size of these types is 256 bits.
  2. For N > 256, _BitInt(N) types are byte-aligned to 512 bits. The size of these types is the smallest multiple of 512 bits greater than or equal to N.


--Sunil

Florian Weimer

unread,
Dec 23, 2023, 5:08:46 PM12/23/23
to Sunil Pandey, X86-64 System V Application Binary Interface
* Sunil Pandey:

> *Proposed ABI change for __BitInt(N)*
>
> • For N <= 128, they have the same size and alignment as the smallest of
> (signed and unsigned) char, short, int, long, long long and __int128 types
> that can contain them.
>
> • For N > 128, they are treated as struct of 128-bit integer chunks.
>
> 1. For 256 >= N > 128, _BitInt(N) types are byte-aligned to 256 bits.
> The size of these types is 256 bits.
> 2. For N > 256, _BitInt(N) types are byte-aligned to 512 bits. The size
> of these types is the smallest multiple of 512 bits greater than or equal
> to N.

For the N > 256 case, storage for objects containing this types cannot
be allocated by malloc. I do not think this is a good trade-off, sorry.

Sunil Pandey

unread,
Dec 25, 2023, 1:40:02 AM12/25/23
to X86-64 System V Application Binary Interface

My understanding, implementation follow ABI.  Can you please describe technical trade-off in detail. I understand, size increase with higher alignment, but not aware of any other trade-off.  

Joseph Myers

unread,
Dec 28, 2023, 7:51:24 PM12/28/23
to Sunil Pandey, X86-64 System V Application Binary Interface
On Sun, 24 Dec 2023, Sunil Pandey wrote:

> My understanding, implementation follow ABI. Can you please describe technical
> trade-off in detail. I understand, size increase with higher alignment, but
> not aware of any other trade-off.

The fact of size increase would adverse affects all other uses of malloc
for larger sizes to benefit only the limited use case of _BitInt.
(_BitInt types are basic types, so malloc is required to return memory
suitably aligned for them, if the size passed is large enough, and
max_align_t is required to be defined to be at least as aligned as all
such types.)

--
Joseph S. Myers
jos...@codesourcery.com

Sunil Pandey

unread,
Dec 30, 2023, 7:51:04 PM12/30/23
to X86-64 System V Application Binary Interface
malloc is just one implementation,  we shouldn’t make ABI decision based

on any specific implementation.  If malloc use for __BitInt increased alignment

affects its other usage; it should be addressed separately.

 

__BitInt is newly introduced feature,  it will be adopted over time on its own

merit.

 

I see following performance benefit of increasing alignment up-to cache line

size.

 

  • __BitInt data won’t unnecessarily split across cache line(64-byte cache line).
  • It can eliminate alignment related run-to-run performance variation.
  • Enable efficient 128/256/512 bit __BitInt vectorization.
  • Reduce alignment related compatibility overhead if it’s guaranteed by ABI.

Jan Beulich

unread,
Jan 4, 2024, 3:48:14 AMJan 4
to Sunil Pandey, X86-64 System V Application Binary Interface
On 31.12.2023 01:51, Sunil Pandey wrote:
> On Thursday, December 28, 2023 at 4:51:24 PM UTC-8 jos...@codesourcery.com
> wrote:
> On Sun, 24 Dec 2023, Sunil Pandey wrote:
>
>> My understanding, implementation follow ABI. Can you please describe
> technical
>> trade-off in detail. I understand, size increase with higher alignment,
> but
>> not aware of any other trade-off.
>
> The fact of size increase would adverse affects all other uses of malloc
> for larger sizes to benefit only the limited use case of _BitInt.
> (_BitInt types are basic types, so malloc is required to return memory
> suitably aligned for them, if the size passed is large enough, and
> max_align_t is required to be defined to be at least as aligned as all
> such types.)
>
> malloc is just one implementation, we shouldn’t make ABI decision based
> on any specific implementation. If malloc use for __BitInt increased
> alignment affects its other usage; it should be addressed separately.

Otoh I don't think we should make an ABI change which cannot reasonably be
matched in even a fundamental language like C, the more that _BitInt() is
there just because of C (aiui). We certainly can't expect the C spec,
especially ones already finalized, to be changed just because of one
specific architecture's requirements. Then again I expect more suitable
alignment of wider _BitInt() would also benefit other architectures, so
maybe the whole situation needs approaching from the C spec side anyway?

Jan

Sunil Pandey

unread,
Jan 7, 2024, 11:55:30 AMJan 7
to X86-64 System V Application Binary Interface
Motivation behind this change is architecture specfic as most x86 processor has
64-byte cache line and vectorization capability, which may not be true for other
architectures.

C spec may not be right place for this change, as forcing higher alignment in C
spec may cause adverse effect on other architecture depending on cache line
size and vectorization capability.

Reply all
Reply to author
Forward
0 new messages