Hmm, why '.align 5' is used in kernels for arm and arm64 ? Can somebody explain?

364 views
Skip to first unread message

Guodong Xu

unread,
Jul 13, 2020, 5:16:00 AM7/13/20
to OpenBLAS-dev
Hi,

When I went through some detail of .S assembly code for kernel/arm and kernel/arm64 [2], I noticed that a lot of them are aligning their code to addresses of multiple 5. That's quite strange in this binary world. Each arm instruction is at the length of 32 bits.

I don't think Arm/ Arm64 architecture can perform better when their instructions are aligned to addresses such as 5.

Can somebody provide an reasonable explanation?

Instead, in Arm Inc.'s 'optimized routines', (I took that as their official code), I can see they are using '.p2align 4', which stands for 2^4 = 16 byte address aligned. [1] 


Thanks a lot. 

Guodong Xu

unread,
Jul 14, 2020, 1:25:12 AM7/14/20
to OpenBLAS-dev
It turns out at least when using GNU assembler (as) '.align 5' in arm/arm64 means to align to 2^5. So, we are good.
Sorry for the false alarm.

The meaning of '.align' differs from system to system, some will put it in as a multiple of number; others will use it as 2 to the power of . As explained in GCC document. [3]

"The way the required alignment is specified varies from system to system. For the a29k, hppa, m68k, m88k, w65, sparc, and Hitachi SH, and i386 using ELF format, the first expression is the alignment request in bytes. For example `.align 8' advances the location counter until it is a multiple of 8. If the location counter is already a multiple of 8, no change is needed.

For other systems, including the i386 using a.out format, it is the number of low-order zero bits the location counter must have after advancement. For example `.align 3' advances the location counter until it a multiple of 8. If the location counter is already a multiple of 8, no change is needed.

This inconsistency is due to the different behaviors of the various native assemblers for these systems which GAS must emulate. GAS also provides .balign and .p2align directives, described later, which have a consistent behavior across all architectures (but are specific to GAS)."

Zhang Xianyi

unread,
Jul 14, 2020, 2:56:45 AM7/14/20
to Guodong Xu, OpenBLAS-dev
Yes, it is 2^5 on ARM.

Xianyi

Guodong Xu <guodo...@linaro.org> 于2020年7月14日周二 下午1:25写道:
--
You received this message because you are subscribed to the Google Groups "OpenBLAS-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openblas-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openblas-dev/2dc0313e-3c3b-4c83-a564-a6c7de9435fen%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages