NASM for Windows ARM builds?

438 views
Skip to first unread message

Jeffrey Walton

unread,
Nov 15, 2016, 5:47:17 AM11/15/16
to Crypto++ Users
Hi Everyone,

What are your thoughts on optionally using NASM for Windows ARM builds?

The first part of the problem is ASM for Microsoft platofrms. We can inline for Win32/X86. For X64, we have to provide a separate source file and use masm64. X64 is the reason for x64dll.asm and friends. For ARM, there's nothing because Microsoft tools don't support ARM ASM. For ARM, we rely on intrinsics at the moment.

The second part of the problem is intrinsics may not be meeting performance expectations. We got a private report over the weekend that ARM64 was performing orders of magnitude slower than C/C++. The code was intrinsics-based, and it was running on Linux. Part of the reason we used intrinsics was it was available to Windows clients.

My testing did not reveal the ARM64 gap, but I think the situation is fairly typical - a user simply builds the sources under their favorite toolchain and something pops out. The toolchain may (or may not) set appropriate flags and the user may (or may not) set appropriate flags. I can live with "instrinsics usually run faster than C/C++" or "instrinsics do not run slower". But I don't like "instrinsics sometimes run slower". Some of these devices are too hard for me to test (like Windows Phone and tablet), so i don't want to risk it for Windows users.

I think we can address the gap, but we need to expand our repertoire with a cross-platform assembler. As far as I know, the cross-platform assembler is NASM. Does anyone know of any others?

NASM has another hidden benefit for distros and app suppliers. Using NASM, we could provide, say, NASM_SSE2_SomeFunc and NASM_AVX_SomeFunc from a single source file. We can't currently do that because when we enable AVX for NASM_AVX_SomeFunc, then SSE4 and AVX could cross-pollinate into NASM_SSE2_SomeFunc. That will result in an "Illegal Instruction" exception.

If things work well, then we could extend NASM to all platforms, like X86 and X64 on Windows and Linux. But that's just wishful thinking at the moment.

There is no free lunch. Development and maintencane costs increase for us. I pinged Wei about it, and he left it up to us.

Jeff

Jeffrey Walton

unread,
Nov 15, 2016, 6:06:07 AM11/15/16
to Crypto++ Users

NASM has another hidden benefit for distros and app suppliers. Using NASM, we could provide, say, NASM_SSE2_SomeFunc and NASM_AVX_SomeFunc from a single source file. We can't currently do that because when we enable AVX for NASM_AVX_SomeFunc, then SSE4 and AVX could cross-pollinate into NASM_SSE2_SomeFunc. That will result in an "Illegal Instruction" exception.

If things work well, then we could extend NASM to all platforms, like X86 and X64 on Windows and Linux. But that's just wishful thinking at the moment.

By the way, this experiment has been privately running in my test rigs for some time now because RDRAND and RDSEED are in a similar setup. Its the reason rdrand.cpp is set-up like it is:

 * C++ function that calls ASM routine
      * can provide intrinsic-based routine
      * https://github.com/weidai11/cryptopp/blob/master/rdrand.cpp
 * Microsoft MASM ASM routines
      * https://github.com/weidai11/cryptopp/blob/master/rdrand.asm
 * Linux ASM routines
      * https://github.com/weidai11/cryptopp/blob/master/rdrand.S

As I was explaining to Wei, part of the problem that I see in testing is: too many choices. We can us C/C++ with intrinsics; we can use MS MASM (rdrand.asm); or we can use NASM assmbler (rdrand.S).

It would be very appealing to fold rdrand.asm and use rdrand.S, and then us rdrand.S everywhere.

Jeff

Jeff

Andrew Marlow

unread,
Nov 16, 2016, 7:21:24 AM11/16/16
to Crypto++ Users


On Tuesday, 15 November 2016 10:47:17 UTC, Jeffrey Walton wrote:
Hi Everyone,

What are your thoughts on optionally using NASM for Windows ARM builds?
[snip]
 As far as I know, the cross-platform assembler is NASM. Does anyone know of any others?

There is yasm (https://en.wikipedia.org/wiki/Yasm), which is a rewrite of NASM under the BSD license.

Mouse

unread,
Nov 16, 2016, 12:26:50 PM11/16/16
to Andrew Marlow, Crypto++ Users
I found YASM developer(s) to be very unresponsive to comments, bug reports and such - but have to admit that YASM in general works, and is a good substitute for NASM on most platforms (for example, I'm using it on Mac).

--
--
You received this message because you are subscribed to the "Crypto++ Users" Google Group.
To unsubscribe, send an email to cryptopp-users-unsubscribe@googlegroups.com.
More information about Crypto++ and this group is available at http://www.cryptopp.com.
---
You received this message because you are subscribed to the Google Groups "Crypto++ Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cryptopp-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Regards,
Mouse

Jeffrey Walton

unread,
Nov 16, 2016, 1:54:28 PM11/16/16
to Crypto++ Users

Oh man, it looks like I miscalculated. I looked at both the NASM and YASM pages. Neither support ARM.

Andrew Marlow

unread,
Nov 17, 2016, 7:53:18 AM11/17/16
to Crypto++ Users
On Wednesday, 16 November 2016 18:54:28 UTC, Jeffrey Walton wrote:

What are your thoughts on optionally using NASM for Windows ARM builds?
[snip]
 As far as I know, the cross-platform assembler is NASM. Does anyone know of any others?

There is yasm (https://en.wikipedia.org/wiki/Yasm), which is a rewrite of NASM under the BSD license.

Oh man, it looks like I miscalculated. I looked at both the NASM and YASM pages. Neither support ARM.

True, but flat assembler does via a cross-assembler, see https://arm.flatassembler.net/.
 
Reply all
Reply to author
Forward
0 new messages