Thoughts on future ARM64 support for emsdk?

72 views
Skip to first unread message

Brion Vibber

unread,
Jul 9, 2020, 2:30:51 PM7/9/20
to emscripten Mailing List
I've been a bit of an ARM64 enthusiast of late, trying out Linux, Windows 10, and iOS devices that run on the ARM64/Aarch64 family of processors. Emscripten works fine on these machines if one cares to do some light development work on them, but since there's no binaries built from CI, the standard emsdk can only install by building from source -- which can take hours on a middleweight portable machine.

Now that Apple is switching their Mac product line to ARM64 processors over the next two years, it will likely become much more common next year for people to have ARM64-based laptop and desktop computers, and some of them will need to build something with emscripten in their workflows either on macOS or on a virtualized Linux in Docker etc.

From what I've seen presented at WWDC, the ARM64 Macs will support emulated processes, so it may work to ship the x86_64 binaries with the caveat that they will run much slower than native builds.

Virtualized Linux builds would also need native ARM64 binaries to run, or else they'd have to sit there for a couple hours compiling after every upgrade.

And of course there are already Windows 10 and Linux computers available with ARM64 processes, on sale since a couple years ago and used in the wild in modest numbers.

I get the impression that the biggest roadblock to explicit ARM64 support in emscripten is getting it into the CI infrastructure:
* Linux/ARM64 builds and testing?
* macOS/ARM64 builds and testing?
* Windows/ARM64 builds and testing?

It's too soon to start on macOS since it's in beta, dev kits aren't shipped yet, and there's no obvious way yet to figure out how to run tests on a macOS ARM system in CI. :)

And I'm less sure how important Win/ARM64 is, given you can use the Linux/ARM64 version in WSL virtualization. But some folks prefer to develop on native Windows, too.

If there's any way we can start talking about Linux/ARM64 builds and testing, I would be very happy about it! I would even kick in a few bucks for a VM or something if that would help any. ;)

Thanks for your time and your consideration!

-- brion vibber (brion @ pobox.com / brion @ wikimedia.org)
Wikimedia Foundation

Thomas Lively

unread,
Jul 9, 2020, 2:43:12 PM7/9/20
to emscripte...@googlegroups.com
Oh wow it didn't even cross my mind that we would be affected by Apple switching to ARM. I assume ARM MacOS is something we will want to support natively at some point, but I don't know whether or not the best solution is to just wait for our CI providers to add support. You're also right that ideally we would support ARM64 on all three OSes, but that would double our testing burden even if all our CI providers supported that. I'm not sure what the best path forward is.

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/CAFnWYT%3D3OLskq1KBtVrgKsq1PTS7eHDxyzpcBpmE8_Bfk7m6Kw%40mail.gmail.com.

Sam Clegg

unread,
Jul 9, 2020, 3:52:23 PM7/9/20
to emscripte...@googlegroups.com
I don't its is very likely that we will want to add these new architectures to the ones that we pre-build and test on our infrastructure. 

The exception being macOS, but as you say that is a ways out still.

I'm afraid I view linux/arm64 a lot like FreeBSD or some other unix with relatively few users.  I'm a big fan of these niche platforms but until there is a large enough userbase I don't think it worth our time to try to support them officially.  Bear in mind that we don't even provide pre-built binaries for linux/x86_32 (which I imagine has way more users than linux/arm64).

Having said that its seems reasonable that we could allow emsdk to support these platforms if community members such as yourself want to take the time to build and upload binary packages for them.    For example you could put your binaries in gcs/s3 bucket and emsdk could look there when running on linux/asm64.   Do you think that could work?

cheers,
sam





Sam Clegg

unread,
Jul 9, 2020, 3:55:09 PM7/9/20
to emscripte...@googlegroups.com
One other thing that might help here is https://github.com/emscripten-core/emscripten/issues/11362.  If we allow for stable llvm versions then you can get clang from the linux distro (using apt-get or whatever) which would significantly reduce the cost of build emsdk from source.   This could either mean that the cost is low enough for you that you are happy to build from source, or low enough for us that we could consider adding to our infra (at least the build part).  WDYT?

cheers,
sam

Alon Zakai

unread,
Jul 9, 2020, 3:58:37 PM7/9/20
to emscripte...@googlegroups.com
At the risk of sounding like a broken record - I've been talking about the next idea a lot recently ;) - I think this is something wasm2c can help with. I just gave a talk about how:


The idea is that we add a single additional build target, "universal C", which compiles LLVM and Binaryen to portable C code that can then be compiled on practically any platform. So this new "build" would be C, and people would download those C files and run a simple command to build them locally. After the local build you end up with a normal executable that just works.

There is still that build step locally, but it's the simplest build possible - no build system is needed, no special local setup, no cmake or configure, just run gcc or clang on a self-contained C file. That should be trivial on ARM64 MacOS or Linux as I believe they have a system C compiler installed by default.

I'm not opposed to adding "proper" ARM64 builds, though - ARM64 builds would have some benefits over wasm2c. But wasm2c builds do cover the long tail of less-common platforms, and we can probably set them up much quicker too. I hope to do this for Binaryen soon anyhow.

- Alon

On Thu, Jul 9, 2020 at 11:30 AM Brion Vibber <br...@pobox.com> wrote:
--

Thomas Lively

unread,
Jul 9, 2020, 4:01:34 PM7/9/20
to emscripte...@googlegroups.com
How does the wasm2c build work with something as large as LLVM? I would assume that at some point the C file would get so large that the C compiler would fall over.

Alon Zakai

unread,
Jul 9, 2020, 4:14:41 PM7/9/20
to emscripte...@googlegroups.com
Yeah, that's an issue with wasm2c. It takes a few minutes to build (optimized) wasm-opt, and clang would be much worse, and maybe fail as you said.

I don't think anyone's looked into this, but ideas include:

* Get wasm2c to emit separate files, each containing one or more functions. Then you just need a Makefile or such and can even build in parallel.

* Write a little python script that splits up the file automatically. The format is pretty simple so that's easy. And the python could also run gcc/clang in a process pool for you.

- Alon



Brion Vibber

unread,
Jul 9, 2020, 4:25:25 PM7/9/20
to emscripten Mailing List
That's an extremely neat idea, and dividing into smaller files should help the compilation to run to completion in limited memory scenarios like virtual machines and low-end laptops. I think this would also be required to make use of multiple CPU cores in the C compiler; running a 6-core machine on only one core is going to be much slower than it should be and will result in a lot of wasted time for the user. 

It would also delay other projects' CI runs that install a fresh emsdk, wouldn't it? By delaying work that could be done once at build time to happen thousands of times, once at every installation?

As a developer who uses a variety of computers to work, I personally would appreciate the performance improvements and the energy savings of making one build per release per platform and distributing the appropriate ones for direct use with no additional compilation stage. 

-- brion

Brion Vibber

unread,
Jul 9, 2020, 4:36:06 PM7/9/20
to emscripten Mailing List
What would be needed for emsdk distribution is automated builds made during the release process, requiring no manual intervention.

If I personally must see the release announcement, run a build, and upload it before people can update and install, then we have failed because there will be frequent delays in the best case.

So if I can set up a VM that is pinged by your system when you're creating a new release and no manual tweaking, naming, testing, or command line invocations are necessary, but the build will go straight into where emsdk downloads from and no delays are present between Linux/x86_64 availability and Linux/ARM64 availability, then I'm happy to rent a VM to make it happen. :)

-- brion

Alon Zakai

unread,
Jul 9, 2020, 5:00:44 PM7/9/20
to emscripte...@googlegroups.com
On Thu, Jul 9, 2020 at 1:25 PM Brion Vibber <br...@pobox.com> wrote:
That's an extremely neat idea, and dividing into smaller files should help the compilation to run to completion in limited memory scenarios like virtual machines and low-end laptops. I think this would also be required to make use of multiple CPU cores in the C compiler; running a 6-core machine on only one core is going to be much slower than it should be and will result in a lot of wasted time for the user. 

It would also delay other projects' CI runs that install a fresh emsdk, wouldn't it? By delaying work that could be done once at build time to happen thousands of times, once at every installation?


Yes, this is the main downside of the wasm2c approach. It's not a true replacement for a proper build. (But it can help rare platforms that have no other build.)

Sam Clegg

unread,
Jul 9, 2020, 5:18:33 PM7/9/20
to emscripte...@googlegroups.com
On Thu, Jul 9, 2020 at 1:36 PM Brion Vibber <br...@pobox.com> wrote:
What would be needed for emsdk distribution is automated builds made during the release process, requiring no manual intervention.

If I personally must see the release announcement, run a build, and upload it before people can update and install, then we have failed because there will be frequent delays in the best case.

So if I can set up a VM that is pinged by your system when you're creating a new release and no manual tweaking, naming, testing, or command line invocations are necessary, but the build will go straight into where emsdk downloads from and no delays are present between Linux/x86_64 availability and Linux/ARM64 availability, then I'm happy to rent a VM to make it happen. :)

When releases happen we commit a change to `emscripten-releases-tags.txt` the emsdk repo, so presumably it should be possible today to simply write script to poll that repo every hour or so, then build and upload the results?

The downside is that the binaries that we release are actually not built by the emsdk repro but at a python script in the waterfall repo: https://github.com/WebAssembly/waterfall.   So even though emsdk can build from source the binaries it expects are not build by that same code.   This is a kind of a sad state of affairs, but it should still be possible to setup a builder that uses those two repos to do your build and upload.

We could start by doing one manually if you like?   You could upload an linux/arm64 tar ball and we can modify emsdk such that it is installable?    

cheers,
sam







 

Brion Vibber

unread,
Jul 9, 2020, 5:24:41 PM7/9/20
to emscripten Mailing List
Yeah, polling for new branches will do in a pinch. :)

I'll see what I can get running on my local machine and then we'll see if we can get it working downloaded by emsdk...

Shall I move detail discussion on this over to a GitHub issue or private email chain?

And thanks! I'm very excited, this is gonna be helpful for me for now ;) and hopefully more people down the road!

-- brion

Sam Clegg

unread,
Jul 9, 2020, 8:12:51 PM7/9/20
to emscripte...@googlegroups.com
Yeah lets create an emsdk issue for this..   

Floh

unread,
Jul 10, 2020, 8:44:55 AM7/10/20
to emscripten-discuss
Was cross-compilation mentioned yet in the thread yet (I didn't find anything when quickly glancing over the posts).

Since there was mention of universal binaries in the WWDC presentations, maybe it will actually be fairly simple to cross-compile binaries for ARM Macs on existing x86 build machines?

(it is already trivial for binaries running on iOS devices versus on the iOS simulator, since the simulator is running x86 code, you just give it a different target triple, e.g. "-target arm64-apple-ios13.5" vs "-target x86_64-apple-ios13.5-simulator", so maybe cross-compiling for Macs will be equally simple...)

Cheers,
-Floh.

Brion Vibber

unread,
Jul 10, 2020, 8:54:51 AM7/10/20
to emscripten Mailing List
Cross compilation for MacOS should be straightforward once the final developer tools land. It wouldn't be possible to run the ARM build on an Intel Mac for test runs, but I expect build issues are going to be more common than arm-specific runtime issues. :)

Similarly Linux arm builds could be done with cross compilation, with the same caveat that running is more difficult unless you play games with qemu. 

-- brion

Reply all
Reply to author
Forward
0 new messages