Performance comparison of WASM and NaCl

274 views
Skip to first unread message

Amol Wagh

unread,
May 4, 2021, 9:27:50 AM5/4/21
to emscripten-discuss

Hello All,

We have a chrome extension written in C++ and Javascript. Recently, changes are done to make use of WebAssembly module instead of NaCl module.
However, performance of WebAssembly found to be slower, on an avg by 30% and despite using `-O3`, than the NaCl's performance.
Emscripten compiler, v2.0.7, is used to prepare the .wasm. Whereas for .nexe, the C++ code is compiled using `pnacl-clang++` compiler. Later `pnacl-transalte` is used to generate the .nexe file. (x86_64 arch).

OS and Compiler details are as follows.
NaCl SDK having `Pepper 50`
Emscripten SDK version 2.0.7
Ubuntu 18.04.5 LTS
Chrome version 88.0.4324.182 (64-bit)
Processor : Intel® Core™ i5-5200U CPU @ 2.20GHz × 4

My concern is -
Right now, Emscripten only supports building 32 bit .wasm. Whereas, .nexe is 64 bit. Are the two executable right candidates for the performance comparison?
There might be differences at several places.
e.g. As mentioned in the design spec of WebAssembly, `long double` are software-emulated.
Note that, I can build 32 bit nexe. However, can't use it in 64 bit Chrome. Following error occurs in that case.
NaCl module load failed: ELF file for wrong architecture.

Regards,.
Amol

Alon Zakai

unread,
May 4, 2021, 5:14:46 PM5/4/21
to emscripte...@googlegroups.com
Many things could cause such a difference, including that PNaCl uses LLVM on the client (which sometimes beats web VMs), the use of SIMD (which must be enabled explicitly in wasm, but I believe not in PNaCl - you may be getting autovectorizing), presence of exceptions and setjmp (currently much slower in wasm, but wasm will fix that soon), etc.

If this difference is important, you can investigate it using profiling traces and inspection of particularly hot functions, etc. If you can share the wasm benchmark we can take a look and see if there is anything obvious.

This e-mail, including any attached files, may contain confidential and privileged information for the sole use of the intended recipient. Any review, use, distribution, or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive information for the intended recipient), please contact the sender by reply e-mail and delete all copies of this message.



--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/d8e66f42-10b9-4af0-9ba8-01424720a1b2n%40googlegroups.com.

Floh

unread,
May 6, 2021, 10:12:51 AM5/6/21
to emscripten-discuss
>  Right now, Emscripten only supports building 32 bit .wasm. Whereas, .nexe is 64 bit.

Note that the 32 in WASM32 is only about the pointer size, it has "proper" 64-bit integers, only pointers (and thus the maximum heap size) are 32 bits. Theoretically this should even yield better performance than 64-bit pointers.

"long double" seem to be 128 bit floating point numbers, I don't think that those have much relevance. Regular doubles (64-bits wide) are supported natively by WASM.

30% performance difference doesn't sound all that surprising TBH and might result from low level code generation differences between the WASM and PNaCl LLVM backends (for instance differences in register allocation, or calling conventions).

Floh

unread,
May 6, 2021, 10:20:29 AM5/6/21
to emscripten-discuss
PS: FWIW the difference you're seeing is quite similar to the difference between a native and wasm version of my 8-bit home computer emulators. For instance on my mid-2014 13 MBP, the C64 emulator's (https://floooh.github.io/tiny8bit/c64-ui.html) frame duration is between 4 and 5ms in the native version, and between 6 and 7ms in the WASM version (in Chrome). The emulator code is fairly straightforward single-threaded portable C code with heavy bit twiddling on 64-bit integers.

Amol Wagh

unread,
May 6, 2021, 4:19:17 PM5/6/21
to emscripten-discuss
Hello Alon and Floh,

Thank you so much for the reply.

Just to give a brief introduction about the extension, it allows to view and edit the Office documents. We use 3 different wasm modules (Word, Sheet and Point), of which only one makes use of exceptions.
Yes, we are investigating the performance differences using the profiling traces. Till now followings are the findings.
  1. In wasm, if File I/O is done on another pthread instead of main application thread, its slower. This issue on Emscripten also talks about it. After shifting the File I/O logic to `main application thread` and separating `main application thread` from the `main browser thread`, performance improved in the wasm. However, still slower by 30% w.r.t. NaCl.
  2. As you mentioned, yes, exception handling is slower in wasm compared to NaCl. Functions like `get` and `invoke_*` (from Emscripten's glue code, the .js file) found to be time consuming. After disabling exceptions, yes, the performance of wasm improves and almost matches with that of NaCl. However, it will make the extension unstable. So, can't consider this as a solution.
    I haven't tried `-fwasm-exceptions` yet. I will try it too. And keep you posted.
  3. The extension uses `pixman` library which makes use of SSE instructions. Till now, I didn't use flags `-msimd128` and `-msse`. I will try and let you know.

Regards,
Amol
Reply all
Reply to author
Forward
0 new messages