Problem with SIMD emulation (LLVM ERROR: Unsupported integer vector type with numElems: 2, primitiveSize: 64!)

802 views
Skip to first unread message

Flix

unread,
Apr 17, 2019, 11:27:43 AM4/17/19
to emscripten-discuss
Hi everybody,
I using emscripten version 1.38.30.

I've noticed that the SIMD headers are missing from that distributions (and beside that, WebAssembly SIMD is probably not working with emscripten).

I wanted to compile the demo in the: https://github.com/rasmusbarr/nudge using emscripten, and since it requires SIMD, I'm using this library: https://github.com/nemequ/simde to emulate it (you can read further info about the steps I took at the end of this thread: https://github.com/nemequ/simde/issues/37).

Result: using desktop gcc the program compiles and run correctly in SIMD-emulated mode, but emscripten still outputs:
Enter code here.LLVM ERROR: Unsupported integer vector type with numElems: 2, primitiveSize: 64!

Any idea why this happens ?
Thank you in advance.


Thomas Lively

unread,
Apr 17, 2019, 4:05:20 PM4/17/19
to emscripte...@googlegroups.com
Yes, the SSE headers were removed from emscripten because they could not actually increase performance, so it was nearly always a bad idea to be using them. No JS engine in use today supports SIMD.js natively and polyfilling SIMD code is slower than simply recompiling an application to not use SIMD in the first place. In other words, using a SIMD-optimized library with emscripten will actually be slower than using a library that is not SIMD-optimized (or at least a non-SIMD build of the same library).

WebAssembly SIMD will of course fix this so that using SIMD can have performance benefits. Emscripten actually does support WebAssembly SIMD when using the LLVM backend, but it is very bleeding edge and there are still a few bugs. And because the WebAssembly SIMD proposal is still in an early stage, the instructions it contains are subject to change.

As for the error you are seeing, it is telling you that Fastcomp does not support 2 x i64 vectors. Unfortunately Fastcomp does not know how to lower unsupported vector types into supported vector types, although the LLVM backend does. Until SIMD stabilizes in the LLVM backend and ships on JS engines, the best way around this error is going to be to remove all uses of 2 x i64 vectors (and 2 x f64 vectors) from your code.

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Flix

unread,
Apr 18, 2019, 6:08:23 AM4/18/19
to emscripten-discuss

WebAssembly SIMD will of course fix this so that using SIMD can have performance benefits. Emscripten actually does support WebAssembly SIMD when using the LLVM backend, but it is very bleeding edge and there are still a few bugs. And because the WebAssembly SIMD proposal is still in an early stage, the instructions it contains are subject to change.

As for the error you are seeing, it is telling you that Fastcomp does not support 2 x i64 vectors. Unfortunately Fastcomp does not know how to lower unsupported vector types into supported vector types, although the LLVM backend does. Until SIMD stabilizes in the LLVM backend and ships on JS engines, the best way around this error is going to be to remove all uses of 2 x i64 vectors (and 2 x f64 vectors) from your code.

OK. Thanks for your explanation.

Flix

unread,
Aug 3, 2019, 7:06:23 AM8/3/19
to emscripten-discuss
Hey! I've just discovered that by using the LLVM backend (latest-upstream, currently 1.38.40-upstream) SIMD emulation works out of the box (well, I just had to get rid of --closure 1) !

So Thomas was right when he said:
As for the error you are seeing, it is telling you that Fastcomp does not support 2 x i64 vectors. Unfortunately Fastcomp does not know how to lower unsupported vector types into supported vector types, although the LLVM backend does.

I've very happy that the time I spent making nudge work with simde (see https://github.com/nemequ/simde/issues/37) was not useless and now I can use nudge in the browser (... at least with SIMD emulated by simde)!


As far as the proper SIMD version, I've seen that the SIMD headers have come back into emscripten, but I'm still experiencing compilation errors:
> em++ -O2 -msse2 -fno-rtti -fno-exceptions [...other stuff here...]
clang-10: warning: argument unused during compilation: '-msse2' [-Wunused-command-line-argument]
In file included from main_no_ffp_with_shadows.cpp:53:
In file included from /.../emsdk/upstream/lib/clang/10.0.0/include/immintrin.h:14:
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:33:5: error: use of undeclared identifier '__builtin_ia32_emms'; did you mean '__builtin_isless'?
    __builtin_ia32_emms
();
   
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:33:5: note: '__builtin_isless' declared here
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:33:25: error: too few arguments to function call, expected 2, have 0
    __builtin_ia32_emms
();
                       
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:50:19: error: use of undeclared identifier '__builtin_ia32_vec_init_v2si'
   
return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:67:12: error: use of undeclared identifier '__builtin_ia32_vec_ext_v2si'
   
return __builtin_ia32_vec_ext_v2si((__v2si)__m, 0);
           
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:129:19: error: use of undeclared identifier '__builtin_ia32_packsswb'
   
return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:159:19: error: use of undeclared identifier '__builtin_ia32_packssdw'
   
return (__m64)__builtin_ia32_packssdw((__v2si)__m1, (__v2si)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:189:19: error: use of undeclared identifier '__builtin_ia32_packuswb'
   
return (__m64)__builtin_ia32_packuswb((__v4hi)__m1, (__v4hi)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:216:19: error: use of undeclared identifier '__builtin_ia32_punpckhbw'
   
return (__m64)__builtin_ia32_punpckhbw((__v8qi)__m1, (__v8qi)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:239:19: error: use of undeclared identifier '__builtin_ia32_punpckhwd'
   
return (__m64)__builtin_ia32_punpckhwd((__v4hi)__m1, (__v4hi)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:260:19: error: use of undeclared identifier '__builtin_ia32_punpckhdq'
   
return (__m64)__builtin_ia32_punpckhdq((__v2si)__m1, (__v2si)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:287:19: error: use of undeclared identifier '__builtin_ia32_punpcklbw'
   
return (__m64)__builtin_ia32_punpcklbw((__v8qi)__m1, (__v8qi)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:310:19: error: use of undeclared identifier '__builtin_ia32_punpcklwd'
   
return (__m64)__builtin_ia32_punpcklwd((__v4hi)__m1, (__v4hi)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:331:19: error: use of undeclared identifier '__builtin_ia32_punpckldq'
   
return (__m64)__builtin_ia32_punpckldq((__v2si)__m1, (__v2si)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:352:19: error: use of undeclared identifier '__builtin_ia32_paddb'
   
return (__m64)__builtin_ia32_paddb((__v8qi)__m1, (__v8qi)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:373:19: error: use of undeclared identifier '__builtin_ia32_paddw'
   
return (__m64)__builtin_ia32_paddw((__v4hi)__m1, (__v4hi)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:394:19: error: use of undeclared identifier '__builtin_ia32_paddd'
   
return (__m64)__builtin_ia32_paddd((__v2si)__m1, (__v2si)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:416:19: error: use of undeclared identifier '__builtin_ia32_paddsb'
   
return (__m64)__builtin_ia32_paddsb((__v8qi)__m1, (__v8qi)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:439:19: error: use of undeclared identifier '__builtin_ia32_paddsw'
   
return (__m64)__builtin_ia32_paddsw((__v4hi)__m1, (__v4hi)__m2);
                 
^
/.../emsdk/upstream/lib/clang/10.0.0/include/mmintrin.h:461:19: error: use of undeclared identifier '__builtin_ia32_paddusb'
   
return (__m64)__builtin_ia32_paddusb((__v8qi)__m1, (__v8qi)__m2);
                 
^
fatal error
: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

I'm not sure why clang-10 gives me '-msse2' [-Wunused-command-line-argument], but I suppose that the SIMD stuff is still WIP in emscripten (and maybe still unsupported by browsers in general), so I'm just happy with what I got now!

Thanks again for your help.

Flix

unread,
Aug 3, 2019, 11:02:09 AM8/3/19
to emscripten-discuss

I've just added all the related stuff to my nudge fork here https://github.com/Flix01/nudge (in case someone is interested).

Thomas Lively

unread,
Aug 3, 2019, 4:03:25 PM8/3/19
to emscripte...@googlegroups.com
Hi, unfortunately those headers you are using have nothing to do with WebAssembly and will not enable you to compile code using x86 intrinsics and targeting WebAssembly. I've filed an issue for this here: https://github.com/emscripten-core/emsdk/issues/309. Clang is giving you a warning about `-msse2` because that flag only works for x86 targets; SSE2 is an x86 feature, not a WebAssembly feature. Notice that you are also getting a large number of warnings about unrecognized builtin functions like `__builtin_ia32_emms`. These builtin functions are used by the mmintrin.h header you included but only exist when targeting x86, not WebAssembly.

You can read more about using WebAssembly SIMD intrinsics here: https://emscripten.org/docs/porting/simd.html#porting-simd-code-targeting-webassembly. Please also keep in mind that using emulated SIMD is slower than not using SIMD at all.

On Sat, Aug 3, 2019 at 8:02 AM Flix <filip...@gmail.com> wrote:

I've just added all the related stuff to my nudge fork here https://github.com/Flix01/nudge (in case someone is interested).

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.

Flix

unread,
Aug 4, 2019, 11:39:46 AM8/4/19
to emscripten-discuss

Thanks again Thomas for your precious feedback!

Please also keep in mind that using emulated SIMD is slower than not using SIMD at all.

Yes, I know, but the Nudge physic library I use (https://github.com/rasmusbarr/nudge) requires SIMD.
I remember that last year, using an old version of emscripten, I managed it compile (with SIMD+asm.js+polyfill), but it was extremely slow when running inside the browser.
Now using emscriptem +simde (https://github.com/nemequ/simde) it's much faster.
 
And, as far as the (non-emulated) SIMD version is concerned:

Hi, unfortunately those headers you are using have nothing to do with WebAssembly and will not enable you to compile code using x86 intrinsics and targeting WebAssembly. I've filed an issue for this here: https://github.com/emscripten-core/emsdk/issues/309. Clang is giving you a warning about `-msse2` because that flag only works for x86 targets; SSE2 is an x86 feature, not a WebAssembly feature. Notice that you are also getting a large number of warnings about unrecognized builtin functions like `__builtin_ia32_emms`. These builtin functions are used by the mmintrin.h header you included but only exist when targeting x86, not WebAssembly.

You can read more about using WebAssembly SIMD intrinsics here: https://emscripten.org/docs/porting/simd.html#porting-simd-code-targeting-webassembly.

Thank you. I really needed this!

However it seems that https://github.com/emscripten-core/emscripten/blob/incoming/system/include/wasm_simd128.h can't be used to convert existing SIMD code to WebAssembly SIMD.
So, for example, if I replace: -msse2 with -msimd128 and #include <immintrin.h> with #include <wasm_simd128.h> I get:
../nudge.cpp:141:9: error: unknown type name '__m128'
typedef __m128 simd4_float;
       
^
../nudge.cpp:142:9: error: unknown type name '__m128i'
typedef __m128i simd4_int32;
       
^
../nudge.cpp:145:20: error: unknown type name '__m128'
        NUDGE_FORCEINLINE __m128 unpacklo32
(__m128 x, __m128 y) {
                         
^
../nudge.cpp:145:38: error: unknown type name '__m128'
        NUDGE_FORCEINLINE __m128 unpacklo32
(__m128 x, __m128 y) {
                                           
^
../nudge.cpp:145:48: error: unknown type name '__m128'
        NUDGE_FORCEINLINE __m128 unpacklo32
(__m128 x, __m128 y) {
                                                     
^
../nudge.cpp:149:20: error: unknown type name '__m128'
        NUDGE_FORCEINLINE __m128 unpackhi32
(__m128 x, __m128 y) {
                         
^
../nudge.cpp:149:38: error: unknown type name '__m128'
        NUDGE_FORCEINLINE __m128 unpackhi32
(__m128 x, __m128 y) {
                                           
^
../nudge.cpp:149:48: error: unknown type name '__m128'
        NUDGE_FORCEINLINE __m128 unpackhi32
(__m128 x, __m128 y) {
                                                     
^
../nudge.cpp:153:20: error: unknown type name '__m128i'
        NUDGE_FORCEINLINE __m128i unpacklo32
(__m128i x, __m128i y) {
                         
^
../nudge.cpp:153:39: error: unknown type name '__m128i'
        NUDGE_FORCEINLINE __m128i unpacklo32
(__m128i x, __m128i y) {
                                             
^
../nudge.cpp:153:50: error: unknown type name '__m128i'
        NUDGE_FORCEINLINE __m128i unpacklo32
(__m128i x, __m128i y) {
                                                       
^
../nudge.cpp:157:20: error: unknown type name '__m128i'
        NUDGE_FORCEINLINE __m128i unpackhi32
(__m128i x, __m128i y) {
                         
^
../nudge.cpp:157:39: error: unknown type name '__m128i'
        NUDGE_FORCEINLINE __m128i unpackhi32
(__m128i x, __m128i y) {
                                             
^
../nudge.cpp:157:50: error: unknown type name '__m128i'
        NUDGE_FORCEINLINE __m128i unpackhi32
(__m128i x, __m128i y) {
                                                       
^
../nudge.cpp:162:20: error: unknown type name '__m128'
        NUDGE_FORCEINLINE __m128 concat2x32
(__m128 x, __m128 y) {
                         
^
../nudge.cpp:162:38: error: unknown type name '__m128'
        NUDGE_FORCEINLINE __m128 concat2x32
(__m128 x, __m128 y) {
                                           
^
../nudge.cpp:162:48: error: unknown type name '__m128'
        NUDGE_FORCEINLINE __m128 concat2x32
(__m128 x, __m128 y) {
                                                     
^
../nudge.cpp:163:31: error: use of undeclared identifier '_MM_SHUFFLE'
               
return _mm_shuffle_ps(x, y, _MM_SHUFFLE(y1, y0, x1, x0));
                                           
^
../nudge.cpp:167:20: error: unknown type name '__m128'
        NUDGE_FORCEINLINE __m128 shuffle32
(__m128 x) {

                         
^
fatal error
: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

because the type __m128 is not detected.

It would be too much work for me to make a full WebAssembly-SIMD conversion for now, so I'm glad with the emulated version.

Thank you again!

 

Jukka Jylänki

unread,
Aug 12, 2019, 7:33:18 AM8/12/19
to emscripte...@googlegroups.com
su 4. elok. 2019 klo 18.39 Flix (filip...@gmail.com) kirjoitti:

>> Please also keep in mind that using emulated SIMD is slower than not using SIMD at all.

This is not always true. It depends on what kind of emulation we are
talking about. We can emulate SSE on top of scalar C code, or SSE on
top of WebAssembly SIMD. Emulating SSE on top of scalar C code can be
equally as fast as hand-writing the scalar code itself, depending on
what the original SSE code looked like. (of course in many cases there
are extra swizzles that would be redundant, but depending on the
problem, the code can end up looking very identical)

Emulating SSE on top of Wasm SIMD can be faster than not having used
SIMD at all.

> I remember that last year, using an old version of emscripten, I managed it compile (with SIMD+asm.js+polyfill), but it was extremely slow when running inside the browser.
> Now using emscriptem +simde (https://github.com/nemequ/simde) it's much faster.

The reason why it was extremely slow was that no browser ended up
shipping simd.js support, and the polyfill was implemented in
hand-written JavaScript. If you had run in Firefox Nightly which did
implement simd.js in hardware, you would have gotten the same result
as you are now getting with simde - both emulate SSE on top of
SIMD.js/wasm simd. That is, Emscripten with SSE+asm.js+simd.js used to
back on directly to hardware, but the SSE backing support was dropped
when Wasm SIMD support was added.

I am really glad to find that the project simde exists, that will
patch up the missing support in Emscripten. If the license is
appropriate, I think it would be great to bundle simde directly into
Emscripten so that -msse and -msse2 etc. work out of the box.

> However it seems that https://github.com/emscripten-core/emscripten/blob/incoming/system/include/wasm_simd128.h can't be used to convert existing SIMD code to WebAssembly SIMD.

> It would be too much work for me to make a full WebAssembly-SIMD conversion for now, so I'm glad with the emulated version.

This has been a long recurring point of conversation that has not yet
been resolved. (long historical thread at
https://github.com/tc39/ecmascript_simd/issues/59) There is no sense
to always have to rewrite SSE code to Wasm SIMD just to have the
browser undo that Wasm-SIMD back to SSE on-the-fly when running on a
x86 hardware. Developers want to be able to write page_sse.wasm and
page_neon.wasm files and feature check to run the appropriate one
depending on the target. (and they also want to be able to write
page_simd.wasm that simultaneously targets both sse and neon) But
there is some hope to introduce proper SSE* instruction sets in a
future revision of the WebAssembly SIMD v2 or similar.

Steven Johnson

unread,
Aug 12, 2019, 1:37:11 PM8/12/19
to emscripte...@googlegroups.com
Err, maybe I'm missing something, but if developers writing for wasm have to think about whether the target has SSEx (specifically) or Neon (specifically) we're doing things terribly wrong. The only target they should need to check for is whether the *wasm* engine supports wasm-simd128, and target that; if developers need to detect for  underlying implementation details, that should be a huuuuuge red flag. (I'd argue that even being *able* to detect the underlying SIMD architecture from wasm-land is a bad idea, lest people try to game their current compiler to squeak out marginal performance improvements based on that.)
Reply all
Reply to author
Forward
0 new messages