I’m pleased to announce the availability of the latest release of SIMD Everywhere (SIMDe), version 0.7.6, representing more than two years of work by over 30 developers since version 0.7.2. (I also released 0.7.4 several weeks ago, but it needed a few more fixes; thanks go to the early adopters who helped me out.)
SIMDe is a permissively-licensed (MIT) header-only library which provides fast, portable implementations of SIMD intrinsics for platforms which aren’t natively supported by the API in question.
For example, with SIMDe you can use SSE, SSE2, SSE3, SSE4.1 and 4.2, AVX, AVX2, and many AVX-512 intrinsics on ARM, POWER, WebAssembly, or almost any platform with a C compiler. That includes, of course, x86 CPUs which don’t support the ISA extension in question (e.g., calling AVX-512F functions on a CPU which doesn’t natively support them).
If the target natively supports the SIMD extension in question there is no performance penalty for using SIMDe. Otherwise, accelerated implementations, such as NEON on ARM, AltiVec on POWER, WASM SIMD on WebAssembly, etc., are used when available to provide good performance.
SIMDe has already been used to port several packages to additional architectures through either upstream support or distribution packages, particularly on Debian.
What’s new in 0.7.4 / 0.7.6As always, we have an extensive test suite to verify our implementations.
For a complete list of changes, check out the 0.7.4 and 0.7.6 release notes.
Below are some additional highlights:
X86There are a total of 7470 SIMD functions on x86, 2971 (39.77%) of which have been implemented in SIMDe so far. Specifically for AVX-512, of the 5270 functions currently in AVX-512, SIMDe implements 1439 (27.31%)
Completely supported functions families Newly added function familiesSIMDe currently implements 56.46% of the ARM NEON functions (3766 out of 6670). If you don’t count 16-bit floats and poly types, it’s 75.95% (3766 / 4969).
Newly added familiesOverall, SIMDe implementents 40 of 533 (7.50%) functions from MSA.
What is coming nextWork on SIMDe is proceeding rapidly, but there are a lot of functions to implement… x86 alone has about 8,000 SIMD functions, and we’ve implemented about 3,000 of them. We will keep adding more functions and improving the implementations we already have.
If you’re interested in using SIMDe but need some specific functions to be implemented first, please file an issue and we may be able to prioritize those functions.
Getting InvolvedIf you’re interested in helping out please get in touch. We have a chat room on Matrix/Element which is fairly active if you have questions, or of course you can just dive right in on the issue tracker.