Great news. I have made a new vector class library for x86 and
x86-64 that makes it easier to use the vector instruction sets
from SSE2 to AVX and AVX2. It's a C++ library that defines a lot
of vector classes, functions and operators. Adding two vectors is
as simple as writing a + sign instead of using assembly code or
intrinsic functions. This is useful where the compiler doesn't
vectorize your code automatically. The resulting code has no extra
overhead when compiled with an optimizing compiler
This library has much more features than Intel's vector classes:
Features
- vectors of 8, 16, 32 and 64-bit integers, signed and unsigned
- vectors of single and double precision floating point numbers
- total vector size 128 or 256 bits
- defines almost all common operators
- boolean operations and branches on vector elements
- defines many arithmetic functions
- permute, blend and table-lookup functions
- fast integer division
- many mathematical functions (requires external library)
- can build code for different instruction sets from the same
source code
- CPU dispatching to utilize higher instruction sets when
available
- uses metaprogramming (including preprocessing directives and
templates) to find the best implementation for the selected
instruction set and parameter values of a given operator or
function
Take a look at
www.agner.org/optimize/#vectorclass
and have fun!