NSIMD is vectorization library. More specifically it is a wrapper around SIMD instruction sets such as Intel SSE, AVX, AVX-512, ARM NEON, SVE…
All major compilers use of SIMD instruction sets. It is known as autovectorization. But compilers are bad at autovectorizing code. Really simple pieces of code are easy to write that all compilers are unable to autovectorize. Autovectorization is an optimization step in compilers and really depends on the compiler manufacturer, version, … Therefore it is hard to have a predictible execution time for a given piece of code.
Writing explicit SIMD code is necessary in order to reach peak performances and have a predicible behavior accross compilers and compiler versions. But writing SIMD code by hand is cumbersome. It makes code almost unreadable, non-portable and therefore really difficult to maintain.
NSIMD tackles all these problems by offering several portable APIs in C89, C++98, C++11 and beyond that work on many compilers. It guaranties the developer to have portable, predicitible and easy to write and read code.
We perform benchmarks on all the hardware we can get our hands on. We produce one PDF for each software we provide and for each hardware.
In the benchmarks page you will find a bunch of PDF documents describing how NSIMD performs against other vectorization libraries on several hardware including: Intel AVX-512, ARM AARCH64 and AMD EPYC.
For NSIMD, we bench most of the functions we provide in small loops. Versions of NSIMD function for each suppported types are benchmarked, this includes integers over 8, 16, 32 and 64 bits and floating point numbers on 32 and 64 bits. For each loop we give in the PDF its source code written in C++ and its corresponding assembly code. Benchmarks of other libraries such as Sleef, MIPP and the standard library is also done and all running times are compared to NSIMD.