-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add intel simd #1703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add intel simd #1703
Conversation
another CI flow should be added for compiling with simd flags. Not all together. one with each:
|
arm has different SIMD instruction set, called NEON (https://developer.arm.com/architectures/instruction-sets/intrinsics/). |
To precompute simd constants at the start, the best solution I found was doing something like this: #ifdef __AVX512F__
static __m512i _512_vec_ones;
static __m512i _512_vec_zeros;
#endif
#ifdef __AVX2__
static __m256i _256_vec_ones;
static __m256i _256_vec_zeros;
#endif
#ifdef __SSE2__
static __m128i _128_vec_ones;
static __m128i _128_vec_zeros;
#endif
CONSTRUCTOR void ff_deserializer_init(void)
{
#ifdef __AVX512F__
_512_vec_ones = _mm512_set1_epi8('1');
_512_vec_zeros = _mm512_set1_epi8('0');
_512_vec_equals = _mm512_set1_epi8('=');
#endif
#ifdef __AVX2__
_256_vec_ones = _mm256_set1_epi8('1');
_256_vec_zeros = _mm256_set1_epi8('0');
#endif
#ifdef __SSE2__
_128_vec_ones = _mm_set1_epi8('1');
_128_vec_zeros = _mm_set1_epi8('0');
#endif
} where |
I'm constantly getting these warnings. Apparently they're harmless since I always use warning: cast increases required alignment of target type [-Wcast-align]
653 | _mm256_storeu_si256((__m256i *)r->v, out); The only fixes I found are:
|
This adds sse2, avx2 and avx512 support to the library in general, wherever it yields an improvement as per the benchmarks.
As discussed in #1700