Skip to content

Enable MSVC (cl.exe) build on Windows ARM64#1285

Open
pdeep854 wants to merge 1 commit intoxtensor-stack:masterfrom
pdeep854:msvc-arm64-enablement
Open

Enable MSVC (cl.exe) build on Windows ARM64#1285
pdeep854 wants to merge 1 commit intoxtensor-stack:masterfrom
pdeep854:msvc-arm64-enablement

Conversation

@pdeep854
Copy link
Copy Markdown

@pdeep854 pdeep854 commented Apr 1, 2026

PR: Enable MSVC (cl.exe) build on Windows ARM64

Summary

This PR restores full build compatibility for xsimd with the MSVC compiler (cl.exe) on
Windows ARM64 (_M_ARM64). Prior to this change, the project failed to compile with MSVC
on ARM64 due to fundamental differences in how MSVC exposes ARM NEON intrinsics compared
to GCC and Clang.

All changes are strictly additive and backward-compatible: every existing GCC/Clang code
path is preserved unchanged inside #else branches. No behaviour is altered for any
non-MSVC-ARM64 target.

Testing

The build was verified locally on a Windows ARM64 machine using Visual Studio 2022:

cmake .. -DBUILD_TESTS=ON -DDOWNLOAD_DOCTEST=ON -DBUILD_BENCHMARK=ON ^
         -DBUILD_EXAMPLES=ON -DCMAKE_BUILD_TYPE=Release ^
         -G "Visual Studio 17 2022"
cmake --build . --config Release

Result: All three targets build successfully with zero errors:

benchmark_xsimd.vcxproj -> ...\Release\benchmark_xsimd.exe
mandelbrot.vcxproj      -> ...\Release\mandelbrot.exe
test_xsimd.vcxproj      -> ...\Release\test_xsimd.exe
image

Copy link
Copy Markdown
Contributor

@serge-sans-paille serge-sans-paille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First set of review, more to come later but at least it will trigger the discussion

#include <algorithm>
#include <array>
#include <complex>
#include <cstdio>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused header, probably a leftover from a debug session.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks will remove

for (size_t i = 0; i < sizeof...(Is); ++i)
if ((bitmask >> i) & 1u)
std::swap(mask_buffer[inserted++], mask_buffer[i]);
// Fill remaining positions with the last valid index to avoid undefined behavior
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not always setting zero?

Copy link
Copy Markdown
Author

@pdeep854 pdeep854 Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after swapping some unused entries at the end of mask_buffer still contain old index values. This mask is later used by swizzle() to gather vector elements. on MSVC arm64, swizzle() returns 0 for invalid indices, so these leftover values cause incorrect results.
to fix this the unused entries are filled with the last valid index, which acts as a safe placeholder. can also set it to 0

compress_mask.store_aligned(&mask_out[0]);
alignas(A::alignment()) T z_out[size];
z.store_aligned(&z_out[0]);
auto res = swizzle(z, compress_mask);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if the overload we were using previously actually exists?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks will remove it as it was added for debugging

alignas(16) T data[] = { static_cast<T>(args)... };
if constexpr (sizeof(T) == 1)
{
if constexpr (std::is_unsigned<T>::value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xsimd requires only C++14 as of this PR :-/

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will make changes as per c++14

{
#if defined(_MSC_VER) && defined(_M_ARM64)
alignas(16) double data[] = { d0, d1 };
return vld1q_f64(data);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could make this the default if it's more prtable: https://godbolt.org/z/fP3Tvjv5f

static_cast<unsigned_type>(b0 ? -1LL : 0LL),
static_cast<unsigned_type>(b1 ? -1LL : 0LL)
};
return vld1q_u64(data);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants