Skip to content

Conversation

willieyz
Copy link
Contributor

@willieyz willieyz commented Sep 3, 2025

  • Resolved: Enable and test AArch64 backend on Arm Windows machines #1132
  • This PR want to do following two things:
    • add the windows-11-arm label in base.yml to enable the windows-11-arm CI testing
    • Enable the arm(aarch64) and x64 native backend for windows:
      1. First change the nmake so that it build the aarch64 backend only
      2. Then later we can worry about building both the C + aarch64 backend
      3. Then later we can worry about not having this break on x86

@willieyz willieyz marked this pull request as ready for review September 4, 2025 09:56
@willieyz willieyz requested a review from a team as a code owner September 4, 2025 09:56
@willieyz willieyz requested a review from rod-chapman September 4, 2025 09:56
@mkannwischer mkannwischer marked this pull request as draft September 4, 2025 09:57
@willieyz willieyz marked this pull request as ready for review September 4, 2025 09:57
@willieyz willieyz marked this pull request as draft September 4, 2025 09:57
@mkannwischer
Copy link
Contributor

@rod-chapman - no need to review this one. @willieyz meant #1176

@willieyz
Copy link
Contributor Author

willieyz commented Sep 4, 2025

@rod-chapman - no need to review this one. @willieyz meant #1176

Yes...thank your for your heads-up, my apologies for the accidental click...

@willieyz willieyz changed the title Enable and test AArch64 backend on Arm Windows machines WIP: Enable and test AArch64 backend on Arm Windows machines Sep 5, 2025
@@ -175,7 +175,7 @@ jobs:
strategy:
fail-fast: false
matrix:
system: [windows-2025, windows-2022]
system: [windows-2025, windows-2022, windows-11-arm]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use MLK_FORCE_AARCH64 on windows-11-arm to double-check that the system is detected as Arm?

@@ -175,7 +175,7 @@ jobs:
strategy:
fail-fast: false
matrix:
system: [windows-2025, windows-2022]
system: [windows-2025, windows-2022, windows-11-arm]
name: Quickcheck ${{ matrix.system }}
runs-on: ${{ matrix.system }}
steps:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below, windows-latest is used for mingw. Should we split this now and test with x86-Windows and Arm-Windows?

Copy link
Contributor

@hanno-becker hanno-becker Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested forcing windows-11-arm for the mingw test and setting MLK_FORCE_AARCH64, but it fails. Any idea why? @mkannwischer @willieyz

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After dumping all preprocessor directives in the github summary of the mingw action, I doubt this is actually picking up an Arm instance: https://github.com/pq-code-package/mlkem-native/actions/runs/17483331582

Compiler location: /c/mingw64/bin/gcc
Compiler: gcc.exe (x86_64-posix-seh-rev2, Built by MinGW-W64 project) 12.2.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Compiler directives:
#define DBL_MIN_EXP (-1021)
#define UINT_LEAST16_MAX 0xffff
#define FLT16_HAS_QUIET_NAN 1
#define __ATOMIC_ACQUIRE 2
...
#define __x86_64 1

Copy link
Contributor

@hanno-becker hanno-becker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comments -- I am not seeing any evidence yet that windows-11-arm is actually picking up an Arm runner.

@hanno-becker
Copy link
Contributor

hanno-becker commented Sep 5, 2025

It appears that with the current MSVC setup job, an x86_64->x86_64 compiler is setup, which silently just works on windows-11-arm because x86_64 binaries are emulated automatically.

To build an arm64 binary, one can pass amd64_arm64 to the ilammy/msvc-dev action, which sets up an x86_64 compiler targeting arm64. This 'cross' compiler is then emulated to produce native binaries. I don't think ilammy/msvc-dev supports a native arm64 compiler.

I'm not sure if mingw64 supports Arm.

@willieyz willieyz force-pushed the add-aarch64Windows-backend-ci branch from 73596b0 to 059364a Compare September 9, 2025 02:02
@willieyz willieyz force-pushed the add-aarch64Windows-backend-ci branch from 9283c9c to a618270 Compare September 9, 2025 02:27
@willieyz
Copy link
Contributor Author

willieyz commented Sep 9, 2025

It appears that with the current MSVC setup job, an x86_64->x86_64 compiler is setup, which silently just works on windows-11-arm because x86_64 binaries are emulated automatically.

To build an arm64 binary, one can pass amd64_arm64 to the ilammy/msvc-dev action, which sets up an x86_64 compiler targeting arm64. This 'cross' compiler is then emulated to produce native binaries. I don't think ilammy/msvc-dev supports a native arm64 compiler.

I'm not sure if mingw64 supports Arm.

Hello @mkannwischer , @hanno-becker,

This is the progress report about “Enable and test AArch64 backend on Windows Machine.”

Thanks to Hanno’s suggestion, I added windows-11-arm to our quickcheck-windows CI job. passed amd64_arm64 to ilammy/msvc-dev and successfully verified that armasm64 works. armasm64 is the assembly code compiler for AArch64 on Windows 11 ARM.

To verify the functionality of armasm64, we first built a simple ARM assembly example (test.asm).

        AREA    |.text|, CODE, READONLY, ALIGN=4

        EXPORT  add_two_numbers

add_two_numbers PROC
        ADD     w0, w0, w1    ; return x0 + x1
        RET
add_two_numbers ENDP

        END

and compiling it with

armasm64 test.asm test.obj

worked fine.

Note: GNU-style ARM assembly code is not accepted by armasm64; it must be rewritten in MASM (Microsoft Macro Assembler) style.

For Next, we tried to use poly_reduce_asm.S as a minimal example and rewrote it in MASM style (poly_reduce_MASM.asm). However, this file could not be compiled by armasm64 due to two issues(one solved, one is not, still working on it):

  1. Loop labels
    In MASM style, loop labels cannot end with :.
    For example:
Lpoly_reduce_loop: ---> Lpoly_reduce_loop

needs to be rewritten without the trailing colon. (thanks for @mkannwischer 's suggestion)
2. Preprocessing C syntax
For assembly pre-process, we tried:

cl /EP poly_reduce_asm_MASM.asm > poly_reduce_asm_MASM_processed.asm

But the output was not as expected. the result contained many blank lines and unrelated content (e.g., #pragma directives, typedefs like typedef unsigned __int64 size_t;, etc.).

The following link contains the compiler option list for cl:
https://learn.microsoft.com/en-us/cpp/build/reference/compiler-options-listed-by-category?view=msvc-170

I am not sure yet if there are any useful options for our case.
the minimal example test.asm and the poly_reduce_asm_MASM.asm, poly_reduce_asm_MASM_processed.asm can be found in following zip file:
MASM_examples.zip

@hanno-becker
Copy link
Contributor

@willieyz Can you check in a minimal example if __ASSEMBLER__ is set when you preprocess with cl? If it is, you should not see C constructs like typedef appearing in the preprocessed assembly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable and test AArch64 backend on Arm Windows machines
3 participants