Skip to content

Conversation

@Sbozzolo
Copy link
Member

@Sbozzolo Sbozzolo commented Dec 5, 2025

This commit runs the spack workflow on larger/GPU-accelerated runners.

Updated description: #578 (comment)

OLD description, new description in #578 (comment)

I added a full test matrix, but I could not get everything to work (despite fixing several bugs). Some of this work conflicts with #550, so I will address more of that there. I tried quick fixes, but I could not get the builds to work, so I am leaving this for future work.

To be noted:

There is this annoying bug with oneAPI and spack where the version of ifx is incorrectly reported, breaking everything. I asked about this on slack, and I was told that I should manually merge the two compiler entries in the package definition (as a workaround). I added a step to take care of this, but I hope we can remove it soon.

Good news that (after a week of fixing bugs), lots of variants are working, including OpenMP cases, cases with mixed compilers, GPU+LLVM, newer dependencies (e.g., eigen 5), et cetera.

The progression I see is:
(#576 at any time)

@Sbozzolo Sbozzolo added build Related to building ci Related to continuous integration (CI) and/or GitHub Actions GPU Related to GPU support spack labels Dec 5, 2025
@Sbozzolo Sbozzolo force-pushed the gbozzola/even_larger_runners branch from f5c241d to c48c530 Compare December 5, 2025 21:58
}

TEST_CASE("2D libCEED Interpolators", "[libCEED][Interpolator][Serial][Parallel][GPU]")
TEST_CASE("2D libCEED Interpolators", "[libCEED][Interpolator][Serial][Parallel]")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed because of #516

end

abstol = 2.0e-12
abstol = 1.0e-11
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember which case led me to change this, but I found it necessary

@Sbozzolo Sbozzolo force-pushed the gbozzola/even_larger_runners branch from c48c530 to 6c7fe66 Compare December 5, 2025 22:01
Comment on lines 266 to 279
strumpack_packages = ["ParMETIS", "METIS", "LAPACK", "BLAS", "MPI", "MPI_Fortran"]
if self.spec.satisfies("+openmp"):
strumpack_packages.append("OpenMP")
args.append(self.define("STRUMPACK_REQUIRED_PACKAGES", ";".join(strumpack_packages)))
scalapack_libs = self.spec["scalapack"].libs
fortran_libs = ""
if "gfortran" in self.compiler.fc:
fortran_libs = "gfortran"
elif "ifort" in self.compiler.fc or "ifx" in self.compiler.fc:
fortran_libs = "ifport;ifcore"
# For other compilers (flang, etc.), don't add extra libs
strumpack_libs = str(scalapack_libs).replace(" ", ";")
if fortran_libs:
strumpack_libs += ";" + fortran_libs
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this entire section might away with #550

@Sbozzolo Sbozzolo requested a review from hughcars December 8, 2025 22:32
Copy link
Collaborator

@hughcars hughcars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small question about usage of the cache, but generally LGTM.

@Sbozzolo Sbozzolo force-pushed the gbozzola/even_larger_runners branch from 6c7fe66 to f724e1b Compare December 9, 2025 19:42
@Sbozzolo Sbozzolo force-pushed the gbozzola/even_larger_runners branch from f724e1b to 656cce1 Compare December 9, 2025 21:48
@Sbozzolo Sbozzolo changed the base branch from main to gbozzola/hypre3 December 29, 2025 16:16
@Sbozzolo Sbozzolo force-pushed the gbozzola/even_larger_runners branch from a0ff316 to 9330759 Compare December 29, 2025 16:23
@Sbozzolo
Copy link
Member Author

Sbozzolo commented Dec 29, 2025

I updated the PR with the following changes:

More importantly, I changed from a allpairs type of testing to a more curated test matrix. allpairs was generating combinations that were valid but unnatural. Instead, here I test:

  • x64, with GCC/LLVM and OpenBlas, and with the Intel suite
  • arm64, with GCC/LLVM and OpenBlas, and with GCC and ARMPL
  • gpu with GCC
  • a combination where I turn unusual options on (static builds, openmp, int64)

For some builds, CI is now broken down in two steps: build (e.g., build-x64-gcc) and test (test-x64-gcc). The test build is a matrix run that re-uses the build artifacts generated in the build step and is run on several smaller runners (since we cannot take advantage of large runners in our regression suite, because of the small size of the problem). The build step compiles Palace with everything turned on (so that we can reuse the artifact). Currently, the test matrix is parametrized on the linear solver, but we can easily extend that.

To allow this workflow, I extended the runtest script to take command line arguments and overwrite the config files with the passed arguments. Environment variables are still accepted (but get overwritten if the corresponding command line argument is passed). I updated the documentation for this.

I also reorganized the workflows so that they are in repo-local actions. One of the undersired side effects of this is that the GitHub action log is not as neatly broken down in steps that show the individual time (which is very useful to optimize the runtime). I checked if there was a way around this, and could not find one. In the future, I will add back timing information.

I also tried adding macOS in the matrix, but found lots of problems:

  • Spack seems incapable to correctly handle armpl (partially because Spack's armpl has a constraint on gcc for some reason). Adding armpl in the mix prevents spack from concretizing
  • Python has to be compiled with apple-clang and cannot be compiled with gcc
  • OpenBlas 0.3.30 does not compile of Apple silicon
  • Strumpack forces OpenBlas with OpenMP, but apple-clang does not support OpenMP
  • Even when mostly using GCC, Spack often cannot concretize the environment

This commit runs the spack workflow on larger/GPU-accelerated runners.

I added a full test matrix, but I could not get everything to work. In
particular, the static builds have several linking errors, which
indicates some issue in the build system.

I tried quick fixes, but I could not get the builds to work, so I am
leaving this for future work.
At the end, we pay the cost of compiling them only once (with the
buildcache)
@Sbozzolo Sbozzolo force-pushed the gbozzola/even_larger_runners branch from 9330759 to 65dcf6f Compare December 29, 2025 16:44
Copy link
Collaborator

@hughcars hughcars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small question about how to disable failing tests in future if using matrices, i.e. one entry in a 3x3 matrix for example. But generally this looks good to me.

Comment on lines +61 to +64
# Intel
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
| gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these be placed within the Setup actions straight after? Or is there value in having this broken out? I imagine they're fairly quick so the granularity is probably helpful even if it adds them on runs without these enabled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this separately to have only one apt update. apt update pull from the internet and takes O(10) seconds. I moved it out for etiquette (I don't want to make too many server calls if I can avoid that) and shave off that tiny amount of time. This is a micro-optimization and I can move this to the steps below if you think that would be better.

Comment on lines +92 to +103
- name: Install Spack
uses: actions/checkout@v6
with:
repository: spack/spack
path: spack
ref: ${{ inputs.spack-version }}

- name: Setup Spack environment
shell: bash
run: |
echo "$(realpath spack/bin)" >> "$GITHUB_PATH"
echo "SPACK_DISABLE_LOCAL_CONFIG=1" >> "$GITHUB_ENV"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this done for the cmake builds too? Again, probably not that big a deal as I bet it's very fast.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, this file is not used for the cmake builds. We might need a couple of small tweaks for that.

Comment on lines 134 to 147
test-x64-gcc:
needs: [filter, build-x64-gcc]
if: success()
strategy:
fail-fast: false
matrix:
solver: [MUMPS, SuperLU, STRUMPACK, Default]
runs-on: [self-hosted, x64, 4xlarge]
steps:
- uses: actions/checkout@v6
- uses: ./.github/actions/run-regression-tests
with:
toolchain: gcc
solver: ${{ matrix.solver }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this matrix approach, if we have a failing test for one entry in the tensor product that we want to disable, how would we do that? I.e. only the STRUMPACK run on 4xlarge.

Copy link
Member Author

@Sbozzolo Sbozzolo Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action defines three separate build-xxxxxx jobs and three corresponding test-xxxxxx jobs. The test jobs define a matrix (all of which running on 4xlarge). If you want to disable a specific combination, you can simply comment that out. For example, if you want to disable STRUMPACK for openmp, we would just have

  test-x64-gcc-openmp:
    needs: [filter, build-x64-gcc-openmp]
    if: success()
    strategy:
      fail-fast: false
      matrix:
        include: 
         - solver: MUMPS
         - solver: SuperLU
        # - solver: STRUMPACK
         - solver: Default
    runs-on: [self-hosted, x64, 4xlarge]
    steps:
      - uses: actions/checkout@v6
      - uses: ./.github/actions/run-regression-tests
        with:
          toolchain: gcc
          variant: "+openmp+int64~shared"
          solver: ${{ matrix.solver }}

# Find Umpire and Camp (needed when CUDA/HIP is enabled)
if(PALACE_WITH_CUDA OR PALACE_WITH_HIP)
find_package(umpire REQUIRED CONFIG)
find_package(camp REQUIRED CONFIG)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does camp do? I've not heard of it before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuda_variant = f"+cuda cuda_arch={arch}"

# TODO: Remove me after blt > 0.7.1 is released
depends_on(f"blt@develop")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be modeled as a conflict for 0.7.1 and below? I.e. I think that will always pull develop if the conflict covers the most recent version, and then this will not require updating again (I think we're doing this for libceed for instance, depending on a version that doesn't exist yet).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can just say depends_on("[email protected]:").

depends_on("libxsmm+debug", when="build_type=Debug")
depends_on("libceed+libxsmm", when="@0.14:")
# NOTE: libxsmm builds on MacOS have linker issues
# https://github.com/libxsmm/libxsmm/issues/883
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the 0.15.0 spack recipe PR is in, I think you should open another one with these changes in addition. This seems like a really good upgrade of the general capability and robustness of the spack release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We change other stuff in #550, but yes, I agree we should update the spack recipe.

Comment on lines +318 to +328
#### Command Line Arguments

- `PALACE_TEST`: Path to *Palace* executable and optional arguments (default: "`palace`")
- `NUM_PROC_TEST`: Number of MPI processes (default: number of physical cores)
- `OMP_NUM_THREADS`: Number of OpenMP threads (default: 1)
- `TEST_CASES`: Space-separated list of test cases to run (default: all examples)
The test runner supports command line arguments for configuration. Each argument can also be set via environment variables as fallbacks.

**Key Options:**

- `--palace-test`: Path to *Palace* executable and optional arguments (default: "`palace`")
- `--num-proc-test`: Number of MPI processes (default: number of physical cores)
- `--test-cases`: Space-separated list of test cases to run (default: all examples)

Run `julia --project runtests.jl --help` to see all available options with descriptions and defaults.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A much nicer approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build Related to building ci Related to continuous integration (CI) and/or GitHub Actions GPU Related to GPU support spack

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants