Skip to content

Conversation

Flamefire
Copy link
Contributor

Allows to install any possible easyconfig without stopping if a single one fails

The main function will call sys.exit with the returned exit code as-if failing by an EasyBuildError so eb --keep-going ec1.eb ec2-eb will return an error code that can be used in scripts

Allows to install any possible easyconfig without stopping if a single
one fails
Flamefire and others added 2 commits October 9, 2025 16:48
Co-authored-by: Jan André Reuter <[email protected]>
@Thyre
Copy link
Collaborator

Thyre commented Oct 12, 2025

Here's the output for a very simple test, trying to install Intel compilers on aarch64 and some other EasyConfig at the same time:

Used command:

[reuter1@jrc0900 ~]$ eb --configfile=../jedi/.config/easybuild/config.cfg --keep-going intel-compilers-2025.2.0.eb M4-1.4.20.eb --rebuild --accept-eula-for=".*" --force-download sources

Output:

Click to open
== Temporary log file in case of crash /tmp/eb-r600p1g2/easybuild-s_pf9cns.log
== processing EasyBuild easyconfig /p/project1/cswmanage/reuter1/EasyBuild/prog/easybuild/easyconfigs/i/intel-compilers/intel-compilers-2025.2.0.eb
== building and installing Core/intel-compilers/2025.2.0...
  >> installation prefix: /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/intel-compilers/2025.2.0
== fetching files and verifying checksums...

WARNING: Found file intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh at /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh, but re-downloading it anyway...

  >> download succeeded: https://registrationcenter-download.intel.com/akdlm/IRC_NAS/39c79383-66bf-4f44-a6dd-14366e34e255/intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh

WARNING: Found file intel-fortran-compiler-2025.2.0.534_offline.sh at /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-fortran-compiler-2025.2.0.534_offline.sh, but re-downloading it anyway...

  >> download succeeded: https://registrationcenter-download.intel.com/akdlm/IRC_NAS/2c69ab6a-dfff-4d8f-ae1c-8368c79a1709/intel-fortran-compiler-2025.2.0.534_offline.sh
  >> sources:
  >> /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh [SHA256: aea3c1ccb97728db138b4f11f771411264292ba7bbec313782229510c9b831bc]
  >> /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-fortran-compiler-2025.2.0.534_offline.sh [SHA256: 3808000bbcef15f17b608156b956e0114393a1b64ee6d9fb29be06450fa40083]
== ... (took 42 secs)
== creating build dir, resetting environment...
  >> build dir: /dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system
== ... (took < 1 sec)
== unpacking...
  >> running shell command:
        cp -dR /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh .
        [started at: 2025-10-12 14:16:49]
        [working dir: /dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/cp-jqz62zt8]
  >> command completed: exit 0, ran in < 1s
  >> running shell command:
        cp -dR /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/i/intel-compilers/intel-fortran-compiler-2025.2.0.534_offline.sh .
        [started at: 2025-10-12 14:16:49]
        [working dir: /dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/cp-2afxutgh]
  >> command completed: exit 0, ran in < 1s
== ... (took < 1 sec)
== patching...
== ... (took < 1 sec)
== preparing...
  >> (no build dependencies specified)
  >> loading modules for (runtime) dependencies:
  >>  * GCCcore/14.3.0
  >>  * binutils/2.44
== ... (took < 1 sec)
== configuring...
== ... (took < 1 sec)
== building...
== ... (took < 1 sec)
== testing...
== ... (took < 1 sec)
== installing...
== installing part 1/2 (intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh)...
  >> running shell command:
        HOME=/dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system  ./intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh -a --action install --silent --eula accept --install-dir /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/intel-compilers/2025.2.0
        [started at: 2025-10-12 14:16:49]
        [working dir: /dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/system-system-ml9cm8kw]

ERROR: Shell command failed!
    full command              ->  HOME=/dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system  ./intel-dpcpp-cpp-compiler-2025.2.0.527_offline.sh -a --action install --silent --eula accept --install-dir /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/intel-compilers/2025.2.0
    exit code                 ->  126
    called from               ->  'install_step_oneapi' function in /p/project1/cswmanage/reuter1/EasyBuild/prog/lib/python3.9/site-packages/easybuild/easyblocks/generic/intelbase.py (line 441)
    working directory         ->  /dev/shm/reuter1/easybuild/build/intelcompilers/2025.2.0/system-system
    output (stdout + stderr)  ->  /tmp/eb-r600p1g2/run-shell-cmd-output/system-system-ml9cm8kw/out.txt
    interactive shell script  ->  /tmp/eb-r600p1g2/run-shell-cmd-output/system-system-ml9cm8kw/cmd.sh

== ... (took 4 secs)
== FAILED: Installation ended unsuccessfully: shell command 'system-system ...' failed with exit code 126 in install step for intel-compilers-2025.2.0.eb (took 47 secs)
== Results of the build can be found in the log file(s) /tmp/eb-r600p1g2/easybuild-intel-compilers-2025.2.0-20251012.141606.eGYin.log
== processing EasyBuild easyconfig /p/project1/cswmanage/reuter1/EasyBuild/prog/easybuild/easyconfigs/m/M4/M4-1.4.20.eb
== building and installing Core/M4/1.4.20...
  >> installation prefix: /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/M4/1.4.20
== fetching files and verifying checksums...

WARNING: Found file m4-1.4.20.tar.gz at /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/m/M4/m4-1.4.20.tar.gz, but re-downloading it anyway...

  >> download succeeded: https://ftpmirror.gnu.org/gnu/m4/m4-1.4.20.tar.gz
  >> sources:
  >> /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/m/M4/m4-1.4.20.tar.gz [SHA256: 6ac4fc31ce440debe63987c2ebbf9d7b6634e67a7c3279257dc7361de8bdb3ef]

WARNING: Found file config.guess at /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/generic/eb_v5.1.3.dev0/ConfigureMake/config.guess, but re-downloading it anyway...

  >> download succeeded: https://git.savannah.gnu.org/cgit/config.git/plain/config.guess?id=28ea239c53a2d5d8800c472bc2452eaa16e37af2
== ... (took 2 secs)
== creating build dir, resetting environment...
  >> build dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system
== ... (took < 1 sec)
== unpacking...
  >> running shell command:
        tar xzf /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/m/M4/m4-1.4.20.tar.gz
        [started at: 2025-10-12 14:16:56]
        [working dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/tar-bwj1k9qe]
  >> command completed: exit 0, ran in < 1s
== ... (took < 1 sec)
== patching...
== ... (took < 1 sec)
== preparing...
== ... (took < 1 sec)
== configuring...
  >> running shell command:
        /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/sources/generic/eb_v5.1.3.dev0/ConfigureMake/config.guess
        [started at: 2025-10-12 14:16:57]
        [working dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system/m4-1.4.20]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/configguess-xz_q3btp]
  >> command completed: exit 0, ran in < 1s
  >> running shell command:
        ./configure --prefix=/p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/M4/1.4.20  --build=aarch64-unknown-linux-gnu  --host=aarch64-unknown-linux-gnu --enable-c++ CPPFLAGS=-fgnu89-inline
        [started at: 2025-10-12 14:16:57]
        [working dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system/m4-1.4.20]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/configure-73f346ds]
  >> command completed: exit 0, ran in 00h00m41s
== ... (took 41 secs)
== building...
  >> running shell command:
        make  -j 16
        [started at: 2025-10-12 14:17:38]
        [working dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system/m4-1.4.20]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/make-rn8l8dka]
  >> command completed: exit 0, ran in 00h00m01s
== ... (took 1 secs)
== testing...
== ... (took < 1 sec)
== installing...
  >> running shell command:
        make install
        [started at: 2025-10-12 14:17:41]
        [working dir: /dev/shm/reuter1/easybuild/build/M4/1.4.20/system-system/m4-1.4.20]
        [output and state saved to /tmp/eb-r600p1g2/run-shell-cmd-output/make-kpai64u_]
  >> command completed: exit 0, ran in < 1s
== ... (took < 1 sec)
== taking care of extensions...
== ... (took < 1 sec)
== restore after iterating...
== ... (took < 1 sec)
== postprocessing...
== ... (took < 1 sec)
== sanity checking...
  >> file 'bin/m4' found: OK
  >> loading modules: M4/1.4.20...
== ... (took < 1 sec)
== cleaning up...
== ... (took < 1 sec)
== creating module...
  >> generating module file @ /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/modules/all/Core/M4/1.4.20.lua
== ... (took < 1 sec)
== permissions...
== ... (took < 1 sec)
== packaging...
== ... (took < 1 sec)
== COMPLETED: Installation ended successfully (took 47 secs)
== Results of the build can be found in the log file(s) /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/software/M4/1.4.20/easybuild/easybuild-M4-1.4.20-20251012.141742.log
== Build succeeded for 1 out of 2
== Summary:
   * [FAILED]  Core/intel-compilers/2025.2.0
   * [SUCCESS] Core/M4/1.4.20

So this is working as expected. Might come in very handy in preparing 2025a on this system for PR testing...

@Thyre
Copy link
Collaborator

Thyre commented Oct 13, 2025

There are errors which this doesn't catch. Mainly, I've encountered these two so far:

ERROR: Failed to process easyconfig /p/project1/cswmanage/reuter1/EasyBuild/prog/easybuild/easyconfigs/n/nvtop/nvtop-3.2.0-GCCcore-14.2.0.eb: One or more OS dependencies were not found: [('libsystemd-dev', 'libudev-dev', 'systemd-devel')]
== Temporary log file in case of crash /p/project1/cswmanage/reuter1/EasyBuild/jedi/apps/log/easybuild-t5ux6ild.log
ERROR: One or more files not found: non_existent_config.eb (search paths: /p/project1/cswmanage/reuter1/EasyBuild/prog/easybuild/easyconfigs)

The second case is arbitrarily made up, but came up due to the EasyStack I used being created with an EasyConfig not existing upstream. Don't know how easy it would be to catch such cases as well, since they also abort --upload-test-report and --dump-test-report at the moment.

This honestly isn't a blocker though, as one can still use flags to work around the former and should make sure that the files exist for the latter.

@Flamefire
Copy link
Contributor Author

I think having no or missing easyconfigs should error in any case as it might hint that you mistyped an option or similar issue where it might not do what you expected.

And we can argue that this option is documented to continue on a failed build and if it fails to parse it is a different issue.

@Thyre
Copy link
Collaborator

Thyre commented Oct 14, 2025

I think having no or missing easyconfigs should error in any case as it might hint that you mistyped an option or similar issue where it might not do what you expected.

And we can argue that this option is documented to continue on a failed build and if it fails to parse it is a different issue.

Absolutely, a missing EasyConfig should error out in any case.
Missing OS deps (i.e. fails to parse) is a separate issue, and I think this should be handled in a separate PR, if at all.

I'm fine with keeping the current behavior for all three (--upload-test-report, --dump-test-report & --keep-going).

@Flamefire
Copy link
Contributor Author

Missing OS deps (i.e. fails to parse) is a separate issue, and I think this should be handled in a separate PR, if at all.

IIRC we have --ignore-os-deps for that.

And failing to parse an easyconfig could as well be that you accidentally passed an easyblock or patch instead of an easyconfig, so again it isn't a build issue which we want to ignore with this option. If this can be made more clear in the description we could do that. But as it fails right at the start I think it is fine as-is.
hence I wouldn't handle that and just let it fail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants