-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
[wip][ci] Use GA container and sccache. #11952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Use the GA container. - Use sccache. This only applies to cmake-based build script at the moment.
This reverts commit 73ed5a7.
Remove the parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Updates CI to run more jobs inside GitHub Actions containers (GA container support) while adding/expanding sccache usage, consolidating workflows, and splitting Python wheel tests into distinct CPU vs GPU jobs.
Changes:
- Migrate multiple pipeline scripts from
docker_run.pywrappers to “run inside container” execution and remove now-redundant companion*-impl.shscripts. - Add/expand
sccacheintegration across CMake-based builds (JVM GPU build, macOS JVM build, CPU build variants). - Restructure GitHub workflows: merge i386/docs workflows into
misc.yml, addci-configurejobs, and split Python wheel tests intotest-python-wheel-gpuandtest-python-wheel-cpu.
Reviewed changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| ops/pipeline/test-python-wheel.sh | Make --cuda-version optional for CPU suites and required only for GPU suites. |
| ops/pipeline/test-jvm-gpu.sh | Removed legacy wrapper script that ran JVM GPU tests via docker_run.py. |
| ops/pipeline/test-cpp-i386.sh | Switch i386 build/test entrypoint to reuse build-cpu.sh i386. |
| ops/pipeline/test-cpp-i386-impl.sh | Removed legacy i386 implementation script. |
| ops/pipeline/get-image-tag.sh | Add shebang and change CI image tag selection. |
| ops/pipeline/deploy-jvm-packages.sh | Move deploy logic into the main script (no container wrapper) and simplify inputs. |
| ops/pipeline/deploy-jvm-packages-impl.sh | Removed legacy deploy implementation script. |
| ops/pipeline/build-test-jvm-packages.sh | Convert to “runs inside container” script; add Spark compatibility checks and env-driven CUDA behavior. |
| ops/pipeline/build-test-jvm-packages-impl.sh | Removed legacy JVM build/test implementation script. |
| ops/pipeline/build-test-cpu-nonomp.sh | Removed legacy non-OpenMP CPU build script in favor of build-cpu.sh cpu-nonomp. |
| ops/pipeline/build-r-docs.sh | Convert to “runs inside container” R doc build and tarball packaging. |
| ops/pipeline/build-r-docs-impl.sh | Removed legacy R docs implementation script. |
| ops/pipeline/build-jvm-macos.sh | Unify macOS JVM dylib build and add sccache compiler launchers. |
| ops/pipeline/build-jvm-macos-intel.sh | Removed separate Intel macOS JVM build script. |
| ops/pipeline/build-jvm-gpu.sh | Convert to “runs inside container” CUDA JVM build and add sccache launchers including CUDA. |
| ops/pipeline/build-jvm-doc.sh | Convert to “runs inside container” JVM docs build and tarball packaging. |
| ops/pipeline/build-jvm-doc-impl.sh | Removed legacy JVM docs implementation script. |
| ops/pipeline/build-gpu-rpkg.sh | Convert to “runs inside container” GPU R package build and tarball packaging; add sccache launchers. |
| ops/pipeline/build-gpu-rpkg-impl.sh | Removed legacy GPU R package implementation script. |
| ops/pipeline/build-cpu.sh | Extend CPU build script with cpu-nonomp and i386 suites to replace removed scripts. |
| .github/workflows/r_tests.yml | Rename workflow. |
| .github/workflows/python_tests.yml | Rename workflow. |
| .github/workflows/misc.yml | Add ci-configure, fold in i386 + docs jobs, and update non-OpenMP build to use build-cpu.sh. |
| .github/workflows/main.yml | Use container + sccache for GPU R pkg build; split Python wheel tests into separate GPU/CPU jobs; remove Docker daemon restarts. |
| .github/workflows/lint.yml | Remove Docker daemon restart step. |
| .github/workflows/jvm_tests.yml | Add ci-configure, migrate to container + sccache, unify macOS build, and adjust JVM GPU testing/deploy flow. |
| .github/workflows/i386.yml | Remove standalone i386 workflow (moved into misc.yml). |
| .github/workflows/doc.yml | Remove standalone docs workflow (moved into misc.yml). |
| .github/workflows/cccl_nightly.yml | Remove Docker daemon restart steps. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ## See https://xgboost.readthedocs.io/en/latest/contrib/ci.html#making-changes-to-ci-containers | ||
|
|
||
| IMAGE_TAG=main | ||
| IMAGE_TAG=PR-75 |
Copilot
AI
Jan 24, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMAGE_TAG is hardcoded to "PR-75", which will force all workflows sourcing this script to pull that tag (and likely fail once PR-75 is gone). Consider defaulting to a stable tag (e.g., main) and/or allowing an override via env var or workflow input, rather than committing a PR-specific tag.
| IMAGE_TAG=PR-75 | |
| IMAGE_TAG="${IMAGE_TAG:-main}" |
|
The CI failure is irrelevant to the changes here. We unfreezed the dask/distributed version in the xgboost-devop, which updates Dask and causes the flaky error here due to an empty DMatrix (among other things). I will open a different PR for a workaround. |
todos:
Example runs: