Skip to content

Conversation

@trivialfis
Copy link
Member

@trivialfis trivialfis commented Jan 23, 2026

  • Use GA container and sccache for JVM, R, and macos builds.
  • Merge some of the workflows to reduce logins.
  • Split up Python test into CPU jobs and GPU jobs to avoid tagging the CPU tests with "CUDA".

todos:

Example runs:

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates CI to run more jobs inside GitHub Actions containers (GA container support) while adding/expanding sccache usage, consolidating workflows, and splitting Python wheel tests into distinct CPU vs GPU jobs.

Changes:

  • Migrate multiple pipeline scripts from docker_run.py wrappers to “run inside container” execution and remove now-redundant companion *-impl.sh scripts.
  • Add/expand sccache integration across CMake-based builds (JVM GPU build, macOS JVM build, CPU build variants).
  • Restructure GitHub workflows: merge i386/docs workflows into misc.yml, add ci-configure jobs, and split Python wheel tests into test-python-wheel-gpu and test-python-wheel-cpu.

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
ops/pipeline/test-python-wheel.sh Make --cuda-version optional for CPU suites and required only for GPU suites.
ops/pipeline/test-jvm-gpu.sh Removed legacy wrapper script that ran JVM GPU tests via docker_run.py.
ops/pipeline/test-cpp-i386.sh Switch i386 build/test entrypoint to reuse build-cpu.sh i386.
ops/pipeline/test-cpp-i386-impl.sh Removed legacy i386 implementation script.
ops/pipeline/get-image-tag.sh Add shebang and change CI image tag selection.
ops/pipeline/deploy-jvm-packages.sh Move deploy logic into the main script (no container wrapper) and simplify inputs.
ops/pipeline/deploy-jvm-packages-impl.sh Removed legacy deploy implementation script.
ops/pipeline/build-test-jvm-packages.sh Convert to “runs inside container” script; add Spark compatibility checks and env-driven CUDA behavior.
ops/pipeline/build-test-jvm-packages-impl.sh Removed legacy JVM build/test implementation script.
ops/pipeline/build-test-cpu-nonomp.sh Removed legacy non-OpenMP CPU build script in favor of build-cpu.sh cpu-nonomp.
ops/pipeline/build-r-docs.sh Convert to “runs inside container” R doc build and tarball packaging.
ops/pipeline/build-r-docs-impl.sh Removed legacy R docs implementation script.
ops/pipeline/build-jvm-macos.sh Unify macOS JVM dylib build and add sccache compiler launchers.
ops/pipeline/build-jvm-macos-intel.sh Removed separate Intel macOS JVM build script.
ops/pipeline/build-jvm-gpu.sh Convert to “runs inside container” CUDA JVM build and add sccache launchers including CUDA.
ops/pipeline/build-jvm-doc.sh Convert to “runs inside container” JVM docs build and tarball packaging.
ops/pipeline/build-jvm-doc-impl.sh Removed legacy JVM docs implementation script.
ops/pipeline/build-gpu-rpkg.sh Convert to “runs inside container” GPU R package build and tarball packaging; add sccache launchers.
ops/pipeline/build-gpu-rpkg-impl.sh Removed legacy GPU R package implementation script.
ops/pipeline/build-cpu.sh Extend CPU build script with cpu-nonomp and i386 suites to replace removed scripts.
.github/workflows/r_tests.yml Rename workflow.
.github/workflows/python_tests.yml Rename workflow.
.github/workflows/misc.yml Add ci-configure, fold in i386 + docs jobs, and update non-OpenMP build to use build-cpu.sh.
.github/workflows/main.yml Use container + sccache for GPU R pkg build; split Python wheel tests into separate GPU/CPU jobs; remove Docker daemon restarts.
.github/workflows/lint.yml Remove Docker daemon restart step.
.github/workflows/jvm_tests.yml Add ci-configure, migrate to container + sccache, unify macOS build, and adjust JVM GPU testing/deploy flow.
.github/workflows/i386.yml Remove standalone i386 workflow (moved into misc.yml).
.github/workflows/doc.yml Remove standalone docs workflow (moved into misc.yml).
.github/workflows/cccl_nightly.yml Remove Docker daemon restart steps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

## See https://xgboost.readthedocs.io/en/latest/contrib/ci.html#making-changes-to-ci-containers

IMAGE_TAG=main
IMAGE_TAG=PR-75
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMAGE_TAG is hardcoded to "PR-75", which will force all workflows sourcing this script to pull that tag (and likely fail once PR-75 is gone). Consider defaulting to a stable tag (e.g., main) and/or allowing an override via env var or workflow input, rather than committing a PR-specific tag.

Suggested change
IMAGE_TAG=PR-75
IMAGE_TAG="${IMAGE_TAG:-main}"

Copilot uses AI. Check for mistakes.
@trivialfis
Copy link
Member Author

trivialfis commented Jan 24, 2026

The CI failure is irrelevant to the changes here. We unfreezed the dask/distributed version in the xgboost-devop, which updates Dask and causes the flaky error here due to an empty DMatrix (among other things).

I will open a different PR for a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant