Skip to content

[CI][BUILD] update dockerfile and build script#170

Open
jikunshang wants to merge 3 commits intovllm-project:mainfrom
jikunshang:kunshang/umd-control
Open

[CI][BUILD] update dockerfile and build script#170
jikunshang wants to merge 3 commits intovllm-project:mainfrom
jikunshang:kunshang/umd-control

Conversation

@jikunshang
Copy link
Collaborator

@jikunshang jikunshang commented Mar 4, 2026

…ild wheel script

Essential Elements of an Effective PR Description Checklist

  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS ABOVE HAVE BEEN CONSIDERED.

Purpose

  1. use config file to control which umd/igc to use when build docker,
  2. add build wheel script

some usage

./build_script/build_wheel_docker.sh --output-dir /tmp/wheels --version 1.0.0
./build_script/build_wheel_docker.sh --no-cache --max-jobs 8
# build a wheel using specific igc/umd version
./build_script/build_wheel_docker.sh --gpu-profile igc-2.27.10-cr-26.01

default on should be 2.22 or 2.24, due to some known igc issue.

  1. 2.11 not support g31
  2. 2.27 build kernel hang
  3. 2.28 some triton kernel failed in vllm-xpu ci.

Test Plan

Test Result

(Optional) Documentation Update

BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing (anything written below this line will be removed by GitHub Actions)

…ild wheel script

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Copilot AI review requested due to automatic review settings March 4, 2026 01:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the XPU Docker build to install Intel GPU runtime packages via a selectable JSON “profile”, and adds a helper script to build the vllm-xpu-kernels wheel inside Docker for reproducibility.

Changes:

  • Added a profile-driven GPU runtime installer script + JSON config of runtime package URLs.
  • Updated Dockerfile.xpu to install the GPU runtime using the new profile mechanism (GPU_RUNTIME_PROFILE build-arg).
  • Added a build_wheel_docker.sh script to build and export wheels from a Docker container.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 10 comments.

File Description
build_script/install_gpu_runtime.sh New installer that selects a profile from JSON and installs matching .deb packages.
build_script/gpu_runtime_packages.json Defines available GPU runtime profiles and their package URLs.
build_script/build_wheel_docker.sh New script to build wheels inside the Dockerfile.xpu-built image and copy artifacts to host.
Dockerfile.xpu Switches GPU runtime install from hardcoded URLs to the new configurable installer/profile.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +49 to +62
# Download and install
WORKDIR=$(mktemp -d)
echo "==> Downloading packages to ${WORKDIR} ..."

while IFS= read -r url; do
echo " ${url}"
wget -q -P "${WORKDIR}" "${url}"
done <<< "${URLS}"

echo "==> Installing .deb packages ..."
dpkg -i "${WORKDIR}"/*.deb

echo "==> Cleaning up ..."
rm -rf "${WORKDIR}"
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WORKDIR=$(mktemp -d) is only cleaned up on the success path. With set -e, any failure (wget/dpkg/python) will leak the temp directory. Add a trap to remove ${WORKDIR} on EXIT/ERR after creation.

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +27
COPY build_script/gpu_runtime_packages.json build_script/install_gpu_runtime.sh /tmp/
RUN bash /tmp/install_gpu_runtime.sh /tmp/gpu_runtime_packages.json ${GPU_RUNTIME_PROFILE}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copied config file name is /tmp/gpu_runtime_packages.json, but the RUN line passes /tmp/gpu_runtime_packages.json (missing gpu_runtime_). As written, install_gpu_runtime.sh will fail with “Config file not found”. Fix the path (or rename the copied file) so it matches the actual filename.

Suggested change
COPY build_script/gpu_runtime_packages.json build_script/install_gpu_runtime.sh /tmp/
RUN bash /tmp/install_gpu_runtime.sh /tmp/gpu_runtime_packages.json ${GPU_RUNTIME_PROFILE}
COPY build_script/gpu_runtime_packages.json /tmp/gpu_packages.json
COPY build_script/install_gpu_runtime.sh /tmp/
RUN bash /tmp/install_gpu_runtime.sh /tmp/gpu_packages.json ${GPU_RUNTIME_PROFILE}

Copilot uses AI. Check for mistakes.
Comment on lines +24 to +43
PROFILE=$(python3 -c "
import json, sys
cfg = json.load(open('${CONFIG_FILE}'))
print(cfg['default'])
")
fi

echo "==> Installing GPU runtime packages from profile: ${PROFILE}"

# Extract the list of URLs for the selected profile
URLS=$(python3 -c "
import json, sys
cfg = json.load(open('${CONFIG_FILE}'))
profile = cfg['profiles'].get('${PROFILE}')
if profile is None:
print(f'ERROR: Profile \"${PROFILE}\" not found. Available: {list(cfg[\"profiles\"].keys())}', file=sys.stderr)
sys.exit(1)
print('\n'.join(profile['packages']))
")

Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The python snippets interpolate ${CONFIG_FILE} / ${PROFILE} directly into Python source code inside a double-quoted python3 -c "..." string. If PROFILE (or the file path) contains quotes/newlines, this can break parsing and can allow code injection during the build. Prefer passing values via environment variables / argv (e.g., python3 -c '...' -- "$CONFIG_FILE" "$PROFILE") or using a here-doc with no shell interpolation.

Suggested change
PROFILE=$(python3 -c "
import json, sys
cfg = json.load(open('${CONFIG_FILE}'))
print(cfg['default'])
")
fi
echo "==> Installing GPU runtime packages from profile: ${PROFILE}"
# Extract the list of URLs for the selected profile
URLS=$(python3 -c "
import json, sys
cfg = json.load(open('${CONFIG_FILE}'))
profile = cfg['profiles'].get('${PROFILE}')
if profile is None:
print(f'ERROR: Profile \"${PROFILE}\" not found. Available: {list(cfg[\"profiles\"].keys())}', file=sys.stderr)
sys.exit(1)
print('\n'.join(profile['packages']))
")
PROFILE=$(python3 - "$CONFIG_FILE" << 'PY'
import json
import sys
config_path = sys.argv[1]
with open(config_path, 'r') as f:
cfg = json.load(f)
print(cfg['default'])
PY
)
fi
echo "==> Installing GPU runtime packages from profile: ${PROFILE}"
# Extract the list of URLs for the selected profile
URLS=$(python3 - "$CONFIG_FILE" "$PROFILE" << 'PY'
import json
import sys
config_path = sys.argv[1]
profile_name = sys.argv[2]
with open(config_path, 'r') as f:
cfg = json.load(f)
profiles = cfg.get('profiles', {})
profile = profiles.get(profile_name)
if profile is None:
available = list(profiles.keys())
print(f'ERROR: Profile "{profile_name}" not found. Available: {available}', file=sys.stderr)
sys.exit(1)
packages = profile.get('packages', [])
print('\n'.join(packages))
PY
)

Copilot uses AI. Check for mistakes.
done <<< "${URLS}"

echo "==> Installing .deb packages ..."
dpkg -i "${WORKDIR}"/*.deb
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Installing the downloaded .debs via dpkg -i can leave the image in a broken state if dependencies are missing (dpkg doesn’t resolve deps). Prefer apt-get install -y ./*.deb (after apt-get update) or follow dpkg -i with apt-get -f install -y to reliably resolve dependencies.

Suggested change
dpkg -i "${WORKDIR}"/*.deb
apt-get update && apt-get install -y "${WORKDIR}"/*.deb

Copilot uses AI. Check for mistakes.
"igc-2.24.8-cr-25.48": {
"description": "IGC 2.24.8 + Compute Runtime 25.48.36300.8 + Level Zero v1.26.0",
"packages": [
"https://github.com/intel/intel-graphics-compiler/releases/download/v2.24.8/intel-igc-core-2_2.24.8+20344_amd64.deb ",
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the package URLs has a trailing space. Because install_gpu_runtime.sh uses IFS= with read -r, that whitespace will be preserved and passed to wget, causing the download to fail. Please remove the trailing space (or trim URLs in the installer).

Suggested change
"https://github.com/intel/intel-graphics-compiler/releases/download/v2.24.8/intel-igc-core-2_2.24.8+20344_amd64.deb ",
"https://github.com/intel/intel-graphics-compiler/releases/download/v2.24.8/intel-igc-core-2_2.24.8+20344_amd64.deb",

Copilot uses AI. Check for mistakes.
Comment on lines +43 to +50
"igc-2.27.10-cr-26.01": {
"description": "IGC 2.27.10 + Compute Runtime 25.53.36486.8 + Level Zero v1.26.2",
"packages": [
"https://github.com/intel/intel-graphics-compiler/releases/download/v2.27.10/intel-igc-core-2_2.27.10+20617_amd64.deb",
"https://github.com/intel/intel-graphics-compiler/releases/download/v2.27.10/intel-igc-opencl-2_2.27.10+20617_amd64.deb",
"https://github.com/intel/compute-runtime/releases/download/26.01.36711.4/intel-ocloc_26.01.36711.4-0_amd64.deb",
"https://github.com/intel/compute-runtime/releases/download/26.01.36711.4/intel-opencl-icd_26.01.36711.4-0_amd64.deb",
"https://github.com/intel/compute-runtime/releases/download/26.01.36711.4/libigdgmm12_22.9.0_amd64.deb",
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The profile description for igc-2.27.10-cr-26.01 states a Compute Runtime version 25.53.36486.8, but the package URLs in this profile are for 26.01.36711.4. Update the description (or URLs) so the metadata matches what will actually be installed.

Copilot uses AI. Check for mistakes.
Comment on lines +112 to +118
git commit -q -m "build" --allow-empty
git tag -a v0.0.0 -m "placeholder" 2>/dev/null || true
}
else
git init -q
git add -A
git commit -q -m "build" --allow-empty
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in-container git reinitialization path runs git commit, which will fail if user.name / user.email aren’t configured in the container (common in CI/base images). Configure them locally before committing (or use git -c user.name=... -c user.email=... commit ...) to make this script robust.

Suggested change
git commit -q -m "build" --allow-empty
git tag -a v0.0.0 -m "placeholder" 2>/dev/null || true
}
else
git init -q
git add -A
git commit -q -m "build" --allow-empty
git -c user.name="vllm-build-bot" -c user.email="vllm-build-bot@example.com" commit -q -m "build" --allow-empty
git tag -a v0.0.0 -m "placeholder" 2>/dev/null || true
}
else
git init -q
git add -A
git -c user.name="vllm-build-bot" -c user.email="vllm-build-bot@example.com" commit -q -m "build" --allow-empty

Copilot uses AI. Check for mistakes.
Comment on lines +40 to +58
# ─── Parse arguments ───────────────────────────────────────────────────────
while [[ $# -gt 0 ]]; do
case "$1" in
--output-dir)
OUTPUT_DIR="$2"; shift 2 ;;
--image-name)
IMAGE_NAME="$2"; shift 2 ;;
--gpu-profile)
GPU_RUNTIME_PROFILE="$2"; shift 2 ;;
--version)
VLLM_VERSION_OVERRIDE="$2"; shift 2 ;;
--max-jobs)
MAX_JOBS="$2"; shift 2 ;;
--no-cache)
DOCKER_NO_CACHE="--no-cache"; shift ;;
-h|--help)
sed -n '2,/^$/p' "$0" | sed 's/^# \?//'; exit 0 ;;
*)
echo "ERROR: Unknown option: $1" >&2; exit 1 ;;
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argument parsing assumes every option that takes a value is followed by $2. If a user passes e.g. --output-dir as the last argument, the script will error under set -u with a less helpful message. Add an explicit check for missing option values and print a clear usage error.

Suggested change
# ─── Parse arguments ───────────────────────────────────────────────────────
while [[ $# -gt 0 ]]; do
case "$1" in
--output-dir)
OUTPUT_DIR="$2"; shift 2 ;;
--image-name)
IMAGE_NAME="$2"; shift 2 ;;
--gpu-profile)
GPU_RUNTIME_PROFILE="$2"; shift 2 ;;
--version)
VLLM_VERSION_OVERRIDE="$2"; shift 2 ;;
--max-jobs)
MAX_JOBS="$2"; shift 2 ;;
--no-cache)
DOCKER_NO_CACHE="--no-cache"; shift ;;
-h|--help)
sed -n '2,/^$/p' "$0" | sed 's/^# \?//'; exit 0 ;;
*)
echo "ERROR: Unknown option: $1" >&2; exit 1 ;;
usage() {
sed -n '2,/^$/p' "$0" | sed 's/^# \?//'
}
# ─── Parse arguments ───────────────────────────────────────────────────────
while [[ $# -gt 0 ]]; do
case "$1" in
--output-dir)
if [[ $# -lt 2 ]]; then
echo "ERROR: Missing value for option '$1'." >&2
usage
exit 1
fi
OUTPUT_DIR="$2"; shift 2 ;;
--image-name)
if [[ $# -lt 2 ]]; then
echo "ERROR: Missing value for option '$1'." >&2
usage
exit 1
fi
IMAGE_NAME="$2"; shift 2 ;;
--gpu-profile)
if [[ $# -lt 2 ]]; then
echo "ERROR: Missing value for option '$1'." >&2
usage
exit 1
fi
GPU_RUNTIME_PROFILE="$2"; shift 2 ;;
--version)
if [[ $# -lt 2 ]]; then
echo "ERROR: Missing value for option '$1'." >&2
usage
exit 1
fi
VLLM_VERSION_OVERRIDE="$2"; shift 2 ;;
--max-jobs)
if [[ $# -lt 2 ]]; then
echo "ERROR: Missing value for option '$1'." >&2
usage
exit 1
fi
MAX_JOBS="$2"; shift 2 ;;
--no-cache)
DOCKER_NO_CACHE="--no-cache"; shift ;;
-h|--help)
usage
exit 0 ;;
*)
echo "ERROR: Unknown option: $1" >&2
usage
exit 1 ;;

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +16
# RUN bash /tmp/install_gpu_runtime.sh [PROFILE_NAME]
#
# If PROFILE_NAME is omitted or "default", the default profile from the JSON is used.

set -euo pipefail

CONFIG_FILE="${1:-/tmp/gpu_runtime_packages.json}"
PROFILE="${2:-default}"

Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage comment doesn’t match the script’s argument order. The script treats $1 as CONFIG_FILE and $2 as PROFILE, but the header suggests calling it with only a profile name. Either update the usage/docs or add argument detection (e.g., if $1 is an existing file, treat it as CONFIG_FILE; otherwise treat it as PROFILE).

Suggested change
# RUN bash /tmp/install_gpu_runtime.sh [PROFILE_NAME]
#
# If PROFILE_NAME is omitted or "default", the default profile from the JSON is used.
set -euo pipefail
CONFIG_FILE="${1:-/tmp/gpu_runtime_packages.json}"
PROFILE="${2:-default}"
# # Use default config file (/tmp/gpu_runtime_packages.json) and optional profile:
# RUN bash /tmp/install_gpu_runtime.sh [PROFILE_NAME]
# # Or specify an explicit config file and optional profile:
# RUN bash /tmp/install_gpu_runtime.sh [CONFIG_FILE] [PROFILE_NAME]
#
# If PROFILE_NAME is omitted or "default", the default profile from the JSON is used.
set -euo pipefail
CONFIG_FILE_DEFAULT="/tmp/gpu_runtime_packages.json"
if [ $# -ge 1 ] && [ -f "$1" ]; then
# First argument is a config file path; optional second argument is the profile name.
CONFIG_FILE="$1"
PROFILE="${2:-default}"
elif [ $# -ge 1 ]; then
# First argument is treated as the profile name; use the default config file path.
CONFIG_FILE="${CONFIG_FILE_DEFAULT}"
PROFILE="$1"
else
# No arguments: use default config file and default profile.
CONFIG_FILE="${CONFIG_FILE_DEFAULT}"
PROFILE="default"
fi

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +36
# Resolve "default" to the actual profile name
if [ "${PROFILE}" = "default" ]; then
PROFILE=$(python3 -c "
import json, sys
cfg = json.load(open('${CONFIG_FILE}'))
print(cfg['default'])
")
fi

echo "==> Installing GPU runtime packages from profile: ${PROFILE}"

# Extract the list of URLs for the selected profile
URLS=$(python3 -c "
import json, sys
cfg = json.load(open('${CONFIG_FILE}'))
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script invokes python3, but Dockerfile.xpu only installs python3.12 (no explicit python3 package). To avoid a build-time failure if /usr/bin/python3 isn’t present in the base image, either call python3.12 here or ensure python3 is installed in the image before running this script.

Copilot uses AI. Check for mistakes.
jikunshang and others added 2 commits March 4, 2026 01:37
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: Kunshang Ji <jikunshang95@gmail.com>
@jikunshang jikunshang mentioned this pull request Mar 7, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants