-
-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Description
Summary
Enable ze_ipc (Intel Level Zero inter-process IPC) as a UCX transport for KV-cache transfer in single-node Prefill-Decode disaggregation on XPU (Intel GPU / Ponte Vecchio), providing zero-copy VRAM→VRAM transfer without host-memory staging.
Background
The current single-node XPU PD path uses kv_buffer_device=xpu with UCX ze_copy,ib,rc (requires an InfiniBand NIC, even for loopback) or has no viable no-IB alternative:
| Path | UCX_TLS | Limitation |
|---|---|---|
| Single-node, no IB | ze_copy,cma |
❌ cma has no AM capability → NIXL_ERR_BACKEND |
| Single-node, IB loopback | ib,rc,ze_copy |
❌ requires IB NIC even intra-node |
| Target | ze_ipc,ze_copy,tcp |
✅ zero-copy IPC, no IB needed |
ze_ipc maps the producer's VRAM buffer directly into the consumer process via Level Zero IPC handles — analogous to CUDA IPC (cudaIpcGetMemHandle). This eliminates the host-staging round-trip (VRAM → host buffer → VRAM) that ze_copy alone requires, reducing latency and PCIe bandwidth pressure.
Note:
ze_copyis kept alongsideze_ipcas a fallback for memory types thatze_ipccannot handle.tcpis required as the Active Messages control plane — NIXL's intra-agent initialization requires at least one AM-capable transport, andze_ipc/ze_copyalone do not provide AM.
Validation Status
Tested on Intel Ponte Vecchio (PVC) B60, single-node 1P1D:
UCX_TLS=ze_ipc,ze_copy,tcpwithkv_buffer_device=xpu— ✅ engine initializes, NIXL backend comes upUCX_TLS=ze_ipc,ze_copy,sm— ❌smexpands tosysv/posix/cma, none have AM bcopy → sameNIXL_ERR_BACKENDasze_copy,cma
The ze_ipc UCX transport DSO (libuct_ze.so, symbol uct_ze_ipc_tl) is present in the UCX build used by NIXL.
What needs to be done
-
UCX build: Confirm
ze_ipctransport is compiled in the UCX version pinned by NIXL (-DUCX_BUILD_ZE=ON). The DSO is present in the current test environment. -
NixlConnector / vllm: When
kv_buffer_device=xpuand no IB NIC is available (single-node), auto-selectUCX_TLS=ze_ipc,ze_copy,tcpinstead of falling back toze_copy,cma. Relevant code:vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py. -
Documentation / CI: Add a single-node xpu-buffer scenario to the PD test matrix using
ze_ipc,ze_copy,tcp.
Related
- UCT/ZE/ZE_IPC: enable level zero ipc support for Intel GPUs openucx/ucx#11218 — upstream UCX PR adding
ze_ipcsupport (the transport DSO already ships in recent UCX builds) - PR [XPU][NIXL] Add GPUDirect RDMA support for XPU #35270 — added
kv_buffer_device=xpusupport for XPU UCX_MEMTYPE_CACHE=nis already set byvllm/platforms/xpu.pysince [XPU][NIXL] Add GPUDirect RDMA support for XPU #35270, which is required for correct VRAM memory-type detection by UCX
Environment
- Hardware: Intel Data Center GPU Max (Ponte Vecchio) B60
- UCX: custom build with
libuct_ze.so(bothuct_ze_ipc_tlanduct_ze_copy_tlpresent) - NIXL: v0.3+
- vLLM:
xpu-minimax-m2.5-pdbranch