Skip to content

[XPU][NixlConnector] Add ze_ipc transport support for single-node PD disaggregation #36625

@Yanli2190

Description

@Yanli2190

Summary

Enable ze_ipc (Intel Level Zero inter-process IPC) as a UCX transport for KV-cache transfer in single-node Prefill-Decode disaggregation on XPU (Intel GPU / Ponte Vecchio), providing zero-copy VRAM→VRAM transfer without host-memory staging.

Background

The current single-node XPU PD path uses kv_buffer_device=xpu with UCX ze_copy,ib,rc (requires an InfiniBand NIC, even for loopback) or has no viable no-IB alternative:

Path UCX_TLS Limitation
Single-node, no IB ze_copy,cma cma has no AM capability → NIXL_ERR_BACKEND
Single-node, IB loopback ib,rc,ze_copy ❌ requires IB NIC even intra-node
Target ze_ipc,ze_copy,tcp ✅ zero-copy IPC, no IB needed

ze_ipc maps the producer's VRAM buffer directly into the consumer process via Level Zero IPC handles — analogous to CUDA IPC (cudaIpcGetMemHandle). This eliminates the host-staging round-trip (VRAM → host buffer → VRAM) that ze_copy alone requires, reducing latency and PCIe bandwidth pressure.

Note: ze_copy is kept alongside ze_ipc as a fallback for memory types that ze_ipc cannot handle. tcp is required as the Active Messages control plane — NIXL's intra-agent initialization requires at least one AM-capable transport, and ze_ipc/ze_copy alone do not provide AM.

Validation Status

Tested on Intel Ponte Vecchio (PVC) B60, single-node 1P1D:

  • UCX_TLS=ze_ipc,ze_copy,tcp with kv_buffer_device=xpu — ✅ engine initializes, NIXL backend comes up
  • UCX_TLS=ze_ipc,ze_copy,sm — ❌ sm expands to sysv/posix/cma, none have AM bcopy → same NIXL_ERR_BACKEND as ze_copy,cma

The ze_ipc UCX transport DSO (libuct_ze.so, symbol uct_ze_ipc_tl) is present in the UCX build used by NIXL.

What needs to be done

  1. UCX build: Confirm ze_ipc transport is compiled in the UCX version pinned by NIXL (-DUCX_BUILD_ZE=ON). The DSO is present in the current test environment.

  2. NixlConnector / vllm: When kv_buffer_device=xpu and no IB NIC is available (single-node), auto-select UCX_TLS=ze_ipc,ze_copy,tcp instead of falling back to ze_copy,cma. Relevant code: vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py.

  3. Documentation / CI: Add a single-node xpu-buffer scenario to the PD test matrix using ze_ipc,ze_copy,tcp.

Related

Environment

  • Hardware: Intel Data Center GPU Max (Ponte Vecchio) B60
  • UCX: custom build with libuct_ze.so (both uct_ze_ipc_tl and uct_ze_copy_tl present)
  • NIXL: v0.3+
  • vLLM: xpu-minimax-m2.5-pd branch

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions