Skip to content

[CI] Make DBO tests non-optional#34450

Open
LucasWilkinson wants to merge 1 commit intovllm-project:mainfrom
neuralmagic:lwikinson/make-dbo-non-optional
Open

[CI] Make DBO tests non-optional#34450
LucasWilkinson wants to merge 1 commit intovllm-project:mainfrom
neuralmagic:lwikinson/make-dbo-non-optional

Conversation

@LucasWilkinson
Copy link
Collaborator

The DBO tests have been catching alot of non-DBO related MoE bugs lately on the nightly/dailies, make non-optional for now to catch them sooner until we have better distributed MoE test coverage

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
@LucasWilkinson LucasWilkinson added the ready ONLY add when PR is ready to merge/full CI is needed label Feb 12, 2026
@mergify mergify bot added the ci/build label Feb 12, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes the DBO (Dual Batch Overlap) tests non-optional to improve bug detection for MoE (Mixture of Experts) models. This is achieved by splitting a larger, optional test step into two more specific steps: one for MoE/DBO tests which is now mandatory, and another for context parallel tests which remains optional. This is a sensible change that aligns with the goal of catching regressions earlier. I have one suggestion to improve the test coverage of the newly promoted DBO test.

num_devices: 2
commands:
- VLLM_USE_DEEP_GEMM=1 VLLM_LOGGING_LEVEL=DEBUG python3 examples/offline_inference/data_parallel.py --model=Qwen/Qwen1.5-MoE-A2.7B -tp=1 -dp=2 --max-model-len=2048 --all2all-backend=deepep_high_throughput
- pytest -v -s tests/v1/distributed/test_dbo.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The test_dbo.py test is being promoted to a non-optional CI step because it's effective at catching MoE-related bugs. However, this pytest command is missing the VLLM_USE_DEEP_GEMM=1 environment variable, which is set for the other MoE test command in this same step. This means the DBO test won't cover MoE execution with DeepGEMM kernels, which is a significant gap. To ensure consistent and thorough testing, and to aid in debugging, I recommend adding both VLLM_USE_DEEP_GEMM=1 and VLLM_LOGGING_LEVEL=DEBUG.

  - VLLM_USE_DEEP_GEMM=1 VLLM_LOGGING_LEVEL=DEBUG pytest -v -s tests/v1/distributed/test_dbo.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant