-
Notifications
You must be signed in to change notification settings - Fork 530
[Test] Test ge graph use DeepSeek-V2-Lite model #3842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds end-to-end tests for the DeepSeek-V3-Lite model in GE Graph mode. The implementation introduces a new test fixture that is largely a duplication of an existing one, which impacts maintainability. More critically, the new test appears to use incorrect golden data copied from another test, which undermines the validity of the test itself. I've provided two comments: one critical issue regarding the test data and a high-severity issue about code duplication.
| # NOTE: vllm-ascend/DeepSeek-V3-Pruning is a random weight of | ||
| # DeepSeek-V3 with 2 hidden layers, thus the golden results seems | ||
| # inaccurate. This will only change if accuracy improves with the | ||
| # official weights of DeepSeek-V3. | ||
| golden_results = [ | ||
| 'Hello, my name is下载早点向前很有่อง', | ||
| 'The president of the United States isSender)## physiological Albany', | ||
| 'The capital of France is Rocky转角 hospitalizedinterval sparked', | ||
| 'The future of AI is её asegο BIOS一扫', | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The golden_results and the accompanying comment appear to be copied from the test for vllm-ascend/DeepSeek-V3-Pruning without being updated for the deepseek-ai/DeepSeek-V3-Lite model.
- Incorrect Comment: The comment at lines 263-266 refers to
vllm-ascend/DeepSeek-V3-Pruning, which is not the model under test here. This is misleading. - Suspicious Golden Results: The
golden_resultsare identical to those forvllm-ascend/DeepSeek-V3-Pruning. It is extremely unlikely that two different models produce the exact same output. This strongly suggests the golden values are incorrect fordeepseek-ai/DeepSeek-V3-Lite.
A test with incorrect golden data is critically flawed as it doesn't validate the model's behavior and can mask regressions. Please regenerate the golden_results using the actual output from deepseek-ai/DeepSeek-V3-Lite and update the comment to be relevant to this model, or remove it if the output is accurate.
| def _deepseek_v3_lite_torchair_test_fixure( | ||
| additional_config:Dict, | ||
| *, | ||
| tensor_parallel_size=2, | ||
| use_v1_schduler=False, | ||
| ): | ||
| example_prompts = [ | ||
| "Hello, my name is", | ||
| "The president of the United States is", | ||
| "The capital of France is", | ||
| "The future of AI is", | ||
| ] | ||
|
|
||
| kwargs = {} | ||
| if not use_v1_schduler: | ||
| kwargs = { | ||
| "ascend_scheduler_config":{ | ||
| "enable":True, | ||
| }, | ||
| "refresh":True, | ||
| } | ||
| additional_config.update(**kwargs) | ||
|
|
||
| with VllmRunner( | ||
| "deepseek-ai/DeepSeek-V3-Lite", | ||
| dtype="half", | ||
| tensor_parallel_size=tensor_parallel_size, | ||
| distributed_executor_backend="mp", | ||
| additional_config=additional_config, | ||
| )as vllm_model: | ||
| vllm_output = vllm_model.generate_greedy(example_prompts, 5) | ||
|
|
||
| # NOTE: vllm-ascend/DeepSeek-V3-Pruning is a random weight of | ||
| # DeepSeek-V3 with 2 hidden layers, thus the golden results seems | ||
| # inaccurate. This will only change if accuracy improves with the | ||
| # official weights of DeepSeek-V3. | ||
| golden_results = [ | ||
| 'Hello, my name is下载早点向前很有่อง', | ||
| 'The president of the United States isSender)## physiological Albany', | ||
| 'The capital of France is Rocky转角 hospitalizedinterval sparked', | ||
| 'The future of AI is её asegο BIOS一扫', | ||
| ] | ||
|
|
||
| assert len(golden_results) == len(vllm_output) | ||
| for i in range(len(vllm_output)): | ||
| assert golden_results[i] == vllm_output[i][1] | ||
| print(f"Generated text:{vllm_output[i][1]!r}") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new test fixture _deepseek_v3_lite_torchair_test_fixure is almost an exact copy of the existing _deepseek_torchair_test_fixture function (lines 32-79). The main difference is the model name passed to VllmRunner.
To improve maintainability and avoid code duplication, this should be refactored. You could parameterize the existing _deepseek_torchair_test_fixture to accept the model name and golden results as arguments. This would eliminate the need for the new _deepseek_v3_lite_torchair_test_fixure function entirely, making the code cleaner and easier to maintain.
For example:
def _deepseek_torchair_test_fixture(
model_name: str,
golden_results: list[str],
additional_config: dict,
# ... other params
):
# ... existing logic ...
with VllmRunner(
model_name, # Use the new parameter
# ...
) as vllm_model:
vllm_output = vllm_model.generate_greedy(example_prompts, 5)
# ... assertion logic using golden_results ...
# Then update call sites:
def test_e2e_deepseekv3_with_torchair():
# ...
_deepseek_torchair_test_fixture(
"vllm-ascend/DeepSeek-V3-Pruning",
DEEPSEEK_V3_PRUNING_GOLDEN_RESULTS,
additional_config
)
def test_e2e_deepseekv3lite_with_torchair():
# ...
_deepseek_torchair_test_fixture(
"deepseek-ai/DeepSeek-V3-Lite",
DEEPSEEK_V3_LITE_GOLDEN_RESULTS,
additional_config
)Addressing the code duplication will also make it easier to manage model-specific details like the golden_results and explanatory comments.
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
Signed-off-by: CodeNine-CJ <[email protected]>
What this PR does / why we need it?
The existing tests lack coverage for the DeepSeek-V3-Lite network model under the GE Graph mode, so supplementary tests will be conducted.
Does this PR introduce any user-facing change?
No
How was this patch tested?