Skip to content

Commit 347dd9d

Browse files
committed
add ds v3 bf16 chunked_prefill on hopper
Signed-off-by: Ivy Zhang <[email protected]>
1 parent c3c9573 commit 347dd9d

File tree

2 files changed

+2
-1
lines changed

2 files changed

+2
-1
lines changed

tests/integration/defs/accuracy/test_llm_api_pytorch.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1132,7 +1132,7 @@ class TestDeepSeekV3Lite(LlmapiAccuracyTestHarness):
11321132
# Chunked Prefill for MLA can only be enabled on SM100
11331133
@parametrize_with_ids(
11341134
"enable_chunked_prefill",
1135-
[False, pytest.param(True, marks=skip_pre_blackwell)])
1135+
[False, pytest.param(True, marks=skip_pre_hopper)])
11361136
@parametrize_with_ids("torch_compile", [False, True])
11371137
@parametrize_with_ids("attention_dp,cuda_graph,overlap_scheduler",
11381138
[(False, False, False), (True, False, False),

tests/integration/test_lists/qa/llm_function_full.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -510,6 +510,7 @@ accuracy/test_llm_api_pytorch.py::TestLlama4ScoutInstruct::test_fp4_chunked_pref
510510
accuracy/test_llm_api_pytorch.py::TestMixtral8x7B::test_fp8_tp2
511511
accuracy/test_llm_api_pytorch.py::TestMixtral8x7B::test_nvfp4_tp2
512512
accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_bfloat16[mtp_nextn=0-attention_dp=False-cuda_graph=False-overlap_scheduler=False-torch_compile=False-enable_chunked_prefill=False]
513+
accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_bfloat16[mtp_nextn=2-attention_dp=True-cuda_graph=True-overlap_scheduler=True-torch_compile=False-enable_chunked_prefill=True]
513514
accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_fp8_block_scales[mtp=disable-fp8kv=False-attention_dp=False-cuda_graph=False-overlap_scheduler=False-torch_compile=False]
514515
accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_nvfp4[moe_backend=CUTLASS-mtp_nextn=0-fp8kv=False-attention_dp=False-cuda_graph=False-overlap_scheduler=False-torch_compile=False]
515516
accuracy/test_llm_api_pytorch.py::TestDeepSeekV3Lite::test_nvfp4[moe_backend=CUTLASS-mtp_nextn=2-fp8kv=False-attention_dp=False-cuda_graph=False-overlap_scheduler=False-torch_compile=False]

0 commit comments

Comments
 (0)