Skip to content

Commit 49e6983

Browse files
[Test] Add accuracy test for qwen3-30b-a3b-w8a8 (#3807)
### What this PR does / why we need it? Add accuracy test for qwen3-30b-a3b-w8a8 This PR depends on #3799 ### How was this patch tested? qwen3-30b-a3b-w8a8 accuarcy test ok: https://github.com/vllm-project/vllm-ascend/actions/runs/19062045267/job/54443732877?pr=3807 - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@83f478b Signed-off-by: hfadzxy <[email protected]>
1 parent 5fed166 commit 49e6983

File tree

2 files changed

+18
-0
lines changed

2 files changed

+18
-0
lines changed

.github/workflows/accuracy_test.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,8 @@ jobs:
6868
model_name: Qwen2.5-Omni-7B
6969
- runner: a2-1
7070
model_name: Meta-Llama-3.1-8B-Instruct
71+
- runner: a2-4
72+
model_name: Qwen3-30B-A3B-W8A8
7173
fail-fast: false
7274
# test will be triggered when tag 'accuracy-test' & 'ready-for-test'
7375
if: >-
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
model_name: "vllm-ascend/Qwen3-30B-A3B-W8A8"
2+
hardware: "Atlas A2 Series"
3+
tasks:
4+
- name: "gsm8k"
5+
metrics:
6+
- name: "exact_match,strict-match"
7+
value: 0.9
8+
- name: "exact_match,flexible-extract"
9+
value: 0.8
10+
num_fewshot: 5
11+
gpu_memory_utilization: 0.7
12+
enable_expert_parallel: True
13+
tensor_parallel_size: 2
14+
apply_chat_template: False
15+
fewshot_as_multiturn: False
16+
quantization: ascend

0 commit comments

Comments
 (0)