Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/accuracy_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,8 @@ jobs:
model_name: DeepSeek-V2-Lite
- runner: a2-4
model_name: Qwen3-Next-80B-A3B-Instruct
- runner: a2-1
model_name: Phi-4-mini-instruct
fail-fast: false
# test will be triggered when tag 'accuracy-test' & 'ready-for-test'
if: >-
Expand Down
14 changes: 14 additions & 0 deletions tests/e2e/models/configs/Phi-4-mini-instruct.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
model_name: "LLM-Research/Phi-4-mini-instruct"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The model name LLM-Research/Phi-4-mini-instruct appears to be incorrect. There are no public Phi-4 models from Microsoft at this time, and the LLM-Research organization does not seem to exist on the Hugging Face Hub. This will likely cause the test to fail when it attempts to download the model.

It's possible you intended to use microsoft/phi-3-mini-4k-instruct. If that's the case, please update the model name. For consistency, you should also consider renaming this YAML file to phi-3-mini-4k-instruct.yaml and updating its entry in tests/e2e/models/configs/accuracy.txt.

model_name: "microsoft/phi-3-mini-4k-instruct"

runner: "linux-aarch64-a2-1"
hardware: "Atlas A2 Series"
tasks:
- name: "gsm8k"
metrics:
- name: "exact_match,strict-match"
value: 0.81
- name: "exact_match,flexible-extract"
value: 0.81
trust_remote_code: True
num_fewshot: 5
batch_size: 32
gpu_memory_utilization: 0.8
1 change: 1 addition & 0 deletions tests/e2e/models/configs/accuracy.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ Qwen2-7B.yaml
Qwen2-VL-7B-Instruct.yaml
Qwen2-Audio-7B-Instruct.yaml
Qwen3-VL-30B-A3B-Instruct.yaml
Phi-4-mini-instruct.yaml
Loading