Skip to content

Commit b9ed3c7

Browse files
authored
fix: update ds-r1 truncate max-output-len to 20k (was 32k) (#2290)
* update reference max-osl config * Update README.md * 20k not 20*1024 * update thresholds * keep readme to 4 digit precision * update tok-len threshold
1 parent 6481ff4 commit b9ed3c7

File tree

3 files changed

+6
-6
lines changed

3 files changed

+6
-6
lines changed

language/deepseek-r1/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ The setup script creates a virtual environment and configures it differently bas
116116

117117
### PyTorch Backend (Distributed)
118118

119-
> ⚠️ **IMPORTANT NOTE**: The PyTorch reference implementation takes approximately 8 days to run on an H200x8 system. This is because large max-OSL (32K) limits concurrency (max-BS=16), and unoptimized pytorch forward and decode logics.
119+
> ⚠️ **IMPORTANT NOTE**: The PyTorch reference implementation takes approximately upto 8 days to run on an H200x8 system. This is because large max-OSL (20K) limits concurrency (max-BS=16), and unoptimized pytorch forward and decode logics.
120120
121121
PyTorch backend uses distributed execution with `torchrun` and `run_eval_mpi.py`:
122122

@@ -222,8 +222,8 @@ Pytorch reference scores:
222222

223223
```bash
224224
Evaluation Results: {
225-
"mean-accuracy": 81.67730173199635,
226-
"mean-output-tok-len": 4043.449863263446,
225+
"mean-accuracy": 81.3582,
226+
"mean-output-tok-len": 3886.2274,
227227
"num-samples": 4388
228228
}
229229
```

language/deepseek-r1/utils/backend_registry.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
# Configuration constants
77
MAX_ISL = 3136 # max input sequence length
8-
MAX_OSL = 32 * 1024 # max output sequence length
8+
MAX_OSL = 20 * 1000 # max output sequence length
99
MAX_TEMPLATE_TOKS = 4 # max template tokens
1010
MODEL_REVISION = "56d4cbbb4d29f4355bab4b9a39ccb717a14ad5ad"
1111

tools/submission/submission_checker.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -497,7 +497,7 @@
497497
),
498498
"rgat": ("acc", 0.7286 * 0.99),
499499
"pointpainting": ("mAP", 0.5425 * 0.999),
500-
"deepseek-r1": ("exact_match", 0.99 * 81.6773, "TOKENS_PER_SAMPLE", 0.9 * 4043.449),
500+
"deepseek-r1": ("exact_match", 0.99 * 81.3582, "TOKENS_PER_SAMPLE", 0.9 * 3886.2274),
501501
"whisper": ("ACCURACY", (100.0 - 2.0671) * 0.99),
502502
},
503503
"accuracy-upper-limit": {
@@ -513,7 +513,7 @@
513513
"llama3.1-405b": ("TOKENS_PER_SAMPLE", 684.68 * 1.1),
514514
"llama3.1-8b": ("GEN_LEN", 8167644 * 1.1),
515515
"llama3.1-8b-edge": ("GEN_LEN", 8167644 * 1.1),
516-
"deepseek-r1": ("TOKENS_PER_SAMPLE", 1.1 * 4043.449)
516+
"deepseek-r1": ("TOKENS_PER_SAMPLE", 1.1 * 3886.2274)
517517
},
518518
"accuracy-delta-perc": {
519519
"stable-diffusion-xl": {"CLIP_SCORE": 1, "FID_SCORE": 2}

0 commit comments

Comments
 (0)