1
- # Mlperf Inference DeepSeek Reference Implementation
1
+ # MLPerf Inference DeepSeek Reference Implementation
2
2
3
3
## Automated command to run the benchmark via MLFlow
4
4
@@ -13,6 +13,22 @@ You can also do pip install mlc-scripts and then use `mlcr` commands for downloa
13
13
- DeepSeek-R1 model is automatically downloaded as part of setup
14
14
- Checkpoint conversion is done transparently when needed.
15
15
16
+ ** Using the MLC R2 Downloader**
17
+
18
+ Download the model using the MLCommons R2 Downloader:
19
+
20
+ ``` bash
21
+ bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
22
+ https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
23
+ ```
24
+
25
+ To specify a custom download directory, use the ` -d ` flag:
26
+ ``` bash
27
+ bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
28
+ -d /path/to/download/directory \
29
+ https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
30
+ ```
31
+
16
32
## Dataset Download
17
33
18
34
The dataset is an ensemble of the datasets: AIME, MATH500, gpqa, MMLU-Pro, livecodebench(code_generation_lite). They are covered by the following licenses:
@@ -23,24 +39,18 @@ The dataset is an ensemble of the datasets: AIME, MATH500, gpqa, MMLU-Pro, livec
23
39
- MMLU-Pro: [ MIT] ( https://opensource.org/license/mit )
24
40
- livecodebench(code_generation_lite): [ CC] ( https://creativecommons.org/share-your-work/cclicenses/ )
25
41
26
- ### Preprocessed
27
-
28
- ** Using MLCFlow Automation**
42
+ ### Preprocessed & Calibration
29
43
30
- ```
31
- mlcr get,dataset,whisper,_preprocessed,_mlc,_rclone --outdirname=<path to download> -j
32
- ```
44
+ ** Using the MLC R2 Downloader**
33
45
34
- ** Using Native method**
35
-
36
- Download the preprocessed dataset using the MLCommons downloader:
46
+ Download the full preprocessed dataset and calibration dataset using the MLCommons R2 Downloader:
37
47
38
48
``` bash
39
49
bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
40
- https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri
50
+ -d ./ https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri
41
51
```
42
52
43
- This will download the dataset file ` mlperf_deepseek_r1_dataset_4388_fp8_eval.pkl ` .
53
+ This will download the full preprocessed dataset file ( ` mlperf_deepseek_r1_dataset_4388_fp8_eval.pkl ` ) and the calibration dataset file ( ` mlperf_deepseek_r1_calibration_dataset_500_fp8_eval.pkl ` ) .
44
54
45
55
To specify a custom download directory, use the ` -d ` flag:
46
56
``` bash
@@ -49,30 +59,20 @@ bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/he
49
59
https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri
50
60
```
51
61
52
- ### Calibration
62
+ ### Preprocessed
53
63
54
64
** Using MLCFlow Automation**
55
65
56
66
```
57
- mlcr get,preprocessed, dataset,deepseek-r1,_calibration ,_mlc,_rclone --outdirname=<path to download> -j
67
+ mlcr get,dataset,whisper,_preprocessed ,_mlc,_rclone --outdirname=<path to download> -j
58
68
```
59
69
60
- ** Using Native method **
70
+ ### Calibration
61
71
62
- Download the calibration dataset using the MLCommons downloader:
72
+ ** Using MLCFlow Automation **
63
73
64
- ``` bash
65
- bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
66
- https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
67
74
```
68
-
69
- This will download the calibration dataset file ` mlperf_deepseek_r1_calibration_dataset_500_fp8_eval.pkl ` .
70
-
71
- To specify a custom download directory, use the ` -d ` flag:
72
- ``` bash
73
- bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
74
- -d /path/to/download/directory \
75
- https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
75
+ mlcr get,preprocessed,dataset,deepseek-r1,_calibration,_mlc,_rclone --outdirname=<path to download> -j
76
76
```
77
77
78
78
## Docker
0 commit comments