@@ -13,6 +13,22 @@ You can also do pip install mlc-scripts and then use `mlcr` commands for downloa
13
13
- DeepSeek-R1 model is automatically downloaded as part of setup
14
14
- Checkpoint conversion is done transparently when needed.
15
15
16
+ ** Using MLC R2 Downloader**
17
+
18
+ Download the model using the MLC R2 Downloader:
19
+
20
+ ``` bash
21
+ bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
22
+ https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
23
+ ```
24
+
25
+ To specify a custom download directory, use the ` -d ` flag:
26
+ ``` bash
27
+ bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
28
+ -d /path/to/download/directory \
29
+ https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
30
+ ```
31
+
16
32
## Dataset Download
17
33
18
34
The dataset is an ensemble of the datasets: AIME, MATH500, gpqa, MMLU-Pro, livecodebench(code_generation_lite). They are covered by the following licenses:
@@ -23,24 +39,17 @@ The dataset is an ensemble of the datasets: AIME, MATH500, gpqa, MMLU-Pro, livec
23
39
- MMLU-Pro: [ MIT] ( https://opensource.org/license/mit )
24
40
- livecodebench(code_generation_lite): [ CC] ( https://creativecommons.org/share-your-work/cclicenses/ )
25
41
26
- ### Preprocessed
27
-
28
- ** Using MLCFlow Automation**
29
-
30
- ```
31
- mlcr get,dataset,whisper,_preprocessed,_mlc,_rclone --outdirname=<path to download> -j
32
- ```
42
+ ### Preprocessed & Calibration
33
43
34
- ** Using Native method **
44
+ ** Using MLC R2 Downloader **
35
45
36
46
Download the preprocessed dataset using the MLCommons downloader:
37
47
38
48
``` bash
39
- bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
40
- https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri
49
+ bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) -d ./ https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri
41
50
```
42
51
43
- This will download the dataset file ` mlperf_deepseek_r1_dataset_4388_fp8_eval.pkl ` .
52
+ This will download both the full preprocessed dataset ( ` mlperf_deepseek_r1_dataset_4388_fp8_eval.pkl ` ) and the calibration dataset ( ` mlperf_deepseek_r1_calibration_dataset_500_fp8_eval.pkl ` ) .
44
53
45
54
To specify a custom download directory, use the ` -d ` flag:
46
55
``` bash
@@ -49,30 +58,20 @@ bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/he
49
58
https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri
50
59
```
51
60
52
- ### Calibration
61
+ ### Preprocessed
53
62
54
63
** Using MLCFlow Automation**
55
64
56
65
```
57
- mlcr get,preprocessed, dataset,deepseek-r1,_calibration ,_mlc,_rclone --outdirname=<path to download> -j
66
+ mlcr get,dataset,whisper,_preprocessed ,_mlc,_rclone --outdirname=<path to download> -j
58
67
```
59
68
60
- ** Using Native method **
69
+ ### Calibration
61
70
62
- Download the calibration dataset using the MLCommons downloader:
71
+ ** Using MLCFlow Automation **
63
72
64
- ``` bash
65
- bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
66
- https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
67
73
```
68
-
69
- This will download the calibration dataset file ` mlperf_deepseek_r1_calibration_dataset_500_fp8_eval.pkl ` .
70
-
71
- To specify a custom download directory, use the ` -d ` flag:
72
- ``` bash
73
- bash <( curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) \
74
- -d /path/to/download/directory \
75
- https://inference.mlcommons-storage.org/metadata/deepseek-r1-0528.uri
74
+ mlcr get,preprocessed,dataset,deepseek-r1,_calibration,_mlc,_rclone --outdirname=<path to download> -j
76
75
```
77
76
78
77
## Docker
0 commit comments