Skip to content

Commit 4784d0d

Browse files
update LLM support models (#3489)
* update LLM support models * correct src build argument * remove cpuid dependency --------- Co-authored-by: Chunyuan WU <[email protected]>
1 parent 1eb99eb commit 4784d0d

File tree

5 files changed

+124
-121
lines changed

5 files changed

+124
-121
lines changed

README.md

Lines changed: 43 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -18,46 +18,49 @@ In the current technological landscape, Generative AI (GenAI) workloads and mode
1818

1919
| MODEL FAMILY | MODEL NAME (Huggingface hub) | FP32 | BF16 | Static quantization INT8 | Weight only quantization INT8 | Weight only quantization INT4 |
2020
|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
21-
|LLAMA| meta-llama/Llama-2-7b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
22-
|LLAMA| meta-llama/Llama-2-13b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
23-
|LLAMA| meta-llama/Llama-2-70b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
24-
|LLAMA| meta-llama/Meta-Llama-3-8B | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
25-
|LLAMA| meta-llama/Meta-Llama-3-70B | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
26-
|LLAMA| meta-llama/Meta-Llama-3.1-8B-Instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
27-
|LLAMA| meta-llama/Llama-3.2-3B-Instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
28-
|LLAMA| meta-llama/Llama-3.2-11B-Vision-Instruct | 🟩 | 🟩 | | 🟩 | |
29-
|GPT-J| EleutherAI/gpt-j-6b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
30-
|GPT-NEOX| EleutherAI/gpt-neox-20b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
31-
|DOLLY| databricks/dolly-v2-12b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
32-
|FALCON| tiiuae/falcon-7b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
33-
|FALCON| tiiuae/falcon-11b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
34-
|FALCON| tiiuae/falcon-40b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
35-
|OPT| facebook/opt-30b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
36-
|OPT| facebook/opt-1.3b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
37-
|Bloom| bigscience/bloom-1b7 | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
38-
|CodeGen| Salesforce/codegen-2B-multi | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
39-
|Baichuan| baichuan-inc/Baichuan2-7B-Chat | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
40-
|Baichuan| baichuan-inc/Baichuan2-13B-Chat | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
41-
|Baichuan| baichuan-inc/Baichuan-13B-Chat | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
42-
|ChatGLM| THUDM/chatglm3-6b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
43-
|ChatGLM| THUDM/chatglm2-6b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
44-
|GPTBigCode| bigcode/starcoder | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
45-
|T5| google/flan-t5-xl | 🟩 | 🟩 | 🟩 | 🟩 | |
46-
|MPT| mosaicml/mpt-7b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
47-
|Mistral| mistralai/Mistral-7B-v0.1 | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
48-
|Mixtral| mistralai/Mixtral-8x7B-v0.1 | 🟩 | 🟩 | | 🟩 | 🟩 |
49-
|Stablelm| stabilityai/stablelm-2-1_6b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
50-
|Qwen| Qwen/Qwen-7B-Chat | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
51-
|Qwen| Qwen/Qwen2-7B | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
52-
|LLaVA| liuhaotian/llava-v1.5-7b | 🟩 | 🟩 | | 🟩 | 🟩 |
53-
|GIT| microsoft/git-base | 🟩 | 🟩 | | 🟩 | |
54-
|Yuan| IEITYuan/Yuan2-102B-hf | 🟩 | 🟩 | | 🟩 | |
55-
|Phi| microsoft/phi-2 | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
56-
|Phi| microsoft/Phi-3-mini-4k-instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
57-
|Phi| microsoft/Phi-3-mini-128k-instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
58-
|Phi| microsoft/Phi-3-medium-4k-instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
59-
|Phi| microsoft/Phi-3-medium-128k-instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
60-
|Whisper| openai/whisper-large-v2 | 🟩 | 🟩 | 🟩 | 🟩 | |
21+
|LLAMA| meta-llama/Llama-2-7b-hf ||||||
22+
|LLAMA| meta-llama/Llama-2-13b-hf ||||||
23+
|LLAMA| meta-llama/Llama-2-70b-hf ||||||
24+
|LLAMA| meta-llama/Meta-Llama-3-8B ||||||
25+
|LLAMA| meta-llama/Meta-Llama-3-70B ||||||
26+
|LLAMA| meta-llama/Meta-Llama-3.1-8B-Instruct ||||||
27+
|LLAMA| meta-llama/Llama-3.2-3B-Instruct ||||||
28+
|LLAMA| meta-llama/Llama-3.2-11B-Vision-Instruct ||| |||
29+
|GPT-J| EleutherAI/gpt-j-6b ||||||
30+
|GPT-NEOX| EleutherAI/gpt-neox-20b ||||||
31+
|DOLLY| databricks/dolly-v2-12b ||||||
32+
|FALCON| tiiuae/falcon-7b ||||||
33+
|FALCON| tiiuae/falcon-11b ||||||
34+
|FALCON| tiiuae/falcon-40b ||||||
35+
|OPT| facebook/opt-30b ||||||
36+
|OPT| facebook/opt-1.3b ||||||
37+
|Bloom| bigscience/bloom-1b7 ||||||
38+
|CodeGen| Salesforce/codegen-2B-multi ||||||
39+
|Baichuan| baichuan-inc/Baichuan2-7B-Chat ||||||
40+
|Baichuan| baichuan-inc/Baichuan2-13B-Chat ||||||
41+
|Baichuan| baichuan-inc/Baichuan-13B-Chat ||||||
42+
|ChatGLM| THUDM/chatglm3-6b ||||||
43+
|ChatGLM| THUDM/chatglm2-6b ||||||
44+
|GPTBigCode| bigcode/starcoder ||||||
45+
|T5| google/flan-t5-xl ||||||
46+
|MPT| mosaicml/mpt-7b ||||||
47+
|Mistral| mistralai/Mistral-7B-v0.1 ||||||
48+
|Mixtral| mistralai/Mixtral-8x7B-v0.1 ||| |||
49+
|Stablelm| stabilityai/stablelm-2-1_6b ||||||
50+
|Qwen| Qwen/Qwen-7B-Chat ||||||
51+
|Qwen| Qwen/Qwen2-7B ||||||
52+
|LLaVA| liuhaotian/llava-v1.5-7b ||| |||
53+
|GIT| microsoft/git-base ||| |||
54+
|Yuan| IEITYuan/Yuan2-102B-hf ||| || |
55+
|Phi| microsoft/phi-2 ||||||
56+
|Phi| microsoft/Phi-3-mini-4k-instruct ||||||
57+
|Phi| microsoft/Phi-3-mini-128k-instruct ||||||
58+
|Phi| microsoft/Phi-3-medium-4k-instruct ||||||
59+
|Phi| microsoft/Phi-3-medium-128k-instruct ||||||
60+
|Whisper| openai/whisper-large-v2 ||||||
61+
|Maira| microsoft/maira-2 ||| |||
62+
|Jamba| ai21labs/Jamba-v0.1 ||| |||
63+
|DeepSeek| deepseek-ai/DeepSeek-V2.5-1210 ||| |||
6164

6265
*Note*: The above verified models (including other models in the same model family, like "codellama/CodeLlama-7b-hf" from LLAMA family) are well supported with all optimizations like indirect access KV cache, fused ROPE, and customized linear kernels.
6366
We are working in progress to better support the models in the tables with various data types. In addition, more models will be optimized in the future.

docs/tutorials/examples.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -314,6 +314,6 @@ $ ldd example-app
314314

315315
## Intel® AI Reference Models
316316

317-
Use cases that have already been optimized by Intel engineers are available at [Intel® AI Reference Models](https://github.com/intel/ai-reference-models/tree/pytorch-r2.5-models) (former Model Zoo).
318-
The lists of PyTorch use cases with links to sample codes are available in the [use case tables](https://github.com/intel/ai-reference-models/tree/pytorch-r2.5-models?tab=readme-ov-file#use-cases).
317+
Use cases that have already been optimized by Intel engineers are available at [Intel® AI Reference Models](https://github.com/intel/ai-reference-models/tree/pytorch-r2.6-models) (former Model Zoo).
318+
The lists of PyTorch use cases with links to sample codes are available in the [use case tables](https://github.com/intel/ai-reference-models/tree/pytorch-r2.6-models?tab=readme-ov-file#use-cases).
319319
You can get performance benefits out-of-the-box by simply running scripts in the Intel® AI Reference Models.

examples/cpu/llm/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ conda activate llm
102102

103103
# Setup the environment with the provided script
104104
cd examples/cpu/llm
105-
bash ./tools/env_setup.sh 8
105+
bash ./tools/env_setup.sh 11
106106

107107
# Activate environment variables
108108
# set bash script argument to "inference" or "fine-tuning" for different usages

examples/cpu/llm/inference/README.md

Lines changed: 78 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -4,88 +4,89 @@
44

55
| MODEL FAMILY | MODEL NAME (Huggingface hub) | FP32 | BF16 | Static quantization INT8 | Weight only quantization INT8 | Weight only quantization INT4 |
66
|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
7-
|LLAMA| meta-llama/Llama-2-7b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
8-
|LLAMA| meta-llama/Llama-2-13b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
9-
|LLAMA| meta-llama/Llama-2-70b-hf | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
10-
|LLAMA| meta-llama/Meta-Llama-3-8B | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
11-
|LLAMA| meta-llama/Meta-Llama-3-70B | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
12-
|LLAMA| meta-llama/Meta-Llama-3.1-8B-Instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
13-
|LLAMA| meta-llama/Llama-3.2-3B-Instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
14-
|LLAMA| meta-llama/Llama-3.2-11B-Vision-Instruct | 🟩 | 🟩 | | 🟩 | |
15-
|GPT-J| EleutherAI/gpt-j-6b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
16-
|GPT-NEOX| EleutherAI/gpt-neox-20b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
17-
|DOLLY| databricks/dolly-v2-12b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
18-
|FALCON| tiiuae/falcon-7b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
19-
|FALCON| tiiuae/falcon-11b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
20-
|FALCON| tiiuae/falcon-40b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
21-
|OPT| facebook/opt-30b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
22-
|OPT| facebook/opt-1.3b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
23-
|Bloom| bigscience/bloom-1b7 | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
24-
|CodeGen| Salesforce/codegen-2B-multi | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
25-
|Baichuan| baichuan-inc/Baichuan2-7B-Chat | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
26-
|Baichuan| baichuan-inc/Baichuan2-13B-Chat | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
27-
|Baichuan| baichuan-inc/Baichuan-13B-Chat | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
28-
|ChatGLM| THUDM/chatglm3-6b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
29-
|ChatGLM| THUDM/chatglm2-6b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
30-
|GPTBigCode| bigcode/starcoder | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
31-
|T5| google/flan-t5-xl | 🟩 | 🟩 | 🟩 | 🟩 | |
32-
|MPT| mosaicml/mpt-7b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
33-
|Mistral| mistralai/Mistral-7B-v0.1 | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
34-
|Mixtral| mistralai/Mixtral-8x7B-v0.1 | 🟩 | 🟩 | | 🟩 | 🟩 |
35-
|Stablelm| stabilityai/stablelm-2-1_6b | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
36-
|Qwen| Qwen/Qwen-7B-Chat | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
37-
|Qwen| Qwen/Qwen2-7B | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
38-
|LLaVA| liuhaotian/llava-v1.5-7b | 🟩 | 🟩 | | 🟩 | 🟩 |
39-
|GIT| microsoft/git-base | 🟩 | 🟩 | | 🟩 | |
40-
|Yuan| IEITYuan/Yuan2-102B-hf | 🟩 | 🟩 | | 🟩 | |
41-
|Phi| microsoft/phi-2 | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
42-
|Phi| microsoft/Phi-3-mini-4k-instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
43-
|Phi| microsoft/Phi-3-mini-128k-instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
44-
|Phi| microsoft/Phi-3-medium-4k-instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
45-
|Phi| microsoft/Phi-3-medium-128k-instruct | 🟩 | 🟩 | 🟩 | 🟩 | 🟩 |
46-
|Whisper| openai/whisper-large-v2 | 🟩 | 🟩 | 🟩 | 🟩 | |
47-
|Maira| microsoft/maira-2 | 🟩 | 🟩 | | 🟩 | |
48-
|Jamba| ai21labs/Jamba-v0.1 | 🟩 | 🟩 | | 🟩 | |
7+
|LLAMA| meta-llama/Llama-2-7b-hf ||||||
8+
|LLAMA| meta-llama/Llama-2-13b-hf ||||||
9+
|LLAMA| meta-llama/Llama-2-70b-hf ||||||
10+
|LLAMA| meta-llama/Meta-Llama-3-8B ||||||
11+
|LLAMA| meta-llama/Meta-Llama-3-70B ||||||
12+
|LLAMA| meta-llama/Meta-Llama-3.1-8B-Instruct ||||||
13+
|LLAMA| meta-llama/Llama-3.2-3B-Instruct ||||||
14+
|LLAMA| meta-llama/Llama-3.2-11B-Vision-Instruct ||| |||
15+
|GPT-J| EleutherAI/gpt-j-6b ||||||
16+
|GPT-NEOX| EleutherAI/gpt-neox-20b ||||||
17+
|DOLLY| databricks/dolly-v2-12b ||||||
18+
|FALCON| tiiuae/falcon-7b ||||||
19+
|FALCON| tiiuae/falcon-11b ||||||
20+
|FALCON| tiiuae/falcon-40b ||||||
21+
|OPT| facebook/opt-30b ||||||
22+
|OPT| facebook/opt-1.3b ||||||
23+
|Bloom| bigscience/bloom-1b7 ||||||
24+
|CodeGen| Salesforce/codegen-2B-multi ||||||
25+
|Baichuan| baichuan-inc/Baichuan2-7B-Chat ||||||
26+
|Baichuan| baichuan-inc/Baichuan2-13B-Chat ||||||
27+
|Baichuan| baichuan-inc/Baichuan-13B-Chat ||||||
28+
|ChatGLM| THUDM/chatglm3-6b ||||||
29+
|ChatGLM| THUDM/chatglm2-6b ||||||
30+
|GPTBigCode| bigcode/starcoder ||||||
31+
|T5| google/flan-t5-xl ||||||
32+
|MPT| mosaicml/mpt-7b ||||||
33+
|Mistral| mistralai/Mistral-7B-v0.1 ||||||
34+
|Mixtral| mistralai/Mixtral-8x7B-v0.1 ||| |||
35+
|Stablelm| stabilityai/stablelm-2-1_6b ||||||
36+
|Qwen| Qwen/Qwen-7B-Chat ||||||
37+
|Qwen| Qwen/Qwen2-7B ||||||
38+
|LLaVA| liuhaotian/llava-v1.5-7b ||| |||
39+
|GIT| microsoft/git-base ||| |||
40+
|Yuan| IEITYuan/Yuan2-102B-hf ||| || |
41+
|Phi| microsoft/phi-2 ||||||
42+
|Phi| microsoft/Phi-3-mini-4k-instruct ||||||
43+
|Phi| microsoft/Phi-3-mini-128k-instruct ||||||
44+
|Phi| microsoft/Phi-3-medium-4k-instruct ||||||
45+
|Phi| microsoft/Phi-3-medium-128k-instruct ||||||
46+
|Whisper| openai/whisper-large-v2 ||||||
47+
|Maira| microsoft/maira-2 ||| |||
48+
|Jamba| ai21labs/Jamba-v0.1 ||| |||
49+
|DeepSeek| deepseek-ai/DeepSeek-V2.5-1210 ||| |||
4950

5051
## 1.2 Verified for distributed inference mode via DeepSpeed
5152

5253
| MODEL FAMILY | MODEL NAME (Huggingface hub) | BF16 | Weight only quantization INT8 |
5354
|:---:|:---:|:---:|:---:|
54-
|LLAMA| meta-llama/Llama-2-7b-hf | 🟩 | 🟩 |
55-
|LLAMA| meta-llama/Llama-2-13b-hf | 🟩 | 🟩 |
56-
|LLAMA| meta-llama/Llama-2-70b-hf | 🟩 | 🟩 |
57-
|LLAMA| meta-llama/Meta-Llama-3-8B | 🟩 | 🟩 |
58-
|LLAMA| meta-llama/Meta-Llama-3-70B | 🟩 | 🟩 |
59-
|LLAMA| meta-llama/Meta-Llama-3.1-8B-Instruct | 🟩 | 🟩 |
60-
|LLAMA| meta-llama/Llama-3.2-3B-Instruct | 🟩 | 🟩 |
61-
|LLAMA| meta-llama/Llama-3.2-11B-Vision-Instruct | 🟩 | 🟩 |
62-
|GPT-J| EleutherAI/gpt-j-6b | 🟩 | 🟩 |
63-
|GPT-NEOX| EleutherAI/gpt-neox-20b | 🟩 | 🟩 |
64-
|DOLLY| databricks/dolly-v2-12b | 🟩 | 🟩 |
65-
|FALCON| tiiuae/falcon-11b | 🟩 | 🟩 |
66-
|FALCON| tiiuae/falcon-40b | 🟩 | 🟩 |
67-
|OPT| facebook/opt-30b | 🟩 | 🟩 |
68-
|OPT| facebook/opt-1.3b | 🟩 | 🟩 |
69-
|Bloom| bigscience/bloom-1b7 | 🟩 | 🟩 |
70-
|CodeGen| Salesforce/codegen-2B-multi | 🟩 | 🟩 |
71-
|Baichuan| baichuan-inc/Baichuan2-7B-Chat | 🟩 | 🟩 |
72-
|Baichuan| baichuan-inc/Baichuan2-13B-Chat | 🟩 | 🟩 |
73-
|Baichuan| baichuan-inc/Baichuan-13B-Chat | 🟩 | 🟩 |
74-
|GPTBigCode| bigcode/starcoder | 🟩 | 🟩 |
75-
|T5| google/flan-t5-xl | 🟩 | 🟩 |
76-
|Mistral| mistralai/Mistral-7B-v0.1 | 🟩 | 🟩 |
77-
|Mistral| mistralai/Mixtral-8x7B-v0.1 | 🟩 | 🟩 |
78-
|MPT| mosaicml/mpt-7b | 🟩 | 🟩 |
79-
|Stablelm| stabilityai/stablelm-2-1_6b | 🟩 | 🟩 |
80-
|Qwen| Qwen/Qwen-7B-Chat | 🟩 | 🟩 |
81-
|Qwen| Qwen/Qwen2-7B | 🟩 | 🟩 |
82-
|GIT| microsoft/git-base | 🟩 | 🟩 |
83-
|Phi| microsoft/phi-2 | 🟩 | 🟩 |
84-
|Phi| microsoft/Phi-3-mini-4k-instruct | 🟩 | 🟩 |
85-
|Phi| microsoft/Phi-3-mini-128k-instruct | 🟩 | 🟩 |
86-
|Phi| microsoft/Phi-3-medium-4k-instruct | 🟩 | 🟩 |
87-
|Phi| microsoft/Phi-3-medium-128k-instruct | 🟩 | 🟩 |
88-
|Whisper| openai/whisper-large-v2 | 🟩 | 🟩 |
55+
|LLAMA| meta-llama/Llama-2-7b-hf | | |
56+
|LLAMA| meta-llama/Llama-2-13b-hf | | |
57+
|LLAMA| meta-llama/Llama-2-70b-hf | | |
58+
|LLAMA| meta-llama/Meta-Llama-3-8B | | |
59+
|LLAMA| meta-llama/Meta-Llama-3-70B | | |
60+
|LLAMA| meta-llama/Meta-Llama-3.1-8B-Instruct | | |
61+
|LLAMA| meta-llama/Llama-3.2-3B-Instruct | | |
62+
|LLAMA| meta-llama/Llama-3.2-11B-Vision-Instruct | | |
63+
|GPT-J| EleutherAI/gpt-j-6b | | |
64+
|GPT-NEOX| EleutherAI/gpt-neox-20b | | |
65+
|DOLLY| databricks/dolly-v2-12b | | |
66+
|FALCON| tiiuae/falcon-11b | | |
67+
|FALCON| tiiuae/falcon-40b | | |
68+
|OPT| facebook/opt-30b | | |
69+
|OPT| facebook/opt-1.3b | | |
70+
|Bloom| bigscience/bloom-1b7 | | |
71+
|CodeGen| Salesforce/codegen-2B-multi | | |
72+
|Baichuan| baichuan-inc/Baichuan2-7B-Chat | | |
73+
|Baichuan| baichuan-inc/Baichuan2-13B-Chat | | |
74+
|Baichuan| baichuan-inc/Baichuan-13B-Chat | | |
75+
|GPTBigCode| bigcode/starcoder | | |
76+
|T5| google/flan-t5-xl | | |
77+
|Mistral| mistralai/Mistral-7B-v0.1 | | |
78+
|Mistral| mistralai/Mixtral-8x7B-v0.1 | | |
79+
|MPT| mosaicml/mpt-7b | | |
80+
|Stablelm| stabilityai/stablelm-2-1_6b | | |
81+
|Qwen| Qwen/Qwen-7B-Chat | | |
82+
|Qwen| Qwen/Qwen2-7B | | |
83+
|GIT| microsoft/git-base | | |
84+
|Phi| microsoft/phi-2 | | |
85+
|Phi| microsoft/Phi-3-mini-4k-instruct | | |
86+
|Phi| microsoft/Phi-3-mini-128k-instruct | | |
87+
|Phi| microsoft/Phi-3-medium-4k-instruct | | |
88+
|Phi| microsoft/Phi-3-medium-128k-instruct | | |
89+
|Whisper| openai/whisper-large-v2 | | |
8990

9091
*Note*: The above verified models (including other models in the same model family, like "codellama/CodeLlama-7b-hf" from LLAMA family)
9192
are well supported with all optimizations like indirect access KV cache, fused ROPE, and customized linear kernels.

examples/cpu/llm/requirements.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
cpuid
21
accelerate
32
datasets==2.21.0
43
sentencepiece

0 commit comments

Comments
 (0)