Skip to content

Commit ab71413

Browse files
[Doc] Update compatibility matrix for pooling and multimodal models (#21831)
Signed-off-by: DarkLight1337 <[email protected]>
1 parent 755fa8b commit ab71413

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

docs/features/compatibility_matrix.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -34,23 +34,25 @@ th:not(:first-child) {
3434
}
3535
</style>
3636

37-
| Feature | [CP][chunked-prefill] | [APC](automatic_prefix_caching.md) | [LoRA](lora.md) | [SD](spec_decode.md) | CUDA graph | <abbr title="Pooling Models">pooling</abbr> | <abbr title="Encoder-Decoder Models">enc-dec</abbr> | <abbr title="Logprobs">logP</abbr> | <abbr title="Prompt Logprobs">prmpt logP</abbr> | <abbr title="Async Output Processing">async output</abbr> | multi-step | <abbr title="Multimodal Inputs">mm</abbr> | best-of | beam-search |
37+
| Feature | [CP][chunked-prefill] | [APC](automatic_prefix_caching.md) | [LoRA](lora.md) | [SD](spec_decode.md) | CUDA graph | [pooling](../models/pooling_models.md) | <abbr title="Encoder-Decoder Models">enc-dec</abbr> | <abbr title="Logprobs">logP</abbr> | <abbr title="Prompt Logprobs">prmpt logP</abbr> | <abbr title="Async Output Processing">async output</abbr> | multi-step | <abbr title="Multimodal Inputs">mm</abbr> | best-of | beam-search |
3838
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3939
| [CP][chunked-prefill] || | | | | | | | | | | | | | |
4040
| [APC](automatic_prefix_caching.md) ||| | | | | | | | | | | | | |
4141
| [LoRA](lora.md) |||| | | | | | | | | | | | |
4242
| [SD](spec_decode.md) ||||| | | | | | | | | | |
4343
| CUDA graph |||||| | | | | | | | | |
44-
| <abbr title="Pooling Models">pooling</abbr> | | | || || | | | | | | | |
44+
| [pooling](../models/pooling_models.md) | \* | \* | || || | | | | | | | |
4545
| <abbr title="Encoder-Decoder Models">enc-dec</abbr> || [](gh-issue:7366) || [](gh-issue:7366) |||| | | | | | | |
4646
| <abbr title="Logprobs">logP</abbr> ||||||||| | | | | | |
4747
| <abbr title="Prompt Logprobs">prmpt logP</abbr> |||||||||| | | | | |
4848
| <abbr title="Async Output Processing">async output</abbr> ||||||||||| | | | |
4949
| multi-step |||||||||||| | | |
50-
| <abbr title="Multimodal Inputs">mm</abbr> || [🟠](gh-pr:8348) | [🟠](gh-pr:4194) |||||||||| | |
50+
| [mm](multimodal_inputs.md) || | [🟠](gh-pr:4194) |||||||||| | |
5151
| best-of |||| [](gh-issue:6137) ||||||| [](gh-issue:7968) ||| |
5252
| beam-search |||| [](gh-issue:6137) ||||||| [](gh-issue:7968) ||||
5353

54+
\* Chunked prefill and prefix caching are only applicable to last-token pooling.
55+
5456
[](){ #feature-x-hardware }
5557

5658
## Feature x Hardware
@@ -62,9 +64,9 @@ th:not(:first-child) {
6264
| [LoRA](lora.md) |||||||||
6365
| [SD](spec_decode.md) |||||||||
6466
| CUDA graph |||||||||
65-
| <abbr title="Pooling Models">pooling</abbr> ||||||| ||
67+
| [pooling](../models/pooling_models.md) ||||||| ||
6668
| <abbr title="Encoder-Decoder Models">enc-dec</abbr> |||||||||
67-
| <abbr title="Multimodal Inputs">mm</abbr> |||||||||
69+
| [mm](multimodal_inputs.md) |||||||||
6870
| <abbr title="Logprobs">logP</abbr> |||||||||
6971
| <abbr title="Prompt Logprobs">prmpt logP</abbr> |||||||||
7072
| <abbr title="Async Output Processing">async output</abbr> |||||||||

0 commit comments

Comments
 (0)