[Docs] [V1] [Hybrid] Update docs to remove FlashInfer constraint for hybrid models (vllm-project#23665)

tdoublep · amd-xiaoyu12 · commit 42f4d0005539 · 2025-08-28T15:24:22.000Z
Signed-off-by: Thomas Parnell &lt;tpa@zurich.ibm.com&gt;
Signed-off-by: Xiao Yu &lt;xiao.yu@amd.com&gt;
diff --git a/docs/usage/v1_guide.md b/docs/usage/v1_guide.md
@@ -111,11 +111,10 @@ Models that use Mamba-2 and Mamba-1 layers (e.g., `Mamba2ForCausalLM`, `MambaFor
 
 Models that combine Mamba-2 and Mamba-1 layers with standard attention layers are also supported (e.g., `BambaForCausalLM`,
 `Zamba2ForCausalLM`, `NemotronHForCausalLM`, `FalconH1ForCausalLM` and `GraniteMoeHybridForCausalLM`, `JambaForCausalLM`). Please note that
-these models currently require disabling prefix caching and using the FlashInfer attention backend in V1.
+these models currently require disabling prefix caching in V1.
 
 Hybrid models with mechanisms different to Mamba are also supported (e.g, `MiniMaxText01ForCausalLM`, `MiniMaxM1ForCausalLM`).
-Please note that these models currently require disabling prefix caching, enforcing eager mode, and using the FlashInfer
-attention backend in V1.
+Please note that these models currently require disabling prefix caching and enforcing eager mode in V1.
 
 #### Encoder-Decoder Models