-
-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[vLLM IR] Port RoPE ops to IR #38756
Copy link
Copy link
Open
Labels
vllm-irvLLM IR: intermediate representation and kernel registrationvLLM IR: intermediate representation and kernel registration
Description
There are many flavors of rope, but some of them only contain a native implementation; those do not need to be ported. Additionally, the sin_cos_cache initialization logic should remain in the layer. At the very least, the following should be ported:
RotaryEmbeddingDeepseekScalingRotaryEmbedding
However, we should carefully inspect semantics if any of the ops can be consolidated, especially using simple bool params. This will help us reduce the maintenance burden and increase the coverage for rope+cache related fusions.
Final challenge for rope will be the inplace semantics, as the _C implementation is fully inplace, and the arguments are views, which will complicate the aliasing analysis for the clone elimination after the lowering pass (see #36823)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
vllm-irvLLM IR: intermediate representation and kernel registrationvLLM IR: intermediate representation and kernel registration