Skip to content

[vLLM IR] Port RoPE ops to IR #38756

@ProExpertProg

Description

@ProExpertProg

There are many flavors of rope, but some of them only contain a native implementation; those do not need to be ported. Additionally, the sin_cos_cache initialization logic should remain in the layer. At the very least, the following should be ported:

  • RotaryEmbedding
  • DeepseekScalingRotaryEmbedding

However, we should carefully inspect semantics if any of the ops can be consolidated, especially using simple bool params. This will help us reduce the maintenance burden and increase the coverage for rope+cache related fusions.

Final challenge for rope will be the inplace semantics, as the _C implementation is fully inplace, and the arguments are views, which will complicate the aliasing analysis for the clone elimination after the lowering pass (see #36823)

Metadata

Metadata

Assignees

No one assigned

    Labels

    vllm-irvLLM IR: intermediate representation and kernel registration

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions