AttentionMetadata Preparation for Encoder-only Models #145

slokesha · 2025-09-09T09:39:47Z

Upstream-Vllm implements an encoder-only attention layer that initializes its own AttentionMetadataBuilder

On GPU, the AttentionMetadataBuilder is the standard way to create attention metadata.

On Gaudi, however, attention metadata is created through the make_prefill_metadata function. This causes encoder-only models to bypass the builder logic and fail in scenarios where no KV cache is used.

This PR introduces encoder-only attention metadata support for Gaudi by aligning with the upstream behavior, while handling Gaudi-specific paths. And also Handles edge cases where no KV cache is present

Dependency on Upstream PR - vllm-project/vllm#24612

Signed-off-by: slokesha <[email protected]>

slokesha added 9 commits September 8, 2025 21:25

Encoder-only attention support

404da7f

Signed-off-by: slokesha <[email protected]>

Fixed input_batch error

381b95f

Signed-off-by: slokesha <[email protected]>

Refactored

abd4470

Signed-off-by: slokesha <[email protected]>

Merge branch 'main' into encoder_only_attn

a5c433c

Handled no kv cache msg

407627f

Signed-off-by: slokesha <[email protected]>

Merge branch 'main' into encoder_only_attn

717ee1e

pre-commit refactor

653050a

Signed-off-by: slokesha <[email protected]>

Added UnitTest

5f631f6

Signed-off-by: slokesha <[email protected]>

pre commit changes

d75e23e

Signed-off-by: slokesha <[email protected]>

slokesha marked this pull request as ready for review September 11, 2025 20:30

slokesha requested review from kzawora-intel, xuechendi, mswiniarsk and adobrzyn as code owners September 11, 2025 20:30

xuechendi and others added 5 commits September 11, 2025 16:08

Merge branch 'main' into encoder_only_attn

a76a513

Merge branch 'main' into encoder_only_attn

bf15361

Merge branch 'main' into encoder_only_attn

b723086

Pre-commit Fix

7f6a8fd

Signed-off-by: slokesha <[email protected]>

Merge branch 'main' into encoder_only_attn

2d924b2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AttentionMetadata Preparation for Encoder-only Models #145

AttentionMetadata Preparation for Encoder-only Models #145

Uh oh!

slokesha commented Sep 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

AttentionMetadata Preparation for Encoder-only Models #145

Are you sure you want to change the base?

AttentionMetadata Preparation for Encoder-only Models #145

Uh oh!

Conversation

slokesha commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

slokesha commented Sep 9, 2025 •

edited

Loading