Skip to content

Conversation

yeonsily
Copy link

@yeonsily yeonsily commented Oct 3, 2025

No description provided.

SupreetSinghPalne and others added 15 commits September 18, 2025 12:37
Fixes HPU graph issues for gemma3 vision inputs

Text warmup to include attn_mask info, so vision+text data can reuse the
graph for language model that's warmed up already.
Changing slicing to index_select for multimodal bucketing for HPU.
Slicing doesn't produce the same hash for the HPU graph with same input
shape.
Use buckets for the vision tower as well to reduce GC recompile
Accuracy bug fix by clone output data of the multimodal-projector.
Validated with Muirbench datasets.
Add missing modelscope package - `VLLM_USE_MODELSCOPE` env doesn't work
without it.
## Essential Elements of an Effective PR Description Checklist
- [ ] The purpose of the PR, such as "Fix some issue (link existing
issues this PR will resolve)".
- [ ] The test plan, such as providing test command.
- [ ] The test results, such as pasting the results comparison before
and after, or e2e results


## Purpose

## Test Plan

## Test Result

<!--- pyml disable-next-line no-emphasis-as-heading -->

---------

Signed-off-by: Libin Tang <[email protected]>
Co-authored-by: Libin Tang <[email protected]>
Co-authored-by: Libin Tang <[email protected]>
Introduce VLLM_WARMUP_WITH_PENALTY to call apply penalty code in sampler
during warmup also.


https://github.com/HabanaAI/vllm-fork/blob/libint/intervl_bucket/vllm/model_executor/layers/sampler.py#L280
is not called during warmup.
And it causes extra graph compile during runtime as it sets to True for
real run.
in sampling if do penalties, the prompt_tokens regenerates for each decode, that takes time. instead
we can use cache it, and reset when requests set changes
@michalkuligowski
Copy link

/run-gaudi-tests

@michalkuligowski
Copy link

/run-gaudi-tests

Copy link

@wpyszka wpyszka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved

@wpyszka wpyszka merged commit 62856c7 into v1.23.0_next Oct 16, 2025
46 checks passed
@wpyszka wpyszka deleted the yeonsily/1.23_internvl branch October 16, 2025 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants