Enable internVL #1997

yeonsily · 2025-10-03T20:14:32Z

No description provided.

Fixes HPU graph issues for gemma3 vision inputs Text warmup to include attn_mask info, so vision+text data can reuse the graph for language model that's warmed up already. Changing slicing to index_select for multimodal bucketing for HPU. Slicing doesn't produce the same hash for the HPU graph with same input shape. Use buckets for the vision tower as well to reduce GC recompile Accuracy bug fix by clone output data of the multimodal-projector. Validated with Muirbench datasets.

Add missing modelscope package - `VLLM_USE_MODELSCOPE` env doesn't work without it.

## Essential Elements of an Effective PR Description Checklist - [ ] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [ ] The test plan, such as providing test command. - [ ] The test results, such as pasting the results comparison before and after, or e2e results ## Purpose ## Test Plan ## Test Result  --------- Signed-off-by: Libin Tang <[email protected]> Co-authored-by: Libin Tang <[email protected]> Co-authored-by: Libin Tang <[email protected]>

Introduce VLLM_WARMUP_WITH_PENALTY to call apply penalty code in sampler during warmup also. https://github.com/HabanaAI/vllm-fork/blob/libint/intervl_bucket/vllm/model_executor/layers/sampler.py#L280 is not called during warmup. And it causes extra graph compile during runtime as it sets to True for real run.

in sampling if do penalties, the prompt_tokens regenerates for each decode, that takes time. instead we can use cache it, and reset when requests set changes

Co-authored-by: Libin Tang <[email protected]>

vllm/worker/hpu_model_runner.py

(#1848)

(From 5902ec7)

michalkuligowski · 2025-10-07T06:49:58Z

/run-gaudi-tests

michalkuligowski · 2025-10-09T05:59:07Z

/run-gaudi-tests

wpyszka

approved

SupreetSinghPalne and others added 15 commits September 18, 2025 12:37

Update common.txt (#1956)

ee517a2

Add missing modelscope package - `VLLM_USE_MODELSCOPE` env doesn't work without it.

Modify merge_multimodal_embeddings to static (#1969)

30c226e

Add Daniel's mediapipe changes

dacac74

Call compute_input_embeddings only for prompt to save decode time

65abdfb

Merge PR1974 intervl:cache prompt_tokens for sampling metadata

b38c808

in sampling if do penalties, the prompt_tokens regenerates for each decode, that takes time. instead we can use cache it, and reset when requests set changes

Add check to only for do_penalities

e684eb5

Fix for merge_multimodal_embeddedings() crash

92e8db3

Add mediapipe changes more

22128e5

Libint/add samplemetatensorcache3 (#1991)

602d2d2

Co-authored-by: Libin Tang <[email protected]>

add fix for output_token length check

8e88b00

Small fixes for internvl (cherry-pick 8751709)

eed4b4b

Fix pre-commit

aab0a37

yeonsily requested review from PatrykWo, afierka-intel, deepvars, jikunshang, kzawora-intel, madamczyk-intel, mgawarkiewicz-intel, michalkuligowski, mswiniarsk, vivekgoe, wpyszka and xuechendi as code owners October 3, 2025 20:14

yeonsily mentioned this pull request Oct 3, 2025

Internvl CI #1998

Open

libinta reviewed Oct 3, 2025

View reviewed changes

vllm/worker/hpu_model_runner.py Show resolved Hide resolved

libinta reviewed Oct 3, 2025

View reviewed changes

vllm/worker/hpu_model_runner.py Show resolved Hide resolved

libinta reviewed Oct 3, 2025

View reviewed changes

vllm/worker/hpu_model_runner.py Show resolved Hide resolved

yeonsily and others added 5 commits October 3, 2025 14:34

Fix pre-commit

bced3c5

Fix pre-commit

55f0d81

GEMMA3:move decode embedding from hpu_model_runner to gemma3

08c5f9e

(#1848)

Fix vit_embeds duplication issue when N breakdown > 1

2d5ad93

(From 5902ec7)

Merge branch 'v1.23.0_next' into yeonsily/1.23_internvl

fdc21b7

Fix review comment

b96a809

michalkuligowski approved these changes Oct 10, 2025

View reviewed changes

wpyszka approved these changes Oct 16, 2025

View reviewed changes

wpyszka merged commit 62856c7 into v1.23.0_next Oct 16, 2025
46 checks passed

wpyszka deleted the yeonsily/1.23_internvl branch October 16, 2025 12:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable internVL #1997

Enable internVL #1997

yeonsily commented Oct 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michalkuligowski commented Oct 7, 2025

Uh oh!

michalkuligowski commented Oct 9, 2025

Uh oh!

wpyszka left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Enable internVL #1997

Enable internVL #1997

Conversation

yeonsily commented Oct 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michalkuligowski commented Oct 7, 2025

Uh oh!

michalkuligowski commented Oct 9, 2025

Uh oh!

wpyszka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants