Continuous Batching for VLMs #610

asmigosw · 2025-11-05T09:22:50Z

Adding CB support for VLMs:

Llava
Gemma3
Mistral3
InternVL2_5
InternVL3_5
Molmo

Signed-off-by: Asmita Goswami <[email protected]>

quic-dhirajku

LGTM

Signed-off-by: Asmita Goswami <[email protected]>

quic-mamta · 2025-11-28T05:29:14Z

QEfficient/generation/embedding_handler.py

        Args:
            vision_session: QAICInferenceSession for vision model
            processor: AutoImageProcessor for image preprocessing
+            tokenizer: AutoTokenizer for text tokenization


Please update args for image height and width, check and update args at other places also.

quic-mamta · 2025-11-28T05:30:19Z

QEfficient/generation/embedding_handler.py

+        Args:
+            image_url: URL or path to image
+            query: Text query to process with image
+        prompt = [query]


is this required here?

quic-mamta · 2025-11-28T05:34:13Z

QEfficient/generation/embedding_handler.py

+                image = Image.open(requests.get(image_url, stream=True).raw)
+            else:
+                image = Image.open(image_url)
+            image = image.resize((536, 354))


we should check for self._image_height and self._image_width and if not passed then resize to these default shapes?

quic-mamta · 2025-11-28T05:35:42Z

QEfficient/generation/embedding_handler.py

            else:
                image = Image.open(image_url)

+            if "mistral3" in self._qeff_model.model.config.model_type:


same as above. also please update the args for this function's docstrings.

quic-mamta · 2025-11-28T06:02:33Z

tests/transformers/models/test_image_text_to_text_models.py

+    full_batch_size = 4
+    queries = [query] * full_batch_size
+
+    pytorch_hf_tokens = [pytorch_hf_tokens] * 4


is there any reason we are not using run_vlm_hf_model_pytorch_CB here?

asmigosw requested review from ochougul, quic-amitraj, quic-hemagnih and quic-rishinr as code owners November 5, 2025 09:22

asmigosw marked this pull request as draft November 10, 2025 12:29

asmigosw added 5 commits November 16, 2025 16:19

Continuous Batching for VLMs

999068b

Signed-off-by: Asmita Goswami <[email protected]>

Added CB support for InternVL

1220cf9

Signed-off-by: Asmita Goswami <[email protected]>

Added CB support for Mistral3

c39ae01

Signed-off-by: Asmita Goswami <[email protected]>

Updated test_image_text_to_text for CB tests

39f5c16

Signed-off-by: Asmita Goswami <[email protected]>

Ruff format

9a42a08

Signed-off-by: Asmita Goswami <[email protected]>

asmigosw force-pushed the CB_VLM_update branch from b89ea66 to 9a42a08 Compare November 16, 2025 16:30

asmigosw added 3 commits November 16, 2025 16:54

Added CB update for Molmo

c1465c8

Signed-off-by: Asmita Goswami <[email protected]>

Added mistral CB support

a6f1182

Signed-off-by: Asmita Goswami <[email protected]>

Merge branch 'main' into CB_VLM_update

9e658bc

asmigosw marked this pull request as ready for review November 19, 2025 09:30

quic-rishinr requested review from quic-dhirajku and quic-mamta November 19, 2025 10:07

Merge branch 'main' into CB_VLM_update

a6ee63f

asmigosw marked this pull request as draft November 20, 2025 05:58

asmigosw added 2 commits November 20, 2025 10:14

Added CB Test for InternVL

94552e0

Signed-off-by: Asmita Goswami <[email protected]>

Ruff format

e8af917

Signed-off-by: Asmita Goswami <[email protected]>

asmigosw marked this pull request as ready for review November 20, 2025 10:19

quic-xiyushi mentioned this pull request Nov 20, 2025

Extend on-device sampling support for dual QPC VLMs #597

Open

Merge branch 'main' into CB_VLM_update

f8d67e4

quic-dhirajku approved these changes Nov 25, 2025

View reviewed changes

asmigosw added 5 commits November 25, 2025 11:48

Merge branch 'main' into CB_VLM_update

7ed78bc

Resolving CI issues

eea2ffa

Signed-off-by: Asmita Goswami <[email protected]>

Added InetrnVL example file for CB

ee54215

Signed-off-by: Asmita Goswami <[email protected]>

Merge branch 'main' into CB_VLM_update

542d60f

Merge branch 'main' into CB_VLM_update

77d07ea

Merge branch 'main' into CB_VLM_update

b8b2299

quic-mamta reviewed Nov 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Continuous Batching for VLMs #610

Continuous Batching for VLMs #610

asmigosw commented Nov 5, 2025 •

edited

Loading

Uh oh!

quic-dhirajku left a comment

Uh oh!

quic-mamta Nov 28, 2025 •

edited

Loading

Uh oh!

quic-mamta Nov 28, 2025

Uh oh!

quic-mamta Nov 28, 2025

Uh oh!

quic-mamta Nov 28, 2025 •

edited

Loading

Uh oh!

quic-mamta Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Continuous Batching for VLMs #610

Are you sure you want to change the base?

Continuous Batching for VLMs #610

Conversation

asmigosw commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quic-dhirajku left a comment

Choose a reason for hiding this comment

Uh oh!

quic-mamta Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

quic-mamta Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

quic-mamta Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

quic-mamta Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

quic-mamta Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

asmigosw commented Nov 5, 2025 •

edited

Loading

quic-mamta Nov 28, 2025 •

edited

Loading

quic-mamta Nov 28, 2025 •

edited

Loading