Skip to content

Conversation

@asmigosw
Copy link
Contributor

@asmigosw asmigosw commented Nov 5, 2025

Adding CB support for VLMs:

  1. Llava
  2. Gemma3
  3. Mistral3
  4. InternVL2_5
  5. InternVL3_5
  6. Molmo

Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
@asmigosw asmigosw marked this pull request as ready for review November 19, 2025 09:30
@asmigosw asmigosw marked this pull request as draft November 20, 2025 05:58
Signed-off-by: Asmita Goswami <[email protected]>
Signed-off-by: Asmita Goswami <[email protected]>
@asmigosw asmigosw marked this pull request as ready for review November 20, 2025 10:19
Copy link
Contributor

@quic-dhirajku quic-dhirajku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Args:
vision_session: QAICInferenceSession for vision model
processor: AutoImageProcessor for image preprocessing
tokenizer: AutoTokenizer for text tokenization
Copy link
Contributor

@quic-mamta quic-mamta Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update args for image height and width, check and update args at other places also.

Args:
image_url: URL or path to image
query: Text query to process with image
prompt = [query]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this required here?

image = Image.open(requests.get(image_url, stream=True).raw)
else:
image = Image.open(image_url)
image = image.resize((536, 354))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should check for self._image_height and self._image_width and if not passed then resize to these default shapes?

else:
image = Image.open(image_url)

if "mistral3" in self._qeff_model.model.config.model_type:
Copy link
Contributor

@quic-mamta quic-mamta Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above. also please update the args for this function's docstrings.

full_batch_size = 4
queries = [query] * full_batch_size

pytorch_hf_tokens = [pytorch_hf_tokens] * 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any reason we are not using run_vlm_hf_model_pytorch_CB here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants