Add colqwen3 embed #651

bulutyigit · 2025-12-30T07:39:25Z

PR Description

What
Adds ColQwen3 (Tomoro tomoro-colqwen3-embed-8b) support to mlx-vlm for multimodal retrieval embeddings.

Why
Tomoro ColQwen3 is a ColBERT-style multi-vector embedding model on top of a Qwen3-VL backbone.
Without native support, MLX users cannot convert/load the checkpoint or produce token-level embeddings.

Changes

Adds colqwen3 model type integration for convert/load
Implements embedding_proj_layer and encode() to output token-level embeddings [B,T,D]
Adds helper APIs:
- encode_queries(processor, texts)
- encode_images(processor, images) (returns visual-token embeddings; ideal for PDF patches)
- maxsim(q, d) for ColBERT MaxSim scoring
Adds weight-key sanitization to map Tomoro/HF keys to MLX module names (vlm.model.* → vlm.*)
Fixes hidden forward path to correctly respect masks (embedding extraction path)

Testing

Verified load(<HF repo>, trust_remote_code=True) works
Verified text embeddings + image embeddings produce valid shapes and MaxSim scores

Notes
This PR focuses on embedding usage, not generation. No changes to public generation APIs expected.

- Add colqwen3 model type for mlx_vlm.convert/load - Implement ColBERT-style multi-vector embedding via embedding_proj_layer - Add weight-key sanitization for Tomoro checkpoints (vlm.model.* -> vlm.*) - Provide encode/encode_queries/encode_images helpers and MaxSim scoring

Blaizzy · 2026-01-02T13:16:52Z

Hey @bulutyigit,
Happy new year, this is an awesome addition!
I actually built a package called mlx-embeddings which would be the perfect home for this port. Would you mind redirecting the PR there? I'll review and merge it there.
Thanks for the contribution!

bulutyigit added 3 commits December 28, 2025 12:32

colqwen3 embed add

059f5ce

debug

068447b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add colqwen3 embed #651

Add colqwen3 embed #651

Uh oh!

bulutyigit commented Dec 30, 2025

Uh oh!

Blaizzy commented Jan 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add colqwen3 embed #651

Are you sure you want to change the base?

Add colqwen3 embed #651

Uh oh!

Conversation

bulutyigit commented Dec 30, 2025

Uh oh!

Blaizzy commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Blaizzy commented Jan 2, 2026 •

edited

Loading