-
Notifications
You must be signed in to change notification settings - Fork 99
Add multi-modal search #1998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add multi-modal search #1998
Conversation
WalkthroughExpands runtime experimental feature flags, adds types for media and embedder fragments, introduces multimodal end-to-end tests and fixtures, updates Vite test env loading for tests, and adds a documentation code sample for multimodal/hybrid search. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Test as Test Suite
participant SDK as JS SDK
participant MS as Meilisearch
rect rgba(232,240,255,0.6)
Test->>SDK: PATCH /experimental-features { multimodal: true }
SDK->>MS: PUT /experimental-features
MS-->>SDK: 200 OK
end
rect rgba(255,250,220,0.6)
Test->>SDK: PATCH /indexes/{uid}/settings (embedder with indexingFragments & searchFragments)
SDK->>MS: PATCH /indexes/{uid}/settings
MS-->>SDK: 202 Accepted
end
rect rgba(255,238,255,0.6)
Test->>SDK: POST /indexes/{uid}/documents (movies fixture)
SDK->>MS: POST /indexes/{uid}/documents
MS-->>SDK: 202 Accepted
end
rect rgba(221,255,221,0.6)
Test->>SDK: POST /indexes/{uid}/search (q and/or media payload)
SDK->>MS: POST /indexes/{uid}/search (media + fragments)
MS-->>SDK: 200 OK (multimodal-ranked hits)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Assessment against linked issues
Assessment against linked issues: Out-of-scope changes
Possibly related PRs
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1998 +/- ##
=======================================
Coverage 98.83% 98.83%
=======================================
Files 19 19
Lines 1550 1550
Branches 334 334
=======================================
Hits 1532 1532
Misses 18 18 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Waiting for #2000 to fix tests |
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (6)
src/types/types.ts (2)
240-244
: Clarify and alias the media payload type to reduce ambiguity and improve reuseCurrent shape is correct but opaque. Consider aliasing and documenting the expected forms (URL vs base64 blob) and that
data
should be base64-encoded. This avoids repetition and helps downstream users.Apply within-range change:
- media?: Record< - string, - Record<string, string | { mime: string; data: string }> - >; + media?: MediaPayload;Add these type aliases near other shared types:
// Media payloads can be either: // - string: typically an absolute URL or data: URL // - object: base64 data with explicit MIME export type MediaBinary = { mime: string; data: string }; export type MediaPayload = Record<string, Record<string, string | MediaBinary>>;
594-606
: Deduplicate fragment shapes with a shared alias and add a brief docBoth fragment maps share the same structure. Use a single alias to keep them in sync and add a short comment to clarify intent.
Apply within-range change:
- indexingFragments?: Record< - string, - { - value: RecordAny; - } - >; - searchFragments?: Record< - string, - { - value: RecordAny; - } - >; + indexingFragments?: EmbedderFragments; + searchFragments?: EmbedderFragments;Add alongside other embedder types:
/** Name → JSON fragment injected into the REST embedder's request */ export type EmbedderFragments = Record<string, { value: RecordAny }>;tests/fixtures/movies.json (1)
1-90
: Avoid network flakiness from remote poster URLs in testsExternal TMDB URLs can cause nondeterministic failures (rate limits, outages). Prefer local fixture images or embedded base64 for stability.
If keeping URLs, gate network usage behind a flag and fall back to local placeholders when offline.
tests/multi_modal_search.test.ts (3)
168-174
: Keep search API usage consistent; avoid mixingq
in options with positional query.Prefer passing the query via the first argument (as done in other tests) for clarity.
-const response = await searchClient.index(INDEX_UID).search(null, { - q: query, +const response = await searchClient.index(INDEX_UID).search(query, {Reference: Search endpoint uses
q
in the HTTP body; SDK maps the first arg to it. (meilisearch.com)
119-133
: Clean up test artifacts.Add an
afterAll
to delete the index so repeated CI runs don’t accumulate test indexes.describe.skipIf(!VOYAGE_API_KEY)("Multi-modal search", () => { let searchClient: Meilisearch; @@ beforeAll(async () => { const client = await getClient("Admin"); @@ searchClient = await getClient("Search"); }); + + afterAll(async () => { + const client = await getClient("Admin"); + await client.index(INDEX_UID).delete(); + });
116-149
: Flakiness guard looks good; consider recording why expectations map to these movies.Assertions depend on the movies fixture semantics. If the fixture changes, tests may fail. Consider pinning the exact records (ids) in comments to aid maintenance.
Also applies to: 151-166, 168-189
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
tests/fixtures/master-yoda.jpeg
is excluded by!**/*.jpeg
📒 Files selected for processing (6)
src/types/experimental-features.ts
(1 hunks)src/types/types.ts
(2 hunks)tests/experimental-features.test.ts
(1 hunks)tests/fixtures/movies.json
(1 hunks)tests/multi_modal_search.test.ts
(1 hunks)vite.config.ts
(2 hunks)
🧰 Additional context used
🪛 Biome (2.1.2)
tests/multi_modal_search.test.ts
[error] 4-4: Expected a semicolon or an implicit semicolon after a statement, but found none
An explicit or implicit semicolon is expected here...
...Which is required to end this statement
(parse)
[error] 4-4: Expected a semicolon or an implicit semicolon after a statement, but found none
An explicit or implicit semicolon is expected here...
...Which is required to end this statement
(parse)
🔇 Additional comments (7)
vite.config.ts (1)
1-1
: Import looks goodThe new loadEnv import is appropriate for Vitest env hydration.
src/types/experimental-features.ts (1)
7-16
: New toggles align with 1.16 featuresAdditions for chatCompletions, compositeEmbedders, and multimodal look consistent and non-breaking.
tests/experimental-features.test.ts (2)
9-18
: Teardown coverage is completeDisabling all toggles via a mapped type ensures future additions surface as compile errors. Good pattern.
23-32
: Enablement matrix LGTMThe positive matrix mirrors the teardown and validates round-trip correctness.
tests/multi_modal_search.test.ts (3)
24-24
: Embedding dimension 1024 matches voyage-multimodal-3.Good choice; the model’s embedding dimension is 1024.
Reference: Voyage docs list
voyage-multimodal-3
embedding dimension as 1024. (docs.voyageai.com)
64-100
: Fragment shapes look correct for Voyage (text, image_url, image_base64).
indexingFragments
/searchFragments
use the rightcontent
schema and types for Voyage. Ensuremedia
in queries matches exactly one search fragment at a time to avoid server errors.Reference: Meilisearch multimodal fragments and
media
matching rules. (meilisearch.dev, meilisearch.com)
65-77
: Confirmimage_base64
usage matches Voyage’s expected format.The
type: "image_base64"
with adata:{mime};base64,{data}
string is valid for Voyage; just ensure the base64 is prefixed as you do here.Reference: Voyage FAQ example uses
type: "image_base64"
with a data-URL string. (docs.voyageai.com)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
tests/multi_modal_search.test.ts (1)
106-110
: REST embedder usesinput
, notinputs
— request will 400.Voyage’s endpoint expects
input
. Usinginputs
breaks requests and the fragments won’t be embedded.request: { // This request object matches the Voyage API request object - inputs: ["{{fragment}}", "{{..}}"], + input: ["{{fragment}}", "{{..}}"], model: "voyage-multimodal-3", },
🧹 Nitpick comments (7)
src/types/types.ts (2)
214-231
: Clarify MediaBinary payload (no data URL prefix).To avoid misuse, explicitly state that
data
must be raw base64 without adata:
prefix (you already prepend it in fragments).export type MediaBinary = { /** MIME type of the file */ mime: string; - /** Base64-encoded data of the file */ + /** Base64-encoded payload only (no `data:` prefix). */ data: string; };
601-626
: Make EmbedderFragments generic for better type inference.Keeps backward compatibility while letting consumers narrow
value
to a model-specific shape.-export type EmbedderFragments = Record<string, { value: RecordAny }>; +export type EmbedderFragments<T = RecordAny> = Record<string, { value: T }>;tests/multi_modal_search.test.ts (5)
4-4
: Keep JSON import assertion for Node 18; fix Biome parsing instead.Node 18 requires
assert { type: "json" }
. Address the Biome parse error via tooling (upgrade Biome to a version that supports import attributes or exclude this test file from Biome parsing), rather than changing the import.Example biome.json tweak (pick one):
- Ignore tests:
{ "files": { "ignore": ["tests/**/*.test.ts"] } }
- Or upgrade Biome and enable import attributes support in parser if available in your version.
125-133
: Make index deletion idempotent to avoid flakiness on fresh runs.- // Delete the index if it already exists - await client.index(INDEX_UID).delete().waitTask(); + // Delete the index if it already exists (ignore 404) + try { + await client.index(INDEX_UID).delete().waitTask(); + } catch (e) { + if ((e as any)?.code !== "index_not_found") throw e; + }
125-141
: Clean up experimental flag after tests to avoid leaking state.Add this after the suite:
afterAll(async () => { const client = await getClient("Admin"); try { await client.updateExperimentalFeatures({ multimodal: false }); } catch {} try { await client.index(INDEX_UID).delete().waitTask(); } catch (e) { if ((e as any)?.code !== "index_not_found") throw e; } });
176-195
: Be consistent: pass the query via the first argument instead of mixingq
in options.- const response = await searchClient.index(INDEX_UID).search(null, { - q: query, + const response = await searchClient.index(INDEX_UID).search(query, { media: { textAndPoster: { text: query, image: { mime: "image/jpeg", data: masterYodaBase64, }, }, }, hybrid: { embedder: EMBEDDER_NAME, semanticRatio: 1, }, });
95-104
: Minor naming nit: preferurl
over repeatingposter.poster
.Purely cosmetic, but
media.poster.url
reads clearer thanmedia.poster.poster
. If you agree, rename the property and update the correspondingimage_url
template.Also applies to: 163-167
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
src/types/types.ts
(4 hunks)tests/multi_modal_search.test.ts
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-03T09:36:56.618Z
Learnt from: Strift
PR: meilisearch/meilisearch-js#1998
File: tests/multi_modal_search.test.ts:4-4
Timestamp: 2025-09-03T09:36:56.618Z
Learning: Import assertions with `assert { type: "json" }` are required for JSON imports in Node.js 18 and should not be removed for Node.js 18 compatibility.
Applied to files:
tests/multi_modal_search.test.ts
🧬 Code graph analysis (1)
tests/multi_modal_search.test.ts (1)
src/types/types.ts (1)
Embedder
(661-668)
🪛 Biome (2.1.2)
tests/multi_modal_search.test.ts
[error] 4-4: Expected a semicolon or an implicit semicolon after a statement, but found none
An explicit or implicit semicolon is expected here...
...Which is required to end this statement
(parse)
[error] 4-4: Expected a semicolon or an implicit semicolon after a statement, but found none
An explicit or implicit semicolon is expected here...
...Which is required to end this statement
(parse)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: integration-tests (Node.js 20)
- GitHub Check: integration-tests (Node.js 22)
🔇 Additional comments (4)
src/types/types.ts (3)
233-235
: MediaPayload shape is appropriate and flexible.
262-262
: Addingmedia
toSearchParams
is correct; omitting it from GET params is prudent.
639-641
: Fragments supported on RestEmbedder look good.This aligns with 1.16 multi-modal; no issues spotted.
tests/multi_modal_search.test.ts (1)
12-21
: Good ESM-safe file path resolution.Using
dirname(fileURLToPath(import.meta.url))
avoids__dirname
pitfalls under ESM/Vitest.
Pull Request
Related issue
Fixes #1995
What does this PR do?
indexingFragments
searchFragments
VITE_VOYAGE_API_KEY
from VoyageAI)PR checklist
Please check if your PR fulfills the following requirements:
Thank you so much for contributing to Meilisearch!
Summary by CodeRabbit
New Features
Tests
Documentation
Chores