Get started • Try the demo • Example • API • Advanced
Local-first vector similarity search for Jazz
Store and query high-dimensional vectors directly on-device.
Built on Jazz, you get:
- Local-first sync across devices
- End-to-end encryption
- Real-time multiplayer
When paired with a local embeddings model:
- Works fully offline
// -- schema.ts --
export const JournalEntry = co.map({
text: z.string(),
simpleEmbedding: coV.vector(384), // <--- Define CoVector
});
// -- app.tsx --
const { search, isSearching } = useCoVectorSearch(
journalEntries,
(entry) => entry.simpleEmbedding,
queryEmbedding
);
- Semantic search: Find notes, docs, or messages by meaning
- Personalization: On-device recommendations and adaptive UIs
- Knowledge management: Organize personal wikis, journals, or research by concept rather than keyword
- Information matching: Connect datasets or peers through embeddings
- Context-aware assistants: Build local-first AI helpers that understand the user’s data while keeping it private
- Cross-device continuity: Carry embeddings seamlessly across phone, tablet, and desktop without a cloud backend
- Creative apps: Enable music, art, or writing tools that find related ideas, motifs, or inspirations
More cool use cases, as per AI
Search & organization
- Semantic search: Find notes, docs, emails, or messages by meaning
- Smart tagging & clustering: Auto-group related items and generate topic labels
- Near-duplicate detection: Merge similar notes, photos, or files
- Cross-app search: Index clipboard, screenshots, and files for one-place recall
Personalization & recommendations
- On-device recommendations: Rank feeds, reading lists, or media without sending data to the cloud
- Context-aware shortcuts: Suggest next actions based on what the user is doing
- Session re-ranking: Personalize command palettes, search results, or menus
Retrieval for AI (local RAG)
- Context fetching for LLMs: Retrieve relevant chunks from local docs to ground responses
- Conversation memory: Pull past chats or notes that match the current topic
- Snippet linking: Auto-link related passages across notebooks or PDFs
Media & sensors
- Photo & screenshot search: “Find images with whiteboard notes from last week”
- Audio similarity: Locate related voice notes, music snippets, or sound effects
Collaboration & P2P
- Peer-to-peer matching: Align embeddings between devices to find shared interests or files
- Team knowledge linking: Connect related docs across teammates without centralizing raw data
- Federated discovery: Share pointers/IDs instead of content; keep source data private
Productivity & dev workflows
- Code search: Semantic lookup of functions, symbols, and snippets in local repos
- Issue triage: Match new bugs to similar past reports or fixes
- Research assistants: Cluster and surface related papers, highlights, and annotations
Safety & housekeeping
- Content filtering: On-device NSFW/spam heuristics using similarity
- Anomaly detection: Spot outliers in logs or metrics locally
- Storage hygiene: Identify stale or redundant items to archive
Install the package from npm:
npm i jazz-vector
Requires jazz-tools
(minimum 0.17.4
) to be already installed.
Jazz Vector only deals with storage and search. You generate the vectors with any model you like (OpenAI, Hugging Face, or custom), then feed the vectors in.
On-device option (recommended): Use Transformers.js to run models locally for offline, private embedding:
Xenova/all-MiniLM-L6-v2
— 384-dim, ~23 MB- More models →
Alternatively, you can call a server-side model (your own or commercial one like OpenAI), but note this removes offline support and may affect user privacy.
Jazz Vector exposes new CoVector
value type you should use to define vector embeddings.
It expects the number of dimensions of your embeddings.
export const Embedding = coV.vector(384);
Currently, you can perform a vector search only across a CoList of CoMaps containing embeddings property. For other data structures, see “Manual Index” pattern.
// schema.ts
import { co, z } from "jazz-tools";
import { coV } from "jazz-vector";
// 1) Define an embedding vector schema with expected dimension count
export const Embedding = coV.vector(384);
export const JournalEntry = co.map({
text: z.string(),
// 2) Use an embedding schema inside an entity
embedding: Embedding,
});
// 3) Define a searchable CoList of items containing embeddings property
export const JournalEntryList = co.list(JournalEntry);
Since CoVector
is a simple wrapper around Jazz's built-in FileStream
, all the FileStream
patterns apply (permissions, loading, etc).
It is recommended to obtain the embeddings vector at the time of writing a CoValue. This makes the most sense, because:
- writer naturally owns the data
- new CoValue will be automatically indexed for all subsequent reader peers
Alternatively, if you wish to create embeddings in the server worker after creation, it will be automatically synced by the power of Jazz.
Instantiate a CoVector using createFrom
method
await Embedding.createFrom([0.018676748499274254, -0.06785402446985245,...])
The instance of a CoVector can be assigned as a value as expected by the schema.
// create.ts
import { JournalEntry, Embedding } from "./schema.ts";
import { createEmbedding } from "./your-code";
// 1) Generate embeddings (bring your own embeddings model)
const vector: number[] = await createEmbedding("Text");
const journalEntry = JournalEntry.create({
text: "Text",
// 2) Instantiate and assign a `CoVector` from a specific vector (`number[]`)
embedding: await Embedding.createFrom(vector),
});
journalEntries.push(journalEntry);
The vector search is performed locally in memory on top of Jazz's CoList.
As such, you need to first load the CoList you wish to search across manually.
// app.tsx
import { useCoState } from "jazz-tools/react";
import { useCoVectorSearch } from "jazz-vector/react";
import { JournalEntryList } from "./schema.ts";
// 1) Load a searchable list (that has elements containing embeddings)
const journalEntries = useCoState(
JournalEntryList,
me.root.journalEntries.id,
{ resolve: { $each: true } }
);
Then, pass the searchable list along with:
- getter for embedding vector property on the list item
- embedding vector for the search query (or
null
that will pass your list through)
// 2) Search the list
const { search, isSearching } = useCoVectorSearch(
journalEntries, // <- loaded list to search in
(entry) => entry.embedding, // <- embedding property getter on each list item
queryEmbedding // <- embeddings of search query (number[]), or null to pass through
);
You can filter the data before passing it to CoVectorSearch to search on a subset of your list.
There are 2 search functions available:
useCoVectorSearch
hook for React appssearchCoVector
function for server workers or vanilla JS
Currently, vector search works only across a CoList of CoMaps containing embeddings property. To search data stored in a different data structures (or across multiple ones), you'll need to construct and maintain a searchable list manually.
For example, given you have a recursive Block
schema.
// -- schema.ts
import { co, z } from "jazz-tools";
// Recursive data structure
const Block = co.map({
text: z.string(),
get childBlock() {
return Block.optional();
}
get parentBlock() {
return Block.optional();
}
});
You can construct a simplified list of searchable objects that hold
the embedding vector and a reference to the original Block
instance.
// -- schema.ts
import { co, z } from "jazz-tools";
import { coV } from "jazz-vector";
const Block = co.map({ ... });
// Simple embedding + reference
export const SearchableBlock = co.map({
block: Block,
embedding: coV.vector(1536),
});
// Flat searchable list of references with embeddings
export const BlocksIndex = co.list(SearchableBlock);
// -- query.tsx
const { search, isSearching } = useCoVectorSearch(
searchableBlocksList,
(block) => block.embedding,
queryEmbedding
);
// `search.results` returns results over `SearchableBlock`
search.results.map(searchResult => {
const searchableBlock = searchResult.value
const block = searchableBlock.block // derefs and loads the `Block` instance
})
This pattern of manually constructing a single “index” is also useful for searching across various data types inside your app (e.g. notes, photos, messages)
The lib expects you to bring own embeddings, so you're free to use either local or server-side model.
Using a server-side embedding model makes sense when (for example)
- you want to optimize client app package size
- you want to offload client CPU cycles when creating embeddings for huge amounts of data
- you want larger, specialized, or proprietary models
- you want easier centralized upgrades.
The trade-offs are:
- loss of offline capability
- higher latency and failure modes (network/timeouts)
- per-request cost/rate limits
- privacy implications because user text leaves the device.
You can put embedding vectors of various dimensions on a single CoValue.
This allows you to use different embedding models for search tasks of varying difficulties, for example:
- use small simple embeddings models on the client to power the on-device search feature
- use powerful commercial embeddings models on the server for RAG
// schema.ts
export const JournalEntry = co.map({
text: z.string(),
simpleEmbedding: coV.vector(384),
largeEmbedding: coV.vector(3072),
});
The CoVectorSearch dereferences and loads the actual CoVector
value (the embedding vector) only upon search.
// query.tsx (on the client device)
const { queryEmbeddings } = useSimpleEmbeddings(...) // returns 384-dimensional vector
const { search, isSearching } = useCoVectorSearch(
journalEntries,
(entry) => entry.simpleEmbedding,
queryEmbeddings
);
// search.ts (on the server)
const queryEmbeddings = await openai.embeddings.create(...) // returns 3072-dimensional vector
const searchResults = searchCoVector(
journalEntries,
(entry) => entry.largeEmbedding,
queryEmbeddings
);
Defines a CoVector
schema in the Jazz storage schema.
Parameter | Type | Description |
---|---|---|
dimensions |
Number | The number of embedding vector dimensions (length) |
- CoVector schema (extension of Jazz's built-in FileStream schema with
.createFrom
method)
Creates an instance of CoVector
from CoVector schema.
Parameter | Type | Description |
---|---|---|
vector |
Array of Number; or Float32Array |
The raw vector data. Must have the exact dimension (length) as defined in the schema. |
options |
Jazz Ownership Object (see) | Native Jazz's ownership options |
- CoVector (Jazz
FileStream
)
Performs a vector search on a CoList. React hook.
Automatically recalculates the results when the searched list or query changes.
Parameter | Type | Description |
---|---|---|
list |
CoList or undefined or null |
An instance of CoList to search in. |
embeddingGetter |
Function | Getter function for the embedding property on each list item. |
queryEmbeddings |
number[] or Float32Array or null |
Embedding vector for the search query. When query is null , the entire list will be passed through. |
filterOptions |
{ limit: N } or{ similarityThreshold: N } or { similarityTopPercent: N} |
Controls how many results are returned. limit sets the maximum exact number of results; similarityThreshold filters by minimum similarity score; similarityTopPercent filters N% top percents based on the highest score. Default { limit: 10 } |
Parameter | Type | Description |
---|---|---|
isSearching |
Boolean | Determines whether a search is currently pending. |
search |
CoVectorSearchResult (see details) |
Search results. |
error |
String (optional) | Eventual error from the search |
Performs a vector search on a CoList. Asynchronous function to be used in the server worker, or a vanilla JS code.
Parameter | Type | Description |
---|---|---|
list |
CoList or undefined or null |
An instance of CoList to search in. |
embeddingGetter |
Function | Getter function for the embedding property on each list item. |
queryEmbeddings |
number[] or Float32Array or null |
Embedding vector for the search query. When query is null , the entire list will be passed through. |
options |
Object (optional) | |
filterOptions |
{ limit: N } or{ similarityThreshold: N } or{ similarityTopPercent: N} |
Controls how many results are returned. limit sets the maximum exact number of results; similarityThreshold filters by minimum similarity score; similarityTopPercent filters N% top percents based on the highest score. Default { limit: 10 } |
abortSignal |
AbortSignal |
Adds ability to abort the search |
Result of the vector search call.
Has 3 variants based on input list
:
undefined
when inputlist
isundefined
null
when inputlist
isnull
- Object (see below) when input
list
has data
Parameter | Type | Description |
---|---|---|
didSearch |
Boolean | Determines whether the search was performed or not. |
durationMs |
Number (optional) | Duration of the vector search in milliseconds. |
results |
Array | Array of results, sorted by similarity from highest to lowest; or the original list data if query was null |
value |
CoList item type | The original item from the CoList |
similarity |
Number (optional) between -1 and 1 |
Similarity score of this value to the query. Will be present if the search was performed (input query was set) |
When the input query is null
the search will pass through all of the original data wrapped in CoVectorSearchResult
type (with didSearch: false
and results
array without a similarity
score).
The current version is the first, most basic (even naive), unoptimized implementation of vector storage and search.
The search simply loads all vectors one by one, then calculates similarity scores, and sorts the results. The performance is poor (search across only 1500 (384-dim) vectors takes ~115 ms in Safari on M1 Pro.)
However, it is a fully working semantic search.
Next steps for this lib:
- build a true vector index
- first milestone is to reach 100k vectors search within 100ms
- performance optimizations for calculating similarity scores
- see TODOs in code
- build bindings for Svelte, etc (looking for contributors!)
npm install
npm run build
MIT License