-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[ENH] Add Morph embedding functions #5183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: propel-code-bot[bot] <203372662+propel-code-bot[bot]@users.noreply.github.com>
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
Summary: 1 successful workflow, 1 pending workflow
Last updated: 2025-08-01 21:34:08 UTC |
|
Add Morph Embedding Functions (Python & Typescript) with Full Integration This PR introduces Morph as a first-class embedding function to both Python and TypeScript clients in Chroma. It provides implementations, integration into existing embedding registries, schema validation, comprehensive tests, and user/documentation updates for Morph, an OpenAI-compatible code-focused embedding model. The change includes build and configuration plumbing, package registration, and seamless switching between environment variable or direct API key configuration. Extensive documentation and examples are included for both languages. Key Changes• Implements MorphEmbeddingFunction for Python and Typescript, using Morph's OpenAI-compatible embedding API. Affected Areas• Python: chromadb/utils/embedding_functions/morph_embedding_function.py This summary was automatically generated by @propel-code-bot |
docs/docs.trychroma.com/public/llms-integrations-embedding-models-morph.txt
Show resolved
Hide resolved
| def validate_config_update( | ||
| self, old_config: Dict[str, Any], new_config: Dict[str, Any] | ||
| ) -> None: | ||
| if "model_name" in new_config: | ||
| raise ValueError( | ||
| "The model name cannot be changed after the embedding function has been initialized." | ||
| ) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
The current implementation of validate_config_update prevents any update that includes model_name, even if it's the same value. This should be changed to only raise an error if the model_name is different from the existing one.
docs/docs.trychroma.com/markdoc/content/integrations/embedding-models/morph.md
Outdated
Show resolved
Hide resolved
docs/docs.trychroma.com/public/llms-integrations-embedding-models-morph.txt
Show resolved
Hide resolved
| // use directly | ||
| const embeddings = embedder.generate(["function calculate(a, b) { return a + b; }", "class User { constructor(name) { this.name = name; } }"]) | ||
|
|
||
| // pass documents to query for .add and .query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Documentation]
Rephrase for clarity: change this comment to "// Pass documents to the .add and .query methods".
docs/docs.trychroma.com/public/llms-integrations-embedding-models-morph.txt
Show resolved
Hide resolved
docs/docs.trychroma.com/public/llms-integrations-embedding-models-morph.txt
Show resolved
Hide resolved
Supersedes chroma-core#5043 --------- Co-authored-by: bhaktatejas922 <[email protected]> Co-authored-by: propel-code-bot[bot] <203372662+propel-code-bot[bot]@users.noreply.github.com> Co-authored-by: Jeffrey Huber <[email protected]>
Supersedes #5043