Feat/build index from arrays#289
Conversation
Adds a new method that accepts numpy arrays directly, avoiding the pickle intermediary. Refactors build_index_from_embeddings to delegate to it. Useful for MLX, GPU, or database-sourced embeddings. Includes 7 tests covering basic usage, regression, dimension/count validation, and placeholder generation. Co-Authored-By: Cora <cora-2f1e43dc@sjstargetedsolutions.co.nz>
Co-Authored-By: Cora <cora-2f1e43dc@sjstargetedsolutions.co.nz>
|
The build error doesn't have anything to do with python (and the code change was purely python). CMake Error at /tmp/tmpmh94a5_1/build/_deps/protobuf-src/cmake/libprotobuf-lite.cmake:27 (target_link_libraries): but the target was not found. Possible reasons include: Call Stack (most recent call first): |
|
Thanks for the contribution. The The failing job is in I’ve seen the same failure pattern on other PRs:
I’m handling this as a separate backend/CI fix. Once that lands (or after rebasing onto a commit that includes it), please re-run checks on this PR. In short: no blocker from the feature itself; current red CI is an infra/backend build issue. |
|
There was a rebuild attempt. Same or very similar error: |
|
All checks passed! |
Adds a new method that accepts numpy arrays directly, avoiding the pickle intermediary. Refactors build_index_from_embeddings to delegate to it. Useful for MLX, GPU, or database-sourced embeddings.
Enables incremental embedding, prior to final indexing, e.g. for input, thinking, output turns in an LLM conversation, or adding the embeddings of one compaction/compression cycle of a long lived/many cycle conversation rather than recalculating all of the embeddings for each cycle. Reduces risk of failed modification (read/write/insert/write) to a pickle file when using an ACID database that has a vector data type.
Includes 7 tests covering basic usage, regression, dimension/count validation, and placeholder generation.
What does this PR do?
Adds a new method that accepts numpy arrays directly, avoiding the pickle intermediary. Refactors build_index_from_embeddings to delegate to it.
Related Issues
Fixes #
Checklist
uv run pytest)ruff formatandruff check)pre-commit run --all-files)