Skip to content

Feat/build index from arrays#289

Merged
ASuresh0524 merged 3 commits intoyichuan-w:mainfrom
sjswerdloff:feat/build-index-from-arrays
Apr 13, 2026
Merged

Feat/build index from arrays#289
ASuresh0524 merged 3 commits intoyichuan-w:mainfrom
sjswerdloff:feat/build-index-from-arrays

Conversation

@sjswerdloff
Copy link
Copy Markdown
Contributor

Adds a new method that accepts numpy arrays directly, avoiding the pickle intermediary. Refactors build_index_from_embeddings to delegate to it. Useful for MLX, GPU, or database-sourced embeddings.
Enables incremental embedding, prior to final indexing, e.g. for input, thinking, output turns in an LLM conversation, or adding the embeddings of one compaction/compression cycle of a long lived/many cycle conversation rather than recalculating all of the embeddings for each cycle. Reduces risk of failed modification (read/write/insert/write) to a pickle file when using an ACID database that has a vector data type.

Includes 7 tests covering basic usage, regression, dimension/count validation, and placeholder generation.

What does this PR do?

Adds a new method that accepts numpy arrays directly, avoiding the pickle intermediary. Refactors build_index_from_embeddings to delegate to it.

Related Issues

Fixes #

Checklist

  • Tests pass (uv run pytest)
  • Code formatted (ruff format and ruff check)
  • Pre-commit hooks pass (pre-commit run --all-files)

sjswerdloff and others added 2 commits March 19, 2026 16:44
Adds a new method that accepts numpy arrays directly, avoiding the
pickle intermediary. Refactors build_index_from_embeddings to delegate
to it. Useful for MLX, GPU, or database-sourced embeddings.

Includes 7 tests covering basic usage, regression, dimension/count
validation, and placeholder generation.

Co-Authored-By: Cora <cora-2f1e43dc@sjstargetedsolutions.co.nz>
Co-Authored-By: Cora <cora-2f1e43dc@sjstargetedsolutions.co.nz>
@sjswerdloff
Copy link
Copy Markdown
Contributor Author

sjswerdloff commented Mar 19, 2026

The build error doesn't have anything to do with python (and the code change was purely python).
I'll look to see if previous PRs are having similar problems...
Any suggestions much appreciated.

CMake Error at /tmp/tmpmh94a5_1/build/_deps/protobuf-src/cmake/libprotobuf-lite.cmake:27 (target_link_libraries):
Target "libprotobuf-lite" links to:

absl::absl_check

but the target was not found. Possible reasons include:

* There is a typo in the target name.
* A find_package call is missing for an IMPORTED target.
* An ALIAS target is missing.

Call Stack (most recent call first):
/tmp/tmpmh94a5_1/build/_deps/protobuf-src/CMakeLists.txt:278 (include)

@ASuresh0524
Copy link
Copy Markdown
Collaborator

Thanks for the contribution. The build_index_from_arrays changes look good, and this failure appears unrelated to your Python changes.

The failing job is in packages/leann-backend-diskann CMake config (tcmalloc / absl::* target resolution), which is an environment/dependency issue in the DiskANN native build path, not in api.py / test logic from this PR.

I’ve seen the same failure pattern on other PRs:

  • absl::absl_check / absl::nullability not found
  • failure inside tcmalloc/protobuf CMake generation
  • matrix cancellations after the first platform failure

I’m handling this as a separate backend/CI fix. Once that lands (or after rebasing onto a commit that includes it), please re-run checks on this PR.

In short: no blocker from the feature itself; current red CI is an infra/backend build issue.

@sjswerdloff
Copy link
Copy Markdown
Contributor Author

There was a rebuild attempt. Same or very similar error:
CMake Error at /tmp/tmpyhuyq3g3/build/_deps/protobuf-src/cmake/libprotobuf-lite.cmake:27 (target_link_libraries):
Target "libprotobuf-lite" links to:

absl::absl_check

@sjswerdloff
Copy link
Copy Markdown
Contributor Author

All checks passed!
Please merge at your earliest convenience.

@ASuresh0524 ASuresh0524 merged commit d8fa507 into yichuan-w:main Apr 13, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants