Skip to content

Conversation

lerman25
Copy link
Collaborator

@lerman25 lerman25 commented Jul 1, 2025

This PR refactors the serialization logic for vector indexes by introducing a clear abstraction layer via the Serializer base class. Both HNSWSerializer and SVSSerializer now inherit from this interface, enabling:

  • Unified structure for save/load operations across index types
  • Encoding version support for backward/forward compatibility
  • Cleaner separation of shared vs. index-specific serialization logic
  • Easier extension for future index types

Each index (e.g., HNSWIndex, SVSIndex) implements its own saveIndexFields method to handle template- and implementation-specific data.


This PR also introduces support for saving/loading SVS indexes and validating their internal consistency.
Specifically:

  • Implements the Serializer interface for SVS, enabling saveIndex() and loadIndex() methods. (Using a commit created by @rfsaliev )
  • Adds a checkIntegrity() method for runtime validation of index structure and metadata.

@CLAassistant
Copy link

CLAassistant commented Jul 1, 2025

CLA assistant check
All committers have signed the CLA.

@lerman25 lerman25 marked this pull request as draft July 1, 2025 11:58
@lerman25 lerman25 force-pushed the omer-add-save-load-check branch from a3ee719 to 7af26da Compare July 1, 2025 12:15
@lerman25 lerman25 marked this pull request as ready for review July 2, 2025 10:55
@lerman25 lerman25 requested a review from meiravgri July 2, 2025 11:18
Copy link
Collaborator

@meiravgri meiravgri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!
main comments are regarding visibility :)
missing review for svs_extensions.h and svs_utils.h, will go over with @rfsaliev

lerman25 added a commit that referenced this pull request Jul 3, 2025
@alonre24 alonre24 changed the title [MOD-7022] Add serialization to SVS index [MOD-10236] Add serialization to SVS index Jul 6, 2025
@lerman25 lerman25 force-pushed the omer-add-save-load-check branch from 7170d55 to 650c333 Compare September 7, 2025 10:51
@lerman25 lerman25 requested a review from meiravgri September 7, 2025 10:51
@lerman25 lerman25 force-pushed the omer-add-save-load-check branch from 884a54c to c167405 Compare September 14, 2025 12:01
meiravgri
meiravgri previously approved these changes Sep 14, 2025
@lerman25 lerman25 enabled auto-merge September 14, 2025 12:04
@lerman25 lerman25 added this pull request to the merge queue Sep 14, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 14, 2025
meiravgri
meiravgri previously approved these changes Sep 16, 2025
@lerman25 lerman25 enabled auto-merge September 16, 2025 12:50
@lerman25 lerman25 added this pull request to the merge queue Sep 16, 2025
Merged via the queue into main with commit f68bb6b Sep 16, 2025
19 checks passed
@lerman25 lerman25 deleted the omer-add-save-load-check branch September 16, 2025 16:24
Copy link

Successfully created backport PR for 8.2:

github-actions bot pushed a commit that referenced this pull request Sep 16, 2025
* generalize

* remove serializer.cpp from cmake

* prepare merge with  rafik commit

* [SVS] Implement Save/Load + test

* seperate hnsw_serializer to h and cpp

* remove get version impl

* save impl

* add load

* change camelcase

* for mat

* generalzie saveIndexFields

* format

* compare metadata on load

* Add checkIntegrity with error

* checkIntegrity

* remove duplicate verification in compare meta data

* format

* svs serializetion version testing

* Revert "svs serializetion version testing"

This reverts commit 9ed7730.

* common serializer test

* remove changes_num from metadata

* Add location c'tor

* Add location ctor and to test

* Remove outdated comment from serializer header

* Enhance documentation for loadIndex function in SVSIndex

* Add comments

* format + remove test

* enable tests

* serializer test

* format

* reset SVS to master

* add logging to test_svs

* format

* remove duplicate NewIndexImpl

* expose loadIndex in VecSimIndex, add BUILD_TEST gurad

* remove string ctor from SVSIndex

* format

* fix BUILD_TEST in svs_factory

* document loadIndex

* move loadIndex to serializer

* remove excess declarations

* remove extra ;

* compatable -> compatible

* remove redundant params from test

* remove comments from threadpool_handle

* remove error context comments

* add checkIntegrity

* update checkIntegrity and format

* move loadIndex to SVSSerializer

* update bindings

* format

* add test

* add single

* adjust labels

* Refactor save_load test to simplify vector generation logic

* add HAVE_SVS guard

* Add missing include for <sstream> in svs_serializer.h

* free faulty index

* Free index

* Improve error message for index loading failure in NewIndex function

* format

---------

Co-authored-by: Rafik Saliev <[email protected]>
(cherry picked from commit f68bb6b)
lerman25 added a commit that referenced this pull request Sep 17, 2025
* generalize

* remove serializer.cpp from cmake

* prepare merge with  rafik commit

* [SVS] Implement Save/Load + test

* seperate hnsw_serializer to h and cpp

* remove get version impl

* save impl

* add load

* change camelcase

* for mat

* generalzie saveIndexFields

* format

* compare metadata on load

* Add checkIntegrity with error

* checkIntegrity

* remove duplicate verification in compare meta data

* format

* svs serializetion version testing

* Revert "svs serializetion version testing"

This reverts commit 9ed7730.

* common serializer test

* remove changes_num from metadata

* Add location c'tor

* Add location ctor and to test

* Remove outdated comment from serializer header

* Enhance documentation for loadIndex function in SVSIndex

* Add comments

* format + remove test

* enable tests

* serializer test

* format

* reset SVS to master

* add logging to test_svs

* format

* remove duplicate NewIndexImpl

* expose loadIndex in VecSimIndex, add BUILD_TEST gurad

* remove string ctor from SVSIndex

* format

* fix BUILD_TEST in svs_factory

* document loadIndex

* move loadIndex to serializer

* remove excess declarations

* remove extra ;

* compatable -> compatible

* remove redundant params from test

* remove comments from threadpool_handle

* remove error context comments

* add checkIntegrity

* update checkIntegrity and format

* move loadIndex to SVSSerializer

* update bindings

* format

* add test

* add single

* adjust labels

* Refactor save_load test to simplify vector generation logic

* add HAVE_SVS guard

* Add missing include for <sstream> in svs_serializer.h

* free faulty index

* Free index

* Improve error message for index loading failure in NewIndex function

* format

---------

Co-authored-by: Rafik Saliev <[email protected]>
github-merge-queue bot pushed a commit that referenced this pull request Sep 17, 2025
[MOD-10236] Add serialization to SVS index (#716)

* generalize

* remove serializer.cpp from cmake

* prepare merge with  rafik commit

* [SVS] Implement Save/Load + test

* seperate hnsw_serializer to h and cpp

* remove get version impl

* save impl

* add load

* change camelcase

* for mat

* generalzie saveIndexFields

* format

* compare metadata on load

* Add checkIntegrity with error

* checkIntegrity

* remove duplicate verification in compare meta data

* format

* svs serializetion version testing

* Revert "svs serializetion version testing"

This reverts commit 9ed7730.

* common serializer test

* remove changes_num from metadata

* Add location c'tor

* Add location ctor and to test

* Remove outdated comment from serializer header

* Enhance documentation for loadIndex function in SVSIndex

* Add comments

* format + remove test

* enable tests

* serializer test

* format

* reset SVS to master

* add logging to test_svs

* format

* remove duplicate NewIndexImpl

* expose loadIndex in VecSimIndex, add BUILD_TEST gurad

* remove string ctor from SVSIndex

* format

* fix BUILD_TEST in svs_factory

* document loadIndex

* move loadIndex to serializer

* remove excess declarations

* remove extra ;

* compatable -> compatible

* remove redundant params from test

* remove comments from threadpool_handle

* remove error context comments

* add checkIntegrity

* update checkIntegrity and format

* move loadIndex to SVSSerializer

* update bindings

* format

* add test

* add single

* adjust labels

* Refactor save_load test to simplify vector generation logic

* add HAVE_SVS guard

* Add missing include for <sstream> in svs_serializer.h

* free faulty index

* Free index

* Improve error message for index loading failure in NewIndex function

* format

---------

Co-authored-by: Rafik Saliev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants