Skip to content

Conversation

christian-bromann
Copy link
Member

Description

This PR fixes a bug in the Milvus vector store implementation where upsert operations would fail when autoId is set to false. The issue was that the primary field was being incorrectly removed from the fields array during collection initialization, causing documents to be added instead of updated during upsert operations.

Root Cause

The bug was in the grabCollectionFields() method where fields with autoID: true were being removed from this.fields regardless of the user's autoId setting. This caused the primary field to be missing during document processing in the addVectors() method, making upsert operations fail.

Solution

Modified the field removal logic in grabCollectionFields() to only remove autoID fields when both conditions are true:

  • field.autoID is true (database schema property)
  • this.autoId is true (user's preference)

This ensures that when autoId: false, the primary field remains in this.fields and gets processed correctly for upsert operations.

Testing

Added comprehensive unit tests covering:

  • Primary field inclusion/exclusion based on autoId setting
  • Document processing with primary field from metadata
  • Error handling when primary field is missing
  • Explicit ID handling vs metadata IDs

Note for reviewers: The unit tests will show harmless error logs during execution due to the MilvusClient constructor attempting connections during instantiation. This is expected behavior and doesn't affect test validity. The tests use a TestMilvus class that extends the main class to expose internal methods for testing without requiring network connections. Consider migrating to Vitest for better ESM mocking capabilities in the future.

Fixes #8495

When autoId is set to false, the primary field was being incorrectly
removed from the fields array during collection initialization, causing
upsert operations to fail and documents to be added instead of updated.

The issue was in grabCollectionFields() where fields with autoID: true
were being removed regardless of the user's autoId setting. This fix
ensures that autoID fields are only removed when both the field has
autoID: true AND the user has set autoId: true.

Changes:
- Modified field removal logic in grabCollectionFields() to check both
  field.autoID and this.autoId before removing fields
- Added comprehensive unit tests covering various autoId scenarios
- Tests verify primary field handling, document processing, and error cases

Fixes upsert functionality when users provide their own document IDs
through metadata with autoId: false.
Copy link

vercel bot commented Jul 14, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
langchainjs-docs Ready Ready Preview Sep 1, 2025 6:25am
1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
langchainjs-api-refs Ignored Ignored Sep 1, 2025 6:25am

@hntrl hntrl merged commit c7f9da0 into langchain-ai:main Sep 1, 2025
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upsert method missing ID field
2 participants