Skip to content

Conversation

@karenyrx
Copy link
Contributor

@karenyrx karenyrx commented Nov 9, 2025

Description

A series of gRPC bulk API fixes/optimizations:

  1. Fix update operation to use doc field instead of object field (still fallsback to object for backward compatibility)
  2. Fix the default value of fetchSource to null to match REST
  3. Add support for allowExplicitIndex setting
  4. Add pipeline support for upsert
  5. Optimize document bytes copying to pass bytes reference

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…llsback to object for bwc)

bytes optimize

Set the default value of source to null to match REST

Support allowExplicitIndex setting

Signed-off-by: Karen Xu <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Nov 9, 2025

❌ Gradle check result for f3818ca: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Karen Xu <[email protected]>
Signed-off-by: Karen Xu <[email protected]>
@github-actions
Copy link
Contributor

❕ Gradle check result for 34bbaf4: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@codecov
Copy link

codecov bot commented Nov 10, 2025

Codecov Report

❌ Patch coverage is 73.46939% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.25%. Comparing base (022d594) to head (34bbaf4).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...est/document/bulk/BulkRequestParserProtoUtils.java 69.04% 5 Missing and 8 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #19937      +/-   ##
============================================
- Coverage     73.27%   73.25%   -0.03%     
+ Complexity    71563    71558       -5     
============================================
  Files          5785     5785              
  Lines        326822   326845      +23     
  Branches      47294    47301       +7     
============================================
- Hits         239484   239432      -52     
- Misses        68111    68156      +45     
- Partials      19227    19257      +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@karenyrx karenyrx marked this pull request as ready for review November 10, 2025 21:50
@karenyrx karenyrx requested a review from a team as a code owner November 10, 2025 21:50
Signed-off-by: Karen Xu <[email protected]>
@github-actions
Copy link
Contributor

❌ Gradle check result for 6d897b8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 6d897b8: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

index = createOperation.hasXIndex() ? createOperation.getXIndex() : index;
// Check explicit index (matches REST BulkRequestParser line 218-221)
if (createOperation.hasXIndex()) {
if (!allowExplicitIndex && defaultIndex != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does allowExplicitIndex have a use case in the context of gRPC? For REST one use case could be forcing users to provide the target index in the URI which is more easily filtered on than parsing a request body. For gRPC though this information will always be in the request body and allowExplicitIndex will only control which index fields you are allowed to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, let me remove it to decouple from this PR for now
I'm not sure of the original intent of allowExplicitIndex - is it just for security concerns or could it also help to reduce the sizes of request payloads? Erring on the side of caution, had added support for it in the gRPC APIs to maintain parity but I can create a separate issue to track it later: #19962

updateRequest.scriptedUpsert(updateAction.getScriptedUpsert());
}

// 3. upsert
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like any given BulkRequestBody has 3 different locations to place its document bytes (object, doc, and upsert). Is there a need to distinguish between these? What happens if multiple are set?

Copy link
Contributor Author

@karenyrx karenyrx Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the 3 make a difference

  • object is used for index and create ops. The reason this extra wrapper was introduced is because in the HTTP APIs, the document for index/create ops are specified on the next line (for ND JSON bulk ops), but in protobuf we must give it a key name, so we name it "object" for unnamed field)
  • doc and upsert are used in the same way as HTTP APIs, which are specified in the Bulk API documentation. (Upating the documentation website to clarify usage of teh upsert field for reference: https://github.com/opensearch-project/documentation-website/pull/11506/files)

There is a BulkRequest.validate() function will will validate if the incorrect combination of parameters are set - this is shared by both the HTTP and GRPC paths

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants