-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Add ACL-aware routing processors for multi-tenant document routing #18834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Shard Placement Actually add the files Signed-off-by: Atri Sharma <[email protected]>
Introduces two new processors to enable document routing based on ACL metadata: - AclRoutingProcessor (ingest pipeline): * Extracts ACL field value and generates deterministic routing using MurmurHash3 * Configurable options: acl_field, target_field (default: _routing), ignore_missing, override_existing * Ensures documents with same ACL values colocate on same shard - AclRoutingSearchProcessor (search pipeline): * Automatically extracts ACL values from term/terms/bool queries * Sets routing on search requests to target specific shards * Supports nested bool query traversal (must/filter/should clauses) * Configurable extraction with extract_from_query flag Implementation details: - Both processors use identical MurmurHash3.hash128() with Base64 encoding for consistent routing value generation - Registered in IngestCommonModulePlugin and SearchPipelineCommonModulePlugin - Comprehensive unit tests and integration tests for both processors - Follows existing processor patterns (similar to HierarchicalRoutingProcessor) Use case: Improves query performance in multi-tenant environments by ensuring tenant-specific documents are colocated and queries are routed to relevant shards. Resolves opensearch-project#18829 Signed-off-by: Atri Sharma <[email protected]>
Signed-off-by: Atri Sharma <[email protected]>
❌ Gradle check result for 3a85104: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #18834 +/- ##
============================================
+ Coverage 72.75% 72.78% +0.02%
- Complexity 68520 68567 +47
============================================
Files 5570 5574 +4
Lines 314998 315254 +256
Branches 45697 45754 +57
============================================
+ Hits 229185 229449 +264
+ Misses 67260 67205 -55
- Partials 18553 18600 +47 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
modules/ingest-common/src/main/java/org/opensearch/ingest/common/AclRoutingProcessor.java
Outdated
Show resolved
Hide resolved
modules/ingest-common/src/main/java/org/opensearch/ingest/common/AclRoutingProcessor.java
Show resolved
Hide resolved
...ne-common/src/main/java/org/opensearch/search/pipeline/common/AclRoutingSearchProcessor.java
Outdated
Show resolved
Hide resolved
...ne-common/src/main/java/org/opensearch/search/pipeline/common/AclRoutingSearchProcessor.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Atri Sharma <[email protected]>
❌ Gradle check result for fb10eca: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Atri Sharma <[email protected]>
❌ Gradle check result for 684f3cd: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Flaky tests #14509 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, @atris!
I added a comment on the search processor just to capture my thoughts on the query visitor. The conclusion is that your implementation is good, but I figure if I had to think about it, I should write down those thoughts. (I'll resolve my own comment before merging.)
...ne-common/src/main/java/org/opensearch/search/pipeline/common/AclRoutingSearchProcessor.java
Show resolved
Hide resolved
…pensearch-project#18834) Introduces two new processors to enable document routing based on ACL metadata: - AclRoutingProcessor (ingest pipeline): * Extracts ACL field value and generates deterministic routing using MurmurHash3 * Configurable options: acl_field, target_field (default: _routing), ignore_missing, override_existing * Ensures documents with same ACL values colocate on same shard - AclRoutingSearchProcessor (search pipeline): * Automatically extracts ACL values from term/terms/bool queries * Sets routing on search requests to target specific shards * Supports nested bool query traversal (must/filter/should clauses) * Configurable extraction with extract_from_query flag Implementation details: - Both processors use identical MurmurHash3.hash128() with Base64 encoding for consistent routing value generation - Registered in IngestCommonModulePlugin and SearchPipelineCommonModulePlugin - Comprehensive unit tests and integration tests for both processors - Follows existing processor patterns (similar to HierarchicalRoutingProcessor) Use case: Improves query performance in multi-tenant environments by ensuring tenant-specific documents are colocated and queries are routed to relevant shards. Resolves opensearch-project#18829 -------- Signed-off-by: Atri Sharma <[email protected]> Signed-off-by: sunqijun.jun <[email protected]>
…pensearch-project#18834) Introduces two new processors to enable document routing based on ACL metadata: - AclRoutingProcessor (ingest pipeline): * Extracts ACL field value and generates deterministic routing using MurmurHash3 * Configurable options: acl_field, target_field (default: _routing), ignore_missing, override_existing * Ensures documents with same ACL values colocate on same shard - AclRoutingSearchProcessor (search pipeline): * Automatically extracts ACL values from term/terms/bool queries * Sets routing on search requests to target specific shards * Supports nested bool query traversal (must/filter/should clauses) * Configurable extraction with extract_from_query flag Implementation details: - Both processors use identical MurmurHash3.hash128() with Base64 encoding for consistent routing value generation - Registered in IngestCommonModulePlugin and SearchPipelineCommonModulePlugin - Comprehensive unit tests and integration tests for both processors - Follows existing processor patterns (similar to HierarchicalRoutingProcessor) Use case: Improves query performance in multi-tenant environments by ensuring tenant-specific documents are colocated and queries are routed to relevant shards. Resolves opensearch-project#18829 -------- Signed-off-by: Atri Sharma <[email protected]>
…pensearch-project#18834) Introduces two new processors to enable document routing based on ACL metadata: - AclRoutingProcessor (ingest pipeline): * Extracts ACL field value and generates deterministic routing using MurmurHash3 * Configurable options: acl_field, target_field (default: _routing), ignore_missing, override_existing * Ensures documents with same ACL values colocate on same shard - AclRoutingSearchProcessor (search pipeline): * Automatically extracts ACL values from term/terms/bool queries * Sets routing on search requests to target specific shards * Supports nested bool query traversal (must/filter/should clauses) * Configurable extraction with extract_from_query flag Implementation details: - Both processors use identical MurmurHash3.hash128() with Base64 encoding for consistent routing value generation - Registered in IngestCommonModulePlugin and SearchPipelineCommonModulePlugin - Comprehensive unit tests and integration tests for both processors - Follows existing processor patterns (similar to HierarchicalRoutingProcessor) Use case: Improves query performance in multi-tenant environments by ensuring tenant-specific documents are colocated and queries are routed to relevant shards. Resolves opensearch-project#18829 -------- Signed-off-by: Atri Sharma <[email protected]>
Introduces two new processors to enable document routing based on ACL metadata:
AclRoutingProcessor (ingest pipeline):
ignore_missing, override_existing
AclRoutingSearchProcessor (search pipeline):
Implementation details:
for consistent routing value generation
Use case: Improves query performance in multi-tenant environments by ensuring
tenant-specific documents are colocated and queries are routed to relevant shards.
Resolves #18829
Signed-off-by: [Atri Sharma] [email protected]