SOLR-17319 : Combined Query Feature for Multi Query Execution #3418

ercsonusharma · 2025-07-04T09:19:21Z

https://issues.apache.org/jira/browse/SOLR-17319

Description

This feature aims to execute multiple queries of multiple kinds across multiple shards of a collection and combine their result basis an algorithm (like Reciprocal Rank Fusion). It also help resolve the issues being discussed w.r.t the previous PR, mainly around across shard documents merging. It provides more flexibility in terms of querying extending JSON Query DSL ultimately enabling Hybrid Search in a pure way solving the shortcomings.

Note: This feature is currently unsupported for non-distributed and grouping query.

Solution

Extended the QueryComponent to create new CombinedQueryComponent and ResponseBuilder to create new CombinedQueryResponseBuilder supports multiple response builders to hold the state and execute multiple queries.
In JSON Query DSL, a parameter is added to identity Combined Query request and basis that it invokes the new CombinedQueryComponent
CombinedQueryComponent have multiple response builders assigned for each query. These queries are first executed at the SolrSearchIndexer level and combined them using RRF for now.
At Shard level also, the responses for the multiple queries are merged.

Tests

Added tests for testing the RRF logic independently.
Added tests for across search index and distributed requests.
Added tests to assert existing behaviour of search handler's QueryComponent as well as for the newly added CombinedQueryComponent basis the flag in json query DSL.

Checklist

Please review the following and check all that apply:

I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
I have created a Jira issue and added the issue ID to my pull request title.
I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
I have developed this patch against the main branch.
I have run ./gradlew check.
I have added tests for my changes.
I have added documentation for the Reference Guide

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

ercsonusharma · 2025-07-09T17:21:12Z

@alessandrobenedetti @dsmiley, please help review it whenever you can. Thanks!

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

solr/core/src/java/org/apache/solr/search/combine/ReciprocalRankFusion.java

solr/core/src/java/org/apache/solr/search/combine/QueryAndResponseCombiner.java

solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java

solr/core/src/test/org/apache/solr/handler/component/CombinedQueryComponentTest.java

solr/core/src/java/org/apache/solr/search/combine/ReciprocalRankFusion.java

dsmiley

Really glad to see this work began by acknowledging the existing work and trying to address the pitfalls!

solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java

solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java

solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java

solr/core/src/java/org/apache/solr/search/combine/QueryAndResponseCombiner.java

alessandrobenedetti · 2025-07-10T09:54:16Z

Hi @ercsonusharma , thanks for resurrecting this, didn't have time to dedicate to the feature in the last few months, good to see some movement!

In the next couple of weeks, I should be able to give it a go and review it!

solr/core/src/java/org/apache/solr/handler/component/CombinedQuerySearchHandler.java

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

cpoerschke · 2025-09-01T17:13:39Z

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

+    // save these results in a private area so we can access them
+    // again when retrieving stored fields.
+    // TODO: use ResponseBuilder (w/ comments) or the request context?
+    rb.resultIds = createShardResult(rb, shardDocMap, responseDocs);


If as per https://github.com/apache/solr/pull/3418/files#r2314366325 the maxScore setting were to be removed then I think here we could simplify like this ...

Suggested change

rb.resultIds = createShardResult(rb, shardDocMap, responseDocs);

rb.resultIds = createShardResult(rb, shardDocMap);

for (int i = 0; i < rb.resultIds.size(); i++) responseDocs.add(null);

... and then somehow (still thinking about that) somethingcreateShardResult-like could be factored out in the QueryComponent base class and overridden here.

No, it cannot be removed. maxScore is shown as the part of SolrDocument result and it has to be updated with latest maxScore after rrf.

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

cpoerschke · 2025-09-02T12:29:13Z

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

+        ShardDoc shardDoc = new ShardDoc();
+        shardDoc.id = id;
+        shardDoc.shard = srsp.getShard();
+        shardDoc.orderInShard = i;
+        Object scoreObj = doc.getFieldValue(SolrReturnFields.SCORE);
+        if (scoreObj != null) {
+          if (scoreObj instanceof String) {
+            shardDoc.score = Float.parseFloat((String) scoreObj);
+          } else {
+            shardDoc.score = ((Number) scoreObj).floatValue();
+          }
+        }
+        if (!scoreDependentFields.isEmpty()) {
+          shardDoc.scoreDependentFields = doc.getSubsetOfFields(scoreDependentFields);
+        }
+
+        shardDoc.sortFieldValues = unmarshalledSortFieldValues;
+        shardDocMap.computeIfAbsent(srsp.getShard(), list -> new ArrayList<>()).add(shardDoc);
+        String prevShard = uniqueDoc.put(id, srsp.getShard());
+        if (prevShard != null) {
+          // duplicate detected
+          numFound--;
+        }


observations: the QueryComponent equivalent to this block of code is https://github.com/apache/solr/blob/releases/solr/9.9.0/solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java#L1122-L1156 but there are differences:

in QueryComponent duplicates are unusual and will be omitted (and the numFound counter decremented)

in CombinedQueryComponent duplicates are possible and must not be omitted (but the numFound counter will be decremented)

Added 4dcbb57 dev increment with this in mind i.e. supportive of both scenarios.

Yes, it's not exactly duplicated all the code of the method, rather half of it.

…yComponent.mergeIds

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java

dsmiley · 2025-09-02T13:28:26Z

Can you please "resolve" any conversation you think were addressed? This is a long PR with many conversations, making it hard to catch up with the current state.

solr/solr-ref-guide/modules/query-guide/pages/json-combined-query-dsl.adoc

…ue instead of implementMergeIds-taking-ShardDocQueueFactory

solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java

solr/core/src/java/org/apache/solr/handler/component/HighlightComponent.java

…a/solr into feat_combined_query

solr/core/src/java/org/apache/solr/search/combine/ReciprocalRankFusion.java

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

dsmiley · 2025-09-04T05:50:02Z

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

+        final var unparsedQuery = params.get(queryKey);
+        ResponseBuilder rbNew = new ResponseBuilder(rb.req, new SolrQueryResponse(), rb.components);
+        rbNew.setQueryString(unparsedQuery);
+        super.prepare(rbNew);


wouldn't we want to manipulate the sort spec so that we get all docs up to offset (AKA "start" param) + rows since RRF/combiner is going to want to see all docs/rankings up to offset+rows? Otherwise our combiner is blind to the "offset" docs. Assuming you agree, then we need to basically apply paging at this layer (our component) instead of letting the subquery do it.

It anyways happening here

That's for distributed-search but not single-core search.

I think user-managed/standalone vs SolrCloud is orthogonal. This is about a single shard working correctly (in whatever Solr mode). IMO it's not optional for basic paging parameters to work correctly with one shard.

I could imagine we'd prefer a mechanism for a SearchComponent to force the "shortCircuit"=false thereby ensuring there's always a distributed phase. Maybe that could be done by re-ordering SearchHandler's call to getAndPrepShardHandler to be after prepareComponents (swap adjacent lines)? Then the prepare method of this component could force distrib and add the shortCircuit=false or something like that. And/or maybe a component should have a more elegant callback to communicate that it forces distributed search (even when there's one shard/core). This would overall simplify this component, no longer needing to handle paging in process(); instead do for distributed-search only.

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

solr/core/src/java/org/apache/solr/search/combine/QueryAndResponseCombiner.java

solr/core/src/test/org/apache/solr/handler/component/CombinedQueryComponentTest.java

solr/core/src/test/org/apache/solr/handler/component/DistributedCombinedQueryComponentTest.java

solr/core/src/test/org/apache/solr/handler/component/CombinedQueryComponentTest.java

solr/core/src/java/org/apache/solr/search/combine/ReciprocalRankFusion.java

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java

dsmiley

The beauty/wisdom of BaseDistributedSearchTestCase is that it tests consistency between single shard and multi-shard. I think it's brilliant; that is the point of this base class. Doing so requires that you use the correct utility methods it provides. I noticed your test calls queryServer instead of query. If you look at their impls, you'll see what I'm getting at. You'll see other subclass tests using the various methods to do these tests.

I suspect there's a single-shard pagination bug. If so, then correct usage of this base class would surface it without you having to write more tests.

solr/core/src/java/org/apache/solr/search/combine/ReciprocalRankFusion.java

solr/core/src/test/org/apache/solr/handler/component/DistributedCombinedQueryComponentTest.java

dsmiley · 2025-09-04T16:33:50Z

The beauty/wisdom of BaseDistributedSearchTestCase is that it tests consistency between single shard and multi-shard. I think it's brilliant; that is the point of this base class.

Yet this PR/approach will not be able to comply since unlike most (all?) components, its results are affected substantially by distributed-search. The (unsaid?) vision of sharding / distributed-search was getting the same results as a single shard, and Solr does the work to pull off that trick, with plenty of tests demonstrating it does. In fact I'd say, with great disappointment, that the observed (by a user) results of this component will not be RRF when there's distributed search over shards.

ercsonusharma · 2025-09-08T02:57:56Z

Yet this PR/approach will not be able to comply since unlike most (all?) components, its results are affected substantially by distributed-search. The (unsaid?) vision of sharding / distributed-search was getting the same results as a single shard, and Solr does the work to pull off that trick, with plenty of tests demonstrating it does. In fact I'd say, with great disappointment, that the observed (by a user) results of this component will not be RRF when there's distributed search over shards.

pushed a change to the PR that adds an option for the user to choose which Combiner method to use — Way 1 (pre) or Way 2 (post).

Sonu Sharma added 4 commits July 4, 2025 14:24

Combined Query Feature for Multi Query Execution

bf3cd5d

Tests: Combined Query Feature for Multi Query Execution

182bec9

Tests: Combined Query Feature for Multi Query Execution

b884f0e

Tests: Combined Query Feature for Multi Query Execution

29e8aea

github-actions bot added client:solrj tests cat:search module:clustering labels Jul 4, 2025

Improve: Fix typo

c113799

cpoerschke reviewed Jul 4, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java Show resolved Hide resolved

cpoerschke reviewed Jul 4, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java Outdated Show resolved Hide resolved

cpoerschke reviewed Jul 4, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java Outdated Show resolved Hide resolved

ercsonusharma added 2 commits July 4, 2025 22:58

Tests: Fix errors

3600ed3

Review comments: implementation

9b0c76e

atris requested changes Jul 9, 2025

View reviewed changes

dsmiley reviewed Jul 9, 2025

View reviewed changes

ercsonusharma added 3 commits July 12, 2025 14:23

Code review changes

a841bc7

Code review changes

91f8e09

Code review changes

cace1f7

github-actions bot removed the module:clustering label Jul 12, 2025

ercsonusharma added 3 commits July 13, 2025 21:35

Code review changes

299db43

Code review changes

840070e

Improvement and fixes

d2feefc

ercsonusharma requested a review from atris July 16, 2025 18:45

cpoerschke reviewed Jul 25, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/handler/component/CombinedQuerySearchHandler.java Outdated Show resolved Hide resolved

cpoerschke reviewed Jul 25, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java Outdated Show resolved Hide resolved

cpoerschke reviewed Jul 25, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java Outdated Show resolved Hide resolved

cpoerschke reviewed Sep 1, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java Show resolved Hide resolved

dev increment: add uniqueDoc map-and-logic to ShardDocQueue

4dcbb57

cpoerschke reviewed Sep 2, 2025

View reviewed changes

ercsonusharma and others added 4 commits September 2, 2025 18:20

review comment fix

8a65023

micro dev increment: replace unnecessary local resultSize use in Quer…

006b8c2

…yComponent.mergeIds

dev increment: factor out ShardDocQueue.resultIds method

771089b

dev increment: remove no-longer-used ShardDocQueue.(pop,size) methods

460e8cd

dsmiley reviewed Sep 2, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java Outdated Show resolved Hide resolved

solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java Outdated Show resolved Hide resolved

cpoerschke reviewed Sep 2, 2025

View reviewed changes

solr/solr-ref-guide/modules/query-guide/pages/json-combined-query-dsl.adoc Show resolved Hide resolved

cpoerschke reviewed Sep 2, 2025

View reviewed changes

solr/solr-ref-guide/modules/query-guide/pages/json-combined-query-dsl.adoc Show resolved Hide resolved

ercsonusharma and others added 5 commits September 3, 2025 10:30

review comment fix

ac85d2f

review comment fix

7b0593c

review comment enhancement

c03c0f7

simplification/consolidation: protected QueryComponent.newShardDocQue…

a52dd22

…ue instead of implementMergeIds-taking-ShardDocQueueFactory

factor out protected QueryComponent.setResultIdsAndResponseDocs method

195f3f1

dsmiley reviewed Sep 3, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java Outdated Show resolved Hide resolved

solr/core/src/java/org/apache/solr/handler/component/HighlightComponent.java Outdated Show resolved Hide resolved

ercsonusharma added 3 commits September 3, 2025 19:57

review comment enhancement

c1f5501

Merge branch 'feat_combined_query' of https://github.com/ercsonusharm…

3649d3e

…a/solr into feat_combined_query

refactor to reduce cyclometric complexity

4eedbed

dsmiley reviewed Sep 4, 2025

View reviewed changes

review comment fixes

0990e7f

dsmiley reviewed Sep 4, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java Outdated Show resolved Hide resolved

dsmiley reviewed Sep 4, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/search/combine/ReciprocalRankFusion.java Outdated Show resolved Hide resolved

solr/core/src/test/org/apache/solr/handler/component/DistributedCombinedQueryComponentTest.java Outdated Show resolved Hide resolved

debug params fix and rrf shard sort order

14ff5e1

ercsonusharma added 2 commits September 5, 2025 13:58

test cases fix and rrf shard sort order

bd637b7

introducing combiner methods as pre and post

2958599

	rb.resultIds = createShardResult(rb, shardDocMap, responseDocs);
	rb.resultIds = createShardResult(rb, shardDocMap);
	for (int i = 0; i < rb.resultIds.size(); i++) responseDocs.add(null);

SOLR-17319 : Combined Query Feature for Multi Query Execution #3418

Are you sure you want to change the base?

SOLR-17319 : Combined Query Feature for Multi Query Execution #3418

Conversation

ercsonusharma commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Solution

Tests

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ercsonusharma commented Jul 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dsmiley left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alessandrobenedetti commented Jul 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dsmiley commented Sep 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dsmiley left a comment

Choose a reason for hiding this comment

Uh oh!

ercsonusharma commented Jul 4, 2025 •

edited

Loading