Skip to content

Conversation

benchaplin
Copy link
Contributor

Resolves #134151, #130821.

Background

A bug was introduced by #121885 due to the following code, which handles batched query exceptions due to a batched partial reduction failure:

@Override
public void handleException(TransportException e) {
Exception cause = (Exception) ExceptionsHelper.unwrapCause(e);
logger.debug("handling node search exception coming from [" + nodeId + "]", cause);
if (e instanceof SendRequestTransportException || cause instanceof TaskCancelledException) {
// two possible special cases here where we do not want to fail the phase:
// failure to send out the request -> handle things the same way a shard would fail with unbatched execution
// as this could be a transient failure and partial results we may have are still valid
// cancellation of the whole batched request on the remote -> maybe we timed out or so, partial results may
// still be valid
onNodeQueryFailure(e, request, routing);
} else {
// Remote failure that wasn't due to networking or cancellation means that the data node was unable to reduce
// its local results. Failure to reduce always fails the phase without exception so we fail the phase here.
if (results instanceof QueryPhaseResultConsumer queryPhaseResultConsumer) {
queryPhaseResultConsumer.failure.compareAndSet(null, cause);
}
onPhaseFailure(getName(), "", cause);
}
}

Raising a phase failure in this way leads to a couple issues:

  1. It can be called more than once (as seen in [Search] Exceptions in datanodes leading to assertFirstRun() failures #134151).
  2. The subsequent freeing of contexts can miss concurrent in-flight queries, resulting in open contexts after the failure (as seen in [CI] SearchWithRejectionsIT testOpenContextsAfterRejections failing #130821).

Solution

Problem 1 could be resolved with a simple flag, as proposed in #131085. Problem 2 could be resolved with some careful use of the same flag to clean contexts upon receiving stale query results. However, in the interest of stability, I propose a solution that more closely resembles how a reduction failure is handled by a non-batched query phase. In non-batched, a reduction failure is held in the QueryPhaseResultConsumer until shard fanout is complete. Only later, during final reduction at the beginning of the fetch phase, do we fail the search.

Fast failure + proper task cancellation are worthy goals for the future. I am tracking these as follow-up improvements for after the release of batched query execution.

This PR:

  1. Alters a batched query request to respond with shard results in the case of a reduction failure on the data node (the failure is now conditionally included in the NodeQueryResponse).
  2. Removes the early phase failure on the coord node. The coord's QueryPhaseResultConsumer will hold onto the failure and fail eventually during the fetch phase, same as non-batched.

@benchaplin benchaplin added >bug Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch :Search Foundations/Search Catch all for Search Foundations v9.1.7 labels Oct 21, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine
Copy link
Collaborator

Hi @benchaplin, I've created a changelog YAML for you.

this.results = in.readArray(i -> i.readBoolean() ? new QuerySearchResult(i) : i.readException(), Object[]::new);
this.mergeResult = QueryPhaseResultConsumer.MergeResult.readFrom(in);
this.topDocsStats = SearchPhaseController.TopDocsStats.readFrom(in);
boolean hasReductionFailure = in.readBoolean();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're changing the shape of this message, do we need to create a new transport version or is that taken care of for us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I believe I do, once I learn how 😂

Copy link
Contributor

@chrisparrinello chrisparrinello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benchaplin benchaplin added auto-backport Automatically create backport pull requests when merged v9.2.1 labels Oct 22, 2025
@benchaplin benchaplin marked this pull request as draft October 22, 2025 21:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >bug :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v9.1.7 v9.2.1 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Search] Exceptions in datanodes leading to assertFirstRun() failures

3 participants