Skip to content

💥 Use Temporal Failures for Nexus Error Serialization#2773

Open
Quinn-With-Two-Ns wants to merge 12 commits intotemporalio:masterfrom
Quinn-With-Two-Ns:NEXUS-38
Open

💥 Use Temporal Failures for Nexus Error Serialization#2773
Quinn-With-Two-Ns wants to merge 12 commits intotemporalio:masterfrom
Quinn-With-Two-Ns:NEXUS-38

Conversation

@Quinn-With-Two-Ns
Copy link
Contributor

@Quinn-With-Two-Ns Quinn-With-Two-Ns commented Feb 5, 2026

💥 Use Temporal Failures for Nexus Error Serialization. The Java SDK now responds to Nexus operation failures and task failures with plain Temporal Failures. This change also allows OperationException and HandlerException to have their own message and stacktrace independent of their cause. If the Server does not support the new format the SDK converts back to the old format before sending the response.

Requires:


Note

Medium Risk
Behavior changes in how Nexus failures are serialized/deserialized and sent to the server; while guarded by capability checks and extensive tests, regressions could affect Nexus error propagation and metrics tagging across server versions.

Overview
Nexus task failure reporting is switched to the new wire format that returns plain Temporal Failure protos for both operation failures (in StartOperationResponse) and handler failures (in RespondNexusTaskFailedRequest), enabling independent message/stacktrace on OperationException/HandlerException and preserving stack traces through conversions.

Adds compatibility shims: SDK detects server support via request capabilities and can down-convert to legacy UnsuccessfulOperationError/HandlerError (also forceable via temporal.nexus.forceOldFailureFormat), while the in-memory test server is updated to advertise/support the new format and still accept the old one. Includes updated/expanded tests and bumps nexus dependency + CI Temporal CLI version.

Written by Cursor Bugbot for commit 62f038b. This will update automatically on new commits. Configure here.

@Quinn-With-Two-Ns Quinn-With-Two-Ns requested a review from a team as a code owner February 5, 2026 22:21
@Quinn-With-Two-Ns Quinn-With-Two-Ns marked this pull request as draft February 5, 2026 22:22
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b2a0c78ca

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@Quinn-With-Two-Ns Quinn-With-Two-Ns marked this pull request as ready for review February 19, 2026 00:27
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 349588549b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return new HandlerException(info.getType(), cause, retryBehavior);
if (failure
.getMessage()
.startsWith(String.format("handler error (%s)", info.getType()))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this is here to allow proper decoding of legacy errors, but for which languages exactly? If we mean to support legacy errors coming from Go or others, are we sure the casing returned by .getType() here would match the error type of other SDKs?, or should we make this condition a little bit more resilient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any old failure serialized by Go or Java. This is the handler error type so that should match the Nexus spec and consistent across Go and Java

}

@SuppressWarnings("deprecation")
private void sendReply(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure of this function is very hard to reason about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this, are you still concerned about the structure?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That one is certainly the most patent culprit, but I'd also try to restructure this to avoid mixing happy paths and failure cases.

Like, instead of:

      if (taskResponse != null) {
        if (!supportTemporalFailure && taskResponse.getStartOperation().hasFailure()) {   
            // some failure case on old server
            reuturn
        }
        // happy path AND some failure case on newer servers
      }
      // other failure cases

I'd suggest restructuring the function along the line of:

if (taskResponse != null && !taskResponse.getStartOperation().hasFailure()) {
    // Operation succeeded

} else if (taskResponse != null) {
    // Operation failed - OperationError

} else if (response.getHandlerException()) {
    // Operation failed - HandlerError

} else {
    // throw ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that will really be clearer since we use the same gRPC method to respond to the server ion the first and second case here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an attempt to refactor to make it clearer let me push and we can discuss

Copy link

@VegetarianOrc VegetarianOrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple of minor things

Copy link
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, left a few small comments.

@Quinn-With-Two-Ns
Copy link
Contributor Author

Just released the new Nexus Java SDK, takes a bit to propagate

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

}

// Create a copy without the message before serializing
FailureInfo failureCopy = FailureInfo.newBuilder(failureInfo).setMessage("").build();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stack trace not cleared in nexus failure metadata payload

Low Severity

nexusFailureMetadataToPayloads creates a FailureInfo copy that only clears the message via setMessage("") before serializing into the details payload, but does not also clear the stack trace. The sibling methods temporalFailureToNexusFailure and temporalFailureToNexusFailureInfo both strip message and stack trace from details. This inconsistency was also noted in review — Go and Python clear both.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants