Skip to content

Conversation

@bergundy
Copy link
Member

@bergundy bergundy commented Aug 7, 2025

Also add a failure representation of the Nexus SDK's OperationError.
The main motivation for this change is for consistency with the rest of the Temporal APIs and to make payload visiting in proxies simpler.

I do not intend to merge this PR until I have the corresponding Go SDK and server PRs.
I do want to get early feedback on this direction before I polish the work on the two remaining PRs.

@bergundy bergundy requested review from a team as code owners August 7, 2025 00:57
ChildWorkflowExecutionFailureInfo child_workflow_execution_failure_info = 12;
NexusOperationFailureInfo nexus_operation_execution_failure_info = 13;
NexusHandlerFailureInfo nexus_handler_failure_info = 14;
NexusSDKOperationFailureInfo nexus_sdk_operation_failure_info = 15;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried about adding this, can we confirm this error won't end up as the cause of a nexus operation failure in history?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the idea is that it would only be used in non-workflow callers and in handlers. But it may still show up in the cause chain in workflow, just not the top level failure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this be part of the cause chain in a workflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since any failure can appear as a cause, there's nothing preventing users from setting OperationError as the cause of an ApplicationError or HandlerError. I don't expect this to typically happen though.
Maybe when we have non-workflow callers and your operation handler uses the client to forward the operation and the handler wraps that error. Here's an example:

func (*myHandler) StartOperation(...) (...) {
  sc := temporalnexus.GetClient(...).NexusServiceClient(...)
  _, err := sc.StartOperation(...)
  return fmt.Errorf("failed to forward call: %w", err)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's true, my main concern is we don't change the current error chains type users see in workflows. Can we confirm that will remain the same?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, those will stay the same except for that we won't construct an implicit ApplicationError as the cause when users call construct a handler error with a message and both caller and handler have been upgraded.


// Representation of the Nexus SDK OperationError object.
message NexusSDKOperationFailureInfo {
string state = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "state" here? May need just a quick comment about what's accepted or what it refers to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's failed | canceled per the Nexus spec. We leave it as a string to allow adding statuses on the Nexus side without requiring a Temporal upgrade.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Worth adding a comment here saying this is an enum defined in the Nexus spec

// aip.dev/not-precedent: Not following linter rules. --)
google.protobuf.Timestamp scheduled_time = 2;

Capabilities capabilities = 100;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is responsible for setting this? I assume the server correct? So is this Nexus Request message only expected to ever be constructed by the server? If this is a Temporal-server-set-only thing, should it instead be on PollNexusTaskQueueResponse?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either of these are constructed only by the server. No strong opinion on where this should go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of being on the poll response myself, but I similarly do not have a strong opinion

UnsuccessfulOperationError operation_error = 3;
// The operation completed unsuccessfully (failed or canceled).
// Failure object must contain a NexusSDKOperationFailureInfo object.
temporal.api.failure.v1.Failure failure = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to say if this is set, the operation_error is ignored, but not that important

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already a oneof so that's not possible.

@bergundy bergundy force-pushed the nexus-error-attributes branch from 0066531 to 225b5bc Compare August 16, 2025 13:49
Also add a failure representation of the Nexus SDK's OperationError.
@bergundy bergundy force-pushed the nexus-error-attributes branch from 225b5bc to 7b57017 Compare September 10, 2025 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants