Skip to content

Conversation

@bergundy
Copy link
Member

@bergundy bergundy commented Sep 17, 2025

What changed?

Added first revision of APIs for standalone activities.

Missing API support for:

  • PauseActivity
  • UnpauseActivity
  • ResetActivity
  • UpdateActivityOptions

Notes for reviewers

  • See the open TODOs, some decisions need to be finalized.
  • Do we want to have an activity specific ID reuse policy and conflict policy or can we make this generic?
    • There are already other cases planned where we would want to use the same definitions.
  • Do we need a generic form of ActivityExecution? What would it be called? EntityExecution?
    • Decided not to do this for now and go with a flat API.
  • Does pause info need to be extracted out?
  • How does this integrate with versioning? Can the client specify any versioning constraints on the StartActivityExecutionRequest?
  • Do we want to use ActivityOptions in StartActivityExecutionRequest? Would more of the fields in the request be considered options in the future?

Server PR

Not done yet, this will be the next step.

ACTIVITY_EXECUTION_STATUS_UNSPECIFIED = 0;
// The activity is not in a terminal status. This does not necessarily mean that there is a currently running
// attempt. The activity may be backing off between attempts or waiting for a worker to pick it up.
ACTIVITY_EXECUTION_STATUS_RUNNING = 1;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will likely add a PAUSED status here since this is the direction workflow pause is going.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think either are a wise direction. What is happening between "running" and "done" are not traditionally execution statuses. Are you going to have a "backing off" status too? Execution status has traditionally been "running or done", and this changes those expectations unnecessarily. Can you use a separate enum to provide more detailed information about the state the running activity/workflow is in and leave the execution status alone?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what we are doing for workflow pause, we should be consistent for any type of execution.

Copy link
Member

@cretz cretz Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I am saying we shouldn't do it for workflow pause either (and have mentioned it in those situations too). It's the same reason that "backing off" doesn't need to be a status. Sure it is effectively paused during backoff, but that doesn't affect that Temporal defines activities and workflows as running if they are not completed.

There are many many cases where the expectations would be violated. For instance, people expect ID reuse policy to only apply to not-running workflows, and ID conflict policy to only apply to running workflows. This violates both of those. People expect ExecutionStatus != Running query to give them all completed workflows. I could come up with many more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd really nice to be able to see paused executions and their counts in the main list view. I don't think "backing off" is the same as pause, we don't update visibility and don't have enough of an incentive to do that for intermediate transient states.

In any case, we shouldn't be having this debate here. Whatever we decide to do for workflows we will copy to activities. For now, we are not even implementing standalone activity pause. That will likely come in the second phase of this project.

Copy link
Member

@cretz cretz Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd really nice to be able to see paused executions and their counts in the main list view

Sure, but that doesn't mean affecting existing fields/concepts like status that have a clear defined expectation and always have

In any case, we shouldn't be having this debate here

👍

Comment on lines 23 to 25
// When StartActivityExecution uses the ID_CONFLICT_POLICY_USE_EXISTING and there is already an existing running
// activity, OnConflictOptions defines actions to be taken on the existing running activity, updating its state.
message OnConflictOptions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be ok not doing this for the first version. If we are to keep this, the comments above the fields need to be updated to not refer to workflows.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this will only be added when we do the Nexus integration. I put it here for completeness.

}

// Limited activity information returned in the list response.
message ActivityListInfo {
Copy link
Member

@cretz cretz Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with workflow, this one should be called ActivityExecutionInfo and we can add ActivityExecutionExtendedInfo or something if you need things that are not in visibility. I cover this in a later comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was not done correctly for workflows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree, and even if I did agree, there's no harm in being consistent here and making list a subset of describe

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the consistency argument is that strong here. We have seen multiple times that we assume something needs to only be in describe or list and realize we actually want it in both. The two separate messages don't allow for repurposing the fields without deprecation or duplication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that ActivityExecutionListInfo is a more consistent name for this. The underlying entity in both cases is an activity execution; it's the amount of info that differs. Thus

  • ActivityExecutionInfo: full info for an activity execution, returned by describe
  • ActivityExecutionListInfo: reduced info for an activity execution, returned by list

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, list should be a subset of describe. I assume the omission of task queue is an oversight.

Task queue is in activity options.

And maybe the next developer doing a future CHASM archetype decides to be inconsistent in their way that they think represents the future and has plenty of justification at that time.

That's why we are having these reviews to agree on a direction. Being consistent with a single API, for the sake of consistency, is not a strong argument IMHO. We have a chance to make things better, which Dan and I believe this is achieving.

This is too small of a concern to completely upend model reuse for everyone. This can be used as justification to never use a model in more than one place. We could just inline/copy all fields everywhere.

Why is it too small of a concern? There are definitely places where this doesn't apply, nobody says we need to take this guideline as a strict rule, it should be applied as necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolution here (i.e. matching latest PR state, which is pending SDK approval) is that the API will use independent response structs:

// Information about a standalone activity.
message ActivityExecutionInfo {

// Limited activity information returned in the list response.
message ActivityExecutionListInfo {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify whether you resolve to make sure that everything that is in list is also in non-list? Anything beyond relying on close PR scrutiny to ensure that?

Copy link
Contributor

@dandavison dandavison Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My view is that all the information in list should be in non-list. However, it may not be a simple subset, as the example of task_queue showed (its present in non-list as part of ActivityOptions). I've added the following comment:

// When adding fields here, ensure that it is also present in ActivityExecutionInfo (note that it
// may already be present in ActivityExecutionInfo but not at the top-level).

So there's that that, and PR scrutiny.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server could have a test that verifies that every field in list-info can be computed from info. It wouldn't stop PRs here merging but it would ensure the server is not released violating this.

Comment on lines 2685 to 2706
message DescribeActivityExecutionResponse {
temporal.api.activity.v1.ActivityExecutionInfo info = 1;

// A token that can be passed in via a subsequent `DescribeActivityExecutionRequest` to long poll on the activity
// state as it makes progress.
bytes long_poll_token = 2;
}

message ListActivityExecutionsRequest {
string namespace = 1;
// Max number of executions to return per page.
int32 page_size = 2;
// Token returned in ListActivityExecutionsResponse.
bytes next_page_token = 3;
// Visibility query, see https://docs.temporal.io/list-filter for the syntax.
string query = 4;
}

message ListActivityExecutionsResponse {
repeated temporal.api.activity.v1.ActivityListInfo executions = 1;
// Token to use to fetch the next page. If empty, there is no next page.
bytes next_page_token = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To match workflow, describe should be a superset of list response, not a completely different structure. This makes sense with many of our SDKs where describe returns a class that extends what list returns. IMO this should be (ignoring long poll stuff):

message DescribeActivityExecutionResponse {
    temporal.api.activity.v1.ActivityExecutionInfo info = 1;
    temporal.api.activity.v1.ActivityExecutionExtendedInfo extended_info = 2;

    // ...
}

// ...

message ListActivityExecutionsResponse {
    repeated temporal.api.activity.v1.ActivityExecutionInfo executions = 1;
    // ...

If there are some things that don't feel like they overlap, e.g. task queue or other activity options on the shared info only represent what it was started with, then those fields can be named/documented appropriately. It makes no sense to get more information from list. Describe should always have at least what visibility does.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the consistency here is important TBH. Using the same info types for workflow list and describe was an oversight that we recognize today. I would be open to changing this if message reuse simplifies anything significantly on the SDK side.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it was an oversight. Do we believe that describe data is a superset of the list data? If so, we need to model it that way. The SDKs (that have types for list and describe) are modeled that way very intentionally (via inheritance or similar) because it makes perfect sense that visibility is a subset of the data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented on this above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we at least agree that describe data is a superset of the list data?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we at least agree that describe data is a superset of the list data?

I agree with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolution here (i.e. matching latest PR state, which is pending SDK approval) is that the API will use independent response structs:

// Information about a standalone activity.
message ActivityExecutionInfo {

// Limited activity information returned in the list response.
message ActivityExecutionListInfo {

Copy link
Member

@cretz cretz Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that.

The model already violates this. For instance, we have decided for list response we are giving a field for execution_duration. Same for close_time too unless I'm missing where these two fields are defined.

How are you planning to uphold this assurance in the absence of normal model reuse? My concern is the only way it will be upheld is if I personally review every single field very closely. I think we can expect for this to happen plenty if it's happening already before the first PR lands (again, unsure, maybe it's in a model I missed somewhere, hard to know without proper model reuse).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you planning to uphold this assurance in the absence of normal model reuse? My concern is the only way it will be upheld is if I personally review every single field very closely.

The server could have a test that verifies that every field in list-info can be computed from info.

I think we can expect for this to happen plenty if it's happening already before the first PR lands

(Actually, I'd say this PR will be subject to more stringent validation when we start implementing with it than now when we're just eyeballing it)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test server side would make me much more comfortable, thanks

ACTIVITY_EXECUTION_STATUS_UNSPECIFIED = 0;
// The activity is not in a terminal status. This does not necessarily mean that there is a currently running
// attempt. The activity may be backing off between attempts or waiting for a worker to pick it up.
ACTIVITY_EXECUTION_STATUS_RUNNING = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think either are a wise direction. What is happening between "running" and "done" are not traditionally execution statuses. Are you going to have a "backing off" status too? Execution status has traditionally been "running or done", and this changes those expectations unnecessarily. Can you use a separate enum to provide more detailed information about the state the running activity/workflow is in and leave the execution status alone?

// If the request is denied, the server returns an `ExecutionAlreadyStarted` error.
//
// See `IdConflictPolicy` for handling ID duplication with a *running* execution.
enum IdReusePolicy {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem general purpose, it seems each top-level Temporal component has its own ID reuse/conflict policy enumerates. Shouldn't this be ActivityIdReusePolicy and ActivityIdConflictPolicy? I think this actually may have value if/when we need something different for one component or the other. From an SDK POV, I would expect these to be separately named enums too because we like consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already know that I want this for "standalone" Nexus operations, and generally these reuse and conflict policies are framework capabilities. I am already at the point of wanting to generalize this and not having to redefine for every new primitive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 3 sets of enums here for their distinct uses is better than being inconsistent (even if they happen to have the same values so far). How do you expect the SDKs to model this? To be inconsistent with themselves, or are SDKs held to a higher standard of consistency?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to get tired of redefining these enums everywhere, and for the most part, SDKs can alias types for the workflow flavors of these enums.

Copy link
Member

@cretz cretz Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are already redefining them in other user-facing places. SDKs can not always alias "workflow flavors", not every language works like that. It is important to see this on the whole and understand all user interfaces, not assume what user-facing aspects can do. We're not going to get tired of these two enums on a situational basis any more than we are with cancellation types or other things where we have done this. The SDKs will at least remain as consistent as possible, even if the API sadly continues to choose not to.

I am now growing convinced we need to see user-facing SDK designs before we can understand the APIs needed for them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now leaning towards having a specific one for activities just to avoid mis-predicting the future where we would want to allow different archetypes to have different ID reuse and conflict policies. It's worth the duplication IMHO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #650 targeting this PR switching to activity-specific reuse and conflict policy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolution here (i.e. matching latest PR state, which is pending SDK approval) is that the API will use activity-specific reuse and conflict policies, as opposed to policies intended to be shared between different execution types.

Comment on lines 2681 to 2682
// This token is returned as part of the `DescribeActivityExecutionResponse`.
bytes long_poll_token = 5;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One wonders if there is value in "activity event" ledger. People are going to long poll and wonder why they can't get updated for each attempt that has an issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this and decided not to add it for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the activity state changes

Let's define "activity state" in the comment here. (What is the intended definition here?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolution here (i.e. matching latest PR state, which is pending SDK approval) is that server is only going to return latest state; it will not persist a ledger/history. Added documentation:

// returning the new state. Note that activity state may change multiple times between
// requests, therefore it is not guaranteed that a client making a sequence of long-poll
// requests will see a complete sequence of state changes.

message CountActivityExecutionsRequest {
string namespace = 1;
// Visibility query, see https://docs.temporal.io/list-filter for the syntax.
string query = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the built-in search attributes for activities still TBD?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fully fleshed out yet. I believe this will be the rough set of attributes:

ActivityId
RunId
ActivityType
TaskQueue
StartTime
ExecutionTime
CloseTime
ExecutionStatus
ExecutionDuration
PauseInfo -- when we add pause support

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also adding StateTransitionCount. I think that may be the list we end up going with.

// If set, turns this request into a long poll that is unblocked when the activity reaches a terminal status.
// The wait duration is capped by the request's context deadline or by the maximum enforced long poll interval
// allowed by the server.
bool wait = 4;
Copy link
Member

@cretz cretz Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing from a UI/CLI POV. So if I want to wait on an update for anything but result I have to use one call, and to wait on result, I have to use this call separately? For workflows, interfaces can use the same call to wait on any kind of update, whether its result or events or whatever. I think we should consider doing the same for activities. I can give some suggestions on ideas if we don't want a full activity ledger, but many boil down to having one RPC that can opt-in to whether it cares about only result or any update, and that doesn't have to have been called first before it can be long polled. This can be mashed into describe if you must, but I like describe to be a fast synchronous call for all components.

I recommend something like:

message GetActivityExecutionStatusRequest {
    string namespace = 1;
    string activity_id = 2;
    string run_id = 3;

    bool include_info = 4;
    bool wait_info = 5; // Cannot be true if include_info is false
    bytes continue_info_wait_token = 6; // Cannot have a value if either of the above two are false

    bool include_outcome = 7;
    bool wait_outcome = 8; // Cannot be true if include_outcome is false
}

message GetActivityExecutionStatusRequest {
    // These three only set if include_info is true (and if it's true, they are _always_ all set)
    temporal.api.activity.v1.ActivityExecutionInfo current_info = 1;
    temporal.api.activity.v1.ActivityExecutionExtendedInfo current_extended_info = 2;
    bytes continue_info_wait_token = 3;

    // This is only set if is activity completed and include_outcome is true
    oneof result {
        temporal.api.common.v1.Payloads result = 4;
        temporal.api.failure.v1.Failure failure = 5;
    }
}

This is important to reduce interface polling calls that need to know both of these things. I could see an SDK however splitting these into separate calls. It also allows us to keep describe as a simple synchronous getter like it is everywhere else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea. Let me take a stab at it. This just means that we won't have Describe at all but I'm fine with that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should absolutely have describe at least from a user POV. I understand the API is not held to a very high standard of consistency/clarity, but our SDKs are, so I guess we can reuse this API call and hide our unfortunate lack of API consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just don't see a good enough reason to duplicate the APIs apart from the consistency argument, which I don't think is super important. Workflows have both history and mutable state, activities only have mutable state. You don't need two APIs here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just don't see a good enough reason to duplicate the APIs apart from the consistency argument

If you discount the good enough reason, then yes it will be hard to see a good enough reason. I was referring to the SDK at least being consisten.

I am now convinced this API cannot be designed well without designing the SDK APIs that will use it. SDK will have a synchronous describe and it sounds like API will not, so we need to see how it fits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the proposal here -- that the API should have a single endpoint with the option for long-poll for various stages including completed, thus allowing result to be retrieved and / or other status info ("describe"). The SDKs can present separate APIs if they choose to.

Copy link
Contributor

@dandavison dandavison Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolution here (i.e. matching latest PR state, which is pending SDK approval) is that the API will offer a method named PollActivityExecutionRequest to poll for state and/or result, optionally including input, with no-wait and long-poll variants. The name and functionality aims for clarity and increased consistency (with WorkflowExecutionUpdate specifically):

  • "Poll" is accurate: the method offers two forms of poll and long-poll
  • The functionality is similar to PollWorkflowExecutionUpdate and the name thus emphasizes this consistency.
  • It feels better IMO to use "poll" to wait for a result than to use "describe" for that.
  • It makes sense IMO that an SDK would implement "describe" and "get result" in terms of a lower-level "poll" API

message RequestCancelActivityExecutionResponse {
}

message TerminateActivityExecutionRequest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're a bit inconsistent on whether our side-effecting calls need a request ID for dedupe. I understand that a terminate is final and therefore a successive call is "safe", but, a user could see an error for the second terminate call I assume. We do not need to solve in this PR since it's a general problem, but IMO we should see if we can make a policy of "request ID on all side-effecting calls" though I understand it causes server side effort/persistence to dedupe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this from workflow terminate, I wonder if it's worth having consistency there or not. It's not much trouble to store the terminate request ID and return an error if the activity is already terminated if the request ID does not match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can tackle the "idempotency keys for all side-effecting requests" requirement/project separately, but would definitely be nice to have.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you think of other requests that we would want to do this with? Probably pause, unpause, update options...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything side-effecting tbh (which includes things that have no user-facing side effects, like queries, because they technically do things), but can be opened as a separate issue

message RequestCancelActivityExecutionResponse {
}

message TerminateActivityExecutionRequest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to note that it is a bit confusing seeing PauseActivityExecution work for multiple forms of activity, but TerminateActivityExecution (and cancel and get result and describe and others) only work for one kind of activity. In the SDK we're going to have to be careful to structure it in a way that is less inconsistent than our API has chosen to structure it here. I do wish we could be clearer in the naming and/or have a pathway to really be generic in these concepts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional. We don't want to add these APIs for workflow activities yet but they are structured in a way that it would be possible to add workflow_id later to operate on a workflow activity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yet

If there are plans to, that makes sense then at least

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's inconclusive, but I didn't think we'd need to support describe for workflow activities initially either. It turns out that if we have list, we need describe. Just don't want to get into a corner where it's harder to add it later.

Copy link
Member Author

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @cretz!

// If the request is denied, the server returns an `ExecutionAlreadyStarted` error.
//
// See `IdConflictPolicy` for handling ID duplication with a *running* execution.
enum IdReusePolicy {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already know that I want this for "standalone" Nexus operations, and generally these reuse and conflict policies are framework capabilities. I am already at the point of wanting to generalize this and not having to redefine for every new primitive.

Comment on lines +2643 to +2647
// Callbacks to be called by the server when this activity reaches a terminal status.
// Callback addresses must be whitelisted in the server's dynamic configuration.
repeated temporal.api.common.v1.Callback completion_callbacks = 14;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I put it here for completeness. It won't make it into the first release.

Comment on lines +2643 to +2647
// Callbacks to be called by the server when this activity reaches a terminal status.
// Callback addresses must be whitelisted in the server's dynamic configuration.
repeated temporal.api.common.v1.Callback completion_callbacks = 14;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove these before merging.

// activity task to be eagerly executed.
// The caller is expected to have a worker available to process the task.
PollActivityTaskQueueResponse eager_task = 3;
// Link to the workflow event.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I will review the comments. Thanks for catching it.

Comment on lines 2676 to 2677
// If true, the activity input is returned in the response.
bool include_input = 4;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because there is no other way of getting updates when the activity is running, unlike workflow where you can wait for history events. But I 100% agree that we want this for workflows too, we're just focusing on standalone activities right now.

Comment on lines 2681 to 2682
// This token is returned as part of the `DescribeActivityExecutionResponse`.
bytes long_poll_token = 5;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this and decided not to add it for now.

message CountActivityExecutionsRequest {
string namespace = 1;
// Visibility query, see https://docs.temporal.io/list-filter for the syntax.
string query = 2;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fully fleshed out yet. I believe this will be the rough set of attributes:

ActivityId
RunId
ActivityType
TaskQueue
StartTime
ExecutionTime
CloseTime
ExecutionStatus
ExecutionDuration
PauseInfo -- when we add pause support

// If set, turns this request into a long poll that is unblocked when the activity reaches a terminal status.
// The wait duration is capped by the request's context deadline or by the maximum enforced long poll interval
// allowed by the server.
bool wait = 4;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea. Let me take a stab at it. This just means that we won't have Describe at all but I'm fine with that.

message RequestCancelActivityExecutionResponse {
}

message TerminateActivityExecutionRequest {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this from workflow terminate, I wonder if it's worth having consistency there or not. It's not much trouble to store the terminate request ID and return an error if the activity is already terminated if the request ID does not match.

message RequestCancelActivityExecutionResponse {
}

message TerminateActivityExecutionRequest {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional. We don't want to add these APIs for workflow activities yet but they are structured in a way that it would be possible to add workflow_id later to operate on a workflow activity.

temporal.api.common.v1.Payloads result = 2;
// The failure if the activity completed unsuccessfully.
temporal.api.failure.v1.Failure failure = 3;
}
Copy link
Contributor

@dandavison dandavison Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth comparing the comments in PollWorkflowExecutionUpdateResponse. When we introduced that there was some discussion, leading to a decision to use a non-error response to signify that a long-polling client should retry.

Comment on lines +190 to +192
// Updated on terminal status.
int64 state_transition_count = 10;
// Updated once on scheduled and once on terminal status.
int64 state_size_bytes = 11;
Copy link
Member

@cretz cretz Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why inconsistently make these as fields instead of search attributes like workflow? Are they also in search attributes, or are we just forbidding the use of this information in queries for activities even though we allow it for workflows? By making them fields, we are ensuring that list and describe field behaviors diverge for the same field which is confusing to have to explain for the same field name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is available for workflows though...

string task_queue = 13;
int64 state_transition_count = 14;
int64 history_size_bytes = 15;

Search attributes returned here are custom ones provided by the user in the start request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I was confused by the name list info here and not there (I probably won't be the last to be confused by this inconsistency). There are other mismatched fields though, but can bring those up separately.

@dandavison dandavison force-pushed the standalone-activity branch 3 times, most recently from 9a8c86d to 6a3e9b1 Compare October 24, 2025 19:17
@dandavison
Copy link
Contributor

@cretz the PR has changed substantially, would you like to review again?

temporal.api.deployment.v1.WorkerDeploymentOptions deployment_options = 6;
}

message PollActivityTaskQueueResponse {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a standalone activity to have access to its own run ID? If so then PollActivityTaskQueueResponse should be extended to include it.

// Activity run ID. If empty the request targets the latest run.
string run_id = 3;
// Exclude the info field from the response.
bool exclude_info = 4;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this field exclude when others are include? If it's to make returning info the default for this operation, I don't think it's the right choice as I imagine the most common use case for polling execution would be waiting for activity completion and getting only the result.

Copy link
Member

@cretz cretz Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine the most common use case for polling execution would be waiting for activity completion and getting only the result.

We make long poll opt-in on long-poll-optional calls. So the default here actually represents what a "describe activity" would do, but they aren't making a describe activity, so they are just making this behave like that by default. The "poll for result" which they've also put inside of this same call is the opt-in alternative call use. It makes sense from a caller/HTTP API POV to have the non-poll "describe" behavior be the default behavior despite the RPC name.

Copy link
Contributor

@dandavison dandavison Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this field exclude when others are include? If it's to make returning info the default for this operation, I don't think it's the right choice as I imagine the most common use case for polling execution would be waiting for activity completion and getting only the result.

Good point @maciejdudko. We made it exclude_info because in an earlier version of this PR the method was named DescribeActivityExecution. However, I subsequently renamed it PollActivityExecution in line with our aim for consistency and clarity in the gRPC API. The new name reflects the fact that this is a low-level API, which SDKs will build functionality such as DescribeActivity and FetchResult(longPoll: bool) on top of. As such, I think you're right, that the fields should now all have include semantics and that it is appropriate for the low-level "poll" gRPC API to default to not including any of the bulky data. Clients (such as SDK authors) making use of the API will not have difficulty understanding that they have to set one of the include fields.

Copy link
Member

@cretz cretz Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clients (such as SDK authors) making use of the API will not have difficulty understanding that they have to set one of the include fields.

Disagree. I think curl https://my-temporal-host/namespaces/my-ns/activities/my-activity should return info like curl https://my-temporal-host/namespaces/my-ns/workflows/my-workflow, curl https://my-temporal-host/namespaces/my-ns/schedules/my-schedule, etc. I think we're regressing and making our APIs harder to use than we did in the past. I don't think we have to put the onus on users and clients to simplify complex APIs, we can develop non-complex APIs from the get-go.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cretz makes a good point with curl. I'm on the side of leaving it as exclude_info now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet