Skip to content

[hyperactor] simplify supervision propagation; unhandled events always cause failure #784

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

mariusae
Copy link
Member

@mariusae mariusae commented Aug 6, 2025

Stack from ghstack (oldest at bottom):

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Differential Revision: D79702385

NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on Phabricator!

…s cause failure

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Differential Revision: [D79702385](https://our.internmc.facebook.com/intern/diff/D79702385/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D79702385/)!

[ghstack-poisoned]
mariusae added a commit that referenced this pull request Aug 6, 2025
…s cause failure

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Differential Revision: [D79702385](https://our.internmc.facebook.com/intern/diff/D79702385/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D79702385/)!

ghstack-source-id: 301209925
Pull Request resolved: #784
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 6, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79702385

…vents always cause failure"

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Differential Revision: [D79702385](https://our.internmc.facebook.com/intern/diff/D79702385/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D79702385/)!

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79702385

…vents always cause failure"

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Differential Revision: [D79702385](https://our.internmc.facebook.com/intern/diff/D79702385/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D79702385/)!

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79702385

…vents always cause failure"

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Differential Revision: [D79702385](https://our.internmc.facebook.com/intern/diff/D79702385/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D79702385/)!

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79702385

…vents always cause failure"

Currently (local) supervision can propagate events without also killing an intermediate actor.

This is 1) wrong; and 2) complicated.

Instead, we treat an unhandled supervision event as an actor failure, and then reduce the propagation paths to one: that of an actor failing.

In order to retain accurate attribution, we add a "caused_by" field to the actor supervision events.

Differential Revision: [D79702385](https://our.internmc.facebook.com/intern/diff/D79702385/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D79702385/)!

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79702385

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in b5ecd8f.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants