Skip to content

Conversation

@ivoson
Copy link
Contributor

@ivoson ivoson commented Dec 2, 2025

What changes were proposed in this pull request?

Rollback shuffle map stages when shuffle checksum mismatch detected:

  • cancel and resubmit the stage if it's running;
  • clean up the shuffle status to ensure it'll be resubmitted;
  • mark rollback attemptId and ignore the results from these elder attempts which may consume inconsistent data;

Why are the changes needed?

To ensure all the succeeding stages will be re-submitted and fully-retry when there is shuffle checksum mismatch detected.

Does this PR introduce any user-facing change?

No

How was this patch tested?

UT added.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the CORE label Dec 2, 2025
@ivoson ivoson marked this pull request as ready for review December 2, 2025 01:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant