Skip to content

Conversation

hstove
Copy link
Contributor

@hstove hstove commented Jun 19, 2025

(Opening as a draft - the test could be improved, and I'm not very confident in the "rules" implemented here)

This PR implements a 'failsafe' for transaction replay - if we've had 2 burn blocks since the new fork tip, clear the replay set. While this is very imperfect, it prioritizes liveness of the chain over guarantees about replay getting executed as expected. Most of the time, this shouldn't make a difference anyways. A new config field, reset_replay_set_after_fork_blocks, is provided to allow changing this value (which defaults to 2).

I've also refactored many of the transaction replay tests to do shallower forks, which aligns much more with reality. This actually caught a bug in the fork detection logic, which we were getting away with due to the tests using deeper forks. We now use a descendency check to determine whether a new burn block is a fork, where we previously did a simple check against the height of a new burn block.

@hstove hstove requested review from kantai, jferrant and fdefelici June 19, 2025 22:29
@aldur aldur moved this to Status: 💻 In Progress in Stacks Core Eng Jun 20, 2025
@aldur aldur added this to the 3.1.0.0.13 milestone Jun 20, 2025
@hstove
Copy link
Contributor Author

hstove commented Jun 20, 2025

After some discussion, we're going to update this to use the rule of "once there are 2 burn blocks past the previous tip, clear the replay set if we're still in it". We'll use a config value for the "2 burn blocks" value.

Copy link

codecov bot commented Jun 22, 2025

Codecov Report

❌ Patch coverage is 9.23077% with 649 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.57%. Comparing base (0752841) to head (5d0c27e).
⚠️ Report is 37 commits behind head on develop.

Files with missing lines Patch % Lines
stacks-node/src/tests/signer/v0.rs 0.17% 576 Missing ⚠️
stacks-node/src/tests/signer/mod.rs 5.00% 38 Missing ⚠️
stacks-signer/src/v0/signer_state.rs 65.27% 25 Missing ⚠️
stackslib/src/net/api/postblock_proposal.rs 0.00% 10 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6212      +/-   ##
===========================================
+ Coverage    75.33%   80.57%   +5.24%     
===========================================
  Files          555      555              
  Lines       350915   350983      +68     
===========================================
+ Hits        264358   282821   +18463     
+ Misses       86557    68162   -18395     
Files with missing lines Coverage Δ
stacks-node/src/tests/nakamoto_integrations.rs 85.00% <100.00%> (+11.15%) ⬆️
stacks-signer/src/chainstate/mod.rs 90.74% <100.00%> (+0.03%) ⬆️
stacks-signer/src/chainstate/tests/v1.rs 100.00% <100.00%> (+10.85%) ⬆️
stacks-signer/src/chainstate/tests/v2.rs 100.00% <100.00%> (+3.67%) ⬆️
stacks-signer/src/client/mod.rs 99.24% <100.00%> (+<0.01%) ⬆️
stacks-signer/src/config.rs 91.38% <100.00%> (+0.13%) ⬆️
stacks-signer/src/runloop.rs 86.16% <100.00%> (+9.48%) ⬆️
stacks-signer/src/tests/signer_state.rs 98.88% <100.00%> (+46.22%) ⬆️
stacks-signer/src/v0/signer.rs 72.10% <100.00%> (+1.90%) ⬆️
stackslib/src/net/api/postblock_proposal.rs 65.35% <0.00%> (+2.90%) ⬆️
... and 3 more

... and 237 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0752841...5d0c27e. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@fdefelici fdefelici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the implementation looks fine!

As expected, all tx_replay_* tests are currently failing due to the two-tenure limit, for the reasons reported in the PR description. Once the current approach is finalized, these tests will likely need to be revisited and properly adjusted to align with the new logic.

I've included a few remarks throughout the review suggesting possible improvements for readability and maintainability.

@obycode obycode modified the milestones: 3.1.0.0.13, 3.1.0.0.14 Jul 1, 2025
@hstove hstove requested review from fdefelici, rdeioris and kantai July 29, 2025 18:22
@hstove
Copy link
Contributor Author

hstove commented Jul 29, 2025

I've re-requested reviews here - after the latest round of test flakiness fixes from develop, CI is looking good here.

fdefelici
fdefelici previously approved these changes Jul 30, 2025
@aldur aldur moved this from Status: 💻 In Progress to Status: In Review in Stacks Core Eng Aug 4, 2025
@jferrant
Copy link
Contributor

jferrant commented Aug 5, 2025

LGTM aside from all the failing tests. I saw similar tests failing in my 2 phase commit PR. maybe take a look there and see if my fixes also fix the issues you are seeing (they are some subtle timing fixes that are probably causing issues on develop but don't always manifest)

jferrant
jferrant previously approved these changes Aug 5, 2025
The return type was `Result<bool>`, but it only needed to be `bool`
@hstove hstove dismissed stale reviews from jferrant and fdefelici via f52ab1f August 6, 2025 14:35
@hstove hstove requested review from jferrant and fdefelici August 6, 2025 14:42
Copy link
Contributor

@fdefelici fdefelici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT!

Just a small remark about an unused method.

PS: Don't know if it is by chance, but all tests are green....I'll take a screenshot :)

@hstove hstove requested a review from fdefelici August 18, 2025 20:06
Copy link
Contributor

@fdefelici fdefelici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT!

Failing tests don't seem related to this change.

@github-project-automation github-project-automation bot moved this from Status: In Review to Status: 💻 In Progress in Stacks Core Eng Aug 20, 2025
@hstove hstove added this pull request to the merge queue Aug 21, 2025
Merged via the queue into stacks-network:develop with commit a2ff29c Aug 21, 2025
552 of 568 checks passed
@hstove hstove deleted the feat/tx-replay-failsafe branch August 21, 2025 14:40
@github-project-automation github-project-automation bot moved this from Status: 💻 In Progress to Status: ✅ Done in Stacks Core Eng Aug 21, 2025
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 29, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
Status: Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

TX Replay: Failsafe logic in signer state
7 participants