Skip to content

feat: use longest common prefix for determining tx replay set #6353

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

hstove
Copy link
Contributor

@hstove hstove commented Aug 5, 2025

This PR adds the ability for the global state machine evaluator to use a longest common prefix algorithm to determine the global transaction replay set after a fork.

For example, if one replay set is [A,B,C] with 50% weight, and another is [A,B] with 30% weight, then we will use [A,B] as the global replay set.

There is code for ensuring we end up with a deterministic global replay set by doing a comparison of signer weight, and then length, and then via txid comparisons.

Comment on lines +576 to +579
assert_eq!(transactions.len(), 2);
assert_eq!(transactions[0], state_test.tx_a); // Order matters!
assert_eq!(transactions[1], state_test.tx_b);
assert!(!transactions.contains(&state_test.tx_c));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: last assertion assert!(!transactions.contains(&state_test.tx_c)) appears to be redundant. Is the intention to enforce the concept?

}

#[test]
fn test_replay_set_common_prefix_coalescing() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test seems a duplicate respect to test_replay_set_common_prefix_coalescing_demo(). Could we consider to remove one of them?

about test naming: test_replay_set_common_prefix_coalescing would be my preference

Comment on lines +605 to +608
assert_eq!(transactions.len(), 2);
assert_eq!(transactions[0], state_test.tx_a); // Order matters!
assert_eq!(transactions[1], state_test.tx_b);
assert!(!transactions.contains(&state_test.tx_c));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: last assertion assert!(!transactions.contains(&state_test.tx_c)) appears to be redundant. Is the intention to enforce the concept?

Comment on lines +698 to +725
#[test]
/// Case: [A,B,C] vs [A,B,D] should find common prefix [A,B]
fn test_replay_set_partial_prefix_match() {
let mut state_test = SignerStateTest::new(4);

// Signer 0: [A,B,C] (25% weight - not enough alone)
state_test.update_signers(
&[0],
vec![
state_test.tx_a.clone(),
state_test.tx_b.clone(),
state_test.tx_c.clone(),
],
);

// Signers 1, 2, 3: [A,B] only (75% weight - above threshold)
state_test.update_signers(
&[1, 2, 3],
vec![state_test.tx_a.clone(), state_test.tx_b.clone()],
);

let transactions = state_test.get_global_replay_set();

// Should find [A,B] as the longest common prefix with majority support
assert_eq!(transactions.len(), 2);
assert_eq!(transactions[0], state_test.tx_a);
assert_eq!(transactions[1], state_test.tx_b);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the test description does not fully align with the actual test logic:

  • The description says: [A, B, C] vs [A, B, D]
  • But the actual test uses:
    • Signer 0 (25% weight) → [A, B, C]
    • Signers 1, 2, 3 (75% weight) → [A, B] (not [A, B, D])

Thus, [A, B] is selected not because it's the common prefix, but because it's the replay set supported by a quorum (>70%).

Is my understanding correct?

@@ -7,6 +7,10 @@ and this project adheres to the versioning scheme outlined in the [README.md](RE

## Unreleased

### Added

- When determining a global transaction replay set, the state evaluator now uses a longest-common-prefix algorithm to find a replay set in the case where a single replay set has less than 70% of sigher weight.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. Typo "sigher" -> "signer"

/// Find the longest common prefix of replay sets that has majority support.
/// This implements the longest common prefix (LCP) strategy where if one signer's replay set
/// is [A,B,C] and another is [A,B], we should use [A,B] as the replay set.
/// Order matters for transaction replay - [A,B] and [B,A] have no common prefix.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahaha this looks like a leetcode question XD

Copy link
Contributor

@jferrant jferrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM: only some minor nits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tx Replay: when agreeing on transaction replay set, use the biggest subset of transactions
3 participants