p2p: rework duplicate message rejections #2635

iurii-ssv · 2025-12-24T17:24:33Z

Resolves #2632

This PR changes the validation-rules treatment of duplicate messages. A duplicate message is a message that should not be applied more than 1 time from QBFT-logic perspective (and it is "additionally" enforced via message-validation rules). SSV node wants to detect duplicate messages in order to:

reduce the amount of unnecessary traffic in p2p network as much as possible
actively punish peers (through libp2p reputation system) who don't follow protocol rules

The behavior this PR is aiming to change:

previously, a duplicate message will not be detected correctly if it comes from a different peer (a peer with different peerID) ... and so SSV node will accept such message and broadcast it to other peers getting punished for it
with this PR, a duplicate message is detected always, and then depending on whether it comes from the peer who already sent that/similar message to us, or from another/different peer - it is rejected/ignored respectively (punishing only those peers who "knowingly" broadcast duplicate messages, the "knowingly" part comes from the libp2p property of every node in p2p pubsub being fully responsible for validating messages received from other peers it chooses to re-broadcasts ... so it must validate the message isn't a duplicate before deciding to re-broadcast it)

In order to be able to tell whether a duplicate p2p message is coming from new/different peer (or from the same peer we've received that/similar message in the past) we need to store & update the message->peer mapping ... that is however somewhat expensive (and doesn't fit well with the current minimalistic/optimized implementation we have for message-validation). Instead, in this PR, we keep track of "seen validation-rule violations":

1 violation is always allowed (if some peer has already sent some message, and then he sends the same/similar message 1 more time - he will not be punished by message rejection that 1 time, the receiving SSV node will simply ignore such a message for the 1st time)
2nd+ violations are spotted via a check against "seen validation-rule violations" (the new structure we now track per peer, on a need-to basis), and if the violating message is coming from the peer who already had been observed to have this/similar kind of violation in the past (for this Operator+DutyType+slot) we reject the message (otherwise we ignore that violating message since it would be the 1st violation of this type for this peer)

Additional considerations:

previously messageValidator.state and messageValidator.validationLockCache maps were keyed by spectypes.MessageID+peerID ... that doesn't look correct to me because the "state" doesn't/shouldn't know/depend on which peer a message is received from
in this PR messageValidator.state and messageValidator.validationLockCache maps are keyed by spectypes.MessageID (which is basically = Operator+DutyType) only, meaning messages with the same spectypes.MessageID targeting different slots will acquire the same validation-lock and will be validated sequentially ... I think we could potentially optimize that further making validation for messages targeting different slots run concurrently, but it would be hard to implement this correctly on top of existing validation code (and this is out of scope of this PR anyway)
SignerState.SeenMsgTypes keeps track of both: messages related to QBFT-instance as well as pre-consensus phase messages, and upon round-change SignerState.Reset call fully resets not only QBFT-instance-related stuff but also pre-consensus phase messages as well. This means we are "allowing the receival of pre-consensus phase messages again once round-change happened" which doesn't make much sense. But it's not really abusable either, so there is no need to address that - just wanted to document that behavior
we also have a bunch of errors this PR doesn't touch that we classify as ignore (and never as reject), we might want to penalize peers sending lots of duplicate messages that end up as those errors as well as a potential future improvement

greptile-apps · 2025-12-24T17:27:48Z

Greptile Summary

Reworked duplicate message detection to be peer-independent while selectively punishing repeat offenders. Changed validation state keying from peerID+messageID to just messageID, ensuring duplicates are detected regardless of which peer sends them. Introduced IgnoreOrReject mechanism that ignores first-time violations (since detecting the original sender is expensive) but rejects subsequent violations from the same peer (easily detectable via SeenViolations tracking).

Key changes:

Validation lock and state now keyed by spectypes.MessageID only (not peerID+messageID)
New SignerState.SeenViolations map tracks which peers have sent which violation types
IgnoreOrReject method returns ignore error for first violation, reject error for repeats
Moved ErrDuplicatedMessage, ErrDifferentProposalData, ErrDecidedWithSameSigners, and ErrTooManyPartialSigMessage from reject to ignore category
Applied ignore-then-reject pattern to duplicate consensus messages, different proposal data, duplicate decided messages, and excessive partial signatures
Method renames for clarity: Signer → OperatorState, GetSignerState → GetSignerStateForSlot

Confidence Score: 5/5

This PR is safe to merge with minimal risk
The changes are well-structured with clear logic for handling duplicate messages. The ignore-then-reject pattern correctly balances protection against malicious peers while avoiding false positives from network issues. Method renames improve code clarity without changing behavior. The validation state keying change is the core fix that enables peer-independent duplicate detection.
No files require special attention

Important Files Changed

Filename	Overview
message/validation/signer_state.go	Added `SeenViolations` tracking and `IgnoreOrReject` method to distinguish first-time violations (ignored) from repeated violations by same peer (rejected)
message/validation/errors.go	Moved duplicate message errors from reject to ignore category, renamed error constants for clarity
message/validation/validation.go	Changed validation lock key from `peerID+messageID` to just `messageID`, ensuring duplicate detection is peer-independent
message/validation/consensus_validation.go	Updated duplicate message validation to use `IgnoreOrReject` for different proposal data, decided messages with same signers, and consensus message limits
message/validation/partial_validation.go	Applied `IgnoreOrReject` pattern to partial signature message validation, renamed error constants for consistency

Sequence Diagram

sequenceDiagram
    participant Peer1 as Peer 1 (Node1A)
    participant Node as SSV Node
    participant Peer2 as Peer 2 (Node1B)
    participant State as SignerState
    
    Note over Node,State: First message arrives
    Peer1->>Node: Send message (msgID=Op+Duty+Slot+Round)
    Node->>State: Check if duplicate via SeenMsgTypes
    State-->>Node: Not seen yet
    Node->>State: Record message type
    Node->>State: Track violation: none
    Node->>Peer1: ACCEPT & broadcast
    
    Note over Node,State: Duplicate from SAME peer
    Peer1->>Node: Send same message again
    Node->>State: Check if duplicate via SeenMsgTypes
    State-->>Node: Already seen (limit reached)
    Node->>State: Check SeenViolations[Peer1][ErrDuplicatedMessage]
    State-->>Node: Not seen from this peer before
    Node->>State: Record SeenViolations[Peer1][ErrDuplicatedMessage]
    Node->>Peer1: IGNORE (first violation)
    
    Note over Node,State: Second duplicate from SAME peer
    Peer1->>Node: Send same message third time
    Node->>State: Check if duplicate via SeenMsgTypes
    State-->>Node: Already seen (limit reached)
    Node->>State: Check SeenViolations[Peer1][ErrDuplicatedMessage]
    State-->>Node: Already seen from this peer!
    Node->>Peer1: REJECT (repeated violation)
    
    Note over Node,State: Duplicate from DIFFERENT peer
    Peer2->>Node: Send same message (different peerID)
    Node->>State: Check if duplicate via SeenMsgTypes
    State-->>Node: Already seen (limit reached)
    Node->>State: Check SeenViolations[Peer2][ErrDuplicatedMessage]
    State-->>Node: Not seen from this peer before
    Node->>State: Record SeenViolations[Peer2][ErrDuplicatedMessage]
    Node->>Peer2: IGNORE (first violation for this peer)

greptile-apps

_{13 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

message/validation/signer_state.go

codecov · 2025-12-27T06:24:25Z

Codecov Report

❌ Patch coverage is 82.75862% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.1%. Comparing base (b6a5fe0) to head (e3f2cfd).

Files with missing lines	Patch %	Lines
message/validation/partial_validation.go	41.1%	18 Missing and 2 partials ⚠️
network/topics/msg_id.go	54.5%	3 Missing and 2 partials ⚠️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

greptile-apps · 2025-12-29T14:47:13Z

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

_{This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".}

vyzo

first pass, approach is pretty reasonable, i will do another pass.

Left a comment.

vyzo · 2025-12-30T14:23:06Z

message/validation/consensus_validation.go


 	"github.com/attestantio/go-eth2-client/spec/phase0"
 	"github.com/libp2p/go-libp2p/core/peer"
-


keep this line please, it is good practice to separate our own libraries from external deps. makes audit a tad easier.

This is actually still consistent with the current version of our make format command:

if you run it after removing all the newlines,

but once those newlines are already there it doesn't remove those

so we can end up with 2 valid but different-looking imports sections (both of which are valid from make format & make lint perspective)

Note also that currently we classify ssv-spec as any other 3rd-party repo (we don't bundle ssv-spec and ssv packages together in under imports section).

I've pushed a commit to revert these imports-affecting changes for now, but in general I think we shouldn't worry much about it (or maybe use something like gci to enforce certain rules) - e3f2cfd

vyzo · 2025-12-30T14:25:32Z

message/validation/consensus_validation.go

 	signerCount := len(signedSSVMessage.OperatorIDs)
 	if signerCount > 1 {
+		if signerState.SeenSigners == nil {
+			signerState.SeenSigners = make(map[SignersBitMask]struct{}) // lazy init on demand to reduce mem consumption


do we need to garbage collect cold entries? Maybe also limit size?

I think it might be a good idea, i am concerned it might open a dos attack vector by making us use potentially unlimited memory in this map.

For SeenSigners and SeenViolations we basically drop the reference(s) to those in SignerState.Reset letting the Golang GC to collect those maps we used in the past.

So it is something to consider, but from what I see in the code it looks like it's working correctly.

ok, this limits the attack vector considerably. I think it should be ok too, i just get uneasy when i see maps reachable from the network.

message/validation/partial_validation.go

MatheusFranco99 · 2026-01-06T14:12:21Z

1 violation is always allowed (if some peer has already sent some message, and then he sends the same/similar message 1 more time - he will not be punished by message rejection that 1 time, the receiving SSV node will simply ignore such a message for the 1st time)

Hmm, I don't see a clear reason for allowing this. I think that's already what we want to avoid: the same peer sending the same logical message twice ("duplicated"). I think it could be rejected already on its first duplicated.

previously messageValidator.state and messageValidator.validationLockCache maps were keyed by spectypes.MessageID+peerID ... that doesn't look correct to me because the "state" doesn't/shouldn't know/depend on which peer a message is received from

in this PR messageValidator.state and messageValidator.validationLockCache maps are keyed by spectypes.MessageID (which is basically = Operator+DutyType) only, meaning messages with the same spectypes.MessageID targeting different slots will acquire the same validation-lock and will be validated sequentially ... I think we could potentially optimize that further making validation for messages targeting different slots run concurrently, but it would be hard to implement this correctly on top of existing validation code (and this is out of scope of this PR anyway)

Btw, this is a critical change. Per-peer state (btw maybe it's easy to reason about it as "network view") was introduced so that we could penalize peers more accurately, as it's only responsible by its own state.
Still, at least, I think having a common state could only hurt the logical rules (as semantic rules don't matter here). For example, these come to mind:

(from knowledge-base)

QBFT Logic

Verification	Error	Classification	Description
Double Proposal	ErrDuplicatedMessage	Reject	Signer already sent a proposal for round.
= for Prepare, Commit, and RC	-	-	-
Already advanced round	ErrRoundAlreadyAdvanced	Ignore	Signer is already in a future round.

PSig Logic

Invalid signature type count	ErrInvalidPartialSignatureTypeCount	Reject	It's allow only: 1 PostConsensusPartialSig, for Committee duty, 1 RandaoPartialSig and 1 PostConsensusPartialSig for Proposer, 1 SelectionProofPartialSig and 1 PostConsensusPartialSig for Aggregator, 1 SelectionProofPartialSig and 1 PostConsensusPartialSig for Sync committee contribution, 1 ValidatorRegistrationPartialSig for Validator Registration, 1 VoluntaryExitPartialSig for Voluntary Exit.

Duty Logic

Already advanced slot	ErrSlotAlreadyAdvanced	Ignore	(Non-committee roles) Signer already advanced to later slot.
Too many duties per epoch	ErrTooManyDutiesPerEpoch	Ignore	If role is either aggregator, voluntary exit and validator registration, it's allowed 2 duties per epoch. Else if committee, 2*V (if no validator is doing sync committee). Else accept.

For the duplicated cases, we already already solving it with this change. For the *AlreadyAdvanced and ErrTooManyDutiesPerEpoch, I don't think an attacker can do much (e.g. trying to populate slots for a small committee so that it wouldn't be able to send more msgs) as the msgs also need to be correctly signed, so the attacker would need to be in the committee itself.

I think it may be good for @GalRogozinski to take a look here as well

iurii-ssv · 2026-01-06T15:28:24Z

@MatheusFranco99 thanks for the review,

Hmm, I don't see a clear reason for allowing this. I think that's already what we want to avoid: the same peer sending the same logical message twice ("duplicated"). I think it could be rejected already on its first duplicated.

As I mentioned in PR description, it would be hard/expensive to implement the necessary logic to track & correctly punish the very 1st violation. So instead, we just "record" the violation 1st time it happens and only "punish" (by message reject) if it happens again.

Does it make sense ? I agree that it would be better to not go for this "shortcut", but it's a good practical trade-off to take, I think (it's not like it can be abused by the attacker).

Still, at least, I think having a common state could only hurt the logical rules (as semantic rules don't matter here)

I'm not sure I understand what this ^ part refers to, could you expand ? Are the changes in this PR fine to do or not ?

MatheusFranco99 · 2026-01-06T16:02:49Z

it would be hard/expensive to implement the necessary logic to track & correctly punish the very 1st violation. So instead, we just "record" the violation 1st time it happens and only "punish" (by message reject) if it happens again.

Hmm, sorry I'm a bit lost here. When you say we just "record" the violation 1st time it happens, isn't it already the necessary logic to track it being used? If not, how do you detect it? Or is this detection by some other "approximation" mechanism?

I think I didn't completely understood it by looking at the code, but is the this duplication counter (for a certain peer) per logical message step (like committee:X, duty type: D, slot: S, round: R, prepare), or is it a counter for all messages?

I'm not sure I understand what this ^ part refers to, could you expand ? Are the changes in this PR fine to do or not ?

This is just a side-thought/concern, just adding it here for others to also think about it and confirm I didn't miss something

iurii-ssv · 2026-01-06T16:59:13Z

Hmm, sorry I'm a bit lost here. When you say we just "record" the violation 1st time it happens, isn't it already the necessary logic to track it being used? If not, how do you detect it? Or is this detection by some other "approximation" mechanism?

Yeah, sorry, it's a bit confusing, basically

in my terminology, violation is a validation-logic error before we classify that error as ignore or reject
so we need to figure out whether the violation is intentional (same peer knowingly sending us the bad message 2+ times, in which case we want to reject the 2nd, 3rd, etc. messages) or unintentional (due to the way libp2p works, any peer can unknowingly send you a violating message 1 time, so we need to ignore it as long as it is their 1st time)
prior violations are tracked via SeenViolations structure in this PR
also note, this PR targets (properly classifies between ignore and reject, instead of just always doing ignore) only subset of possible duplicate messages SSV node might receive (specifically those observed to be sent when running a duplicate SSV node during my testing) ... ideally we'd want to treat all errors with the same approach, but maybe it's just not necessary in practice

GalRogozinski

I am rejecting not because the change isn't good, I actually don't know if it is good.
It is simply something the spec team need to schedule time for correct analysis.

The rejection is just to hold off the merge for now

cc @Tom-ssvlabs

oleg-ssvlabs

gj!
Left couple of comments

message/validation/consensus_validation.go

oleg-ssvlabs · 2026-01-08T12:26:45Z

message/validation/consensus_validation.go

 				if len(signedSSVMessage.FullData) != 0 && signerState.HashedProposalData != nil {
-					if *signerState.HashedProposalData != sha256.Sum256(signedSSVMessage.FullData) {
-						return ErrDifferentProposalData
+					msgHashedProposalData := sha256.Sum256(signedSSVMessage.FullData)


here there is a validation which ensures consensusMessage.Root equals to hash FullData (same SHA256 hashing under the specabft.HashDataRoot() method). This validation runs before this code.

This basically means you don't need to hash FullData here for comparison, you can do this instead:

*signerState.HashedProposalData != consensusMessage.Root

Hashing is not very expensive, but FullData max size is 8Mb (considering ssz-max tag), so this minor optimization might introduce some performance improvements.

You can also change here from:

fullDataHash := sha256.Sum256(signedSSVMessage.FullData) signerState.HashedProposalData = &fullDataHash

to:

signerState.HashedProposalData = &consensusMessage.Root

message/validation/signer_state.go

p2p: rework duplicate message rejections

65fb024

iurii-ssv requested review from a team as code owners December 24, 2025 17:24

iurii-ssv marked this pull request as draft December 24, 2025 17:24

greptile-apps bot reviewed Dec 24, 2025

View reviewed changes

message/validation/signer_state.go Outdated Show resolved Hide resolved

iurii-ssv added 5 commits December 25, 2025 15:08

fix typos & minor adjustments

65bea82

mics cleanup

be3ecab

use string as a map key for effeciency and reliability

0c41b77

simplify & cleanup

22c6586

fix tests compilation

2e53295

iurii-ssv marked this pull request as ready for review December 29, 2025 14:44

iurii-ssv added 2 commits December 30, 2025 15:38

add some tests

913ace6

Merge branch 'stage' into p2p-rework-duplicate-message-rejections

b57a279

vyzo reviewed Dec 30, 2025

View reviewed changes

keep imports section formatting as it was

e3f2cfd

y0sher requested a review from MatheusFranco99 January 5, 2026 12:30

GalRogozinski requested changes Jan 8, 2026

View reviewed changes

oleg-ssvlabs reviewed Jan 8, 2026

View reviewed changes


		"github.com/attestantio/go-eth2-client/spec/phase0"
		"github.com/libp2p/go-libp2p/core/peer"

p2p: rework duplicate message rejections #2635

Are you sure you want to change the base?

p2p: rework duplicate message rejections #2635

Uh oh!

Conversation

iurii-ssv commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Dec 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps bot commented Dec 29, 2025

Greptile's behavior is changing!

Uh oh!

vyzo left a comment

Choose a reason for hiding this comment

Uh oh!

vyzo Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

iurii-ssv Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vyzo Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

iurii-ssv Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

vyzo Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MatheusFranco99 commented Jan 6, 2026

Uh oh!

iurii-ssv commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MatheusFranco99 commented Jan 6, 2026

Uh oh!

iurii-ssv commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GalRogozinski left a comment

Choose a reason for hiding this comment

Uh oh!

oleg-ssvlabs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

oleg-ssvlabs Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

oleg-ssvlabs Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

iurii-ssv commented Dec 24, 2025 •

edited

Loading

greptile-apps bot commented Dec 24, 2025 •

edited

Loading

codecov bot commented Dec 27, 2025 •

edited

Loading

iurii-ssv Dec 31, 2025 •

edited

Loading

iurii-ssv commented Jan 6, 2026 •

edited

Loading

iurii-ssv commented Jan 6, 2026 •

edited

Loading