-
Notifications
You must be signed in to change notification settings - Fork 142
p2p: rework duplicate message rejections #2635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: stage
Are you sure you want to change the base?
Conversation
Greptile SummaryReworked duplicate message detection to be peer-independent while selectively punishing repeat offenders. Changed validation state keying from Key changes:
Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Peer1 as Peer 1 (Node1A)
participant Node as SSV Node
participant Peer2 as Peer 2 (Node1B)
participant State as SignerState
Note over Node,State: First message arrives
Peer1->>Node: Send message (msgID=Op+Duty+Slot+Round)
Node->>State: Check if duplicate via SeenMsgTypes
State-->>Node: Not seen yet
Node->>State: Record message type
Node->>State: Track violation: none
Node->>Peer1: ACCEPT & broadcast
Note over Node,State: Duplicate from SAME peer
Peer1->>Node: Send same message again
Node->>State: Check if duplicate via SeenMsgTypes
State-->>Node: Already seen (limit reached)
Node->>State: Check SeenViolations[Peer1][ErrDuplicatedMessage]
State-->>Node: Not seen from this peer before
Node->>State: Record SeenViolations[Peer1][ErrDuplicatedMessage]
Node->>Peer1: IGNORE (first violation)
Note over Node,State: Second duplicate from SAME peer
Peer1->>Node: Send same message third time
Node->>State: Check if duplicate via SeenMsgTypes
State-->>Node: Already seen (limit reached)
Node->>State: Check SeenViolations[Peer1][ErrDuplicatedMessage]
State-->>Node: Already seen from this peer!
Node->>Peer1: REJECT (repeated violation)
Note over Node,State: Duplicate from DIFFERENT peer
Peer2->>Node: Send same message (different peerID)
Node->>State: Check if duplicate via SeenMsgTypes
State-->>Node: Already seen (limit reached)
Node->>State: Check SeenViolations[Peer2][ErrDuplicatedMessage]
State-->>Node: Not seen from this peer before
Node->>State: Record SeenViolations[Peer2][ErrDuplicatedMessage]
Node->>Peer2: IGNORE (first violation for this peer)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
13 files reviewed, 1 comment
Codecov Report❌ Patch coverage is
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Greptile's behavior is changing!From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section. This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR". |
vyzo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first pass, approach is pretty reasonable, i will do another pass.
Left a comment.
|
|
||
| "github.com/attestantio/go-eth2-client/spec/phase0" | ||
| "github.com/libp2p/go-libp2p/core/peer" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep this line please, it is good practice to separate our own libraries from external deps. makes audit a tad easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually still consistent with the current version of our make format command:
- if you run it after removing all the newlines,
- but once those newlines are already there it doesn't remove those
so we can end up with 2 valid but different-looking imports sections (both of which are valid from make format & make lint perspective)
Note also that currently we classify ssv-spec as any other 3rd-party repo (we don't bundle ssv-spec and ssv packages together in under imports section).
I've pushed a commit to revert these imports-affecting changes for now, but in general I think we shouldn't worry much about it (or maybe use something like gci to enforce certain rules) - e3f2cfd
| signerCount := len(signedSSVMessage.OperatorIDs) | ||
| if signerCount > 1 { | ||
| if signerState.SeenSigners == nil { | ||
| signerState.SeenSigners = make(map[SignersBitMask]struct{}) // lazy init on demand to reduce mem consumption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to garbage collect cold entries? Maybe also limit size?
I think it might be a good idea, i am concerned it might open a dos attack vector by making us use potentially unlimited memory in this map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For SeenSigners and SeenViolations we basically drop the reference(s) to those in SignerState.Reset letting the Golang GC to collect those maps we used in the past.
So it is something to consider, but from what I see in the code it looks like it's working correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, this limits the attack vector considerably. I think it should be ok too, i just get uneasy when i see maps reachable from the network.
Hmm, I don't see a clear reason for allowing this. I think that's already what we want to avoid: the same peer sending the same logical message twice ("duplicated"). I think it could be rejected already on its first duplicated.
Btw, this is a critical change. Per-peer state (btw maybe it's easy to reason about it as "network view") was introduced so that we could penalize peers more accurately, as it's only responsible by its own state. (from knowledge-base) QBFT Logic
PSig Logic
Duty Logic
For the duplicated cases, we already already solving it with this change. For the I think it may be good for @GalRogozinski to take a look here as well |
|
@MatheusFranco99 thanks for the review,
As I mentioned in PR description, it would be hard/expensive to implement the necessary logic to track & correctly punish the very 1st violation. So instead, we just "record" the violation 1st time it happens and only "punish" (by message reject) if it happens again. Does it make sense ? I agree that it would be better to not go for this "shortcut", but it's a good practical trade-off to take, I think (it's not like it can be abused by the attacker).
I'm not sure I understand what this ^ part refers to, could you expand ? Are the changes in this PR fine to do or not ? |
Hmm, sorry I'm a bit lost here. When you say I think I didn't completely understood it by looking at the code, but is the this duplication counter (for a certain peer) per logical message step (like
This is just a side-thought/concern, just adding it here for others to also think about it and confirm I didn't miss something |
Yeah, sorry, it's a bit confusing, basically
|
GalRogozinski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am rejecting not because the change isn't good, I actually don't know if it is good.
It is simply something the spec team need to schedule time for correct analysis.
The rejection is just to hold off the merge for now
cc @Tom-ssvlabs
oleg-ssvlabs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gj!
Left couple of comments
| if len(signedSSVMessage.FullData) != 0 && signerState.HashedProposalData != nil { | ||
| if *signerState.HashedProposalData != sha256.Sum256(signedSSVMessage.FullData) { | ||
| return ErrDifferentProposalData | ||
| msgHashedProposalData := sha256.Sum256(signedSSVMessage.FullData) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here there is a validation which ensures consensusMessage.Root equals to hash FullData (same SHA256 hashing under the specabft.HashDataRoot() method). This validation runs before this code.
This basically means you don't need to hash FullData here for comparison, you can do this instead:
*signerState.HashedProposalData != consensusMessage.RootHashing is not very expensive, but FullData max size is 8Mb (considering ssz-max tag), so this minor optimization might introduce some performance improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also change here from:
fullDataHash := sha256.Sum256(signedSSVMessage.FullData)
signerState.HashedProposalData = &fullDataHashto:
signerState.HashedProposalData = &consensusMessage.Root
Resolves #2632
This PR changes the validation-rules treatment of duplicate messages. A duplicate message is a message that should not be applied more than 1 time from QBFT-logic perspective (and it is "additionally" enforced via message-validation rules). SSV node wants to detect duplicate messages in order to:
The behavior this PR is aiming to change:
In order to be able to tell whether a duplicate p2p message is coming from new/different peer (or from the same peer we've received that/similar message in the past) we need to store & update the
message->peermapping ... that is however somewhat expensive (and doesn't fit well with the current minimalistic/optimized implementation we have for message-validation). Instead, in this PR, we keep track of "seen validation-rule violations":Additional considerations:
messageValidator.stateandmessageValidator.validationLockCachemaps were keyed byspectypes.MessageID+peerID... that doesn't look correct to me because the "state" doesn't/shouldn't know/depend on which peer a message is received frommessageValidator.stateandmessageValidator.validationLockCachemaps are keyed byspectypes.MessageID(which is basically = Operator+DutyType) only, meaning messages with the samespectypes.MessageIDtargeting different slots will acquire the same validation-lock and will be validated sequentially ... I think we could potentially optimize that further making validation for messages targeting different slots run concurrently, but it would be hard to implement this correctly on top of existing validation code (and this is out of scope of this PR anyway)SignerState.SeenMsgTypeskeeps track of both: messages related to QBFT-instance as well as pre-consensus phase messages, and upon round-changeSignerState.Resetcall fully resets not only QBFT-instance-related stuff but also pre-consensus phase messages as well. This means we are "allowing the receival of pre-consensus phase messages again once round-change happened" which doesn't make much sense. But it's not really abusable either, so there is no need to address that - just wanted to document that behavior