Skip to content

Conversation

siv2r
Copy link
Contributor

@siv2r siv2r commented Mar 27, 2025

Addresses #25

This pull request updates ChillDKG APIs to serialize messages between participants and the coordinator as bytes. Specifically, these API input/output types have been updated to use bytes:

  • ParticipantMsg1
  • ParticipantMsg2
  • CoordinatorMsg1
  • CoordinatorMsg2
  • CoordinatorInvestigationMsg

To support this, I've implemented to_bytes() and from_bytes() (or from_bytes_and_n()) methods for each named tuple. Additionally, I've introduced the GE.from_bytes_with_infinity() method (EDIT: created a PR for this function in upstream).

Alternative Designs

  • Including n in serialization: Instead of using from_bytes_and_n(), we could encode n within the first four bytes of the serialization. This approach would remove the explicit n parameter. However, since we'd have to repeat this encoding at multiple layers (chilldkg, encpedpop, simplepedpop), it would result in redundant and less clean serialization.

  • Type Hinting: Currently, messages in API input/output are directly typed as bytes, which isn't ideal for type clarity. An alternative would be to create explicit aliases like ParticipantMsg1Bytes and CoordinatorMsg1Bytes, similar to what we have with RecoveryData. Another option would be to rename the current structured types (e.g., to ParticipantMsg1Internal or ParticipantMsg1Struct) and use the existing names as byte aliases.

Questions

  • Why does VSSCommitment.from_bytes_and_t explicitly require t? Can't we infer t by computing len(b) // 33 like we do in deserialize_recovery_data? If this explicit approach was intentional, should we consistently implement explicit parameters (from_bytes_and_t) for other similar types, even when t can be inferred (e.g., simplepedpop.ParticipantMsg.from_bytes)?

  • Currently, I've also implemented serialization/deserialization methods for DKGOutput and SessionParams. However, these types aren't actually exchanged between participants and the coordinator, so their serialization isn't used within the API inputs/outputs. Shall I remove these unused methods? (EDIT: removed these (de)ser methods)

Copy link
Collaborator

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including n in serialization: [...] we could encode n within the first four bytes of the serialization. [...] However, since we'd have to repeat this encoding at multiple layers (chilldkg, encpedpop, simplepedpop), it would result in redundant and less clean serialization.

I think that's a good reason for not going with this alternative.

Questions

Why does VSSCommitment.from_bytes_and_t explicitly require t? Can't we infer t by computing len(b) // 33 like we do in deserialize_recovery_data? If this explicit approach was intentional, should we consistently implement explicit parameters (from_bytes_and_t) for other similar types, even when t can be inferred (e.g., simplepedpop.ParticipantMsg.from_bytes)?

I think the current implementation of computing t in simplepedpop.ParticipantMsg.from_bytes from the length of the data itself is broken. That's because it's never checked that this t is actually correct. The coordinator_step function gets t as an argument, but it's never checked that it actually matches t in the participant messages. AFAICS this will lead to a crash in the computation of sum_coms_to_nonconst_terms.

Similarly, if the t computed in the CoordinatorMsg's from_bytes function is wrong, then the assertion assert len(sum_coms_to_nonconst_terms) == t - 1 fires, instead of raising a ValueError as for other parsing errors.

These kinds of bugs are a good argument for doing as much error handling in the parsing functions themselves. So, I'd suggest to "consistently implement explicit parameters" (such as t) to the from_bytes function and make sure that the error is handled inside the function. Then we can, for example, also remove the assertion len(sum_coms_to_nonconst_terms) == t - 1.

Currently, I've also implemented serialization/deserialization methods for DKGOutput and SessionParams. However, these types aren't actually exchanged between participants and the coordinator, so their serialization isn't used within the API inputs/outputs. Shall I remove these unused methods?

I'd remove them because:

  • I'd expect applications will typically have their own way of sending and collecting SessionParams.
  • DKGOutput is not a protocol message that is sent between spec-compliant participants.

Comment on lines 77 to 81
# Read com (33*t bytes)
if len(rest) < 33 * t:
raise ValueError
com, rest = (
VSSCommitment.from_bytes_and_t(rest[: 33 * t], t),
rest[33 * t :],
)

if len(rest) != 0:
raise ValueError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like both these ValueErrors are never raised because after the t is computed via divmod we have

t == len(rest)/33 + remainder # (and t is an integer) by definition of divmod
t == len(rest)/33 # because we fail if remainder != 0

It seems like there are more cases of unnecessary conditions in other parsing functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks! I modeled this function after the deserialize_recovery_data implementation. By that same reasoning, the ValueErrors below that come after divmod won't be triggered either, correct?

# Compute n
n, remainder = divmod(len(rest), (33 + 33 + 32 + 64))
if remainder != 0:
raise ValueError
# Read hostpubkeys (33*n bytes)
if len(rest) < 33 * n:
raise ValueError
hostpubkeys, rest = [rest[i : i + 33] for i in range(0, 33 * n, 33)], rest[33 * n :]
# Read pubnonces (33*n bytes)
if len(rest) < 33 * n:
raise ValueError
pubnonces, rest = [rest[i : i + 33] for i in range(0, 33 * n, 33)], rest[33 * n :]
# Read enc_secshares (32*n bytes)
if len(rest) < 32 * n:
raise ValueError
enc_secshares, rest = (
[Scalar.from_bytes_checked(rest[i : i + 32]) for i in range(0, 32 * n, 32)],
rest[32 * n :],
)
# Read cert
cert_len = certeq_cert_len(n)
if len(rest) < cert_len:
raise ValueError
cert, rest = rest[:cert_len], rest[cert_len:]
if len(rest) != 0:
raise ValueError
return (t, sum_coms, hostpubkeys, pubnonces, enc_secshares, cert)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, but I'd still keep it because it's easier to reason about, than when the ValueError is removed:
We know that after the check that remainder == 0, the following holds in the integers (and n is positive):

n = len(rest)/(33 + 33 + 32 + 64)

So

(33 + 33 + 32 + 64)*n = len(rest)

Since (33 + 33 + 32 + 64)*n >= 33*n, we have that len(rest) >= 33*n.


# Read Pop (64 bytes)
if len(rest) < 64:
raise ValueError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should introduce a dedicated parsing error (can be a later PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d like to add this parsing error in a new commit under the same PR, since it needs to be considered when writing the test vectors

real-or-random added a commit to secp256k1lab/secp256k1lab that referenced this pull request Aug 4, 2025
fa4e7c6 Introduce `GE.from_bytes_compressed_with_infinity` (siv2r)

Pull request description:

  I'm working on [adding serialization/deserialization methods to ChillDKG](BlockstreamResearch/bip-frost-dkg#88) and need a helper function for deserializing coordinator messages. The function I need would map the 33 zero bytes to the infinity point.

  For context: BlockstreamResearch/bip-frost-dkg#88 (comment)

  I named the method `from_bytes_with_infinity` to match the existing `to_bytes_with_infinity` method in GE, but `from_zero_bytes` might be a more accurate name.

ACKs for top commit:
  theStack:
    ACK fa4e7c6
  real-or-random:
    utACK fa4e7c6

Tree-SHA512: 128d18a3868eec82c09655cbffb13f8f146e9fc328637cce648ef7a320eb01647e2491618bd506d8e20d57659b4ae0ec3e213b05fa93cb06c2c43830fcaf5510
The (de)ser methods were added for the following types:
- ParticipantMsg1
- ParticipantMsg2
- CoordinatorMsg1
- CoordinatorMsg2
- CoordinatorInvestigationMsg

These message types are now represented as bytes in user-facing APIs

Executed `./all.sh` to update the README file with new msgs type
@siv2r
Copy link
Contributor Author

siv2r commented Sep 16, 2025

I'd suggest to "consistently implement explicit parameters" (such as t) to the from_bytes function

I’ve updated the deserialization functions to follow this style. The code won’t compile yet because it needs the GE.from_bytes_with_infinity method, which isn’t available in the current secp256k1lab version. The method exists upstream, so once we update secp256k1lab to the latest version, everything should work.

@siv2r
Copy link
Contributor Author

siv2r commented Sep 17, 2025

I’ve been thinking about how to structure custom parsing errors, and came up with a few possible designs. I’m leaning toward Design 3 with Design 2 (using string arguments) under the hood. Would like to know your thoughts on this.

Design 1: Layer-specific subclasses

# util.py
class MessageParseError(ValueError):
    """Base class for message parsing errors."""

# simplpedpop.py
class ParticipantMessageError(MessageParseError): 
    pass

class CoordinatorMessageError(MessageParseError): 
    pass

class CoordinatorInvestigationMessageError(MessageParseError): 
    pass

# (same pattern repeated in encpedpop.py and chilldkg.py)

This design fits the current style and makes type-specific catching easy. The downside is boilerplate: nine new classes across three layers.

Design 2: Message type as an argument

# util.py
from typing import Type

class MessageParseError(ValueError):
    """Base class for message parsing errors."""
    def __init__(self, message_type: Type, msg: str):
        self.message_type = message_type
        super().__init__(msg)

# Usage
def from_bytes_and_t(b: bytes, t: int) -> ParticipantMsg:
    if len(rest) < 64:
        raise MessageParseError(
            simplpedpop.ParticipantMsg,
        )

# we can also do `raise MessageParseError("ParticipantMsg")` instead
# we don't need to worry about converting exception caught before propagating them to higher layers

This design keeps the number of classes small. The downside is that it deviates from the current design pattern.

Design 3: Unified base for all parsing errors

# util.py
class ParseError(ValueError):
    """Base for all parsing errors in the protocol."""

class MessageParseError(ParseError):
    """Base for message parsing errors."""
    # Could use Design 1 or 2 here

class RecoveryDataError(ParseError):
    """Failed to parse recovery data."""

This option creates a single umbrella for all parsing errors.

Copy link
Collaborator

@jonasnick jonasnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to also look top-down: what errors do we need to produce to get the user the information they need and allow us to test the API. For example in the participant functions of ChillDKG it's sufficient to return a generic message parsing error, because there's always just one message argument in the API that can fail to parse. On the other hand, the coordinator functions in chilldkg get multiple messages and we want to raise an error containing the index of the failing message. What design best facilitates this?

@siv2r
Copy link
Contributor Author

siv2r commented Sep 27, 2025

Ahh, I see. The requirement here is similar to how FaultyParticipantError and FaultyCoordinatorError are implemented. We can create a base class MessageError("error message") that gets raised inside every *from_bytes* method. Then we promote this to role-specific errors (ParticipantMessageError with index, or CoordinatorMessageError without index) at the public API entry points (participant_step*, coordinator_step*, coordinator_investigate, etc.).

I'll implement this design in a new commit, and we can iterate from there.

Also, remove removed redundant parsing in coordinator_investigate() fn
in the chilldkg.py
@siv2r
Copy link
Contributor Author

siv2r commented Sep 29, 2025

I've updated the parsing code to raise MsgParseError("error msg") instead of a plain ValueError. I've also left out simplpedpop.ParticipantMsg and encpedpop.CoordinatorMsg from the error message string since I don't think the test function needs to know which layer the parsing fails at. More importantly, other developers probably won't implement ChillDKG in a wrapper style (right?), so including this info in the error message didn't seem that useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants