AGENTS.md

Project overview
Directory structure
- Specs
- Tests
Key conventions
Important commands
Writing tests
Common tasks
Important notes
- Verify current spec behavior
- Fork inheritance

Project overview

This repository contains the Ethereum Proof-of-Stake Consensus Specifications. It serves as:

Formal specifications in human-readable markdown with embedded Python
Executable reference implementation (Python code generated from markdown)
Reference test generator for client implementations
Protocol development platform organized by network upgrades (forks)

The specifications define how Ethereum's consensus layer (beacon chain) operates.

Directory structure

Specs

/specs/
  phase0/     # The genesis specs
  altair/     # The 1st upgrade (starts with A)
  bellatrix/  # The 2nd upgrade (starts with B)
  capella/    # The 3rd upgrade (starts with C)
  ...
  _features/  # Features which have not been scheduled for inclusion

Tests

/tests/
  core/pyspec/eth_consensus_specs/
    <fork>/              # Assembled pyspec (do not edit)
    test/<fork>/         # Test cases organized by fork
      block_processing/
      epoch_processing/
      sanity/
      ...
    test/helpers/        # Shared test helpers
      <fork>/            # Fork-specific test helpers
  generators/            # Reference test generators
  formats/               # Test format specifications

Key conventions

Specification files

Structure

Title
Table of contents
Introduction
Types
Constants
Presets
Configuration
Containers
Functions

Warnings

Check the README to see which forks are stable vs in-development. In-development specs must include a warning at the top:

*Note*: This document is a work-in-progress for researchers and implementers.

This line should be removed once the spec becomes stable.

The table of contents is auto-generated by make lint - don't update it manually. New spec files need these TOC markers added manually:

<!-- mdformat-toc start --slug=github --no-anchors --maxlevel=6 --minlevel=2 -->
<!-- mdformat-toc end -->

Markdown directives

Special HTML comments control spec parsing:

 - Skip the next code block (used for non-executable specs like p2p-interface.md)
 - Type is defined externally
 - Constant is predefined or function-dependent

Python code in markdown

Python code is embedded in fenced code blocks and must be:

Valid, executable Python
Fully type-hinted
Include docstrings for functions (wrapped at 80 chars, proper punctuation, double backticks for inline code)

Example:

def get_active_validator_indices(state: BeaconState, epoch: Epoch) -> Sequence[ValidatorIndex]:
    """
    Return the sequence of active validator indices at ``epoch``.
    """
    return [
        ValidatorIndex(i) for i, v in enumerate(state.validators) if is_active_validator(v, epoch)
    ]

Code style

The specs are implemented by many clients in different languages. Keep code simple and readable:

Avoid Python-specific features like map, filter, lambda, or complex comprehensions
Prefer simple for loops over clever one-liners when it aids readability
Use straightforward control flow that translates easily to other languages
Be concise but not overly verbose
Match the style of existing spec code

Fork comments

These comments mark specific changes within functions or containers, not above the entire definition.

Example: function parameters:

Note: The "# Removed field" pattern documents function parameters or class fields that existed in a previous fork but were removed in this one.

def process_execution_payload(
    state: BeaconState,
    # [Modified in Gloas:EIP7732]
    # Removed `body`
    # [New in Gloas:EIP7732]
    signed_envelope: SignedExecutionPayloadEnvelope,
    execution_engine: ExecutionEngine,
    # [New in Gloas:EIP7732]
    verify: bool = True,
) -> None:
    pass

Example: modified container:

class DataColumnSidecar(Container):
    index: ColumnIndex
    column: List[Cell, MAX_BLOB_COMMITMENTS_PER_BLOCK]
    # [Modified in Gloas:EIP7732]
    # Removed `kzg_commitments`
    kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK]
    # [Modified in Gloas:EIP7732]
    # Removed `signed_block_header`
    # [New in Gloas:EIP7732]
    slot: Slot
    # [New in Gloas:EIP7732]
    beacon_block_root: Root

Example: new container with multiple EIPs:

Note: If a change is associated with one or more EIPs, list all of them. Only omit the EIP if the change is unrelated to any EIP (more common in older forks).

class ExecutionRequests(Container):
    # [New in Electra:EIP6110]
    deposits: List[DepositRequest, MAX_DEPOSIT_REQUESTS_PER_PAYLOAD]
    # [New in Electra:EIP7002:EIP7251]
    withdrawals: List[WithdrawalRequest, MAX_WITHDRAWAL_REQUESTS_PER_PAYLOAD]
    # [New in Electra:EIP7251]
    consolidations: List[ConsolidationRequest, MAX_CONSOLIDATION_REQUESTS_PER_PAYLOAD]

Container definitions

SSZ containers use Python dataclass-style syntax:

class PendingConsolidation(Container):
    source_index: ValidatorIndex
    target_index: ValidatorIndex

Type system

Custom types: Slot, Epoch, Gwei, Root, BLSPubkey, ValidatorIndex, etc.
SSZ primitives: uint8, uint16, uint32, uint64, uint256, boolean, Bytes32
SSZ composites: Container, List[T, N], Vector[T, N], Bitlist[N], Bitvector[N]

Presets vs configs

Presets (compile-time) define protocol limits that affect type sizes:

Located in presets/mainnet/ and presets/minimal/
Examples: MAX_VALIDATORS_PER_COMMITTEE, SLOTS_PER_EPOCH
Changing these requires recompiling

Configs (runtime) define network-specific parameters:

Located in configs/
Examples: GENESIS_FORK_VERSION, ELECTRA_FORK_EPOCH, DEPOSIT_CONTRACT_ADDRESS
Can be changed without recompilation

When adding new constants, determine if they affect type sizes (preset) or are just network parameters (config). Preset values go in both mainnet/ and minimal/ directories.

Important: Do not create presets or configs that are derived from other presets or configs. Each value should be defined independently.

Important commands

Everything is done through the Makefile. Run make help verbose=true for full documentation.

Linting

make lint

This command runs all linters, formatters, and checks for the repository. It covers Python code style, markdown formatting, table of contents validation, and spec-specific checks. Always run this before committing to ensure changes meet the project standards.

Running tests

# Run all minimal preset tests (~30 minutes)
make test

# Run all mainnet preset tests (~5 hours)
make test preset=mainnet

# Run tests for a specific fork
make test fork=deneb

# Run tests matching a pattern (partial match)
make test k=deposit  # Runs all tests with "deposit" in the name

# Combine options
make test preset=mainnet fork=deneb k=test_verify_kzg_proof

When testing, focus on what might have been impacted by the changes. Running the full test suite is slow, so target testing to relevant areas. For example, a bug fix in Electra deposit handling should be tested with deposit tests for Electra and any later forks. This may require multiple commands with different fork=<fork> options, as there is currently no single command to run tests for a given fork and all subsequent forks.

Use preset=minimal (the default) while developing and iterating on changes. Once everything works, run the same targeted tests with preset=mainnet as a final sanity check before committing.

Generating reference tests

# Generate all reference tests (runs both presets by default)
make reftests

# AI agents should use verbose mode (default view uses dynamic tables)
make reftests verbose=true

# Generate tests for a specific fork
make reftests fork=electra verbose=true

# Generate tests matching a pattern (omit the "test_" prefix)
make reftests k=verify_kzg_proof verbose=true

# Generate a specific test runner's suite
make reftests runner=bls verbose=true

# Combine options
make reftests preset=mainnet fork=deneb k=verify_kzg_proof verbose=true

Reference tests are written to the ../consensus-spec-tests directory, which is created automatically.

If a full suite of tests has been generated and an issue is identified, there is no need to regenerate everything. Only the affected test cases need to be regenerated; the framework will delete the individual test case directories before regenerating them.

Note that if a test case is removed from the framework, make reftests will not delete previously generated reference tests for that case. The corresponding directories must be deleted manually, or the entire ../consensus-spec-tests directory can be removed if regenerating everything is acceptable.

To see available runners:

find tests/generators/runners -maxdepth 1 -type f -name '*.py' ! -name '__init__.py' -exec basename {} .py \;

Cleaning

make clean

This command deletes all untracked files in the repository. Any untracked files that should be preserved must be staged with git add before running this command.

Writing tests

Reference tests vs unittests

Reference tests generate test vectors that client implementations use. Prefer reference tests when possible since they benefit the entire ecosystem.

Unittests are internal-only tests that don't produce reference files. Use these when reference tests are not feasible (e.g., testing internal helpers or edge cases that do not map to client behavior). Unittests are located in unittests directories.

Reference test formats

Reference test format specifications are located in tests/formats/. These define the directory and file structure for generated reference tests, documenting the expected inputs, outputs, and file organization for each test category (e.g., operations/, sanity/, epoch_processing/). Client implementations use these specifications to parse and run the reference tests.

Test decorators

When writing tests, use these decorators:

@with_all_phases - Run on all forks
@with_phases([DENEB, FULU]) - Run on specific forks
@with_deneb_and_later - Run on Deneb and all subsequent forks
@with_electra_and_later - Run on Electra and all subsequent forks
@spec_state_test - State transition test
@spec_test - General spec test
@always_bls - Always enable BLS verification

Test pattern

Tests yield their outputs for reference test generation:

@with_all_phases
@spec_state_test
def test_example(spec, state):
    # Setup
    yield "pre", state

    # Execute
    block = build_empty_block_for_next_slot(spec, state)
    signed_block = state_transition_and_sign_block(spec, state, block)

    yield "blocks", [signed_block]
    yield "post", state

Common tasks

Adding a new helper function

Add the Python function to the appropriate spec markdown file
Add tests in tests/core/pyspec/eth_consensus_specs/test/
Run make lint to run checks

Modifying an existing function

Find the function in the spec markdown
Make the necessary changes, adding "fork comments" above changed lines
Run make lint to run checks

Adding a new container field

Add field to container definition in spec markdown
Update any functions that construct or use the container
Update preset/config if needed
Run make lint to run checks

Adding a new fork or feature

Adding a new fork (e.g., "foobar") requires updates to many files:

1. Build system:

Makefile - Add to ALL_EXECUTABLE_SPEC_NAMES

2. GitHub automation:

.github/labeler.yml - Add label config for auto-labeling PRs
.github/release-drafter.yml - Add category for release notes

3. Spec generation (pysetup/):

pysetup/constants.py - Add FOOBAR = "foobar"
pysetup/md_doc_paths.py - Import constant, add to PREVIOUS_FORK_OF
pysetup/spec_builders/foobar.py - Create SpecBuilder class
pysetup/spec_builders/__init__.py - Import and register the SpecBuilder

4. Test infrastructure (tests/core/pyspec/eth_consensus_specs/test/):

helpers/constants.py - Add constant, update ALL_PHASES, PREVIOUS_FORK_OF, POST_FORK_OF
helpers/forks.py - Add is_post_foobar(spec) function
context.py - Add with_foobar_and_later decorator

5. Spec files:

specs/foobar/ - For scheduled forks
specs/_features/eipNNNN/ - For experimental features (must start with "eip")

6. Presets (if the fork has preset values):

presets/mainnet/foobar.yaml
presets/minimal/foobar.yaml

Important notes

Verify current spec behavior

This is an evolving specification. Do not rely on prior knowledge or cached context when modifying the spec. Always read the current spec files to verify how functions, containers, and logic actually behave before making changes.

Fork inheritance

Each fork inherits all specs from the previous fork. The chain is defined in pysetup/md_doc_paths.py via PREVIOUS_FORK_OF. When generating a fork's spec, all markdown files from ancestor forks are loaded first.

When adding to a new fork:

Reference the previous fork at the top of the spec
Only include new or modified sections
Include an upgrade_to_<fork> function that converts the previous fork's BeaconState to the new fork's state

Changes to an older fork (functions, containers, constants, etc.) may require updates to newer forks as well, if those elements are used or modified in later forks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AGENTS.md

Project overview

Directory structure

Specs

Tests

Key conventions

Specification files

Structure

Warnings

Table of contents

Markdown directives

Python code in markdown

Code style

Fork comments

Container definitions

Type system

Presets vs configs

Important commands

Linting

Running tests

Generating reference tests

Cleaning

Writing tests

Reference tests vs unittests

Reference test formats

Test decorators

Test pattern

Common tasks

Adding a new helper function

Modifying an existing function

Adding a new container field

Adding a new fork or feature

Important notes

Verify current spec behavior

Fork inheritance

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

AGENTS.md

Project overview

Directory structure

Specs

Tests

Key conventions

Specification files

Structure

Warnings

Table of contents

Markdown directives

Python code in markdown

Code style

Fork comments

Container definitions

Type system

Presets vs configs

Important commands

Linting

Running tests

Generating reference tests

Cleaning

Writing tests

Reference tests vs unittests

Reference test formats

Test decorators

Test pattern

Common tasks

Adding a new helper function

Modifying an existing function

Adding a new container field

Adding a new fork or feature

Important notes

Verify current spec behavior

Fork inheritance