Skip to content

MIR Codegen: Assembler Label & Jump Resolution #698

@gakonst

Description

@gakonst

Summary

Implement correct label, jump, and PUSH-width resolution so scheduled EVM instruction sequences become final bytecode.

Parent issue: #687

Context

Currently, jumps in the codegen are placeholders. We need a proper assembler that:

  1. Assigns symbolic labels to basic blocks
  2. Computes concrete byte offsets
  3. Handles variable-width PUSH instructions (the target offset affects the PUSH size, which affects all subsequent offsets)

Tasks

Label abstraction

  • Represent basic block labels as symbolic IDs
  • JUMP/JUMPI instructions target labels, not numeric offsets
  • Track label definitions and references

Two-pass assembly

  • Pass 1: Linearize instructions, assign tentative offsets, track label positions
  • Compute sizes of variable-width instructions (PUSHn)
  • Pass 2: Re-run if PUSH widths changed due to offset size
  • Iterate until fixed-point (typically 2-3 iterations max)

Jump resolution

  • Replace each label reference with concrete offset bytes
  • Emit correct PUSHn followed by JUMP/JUMPI
  • Handle forward and backward jumps

JUMPDEST handling

  • Ensure every jump target has preceding JUMPDEST
  • Maintain mapping label → JUMPDEST offset
  • Insert JUMPDEST opcodes during linearization

Example

; Symbolic assembly
  PUSH label_else
  JUMPI
  PUSH 42
  PUSH label_end
  JUMP
label_else:
  JUMPDEST
  PUSH 0
label_end:
  JUMPDEST
  RETURN

; After resolution (assuming offsets)
  PUSH1 0x0A    ; offset of label_else
  JUMPI
  PUSH1 0x2A    ; 42
  PUSH1 0x0C    ; offset of label_end  
  JUMP
  JUMPDEST      ; 0x0A: label_else
  PUSH1 0x00
  JUMPDEST      ; 0x0C: label_end
  RETURN

Patterns to follow

From Venom:

  • Structured assembler with labels and separate patching step

From Sonatina:

  • Clean separation between IR and assembly, symbol resolution passes

Acceptance Criteria

  • Assembly tests: branches, loops, function dispatch have correct JUMP destinations
  • Round-trip tests: assemble → disassemble → compare structure
  • No invalid jumps or missing JUMPDESTs under EVM verification
  • Variable-width PUSH correctly sized (PUSH1 vs PUSH2 vs PUSH3)

Estimated Complexity

Small-Medium - Well-understood algorithm but requires careful offset tracking

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-codegenArea: code generation and MIRC-enhancementCategory: an issue proposing an enhancement or a PR with oneE-mediumCall for participation: Medium difficulty. Experience needed to fix: Intermediate.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions