Skip to content

brittonr/chaoscontrol

Repository files navigation

ChaosControl — Deterministic VMM

A deterministic Virtual Machine Monitor (VMM) for x86_64 built with KVM and the rust-vmm crate ecosystem. Designed for simulation testing of distributed systems where reproducibility is essential.

This is just an experiment with Claude + Pi.dev. Use at your own risk

Features

Deterministic Execution

  • CPUID filtering: Comprehensive filtering removes RDRAND, RDSEED, RDTSCP, optionally AVX2/AVX-512, and hides hypervisor presence
  • Pinned TSC: Fixed time stamp counter frequency (default 3.0 GHz) for reproducible timing across hosts
  • Virtual TSC: Software TSC counter that advances only on VM exits, enabling fully deterministic time progression
  • Fixed processor identity: Optional model/family/stepping override for cross-host reproducibility
  • SMP support: Multi-vCPU VMs with serialized execution (Antithesis-style), deterministic round-robin or randomized scheduling

VM Infrastructure

  • x86_64 boot: Full long mode setup with GDT, identity-mapped page tables (1 GB via 2 MB pages), and Linux boot protocol support
  • In-kernel IRQ chip: PIC, IOAPIC, and LAPIC via KVM
  • Serial console: COM1 with interrupt-driven I/O and output capture
  • Linux kernel support: Loads ELF kernels via linux-loader
  • ACPI tables: RSDP/RSDT/MADT for SMP CPU topology

Snapshot / Restore

  • Complete state capture: CPU registers, FPU, debug registers, LAPIC, XCRs, IRQ chip (PIC master/slave, IOAPIC), PIT, KVM clock, and full guest memory
  • Instant restore: Resume execution from any captured checkpoint
  • Fork support: Create divergent execution paths from a single snapshot point
  • Copy-on-write block device: Snapshots share the base disk image via Arc; only dirty 4 KB pages are cloned — a 512 MB disk with 1 MB of writes costs ~1 MB per snapshot, not 512 MB

Deterministic Devices

  • Entropy: Seeded ChaCha20 PRNG replacing hardware RNG, with snapshot/restore and reseed for exploration
  • Block: Copy-on-write block device with optional disk image file backing (--disk-image). Supports fault injection (read errors, write errors, torn writes, corruption)
  • Network: Simulated network with RX/TX queues, latency, jitter, bandwidth limiting, packet loss/corruption/reorder/duplication for fully controlled packet delivery between VMs

Exploration & Bug Finding

  • Coverage-guided exploration: AFL-style edge coverage bitmaps, fork-from-snapshot branching, frontier-based search
  • Three exploration modes: fault-schedule mutation, input-tree branching at random_choice() points, or hybrid
  • Fault schedule minimization: Delta debugging (ddmin) to find the smallest schedule that triggers a bug
  • Bug reproduction: Replay a bug report to verify it still triggers
  • Assertion catalog: Compile-time registration of all assertion sites via linkme; reports show which assertions are exercised/unexercised
  • Per-round history: Coverage growth curves, plateau detection, bug discovery timeline

Determinism Logging

  • Binary dlog format: Per-exit event log for diagnosing non-determinism
  • Structural diff: Compare two runs ignoring data payloads
  • Register dumps: Periodic full-register snapshots in dlog
  • Memory hashing: CRC32 page hashes at snapshot boundaries

Project Structure

chaoscontrol/
├── flake.nix                              # Nix development environment
├── Cargo.toml                             # Workspace root
└── crates/
    ├── chaoscontrol-protocol/             # SDK ↔ VMM wire protocol (no_std)
    ├── chaoscontrol-sdk/                  # Guest-side SDK (Antithesis-style)
    ├── chaoscontrol-fault/                # Host-side fault injection engine
    ├── chaoscontrol-vmm/                  # VMM implementation
    ├── chaoscontrol-explore/              # Coverage-guided exploration engine
    ├── chaoscontrol-replay/               # Recording, replay, time-travel debugger
    ├── chaoscontrol-trace/                # eBPF-based KVM tracing
    ├── chaoscontrol-guest/                # Minimal SDK-instrumented guest binary
    ├── chaoscontrol-raft-guest/           # 3-node Raft consensus guest (35 assertions)
    ├── chaoscontrol-guest-net/            # Network guest library (smoltcp)
    └── chaoscontrol-net-guest/            # Network demo guest binary

Kernel Coverage (KCOV)

When the guest kernel is built with CONFIG_KCOV=y, the SDK automatically collects kernel code coverage and merges it into the same AFL-style bitmap used by userspace SanCov. This gives the explorer visibility into kernel code paths exercised by different fault schedules — filesystem error handling, network stack branches, scheduler decisions, etc.

# Build KCOV-enabled kernel (first time takes ~20 min)
nix build .#kcov-vmlinux -o result-kcov

# Run exploration with kernel coverage
cargo run --release --bin chaoscontrol-explore -- run \
  --kernel result-kcov/vmlinux --initrd guest/initrd-raft.gz \
  --vms 3 --rounds 200 --branches 16

# Guest SDK auto-detects KCOV — no code changes needed

On a standard kernel (without CONFIG_KCOV), the SDK gracefully falls back to userspace-only coverage — no crash, no error.

Building

# Enter development environment
nix develop

# Build VMM + tools
cargo build

# Run tests (827 unit + doc tests)
cargo test

# Build guest binaries (statically linked, musl)
nix build .#guest-sdk    # → result/bin/chaoscontrol-guest
nix build .#guest-raft   # → result/bin/chaoscontrol-raft-guest
nix build .#guest-net    # → result/bin/chaoscontrol-net-guest

# Build initrd images (from guest binaries)
nix build .#initrd-sdk   # → result (gzipped cpio)
nix build .#initrd-raft
nix build .#initrd-net

# Build custom kernels
nix build .#net-vmlinux       # virtio-net enabled
nix build .#kcov-vmlinux      # KCOV coverage
nix build .#kcov-net-vmlinux  # both

# Boot a kernel
cargo run --bin boot -- <kernel-path> [initrd-path]

# Snapshot demo
cargo run --release --bin snapshot_demo -- <kernel-path> <initrd-path>

CLI Tools

Quick Start (Nix)

# Run Raft exploration with one command (builds kernel + guest + initrd)
nix run .#explore-raft

# Run with custom args (appended after defaults)
nix run .#explore-raft -- --output results/ --rounds 200

Exploration

# Coverage-guided exploration
cargo run --release --bin chaoscontrol-explore -- run \
  --kernel <kernel-path> --initrd <initrd-path> \
  --vms 3 --rounds 200 --branches 16 --output results/

# With persistent disk image
cargo run --release --bin chaoscontrol-explore -- run \
  --kernel <kernel-path> --initrd <initrd-path> \
  --disk-image <path-to-ext4.img> \
  --vms 3 --rounds 200 --branches 16 --output results/

# Input-tree mode (branch at random_choice() points)
cargo run --release --bin chaoscontrol-explore -- run \
  --kernel <kernel-path> --initrd <initrd-path> \
  --mode input-tree --output results/

# Resume from checkpoint
cargo run --release --bin chaoscontrol-explore -- resume \
  --corpus results/ --rounds 500

Output directory contains:

  • checkpoint.json — resumable exploration state
  • report.txt — human-readable report with per-round history
  • assertions.json — per-assertion verdicts and hit counts
  • bug_N.json — bug reports (consumable by minimize/reproduce)

Bug Workflow

# 1. Explore — find bugs
cargo run --release --bin chaoscontrol-explore -- run \
  --kernel vmlinux --initrd initrd.gz \
  --vms 3 --rounds 100 --output results/

# 2. Minimize — shrink the fault schedule
cargo run --release --bin chaoscontrol-explore -- minimize \
  --kernel vmlinux --initrd initrd.gz \
  --bug results/bug_0.json --output minimized.json

# 3. Reproduce — verify the bug
cargo run --release --bin chaoscontrol-explore -- reproduce \
  --kernel vmlinux --initrd initrd.gz \
  --bug minimized.json --serial

Replay & Debugging

# Replay a recorded session
cargo run --release --bin chaoscontrol-replay -- replay \
  --recording session.json --ticks 5000

# Triage — generate bug report from recording
cargo run --release --bin chaoscontrol-replay -- triage \
  --recording session.json --bug-id 1 --format markdown

# Show recording metadata
cargo run --release --bin chaoscontrol-replay -- info \
  --recording session.json

# Determinism log tools
cargo run --release --bin chaoscontrol-replay -- dlog diff a.dlog b.dlog
cargo run --release --bin chaoscontrol-replay -- dlog dump run.dlog
cargo run --release --bin chaoscontrol-replay -- dlog stats run.dlog

Live Dashboard

# Run exploration with live web dashboard
cargo run --release --bin chaoscontrol-explore --features dashboard -- run \
  --kernel vmlinux --initrd initrd.gz \
  --vms 3 --rounds 100 --dashboard

# Custom dashboard port
cargo run --release --bin chaoscontrol-explore --features dashboard -- run \
  --kernel vmlinux --initrd initrd.gz \
  --dashboard --dashboard-port 9090

# Review past results (standalone mode)
cargo run --release --bin chaoscontrol-dashboard -- serve --corpus results/

The dashboard shows:

  • Coverage growth chart with bug discovery markers
  • Per-assertion status table (failed/passed/unexercised)
  • Round-by-round progress table
  • Network fabric statistics
  • Live updates via Server-Sent Events

Open http://localhost:8080 in a browser while exploration runs.

eBPF Tracing

# Live KVM trace (requires sudo)
sudo chaoscontrol-trace live --pid <VMM_PID> --output trace.json

# Verify determinism between two traces
chaoscontrol-trace verify --trace-a run1.json --trace-b run2.json

Architecture

VM Setup (vm.rs)

DeterministicVm is the main entry point, configured via VmConfig:

use chaoscontrol_vmm::vm::{DeterministicVm, VmConfig};
use chaoscontrol_vmm::cpu::CpuConfig;

let config = VmConfig {
    memory_size: 256 * 1024 * 1024,
    cpu: CpuConfig {
        tsc_khz: 3_000_000,
        seed: 42,
        ..CpuConfig::default()
    },
    ..VmConfig::default()
};

let mut vm = DeterministicVm::new(config)?;
vm.load_kernel("vmlinux", Some("initrd.gz"))?;
vm.run()?;

CPU Determinism (cpu.rs)

Comprehensive CPUID filtering:

CPUID Leaf What's Filtered Why
0x1 RDRAND, TSC-Deadline, hypervisor bit Hardware RNG, timer jitter
0x7 RDSEED, AVX2, AVX-512 Hardware RNG, ISA variation
0x15 TSC frequency info Fixed crystal clock ratio
0x16 Processor frequency Consistent MHz reporting
0x40000000+ KVM paravirt leaves Hide hypervisor presence
0x80000001 RDTSCP Bypasses MSR-trap path
0x80000007 Invariant TSC Guest shouldn't assume host TSC

Virtual TSC for fully deterministic time:

use chaoscontrol_vmm::cpu::VirtualTsc;

let mut vtsc = VirtualTsc::new(3_000_000, 1_000);
vtsc.tick();                    // Advance by 1000 counts
let ns = vtsc.elapsed_ns();    // Convert to nanoseconds
let snap = vtsc.snapshot();    // Serialize for checkpoints

Guest SDK (Antithesis-style)

The chaoscontrol-sdk crate provides a guest-side testing API inspired by Antithesis. Guest code uses these to annotate properties and receive guided random values:

use chaoscontrol_sdk::prelude::*;

chaoscontrol_init();

// Signal setup complete — faults may begin
lifecycle::setup_complete(&[("nodes", "3")]);

// Safety property: must always hold
cc_assert_always!(leader < num_nodes, "valid leader");

// Liveness property: must hold at least once across all runs
cc_assert_sometimes!(write_ok, "write succeeded");

// Reachability
cc_assert_reachable!("leader elected");
cc_assert_unreachable!("split brain");

// Guided random choice for exploration
let action = random::random_choice(3);

All assertion sites are registered at compile time via linkme and reported to the VMM at startup. The exploration report shows which assertions were exercised, passed, failed, or never reached.

Fault Injection Engine

use chaoscontrol_fault::schedule::FaultScheduleBuilder;
use chaoscontrol_fault::faults::Fault;

let schedule = FaultScheduleBuilder::new()
    .at_ns(1_000_000_000, Fault::NetworkPartition {
        side_a: vec![0],
        side_b: vec![1, 2],
    })
    .at_ns(5_000_000_000, Fault::NetworkHeal)
    .at_ns(8_000_000_000, Fault::ProcessKill { target: 1 })
    .at_ns(10_000_000_000, Fault::InjectInterrupt { target: 0, irq: 5 })
    .build();

27 fault types across 6 categories: network (partition, latency, jitter, bandwidth, loss, corruption, reorder, duplication, heal), disk (I/O errors, torn writes, corruption, full), process (kill, pause, restart), clock (skew, jump), resource (memory pressure), interrupt (IRQ injection, NMI).

Run Loop

The VM run loop handles exits and advances the virtual TSC deterministically:

  • IoIn/IoOut: Serial port I/O, device access, SDK hypercalls
  • Hlt: VM halted — fast-forward TSC + inject timer IRQ
  • MmioRead/MmioWrite: Virtio MMIO, HPET, ACPI PM timer
  • Hypercall: VMCALL-based SDK transport (preferred over port I/O)
  • Every exit increments the virtual TSC by a fixed amount

Execution modes:

  • run() — run until halt/shutdown
  • run_until(pattern) — run until serial output matches
  • run_bounded(max_exits) — run for N exits (deterministic scheduling)

Dependencies

kvm-ioctls = "0.19"       # KVM API
kvm-bindings = "0.10"     # KVM structures
vm-memory = "0.17"        # Guest memory management
linux-loader = "0.13"     # Kernel loading (ELF)
vm-superio = "0.8"        # Serial port emulation
vmm-sys-util = "0.12"     # EventFd, utilities
rand_chacha = "0.3"       # Seeded PRNG
linkme = "0.3"            # Compile-time assertion catalog
snafu = "0.8"             # Error handling

Roadmap

  • Boot Linux kernel in single-vCPU KVM VM
  • CPUID filtering (RDRAND, RDSEED, RDTSCP, AVX, hypervisor)
  • TSC pinning + virtual TSC tracking
  • Complete snapshot/restore (CPU + memory + devices)
  • Deterministic entropy (seeded ChaCha20)
  • Deterministic block device with fault injection
  • Deterministic network (simulated queues)
  • Guest SDK (Antithesis-style assertions + guided randomness)
  • Fault injection engine (network, disk, process, clock faults)
  • Property oracle (cross-run assertion tracking + verdicts)
  • VMM ↔ SDK hypercall integration (VMCALL + port I/O fallback)
  • Virtio transport layer (MMIO-based, blk + net + rng)
  • Multi-VM simulation controller with network fabric
  • Deterministic scheduling across VMs
  • SMP — multi-vCPU with serialized execution
  • Coverage-guided exploration (AFL-style edge bitmaps)
  • Input tree exploration — branch at random_choice() decision points
  • Network simulation fidelity (jitter, bandwidth, duplication)
  • Kernel coverage (KCOV) — kernel code path visibility
  • Assertion catalog — compile-time registration via linkme
  • Fault schedule minimization — delta debugging
  • Bug reproduction from JSON reports
  • Determinism logging (dlog) — binary event log + diff + stats
  • Time-travel debugger with counterfactual analysis
  • Per-round exploration history and plateau detection
  • Per-assertion detail reports with JSON export
  • Multi-VM networking (virtio-net + smoltcp TCP/IP)
  • Interrupt injection faults (IRQ + NMI)
  • Core pinning for reduced scheduling jitter
  • Nix-native build pipeline (guest packages, initrd builder, kernel composer)
  • Declarative simulation tests via mkChaosTest

Using ChaosControl from Your Flake

Add ChaosControl as a flake input and define simulation tests for your own guest binaries:

{
  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
    chaoscontrol.url = "github:user/chaoscontrol";
  };

  outputs = { self, nixpkgs, chaoscontrol, ... }:
    let
      system = "x86_64-linux";
      cc = chaoscontrol.lib.${system};
      pkgs = nixpkgs.legacyPackages.${system};
    in {
      # Define a simulation test as a flake check
      checks.${system}.my-consensus-test = cc.mkChaosTest {
        name = "my-consensus";
        kernel = cc.mkChaosKernel { virtioNet = true; };
        initrd = cc.mkChaosInitrd {
          init = self.packages.${system}.my-guest;
        };
        vms = 3;
        rounds = 100;
        branches = 8;
        seed = 42;
      };

      # Use pre-built kernels to skip kernel compilation
      checks.${system}.quick-test = cc.mkChaosTest {
        name = "quick";
        kernel = chaoscontrol.packages.${system}.net-vmlinux;
        initrd = cc.mkChaosInitrd {
          init = self.packages.${system}.my-guest;
        };
        rounds = 10;
      };
    };
}

Run with nix flake check (requires system-features = kvm in nix.conf for the builder).

About

experimental deterministic simulation testing hypervisor with harness and rust sdk

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages