Skip to content

Latest commit

 

History

History
147 lines (118 loc) · 6.97 KB

File metadata and controls

147 lines (118 loc) · 6.97 KB

mini-firecracker — architecture

A running map between this project's packages and the parts of upstream Firecracker (and the KVM kernel driver underneath it) they correspond to. The goal is that a reader flipping between mini-firecracker/ and ../firecracker/ can orient quickly — "the thing I just read in mini-fc lives over there, roughly, in the real thing."

The mapping is structural, not exact: mini-firecracker is deliberately small (no jailer, no PCI, no seccomp, no REST API), so many upstream modules have no counterpart here yet, and a few will never.

One-line shape

mini-firecracker is a single Go binary that opens /dev/kvm, creates one VM, maps some guest RAM, creates vCPU threads, and runs them while handling the small set of KVM_RUN exits that matter for a minimal microVM. That's it. Everything else — virtio, the boot protocol, the snapshot logic — is scaffolding on that core loop.

Package map

mini-firecracker package Real Firecracker location What it owns
pkg/kvm/ src/vmm/src/vstate/vm.rs, vcpu/mod.rs /dev/kvm + ioctl wrappers; capability probe; eventually KVM_CREATE_VM / KVM_CREATE_VCPU / KVM_RUN.
pkg/boot/ src/vmm/src/arch/x86_64/{boot,layout}.rs bzImage parsing, setup header, zero-page, long-mode entry, page tables.
pkg/virtio/ src/vmm/src/devices/virtio/ virtio-MMIO transport + block / net / console devices.
pkg/vmm/ src/vmm/src/lib.rs + builder.rs The main loop — vCPU run threads, epoll for device fds, exit dispatch.
pkg/snapshot/ src/vmm/src/persist.rs (Phase 4) memory + vCPU + device state capture & restore.
cmd/mini-fc/ src/firecracker/src/main.rs CLI entry. No REST API — flags only.
internal/hostcheck/ (no upstream analog) mini-fc check — the Phase-0 probe. Firecracker assumes the host is fine and panics loudly if not; we put a polite probe in front instead.

Real Firecracker also has a jailer (src/jailer/), a REST API (src/firecracker/src/api_server/), metrics, CPU templates, block-IO rate limiters, and an MMDS. None of those are in mini-firecracker's roadmap — see ../firecracker-experiments-PLAN.md §2 (non-goals).

How an execution flows, Phase by Phase

Phase 1 — mini-fc --kernel bzImage

pkg/kvm.Open()           open("/dev/kvm", O_RDWR)
      │
      ▼
Device.CreateVM()        ioctl(KVM_CREATE_VM) → vm fd
      │
      ▼
VM.SetTSSAddr()          ioctl(KVM_SET_TSS_ADDR, 0xfffbd000)    // legacy AMD-V quirk
VM.CreateIRQChip()       ioctl(KVM_CREATE_IRQCHIP)              // in-kernel PIC/LAPIC
      │
      ▼
VM.SetUserMemoryRegion() mmap 128 MiB anonymous → KVM_SET_USER_MEMORY_REGION
      │
      ▼
boot.LoadBzImage()       parse setup_header; copy kernel to 0x100000
boot.LoadZeroPage()      cmdline pointer, memory map, boot_params at 0x7000
      │
      ▼
VM.CreateVCPU(0)         ioctl(KVM_CREATE_VCPU) → vcpu fd; mmap kvm_run
vcpu.SetRegs/SetSRegs    CR0.PE|PG, CR4.PAE, EFER.LME|LMA; 4-level PTs; RIP = entry
vcpu.SetCPUID2           copy-through KVM_GET_SUPPORTED_CPUID minus a few entries
      │
      ▼
vmm.Run()                for { ioctl(KVM_RUN); switch kvm_run.exit_reason { ... } }
      │                              │
      │                              ├─ EXIT_IO  0x3f8..0x3ff (16550A)  → stdout / LSR stub
      │                              ├─ EXIT_HLT, EXIT_SHUTDOWN         → teardown
      │                              └─ anything else                   → panic (for now)

Milestone: kernel banner reaches stdout, kernel panics with "no working init" — proof the interception layer works, same feeling as mini-sentry's first trapped execve.

Phase 2 — virtio-blk + rootfs → shell

pkg/virtio/ gets a virtio-MMIO transport and a block device backed by a host file. The kernel cmdline picks up root=/dev/vda rw; the guest reads its rootfs via virtqueues. The serial stdin wires to the console device. You type ls, it lists /.

Where the complexity lives: the split descriptor ring, avail/used indices, indirect descriptors, the kick-notify contract. See pkg/virtio/queue.go (when it exists) and the virtio v1.2 spec §2.6.

Phase 3 — virtio-net via tap

A second device on the MMIO bus; two virtqueues (rx, tx); backed by a tap fd on the host. The VMM's main loop becomes an epoll driver: vCPU-thread KVM_RUN exits on one side, tap POLLIN on the other, both drive the same Queue objects.

Phase 4 — snapshot / restore

pkg/snapshot/ serializes: the guest memory region (dump the mmap'd backing to a file), the vCPU state (KVM_GET_*), and the device state (queue indices, config space, interrupt lines). Restore recreates the VM, mmaps the memory file back into the slot, re-applies the vCPU and device state, and issues KVM_RUN from exactly the next instruction.

This is where the largest surface area lives — Firecracker's persist.rs is the canonical reference and worth reading end-to-end before we start writing.

Threading model (future)

Once pkg/vmm/ exists it will use Firecracker's shape:

  • one vCPU thread per vCPU, each blocked in KVM_RUN;
  • one epoll thread running device I/O, tap fd, and the (optional) control socket;
  • an EventFd linking the two sides so an MMIO write from the guest can wake the epoll loop.

mini-firecracker Phase 1–3 get away with a single OS thread because the VM has exactly one vCPU and no background I/O — the epoll restructure happens when virtio-net lands in Phase 3.

What we don't copy from upstream

  • Jailer. src/jailer/ is a seccomp + chroot + cgroups wrapper around the Firecracker process itself. Important for real deployment (see firecracker-primer.md Part II on the jailer); not part of the learning axis mini-firecracker is on. If we ever add it, it goes alongside not inside pkg/vmm/.
  • REST API. mini-fc is a CLI. Firecracker's API surface is documented in src/api_server/; we read it rather than re-implement it.
  • Rate limiters, MMDS, metrics, CPU templates. Orthogonal to the core VMM loop; noise for the pedagogical goal.

Upstream reference index

Key paths in ../firecracker/ worth reading alongside each phase:

  • Boot + arch: src/vmm/src/arch/x86_64/{mod,layout,boot}.rs
  • KVM glue: src/vmm/src/vstate/{vm,vcpu/mod}.rs
  • Main builder: src/vmm/src/builder.rs
  • virtio: src/vmm/src/devices/virtio/{block,net,console}/
  • Snapshot: src/vmm/src/persist.rs
  • vCPU exit dispatch: search for match exit_reason in vstate/vcpu/mod.rs

Complementary in ../kata/ and ../firecracker-containerd/: both treat Firecracker as a library and can be read for how a container runtime embeds the VMM rather than invokes it.