An open-source hardware implementation of a Predictive Coding Network (PCN) in the SkyWater 130 nm process (Sky130A), verified by SPICE simulation and synthesised RTL place-and-route.
Weights are stored as voltages on 200 fF capacitors inside a five-transistor OTA. Inference is computed by transconductance and Kirchhoff's current law. Hebbian weight updates happen on-chip — no weight data crosses the chip boundary during inference or learning.
A target companion arXiv preprint is in paper/main_v3rev.tex (not yet on arXiv)
Architecture designed by Saul Dobney, coded and validated using Claude Code
| Metric | Value | Method |
|---|---|---|
| Single-cell gain | 6.82 V/V | SPICE (Sky130A TT 27 °C) |
| 4-layer spatial cascade gains | 1.43 / 0.45 / 0.70 / 0.54 V/V | SPICE |
| Temporal reuse gain (per VL) | 6.68 V/V | SPICE |
| ADC/DAC round-trip error | 0.50 LSB | SPICE |
| GHA prediction error reduction | 484× (single step) | Software sim |
| Template selectivity | 1.00 | Software sim |
| Digital timing slack @ 50 MHz | +8.46 ns | OpenLane P&R (Sky130A TT) |
| Projected TOPS/W at 28 nm | ~320 GOPS/W | Estimate |
| Projected TOPS/W at 7 nm | ~3.8 TOPS/W | Estimate |
┌─────────────────────────────────────────────────────┐
│ PCN Module (16 × 16) │
│ │
│ inp[0..15] ──► MAC cell array ──► iout[0..15] │
│ (5T OTA × 256) │
│ Vw = f(SRAM → DAC → Cw) │
│ Hebbian: ΔVw per pulse │
└────────────────────────┬────────────────────────────┘
│ PMOS source-follower level shift (+0.67 V)
▼
(next layer or ADC save)
Temporal reuse: one physical array × N_virt virtual layers
Phase 1 — ADC saves activations to SRAM
Phase 2 — Weight DAC loads next VL weights
Phase 3 — Input DAC replays activations (resets common mode)
Phase 4 — Array computes; Hebbian gate fires if |ε| > θ
MAC cell transistors: MN1/MN2 differential pair (W=2/L=0.35 µm), MP1/MP2 PMOS current mirror load (W=4/L=0.7 µm, CLM-optimised), MN3 tail (W=10/L=0.35 µm, gate=Vw — weight sets Gm), MN4+MP4 CMOS transmission gate (Cw access), Cw=200 fF.
PCNchip_design/
│
├── pcn_mac_cell.spice # 5T OTA MAC cell (production netlist)
├── pcn_mac_cell_v3b.spice # Characterisation variants (Path A study)
├── pcn_array_*.spice # Generated 4×4 / 16×16 / 16×32 / 32×32 / 32×64 arrays
├── pcn_module_*.spice # Arrays with peripheral bias ports
├── pcn_chip_4layer.spice # 4-layer 16×16 spatial stack
├── layer_link.spice # PMOS SF inter-layer level shifter
├── layer_link_route_*.spice # Routing-weight variants (4/16/32 routing cells)
├── bias_gen.spice # Vbias_n / Vcm / Vπ generator
├── weight_dac.spice # 8-bit R-2R weight DAC
├── inp_dac.spice # 8-bit R-2R activation (input) DAC
│
├── tb_pcn_4layer.spice # 4-layer cascade testbench (T1/T2/T3)
├── tb_pcn_4col_2vl.spice # 4-column 2-VL GHA timing testbench (all PASS)
├── tb_temporal_reuse.spice # 3-VL temporal reuse (gain 6.82 V/V)
├── tb_temporal_full.spice # Full round-trip: inp_dac + weight_dac + array
├── tb_sram_reload.spice # Save/load path verification
├── tb_pcn_2layer_*.spice # 2-layer integration testbenches
├── tb_pcn_route_test.spice # Dynamic routing weight testbench
├── tb_bias_gen.spice # Bias generator testbench
├── tb_dvw_pulse.spice # Hebbian pulse characterisation
│
├── gen_array.py # Generates pcn_array_*.spice and pcn_module_*.spice
├── gen_tb_2layer.py # Generates 2-layer testbench SPICE
├── gen_tb_4layer.py # Generates 4-layer testbench SPICE
├── run_sim.sh # ngspice runner (--netlist flag for non-default)
├── plot_results.py # Plots output CSV data
│
├── rtl/ # Synthesisable Verilog RTL
│ ├── pcn_digital_top.v # Top-level integration
│ ├── weight_fsm.v # 22-state FSM (weight load, temporal reuse, ADC sweep)
│ ├── hebb_ctrl.v # Per-row Hebbian enable, gated by HEBB_ROW_MASK
│ ├── pcn_wb_regs.v # Wishbone register file (13 registers, base 0x3000_0000)
│ ├── sar_adc.v # 8-bit SAR ADC (10-cycle latency)
│ ├── act_sram.v # Activation register file (N_cells × 8-bit)
│ ├── sram_if.v # SRAM interface
│ ├── sram_blackbox.v # OpenRAM macro black-box wrapper
│ ├── power_fsm.v # Sleep/wake power controller
│ ├── tb_pcn_digital_top.v # Digital integration testbench (13/13 tests PASS)
│ └── tb_sar_adc.v # SAR ADC testbench (11/11 tests PASS)
│
├── sim/ # Python software simulation
│ ├── pcn_core.py # PCNLayer + PCNTemporalStack (v1/BCM/v2 modes, 8-bit quant)
│ ├── tasks.py # Benchmark tasks: Gaussian PCA, templates, bars/stripes
│ ├── train.py # Training loops: train_layer, compare_modes, train_temporal
│ ├── run_sim.py # Runs E1–E4 experiments; saves figures to sim/results/
│ ├── circuit_sim.py # Hardware-calibrated model; 7 circuit experiments C1–C7
│ ├── run_circuit_sim.py # Runs C1–C7; cross-validates against SPICE
│ ├── pcn_predict.py # 2-layer GHA multi-cell PCN; 484× pred_err reduction
│ ├── pcn_mnist.py # MNIST digit classification using hardware-faithful PCN simulation
│ └── results/ # Output figures (c1–c7, e1–e4, p_predictive_network)
│
├── pnr/ # OpenLane place-and-route
│ ├── config.yaml # OpenLane 2 configuration
│ ├── macro_placement.cfg # OpenRAM SRAM macro placement
│ └── src/ # RTL sources for P&R (mirrors rtl/ minus testbenches)
│
├── synth_output/ # Yosys synthesis results
│ ├── pcn_digital_top_synth.v # Post-synthesis netlist
│ └── SYNTHESIS_RESULTS.md # Cell count, area, timing summary
│
├── magic/ # Analog layout (in progress)
│ ├── mac_cell.mag # MAC cell Magic layout (layer names corrected)
│ ├── mac_cell_seed.tcl # Seed script for Magic (3200×2200 units = 16×11 µm)
│ ├── Makefile # make seed → opens Magic; make drc / make lvs
│ └── SETUP.md # Magic version requirements and build instructions
│
├── xschem/ # Schematic capture
│ ├── mac_cell.sch # MAC cell schematic
│ ├── mac_cell.sym # MAC cell symbol
│ └── mac_cell_tb.sch # MAC cell testbench schematic
│
├── paper/ # ArXiv preprint
│ ├── main_v3rev.tex # Current submission draft (IEEEtran, 10 pages)
│ ├── refs.bib # Bibliography (19 entries)
│ └── main_v3rev.pdf # Compiled PDF
│
├── pred_code_networks.md # Full design journal (~10,500 lines; §1–§75)
├── FeFET_7nm_discussion.md # Weight storage options: MIM cap, FeFET, PCM, RRAM, WSI
├── sky130_summary.md # Sky130A process design overview
├── simulations_summary.md # Python software simulation summary: MNIST, EMNIST
├── PCN_versus_spiking.md # Comparison of PCN and Spiking Neural Nets approaches
├── quickstart.md # Quick-start guide for running simulations
└── config.json # OpenLane project configuration
sudo apt install ngspiceRequires the SkyWater 130 nm PDK installed via volare:
pip install volare
volare enable --pdk sky130 0fe599b2
export PDK_ROOT=~/.volaresudo apt install iverilogpip install numpy matplotlib scipyOpenLane 2 via Docker:
docker pull ghcr.io/efabless/openlane2:2.3.10Magic 8.3.411 or later (build from source — see magic/SETUP.md):
git clone https://github.com/RTimothyEdwards/magic
cd magic && ./configure && make -j$(nproc) && sudo make installcd PCNchip_design
PDK=~/.volare/sky130/versions/0fe599b2/sky130A
sed "s|\$PDK_ROOT|$PDK|g" tb_pcn_4layer.spice | ngspice -bOr use the runner script:
bash run_sim.sh # default testbench
bash run_sim.sh --netlist tb_temporal_reuse.spicecd rtl
iverilog -o tb_pcn.vvp tb_pcn_digital_top.v pcn_digital_top.v \
weight_fsm.v hebb_ctrl.v pcn_wb_regs.v sar_adc.v \
act_sram.v sram_if.v sram_blackbox.v power_fsm.v
vvp tb_pcn.vvpcd sim
python run_sim.py # Gaussian PCA, template learning, quantisation, temporal reuse
python run_circuit_sim.py # Circuit-level validation C1–C7 (hardware-calibrated model)
python pcn_predict.py # 2-layer GHA multi-cell PCNResults are written to sim/results/.
| Offset | Name | Width | Function |
|---|---|---|---|
| 0x00 | WEIGHT_DATA | 8 | Weight byte to write |
| 0x04 | CELL_ADDR | 16 | Target cell/column address |
| 0x08 | CTRL | 7 | start_load, load_all, hebb_en, sleep_req, rst_weights, start_temporal, start_adc_sweep |
| 0x0C | STATUS | 4 | sleep_ack, hebb_actv, busy, ready |
| 0x10 | HEBB_MASK | N_rows | Static per-row Hebbian enable |
| 0x14 | HEBB_PW | 16 | Hebbian pulse width (cycles) |
| 0x18 | SRAM_DATA | 8 | Direct SRAM r/w |
| 0x20 | N_VIRT_LAYERS | 4 | Virtual layer count (1–8) |
| 0x24 | HEBB_ROW_MASK | N_rows | Dynamic per-row GHA mask |
| 0x28 | IERR_DIG | N_rows | Read-only precision-gate flags |
| 0x2C | INP_DAC_DATA | 8 | Direct input DAC write (GHA residual reload) |
| 0x30 | ACT_SRAM_DATA | 8 | Direct activation SRAM r/w |
| Configuration | Effective weights | Comparable scale |
|---|---|---|
| 1 chip, Sky130A, N=100 | ~200 K | small feature extractor |
| 1 chip, 28 nm, N=100 | ~1.6 M | — |
| 100 chips, 28 nm, N=20 | ~320 M | ResNet-50 class |
| 100 chips, 7 nm, N=100 | ~10 B | GPT-2 class |
All inter-chip interfaces are digital SPI — no analog signals cross chip boundaries. Weight bandwidth is zero: weights never leave the chip.
paper/main_v3.tex — Scalable Modular Analog VLSI for Predictive Coding Networks:
Temporal Multiplexing, Learned Routing, and Zero Weight Bandwidth — Saul Dobney, 2026.
IEEEtran format, targeting cs.AR + cs.NE cross-listing.
| Domain | Status |
|---|---|
| MAC cell SPICE | Complete — all parameters verified |
| Analog peripheral circuits | Complete — bias gen, weight DAC, input DAC, PMOS SF |
| 4-layer spatial cascade | Complete — gains verified by SPICE |
| Temporal reuse | Complete — 3-VL SPICE + full round-trip |
| Digital RTL | Complete — 22-state FSM, 13/13 tests pass |
| Place and route | Complete — OpenLane 2, 0 DRC violations, +8.46 ns slack |
| Software simulation | Complete — E1–E4 + C1–C7 + GHA multi-cell |
| Analog layout (Magic) | In progress — blocked on Magic 8.3.411+ build |
| Tape-out | Not started |
Design files, RTL, and simulation scripts: Apache 2.0
The SkyWater 130 nm PDK is subject to its own licence terms.