Add FPGA board cards, DMA+PCIe simulation, Wishbone bus wrapper, timing closure#46
Open
devin-ai-integration[bot] wants to merge 4 commits into
Open
Add FPGA board cards, DMA+PCIe simulation, Wishbone bus wrapper, timing closure#46devin-ai-integration[bot] wants to merge 4 commits into
devin-ai-integration[bot] wants to merge 4 commits into
Conversation
…imates - Add fpga_cards/ directory with Lattice ECP5 45K-CABGA381 card JSON (all values from official Lattice datasheets) - Add tg2hdl/fpga_card.py: FPGACard dataclass, load_card(), list_cards() - Remove PCIeModel class and hardcoded FPGA constants from report.py - Update benchmark() to accept FPGACard instead of PCIeModel - Update benchmark.py to read clock/power from FPGA card - Update compare_inference.py to use card for synthesis params and BRAM - Replace Xilinx RAMB36 constant with card-derived BRAM block size - Remove all FIXME comments related to estimated FPGA values - Export FPGACard and helpers from tg2hdl.__init__ Co-Authored-By: Alessandro Ferrari <alessandro.ferrari.2004@gmail.com>
Contributor
Author
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
✅ Deploy Preview for tg2hdl ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
- Add optional card: FPGACard parameter to synthesis_stats() - When card is provided, device, package, yosys_target, resource types, and fpga_family are all read from the card - Update callers in report.py and compare_inference.py to pass card=card instead of device=/package= kwargs - Backward-compatible: existing callers without a card still work via the original device/package string defaults Co-Authored-By: Alessandro Ferrari <alessandro.ferrari.2004@gmail.com>
…_binary to card - compiler/pcie_dma.py: New module with simulate_top_with_pcie() that throttles input/output transfers to match card PCIe bandwidth and adds per-direction DMA setup latency cycles - compiler/utils.py: synthesis_stats() now returns target_mhz and timing_met fields for timing closure flagging; reads nextpnr binary name from card instead of hardcoding - tg2hdl/report.py: benchmark() runs both ideal and DMA-aware sims; HTML report shows DMA timing breakdown table; estimates section documents the DMA simulation methodology - fpga_cards/lattice_ecp5_45k_cabga381.json: Added nextpnr_binary field - tg2hdl/fpga_card.py: Added synth_nextpnr_binary and synth_toolchain - compare_inference.py: Uses card.synth_toolchain for display label Co-Authored-By: Alessandro Ferrari <alessandro.ferrari.2004@gmail.com>
- compiler/wishbone_wrapper.py: WishboneTopWrapper Elaboratable that memory-maps TopModule's input/output buffers and control registers over a standard Wishbone B4 bus. Register map: CTRL (0x0000), STATUS (0x0004), CYCLE_CNT (0x0008), input region (0x1000+), output region (0x8000+). simulate_wishbone() drives TopModule through the bus interface with cycle-accurate bus transactions. - compiler/__init__.py: Export WishboneTopWrapper and simulate_wishbone - compiler/pcie_dma.py: Remove unused amaranth.hdl imports - tg2hdl/report.py: benchmark() runs Wishbone simulation alongside ideal and DMA sims; HTML report adds Wishbone timing breakdown table; BenchmarkArtifact gains wb_* fields; estimates section documents Wishbone simulation methodology Co-Authored-By: Alessandro Ferrari <alessandro.ferrari.2004@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces FPGA board "cards" — JSON files under
fpga_cards/that capture all board-level parameters (fabric resources, BRAM geometry, PCIe bandwidth, power draw, synthesis flags) sourced from vendor datasheets. The benchmark, report, and comparison pipelines now read from anFPGACarddataclass instead of using hardcoded constants and thePCIeModelclass.Additionally adds:
synthesis_stats()reports whether nextpnr met the target FmaxElaboratablethat memory-maps TopModule over a standard Wishbone bus, modelling the LiteX SoC integration pathNew files
fpga_cards/lattice_ecp5_45k_cabga381.json— first card, all values from Lattice ECP5 Family Data Sheet (FPGA-DS-02012-3.4) and related app notestg2hdl/fpga_card.py— frozenFPGACarddataclass,load_card()/load_card_from_path()/list_cards()helpers, and derived methods (pcie_xfer_s,bram_blocks_for_bits, etc.)compiler/pcie_dma.py—simulate_top_with_pcie()runs a second Amaranth simulation that throttles input loading and output readback to match the card's PCIe bandwidth, with per-direction DMA setup latencycompiler/wishbone_wrapper.py—WishboneTopWrapperElaboratable that wraps TopModule as a Wishbone B4 slave with memory-mapped registers: CTRL (0x0000), STATUS (0x0004), CYCLE_CNT (0x0008), input buffers (0x1000+), output buffers (0x8000+).simulate_wishbone()drives TopModule through the bus interface with cycle-accurate 2-cycle Wishbone transactions (strobe + ack)Modified files
tg2hdl/report.py— removesPCIeModelclass andFPGA_FAMILY/FPGA_DEVICE/FPGA_PACKAGEconstants;benchmark()now acceptscard: FPGACardand runs three simulations (ideal, DMA-throttled, Wishbone-wrapped); HTML report includes "DMA + PCIe" and "Wishbone Bus Simulation" timing tables;BenchmarkArtifactgainsdma_*andwb_*fields; estimates section generated from_estimates_for_card(card)with Wishbone methodology documentedbenchmark.py— clock speeds, power values, and scaling estimates read from card instead of a hardcoded multi-FPGA tablecompare_inference.py— replaces Xilinx-specific_RAMB36_BITSconstant with card-derived BRAM block size; synthesis calls passcard=card; display label usescard.synth_toolchaincompiler/utils.py—synthesis_stats()accepts optionalcard: FPGACard; returns newtarget_mhzandtiming_metfields for timing closure flagging; readsnextpnr_binaryname from card instead of hardcoding"nextpnr-ecp5"compiler/__init__.py— exportssimulate_top_with_pcie,WishboneTopWrapper,simulate_wishbonetg2hdl/__init__.py— exportsFPGACard,load_card,load_card_from_path,list_cards; dropsPCIeModelexportAll FIXME comments related to estimated FPGA values have been removed or replaced with notes referencing the card datasheet source.
Review & Testing Checklist for Human
WishboneTopWrapperis a ~300-line Elaboratable that has not been synthesized or tested end-to-end. The FSM, address decoding, and buffer mapping logic should be reviewed carefully. In particular verify that the input buffer sequential layout (0x1000 + offset) and output buffer region (0x8000+) correctly map to TopModule'sext_write_portsandoutput_rport.benchmark()now runs three full Amaranth simulations (ideal + DMA-throttled + Wishbone-wrapped). There is no flag to skip any of them. Verify this runtime is acceptable or consider adding opt-out parameters.simulate_wishbone()output correctness: The Wishbone simulation should produce identical numerical results to the ideal simulation. Runbenchmark()and verify that the Wishbone output buffer contents match the ideal simulation output.benchmark()signature changed frompcie: PCIeModeltocard: FPGACard, andPCIeModelis deleted. Verify no external callers or notebooks use the oldpcie=kwarg or importPCIeModel.timing_metsemantics:synthesis_stats()compares achieved Fmax againstcard.synth_typical_fmax_mhz(the card's typical achievable frequency, not a user-specified constraint). Decide if this is the right target or if it should be a separate user-configurable parameter.benchmark()end-to-end and open the generatedindex.htmlto verify the three timing tables (ideal, DMA, Wishbone) all render correctly with card-derived values.Notes
compiler/backend.pyhas an unrelated FIXME about the analytical cycle model, andbenchmark.pyretains a FIXME about GPU latency estimates — both are intentionally left as-is since they aren't FPGA board-level estimates.FPGACard,load_card(),simulate_top_with_pcie(),WishboneTopWrapper, orsimulate_wishbone(). The JSON schema is validated only by the dataclass constructor at load time.synthesis_stats()retains its originaldevice="45k"/package="CABGA381"default arguments for backward compatibility, but all callers in this PR now passcard=cardinstead."mapped"stage inshow_hardware()still uses a hardcodedsynth_ecp5Yosys pass — not updated to use the card'syosys_target.synthesis_stats()still returns ECP5-specific key names (dp16kd,mult18,comb,ff). Future non-ECP5 cards may need generic key names.Link to Devin session: https://app.devin.ai/sessions/c6cba5b3aae14406a970cb500224341c
Requested by: @Ferryistaken