HUMAN: Akita DRAM Model Evaluation for HBM3

## Summary

An evaluation of the Akita DRAM model (`akita/v4/mem/dram/`) was conducted to assess its suitability for modeling HBM3 memory as used on the AMD MI300A. While the model has a solid architectural foundation (bank state machines, 4-level timing tables, transaction splitting), several critical features are missing or incorrectly modeled for HBM3.

## Akita DRAM Model Architecture (Brief)

The model follows a standard academic DRAM controller pipeline:

```
Request → Transaction → SubTransSplitter → SubTransactionQueue (FCFS)
→ CommandCreator (close-page) → CommandQueue (per-rank, round-robin)
→ Channel (Banks[Rank][BankGroup][Bank]) → Bank state machine → Response
```

Supported protocols: DDR3, DDR4, GDDR5, GDDR5X, GDDR6, LPDDR, LPDDR3, LPDDR4, HBM, HBM2, HMC.

## Key HBM3 Features Missing or Incorrectly Modeled

### 1. No HBM3 Protocol Constant (only HBM/HBM2)

The model has no `HBM3` protocol. HBM3 has significant differences from HBM/HBM2:
- Higher data rate (up to 6.4 Gbps/pin; MI300A uses 5.2 Gbps/pin)
- Independent pseudo-channels as a first-class concept
- Different timing parameters and refresh schemes (per-bank refresh is default)
- In-line ECC

**Recommendation:** Add an `HBM3` protocol constant with appropriate protocol-specific timing behaviors.

### 2. No Pseudo-Channel Modeling

HBM3 splits each 128-bit channel into two independent 64-bit pseudo-channels, each with its own command/address bus, bank groups, banks, and independent row buffers. The Akita model has no concept of pseudo-channels — it treats the channel as a monolithic unit.

**Impact:** Two requests to different pseudo-channels should proceed independently but in the model they share timing constraints.

**Recommendation:** Add pseudo-channel support as a first-class feature in the channel hierarchy, or model each pseudo-channel as a separate DRAM controller instance.

### 3. Single Command Per Cycle Bottleneck

The `issue()` function issues at most ONE command per tick. The `subTransactionQueue.Tick()` also processes at most ONE sub-transaction per tick. Real HBM3 can issue multiple independent commands per cycle (one per pseudo-channel, multiple banks active simultaneously).

**Recommendation:** Allow multiple commands to be issued per tick when timing constraints allow. At minimum, one command per pseudo-channel.

### 4. Close-Page Policy Only (No Open-Page)

The model hardcodes `ClosePageCommandCreator`, always generating `ReadPrecharge`/`WritePrecharge` commands. Real HBM3 controllers use open-page or adaptive policies to exploit row buffer locality, which is critical for GPU streaming workloads. The bank state machine already supports open row tracking — only a proper command creator is needed.

**Recommendation:** Implement an `OpenPageCommandCreator` or `AdaptiveCommandCreator`.

### 5. No Per-Bank Refresh

The model defines refresh commands but no component actually generates them. No refresh controller or scheduler exists. In real HBM3, per-bank refresh (REFpb) is the default mode. Refresh interference can reduce effective bandwidth by 5-15%.

**Recommendation:** Implement a refresh scheduler. For HBM3, per-bank refresh should be the default mode.

### 6. Missing tPPD for HBM

The `tPPD` (precharge-to-precharge delay) is only applied for GDDR and LPDDR4 protocols in the timing generation code, **not for HBM**. HBM3 requires tPPD. This appears to be a bug in the timing table generation.

**Recommendation:** Enable tPPD in the timing tables for HBM/HBM2/HBM3 protocols.

## Additional Gaps (Lower Priority)

- **No bus turnaround delay modeling** at the channel level (read↔write switching)
- **Command queue is per-rank only** — per-bank or per-bank-group queues would improve parallelism
- **Address mapping order not configurable** through the builder API; no HBM3-optimized defaults
- **No power-down state management** (states defined but never entered)

## Current Workaround

We are using `SimpleBankedMemory` as an interim DRAM model for MI300A timing configuration, with tuned pipeline depth, stage latency, and buffer sizes to approximate the expected bandwidth characteristics. This sidesteps the DRAM model limitations but sacrifices detailed timing accuracy.

## Summary Table

| Feature | Status | HBM3 Need | Severity |
|---------|--------|-----------|----------|
| HBM3 protocol | ❌ Missing | HBM3-specific behavior | 🔴 Critical |
| Pseudo-channels | ❌ Not modeled | Independent 64-bit channels | 🔴 Critical |
| Page policy | Close-page only | Open/Close/Adaptive | 🔴 Critical |
| Commands/cycle | 1 (hardcoded) | Multiple per bank/pseudo-ch | 🔴 Critical |
| Refresh | ❌ Not implemented | Per-bank refresh (3.9 μs) | 🔴 Critical |
| tPPD for HBM | ❌ Not applied | Required | 🔴 Critical |
| Bus turnaround | ❌ Not modeled | R/W turnaround penalty | ⚠️ Important |
| Command queue | Per-rank | Per-bank/bank-group | ⚠️ Important |
| Bank state machine | ✅ OK | Open/Closed/SRef | ✅ OK |
| Timing tables | ✅ OK | 4-level hierarchy | ✅ OK |
| Row buffer tracking | ✅ OK | Per-bank open row | ✅ OK |

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HUMAN: Akita DRAM Model Evaluation for HBM3 #252

Summary

Akita DRAM Model Architecture (Brief)

Key HBM3 Features Missing or Incorrectly Modeled

1. No HBM3 Protocol Constant (only HBM/HBM2)

2. No Pseudo-Channel Modeling

3. Single Command Per Cycle Bottleneck

4. Close-Page Policy Only (No Open-Page)

5. No Per-Bank Refresh

6. Missing tPPD for HBM

Additional Gaps (Lower Priority)

Current Workaround

Summary Table

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature	Status	HBM3 Need	Severity
HBM3 protocol	❌ Missing	HBM3-specific behavior	🔴 Critical
Pseudo-channels	❌ Not modeled	Independent 64-bit channels	🔴 Critical
Page policy	Close-page only	Open/Close/Adaptive	🔴 Critical
Commands/cycle	1 (hardcoded)	Multiple per bank/pseudo-ch	🔴 Critical
Refresh	❌ Not implemented	Per-bank refresh (3.9 μs)	🔴 Critical
tPPD for HBM	❌ Not applied	Required	🔴 Critical
Bus turnaround	❌ Not modeled	R/W turnaround penalty	⚠️ Important
Command queue	Per-rank	Per-bank/bank-group	⚠️ Important
Bank state machine	✅ OK	Open/Closed/SRef	✅ OK
Timing tables	✅ OK	4-level hierarchy	✅ OK
Row buffer tracking	✅ OK	Per-bank open row	✅ OK

HUMAN: Akita DRAM Model Evaluation for HBM3 #252

Description

Summary

Akita DRAM Model Architecture (Brief)

Key HBM3 Features Missing or Incorrectly Modeled

1. No HBM3 Protocol Constant (only HBM/HBM2)

2. No Pseudo-Channel Modeling

3. Single Command Per Cycle Bottleneck

4. Close-Page Policy Only (No Open-Page)

5. No Per-Bank Refresh

6. Missing tPPD for HBM

Additional Gaps (Lower Priority)

Current Workaround

Summary Table

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions