Skip to content

Conversation

@nemecad
Copy link

@nemecad nemecad commented Sep 4, 2025

This PR addresses some previous feedback, most notably:

  • Set-associative TLB (machine::TLB): Implements a set-associative Translation Lookaside Buffer (TLB) frontend over physical memory, handling virtual to physical translation, flush, and replacement policy.
  • Pluggable Replacement Policies (machine::TLBPolicy): Abstract TLB replacement policy interface & implementations (RAND, LRU, LFU, PLRU) for set-associative tables.
  • SV32 Page-Table Walker (machine::PageTableWalker): Performs multi-level page-table walks (SV32) in memory to resolve a virtual address to a physical one.
  • Sv32Pte Bitfield Helpers (sv32.h): SV32-specific definitions: page-table entry (PTE) bitfields, shifts/masks, and PTE to physical address helpers.
  • VirtualAddress (virtual_address.h): Lightweight VirtualAddress wrapper offering raw access, alignment checks, arithmetic, and comparisons.
  • Add supervisor CSRs and sstatus handling: supervisor CSRs (sstatus, stvec, sscratch, sepc, scause, stval, satp) and a write handler that presents sstatus as a masked view of mstatus so supervisor-visible bits stay in sync.
  • Store current privilege level in CoreState: tracking of the hart's current privilege level in CoreState so exception/return handling and visualization can read/update it from the central CoreState structure.

Tests:

  • Add SV32 page-table + TLB integration tests: a set of small assembly tests that exercise the SV32 page-table walker, SATP enablement and the new TLB code. The tests create a root page table and map a virtual page at 0xC4000000, then exercise several scenarios. The tests verify page-table walker behaviour, SATP switching and TLB caching/flush logic. Tests were written based on the consultation.

UI Components:

  • Show current privilege level in core state view:
Snímek obrazovky 2025-09-04 115127
  • Virtual memory configuration to NewDialog:
Snímek obrazovky 2025-09-04 115904
  • TLB visualization and statistics dock:
Snímek obrazovky 2025-09-04 115521
  • VM toggle and "As CPU" memory access view:
Snímek obrazovky 2025-09-04 120034

@jdupak jdupak self-requested a review September 28, 2025 17:17
@jdupak
Copy link
Collaborator

jdupak commented Sep 28, 2025

I am getting this weird zoom.
image

@jdupak
Copy link
Collaborator

jdupak commented Sep 28, 2025

Notice that address sanitizer is failing in CI.

void tlb_update(unsigned way, unsigned set, bool valid, unsigned asid, quint64 vpn, quint64 phys, bool write);

private:
const machine::TLB *tlb;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who owns this pointer?

#include <cstdint>

namespace machine {
enum TLBType { PROGRAM, DATA };
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does LTB need to know this?

@ppisa
Copy link
Member

ppisa commented Sep 29, 2025

@jdupak thanks for review of interfacing to the memory model architecture.

@ppisa
Copy link
Member

ppisa commented Sep 29, 2025

From my side, the changes to the processor pipeline diagram has been applied directly to the SVG files (src/gui/windows/coreview/schemas), but current design uses DRAW.IO source (extras/core_graphics) as the authoritative source of the pipeline visualization and SVGs are generated from this file. So the commit with SVG change should include extras/core_graphics/diagram.drawio change as well or extras/core_graphics/diagram.drawio change should be commit before SVG files regeneration commit. In long term, I would lean to single SVG file with tags for conditional rendering, but we have not got to that state yet and current solution implemented by @jdupak is based on DRAW.IO and exports controlled by tagging (some documentation there docs/developer/coreview-graphics/using-drawio-diagram.md).

@ppisa
Copy link
Member

ppisa commented Sep 29, 2025

For memory view, I would not complicate it with Show virtual checkbox. I would use only switching between As CPU (VMA), Cached and Raw.

@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch 5 times, most recently from 0bb04d1 to ca4300b Compare October 5, 2025 19:16
@nemecad
Copy link
Author

nemecad commented Oct 5, 2025

@ppisa @jdupak Thank you for your detailed feedback. I appreciate it and have made some changes based on your review. I would be grateful for any further feedback.

@jdupak jdupak force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from ca4300b to a6cbf71 Compare October 19, 2025 14:48
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo in the dir name

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no cmake logic to run these tests. I think we want to run them as cli tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comment. I’ve added the CMake logic to run these as CLI tests in the commit 7a204cf.

@jdupak jdupak force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from a6cbf71 to bc32933 Compare October 19, 2025 16:13
@jdupak
Copy link
Collaborator

jdupak commented Oct 19, 2025

I pushed some slight edits. Barring the issue with new tests not being run I am fine with merging this.

@ppisa
Copy link
Member

ppisa commented Oct 29, 2025

I am going through the code. I have one overall remark, that there are lot of formatting changes included in functional changes. I am not reluctant to formatting changes even that I think that sometimes formatting left by human to align for example some case lines into columns etc. has some value. But formatting changes unrelated to the functional changes make review harder. So I would keep with patches as they are but I would suggest to separate formatting, even over all later modified files in series, separate from functional changes.

@ppisa
Copy link
Member

ppisa commented Oct 29, 2025

I am do not like is_mmio_region() and bypass_mmio() concept. The peripherals accesses should go through regular address translation. It is responsibility of the OS to map regions related to I/O into virtual address space of kernel and or even user application, i.e. for mmap() like accesses.

As for enabling cache for accesses there is a hack in the QtRvSim which enforces next uncached region

Cache::Cache
    uncached_start(0xf0000000_addr)
    uncached_last(0xfffffffe_addr)

In the longer term, cacheability should be controlled from page tables. But PBMT (Page-Based Memory Types) are supported only for Sv39 and bigger translation configurations, see Chapter 14. "Svpbmt" Extension for Page-Based Memory Types

Mode Value Requested Memory Attributes
PMA 0 None
NC 1 Non-cacheable, idempotent, weakly-ordered (RVWMO), main memory
IO 2 Non-cacheable, non-idempotent, strongly-ordered (I/O ordering), I/O
- 3 Reserved for future standard use

But the physical region marked to skip caching in cache implementation (current state) should be enough for now.

@ppisa
Copy link
Member

ppisa commented Oct 29, 2025

Not so critical for now, but should be solved in the longer time perspective. SRET can be executed even in M mode. So the type of the return should be propagated to the control_state->exception_return in the Core::memory(const ExecuteInterstage &dt). It is question if to add signal which goes through all stages (more readable) or to use bit from instruction for local decode of the type. MRET should not be allowed in system mode. In general, I think that current version does not mark system level instructions and access to the system and machine mode CSRs as invalid in U mode. So some masking would be required on the decode level in future. Some more flags needs to be added into enum InstructionFlags to allow that checking and flags_to_check and it should then be updated on mode transition.

The behavior of xRET instructions is described in 3.1.6.1. Privilege and Global Interrupt-Enable Stack in mstatus register. When SRET is executed in M mode then it executes the same as in the S mode but it
should clear MPRV=0. This is to allow emulate some system level operations in machine level code.

@ppisa
Copy link
Member

ppisa commented Oct 29, 2025

It seems that TLBs are updated from the start of the system. The TLB and its updates should be enable only when root register is set. And they should not be updated in M mode at all.

@jdupak
Copy link
Collaborator

jdupak commented Oct 29, 2025

There is one actual issue from CI: you cannot use ftruncate. It fails compilation on Win.

@jdupak
Copy link
Collaborator

jdupak commented Oct 30, 2025

There is one actual issue from CI: you cannot use ftruncate. It fails compilation on Win.

Never mind, this is broken on master. I will fix that. It does not block this PR.

@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch 3 times, most recently from c846a9e to 442a091 Compare November 2, 2025 16:54
@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch 2 times, most recently from 3238c4f to 62777b5 Compare November 9, 2025 16:02
@jdupak jdupak force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from 62777b5 to 6c37dd9 Compare November 10, 2025 19:23
@jdupak
Copy link
Collaborator

jdupak commented Nov 10, 2025

@nemecad notice that I force pushed your branch - there is zero diff at the end but all spurious format changes should be gone now. My apologies for introducing them.

CC @ppisa should be now easier to review

@jdupak jdupak force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from 6c37dd9 to 861f836 Compare November 11, 2025 07:16
Copy link
Member

@ppisa ppisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to @nemecad for the virtual memory implementation. The code is clean and readable in general. Thanks to @jdupak for review and formatting.

There are some minor issues to resolve or discuss. Same some suggestion to history cleanup but I think that we can merge code soon.

To speedup discussion, I would like to meet or call with @nemecad.

But congratulation to good job generally. As the next step I would like to discuss possibility to work on Sv39 which would allow to test some more real operating system scenarios.


namespace machine {

static bool is_mmio_region(uint64_t virt) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should be removed from the history. The logic is corrected by the commit 667bd4f
But it would be much better, if it does not appear in the history at all.

return false;
}

static Address bypass_mmio(Address vaddr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtto

mem = machine->cache_data();
}
} else {
if (access_through_cache == 2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modes should be changed to proper enum to make code readable.

The field name access_through_cache should be adjusted. Something like mem_access_kind, mem_access_level or some better name name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed enum and when I think about use there could be interesting to to have option to look to memory at virtual level even when CPU is in machine mode, because the you can observe what hypervisor or SBI does with some or system memory

enum MemoryAccessAtLevel {
        MEM_ACC_AS_CPU = 0,
        MEM_ACC_VIRT_ADDR = 1,
        MEM_ACC_PHYS_ADDR = 2,
        MEM_ACC_PHYS_ADDR_SKIP_CACHES = 3,
        MEM_ACC_AS_MACHINE = 4,
   };

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit and previous one should be squashed or kept but location of the test files as they are introduced in the previous commit should be already the final one. The move is abundant in new component history.

flags = (enum InstructionFlags)im.flags;
alu_op = im.alu;
mem_ctl = im.mem_ctl;
if (flags & IMF_CSR) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this additional logic to map CSR to required privilege level. But see proposed testing remark.

ExceptionCause excause = dt.excause;

dt.inst.flags_alu_op_mem_ctl(flags, alu_op, mem_ctl);
auto current_priv = state.current_privilege();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to use some masking there. Ideally included directly by updates of check_inst_flags_val and check_inst_flags_mask when which would be updated when the privilege level changes. This would allow speedup and even check on individual instruction level when mret, sret etc, can be augmented by required privilege level directly in the instruction table. This common illegal instruction processing is possible because privilege violation should lead to regular illegal instruction exception. See

2.1. CSR Address Mapping Conventions

Instructions that access a non-existent CSR are reserved. Attempts to access a CSR without appropriate privilege level raise illegal-instruction exceptions or, as described in Section 21.6.1, virtual-instruction exceptions. Attempts to write a read-only register raise illegal-instruction exceptions. read/write register might also contain some bits that are read-only, in which case writes to the read-
only bits are ignored.

jdupak and others added 4 commits November 13, 2025 10:26
Add tracking of the hart's current privilege level to the core state so code
handling exceptions/returns and visualization can read/update it from the
central CoreState structure.
The next supervisor CSRs has been added:
  sstatus, stvec, sscratch, sepc, scause, stval, satp
Write handler has been added as well. It presents sstatus
as a masked view of mstatus so supervisor-visible bits stay
in sync.
@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from 861f836 to b5940e8 Compare November 14, 2025 12:11
Copy link
Member

@ppisa ppisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added comments and documented some which has been already expressed in discussion.

I have noticed some problems in rv32ui-p-fence_i official RISC-V tests. It is in cached variant regardless of pipeline/single-cycle and 32/64/bits variants. @jdupak it is strange that the failure of given/single official test does not propagate to failure of whole test series.

Problem seems to appear in some change after Machine: add supervisor CSRs and status handling commit or it could be introduced by my rearrangement of the changes.

rv32ui-p-fence_i: ERROR
[INFO]  machine.ProgramLoader:	Loaded executable: 32bit
[INFO]  machine.TLB:	TLB[I] constructed; sets=16 way=1
[INFO]  machine.TLB:	TLB[D] constructed; sets=16 way=1
[INFO]  machine.TLB:	TLB: SATP changed → flushed all; new SATP=0x00000000
[INFO]  machine.TLB:	TLB: SATP changed → flushed all; new SATP=0x00000000
[INFO]  machine.TLB:	TLB: SATP changed → flushed all; new SATP=0x00000000
[INFO]  machine.TLB:	TLB: SATP changed → flushed all; new SATP=0x00000000
[INFO]  machine.BranchPredictor:	Initialized branch predictor: None
[INFO]  machine.TLB:	TLB[D]: flushed all entries
[INFO]  machine.TLB:	TLB[I]: flushed all entries
[DEBUG] machine.core:	Exception cause 11 instruction PC 0x80000180 next PC 0x80000184 jump branch PC 0x8000017cregisters PC 0x80000184 mem ref 0x00000000

Machine stopped on ECALL_M exception.

mem = machine->cache_data();
}
} else {
if (access_through_cache == 2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed enum and when I think about use there could be interesting to to have option to look to memory at virtual level even when CPU is in machine mode, because the you can observe what hypervisor or SBI does with some or system memory

enum MemoryAccessAtLevel {
        MEM_ACC_AS_CPU = 0,
        MEM_ACC_VIRT_ADDR = 1,
        MEM_ACC_PHYS_ADDR = 2,
        MEM_ACC_PHYS_ADDR_SKIP_CACHES = 3,
        MEM_ACC_AS_MACHINE = 4,
   };

}
if (auto data_tlb = dynamic_cast<TLB *>(mem_data)) {
data_tlb->on_privilege_changed(restored);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dynamic cast are the last resort and the core should know (ideally) nothing about TLB except some control instructions to commands propagation.

One option is to add standard (synchronous) signal emit at set_current_privilege in the Core (it is QObject) and interconnect this signal to TLBs.

But when I think about it, then the right solution is to modify memory access FrontendMemory::write_ctl and FrontendMemory::read_ctl to propagate some control signals. Probably by pointer which can be null or may be with default parameters when passed by value (some struct which fits into 32 bits or uint in such case). These additional signals should propagate the privilege level and asid. This is how it is done o real CPUs. I.e., when the processor chips exposed bus to external MMU (68020) or when the buses are routed into FPGA fabric today. The control signals should be privilege level and current ASID. ASID should be held in core state and synchronized by some signal from CSR writes...

The TLB::on_privilege_changed should not be needed and for sure it should not flush TLB entries. It would cause extreme overhead for system calls and machine exceptions. The TLB flushes are maintained by operation system when page tables are modified or there is change of mapping of memory contexts to ASIDs. Seven switch to other memory context does not need the flush when ASIDs are unique.


#include "common/logging.h"
#include "execute/alu.h"
#include "memory/tlb/tlb.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TLB integration has to be solved such way that internal core logic does not need know how it works and how it is implemented. Same for tests etc.

TLBType type;
const TLBConfig tlb_config;
uint32_t current_satp_raw = 0;
CSR::PrivilegeLevel current_priv_ = CSR::PrivilegeLevel::MACHINE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic should be solved such way, that this field is not needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be use for keeping some last access privilege level and ASID or something similar for visualization purposes. But for sure not for real work.

namespace machine {

inline bool is_mode_enabled_in_satp(uint32_t satp_raw) {
return (satp_raw & (1u << 31)) != 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not like this inline there. It should go probably into TLB header.

return InstructionFlags(flags_to_check);
}

static CSR::PrivilegeLevel decode_xret_type_from_inst(const Instruction &inst) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be solved some other way. I would suggest to not solve this at decode level at all and left decision on ControlState::read and write or at least to the memory stage where illegal-instruction exception exception would be raised. It cannot be through standard exception signal from CSR, it has to be too late. It has to be by return value or some other way, optional pointer to status return. The illegal-instruction exception should be raised even if write to read only register is attempted and even when non-existent registers is addressed. All these information cannot be gathered at decode state. It would cost too much.

There is related discussion about RISC-V standard, which allows some situations where accesses to non existent/unspecified CSRs are reserved, but conclusion is that it should result in illegal-instruction as well except for some exotic arrangements

riscv/riscv-isa-manual#1116

// Mark illegal if current privilege is lower than encoded xRET type (e.g. MRET executed in S-mode)
if (state.current_privilege() < inst_xret_priv) {
excause = EXCAUSE_INSN_ILLEGAL;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should not be needed. Should be solved by

const InstructionFlags check_inst_flags_val;
const InstructionFlags check_inst_flags_mask;

manipulation in set_current_privilege and snntation of the instructions by required mode IMF_PRIV_S, IMF_PRIV_H, IMF_PRIV_M in decoding tables.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your detailed feedback. I appreciate it and have made some changes based on your review. I would be grateful for any further feedback.

jdupak and others added 8 commits November 19, 2025 20:23
Implements a set-associative Translation Lookaside Buffer (TLB)
with replacement policies, Page-Table Walker,
and adds SV32-specific definitions.
Add privilege level mapping to the GUI so the current
hart privilege (UNPRIV, SUPERV, HYPERV, MACHINE) is displayed
in core state visualization.
Extend NewDialog with controls for virtual memory setup,
including TLB number of sets, associativity, and replacement
policy.
Introduce new components for displaying and tracking TLB
state similar to cache. TLBViewBlock and TLBAddressBlock
render per-set and per-way TLB contents, updated on tlb_update
signals. TLBViewScene assembles these views based on associativity.
TLBDock integrates into the GUI, showing hit/miss counts, memory
accesses, stall cycles, hit rate, and speed improvement, with live updates from the TLB.
Introduce an "As CPU (VMA)" access option in the cached
access selector to render memory contents as observed
by the CPU through the frontend interface.
Add a set of small assembly tests that exercise the SV32 page-table walker,
SATP enablement and the new TLB code. The tests create a root page
table and map a virtual page at 0xC4000000, then exercise several scenarios.
The tests verify page-table walker behaviour, SATP switching and TLB
caching/flush logic. Tests were written based on the consultation.
Ensure that TLBs are only updated when the root register is set,
and disable TLB updates while running in Machine mode.
…ion checks

Decode MRET/SRET/URET in the decode stage, carry the return
type through the interstage registers, and pass it
to ControlState::exception_return in the memory stage.
Extend instruction metadata with privilege flags (IMF_PRIV_M/H/S)
for privileged operations and use them for masking.
@nemecad nemecad force-pushed the feature/sv32-vm-tlb-ptw-cleanup branch from b5940e8 to 0fc627f Compare November 19, 2025 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants