Skip to content

Persistent memory ranges #370

@diegonehab

Description

@diegonehab

Overview

Add --persistent-memory CLI option that, unlike --flash-drive, does not bind to a block device (/dev/pmemN). Instead, it uses Linux's UIO (Userspace I/O) framework to provide direct physical memory access from userspace programs. Writes go straight to physical memory -- no page cache, no msync.

Also migrate the cmio rx/tx buffers to UIO.

cmio buffer migration

There are two options for the cmio rx/tx buffers:

Option A: Leave cmio as-is. The cmio driver keeps its current device tree node, buffer management (remap_pfn_range, IOCTL_CMIO_SETUP), and exclusive access control. Only persistent memories use UIO. This is the conservative choice -- no risk, no driver changes, but the cmio buffer access path remains separate from persistent memory.

Option B: Migrate cmio buffers to UIO. Replace the cmio device tree node with two generic-uio nodes. The cmio driver drops all buffer management and shrinks to just yield. Buffer discovery unifies with flash drives and persistent memories under /run/cartesi-label/. The tradeoff is loss of exclusive access: UIO has no built-in exclusivity mechanism, so any process with sufficient permissions can open and mmap the buffers concurrently. In practice this is low risk -- the emulator runs a single application and well-behaved guest software has no reason to map cmio buffers from multiple processes -- but it is a real loss compared to today's driver.

Design

memory_range_config changes (machine-config.h)

Add a label field to memory_range_config:

struct memory_range_config final {
    std::string label;                         ///< Label for identification (NEW)
    uint64_t start{0xffffffffffffffffUL};      ///< Memory range start position
    uint64_t length{0xffffffffffffffffUL};     ///< Memory range length
    bool read_only{false};                     ///< Make memory range read-only to host
    backing_store_config backing_store;        ///< Backing store
};

Flash drives and persistent memories both use this struct. The label is written into the device tree (ctsi,label) and used for the info file under /run/cartesi-label/.

Labels are mandatory for both flash drives and persistent memories. They must satisfy the following constraints, all validated by the machine constructor:

  • Non-empty
  • Characters restricted to alphanumeric ([a-zA-Z0-9]), hyphen (-), and underscore (_)
  • Must not start with "ctsi" -- this prefix is reserved for internal use (e.g., ctsi-cmio-rx-buffer, ctsi-cmio-tx-buffer)
  • Must be unique across all flash drives and persistent memories

New PMA device ID (pmas-defines.h / pmas-constants.h)

Add a new PMA_ISTART_DID for persistent memory. In pmas-defines.h:

#define PMA_PERSISTENT_MEMORY_DID_DEF 11    ///< Device ID for persistent memory

In pmas-constants.h, add to the PMA_ISTART_DID enum:

persistent_memory = PMA_PERSISTENT_MEMORY_DID_DEF,  ///< DID for persistent memory

This distinguishes persistent memory from flash drives (PMA_FLASH_DRIVE_DID_DEF = 3) in the PMA istart flags, which matters for the state access layer and step verification.

Device tree changes (dtb.cpp)

Flash drives -- add ctsi,label

Currently (lines 184-190):

for (const auto &f : c.flash_drive) {
    fdt.begin_node_num("pmem", f.start);
    fdt.prop_string("compatible", "pmem-region");
    fdt.prop_u64_list<2>("reg", {f.start, f.length});
    fdt.prop_empty("volatile");
    fdt.end_node();
}

Change: add ctsi,label to each node:

for (const auto &f : c.flash_drive) {
    fdt.begin_node_num("pmem", f.start);
    fdt.prop_string("compatible", "pmem-region");
    fdt.prop_u64_list<2>("reg", {f.start, f.length});
    fdt.prop_empty("volatile");
    fdt.prop_string("ctsi,label", f.label);
    fdt.end_node();
}

Persistent memory -- new UIO nodes

Add a loop for persistent memory entries, generating generic-uio nodes:

for (const auto &m : c.persistent_memory) {
    fdt.begin_node_num("uio", m.start);
    fdt.prop_string("compatible", "generic-uio");
    fdt.prop_u64_list<2>("reg", {m.start, m.length});
    fdt.prop_string("ctsi,label", m.label);
    fdt.end_node();
}

cmio rx/tx buffers (Option B only)

If Option B is chosen, replace the current cmio device tree block (lines 192-207) with UIO nodes:

fdt.begin_node_num("uio", AR_CMIO_RX_BUFFER_START);
fdt.prop_string("compatible", "generic-uio");
fdt.prop_u64_list<2>("reg", {AR_CMIO_RX_BUFFER_START, AR_CMIO_RX_BUFFER_LENGTH});
fdt.prop_string("ctsi,label", "ctsi-cmio-rx-buffer");
fdt.end_node();

fdt.begin_node_num("uio", AR_CMIO_TX_BUFFER_START);
fdt.prop_string("compatible", "generic-uio");
fdt.prop_u64_list<2>("reg", {AR_CMIO_TX_BUFFER_START, AR_CMIO_TX_BUFFER_LENGTH});
fdt.prop_string("ctsi,label", "ctsi-cmio-tx-buffer");
fdt.end_node();

The cmio parent node with compatible = "ctsi-cmio" is removed. The yield node (lines 210-220) is unrelated and stays as-is.

CARTESI_MACHINE_IO_DRIVER changes (cmio.c) (Option B only)

If Option A is chosen, the cmio driver is unchanged.

If Option B is chosen:

  1. Remove buffer management: Drop struct cmio_buffer, struct cmio_setup, setup_buffer(), cmio_mmap(), and IOCTL_CMIO_SETUP.

  2. Keep yield functionality: IOCTL_CMIO_YIELD and the yield device tree node remain unchanged. The driver should switch to matching ctsi-yield since the cmio parent node is removed.

  3. Userspace impact: Programs discover buffers through /run/cartesi-label/ctsi-cmio-rx-buffer and /run/cartesi-label/ctsi-cmio-tx-buffer, then open the corresponding /dev/uioN and mmap directly.

cartesi-machine.lua changes

Replace /run/drive-label/ with /run/cartesi-label/

Currently (lines 2151-2158), the init script creates per-device label files:

do -- create a map of the label in /run/drive-label for flashdrive tool
    local cmd = table.concat({
        'busybox mkdir -p /run/drive-label && echo "',
        label,
        '" > /run/drive-label/',
        devname,
    })
    config.dtb.init = config.dtb.init .. cmd .. "\n"
end

Replace with per-label files under /run/cartesi-label/. Each file is named by label and contains <device> <start> <length>.

Emit busybox mkdir -p /run/cartesi-label once before any entries.

Flash drive entries

For each flash drive:

local cmd = string.format(
    'echo "/dev/%s 0x%x 0x%x" > "/run/cartesi-label/%s"',
    devname,
    config.flash_drive[#config.flash_drive].start,
    config.flash_drive[#config.flash_drive].length,
    label
)
config.dtb.init = config.dtb.init .. cmd .. "\n"

Persistent memory entries

For each persistent memory (new --persistent-memory option):

local devname = "uio" .. uio_index
local cmd = string.format(
    'echo "/dev/%s 0x%x 0x%x" > "/run/cartesi-label/%s"',
    devname,
    config.persistent_memory[#config.persistent_memory].start,
    config.persistent_memory[#config.persistent_memory].length,
    label
)
config.dtb.init = config.dtb.init .. cmd .. "\n"

UIO device permissions

UIO device files are created with root-only permissions by default. The init script must set permissions so userspace programs can access them.

For persistent memory entries, based on the read_only flag:

if pm_read_only[label] then
    config.dtb.init = config.dtb.init .. string.format('chmod 0444 /dev/%s\n', devname)
else
    config.dtb.init = config.dtb.init .. string.format('chmod 0666 /dev/%s\n', devname)
end

This enforces read-only at the guest kernel level -- a user process that opens a read-only UIO device cannot create a writable mmap. Without this, the emulator would still enforce read-only via PMA flags (store access fault), but chmod gives a cleaner error path.

cmio buffer entries (Option B only)

If Option B is chosen, also set permissions and create label files for cmio buffers:

config.dtb.init = config.dtb.init .. string.format('chmod 0666 /dev/%s\n', devname_rx)
config.dtb.init = config.dtb.init .. string.format('chmod 0666 /dev/%s\n', devname_tx)
config.dtb.init = config.dtb.init .. string.format(
    'echo "/dev/uio%d 0x%x 0x%x" > /run/cartesi-label/ctsi-cmio-rx-buffer\n',
    uio_index_rx, AR_CMIO_RX_BUFFER_START, AR_CMIO_RX_BUFFER_LENGTH
)
config.dtb.init = config.dtb.init .. string.format(
    'echo "/dev/uio%d 0x%x 0x%x" > /run/cartesi-label/ctsi-cmio-tx-buffer\n',
    uio_index_tx, AR_CMIO_TX_BUFFER_START, AR_CMIO_TX_BUFFER_LENGTH
)

UIO device index tracking

UIO devices are numbered by the kernel in probe order. With Option A, all UIO devices are persistent memories (uio0..N in config order). With Option B, cmio buffers come first (cmio rx, cmio tx, then persistent memories in config order), since the device tree is generated in that order and probe order follows DTB order for generic-uio nodes.

Label validation

The "ctsi" prefix is reserved for internal use, and labels must be unique across all flash drives and persistent memories. The machine constructor (C++ side) must reject labels that start with "ctsi" or that are duplicated.

New --persistent-memory CLI option

Parse --persistent-memory=<key>:<value>[,...] with keys:

  • label (required) -- the name used in ctsi,label and the info file
  • data_filename -- backing file
  • dht_filename -- dense hash tree file
  • dpt_filename -- dirty page tree file
  • shared -- whether changes persist to backing file
  • create -- create backing file if missing
  • truncate -- truncate backing file to correct size
  • read_only -- make read-only
  • user -- ownership for the /dev/uioN device
  • start -- explicit starting memory address
  • length -- explicit length

No mke2fs or mount keys (no filesystem layer).

Guest-side flashdrive / persistentmemory utilities

Current state

The existing flashdrive script (context/machine-guest-tools/sys-utils/misc/flashdrive) takes a label argument and scans /dev/pmem* entries against /run/drive-label/ to find the matching device. It prints the device path and exits.

New design

Replace flashdrive with a single memoryrange script that acts as both flashdrive and persistentmemory, depending on the name it is invoked as (busybox-style arg[0] dispatch).

The script reads from /run/cartesi-label/<label>, which contains <device> <start> <length>.

Output modes (selected by flag):

  • --device -- print device path (default if no flag given)
  • --start -- print start address
  • --length -- print length

Invocation name determines device filtering:

  • flashdrive -- only matches /dev/pmem* devices
  • persistentmemory -- only matches /dev/uio* devices

If the label exists in /run/cartesi-label/ but the device doesn't match the expected type, the command fails (exit 1). This prevents flashdrive mylabel from returning a UIO device and vice versa.

File layout (context/machine-guest-tools/sys-utils/misc/):

  • memoryrange -- the actual script
  • flashdrive -- symlink to memoryrange
  • persistentmemory -- symlink to memoryrange

Makefile changes: Install memoryrange and create the two symlinks in $(DESTDIR)$(PREFIX)/bin/.

Guest kernel config

Add to context/linux/arch/riscv/configs/cartesi_defconfig:

CONFIG_UIO=y

Rejected alternatives

  • DAX on pmem: Unstable on RISC-V (ZONE_DEVICE patch/revert cycles through kernel 6.15)
  • Raw pmem without DAX: Requires msync due to page cache
  • /dev/mem: Blocked by CONFIG_STRICT_DEVMEM=y; no address discovery mechanism
  • Custom kernel driver: Would support lseek for size, but adds maintenance burden; UIO is mainline

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestepicoptimizationOptimizationrefactorRestructuring code, while not changing its original functionality

Projects

Status

Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions