Skip to content

[RFC] Introduce Page Cache in NGO #267

@lucassong-mh

Description

@lucassong-mh
  • Feature Name: Introduce Page Cache
  • Start Date: 2022-06-21

Summary

page-cache is a new designed and implemented crate and will be added into NGO.

Page Cache provides cache mechanism for block devices. Similar to Linux Buffer Cache, the goal of our page cache is to minimize disk I/O by storing data (page/block granularity) in physical memory that would otherwise require disk access.

In NGO, Page Cache caches read/write data between Async FS and Block Device. It also utilizes Rust asynchronous programming to gain better performance.

Background

Please refer to:

Design doc: #238

Async Filesystem: #265

High-level Design

page-cache mainly provides LRU-strategy struct PageCache and Usage-wrapper struct CachedDisk for users (filesystems).

page-cache

API

Public types:

PageCache<K: PageKey, A: PageAlloc>: Manage the cached pages for a domain of key (e.g., block IDs). Mainly use LruCache.

PageState: Indicate the state of a cached page.

PageHandle<K: PageKey, A: PageAlloc>: The handle to a cached page acquired from the page cache. Further operations to the page (like change the state or read/write content) must call lock() to get corresponding PageHandleGuard.

FixedSizePageAlloc: A page allocator with fixed total size.

CachedDisk<A: PageAlloc>: A virtual disk with a backing disk and a page cache. Benefit filesystems to access page cache just like accessing the disk. Define a CachedDiskFlusher: PageCacheFlusher for the inner page cache.

Private types:

Page<A: PageAlloc>: A block of memory (same size as BLOCK_SIZE) obtained from an allocator which implements PageAlloc.

PageEvictor<K: PageKey, A: PageAlloc>: Spawn a task to flush and evict pages of all instances of PageCache<K, A> when the memory of A is low.

Trait:

PageAlloc: A trait for a page allocator that can monitor the amount of free memory.

PageCacheFlusher: Allow the owner (CachedDisk or filesystem) of a page cache to specify user-specific I/O logic to flush the dirty pages of a page cache.

PageKey: A trait to define domain of key for page cache.

Detail-level explanation

See cargo doc of page-cache.

Performance improvement

fio-pagecache

AFS+PageCache beats SEFS on an average of 110.9%.
The result of seq-write is outstanding thanks to the batch write-back optimization during CachedDisk's flush().

Future work

  • Batch read blocks from block device in CachedDisk's read().
  • Two-List Strategy: aka LRU/2, used to solve the only-used-once failure. Keep two lists: the active list and the inactive list. Pages on the active list are considered "hot" and are not available for eviction. Pages on the inactive list are available for cache eviction.
  • Implement BlockDevice for CachedDisk.
  • Try to integrate memory-mapped file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions