33[ ![ Build] ( https://github.com/cacheMon/libCacheSim-python/actions/workflows/build.yml/badge.svg )] ( https://github.com/cacheMon/libCacheSim-python/actions/workflows/build.yml )
44[ ![ Documentation] ( https://github.com/cacheMon/libCacheSim-python/actions/workflows/docs.yml/badge.svg )] ( docs.libcachesim.com/python )
55
6- Python bindings for [ libCacheSim] ( https://github.com/1a1a11a/libCacheSim ) , a high-performance cache simulator and analysis library.
6+
7+ libCacheSim is fast with the features from [ underlying libCacheSim lib] ( https://github.com/1a1a11a/libCacheSim ) :
8+
9+ - ** High performance** - over 20M requests/sec for a realistic trace replay
10+ - ** High memory efficiency** - predictable and small memory footprint
11+ - ** Parallelism out-of-the-box** - uses the many CPU cores to speed up trace analysis and cache simulations
12+
13+ libCacheSim is flexible and easy to use with:
14+
15+ - ** Seamless integration** with [ open-source cache dataset] ( https://github.com/cacheMon/cache_dataset ) consisting of thousands traces hosted on S3
16+ - ** High-throughput simulation** with the [ underlying libCacheSim lib] ( https://github.com/1a1a11a/libCacheSim )
17+ - ** Detailed cache requests** and other internal data control
18+ - ** Customized plugin cache development** without any compilation
19+
20+ ## Prerequisites
21+
22+ - OS: Linux / macOS
23+ - Python: 3.9 -- 3.13
724
825## Installation
926
27+ ### Quick Install
28+
1029Binary installers for the latest released version are available at the [ Python Package Index (PyPI)] ( https://pypi.org/project/libcachesim ) .
1130
1231``` bash
1332pip install libcachesim
1433```
1534
35+ ### Recommended Installation with uv
36+
37+ It's recommended to use [ uv] ( https://docs.astral.sh/uv/ ) , a very fast Python environment manager, to create and manage Python environments:
38+
39+ ``` bash
40+ uv venv --python 3.12 --seed
41+ source .venv/bin/activate
42+ uv pip install libcachesim
43+ ```
44+
45+ ### Advanced Features Installation
46+
47+ For users who want to run LRB, ThreeLCache, and GLCache eviction algorithms:
48+
49+ !!! important
50+ If ` uv ` cannot find built wheels for your machine, the building system will skip these algorithms by default.
51+
52+ To enable them, you need to install all third-party dependencies first:
53+
54+ ``` bash
55+ git clone https://github.com/cacheMon/libCacheSim-python.git
56+ cd libCacheSim-python
57+ bash scripts/install_deps.sh
58+
59+ # If you cannot install software directly (e.g., no sudo access)
60+ bash scripts/install_deps_user.sh
61+ ```
62+
63+ Then, you can reinstall libcachesim using the following commands (may need to add ` --no-cache-dir ` to force it to build from scratch):
64+
65+ ``` bash
66+ # Enable LRB
67+ CMAKE_ARGS=" -DENABLE_LRB=ON" uv pip install libcachesim
68+ # Enable ThreeLCache
69+ CMAKE_ARGS=" -DENABLE_3L_CACHE=ON" uv pip install libcachesim
70+ # Enable GLCache
71+ CMAKE_ARGS=" -DENABLE_GLCACHE=ON" uv pip install libcachesim
72+ ```
73+
1674### Installation from sources
1775
1876If there are no wheels suitable for your environment, consider building from source.
@@ -29,6 +87,42 @@ python -m pytest tests/
2987
3088## Quick Start
3189
90+ ### Cache Simulation
91+
92+ With libcachesim installed, you can start cache simulation for some eviction algorithm and cache traces:
93+
94+ ``` python
95+ import libcachesim as lcs
96+
97+ # Step 1: Get one trace from S3 bucket
98+ URI = " cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst"
99+ dl = lcs.DataLoader()
100+ dl.load(URI )
101+
102+ # Step 2: Open trace and process efficiently
103+ reader = lcs.TraceReader(
104+ trace = dl.get_cache_path(URI ),
105+ trace_type = lcs.TraceType.ORACLE_GENERAL_TRACE ,
106+ reader_init_params = lcs.ReaderInitParam(ignore_obj_size = False )
107+ )
108+
109+ # Step 3: Initialize cache
110+ cache = lcs.S3FIFO(cache_size = 1024 * 1024 )
111+
112+ # Step 4: Process entire trace efficiently (C++ backend)
113+ obj_miss_ratio, byte_miss_ratio = cache.process_trace(reader)
114+ print (f " Object miss ratio: { obj_miss_ratio:.4f } , Byte miss ratio: { byte_miss_ratio:.4f } " )
115+
116+ # Step 4.1: Process with limited number of requests
117+ cache = lcs.S3FIFO(cache_size = 1024 * 1024 )
118+ obj_miss_ratio, byte_miss_ratio = cache.process_trace(
119+ reader,
120+ start_req = 0 ,
121+ max_req = 1000
122+ )
123+ print (f " Object miss ratio: { obj_miss_ratio:.4f } , Byte miss ratio: { byte_miss_ratio:.4f } " )
124+ ```
125+
32126### Basic Usage
33127
34128``` python
@@ -46,7 +140,9 @@ print(cache.get(req)) # False (first access)
46140print (cache.get(req)) # True (second access)
47141```
48142
49- ### Trace Processing
143+ ### Trace Analysis
144+
145+ Here is an example demonstrating how to use ` TraceAnalyzer ` :
50146
51147``` python
52148import libcachesim as lcs
@@ -56,25 +152,40 @@ URI = "cache_dataset_oracleGeneral/2007_msr/msr_hm_0.oracleGeneral.zst"
56152dl = lcs.DataLoader()
57153dl.load(URI )
58154
59- # Step 2: Open trace and process efficiently
60- reader = lcs.TraceReader(dl.get_cache_path(URI ))
155+ reader = lcs.TraceReader(
156+ trace = dl.get_cache_path(URI ),
157+ trace_type = lcs.TraceType.ORACLE_GENERAL_TRACE ,
158+ reader_init_params = lcs.ReaderInitParam(ignore_obj_size = False )
159+ )
61160
62- # Step 3: Initialize cache
63- cache = lcs.S3FIFO(cache_size = 1024 * 1024 )
161+ analysis_option = lcs.AnalysisOption(
162+ req_rate = True , # Keep basic request rate analysis
163+ access_pattern = False , # Disable access pattern analysis
164+ size = True , # Keep size analysis
165+ reuse = False , # Disable reuse analysis for small datasets
166+ popularity = False , # Disable popularity analysis for small datasets (< 200 objects)
167+ ttl = False , # Disable TTL analysis
168+ popularity_decay = False , # Disable popularity decay analysis
169+ lifetime = False , # Disable lifetime analysis
170+ create_future_reuse_ccdf = False , # Disable experimental features
171+ prob_at_age = False , # Disable experimental features
172+ size_change = False , # Disable size change analysis
173+ )
174+
175+ analysis_param = lcs.AnalysisParam()
176+
177+ analyzer = lcs.TraceAnalyzer(
178+ reader, " example_analysis" , analysis_option = analysis_option, analysis_param = analysis_param
179+ )
64180
65- # Step 4: Process entire trace efficiently (C++ backend)
66- obj_miss_ratio, byte_miss_ratio = cache.process_trace(reader)
67- print (f " Object miss ratio: { obj_miss_ratio:.4f } , Byte miss ratio: { byte_miss_ratio:.4f } " )
181+ analyzer.run()
68182```
69183
70- > [ !NOTE]
71- > We DO NOT ignore the object size by defaults, you can add ` reader_init_params = lcs.ReaderInitParam(ignore_obj_size=False) ` to the initialization of ` TraceReader ` if needed.
72-
73- ## Custom Cache Policies
184+ ## Plugin System
74185
75- Implement custom cache replacement algorithms using pure Python functions - ** no C/C++ compilation required** .
186+ libCacheSim allows you to develop your own cache eviction algorithms and test them via the plugin system without any C/C++ compilation required.
76187
77- ### Python Hook Cache Overview
188+ ### Plugin Cache Overview
78189
79190The ` PluginCache ` allows you to define custom caching behavior through Python callback functions. You need to implement these callback functions:
80191
@@ -87,74 +198,51 @@ The `PluginCache` allows you to define custom caching behavior through Python ca
87198| ` remove_hook ` | ` (data: Any, obj_id: int) -> None ` | Clean up when object removed |
88199| ` free_hook ` | ` (data: Any) -> None ` | [ Optional] Final cleanup |
89200
90- <details >
91- <summary >An example for LRU</summary >
201+ ### Example: Implementing LRU via Plugin System
92202
93203``` python
94204from collections import OrderedDict
95- from libcachesim import PluginCache, CommonCacheParams, Request, SyntheticReader, LRU
96-
97-
98- class StandaloneLRU :
99- def __init__ (self ):
100- self .cache_data = OrderedDict()
101-
102- def cache_hit (self , obj_id ):
103- if obj_id in self .cache_data:
104- obj_size = self .cache_data.pop(obj_id)
105- self .cache_data[obj_id] = obj_size
106-
107- def cache_miss (self , obj_id , obj_size ):
108- self .cache_data[obj_id] = obj_size
109-
110- def cache_eviction (self ):
111- evicted_id, _ = self .cache_data.popitem(last = False )
112- return evicted_id
113-
114- def cache_remove (self , obj_id ):
115- if obj_id in self .cache_data:
116- del self .cache_data[obj_id]
117-
205+ from typing import Any
118206
119- def cache_init_hook (common_cache_params : CommonCacheParams):
120- return StandaloneLRU()
207+ from libcachesim import PluginCache, LRU , CommonCacheParams, Request
121208
209+ def init_hook (_ : CommonCacheParams) -> Any:
210+ return OrderedDict()
122211
123- def cache_hit_hook ( cache , request : Request):
124- cache.cache_hit(request .obj_id)
212+ def hit_hook ( data : Any, req : Request) -> None :
213+ data.move_to_end(req .obj_id, last = True )
125214
215+ def miss_hook (data : Any, req : Request) -> None :
216+ data.__setitem__ (req.obj_id, req.obj_size)
126217
127- def cache_miss_hook ( cache , request : Request):
128- cache.cache_miss(request.obj_id, request.obj_size)
218+ def eviction_hook ( data : Any, _ : Request) -> int :
219+ return data.popitem( last = False )[ 0 ]
129220
221+ def remove_hook (data : Any, obj_id : int ) -> None :
222+ data.pop(obj_id, None )
130223
131- def cache_eviction_hook (cache , request : Request):
132- return cache.cache_eviction()
133-
134-
135- def cache_remove_hook (cache , obj_id ):
136- cache.cache_remove(obj_id)
137-
138-
139- def cache_free_hook (cache ):
140- cache.cache_data.clear()
141-
224+ def free_hook (data : Any) -> None :
225+ data.clear()
142226
143227plugin_lru_cache = PluginCache(
144- cache_size = 1024 ,
145- cache_init_hook = cache_init_hook ,
146- cache_hit_hook = cache_hit_hook ,
147- cache_miss_hook = cache_miss_hook ,
148- cache_eviction_hook = cache_eviction_hook ,
149- cache_remove_hook = cache_remove_hook ,
150- cache_free_hook = cache_free_hook ,
151- cache_name = " CustomizedLRU " ,
228+ cache_size = 128 ,
229+ cache_init_hook = init_hook ,
230+ cache_hit_hook = hit_hook ,
231+ cache_miss_hook = miss_hook ,
232+ cache_eviction_hook = eviction_hook ,
233+ cache_remove_hook = remove_hook ,
234+ cache_free_hook = free_hook ,
235+ cache_name = " Plugin_LRU " ,
152236)
153- ```
154- </details >
155237
238+ reader = lcs.SyntheticReader(num_objects = 1000 , num_of_req = 10000 , obj_size = 1 )
239+ req_miss_ratio, byte_miss_ratio = plugin_lru_cache.process_trace(reader)
240+ ref_req_miss_ratio, ref_byte_miss_ratio = LRU(128 ).process_trace(reader)
241+ print (f " plugin req miss ratio { req_miss_ratio} , ref req miss ratio { ref_req_miss_ratio} " )
242+ print (f " plugin byte miss ratio { byte_miss_ratio} , ref byte miss ratio { ref_byte_miss_ratio} " )
243+ ```
156244
157- Another simple implementation via hook functions for S3FIFO respectively is given in [ examples ] ( examples/plugin_cache/s3fifo.py ) .
245+ By defining custom hook functions for cache initialization, hit, miss, eviction, removal, and cleanup, users can easily prototype and test their own cache eviction algorithms .
158246
159247### Getting Help
160248
@@ -208,7 +296,6 @@ If you used libCacheSim in your research, please cite the above papers.
208296
209297---
210298
211-
212299## License
213300See [ LICENSE] ( LICENSE ) for details.
214301
0 commit comments