Implement chunk eviction by 2a46m4 · Pull Request #47 · johnramsden/OxCache

2a46m4 · 2025-08-20T10:59:41Z

This PR (re)-implements chunk eviction.

The original approach special-cased the write-back of valid chunks during chunk eviction. The write-back occurred on the same zone and the zone list had a special function that would accepts a zone with a non-full number of chunks to be returned. This approach was somewhat error-prone and also caused a lot of issues with too many active zones, as opening the zone to be re-written to wasn't accounted in the zone_list's bookkeeping.

With the approach in this PR, the reader part is mostly the same, but the writer part now queues the writer pool with the valid chunks. This allows the zone_list to reclaim the zone while still being able to keep track of the number of active zones properly. Since the zone is going from full to empty when being reset, the number of active zones is not increased. Instead, the active zone is only increased when the zone_list actually hands out a chunk to be written to.

There is some change in the behaviour. Valid chunks that were reset are not immediately written back to and might take a while to get through the write_pool even if they are active. This can be a problem if they are very active. Since the valid chunks go through the write_pool path, the valid chunks will update their slot in the eviction policy, and the cache measurements will also account for them.

One thing to note is that the eviction no longer blocks when calling clean_zone_and_update_map. This is to avoid what would presumably be a deadlock situation: if clean_zone_and_update_map is blocking, when evicting, if the cache runs out of space again when writing back the chunks, the cache would have to evict again to make space for those chunks. However, there can only be one evict occurring at once, so a deadlock occurs. We spawn green threads to handle it so that the eviction thread can return immediately, and further eviction can occur.

I don't have measurements, but this might be slightly slower than the original implementation, because it has to go through the write_pool path.

This reverts commit be2a96d.

johnramsden · 2025-08-20T23:27:50Z

I think we need to find a solution for this:

There is some change in the behaviour. Valid chunks that were reset are not immediately written back to and might take a while to get through the write_pool even if they are active. This can be a problem if they are very active. Since the valid chunks go through the write_pool path, the valid chunks will update their slot in the eviction policy, and the cache measurements will also account for them.

Perhaps by passing in some additional information that says do not update lru

We need to use an aligned vector because chunk eviction needs to read valid chunks from the device and then write it back. The writeback expects an aligned vector.

The RWLock allows multiple threads to append to the zone at the same time if the chunk size is larger than the zone append size, and only one thread to append if it's smaller. In addition, since the policy doesn't check whether the zone is full before evicting it, we can actually evict zones that are still being written to. Thus we need to hold the lock while resetting the zone to make sure that we have exclusive access to the zone.

Continuing from the previous commit, if the zone being resetted is an open zone, this isn't actually a logic error, since we can evict zones that are not full. In practice this is rare but still possible. This is fine as long as we remove it and then put it back in the free zone list.

When evicting, we need to remove stale elements from the LRU queue. An example: LRU (Zone, Chunk) state: [(1, 1), (1, 2), (2, 1), (1, 4), (3, 3)] We choose to evict (1, 1) and (1, 2): [(2, 1), (1, 4), (3, 3)] (1, 4) is stale, because we reset the entire zone. This applies whether we write in-place or not. [(2, 1), (1, 4) { STALE }, (3, 3)] So, we filter out all zones with 1. [(2, 1), (3, 3)] When we write it back to the device, we need to update the LRU. Suppose we write it back to (2, 2). Then the LRU will look like: [(2, 1), (3, 3), (2, 2)]

need to spawn it on runtime

2a46m4 · 2025-08-24T06:13:42Z

Implemented a few fixes

johnramsden · 2025-08-24T07:36:45Z

+        let mut clean_targets = self.pq.remove_if_thresh_met();
+        clean_targets.sort_unstable();
+
+        // Search in the LRU for items which exist in the clean
+        // targets.  These are valid chunks, but must be removed
+        // because their location is invalidated when the zone is
+        // reset.
+        let new_lru_list = self.lru.iter().rev().filter(|lru_item| {
+            clean_targets.binary_search(&lru_item.0.zone).is_err()
+        }).map(|(k, _)| k.clone()).collect::<Vec<ChunkLocation>>();
+
+        self.lru.clear();
+
+        for k in new_lru_list {
+            self.lru.put(k, ());
+        }
+
+        clean_targets


Isn't this expensive? Have to repopulate the whole lru every time we call get targets?

Yes, it's more a temporary solution to get it working before I try anything fancy. A better solution would be to store the chunks in a hash map

Eviction doesn't occur on Cortes when we have enough throughput, but it does work with low throughput. We were stuck looping on cache is full The issue comes down to eviction can't keep up with all the requests when we drain the channel. So it doesn't do enough eviction. Basically the issue occurs if you have enough throughput and we have to do multiple eviction attempts until there is enough space for all the channels and then drain. Once I fixed that eviction does occur, however now we get a validation error invalidated read response. So I assume some data is being written back incorrectly or something. Potentially reads occurring from old data? Likely an issue with the map.

…he into sam/chunk-eviction-fix

johnramsden · 2025-08-24T19:30:14Z

Eviction doesn't occur on Cortes when we have enough throughput, but it does work with low throughput.

We were stuck looping on cache is full

The issue comes down to eviction can't keep up with all the requests when we drain the channel. So it doesn't do enough eviction. Basically the issue occurs if you have enough throughput and we have to do multiple eviction attempts until there is enough space for all the channels and then drain.

Once I fixed that eviction does occur, however now we get a validation error invalidated read response. So I assume some data is being written back incorrectly or something. Potentially reads occurring from old data? Likely an issue with the map.

Pushed a patch for eviction, now we need to resolve the correctness.

Config used:

log_level = "debug"

[server]
socket = "/tmp/oxcache.sock"
disk = "/dev/nvme0n2"
writer_threads = 14
reader_threads = 14
chunk_size = 536870912
block_zone_capacity = 1129316352
max_write_size = 262144
max_zones = 20

[remote]
remote_type = "emulated" # emulated | S3
bucket = "S3_BUCKET"
remote_artificial_delay_microsec = 40632

[eviction]
eviction_policy = "chunk"
high_water_evict = 16 # Number remaining from end, evicts if reaches here
low_water_evict = 32  # Evict until below mark
eviction_interval = 1  # Evict every 1s
high_water_clean = 16 # Number remaining from end, cleans if reaches here (only used with chunk)
low_water_clean = 8  # Clean until below mark (only used with chunk)

[metrics]
ip_addr = "127.0.0.1"
port = 9000
file_metrics_directory = "./logs"

johnramsden · 2025-08-24T19:32:46Z

Validation failure occurs here:

OxCache/oxcache/src/server.rs

Line 95 in 435393b

debug_assert_eq!(u64::from_be_bytes(buffer), key_hash);

…OxCache into sam/chunk-eviction-fix

2a46m4 · 2025-08-27T02:03:14Z

There is a difference between the cachemap and the eviction policy LRU queue which caused eviction to loop endlessly. This was because eviction was failing silently when evicted chunks are not found in the cache map. This was in turn caused by read LRU updates that were occurring after the zone was reset, so the chunk was registered in the LRU even though it wasn't active anymore. I fixed the eviction looping by not registering any read LRU updates if it's not in the queue, but the original problem i.e. the read after resets is still there. This causes chunk validation assertions to fail. I can think of two possible areas where that could be happening. either a race condition caused a read to slip through while eviction was occuring or there is some buggy behaviour around multiple evictions occuring at the same time (this can happen because the clean zones thing is non-blocking).

…/chunk-eviction-fix

We had a race condition due to the inconsistency of the bucket map and underlying data The bucket map would have had to be held during reads and writes to avoid this race condition Implement a "pin counter", this allows us to check if the data is currently being used, and we only evict when it reaches zero. The issue is the inconsistency between the entry and the bucket map. By doing it this way any intermediate locations will remain correct until they are done reading. If they read while we are in process of evicting they will wait on the waiting state.

…he into sam/chunk-eviction-fix

Needed to avoid 'out of zone' deadlock

2a46m4 added 3 commits August 20, 2025 03:44

Implement chunk eviction

436ad8e

Also remove the entry in the zone_to_entry map

be2a96d

Revert "Also remove the entry in the zone_to_entry map"

eb613b2

This reverts commit be2a96d.

2a46m4 added 9 commits August 20, 2025 17:56

make eviction not update the lru

5ea4fd2

Merge remote-tracking branch 'origin/main' into sam/chunk-eviction-fix

abb8981

increase size of image

b7ea8f9

Minor comment

63161c5

Use aligned vector for reading

edffd50

We need to use an aligned vector because chunk eviction needs to read valid chunks from the device and then write it back. The writeback expects an aligned vector.

minor runtime changes

7644a5c

need to spawn it on runtime

johnramsden and others added 3 commits August 23, 2025 23:18

Merge branch 'main' into sam/chunk-eviction-fix

4e210c8

minor debug statements

c2425c2

remove dummy

35293f3

johnramsden reviewed Aug 24, 2025

View reviewed changes

johnramsden added 2 commits August 24, 2025 12:26

Merge branch 'sam/chunk-eviction-fix' of github.com:johnramsden/OxCac…

435393b

…he into sam/chunk-eviction-fix

2a46m4 commented Aug 25, 2025

View reviewed changes

Comment thread oxcache/src/eviction.rs Outdated

Comment thread oxcache/src/eviction.rs Outdated

johnramsden and others added 4 commits August 24, 2025 20:54

Revert changes, problem is related to zones somehow being unavailable

343866e

some fixes

89e4f78

Apply some fixes to prevent eviction from getting stuck

3cfd592

Merge branch 'sam/chunk-eviction-fix' of github-personal:johnramsden/…

3acafbc

…OxCache into sam/chunk-eviction-fix

sam cheng and others added 2 commits August 28, 2025 19:51

Merge remote-tracking branch 'origin/sam/chunk-eviction-fix' into sam…

8251eb0

…/chunk-eviction-fix

johnramsden added 5 commits September 7, 2025 16:07

Merge branch 'main' into sam/chunk-eviction-fix

c899ecd

Add logging

e33b9e2

Merge branch 'sam/chunk-eviction-fix' of github.com:johnramsden/OxCac…

bc761d3

…he into sam/chunk-eviction-fix

Swap logging to debug

396448b

Make chunk cleaning non blocking

6b6dfdf

Needed to avoid 'out of zone' deadlock

johnramsden merged commit a06c89a into main Sep 11, 2025
2 checks passed

johnramsden deleted the sam/chunk-eviction-fix branch September 11, 2025 01:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement chunk eviction#47

Implement chunk eviction#47
johnramsden merged 28 commits intomainfrom
sam/chunk-eviction-fix

2a46m4 commented Aug 20, 2025 •

edited

Loading

Uh oh!

johnramsden commented Aug 20, 2025

Uh oh!

2a46m4 commented Aug 24, 2025

Uh oh!

Uh oh!

johnramsden Aug 24, 2025

Uh oh!

2a46m4 Aug 24, 2025

Uh oh!

Uh oh!

johnramsden commented Aug 24, 2025

Uh oh!

johnramsden commented Aug 24, 2025

Uh oh!

Uh oh!

Uh oh!

2a46m4 commented Aug 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

2a46m4 commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

johnramsden commented Aug 20, 2025

Uh oh!

2a46m4 commented Aug 24, 2025

Uh oh!

Uh oh!

johnramsden Aug 24, 2025

Choose a reason for hiding this comment

Uh oh!

2a46m4 Aug 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

johnramsden commented Aug 24, 2025

Uh oh!

johnramsden commented Aug 24, 2025

Uh oh!

Uh oh!

Uh oh!

2a46m4 commented Aug 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

2a46m4 commented Aug 20, 2025 •

edited

Loading