-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[refactor][ml] Replace cache eviction algorithm with centralized removal queue and job #24363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[refactor][ml] Replace cache eviction algorithm with centralized removal queue and job #24363
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #24363 +/- ##
============================================
+ Coverage 73.57% 74.36% +0.78%
- Complexity 32624 32651 +27
============================================
Files 1877 1878 +1
Lines 139502 146301 +6799
Branches 15299 16772 +1473
============================================
+ Hits 102638 108796 +6158
+ Misses 28908 28883 -25
- Partials 7956 8622 +666
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/cache/RangeCache.java
Show resolved
Hide resolved
b72bc4f to
9dd5726
Compare
merlimat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only question is about the eviction from the stash queue: since the insertion might not be in order, when there is memory pressure, how is the eviction going to happen?
Or, is the size of this stash already controlled in size such that it would only require time-based expiration?
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
Outdated
Show resolved
Hide resolved
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/cache/CachedEntry.java
Outdated
Show resolved
Hide resolved
...edger/src/main/java/org/apache/bookkeeper/mledger/impl/cache/RangeEntryCacheManagerImpl.java
Show resolved
Hide resolved
|
The merge conflicts will be resolved by first merging #24552 to master branch. |
For the implementation in this PR all entries are added in order to the removal queue and the timestamp is set when the entry gets added to the cache and the removal queue. Therefore, the timestamps are in order. The eviction will become more complicated later. In PIP-430, there's a need to put entries aside and keep them in the cache for a longer period of time. Currently that is addressed in PIP-430 by having 2 separate settings for expiration and that will simplify the solution. The maximum TTL for cache entries will continue to be in order in the removal queue, but another datastructure is needed when size based eviction would remove the entry, but it's prioritized to be skipped and kept until the entry expires. Handling that challenge is out-of-scope for this PR. After this current PR has been merged, it will be possible to build upon this and expand the solution further to implement PIP-430 (I have a WIP PR in my own fork, but it's not matching the PIP high-level design). One more detail about eviction handling that is present in this PR. The central queue is used for eviction by size (to keep all cached items under the limit) or by TTL. When entries are evicted directly from the RangeCache (for a complete ledger or for up to the markdelete position), the entry wrapper held in the range cache will be cleared and the same instance will remain in the queue until it gets processed. Empty entry wrappers will be held in the queue at most for the TTL in worst case. The entry wrapper gets recycled when it gets processed from the I hope this answer covers your question @merlimat |
… and replace with ReferenceCountedEntry - ReferenceCountedEntry is already implemented by EntryImpl - Separate interface to avoid changing and breaking existing Entry interface
…ction-optimization
|
PIP-430 vote has passed. I'll merge this PR to master. |
…val queue and job (apache#24363)
…val queue and job (apache#24363)
…val queue and job (apache#24363)
Motivation
This PR fixes fundamental inefficiencies and correctness issues in the current Pulsar broker entry cache eviction algorithm. The current implementation has flawed size-based eviction that doesn't remove the oldest entries and incorrect timestamp-based eviction with high CPU overhead. These fixes ensure that size-based eviction properly removes the oldest entries and timestamp-based eviction works correctly. Additionally, this PR serves as a foundation for future improvements to efficiently handle catch-up reads and Key_Shared subscription scenarios.
Mailing list discussion about this PR: https://lists.apache.org/thread/ddzzc17b0c218ozq9tx0r3rx5sgljfb0
UPDATE: This change is covered in "PIP-430: Pulsar Broker cache improvements: refactoring eviction and adding a new cache strategy based on expected read count", #24444
Problems with the Current Broker Entry Cache Implementation
Size-Based Eviction doesn't remove oldest entries: The existing
EntryCacheDefaultEvictionPolicyuses an algorithm for keeping the cache size under the limit but cannot guarantee removal of the oldest entries from the cache. The algorithm:PercentOfSizeToConsiderForEviction, default0.5)Inefficient and Incorrect Timestamp-Based Eviction: The current timestamp eviction has both performance and correctness issues:
managedLedgerCacheEvictionIntervalMs=10) - 100 times per second!Limited Cache Scope: The original
RangeCachewas designed for tailing reads. Later changes added support for backlogged cursors, but the eviction algorithms weren't updated to handle mixed read patterns effectively.Unnecessary Complexity: Generic type parameters in
RangeCacheadd complexity without providing value, as the cache is only used for entry storage.Modifications
1. Centralized Removal Queue (
RangeCacheRemovalQueue)MpscUnboundedArrayQueueto maintain entry insertion order2. Simplified Cache Implementation
RangeCacheto reduce complexity3. Foundation for Future Improvements
The existing broker cache has limitations:
cacheEvictionByMarkDeletedPosition=true)This refactoring prepares the cache system for:
Algorithm Comparison
Before (EntryCacheDefaultEvictionPolicy)
Size Based Eviction
pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/cache/EntryCacheDefaultEvictionPolicy.java
Lines 41 to 93 in 82237d3
Timestamp eviction
managedLedgerCacheEvictionIntervalMs=10)pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerFactoryImpl.java
Lines 306 to 317 in eccc6b6
After (RangeCacheRemovalQueue)
https://github.com/apache/pulsar/blob/b72bc4ff3aa5c9c45d9233d2d000429b3cf0ce1a/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/cache/RangeCacheRemovalQueue.java
Note: There's a single shared removal queue for all ManagedLedgerImpl instances instead of having to do the check in multiple instances.
Verifying this change
This change is already covered by existing tests:
RangeCacheTestvalidates the new removal queue functionalityEntryCacheManagerTestverifies eviction behavior remains correctDocumentation
docdoc-requireddoc-not-neededdoc-complete