-
Notifications
You must be signed in to change notification settings - Fork 524
[Blog Post] Adaptive Refresh for Resilient Segment Replication #4000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Vigya Sharma <[email protected]>
Signed-off-by: Vigya Sharma <[email protected]>
|
Thank you for submitting a blog post! The blog post review process is: Submit a PR -> (Optional) Peer review -> Doc review -> Editorial review -> Marketing review -> Published. |
|
Hi @vigyasharma, It looks like you're adding a new blog post but don't have an issue mentioned. Please link this PR to an open issue using one of these keywords in the PR description:
If an issue hasn't been created yet, please create one and then link it to this PR. |
Signed-off-by: Vigya Sharma <[email protected]>
| @@ -0,0 +1,212 @@ | |||
| --- | |||
| layout: post | |||
| title: "Adaptive Refresh for Resilient Segment Replication" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use sentence case - "Adaptive refresh for resilient segment replication."
| A central challenge in replicated systems is how to propagate index changes across all replicas. Real-world document sets change over time: product catalog, availability and price changes in e-commerce search, documents get added and updated in enterprise document search, flight availability changes in airline ticket search, and most commercial search engines require near real-time updates. | ||
|
|
||
|
|
||
| ## Propagating Changes in Replicated Systems |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sentence case for subhead
| With this approach, a document is indexed only once. We save precious computation resources on replicas, and instead leverage the network to copy (replicate) index changes across the entire fleet. By tracking replication checkpoints, we can make sure that each replica is on the same point-in-time view of the index, as opposed to document replication where each replica may index at its own pace. There's also the nice ability to rollback to a known older "good checkpoint" should some bad index changes make their way to production. | ||
|
|
||
|
|
||
| ## A Typical Segment Replication Setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sentence case for subhead
| Before we go any further, it's worth understanding how Lucene manages searcher refreshes, which forms the basis of segment replicated systems. | ||
|
|
||
|
|
||
| ## Understanding Lucene's Refresh Mechanism |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sentence case for subhead
|
|
||
|
|
||
|
|
||
| ## Real World Challenges at Scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sentence case for subhead
| The single-checkpoint method forces each replica to absorb all accumulated changes in one step. It's simple and correct, but when the gap between checkpoints absorbed by the replica is large, you pay the whole cost at once. This cliff-like behavior is exactly what we set out to smooth out. | ||
|
|
||
|
|
||
| ## Bite-Sized Commits and Adaptive Refresh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sentence case for subhead
|
Adding @natebower for final editorial review. |
Signed-off-by: Vigya Sharma <[email protected]>
|
Thanks for the review, @pajuric . I've updated all headings to sentence case. |
natebower
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Editorial review
_posts/2025-11-12-adaptive-refresh-for-resilient-segment-replication.md
Outdated
Show resolved
Hide resolved
_posts/2025-11-12-adaptive-refresh-for-resilient-segment-replication.md
Outdated
Show resolved
Hide resolved
_posts/2025-11-12-adaptive-refresh-for-resilient-segment-replication.md
Outdated
Show resolved
Hide resolved
|
|
||
| With a low OS page cache churn, we see fewer page faults and more stable latency for search requests. The entire process is idempotent, retry-able and resilient to transient failures. Since all intermediate states are valid Lucene commits, you can resume refreshing from the last commit point. Additionally, refreshing again on a commit point that the searcher is already on is a no-op and does not impact the system. | ||
|
|
||
| Astute readers might wonder why we don’t simply checkpoint more frequently and continue using the single-commit checkpoints. This is done to enable refresh efficiencies. Single commit checkpoints (even with small commits) require replicas to iterate through all checkpoints as it catches up to the latest changes. With multiple commits in the same checkpoint, we can evaluate them together using common checkpoint metadata and intelligently decide the commits to refresh on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Astute readers might wonder why we don’t simply checkpoint more frequently and continue using the single-commit checkpoints. This is done to enable refresh efficiencies. Single commit checkpoints (even with small commits) require replicas to iterate through all checkpoints as it catches up to the latest changes. With multiple commits in the same checkpoint, we can evaluate them together using common checkpoint metadata and intelligently decide the commits to refresh on. | |
| Astute readers might wonder why we don't simply checkpoint more frequently and continue using the single-commit checkpoints. This is done to enable refresh efficiencies. Single-commit checkpoints (even with small commits) require replicas to iterate through all checkpoints as they catch up to the latest changes. With multiple commits in the same checkpoint, we can evaluate them together using common checkpoint metadata and intelligently decide which commits to refresh on. |
|
|
||
| ## Conclusion | ||
|
|
||
| Large checkpoint jumps in segment-replicated systems create a fundamental tension: you want replicas to catch up quickly, but absorbing too much change at once can destabilize the system. The traditional approach of always replicating the latest commit creates a cliff — when replicas fall behind, they must pay the full cost in one painful step, with page faults, latency spikes, and ultimately, timed out search requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Large checkpoint jumps in segment-replicated systems create a fundamental tension: you want replicas to catch up quickly, but absorbing too much change at once can destabilize the system. The traditional approach of always replicating the latest commit creates a cliff — when replicas fall behind, they must pay the full cost in one painful step, with page faults, latency spikes, and ultimately, timed out search requests. | |
| Large checkpoint jumps in segment-replicated systems create a fundamental tension: you want replicas to catch up quickly, but absorbing too much change at once can destabilize the system. The traditional approach of always replicating the latest commit creates a cliff---when replicas fall behind, they must pay the full cost in one painful step, with page faults, latency spikes, and, ultimately, timed out search requests. |
|
|
||
| Bite-sized commits and adaptive refresh transform this cliff into a staircase. By maintaining a rolling history of commits and letting replicas step through them incrementally, we allow replicas to catch up at their own sustainable pace while maintaining predictable performance characteristics. Each refresh stays within safe resource bounds, page cache churn remains low, and search latency stays stable, even during update bursts or network hiccups. | ||
|
|
||
| The elegance of this approach lies in its simplicity. There's no complex coordination protocol, no expensive distributed consensus, just intelligent use of what Lucene already gives us: immutable segments and atomic refresh semantics. Replicas make local decisions about which commit to refresh on next, using simple heuristics like delta size thresholds. The system remains fully idempotent and retry-able; if anything fails mid-refresh, you simply resume from the last successful commit. Each bite-sized commit is actually an incremental backup of the index. As a nice side effect, we get fine grained point-in-time checkpoints to recover your index from, in case of outages or data corruption events. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The elegance of this approach lies in its simplicity. There's no complex coordination protocol, no expensive distributed consensus, just intelligent use of what Lucene already gives us: immutable segments and atomic refresh semantics. Replicas make local decisions about which commit to refresh on next, using simple heuristics like delta size thresholds. The system remains fully idempotent and retry-able; if anything fails mid-refresh, you simply resume from the last successful commit. Each bite-sized commit is actually an incremental backup of the index. As a nice side effect, we get fine grained point-in-time checkpoints to recover your index from, in case of outages or data corruption events. | |
| The elegance of this approach lies in its simplicity. There's no complex coordination protocol, no expensive distributed consensus---just intelligent use of what Lucene already gives us: immutable segments and atomic refresh semantics. Replicas make local decisions about which commit to refresh on next, using simple heuristics like delta size thresholds. The system remains fully idempotent and retry-able; if anything fails mid-refresh, you simply resume from the last successful commit. Each bite-sized commit is actually an incremental backup of the index. As a nice side effect, we get fine-grained point-in-time checkpoints from which to recover your index in case of outages or data corruption events. |
|
|
||
| In production environments spanning multiple geographic regions with varying network conditions, this matters significantly. A replica in a distant data center experiencing bandwidth constraints can make steady progress without falling dangerously behind. A replica recovering from a restart can catch up incrementally rather than attempting one massive refresh. | ||
|
|
||
| At the same time, it is worth noting that this set up will increase your remote storage costs. Since we now store a sliding window of more frequent commits, we capture some transient segments that would’ve otherwise been skipped with less frequent checkpoints. This increase in storage is directly controlled by the window and frequency of checkpoints we chose to maintain, longer windows consume more storage. It is important to configure a remote storage clean up policy that periodically deletes older, obsolete checkpoints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| At the same time, it is worth noting that this set up will increase your remote storage costs. Since we now store a sliding window of more frequent commits, we capture some transient segments that would’ve otherwise been skipped with less frequent checkpoints. This increase in storage is directly controlled by the window and frequency of checkpoints we chose to maintain, longer windows consume more storage. It is important to configure a remote storage clean up policy that periodically deletes older, obsolete checkpoints. | |
| At the same time, it is worth noting that this setup will increase your remote storage costs. Since you now store a sliding window of more frequent commits, you capture some transient segments that would've otherwise been skipped with less frequent checkpoints. This increase in storage is directly controlled by the window and frequency of checkpoints you choose to maintain---longer windows consume more storage. It is important to configure a remote storage cleanup policy that periodically deletes older, obsolete checkpoints. |
|
|
||
| At the same time, it is worth noting that this set up will increase your remote storage costs. Since we now store a sliding window of more frequent commits, we capture some transient segments that would’ve otherwise been skipped with less frequent checkpoints. This increase in storage is directly controlled by the window and frequency of checkpoints we chose to maintain, longer windows consume more storage. It is important to configure a remote storage clean up policy that periodically deletes older, obsolete checkpoints. | ||
|
|
||
| Support for this architecture is now available in Lucene 10.3, providing high-throughput, geographically distributed search systems with a proven path to more stable replication. If your replicas are experiencing latency spikes during refresh, or if you're dealing with cross-region replication challenges, adaptive refresh might exactly be the resilient replication strategy you've been looking for. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Support for this architecture is now available in Lucene 10.3, providing high-throughput, geographically distributed search systems with a proven path to more stable replication. If your replicas are experiencing latency spikes during refresh, or if you're dealing with cross-region replication challenges, adaptive refresh might exactly be the resilient replication strategy you've been looking for. | |
| Support for this architecture is now available in Lucene 10.3, providing high-throughput, geographically distributed search systems with a proven path to more stable replication. If your replicas are experiencing latency spikes during refresh, or if you're dealing with cross-region replication challenges, adaptive refresh might be exactly the resilient replication strategy you've been looking for. |
natebower
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Editorial review
Signed-off-by: Nathan Bower <[email protected]>
natebower
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vigyasharma! LGTM
@pajuric This should be ready to publish.
Description
Blog Post about Adaptive Refresh for Resilient Segment Replication
Issues Resolved
Resolves #3971
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.