Skip to content

Conversation

@vigyasharma
Copy link

@vigyasharma vigyasharma commented Nov 13, 2025

Description

Blog Post about Adaptive Refresh for Resilient Segment Replication

Issues Resolved

Resolves #3971

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

Signed-off-by: Vigya Sharma <[email protected]>
@github-actions
Copy link

Thank you for submitting a blog post!

The blog post review process is: Submit a PR -> (Optional) Peer review -> Doc review -> Editorial review -> Marketing review -> Published.

@github-actions
Copy link

Hi @vigyasharma,

It looks like you're adding a new blog post but don't have an issue mentioned. Please link this PR to an open issue using one of these keywords in the PR description:

  • Closes #issue-number
  • Fixes #issue-number
  • Resolves #issue-number

If an issue hasn't been created yet, please create one and then link it to this PR.

Signed-off-by: Vigya Sharma <[email protected]>
@@ -0,0 +1,212 @@
---
layout: post
title: "Adaptive Refresh for Resilient Segment Replication"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use sentence case - "Adaptive refresh for resilient segment replication."

A central challenge in replicated systems is how to propagate index changes across all replicas. Real-world document sets change over time: product catalog, availability and price changes in e-commerce search, documents get added and updated in enterprise document search, flight availability changes in airline ticket search, and most commercial search engines require near real-time updates.


## Propagating Changes in Replicated Systems
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentence case for subhead

With this approach, a document is indexed only once. We save precious computation resources on replicas, and instead leverage the network to copy (replicate) index changes across the entire fleet. By tracking replication checkpoints, we can make sure that each replica is on the same point-in-time view of the index, as opposed to document replication where each replica may index at its own pace. There's also the nice ability to rollback to a known older "good checkpoint" should some bad index changes make their way to production.


## A Typical Segment Replication Setup
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentence case for subhead

Before we go any further, it's worth understanding how Lucene manages searcher refreshes, which forms the basis of segment replicated systems.


## Understanding Lucene's Refresh Mechanism
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentence case for subhead




## Real World Challenges at Scale
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentence case for subhead

The single-checkpoint method forces each replica to absorb all accumulated changes in one step. It's simple and correct, but when the gap between checkpoints absorbed by the replica is large, you pay the whole cost at once. This cliff-like behavior is exactly what we set out to smooth out.


## Bite-Sized Commits and Adaptive Refresh
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentence case for subhead

@pajuric pajuric added the Lucene Lucene related content label Nov 15, 2025
@pajuric
Copy link

pajuric commented Nov 15, 2025

Adding @natebower for final editorial review.

@vigyasharma
Copy link
Author

Thanks for the review, @pajuric . I've updated all headings to sentence case.

Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editorial review


With a low OS page cache churn, we see fewer page faults and more stable latency for search requests. The entire process is idempotent, retry-able and resilient to transient failures. Since all intermediate states are valid Lucene commits, you can resume refreshing from the last commit point. Additionally, refreshing again on a commit point that the searcher is already on is a no-op and does not impact the system.

Astute readers might wonder why we don’t simply checkpoint more frequently and continue using the single-commit checkpoints. This is done to enable refresh efficiencies. Single commit checkpoints (even with small commits) require replicas to iterate through all checkpoints as it catches up to the latest changes. With multiple commits in the same checkpoint, we can evaluate them together using common checkpoint metadata and intelligently decide the commits to refresh on.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Astute readers might wonder why we dont simply checkpoint more frequently and continue using the single-commit checkpoints. This is done to enable refresh efficiencies. Single commit checkpoints (even with small commits) require replicas to iterate through all checkpoints as it catches up to the latest changes. With multiple commits in the same checkpoint, we can evaluate them together using common checkpoint metadata and intelligently decide the commits to refresh on.
Astute readers might wonder why we don't simply checkpoint more frequently and continue using the single-commit checkpoints. This is done to enable refresh efficiencies. Single-commit checkpoints (even with small commits) require replicas to iterate through all checkpoints as they catch up to the latest changes. With multiple commits in the same checkpoint, we can evaluate them together using common checkpoint metadata and intelligently decide which commits to refresh on.


## Conclusion

Large checkpoint jumps in segment-replicated systems create a fundamental tension: you want replicas to catch up quickly, but absorbing too much change at once can destabilize the system. The traditional approach of always replicating the latest commit creates a cliff — when replicas fall behind, they must pay the full cost in one painful step, with page faults, latency spikes, and ultimately, timed out search requests.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Large checkpoint jumps in segment-replicated systems create a fundamental tension: you want replicas to catch up quickly, but absorbing too much change at once can destabilize the system. The traditional approach of always replicating the latest commit creates a cliffwhen replicas fall behind, they must pay the full cost in one painful step, with page faults, latency spikes, and ultimately, timed out search requests.
Large checkpoint jumps in segment-replicated systems create a fundamental tension: you want replicas to catch up quickly, but absorbing too much change at once can destabilize the system. The traditional approach of always replicating the latest commit creates a cliff---when replicas fall behind, they must pay the full cost in one painful step, with page faults, latency spikes, and, ultimately, timed out search requests.


Bite-sized commits and adaptive refresh transform this cliff into a staircase. By maintaining a rolling history of commits and letting replicas step through them incrementally, we allow replicas to catch up at their own sustainable pace while maintaining predictable performance characteristics. Each refresh stays within safe resource bounds, page cache churn remains low, and search latency stays stable, even during update bursts or network hiccups.

The elegance of this approach lies in its simplicity. There's no complex coordination protocol, no expensive distributed consensus, just intelligent use of what Lucene already gives us: immutable segments and atomic refresh semantics. Replicas make local decisions about which commit to refresh on next, using simple heuristics like delta size thresholds. The system remains fully idempotent and retry-able; if anything fails mid-refresh, you simply resume from the last successful commit. Each bite-sized commit is actually an incremental backup of the index. As a nice side effect, we get fine grained point-in-time checkpoints to recover your index from, in case of outages or data corruption events.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The elegance of this approach lies in its simplicity. There's no complex coordination protocol, no expensive distributed consensus, just intelligent use of what Lucene already gives us: immutable segments and atomic refresh semantics. Replicas make local decisions about which commit to refresh on next, using simple heuristics like delta size thresholds. The system remains fully idempotent and retry-able; if anything fails mid-refresh, you simply resume from the last successful commit. Each bite-sized commit is actually an incremental backup of the index. As a nice side effect, we get fine grained point-in-time checkpoints to recover your index from, in case of outages or data corruption events.
The elegance of this approach lies in its simplicity. There's no complex coordination protocol, no expensive distributed consensus---just intelligent use of what Lucene already gives us: immutable segments and atomic refresh semantics. Replicas make local decisions about which commit to refresh on next, using simple heuristics like delta size thresholds. The system remains fully idempotent and retry-able; if anything fails mid-refresh, you simply resume from the last successful commit. Each bite-sized commit is actually an incremental backup of the index. As a nice side effect, we get fine-grained point-in-time checkpoints from which to recover your index in case of outages or data corruption events.


In production environments spanning multiple geographic regions with varying network conditions, this matters significantly. A replica in a distant data center experiencing bandwidth constraints can make steady progress without falling dangerously behind. A replica recovering from a restart can catch up incrementally rather than attempting one massive refresh.

At the same time, it is worth noting that this set up will increase your remote storage costs. Since we now store a sliding window of more frequent commits, we capture some transient segments that would’ve otherwise been skipped with less frequent checkpoints. This increase in storage is directly controlled by the window and frequency of checkpoints we chose to maintain, longer windows consume more storage. It is important to configure a remote storage clean up policy that periodically deletes older, obsolete checkpoints.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
At the same time, it is worth noting that this set up will increase your remote storage costs. Since we now store a sliding window of more frequent commits, we capture some transient segments that wouldve otherwise been skipped with less frequent checkpoints. This increase in storage is directly controlled by the window and frequency of checkpoints we chose to maintain, longer windows consume more storage. It is important to configure a remote storage clean up policy that periodically deletes older, obsolete checkpoints.
At the same time, it is worth noting that this setup will increase your remote storage costs. Since you now store a sliding window of more frequent commits, you capture some transient segments that would've otherwise been skipped with less frequent checkpoints. This increase in storage is directly controlled by the window and frequency of checkpoints you choose to maintain---longer windows consume more storage. It is important to configure a remote storage cleanup policy that periodically deletes older, obsolete checkpoints.


At the same time, it is worth noting that this set up will increase your remote storage costs. Since we now store a sliding window of more frequent commits, we capture some transient segments that would’ve otherwise been skipped with less frequent checkpoints. This increase in storage is directly controlled by the window and frequency of checkpoints we chose to maintain, longer windows consume more storage. It is important to configure a remote storage clean up policy that periodically deletes older, obsolete checkpoints.

Support for this architecture is now available in Lucene 10.3, providing high-throughput, geographically distributed search systems with a proven path to more stable replication. If your replicas are experiencing latency spikes during refresh, or if you're dealing with cross-region replication challenges, adaptive refresh might exactly be the resilient replication strategy you've been looking for.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Support for this architecture is now available in Lucene 10.3, providing high-throughput, geographically distributed search systems with a proven path to more stable replication. If your replicas are experiencing latency spikes during refresh, or if you're dealing with cross-region replication challenges, adaptive refresh might exactly be the resilient replication strategy you've been looking for.
Support for this architecture is now available in Lucene 10.3, providing high-throughput, geographically distributed search systems with a proven path to more stable replication. If your replicas are experiencing latency spikes during refresh, or if you're dealing with cross-region replication challenges, adaptive refresh might be exactly the resilient replication strategy you've been looking for.

Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editorial review

Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vigyasharma! LGTM

@pajuric This should be ready to publish.

@natebower natebower added the Done and ready to publish The blog is approved and ready to publish label Nov 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Done and ready to publish The blog is approved and ready to publish Lucene Lucene related content

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BLOG] Adaptive Refresh for Resilient Segment Replication

3 participants