shudv · shudv · Jan 20, 2026 · Jan 20, 2026 · Jan 20, 2026 · Jan 20, 2026
diff --git a/README.md b/README.md
@@ -30,7 +30,7 @@ pnpm benchmark:export
 
 ## Benchmark (n = 100K, Rust)
 
-| k             | FullSort (µs) | BIS (µs)        | ESM (µs)        | DeltaSort (µs)        |
+| k             | FullSort (µs) | BIS (µs)        | ESM (µs)        | **DeltaSort** (µs)    |
 | ------------- | ------------- | --------------- | --------------- | --------------------- |
 | 1 (0.001%)    | 1215.0 ±0.3%  | 113.4 ±1.5% 🪶  | 797.8 ±0.4%     | **15.7 ±4.3%** ⚡     |
 | 10 (0.01%)    | 2012.6 ±0.5%  | 1127.8 ±1.1% 🪶 | 1006.8 ±0.6%    | **98.2 ±3.0%** ⚡     |
@@ -57,7 +57,7 @@ The key insight: pre-sorting dirty values creates _segments_ that can be fixed _
 ```
 paper/   — LaTeX source for the paper
 rust/    — Rust implementation + benchmarks
-js/      — JavaScript implementation
+js/      — JavaScript implementation + benchmarks
 ```
 
 ## Feedback Welcome

diff --git a/paper/figures/algorithm-comparison.tex b/paper/figures/algorithm-comparison.tex
@@ -144,6 +144,6 @@
 \draw[arrow] (\xM + \arrayw + 0.1, \rowC) -- (8.8, \rowC) -- (8.8, \rowB) -- (\xO - \arrowgap, \rowB);
 \node[black, anchor=west] at (8.0, \rowC + 0.25) {\scriptsize\bfseries 6+6};
 \end{tikzpicture}
-\caption{Data movement comparison for 3 updated values (yellow). BIS first uses 9 moves to shift updated values to the end (since insertion requires a fully sorted prefix), then 10 (0+4+6) moves to insert each into the sorted prefix. ESM uses 6 moves to sort the extracted values, then writes all 8 values to produce the merged output. DeltaSort uses 6 moves to sort updated values, then just 6 (2+2+2) moves to place each value correctly within its segment (see~\defref{def:segment}), \textbf{without having to form a fully sorted prefix upfront} like BIS.}
+\caption{Data movement comparison for 3 updated values (yellow). BIS first uses 9 moves to shift updated values to the end (since insertion requires a fully sorted prefix), then 10 (0+4+6) moves to insert each updated value into the sorted prefix. ESM uses 6 moves to sort the extracted values, then writes all 8 values to produce the merged output. DeltaSort uses 6 moves to sort updated values, then just 6 (2+2+2) moves to place each value correctly within its segment (see~\defref{def:segment}), \textbf{without having to form a fully sorted prefix upfront} like BIS.}
 \label{fig:algorithm-comparison}
 \end{figure}
diff --git a/paper/figures/delta-sort-example.tex b/paper/figures/delta-sort-example.tex
@@ -6,6 +6,7 @@
     cellup/.style={cell, fill=yellow!40, font=\scriptsize\bfseries},
     cellL/.style={cell, fill=blue!25, font=\scriptsize\bfseries},
     cellR/.style={cell, fill=red!20, font=\scriptsize\bfseries},
+    cellgreen/.style={cell, fill=green!30, font=\scriptsize\bfseries},
     larrow/.style={->, >=stealth, thick, blue!70!black},
     rarrow/.style={->, >=stealth, thick, red!70!black},
     segbrace/.style={decorate, decoration={brace, amplitude=5pt}},
@@ -70,9 +71,20 @@
 \node[font=\small\bfseries, anchor=east] at (-0.3, -4.7) {After Phase 2:};
 
 % Final sorted: 1, 2, 4, 16, 24, 28, 32, 35, 45, 56, 58, 64, 70
-\foreach \i/\v in {0/1, 1/2, 2/4, 3/16, 4/24, 5/28, 6/32, 7/35, 8/45, 9/56, 10/58, 11/64, 12/70} {
-    \node[cell] (f\i) at (\i*0.58, -4.7) {\v};
-}
+% Green cells are the updated values in their final positions
+\node[cellgreen] (f0) at (0*0.58, -4.7) {1};
+\node[cellgreen] (f1) at (1*0.58, -4.7) {2};
+\node[cell] (f2) at (2*0.58, -4.7) {4};
+\node[cell] (f3) at (3*0.58, -4.7) {16};
+\node[cell] (f4) at (4*0.58, -4.7) {24};
+\node[cellgreen] (f5) at (5*0.58, -4.7) {28};
+\node[cellgreen] (f6) at (6*0.58, -4.7) {32};
+\node[cell] (f7) at (7*0.58, -4.7) {35};
+\node[cell] (f8) at (8*0.58, -4.7) {45};
+\node[cell] (f9) at (9*0.58, -4.7) {56};
+\node[cellgreen] (f10) at (10*0.58, -4.7) {58};
+\node[cell] (f11) at (11*0.58, -4.7) {64};
+\node[cellgreen] (f12) at (12*0.58, -4.7) {70};
 
 % Movement arrows from Phase 1 to Phase 2 (updated values only)
 % Labels show: #fix_number (leftBound, rightBound) - placed beside arrows

diff --git a/paper/figures/incremental-sorting-algorithms.tex b/paper/figures/incremental-sorting-algorithms.tex
@@ -6,9 +6,9 @@
 \toprule
 Algorithm & Time (Expected / Worst) & Space \\
 \midrule
-Binary-Insertion-Sort$^{\dagger}$ & $O(kn)$ & $O(1)$ \\
-Extract–Sort–Merge$^{*}$$^{\dagger}$ & $O(k \log k + n)$ & $O(n)$ \\
-\textbf{DeltaSort}$^{*}$ & $O(n\sqrt{k})\,/\,O(kn)$ & $O(k)$ \\
+Binary-Insertion-Sort$^{\dagger}$ & $O(nk)$ & $O(1)$ \\
+Extract–Sort–Merge$^{*}$$^{\dagger}$ & $O(n + k \log k)$ & $O(n)$ \\
+\textbf{DeltaSort}$^{*}$ & $O(n\sqrt{k})\,/\,O(nk)$ & $O(k)$ \\
 \bottomrule
 \end{tabular}
 

diff --git a/paper/figures/sorting-algorithms.tex b/paper/figures/sorting-algorithms.tex
@@ -7,8 +7,8 @@
 Algorithm & Comparisons (C) & Movement (M) & Space (S) \\
 \midrule
 FullSort$^{*}$ $(\texttt{sort\_by})$ & $O(n \log n)$ & - & - \\
-BIS & $O(k \log n)$ & $O(kn)$ & $O(1)$ \\
-ESM & $O(k \log k + n)$ & $O(n)$ & $O(n)$ \\
+BIS & $O(k \log n)$ & $O(nk)$ & $O(1)$ \\
+ESM & $O(n + k \log k)$ & $O(n)$ & $O(n)$ \\
 \textbf{DeltaSort} & $O(k \log n)$ & $O(n\sqrt{k})^{\dagger}$ & $O(k)$ \\
 \bottomrule
 \end{tabular}
@@ -17,5 +17,5 @@
 {\small $^{*}$Not update-aware; hence complexity is a function of only $n$. The exact movement and space complexity are implementation-dependent and not required for the present discussion.}
 
 
-{\small $^{\dagger}$Expected. Worst-case time is $O(kn)$}
+{\small $^{\dagger}$Expected. Worst-case time is $O(nk)$}
 \end{table}
diff --git a/paper/sections/abstract.tex b/paper/sections/abstract.tex
@@ -1,2 +1,2 @@
 % Abstract
-Sorting values or records is a fundamental operation. When records need to be read in a particular order, sorting at read time incurs repeated $O(n \log n)$ cost and can become a bottleneck in read-heavy workloads. A common solution is to maintain a derived sorted read-replica that is kept updated as the underlying system-of-record changes. For updating read-replicas that are stored as arrays, existing approaches rely on either full re-sorting or incremental techniques such as repeated binary insertion, which incurs high data movement, or merge-based technique, which requires linear auxiliary space. In this paper, we study incremental sorting under a model in which the sorting routine is explicitly informed of the indices of values updated since the previous sort—a setting that naturally arises in systems that track update deltas. Under this model, we present \emph{DeltaSort}, a new algorithm for incremental sorting that occupies a distinct point in the time-space trade-off spectrum. We provide theoretical analysis and experimental evidence showing that, for random update distributions, DeltaSort achieves \emph{lower execution time than insertion-based approach} while using \emph{substantially less auxiliary space than merge-based approach}.
+Sorting values or records is a fundamental operation. When records need to be read in a particular order, sorting at read time incurs repeated $O(n \log n)$ cost and can become a bottleneck in read-heavy workloads. A common solution is to maintain a derived sorted read-replica that is kept updated as the underlying system-of-record changes. For updating read-replicas that are stored as arrays, existing approaches rely on either full re-sorting or incremental techniques such as repeated binary insertion, which incurs high data movement, or merge-based technique, which requires linear auxiliary space. In this paper, we study incremental sorting under a model in which the sorting routine is explicitly informed of the indices of values updated since the previous sort—a setting that naturally arises in systems that track update deltas. Under this model, we present \emph{DeltaSort}, a new algorithm for incremental sorting that occupies a distinct point in the time-space trade-off spectrum. We provide theoretical analysis and experimental evidence showing that, for random update distributions, DeltaSort achieves \emph{lower execution time than insertion-based approach} ($O(n\sqrt{k})$ vs. $O(nk)$) while using \emph{substantially less auxiliary space than merge-based approach} ($O(k)$ vs. $O(n)$).
diff --git a/paper/sections/algorithm.tex b/paper/sections/algorithm.tex
@@ -138,7 +138,7 @@ \subsection{Complexity Analysis}
 \end{proof}
 
 \begin{remark}[Worst Case Movement]
-While the expected movement is $O(n\sqrt{k})$, the worst-case movement can be as high as $O(kn)$. This occurs when updated values form a single segment spanning the entire array. This happens when updates \emph{cluster monotonically} at the start or end of the array (see~\appref{sec:appendix-worst-case} for an illustration). Hence, in a practical setting, a hybrid algorithm will need to fall back to ESM or full re-sort based on the number of segments.
+While the expected movement is $O(n\sqrt{k})$, the worst-case movement can be as high as $O(nk)$. This occurs when updated values form a single segment spanning the entire array. This happens when updates \emph{cluster monotonically} at the start or end of the array (see~\appref{sec:appendix-worst-case} for an illustration). Hence, in a practical setting, a hybrid algorithm will need to fall back to ESM or full re-sort based on the number of segments.
 \end{remark}
 
 \begin{lemma}[Comparison Count]

diff --git a/paper/sections/appendix.tex b/paper/sections/appendix.tex
@@ -2,7 +2,7 @@
 
 \section{DeltaSort Worst Case Movement}
 \label{sec:appendix-worst-case}
-DeltaSort can exhibit worst-case movement of $O(kn)$ under clustered monotonic updates as shown in \figref{fig:worst-case}. In this case, all left-moving values must be shifted to the start of the array, and all right-moving values to the end of the array, resulting in maximal data movement of $O(kn)$.
+DeltaSort can exhibit worst-case movement of $O(nk)$ under clustered monotonic updates as shown in \figref{fig:worst-case}. In this case, all left-moving values must be shifted to the start of the array, and all right-moving values to the end of the array, resulting in maximal data movement of $O(nk)$.
 
 \input{\figdir/worst-case}
 

diff --git a/paper/sections/experiments.tex b/paper/sections/experiments.tex
@@ -23,7 +23,7 @@ \subsection{Results}
 \figref{fig:rust-performance}(a) shows execution time (in \textmu s) for $n = 100$K values as a function of percentage of updated values on a log--log scale. We use a log--log scale to highlight interesting behavior at lower ranges of $k$, which is the most practically relevant range, since updates typically affect only a small fraction of an array at once. As $k$ increases, all update-aware algorithms eventually lose to FullSort at a critical crossover threshold $k_c$, as the overhead of processing updates overshadows any benefit from knowing what was updated. \figref{fig:rust-performance}(b) shows how $k_c$ varies for each algorithm across various scales. Several observations emerge:
 
 \begin{enumerate}
-  \item The asymptotic behavior for each algorithm aligns with theory. BIS exhibits steep growth consistent with its $O(kn)$ movement cost, quickly becoming impractical as $k$ increases. ESM is relatively flat for small $k$, where linear merge cost dominates and is largely independent of the number of updates. DeltaSort exhibits intermediate growth.
+  \item The asymptotic behavior for each algorithm aligns with theory. BIS exhibits steep growth consistent with its $O(nk)$ movement cost, quickly becoming impractical as $k$ increases. ESM is relatively flat for small $k$, where linear merge cost dominates and is largely independent of the number of updates. DeltaSort exhibits intermediate growth.
 
   \item \textbf{DeltaSort is the fastest algorithm for} $k \lesssim 1\%$. For example, at $k = 0.1\%$, DeltaSort is $\sim$10$\times$ faster than FullSort, $\sim$3$\times$ faster than ESM and $\sim$15$\times$ faster than BIS. It uses more auxiliary space than BIS ($O(k)$ vs $O(1)$) but substantially less than ESM ($O(k)$ vs $O(n)$). Even though $1\%$ is small in absolute terms, it aligns very well with practical workloads where updates usually affect only a small percentage of the full dataset.
 
@@ -32,7 +32,7 @@ \subsection{Results}
   \item Crossover threshold trend is consistent and largely stable across all scales (DeltaSort 20--40\%, ESM 60--85\%), indicating that \textbf{the above observations hold across scales}. BIS has a very low threshold due to its quadratic data movement cost, whereas ESM has the highest threshold due to its efficient linear merge pass. DeltaSort exhibits an intermediate threshold, in line with its intermediate time complexity.
 \end{enumerate}
 
-These observations suggest that, much like hybrid blind sorting algorithms (e.g., TimSort~\cite{timsort}, DriftSort~\cite{driftsort}), it would be beneficial to construct \emph{adaptive update-aware} strategies. As an example, for the Rust implementation evaluated here, an adaptive strategy for an environment that requires fast execution without excessive space usage could be: use DeltaSort for $k \lesssim 10\%$, ESM for $20\% \lesssim k \lesssim 70\%$, and FullSort for $k \gtrsim 70\%$. The optimal cut-off points for a scenario would depend on several factors like memory availability, distribution of update sizes, comparator cost etc. For example, as the comparator cost grows, DeltaSort preserves its advantage over BIS because both have similar comparison count: $O(k \log n)$, while widening its gap relative to ESM, which has substantially higher comparison overhead. Hence, the crossover thresholds would shift in favor of DeltaSort, expanding the range of update sizes for which it is the preferred strategy.
+These observations suggest that, much like hybrid blind sorting algorithms (e.g., TimSort~\cite{timsort}, DriftSort~\cite{driftsort}), it would be beneficial to construct \emph{adaptive update-aware} strategies. As an example, for the Rust implementation evaluated here, an adaptive strategy for an environment that requires fast execution without excessive space usage could be: use DeltaSort for $k \lesssim 10\%$, ESM for $20\% \lesssim k \lesssim 70\%$, and FullSort for $k \gtrsim 70\%$. The optimal cut-off points for a scenario would depend on several factors like memory availability, distribution of update sizes, comparator cost etc. For example, as the comparator cost grows, DeltaSort preserves its advantage over BIS because both have similar comparison count: $O(k \log n)$, while widening its gap relative to ESM, which has substantially higher comparison overhead. Hence, the \emph{crossover thresholds would shift in favor of DeltaSort}, expanding the range of update sizes for which it is the preferred strategy.
 
 \subsection{Performance in V8 runtime}
 

diff --git a/paper/sections/future.tex b/paper/sections/future.tex
@@ -3,10 +3,10 @@ \section{Future Work}
 \label{sec:future}
 %==============================================================================
 
-While our analysis focuses on a random bounded-range update model, many practical workloads exhibit additional structure. Examples include bounded-rank displacement or clustered updates. These models may improve or worsen the movement and comparison costs of DeltaSort, potentially impacting the range in which it is applicable. Hence, a more systematic study of such update models is important.
+While our analysis focuses on a random bounded-range update model, many practical workloads exhibit additional structure. Examples include \emph{bounded-rank} or \emph{clustered} updates. These models may improve or worsen the movement and comparison costs of DeltaSort, potentially impacting the range in which it is applicable. Hence, a more systematic study of such update models is important.
 
-Additionally, we showed how segments can be fixed \emph{locally} and \emph{independently}. This strongly suggests opportunities for parallel execution, where different segments could be fixed concurrently without interference. In contrast, BIS is inherently sequential due to overlapping in-place shifts, and even though ESM parallelizes quite well, it uses substantial space. Exploring the parallel variant of DeltaSort and understanding its scalability on multi-core systems is a promising direction for future work.
+Additionally, we showed how segments can be fixed \emph{locally} and \emph{independently}. This strongly suggests opportunities for \emph{parallel execution}, where different segments could be fixed concurrently without interference. In contrast, BIS is inherently sequential due to overlapping in-place shifts, and even though ESM parallelizes quite well, it uses substantial space. Exploring the parallel variant of DeltaSort and understanding its scalability on multi-core systems is a promising direction for future work.
 
 We also saw that DeltaSort's performance advantage relies on predictable memory movement costs, which does not hold in runtimes like V8. Further study is needed to understand whether this is a general limitation or specific to V8.
 
-Finally, while this paper treats DeltaSort as a standalone algorithm for analysis, practical systems would benefit from hybrid adaptive strategies that select among BIS, DeltaSort, ESM, and full re-sorting based on update sizes and system constraints. Designing heuristic-based, low-overhead mechanisms for such dynamic selection is an open challenge.
+Finally, while this paper treats DeltaSort as a standalone algorithm for analysis, practical systems would benefit from \emph{hybrid adaptive strategies} that select among BIS, DeltaSort, ESM, and full re-sorting based on update sizes and system constraints. Designing heuristic-based, low-overhead mechanisms for such dynamic selection is an open challenge.
diff --git a/paper/sections/introduction.tex b/paper/sections/introduction.tex
@@ -13,8 +13,8 @@ \section{Introduction}
 
 Existing approaches for incrementally sorting arrays force a choice between two extremes:
 \begin{enumerate}
-  \item Binary-Insertion-Sort (\emph{BIS}): Collect all updated values from the array and insert them back one by one using binary search. This approach uses $O(1)$ space but incurs $O(kn)$ data movement for $k$ updates in an array of size $n$, making it suitable only for very small update batches. This is the \emph{space-efficient but time-inefficient} option.
-  \item Extract–Sort–Merge (\emph{ESM}): Extract all updated values into a new array, sort them using an efficient $O(k \log k)$ algorithm, and then merge them back into the original array. This approach uses $O(k \log k + n)$ time, but $O(n)$ space, even for small $k$, making it suitable only for larger update batches. This is the \emph{time-efficient but space-inefficient} option.
+  \item Binary-Insertion-Sort (\emph{BIS}): Collect all updated values from the array and insert them back one by one using binary search. This approach uses $O(1)$ space but incurs $O(nk)$ data movement for $k$ updates in an array of size $n$, making it suitable only for very small update batches. This is the \emph{space-efficient but time-inefficient} option.
+  \item Extract–Sort–Merge (\emph{ESM}): Extract all updated values into a new array, sort them using an efficient $O(k \log k)$ algorithm, and then merge them back into the original array. This approach uses $O(n + k \log k)$ time, but $O(n)$ space, even for small $k$, making it suitable only for larger update batches. This is the \emph{time-efficient but space-inefficient} option.
 \end{enumerate}
 
 This raises the question: are there other algorithms that offer \emph{intermediate} trade-offs? To our knowledge, no prior work exploits knowledge of updated indices to achieve a better time-space trade-off than the two extremes listed above. In this paper, we fill that gap. Specifically, this paper makes the following contributions: