Only update TopK dynamic filters if the new ones are more selective #16433

adriangb · 2025-06-17T22:37:24Z

The idea here is to introduce a global thresholds reference that gets updated across all partitions.
This could drastically speed up early termination and will also avoid re-evaluating file-level statistics pruning in ParquetOpener.
I've also swapped our use of Arc<RwLock<T>> to Arc<ArcSwap<T>> which should offer some perf improvements.

adriangb · 2025-06-17T22:37:51Z

datafusion/physical-plan/src/sorts/sort.rs

+    /// Filter matching the state of the sort for dynamic filter pushdown.
+    /// If `fetch` is `Some`, this will also be set and a TopK operator may be used.
+    /// If `fetch` is `None`, this will be `None`.
+    filter: Option<TopKDynamicFilters>,


I feel like there's some further refactoring that could happen here, e.g. split up SortExec, leaving for another day.

adriangb · 2025-06-17T22:38:11Z

datafusion/physical-plan/src/sorts/sort.rs

-    pub fn with_filter(mut self, filter: Arc<DynamicFilterPhysicalExpr>) -> Self {
-        self.filter = Some(filter);
-        self
-    }


This was unused and had slipped through the cracks. I can make a new PR to just remove these methods.

adriangb · 2025-06-17T22:39:03Z

datafusion/physical-plan/src/sorts/sort.rs

+                    self.filter
+                        .as_ref()
+                        .expect("Filter should be set when fetch is Some")
+                        .clone(),


I refactored so that the TopK struct always expects this parameter which better reflects the reality of execution. But since it's strangely tied to the fetch param I am doing an expect assertion here. It should never fail at runtime.

I recommend turning this into an internal error so that if someone does hit this for some reason the symptom is less severe

datafusion/physical-plan/src/topk/mod.rs

Dandandan · 2025-06-18T22:53:57Z

It seems in some cases it's faster:

┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃ topk-dynamic-filter ┃ topk-filters ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ Q1           │            16.17 ms │     16.89 ms │     no change │
│ Q2           │            27.60 ms │     20.22 ms │ +1.36x faster │
│ Q3           │            55.41 ms │     54.45 ms │     no change │
│ Q4           │            22.26 ms │     22.07 ms │     no change │
│ Q5           │            11.98 ms │     12.10 ms │     no change │
│ Q6           │            26.25 ms │     27.26 ms │     no change │
│ Q7           │            71.56 ms │     67.86 ms │ +1.05x faster │
│ Q8           │            90.02 ms │     46.15 ms │ +1.95x faster │
│ Q9           │            61.64 ms │     58.91 ms │     no change │
│ Q10          │           101.13 ms │     98.24 ms │     no change │
│ Q11          │            58.43 ms │     56.85 ms │     no change │
└──────────────┴─────────────────────┴──────────────┴───────────────┘

And in other cases it's mixed / slower:

┃ Query        ┃ topk-dynamic-filter ┃ topk-filters ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │            14.00 ms │     13.07 ms │ +1.07x faster │
│ QQuery 1     │            20.55 ms │     18.81 ms │ +1.09x faster │
│ QQuery 2     │            56.58 ms │     56.33 ms │     no change │
│ QQuery 3     │            56.30 ms │     54.94 ms │     no change │
│ QQuery 4     │           468.00 ms │    516.71 ms │  1.10x slower │
│ QQuery 5     │           626.85 ms │    659.29 ms │  1.05x slower │
│ QQuery 6     │            15.31 ms │     15.26 ms │     no change │
│ QQuery 7     │            30.44 ms │     21.48 ms │ +1.42x faster │
│ QQuery 8     │           547.48 ms │    569.67 ms │     no change │
│ QQuery 9     │           744.50 ms │    778.67 ms │     no change │
│ QQuery 10    │           182.53 ms │    176.03 ms │     no change │
│ QQuery 11    │           217.10 ms │    190.23 ms │ +1.14x faster │
│ QQuery 12    │           832.33 ms │    715.38 ms │ +1.16x faster │
│ QQuery 13    │          1102.37 ms │    914.96 ms │ +1.20x faster │
│ QQuery 14    │           823.97 ms │    576.94 ms │ +1.43x faster │
│ QQuery 15    │           574.53 ms │    496.51 ms │ +1.16x faster │
│ QQuery 16    │          1324.12 ms │   1150.68 ms │ +1.15x faster │
│ QQuery 17    │          1324.29 ms │   1245.55 ms │ +1.06x faster │
│ QQuery 18    │          2603.07 ms │   2248.86 ms │ +1.16x faster │
│ QQuery 19    │            48.17 ms │     49.60 ms │     no change │
│ QQuery 20    │           978.10 ms │    964.90 ms │     no change │
│ QQuery 21    │          1021.47 ms │   1050.51 ms │     no change │
│ QQuery 22    │          1958.07 ms │   1660.81 ms │ +1.18x faster │
│ QQuery 23    │           760.66 ms │   5353.24 ms │  7.04x slower │
│ QQuery 24    │           525.78 ms │    342.39 ms │ +1.54x faster │
│ QQuery 25    │           480.36 ms │    278.59 ms │ +1.72x faster │
│ QQuery 26    │           542.13 ms │    342.32 ms │ +1.58x faster │
│ QQuery 27    │          1826.05 ms │   1260.73 ms │ +1.45x faster │
│ QQuery 28    │         11094.31 ms │  10685.64 ms │     no change │
│ QQuery 29    │           427.18 ms │    432.16 ms │     no change │
│ QQuery 30    │           826.43 ms │    539.39 ms │ +1.53x faster │
│ QQuery 31    │           789.89 ms │    553.49 ms │ +1.43x faster │
│ QQuery 32    │          2580.93 ms │   2339.06 ms │ +1.10x faster │
│ QQuery 33    │          2660.40 ms │   2761.14 ms │     no change │
│ QQuery 34    │          2948.08 ms │   2923.44 ms │     no change │
│ QQuery 35    │           886.56 ms │    810.35 ms │ +1.09x faster │
│ QQuery 36    │            18.59 ms │     83.97 ms │  4.52x slower │
│ QQuery 37    │            18.67 ms │     35.52 ms │  1.90x slower │
│ QQuery 38    │            18.10 ms │     81.86 ms │  4.52x slower │
│ QQuery 39    │            17.89 ms │    135.04 ms │  7.55x slower │
│ QQuery 40    │            18.30 ms │     27.70 ms │  1.51x slower │
│ QQuery 41    │            18.15 ms │     26.07 ms │  1.44x slower │
│ QQuery 42    │            17.78 ms │     22.56 ms │  1.27x slower │
└──────────────┴─────────────────────┴──────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓

adriangb · 2025-06-18T22:58:23Z

Seems like a bug in my implementation right? I'd be surprised if the update checks I added are that heavy compared to other work...

datafusion/physical-plan/src/topk/mod.rs

Dandandan · 2025-06-19T05:13:41Z

datafusion/physical-plan/src/topk/mod.rs

        let Some(thresholds) = self.heap.get_threshold_values(&self.expr)? else {
            return Ok(());
        };

+        // Are the new thresholds more selective than our existing ones?
+        let should_update = {
+            let mut current = self.filter.thresholds.write();


Perhaps this lock takes too long + overhead of doing this for all updates + all partitions quite often?
We could also store a Row instead of thresholds to make comparison much quicker (should also be able to avoid allocations)?

I think this would be a good idea anyway to simplify the code.

Dandandan · 2025-06-19T05:23:52Z

Hm my earlier benchmarks didn't seem correct. not sure where the earlier run came from 🤔

Dandandan · 2025-06-19T09:20:25Z

In hindsight I think actually another fact that we don't see topk being as effective with more partitions is that spreading them over partitions will essentially make topk(n) into a topk(n*partitions).
So a limit of 10 will be 200 with 20 partitions, and therefore will reduce the selectivity of the filter.

Synchronizing based on the max of each heap will roughly transform it to a heap with n*partitions - partitions, so it doesn't help that much with better selectivity.

Dandandan · 2025-06-19T09:32:21Z

(Removed a non-working proposal).

I am wondering if we can somehow synchronize the values in the heap efficiently in order to make the filter for a topk(n*partitions) as efficient as topk(n) 🤔

adriangb · 2025-06-19T12:40:03Z

I am wondering if we can somehow synchronize the values in the heap efficiently in order to make the filter for a topk(n*partitions) as efficient as topk(n) 🤔

I think the only way to do that would be to make a global heap and put a lock on it, similar to what we do with the filters?

adriangb · 2025-06-19T12:40:31Z

Hm my earlier benchmarks didn't seem correct. not sure where the earlier run came from 🤔

What do the current ones show? Not much improvement?

Dandandan · 2025-06-19T14:25:33Z

Hm my earlier benchmarks didn't seem correct. not sure where the earlier run came from 🤔

What do the current ones show? Not much improvement?

Yes about the same compared to main.

adriangb · 2025-06-19T14:36:19Z

We could try a shared heap. It might work? I guess it will be a sort of balance between lock contention and better selectivity. Maybe we can balance it by having distinct heaps for writes with no locks but read only references to all of them so that when we do reads we compute on the fly the "combined" heap? Then we don't need any locks. The cost is that computations on the heap are larger but as long as k ~ constant then it should be fine.

Dandandan · 2025-06-19T23:07:23Z

We could try a shared heap. It might work? I guess it will be a sort of balance between lock contention and better selectivity. Maybe we can balance it by having distinct heaps for writes with no locks but read only references to all of them so that when we do reads we compute on the fly the "combined" heap? Then we don't need any locks. The cost is that computations on the heap are larger but as long as k ~ constant then it should be fine.

how would you compute the shared heap on the fly?

I was thinking something similar: each write to a own heap, only write the ones that updated the local heap to the shared heap (limiting the lock access time).

adriangb · 2025-06-19T23:15:23Z

how would you compute the shared heap on the fly?

I was thinking you'd compute the top K of the top K * partitions on the fly.

But maybe your proposal makes more sense.

adriangb · 2025-06-27T18:49:20Z

Marking as draft until we sort out #16501

adriangb · 2025-07-04T12:45:26Z

I did a bench run, confounding results:

┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ topk-filters ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ Q1           │  9.96 ms │     10.70 ms │  1.07x slower │
│ Q2           │ 12.63 ms │     15.30 ms │  1.21x slower │
│ Q3           │ 33.45 ms │     35.22 ms │  1.05x slower │
│ Q4           │ 15.22 ms │     16.82 ms │  1.11x slower │
│ Q5           │  9.87 ms │      8.85 ms │ +1.12x faster │
│ Q6           │ 16.54 ms │     22.16 ms │  1.34x slower │
│ Q7           │ 43.41 ms │     44.00 ms │     no change │
│ Q8           │ 22.91 ms │     34.56 ms │  1.51x slower │
│ Q9           │ 38.68 ms │     39.60 ms │     no change │
│ Q10          │ 65.73 ms │     73.24 ms │  1.11x slower │
│ Q11          │ 34.78 ms │     37.14 ms │  1.07x slower │
└──────────────┴──────────┴──────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary           ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (main)           │ 303.17ms │
│ Total Time (topk-filters)   │ 337.58ms │
│ Average Time (main)         │  27.56ms │
│ Average Time (topk-filters) │  30.69ms │
│ Queries Faster              │        1 │
│ Queries Slower              │        8 │
│ Queries with No Change      │        2 │
│ Queries with Failure        │        0 │
└─────────────────────────────┴──────────┘

Mostly... slower

Dandandan · 2025-07-04T20:24:17Z

I am taking a look now, see if I can find a thing

Dandandan · 2025-07-04T20:53:45Z

datafusion/physical-plan/src/topk/mod.rs

+            if let Some(current_row) = threshold_guard.as_ref() {
+                match current_row.as_slice().cmp(new_threshold_row) {
+                    Ordering::Greater => {
+                        // new < current, so new threshold is more selective


I think this was wrong before @adriangb - in the heap lower means more selective

The current solution seems roughly on par with main, I don't observe a speedup.

I guess we should try changing it to the global heap thing - write updates to the shared/global heap and update the filter based on global heap.

adriangb · 2025-08-13T16:30:04Z

I tried some stuff. Still not an obvious improvement:

/bench.sh compare main topk-filters
Comparing main and topk-filters
--------------------
Benchmark run_topk_tpch.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ topk-filters ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ Q1           │ 12.46 ms │     10.55 ms │ +1.18x faster │
│ Q2           │ 14.02 ms │     12.06 ms │ +1.16x faster │
│ Q3           │ 29.39 ms │     30.10 ms │     no change │
│ Q4           │ 13.74 ms │     12.79 ms │ +1.07x faster │
│ Q5           │  7.44 ms │      7.09 ms │     no change │
│ Q6           │ 15.87 ms │     13.79 ms │ +1.15x faster │
│ Q7           │ 37.22 ms │     44.24 ms │  1.19x slower │
│ Q8           │ 27.51 ms │     25.17 ms │ +1.09x faster │
│ Q9           │ 33.04 ms │     31.28 ms │ +1.06x faster │
│ Q10          │ 55.26 ms │     58.96 ms │  1.07x slower │
│ Q11          │ 28.74 ms │     29.75 ms │     no change │
└──────────────┴──────────┴──────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary           ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (main)           │ 274.68ms │
│ Total Time (topk-filters)   │ 275.79ms │
│ Average Time (main)         │  24.97ms │
│ Average Time (topk-filters) │  25.07ms │
│ Queries Faster              │        6 │
│ Queries Slower              │        2 │
│ Queries with No Change      │        3 │
│ Queries with Failure        │        0 │
└─────────────────────────────┴──────────┘

adriangb · 2025-08-13T17:00:21Z

ClickBench results for partitioned datasets look much better, the only query that is truly slower is a very fast one anyway (so probably noise). But also it seems like most of these should not be faster 🤔 so maybe my measurement is off.

┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       main ┃ topk-filters ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │    0.92 ms │      1.15 ms │  1.25x slower │
│ QQuery 1     │   12.45 ms │     11.01 ms │ +1.13x faster │
│ QQuery 2     │   42.41 ms │     35.19 ms │ +1.21x faster │
│ QQuery 3     │   40.63 ms │     33.55 ms │ +1.21x faster │
│ QQuery 4     │  337.67 ms │    267.36 ms │ +1.26x faster │
│ QQuery 5     │  434.44 ms │    362.25 ms │ +1.20x faster │
│ QQuery 6     │    1.21 ms │      1.05 ms │ +1.15x faster │
│ QQuery 7     │   14.20 ms │     11.94 ms │ +1.19x faster │
│ QQuery 8     │  413.55 ms │    317.13 ms │ +1.30x faster │
│ QQuery 9     │  601.85 ms │    480.06 ms │ +1.25x faster │
│ QQuery 10    │  100.43 ms │     84.75 ms │ +1.18x faster │
│ QQuery 11    │  108.91 ms │     91.61 ms │ +1.19x faster │
│ QQuery 12    │  419.13 ms │    323.82 ms │ +1.29x faster │
│ QQuery 13    │  570.70 ms │    481.27 ms │ +1.19x faster │
│ QQuery 14    │  428.36 ms │    310.63 ms │ +1.38x faster │
│ QQuery 15    │  426.88 ms │    309.22 ms │ +1.38x faster │
│ QQuery 16    │ 1251.45 ms │    729.69 ms │ +1.72x faster │
│ QQuery 17    │ 1175.59 ms │    738.35 ms │ +1.59x faster │
│ QQuery 18    │ 2670.84 ms │   1945.33 ms │ +1.37x faster │
│ QQuery 19    │   41.71 ms │     32.77 ms │ +1.27x faster │
│ QQuery 20    │ 1002.98 ms │    738.13 ms │ +1.36x faster │
│ QQuery 21    │  969.39 ms │    763.04 ms │ +1.27x faster │
│ QQuery 22    │ 1506.05 ms │   1214.86 ms │ +1.24x faster │
│ QQuery 23    │ 4418.24 ms │   4008.72 ms │ +1.10x faster │
│ QQuery 24    │  198.37 ms │    196.65 ms │     no change │
│ QQuery 25    │  150.97 ms │    150.72 ms │     no change │
│ QQuery 26    │  200.45 ms │    191.65 ms │     no change │
│ QQuery 27    │ 1007.57 ms │    893.24 ms │ +1.13x faster │
│ QQuery 28    │ 7214.47 ms │   7065.58 ms │     no change │
│ QQuery 29    │  356.81 ms │    327.34 ms │ +1.09x faster │
│ QQuery 30    │  323.66 ms │    310.25 ms │     no change │
│ QQuery 31    │  366.94 ms │    346.37 ms │ +1.06x faster │
│ QQuery 32    │ 2028.11 ms │   1692.80 ms │ +1.20x faster │
│ QQuery 33    │ 2286.38 ms │   1775.46 ms │ +1.29x faster │
│ QQuery 34    │ 1978.34 ms │   2305.40 ms │  1.17x slower │
│ QQuery 35    │  555.06 ms │    552.82 ms │     no change │
│ QQuery 36    │   61.16 ms │     62.54 ms │     no change │
│ QQuery 37    │   24.59 ms │     23.99 ms │     no change │
│ QQuery 38    │   61.70 ms │     63.87 ms │     no change │
│ QQuery 39    │  102.67 ms │    103.98 ms │     no change │
│ QQuery 40    │   17.16 ms │     18.43 ms │  1.07x slower │
│ QQuery 41    │   16.16 ms │     17.05 ms │  1.06x slower │
│ QQuery 42    │   12.72 ms │     13.14 ms │     no change │
└──────────────┴────────────┴──────────────┴───────────────┘

adriangb · 2025-08-22T15:39:20Z

@alamb I think you've reviewed but not approved, I thought you had approved. Can you take another look at this PR when you get a chance? I think I've addressed all of the feedback.

alamb · 2025-08-22T16:55:53Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1014-gcp #15~24.04.1-Ubuntu SMP Fri Jul 25 23:26:08 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing topk-filters (0b1cb58) to f363e38 diff using: tpch_mem
Results will be posted here when complete

alamb · 2025-08-22T16:56:18Z

Looking at it now -- I kicked off the benchmarks again after making some changes to my gcp machine that hopefully will make the results more consistent

alamb · 2025-08-22T17:26:26Z

🤖: Benchmark completed

Details

Comparing HEAD and topk-filters
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ topk-filters ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 169.63 ms │    165.07 ms │     no change │
│ QQuery 2     │  27.55 ms │     27.70 ms │     no change │
│ QQuery 3     │  44.86 ms │     45.11 ms │     no change │
│ QQuery 4     │  26.77 ms │     26.41 ms │     no change │
│ QQuery 5     │  72.18 ms │     73.65 ms │     no change │
│ QQuery 6     │  19.67 ms │     19.50 ms │     no change │
│ QQuery 7     │ 144.25 ms │    139.83 ms │     no change │
│ QQuery 8     │  30.48 ms │     33.56 ms │  1.10x slower │
│ QQuery 9     │  83.26 ms │     82.54 ms │     no change │
│ QQuery 10    │  57.63 ms │     58.02 ms │     no change │
│ QQuery 11    │  42.63 ms │     40.83 ms │     no change │
│ QQuery 12    │  49.97 ms │     50.12 ms │     no change │
│ QQuery 13    │  44.87 ms │     46.68 ms │     no change │
│ QQuery 14    │  13.69 ms │     13.90 ms │     no change │
│ QQuery 15    │  23.59 ms │     23.69 ms │     no change │
│ QQuery 16    │  23.76 ms │     23.45 ms │     no change │
│ QQuery 17    │ 141.23 ms │    146.02 ms │     no change │
│ QQuery 18    │ 318.80 ms │    313.81 ms │     no change │
│ QQuery 19    │  48.11 ms │     34.80 ms │ +1.38x faster │
│ QQuery 20    │  60.59 ms │     47.85 ms │ +1.27x faster │
│ QQuery 21    │ 224.40 ms │    216.34 ms │     no change │
│ QQuery 22    │  20.00 ms │     19.36 ms │     no change │
└──────────────┴───────────┴──────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary           ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)           │ 1687.90ms │
│ Total Time (topk-filters)   │ 1648.24ms │
│ Average Time (HEAD)         │   76.72ms │
│ Average Time (topk-filters) │   74.92ms │
│ Queries Faster              │         2 │
│ Queries Slower              │         1 │
│ Queries with No Change      │        19 │
│ Queries with Failure        │         0 │
└─────────────────────────────┴───────────┘

alamb · 2025-08-22T17:26:28Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1014-gcp #15~24.04.1-Ubuntu SMP Fri Jul 25 23:26:08 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing topk-filters (d4e2312) to f363e38 diff using: topk_tpch
Results will be posted here when complete

alamb · 2025-08-22T17:37:09Z

🤖: Benchmark completed

Details

Comparing HEAD and topk-filters
--------------------
Benchmark run_topk_tpch.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ topk-filters ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ Q1           │  36.60 ms │      8.76 ms │ +4.18x faster │
│ Q2           │  45.45 ms │     45.55 ms │     no change │
│ Q3           │ 124.01 ms │    125.24 ms │     no change │
│ Q4           │  51.42 ms │     48.09 ms │ +1.07x faster │
│ Q5           │  35.31 ms │     38.12 ms │  1.08x slower │
│ Q6           │  66.22 ms │     67.99 ms │     no change │
│ Q7           │ 162.63 ms │    166.41 ms │     no change │
│ Q8           │ 106.50 ms │    108.96 ms │     no change │
│ Q9           │ 135.16 ms │    135.68 ms │     no change │
│ Q10          │ 223.96 ms │    221.17 ms │     no change │
│ Q11          │ 113.44 ms │    117.74 ms │     no change │
└──────────────┴───────────┴──────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary           ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)           │ 1100.71ms │
│ Total Time (topk-filters)   │ 1083.72ms │
│ Average Time (HEAD)         │  100.06ms │
│ Average Time (topk-filters) │   98.52ms │
│ Queries Faster              │         2 │
│ Queries Slower              │         1 │
│ Queries with No Change      │         8 │
│ Queries with Failure        │         0 │
└─────────────────────────────┴───────────┘

alamb · 2025-08-22T17:37:12Z

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1014-gcp #15~24.04.1-Ubuntu SMP Fri Jul 25 23:26:08 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing topk-filters (d4e2312) to f363e38 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

alamb

Thank you @adriangb -- I went over this carefully again.

As long as the benchmark runs look good I think this PR is ready to go. A very nice optimization 👏

I found one small potential improvement (which you already merged :) )

pydantic#37

I am working on a follow on PR to extract the early stream stop

Again, really nice work and thank you

alamb · 2025-08-22T17:07:28Z

datafusion/physical-plan/src/topk/mod.rs

+        let new_threshold_row = &max_row.row;
+
+        // Extract scalar values BEFORE acquiring lock to reduce critical section
+        let thresholds = match self.heap.get_threshold_values(&self.expr)? {


I think we can move this down after the check for update too:

Minor: do not extract heap threshold values if not updating filter pydantic/datafusion#37

alamb · 2025-08-22T17:38:39Z

datafusion/datasource-parquet/src/opener.rs

-                        .and_then(|b| schema_mapping.map_batch(b).map_err(Into::into))
-                });
+            // Create a stateful stream that can check pruning after each batch
+            let adapted = {


I found this code somewhat 🤯 (and this function is already 100s of lines long) I spent some time refactoring it into its own stream for readability and I also understand it better now. I'll put up a follow on PR to extract this logic -- no need to do it in this one

Encapsulate early File pruning in parquet opener in its own stream #17293

alamb · 2025-08-22T18:10:45Z

🤖: Benchmark completed

Details

Comparing HEAD and topk-filters
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ topk-filters ┃    Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0     │  2639.26 ms │   2604.08 ms │ no change │
│ QQuery 1     │  1219.97 ms │   1240.23 ms │ no change │
│ QQuery 2     │  2324.31 ms │   2392.51 ms │ no change │
│ QQuery 3     │  1207.99 ms │   1172.77 ms │ no change │
│ QQuery 4     │  2236.03 ms │   2231.24 ms │ no change │
│ QQuery 5     │ 27457.81 ms │  27345.17 ms │ no change │
│ QQuery 6     │  4125.85 ms │   4209.18 ms │ no change │
│ QQuery 7     │  3599.32 ms │   3503.08 ms │ no change │
└──────────────┴─────────────┴──────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary           ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)           │ 44810.53ms │
│ Total Time (topk-filters)   │ 44698.26ms │
│ Average Time (HEAD)         │  5601.32ms │
│ Average Time (topk-filters) │  5587.28ms │
│ Queries Faster              │          0 │
│ Queries Slower              │          0 │
│ Queries with No Change      │          8 │
│ Queries with Failure        │          0 │
└─────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ topk-filters ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.12 ms │      2.15 ms │     no change │
│ QQuery 1     │    50.61 ms │     51.14 ms │     no change │
│ QQuery 2     │   135.87 ms │    137.82 ms │     no change │
│ QQuery 3     │   162.54 ms │    162.41 ms │     no change │
│ QQuery 4     │  1055.31 ms │   1027.29 ms │     no change │
│ QQuery 5     │  1475.94 ms │   1472.47 ms │     no change │
│ QQuery 6     │     2.13 ms │      2.17 ms │     no change │
│ QQuery 7     │    54.79 ms │     54.63 ms │     no change │
│ QQuery 8     │  1472.25 ms │   1422.87 ms │     no change │
│ QQuery 9     │  1837.78 ms │   1812.93 ms │     no change │
│ QQuery 10    │   369.82 ms │    385.13 ms │     no change │
│ QQuery 11    │   428.04 ms │    435.19 ms │     no change │
│ QQuery 12    │  1347.17 ms │   1365.75 ms │     no change │
│ QQuery 13    │  2098.07 ms │   2115.15 ms │     no change │
│ QQuery 14    │  1237.31 ms │   1239.03 ms │     no change │
│ QQuery 15    │  1173.37 ms │   1173.14 ms │     no change │
│ QQuery 16    │  2612.56 ms │   2607.94 ms │     no change │
│ QQuery 17    │  2597.59 ms │   2607.24 ms │     no change │
│ QQuery 18    │  5156.33 ms │   4834.22 ms │ +1.07x faster │
│ QQuery 19    │   126.62 ms │    128.06 ms │     no change │
│ QQuery 20    │  1961.14 ms │   2007.25 ms │     no change │
│ QQuery 21    │  2279.44 ms │   2322.19 ms │     no change │
│ QQuery 22    │  3932.07 ms │   3939.04 ms │     no change │
│ QQuery 23    │ 15969.33 ms │  12635.10 ms │ +1.26x faster │
│ QQuery 24    │   265.46 ms │    208.07 ms │ +1.28x faster │
│ QQuery 25    │   492.34 ms │    505.17 ms │     no change │
│ QQuery 26    │   259.62 ms │    229.17 ms │ +1.13x faster │
│ QQuery 27    │  2872.01 ms │   2888.04 ms │     no change │
│ QQuery 28    │ 23296.75 ms │  23107.00 ms │     no change │
│ QQuery 29    │   984.75 ms │   1005.99 ms │     no change │
│ QQuery 30    │  1328.94 ms │   1289.57 ms │     no change │
│ QQuery 31    │  1326.20 ms │   1330.12 ms │     no change │
│ QQuery 32    │  4487.03 ms │   4331.85 ms │     no change │
│ QQuery 33    │  5750.40 ms │   5628.25 ms │     no change │
│ QQuery 34    │  5811.68 ms │   5931.34 ms │     no change │
│ QQuery 35    │  2014.76 ms │   2008.45 ms │     no change │
│ QQuery 36    │   122.81 ms │    120.62 ms │     no change │
│ QQuery 37    │    53.34 ms │     53.77 ms │     no change │
│ QQuery 38    │   122.13 ms │    123.16 ms │     no change │
│ QQuery 39    │   194.75 ms │    197.72 ms │     no change │
│ QQuery 40    │    43.97 ms │     43.22 ms │     no change │
│ QQuery 41    │    40.40 ms │     39.76 ms │     no change │
│ QQuery 42    │    32.38 ms │     31.56 ms │     no change │
└──────────────┴─────────────┴──────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary           ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)           │ 97037.93ms │
│ Total Time (topk-filters)   │ 93013.16ms │
│ Average Time (HEAD)         │  2256.70ms │
│ Average Time (topk-filters) │  2163.10ms │
│ Queries Faster              │          4 │
│ Queries Slower              │          0 │
│ Queries with No Change      │         39 │
│ Queries with Failure        │          0 │
└─────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ topk-filters ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 171.32 ms │    166.57 ms │     no change │
│ QQuery 2     │  25.81 ms │     27.51 ms │  1.07x slower │
│ QQuery 3     │  44.54 ms │     45.14 ms │     no change │
│ QQuery 4     │  26.16 ms │     26.06 ms │     no change │
│ QQuery 5     │  72.60 ms │     72.99 ms │     no change │
│ QQuery 6     │  19.81 ms │     19.50 ms │     no change │
│ QQuery 7     │ 146.51 ms │    139.13 ms │ +1.05x faster │
│ QQuery 8     │  32.05 ms │     32.03 ms │     no change │
│ QQuery 9     │  84.08 ms │     83.37 ms │     no change │
│ QQuery 10    │  57.98 ms │     56.83 ms │     no change │
│ QQuery 11    │  41.88 ms │     41.95 ms │     no change │
│ QQuery 12    │  51.34 ms │     51.00 ms │     no change │
│ QQuery 13    │  46.30 ms │     45.60 ms │     no change │
│ QQuery 14    │  13.09 ms │     13.10 ms │     no change │
│ QQuery 15    │  24.02 ms │     23.66 ms │     no change │
│ QQuery 16    │  23.78 ms │     23.83 ms │     no change │
│ QQuery 17    │ 145.24 ms │    147.87 ms │     no change │
│ QQuery 18    │ 322.75 ms │    328.66 ms │     no change │
│ QQuery 19    │  36.78 ms │     35.22 ms │     no change │
│ QQuery 20    │  47.98 ms │     48.12 ms │     no change │
│ QQuery 21    │ 219.66 ms │    216.21 ms │     no change │
│ QQuery 22    │  20.03 ms │     19.40 ms │     no change │
└──────────────┴───────────┴──────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary           ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)           │ 1673.70ms │
│ Total Time (topk-filters)   │ 1663.74ms │
│ Average Time (HEAD)         │   76.08ms │
│ Average Time (topk-filters) │   75.62ms │
│ Queries Faster              │         1 │
│ Queries Slower              │         1 │
│ Queries with No Change      │        20 │
│ Queries with Failure        │         0 │
└─────────────────────────────┴───────────┘

adriangb · 2025-08-22T18:14:48Z

Benchmarks look good, several faster queries, no queries really slower! I'll merge this in the next couple hours if you don't first Andrew.

github-actions bot added the physical-plan Changes to the physical-plan crate label Jun 17, 2025

adriangb commented Jun 17, 2025

View reviewed changes

adriangb mentioned this pull request Jun 17, 2025

Skip re-pruning based on partition values and file level stats if there are no dynamic filters #16424

Merged

Dandandan reviewed Jun 18, 2025

View reviewed changes

datafusion/physical-plan/src/topk/mod.rs Outdated Show resolved Hide resolved

adriangb force-pushed the topk-filters branch from ddd8d24 to 0bcc654 Compare June 18, 2025 23:37

Dandandan reviewed Jun 19, 2025

View reviewed changes

datafusion/physical-plan/src/topk/mod.rs Show resolved Hide resolved

Dandandan reviewed Jun 19, 2025

View reviewed changes

adriangb marked this pull request as draft June 27, 2025 18:48

adriangb force-pushed the topk-filters branch from fe80957 to c5f88b8 Compare July 4, 2025 04:03

adriangb marked this pull request as ready for review July 4, 2025 12:08

adriangb force-pushed the topk-filters branch from 3f350eb to cfc7e3b Compare July 4, 2025 12:08

Dandandan reviewed Jul 4, 2025

View reviewed changes

adriangb force-pushed the topk-filters branch from 3d4822b to 690b0b1 Compare August 13, 2025 11:30

adriangb force-pushed the topk-filters branch from 4b81358 to 310100b Compare August 16, 2025 18:18

adriangb requested a review from alamb August 22, 2025 15:39

adriangb and others added 13 commits August 22, 2025 11:34

Only update TopK dynamic filters if the new ones are more selective

0246e8e

use arc-swap, move around work

522c13d

fmt

b1759c8

fmt

48d4dbf

use more ArcSwap

0192037

early termination

149931c

fmt

6ea8bdd

make an internal error

1d66079

Use parking_lot rather than ArcSwap

ed239d8

Remove other use of arc-swap, rework critical section

a42fbbd

taplo

ae29d25

Single lock

05a749e

fix imports

0b1cb58

adriangb force-pushed the topk-filters branch from 2a8204e to 0b1cb58 Compare August 22, 2025 16:34

alamb mentioned this pull request Aug 22, 2025

Minor: do not extract heap threshold values if not updating filter pydantic/datafusion#37

Merged

Minor: do not extract heap threshold values if not updating filter (#37)

d4e2312

alamb approved these changes Aug 22, 2025

View reviewed changes

alamb mentioned this pull request Aug 22, 2025

Encapsulate early File pruning in parquet opener in its own stream #17293

Open

adriangb merged commit 18c5a6c into apache:main Aug 22, 2025
29 checks passed

alamb mentioned this pull request Aug 22, 2025

Release DataFusion 50.0.0 (Aug/Sep 2025) #16799

Open

37 tasks

Only update TopK dynamic filters if the new ones are more selective #16433

Only update TopK dynamic filters if the new ones are more selective #16433

Conversation

adriangb commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Dandandan commented Jun 18, 2025

Uh oh!

adriangb commented Jun 18, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Dandandan commented Jun 19, 2025

Uh oh!

Dandandan commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Dandandan commented Jun 19, 2025

Uh oh!

adriangb commented Jun 19, 2025

Uh oh!

adriangb commented Jun 19, 2025

Uh oh!

Dandandan commented Jun 19, 2025

Uh oh!

adriangb commented Jun 19, 2025

Uh oh!

Dandandan commented Jun 19, 2025

Uh oh!

adriangb commented Jun 19, 2025

Uh oh!

adriangb commented Jun 27, 2025

Uh oh!

adriangb commented Jul 4, 2025

Uh oh!

Dandandan commented Jul 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adriangb commented Aug 13, 2025

Uh oh!

adriangb commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adriangb commented Aug 22, 2025

Uh oh!

alamb commented Aug 22, 2025

Uh oh!

alamb commented Aug 22, 2025

Uh oh!

alamb commented Aug 22, 2025

Uh oh!

alamb commented Aug 22, 2025

Uh oh!

alamb commented Aug 22, 2025

Uh oh!

alamb commented Aug 22, 2025

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adriangb commented Jun 17, 2025 •

edited

Loading

Dandandan commented Jun 19, 2025 •

edited

Loading

adriangb commented Aug 13, 2025 •

edited

Loading