Skip to content

Conversation

@adriangb
Copy link
Contributor

This is a WIP PR just to run some benchmarks, it is not ready for review
It was mostly AI generated with the goal of running benchmarks not to make a mergeable change

@github-actions github-actions bot added core Core DataFusion crate physical-plan Changes to the physical-plan crate labels Oct 27, 2025
@adriangb
Copy link
Contributor Author

Here's some initial results:

❯ cargo run --release -p datafusion-cli -- -f q.sql
    Finished `release` profile [optimized] target(s) in 0.47s
     Running `target/release/datafusion-cli -f q.sql`
DataFusion CLI v50.3.0
+-------+
| count |
+-------+
| 1000  |
+-------+
1 row(s) fetched. 
Elapsed 0.006 seconds.

0 row(s) fetched. 
Elapsed 0.001 seconds.

+-----------+
| count     |
+-----------+
| 100000000 |
+-----------+
1 row(s) fetched. 
Elapsed 1.746 seconds.

0 row(s) fetched. 
Elapsed 0.001 seconds.

+----+----+----+
| k  | v  | k  |
+----+----+----+
| 50 | 50 | 50 |
| 51 | 51 | 51 |
| 52 | 52 | 52 |
| 53 | 53 | 53 |
| 54 | 54 | 54 |
| 55 | 55 | 55 |
| 56 | 56 | 56 |
| 57 | 57 | 57 |
| 58 | 58 | 58 |
| 59 | 59 | 59 |
| 60 | 60 | 60 |
| 61 | 61 | 61 |
| 62 | 62 | 62 |
| 63 | 63 | 63 |
| 64 | 64 | 64 |
| 65 | 65 | 65 |
| 66 | 66 | 66 |
| 67 | 67 | 67 |
| 68 | 68 | 68 |
| 69 | 69 | 69 |
| 70 | 70 | 70 |
| 71 | 71 | 71 |
| 72 | 72 | 72 |
| 73 | 73 | 73 |
| 74 | 74 | 74 |
| 75 | 75 | 75 |
| 76 | 76 | 76 |
| 77 | 77 | 77 |
| 78 | 78 | 78 |
| 79 | 79 | 79 |
| 80 | 80 | 80 |
| 81 | 81 | 81 |
| 82 | 82 | 82 |
| 83 | 83 | 83 |
| 84 | 84 | 84 |
| 85 | 85 | 85 |
| 86 | 86 | 86 |
| 87 | 87 | 87 |
| 88 | 88 | 88 |
| 89 | 89 | 89 |
| .            |
| .            |
| .            |
+----+----+----+
951 row(s) fetched. (First 40 displayed. Use --maxrows to adjust)
Elapsed 0.004 seconds.


datafusion on  pushdown-hashes-case [?] is 📦 v50.3.0 via 🐍 v3.13.7 (datafusion-benchmark) via 🦀 v1.90.0 on ☁️  [email protected](us-east4) took 3s 
❯ datafusion-cli -f q.sql                          
DataFusion CLI v50.0.0
+-------+
| count |
+-------+
| 1000  |
+-------+
1 row(s) fetched. 
Elapsed 0.065 seconds.

0 row(s) fetched. 
Elapsed 0.004 seconds.

+-----------+
| count     |
+-----------+
| 100000000 |
+-----------+
1 row(s) fetched. 
Elapsed 1.531 seconds.

0 row(s) fetched. 
Elapsed 0.001 seconds.

+----+----+----+
| k  | v  | k  |
+----+----+----+
| 50 | 50 | 50 |
| 51 | 51 | 51 |
| 52 | 52 | 52 |
| 53 | 53 | 53 |
| 54 | 54 | 54 |
| 55 | 55 | 55 |
| 56 | 56 | 56 |
| 57 | 57 | 57 |
| 58 | 58 | 58 |
| 59 | 59 | 59 |
| 60 | 60 | 60 |
| 61 | 61 | 61 |
| 62 | 62 | 62 |
| 63 | 63 | 63 |
| 64 | 64 | 64 |
| 65 | 65 | 65 |
| 66 | 66 | 66 |
| 67 | 67 | 67 |
| 68 | 68 | 68 |
| 69 | 69 | 69 |
| 70 | 70 | 70 |
| 71 | 71 | 71 |
| 72 | 72 | 72 |
| 73 | 73 | 73 |
| 74 | 74 | 74 |
| 75 | 75 | 75 |
| 76 | 76 | 76 |
| 77 | 77 | 77 |
| 78 | 78 | 78 |
| 79 | 79 | 79 |
| 80 | 80 | 80 |
| 81 | 81 | 81 |
| 82 | 82 | 82 |
| 83 | 83 | 83 |
| 84 | 84 | 84 |
| 85 | 85 | 85 |
| 86 | 86 | 86 |
| 87 | 87 | 87 |
| 88 | 88 | 88 |
| 89 | 89 | 89 |
| .            |
| .            |
| .            |
+----+----+----+
951 row(s) fetched. (First 40 displayed. Use --maxrows to adjust)
Elapsed 0.136 seconds.


datafusion on  pushdown-hashes-case [?] is 📦 v50.3.0 via 🐍 v3.13.7 (datafusion-benchmark) via 🦀 v1.90.0 on ☁️  [email protected](us-east4) 
❯ ./benchmarks/bench.sh compare main pushdown-hashes-case
Comparing main and pushdown-hashes-case
--------------------
Benchmark tpch_sf10.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃       main ┃ pushdown-hashes-case ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  495.39 ms │            443.15 ms │ +1.12x faster │
│ QQuery 2     │   94.35 ms │             86.27 ms │ +1.09x faster │
│ QQuery 3     │  255.05 ms │            248.12 ms │     no change │
│ QQuery 4     │  209.80 ms │            196.09 ms │ +1.07x faster │
│ QQuery 5     │  375.67 ms │            349.27 ms │ +1.08x faster │
│ QQuery 6     │  146.43 ms │            133.02 ms │ +1.10x faster │
│ QQuery 7     │  573.03 ms │            505.02 ms │ +1.13x faster │
│ QQuery 8     │  434.82 ms │            366.27 ms │ +1.19x faster │
│ QQuery 9     │  653.18 ms │            558.85 ms │ +1.17x faster │
│ QQuery 10    │  359.55 ms │            330.17 ms │ +1.09x faster │
│ QQuery 11    │   77.48 ms │             66.45 ms │ +1.17x faster │
│ QQuery 12    │  204.63 ms │            188.96 ms │ +1.08x faster │
│ QQuery 13    │  359.42 ms │            345.62 ms │     no change │
│ QQuery 14    │  178.11 ms │            178.12 ms │     no change │
│ QQuery 15    │  260.67 ms │            258.49 ms │     no change │
│ QQuery 16    │   64.22 ms │             66.97 ms │     no change │
│ QQuery 17    │  656.44 ms │            703.65 ms │  1.07x slower │
│ QQuery 18    │ 1193.44 ms │           1072.72 ms │ +1.11x faster │
│ QQuery 19    │  293.48 ms │            285.36 ms │     no change │
│ QQuery 20    │  236.95 ms │            235.57 ms │     no change │
│ QQuery 21    │  706.11 ms │            692.01 ms │     no change │
│ QQuery 22    │   84.61 ms │             84.55 ms │     no change │
└──────────────┴────────────┴──────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                   ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (main)                   │ 7912.82ms │
│ Total Time (pushdown-hashes-case)   │ 7394.69ms │
│ Average Time (main)                 │  359.67ms │
│ Average Time (pushdown-hashes-case) │  336.12ms │
│ Queries Faster                      │        12 │
│ Queries Slower                      │         1 │
│ Queries with No Change              │         9 │
│ Queries with Failure                │         0 │
└─────────────────────────────────────┴───────────┘

Where q.sql is basically https://datafusion.apache.org/blog/2025/09/10/dynamic-filters/#hash-join-dynamic-filters

@adriangb adriangb force-pushed the pushdown-hashes-case branch from 3d68a39 to 3332644 Compare October 30, 2025 21:44
@adriangb
Copy link
Contributor Author

A version of this that supports InList as well: #18393

@adriangb
Copy link
Contributor Author

adriangb commented Nov 4, 2025

Closing in favor of #18393 (comment)

@adriangb adriangb closed this Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant