apache · alamb · Aug 8, 2025 · Aug 7, 2025 · Aug 8, 2025 · Aug 8, 2025
diff --git a/dev/update_config_docs.sh b/dev/update_config_docs.sh
@@ -149,6 +149,32 @@ SET datafusion.execution.target_partitions = '1';
 
 [`ListingTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/listing/struct.ListingTable.html
 
+## Memory-limited Queries
+
+When executing a memory-consuming query under a tight memory limit, DataFusion 
+will spill intermediate results to disk.
+
+When the [`FairSpillPool`] is used, memory is divided evenly among partitions. 
+The higher the value of `datafusion.execution.target_partitions`, the less memory 
+is allocated to each partition, and the out-of-core execution path may trigger 
+more frequently, possibly slowing down execution.
+
+Additionally, while spilling, data is read back in `datafusion.execution.batch_size` size batches.
+The larger this value, the fewer spilled sorted runs can be merged. Decreasing this setting
+can help reduce the number of subsequent spills required. 
+
+In conclusion, for queries under a very tight memory limit, it's recommended to
+set `target_partitions` and `batch_size` to smaller values.
+
+```sql
+-- Query still gets paralleized, but each partition will have more memory to use
+SET datafusion.execution.target_partitions = 4;
+-- Smaller than the default '8192', while still keep the benefit of vectorized execution
+SET datafusion.execution.batch_size = 1024;
+```
+
+[`FairSpillPool`]: https://docs.rs/datafusion/latest/datafusion/execution/memory_pool/struct.FairSpillPool.html
+
 EOF
 
 

diff --git a/docs/source/user-guide/configs.md b/docs/source/user-guide/configs.md
@@ -221,3 +221,29 @@ SET datafusion.execution.target_partitions = '1';
 ```
 
 [`listingtable`]: https://docs.rs/datafusion/latest/datafusion/datasource/listing/struct.ListingTable.html
+
+## Memory-limited Queries
+
+When executing a memory-consuming query under a tight memory limit, DataFusion
+will spill intermediate results to disk.
+
+When the [`FairSpillPool`] is used, memory is divided evenly among partitions.
+The higher the value of `datafusion.execution.target_partitions`, the less memory
+is allocated to each partition, and the out-of-core execution path may trigger
+more frequently, possibly slowing down execution.
+
+Additionally, while spilling, data is read back in `datafusion.execution.batch_size` size batches.
+The larger this value, the fewer spilled sorted runs can be merged. Decreasing this setting
+can help reduce the number of subsequent spills required.
+
+In conclusion, for queries under a very tight memory limit, it's recommended to
+set `target_partitions` and `batch_size` to smaller values.
+
+```sql
+-- Query still gets paralleized, but each partition will have more memory to use
+SET datafusion.execution.target_partitions = 4;
+-- Smaller than the default '8192', while still keep the benefit of vectorized execution
+SET datafusion.execution.batch_size = 1024;
+```
+
+[`fairspillpool`]: https://docs.rs/datafusion/latest/datafusion/execution/memory_pool/struct.FairSpillPool.html