-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Is your feature request related to a problem or challenge?
Using the tooling that @BlakeOrth have been working on for instrumenting datafusion-cli we can see what requests are actually being made when we query remote files;
DataFusion CLI v50.2.0
> \object_store_profiling trace
ObjectStore Profile mode set to Trace
> SELECT COUNT(*) from 'https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet' where "SearchPhrase" <> '';
+----------+
| count(*) |
+----------+
| 131559 |
+----------+
1 row(s) fetched.
Elapsed 0.606 seconds.
Object Store Profiling
Instrumented Object Store: instrument_mode: Trace, inner: HttpStore
2025-10-17T08:55:22.202041+00:00 operation=Get duration=0.025994s size=8 range: bytes=174965036-174965043 path=hits_compatible/athena_partitioned/hits_1.parquet
2025-10-17T08:55:22.228064+00:00 operation=Get duration=0.028127s size=34322 range: bytes=174930714-174965035 path=hits_compatible/athena_partitioned/hits_1.parquet
2025-10-17T08:55:22.295696+00:00 operation=Get duration=0.032303s size=15503 range: bytes=5120273-5135775 path=hits_compatible/athena_partitioned/hits_1.parquet
2025-10-17T08:55:22.296663+00:00 operation=Get duration=0.060797s size=3895852 range: bytes=145483536-149379387 path=hits_compatible/athena_partitioned/hits_1.parquet
2025-10-17T08:55:22.330266+00:00 operation=Get duration=0.040970s size=61815 range: bytes=46392516-46454330 path=hits_compatible/athena_partitioned/hits_1.parquetThere are 5!! requests made to read this file, annotated:
* operation=Get size=8 range: bytes=174965036-174965043 <-- reads the last 8 bytes to find metadata size
* operation=Get size=34322 range: bytes=174930714-174965035 <-- Footer Metadata
* operation=Get size=15503 range: bytes=5120273-5135775 <-- "SearchPhrase" data pages
* operation=Get size=3895852 range: bytes=145483536-149379387 <-- "SearchPhrase" data pages
* operation=Get size=61815 range: bytes=46392516-46454330
Describe the solution you'd like
I would like to avoid the first 8 byte request which adds an entire new object store request (and thus additional latency and additional cost)
Describe alternatives you've considered
I recommend changing the default of datafusion.execution.parquet.metadata_size_hint to 512k or 1MB
It turns out there there is an existing setting to avoid this first 8 byte request already, called datafusion.execution.parquet.metadata_size_hint which will prefetch a larger initial request (and will only make a second request if the first request does not have enough bytes)
Here is an example of using metadata_size_hint and reducing the number of requests to 4:
DataFusion CLI v50.2.0
> \object_store_profiling trace
ObjectStore Profile mode set to Trace
> set datafusion.execution.parquet.metadata_size_hint = 500000;
0 row(s) fetched.
Elapsed 0.001 seconds.
Object Store Profiling
> SELECT COUNT(*) from 'https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet' where "SearchPhrase" <> '';
+----------+
| count(*) |
+----------+
| 131559 |
+----------+
1 row(s) fetched.
Elapsed 0.573 seconds.
Object Store Profiling
Instrumented Object Store: instrument_mode: Trace, inner: HttpStore
2025-10-17T09:11:51.870079+00:00 operation=Get duration=0.031349s size=500000 range: bytes=174465044-174965043 path=hits_compatible/athena_partitioned/hits_1.parquet
2025-10-17T09:11:51.986178+00:00 operation=Get duration=0.025578s size=15503 range: bytes=5120273-5135775 path=hits_compatible/athena_partitioned/hits_1.parquet
2025-10-17T09:11:51.986345+00:00 operation=Get duration=0.064672s size=3895852 range: bytes=145483536-149379387 path=hits_compatible/athena_partitioned/hits_1.parquet
2025-10-17T09:11:52.012529+00:00 operation=Get duration=0.064541s size=61815 range: bytes=46392516-46454330 path=hits_compatible/athena_partitioned/hits_1.parquet
### Additional context
_No response_