Skip to content

Conversation

krinart
Copy link

@krinart krinart commented Sep 15, 2025

Which issue does this PR close?

N/A

What changes are included in this PR?

Previously _limit was ignored in IcebergTableProvider::scan:

_state: &dyn Session,
projection: Option<&Vec<usize>>,
filters: &[Expr],
_limit: Option<usize>,
) -> DFResult<Arc<dyn ExecutionPlan>> {
Ok(Arc::new(IcebergTableScan::new(
self.table.clone(),
self.snapshot_id,
self.schema.clone(),
projection,
filters,
)))
}
fn supports_filters_pushdown(

This PR propagates limit all the way to the ArrowReaderBuilder.

Note: limit push down is only applied to each batch which means that IcebergTableProvider::scan may potentially return more records than specified by limit.

Which is OK according to TableProvider::scan documentation:

If limit is specified, must only produce at least this many rows, (though it may return more).

Are these changes tested?

Unit tests

record_batch_stream_builder.with_row_groups(selected_row_group_indices);
}

if let Some(limit) = task.limit {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I extended should_load_page_index logic and ArrowReaderOptions is initialized with with_page_index(should_load_page_index).

Copy link
Contributor

@ZENOTME ZENOTME left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @krinart

@krinart
Copy link
Author

krinart commented Sep 19, 2025

Happy to help @ZENOTME and thanks for the feedback!
Any suggestions on how to proceed and merge this PR?

sgrebnov added a commit to spiceai/iceberg-rust that referenced this pull request Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants