-
Notifications
You must be signed in to change notification settings - Fork 331
Implement limit push down for IcebergTableProvider
#1673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
record_batch_stream_builder.with_row_groups(selected_row_group_indices); | ||
} | ||
|
||
if let Some(limit) = task.limit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we enable with_page_index
as suggested by doc: https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.ArrowReaderBuilder.html#method.with_limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I extended should_load_page_index
logic and ArrowReaderOptions
is initialized with with_page_index(should_load_page_index)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @krinart
Happy to help @ZENOTME and thanks for the feedback! |
Original PR: #19 Upstream PR: apache#1673
Which issue does this PR close?
N/A
What changes are included in this PR?
Previously
_limit
was ignored inIcebergTableProvider::scan
:iceberg-rust/crates/integrations/datafusion/src/table/mod.rs
Lines 149 to 163 in aad9e2e
This PR propagates limit all the way to the
ArrowReaderBuilder
.Note: limit push down is only applied to each batch which means that
IcebergTableProvider::scan
may potentially return more records than specified by limit.Which is OK according to
TableProvider::scan
documentation:Are these changes tested?
Unit tests