-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Add support for table_changes function for Lakehouse #26882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add support for table_changes function for Lakehouse #26882
Conversation
Reviewer's GuideThis PR adds a generic table_changes function to the Lakehouse connector by introducing a delegating TableChangesFunction, registering it via Guice in the connector and modules, and wiring a FunctionProvider to route calls to the appropriate DeltaLake or Iceberg implementation, with new smoke tests validating its behavior. File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `plugin/trino-lakehouse/src/test/java/io/trino/plugin/lakehouse/TestLakehouseIcebergConnectorSmokeTest.java:55-56` </location>
<code_context>
)\\E""");
}
+
+ @Test
+ public void testTableChangesFunction()
+ {
+ String tableName = "test_table_changes_function_" + randomNameSuffix();
</code_context>
<issue_to_address>
**suggestion (testing):** Missing test cases for error scenarios and edge cases in table_changes function.
Please add tests for querying non-existent tables, invalid snapshot IDs, and cases with no changes between snapshots to improve coverage of error handling and edge cases.
</issue_to_address>
### Comment 2
<location> `plugin/trino-lakehouse/src/test/java/io/trino/plugin/lakehouse/TestLakehouseDeltaConnectorSmokeTest.java:65-66` </location>
<code_context>
)\\E""");
}
+
+ @Test
+ public void testTableChangesFunction()
+ {
+ String tableName = "test_table_changes_function_" + randomNameSuffix();
</code_context>
<issue_to_address>
**suggestion (testing):** Missing test for table_changes on non-Delta tables and invalid arguments.
Please include tests for non-Delta tables and invalid arguments to ensure error handling is properly verified.
</issue_to_address>
### Comment 3
<location> `plugin/trino-lakehouse/src/test/java/io/trino/plugin/lakehouse/TestLakehouseDeltaConnectorSmokeTest.java:116-118` </location>
<code_context>
+ """);
+ }
+
+ private void assertTableChangesQuery(@Language("SQL") String sql, @Language("SQL") String expectedResult)
+ {
+ assertThat(query(sql))
+ .result()
+ .exceptColumns("_commit_timestamp")
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding assertions for error messages and empty results.
Adding assertions for empty results and expected errors will improve test coverage and ensure the function behaves correctly in all scenarios.
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
@Test | ||
public void testTableChangesFunction() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Missing test cases for error scenarios and edge cases in table_changes function.
Please add tests for querying non-existent tables, invalid snapshot IDs, and cases with no changes between snapshots to improve coverage of error handling and edge cases.
@Test | ||
public void testTableChangesFunction() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Missing test for table_changes on non-Delta tables and invalid arguments.
Please include tests for non-Delta tables and invalid arguments to ensure error handling is properly verified.
private void assertTableChangesQuery(@Language("SQL") String sql, @Language("SQL") String expectedResult) | ||
{ | ||
assertThat(query(sql)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider adding assertions for error messages and empty results.
Adding assertions for empty results and expected errors will improve test coverage and ensure the function behaves correctly in all scenarios.
9f5c6a1
to
7bbeb9b
Compare
The table_changes function is supported by DeltaLake and Iceberg modules. The TableChangesFunction does the initial analisys. The Lakehouse TableChangesFunction will delegate to the DeltaLake TableChangesFunction, or, if the table is not found, to the Iceberg TableChangesFunction. The LakehouseFunctionProvider returns the TableFunctionProcessorProviderFactory from DeltaLak or Iceberg, depending on the previously functionHandle generated by the TableChangesFunction.
7bbeb9b
to
3c4a62f
Compare
Add support for table_changes function for Lakehouse
Fixes #26756
Description
The table_changes function is supported by DeltaLake and Iceberg modules.
The TableChangesFunction does the initial analysis.
The Lakehouse TableChangesFunction will delegate to the DeltaLake TableChangesFunction, or, if the table is not found, to the Iceberg TableChangesFunction.
The LakehouseFunctionProvider returns the TableFunctionProcessorProviderFactory from DeltaLake or Iceberg, depending on the previously functionHandle generated by the TableChangesFunction.
The function for DeltaLake has 3 parameters:
schema_name, table_name, since_version (optional)
The function for Iceberg has 4 parameters (all required):
schema_name, table_name, start_snapshot_id, end_snapshot_id
The function for Lakehouse will have 5 parameters:
schema_name, table_name, start_snapshot_id, end_snapshot_id, since_version (optional)
start_snapshot_id and end_snapshot_id are optional for DeltaLake, but required for Iceberg module
For DeltaLake, the function is called like this:
or
For Iceberg, the function is called like this:
or
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
(X) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:
Summary by Sourcery
Add support for the system.table_changes function to the Lakehouse connector by delegating SQL analysis and execution to the DeltaLake or Iceberg implementations.
New Features:
Enhancements:
Tests: