Fix `schema_adapter` integration tests not running #16835

kosiew · 2025-07-21T03:51:37Z

Which issue does this PR close?

Closes Schema adapter Integration tests are not being run #16801.

Rationale for this change

This PR restructures and refactors the schema adapter integration tests to improve maintainability, clarity, and test isolation. It separates the test logic into a dedicated schema_adapter module under the integration_tests directory, aligning with other modular test patterns in the codebase.

What changes are included in this PR?

Removed schema_adapter_integration_tests.rs from the integration_tests directory.
Created a new module schema_adapter and moved the tests there.
Added mod schema_adapter; to core_integration.rs to include the new module.
Enhanced the schema adapter test suite to:
- Write and read test data using InMemory object store.
- Validate consistent behavior of the UppercaseAdapterFactory across ParquetSource, ArrowSource, CsvSource, and JsonSource.
- Confirm schema mapping behavior and adapter output schemas.
Added missing use imports and corrected adapter error handling in existing test files.

Are these changes tested?

✅ Yes, this PR includes comprehensive unit and integration tests for:

Adapter correctness and schema transformation behavior.
Reusability of SchemaAdapterFactory across file sources.
Compatibility with object stores and batch collection.

Are there any user-facing changes?

No, these changes are internal to the testing framework. There are no user-facing changes or breaking API changes introduced in this PR.

…ma adaptation - Updated SchemaAdapterFactory create method signature to accept projected and table schema refs. - Implemented map_column_index and map_schema methods in UppercaseAdapter to support case-insensitive column name mapping and schema projection. - Added UppercaseSchemaMapper to handle the mapping of RecordBatch columns and column statistics according to the projection. - Refactored adapt and output_schema methods accordingly. - This enables correct schema and data mapping for adapters that change column names (e.g., to uppercase) in integration tests.

…ule structure

…erFactory, TestSchemaAdapter, and TestSchemaMapping in schema adapter integration tests.

…_tests.rs file and consolidating struct and implementation blocks for TestSchemaAdapterFactory, TestSchemaAdapter, and TestSchemaMapping. Update imports and adjust test configurations for ParquetSource and CsvSource.

…o the directory instead of a specific file

…fusion/core

relocate schema adapter tests into the parquet suite reference new location in schema.rs remove old schema_adaptation tests

Deleted the outdated end-to-end schema test file `schema.rs` from core tests, as schema adaptation tests have been moved to `parquet/schema_adapter.rs`.

…ry in parquet integration tests

…for Arrow, Parquet, Csv, and Json

…SchemaAdapterFactory for equality comparison

This reverts commit 414de48.

…prove type checking

alamb

Thank you @kosiew

alamb · 2025-07-25T19:35:31Z

datafusion/core/tests/parquet/schema_adapter.rs

+#[derive(Debug, PartialEq)]
+struct UppercaseAdapterFactory {}
+
+impl SchemaAdapterFactory for UppercaseAdapterFactory {


Moving the tests here sort of implies they are only related to parquet -- don't we apply schema adapter to other formats too?

~~However, since all the tests use parquet this seems like a good place to put them~~

Update: they don't all use parquet

alamb · 2025-07-25T19:39:11Z

datafusion/core/tests/parquet/schema_adapter.rs

+}
+
+#[tokio::test]
+async fn test_multi_source_schema_adapter_reuse() -> Result<()> {


actually, I missed this one before -- given this is testing formats other than parquet, I think we should move it back into core_integration.

Here is a suggestion how: #16801 (comment)

Moved them to
datafusion/core/tests/integration_tests/schema_adapter/schema_adapter_integration_tests.rs

I think we should have a different approach

move integration tests from parquet/schema_adapter.rs add new integration_tests/schema_adapter module add root driver schema_adapter_integration.rs

- Moved existing schema adapter integration tests from `schema_adaptation/schema_adapter_integration_tests.rs` to a new module in `datafusion/core/tests/integration_tests/schema_adapter/schema_adapter_integration_tests.rs`. - Created a new file `schema_adapter.rs` in the integration tests folder to run and organize the tests under the schema adapter directory. - The tests validate the functionality of a schema adapter that transforms column names to uppercase, ensuring compatibility across different file sources. - Ensured proper organization of tests for future maintainability and clearer directory structure.

…file

alamb · 2025-07-27T10:32:42Z

datafusion/core/tests/schema_adapter_integration.rs

@@ -0,0 +1,21 @@
+// Licensed to the Apache Software Foundation (ASF) under one


I think this effectively means we will have a new integration test binary that gets run like

cargo test --test schema_adapter_integration

each test binary takes up significant space, and in the past we had problems with the runners disk space filling up

IN this case, the new binary takes 188MB on my machine, so it probably would add the same to most CI runs:

(venv) andrewlamb@Andrews-MacBook-Pro-3:~/Software/datafusion$ du -s -h target/debug/deps/schema_adapter_integration-2b9fa3c8791a7c77 188M target/debug/deps/schema_adapter_integration-2b9fa3c8791a7c77

Here is a proposed PR to add it to the existing core_integration binary, so it would get run like this:

cargo test --test core_integration -- schema_adapter

Move schema adapter tests to the core_integration bianary kosiew/datafusion#27

And not add a new binary

alamb

Thank you @kosiew

I think this PR does run the test now, so I think it could be merged as is

However, I think it is worth considering using an exisitng test binary rather than making a new one, namely this PR:

kosiew#27

kosiew · 2025-07-27T13:15:13Z

Thanks @alamb , @findepi for your reviews.

alamb · 2025-07-27T13:27:59Z

Thank you for sticking with this @kosiew

- Removed `schema_adapter_integration_tests.rs` from the `integration_tests` directory. - Created a new module `schema_adapter` and moved the tests there. - Added `mod schema_adapter;` to `core_integration.rs` to include the new module. - Enhanced the schema adapter test suite to: - Write and read test data using `InMemory` object store. - Validate consistent behavior of the `UppercaseAdapterFactory` across `ParquetSource`, `ArrowSource`, `CsvSource`, and `JsonSource`. - Confirm schema mapping behavior and adapter output schemas. - Added missing `use` imports and corrected adapter error handling in existing test files.

kosiew added 22 commits July 21, 2025 11:49

Add integration test configuration for schema adapter

0c31907

Add integration tests for schema adapter functionality and create mod…

8bb5d1a

…ule structure

Remove duplicate struct and implementation blocks for TestSchemaAdapt…

355cb95

…erFactory, TestSchemaAdapter, and TestSchemaMapping in schema adapter integration tests.

Merge branch 'main' into integration-16801

a164ca8

Update schema adapter integration tests path in Cargo.toml to point t…

2acf1e4

…o the directory instead of a specific file

Remove schema_adapter_integration_tests block from Cargo.toml in data…

4c41b0c

…fusion/core

rename integration_tests folder to schema_adaptation

6cf9654

Refactor physical_optimizer module imports to use the correct path

a206e6f

Add end-to-end tests for schema-related functionality in schema.rs

c7e6b74

Update expected schema column name in parquet integration test

593b4b4

Move schema adapter tests

29e0bca

relocate schema adapter tests into the parquet suite reference new location in schema.rs remove old schema_adaptation tests

test: remove deprecated schema.rs test file

f08d5f5

Deleted the outdated end-to-end schema test file `schema.rs` from core tests, as schema adaptation tests have been moved to `parquet/schema_adapter.rs`.

refactor: simplify schema mapping and remove unused temporary directo…

0e554db

…ry in parquet integration tests

test: update expected schema column names in parquet integration test

458fc88

fix test_multi_source_schema_adapter_reuse

c985968

feat: add as_any method to schema adapters for downcasting support

8ee6d34

fix test_multi_source_schema_adapter_reuse

1a4e66e

test: update schema name assertions and enhance source adapter tests …

0c6dafe

…for Arrow, Parquet, Csv, and Json

test: enhance multi-source schema adapter reuse tests and update Test…

210260a

…SchemaAdapterFactory for equality comparison

fix: test_parquet_integration_with_schema_adapter

bb25948

github-actions bot added core Core DataFusion crate datasource Changes to the datasource crate labels Jul 21, 2025

kosiew added 6 commits July 22, 2025 08:28

Merge branch 'main' into integration-16801

15f5cab

refactor(schema_adapter): remove dead code and clean up whitespace

5f2f703

feat(schema_adapter): add as_any method for dynamic type access

761a07f

refactor tests, extract helper functions

414de48

Revert "refactor tests, extract helper functions"

f00cb42

This reverts commit 414de48.

refactor(schema_adapter): remove outdated comments from test file

06d4ea3

github-actions bot removed the datasource Changes to the datasource crate label Jul 25, 2025

kosiew added 2 commits July 25, 2025 10:47

Refactor schema adapter tests to remove unused as_any method and im…

a362bcf

…prove type checking

Merge branch 'main' into integration-16801

37f75e9

kosiew force-pushed the integration-16801 branch from be97092 to 37f75e9 Compare July 25, 2025 02:47

Fix fmt errors

0e8f15f

alamb previously approved these changes Jul 25, 2025

View reviewed changes

alamb reviewed Jul 25, 2025

View reviewed changes

kosiew mentioned this pull request Jul 27, 2025

Schema adapter Integration tests are not being run #16801

Closed

kosiew and others added 9 commits July 27, 2025 15:50

refactor: move schema adapter integration tests

e34b2bc

move integration tests from parquet/schema_adapter.rs add new integration_tests/schema_adapter module add root driver schema_adapter_integration.rs

chore: update license header in schema adapter integration tests

ad5e92b

Merge branch 'main' into integration-16801

36ddeea

Add Apache License header to schema adapter integration tests file

85e29be

Merge branch 'main' into integration-16801

a054c46

chore: add Apache License header to schema adapter integration tests …

67213f4

…file

Clippy

112f8b6

Merge remote-tracking branch 'apache/main' into integration-16801

54b94a7

alamb mentioned this pull request Jul 27, 2025

Move schema adapter tests to the core_integration bianary kosiew/datafusion#27

Merged

alamb reviewed Jul 27, 2025

View reviewed changes

alamb approved these changes Jul 27, 2025

View reviewed changes

alamb changed the title ~~Fix integration tests not running~~ Fix schema_adapter integration tests not running Jul 27, 2025

alamb added the development-process Related to development process of DataFusion label Jul 27, 2025

Move schema adapter tests to the core_integration binary

74d8a6d

github-actions bot removed the development-process Related to development process of DataFusion label Jul 27, 2025

kosiew merged commit ff777ea into apache:main Jul 27, 2025
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix `schema_adapter` integration tests not running #16835

Fix `schema_adapter` integration tests not running #16835

Uh oh!

kosiew commented Jul 21, 2025 •

edited

Loading

Uh oh!

alamb left a comment

Uh oh!

alamb Jul 25, 2025 •

edited

Loading

Uh oh!

alamb Jul 25, 2025

Uh oh!

kosiew Jul 27, 2025

Uh oh!

alamb Jul 27, 2025

Uh oh!

alamb left a comment

Uh oh!

Uh oh!

kosiew commented Jul 27, 2025

Uh oh!

alamb commented Jul 27, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1,21 @@
		// Licensed to the Apache Software Foundation (ASF) under one

Fix schema_adapter integration tests not running #16835

Fix schema_adapter integration tests not running #16835

Uh oh!

Conversation

kosiew commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

kosiew Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

alamb Jul 27, 2025

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kosiew commented Jul 27, 2025

Uh oh!

alamb commented Jul 27, 2025

Uh oh!

Uh oh!

Fix `schema_adapter` integration tests not running #16835

Fix `schema_adapter` integration tests not running #16835

kosiew commented Jul 21, 2025 •

edited

Loading

alamb Jul 25, 2025 •

edited

Loading