-
Notifications
You must be signed in to change notification settings - Fork 86
Open
Description
What
normalize_column_name() does not perform case conversion on non-Snowflake databases, which breaks idempotency in inject_missing_columns, remove_columns_not_in_database, and synchronize_data_types when combined with output-to-upper or output-to-lower settings.
Reproduction scenario
PostgreSQL + output-to-upper: true:
- First run: DB returns
zebra→get_columns()key iszebra(normalize is a no-op) →inject_missing_columnsaddsnode.columns["ZEBRA"](output-to-upper applied) - Second run:
current_columns = {normalize("ZEBRA", "postgres")}={"ZEBRA"},incoming_name = "zebra"→"zebra" not in {"ZEBRA"}→ True → re-added (overwritten)
Root cause
Both current_columns and incoming_columns are compared using normalize_column_name, but normalize_column_name is case-preserving on non-Snowflake databases. This means column keys transformed by output-to-upper/output-to-lower won't match the DB-derived normalized names.
Affected locations
src/dbt_osmosis/core/transforms.py:inject_missing_columns(L333-336)src/dbt_osmosis/core/transforms.py:remove_columns_not_in_database(L379-382)src/dbt_osmosis/core/transforms.py:synchronize_data_types(L535-538)
Notes
- A naive case-insensitive comparison could break databases that distinguish quoted columns (
"Foo"vs"foo") - This likely requires either rethinking
normalize_column_nameor introducing a separate comparison normalizer that accounts for output case settings
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels