-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Is this your first time submitting a feature request?
- I have read the expectations for open source contributors
- I have searched the existing issues, and I could not find an existing issue for this feature
- I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion
Describe the feature
When running a unit test against a model, the data types of any inputs are ultimately derived by the adapter.get_columns_in_relation macro called by the get_fixture_sql. This requires that the inputs to already exist on the target database to derive the column names and types. This potentially requiring a large number of upstream models to be built before a model can be unit tested, if it is towards the end of a dependency chain.
If the upstream models are contracted, they cannot be run unless the schemas match those defined in their YAML configuration files. Given they cannot exist without matching the contracts, the contracted data types may as well be used, removing the need for the upstream models to exist.
For a project that is fully contracted, this removes the need to run any models prior to unit testing, allowing for CI to run the unit tests first and fail faster if they are incorrect.
Describe alternatives you've considered
Leaving logic as is.
Who will this benefit?
Anyone unit testing models where any of the inputs are contracted, as fewer database calls will need to be made.
Are you interested in contributing this feature?
I have a working PoC (code examples below), but am unable to commit to the timeframes for getting a PR through.
Anything else?
Proposed changes:
-
dbt-adapters/src/dbt/include/global_project/macros/unit_test_sql/get_fixture_sql.sql-get_fixture_sql:Remove the
column_name_to_quoteddictionary and replace withadapter.quotecalls, as the macro currently fails ifcolumn_name_to_data_typesis supplied ascolumn_name_to_quotedis not instantiated.{% macro get_fixture_sql(rows, column_name_to_data_types) %} -- Fixture for {{ model.name }} {% set default_row = {} %} {%- if not column_name_to_data_types -%} {#-- Use defer_relation IFF it is available in the manifest and 'this' is missing from the database --#} {%- set this_or_defer_relation = defer_relation if (defer_relation and not load_relation(this)) else this -%} {%- set columns_in_relation = adapter.get_columns_in_relation(this_or_defer_relation) -%} {%- set column_name_to_data_types = {} -%} {%- for column in columns_in_relation -%} {#-- This needs to be a case-insensitive comparison --#} {%- do column_name_to_data_types.update({column.name|lower: column.data_type}) -%} {%- endfor -%} {%- endif -%} {%- if not column_name_to_data_types -%} {{ exceptions.raise_compiler_error("Not able to get columns for unit test '" ~ model.name ~ "' from relation " ~ this ~ " because the relation doesn't exist") }} {%- endif -%} {%- for column_name, column_type in column_name_to_data_types.items() -%} {%- do default_row.update({column_name: (safe_cast("null", column_type) | trim )}) -%} {%- endfor -%} {{ validate_fixture_rows(rows, row_number) }} {%- for row in rows -%} {%- set formatted_row = format_row(row, column_name_to_data_types) -%} {%- set default_row_copy = default_row.copy() -%} {%- do default_row_copy.update(formatted_row) -%} select {%- for column_name, column_value in default_row_copy.items() %} {{ column_value }} as {{ adapter.quote(column_name) }}{% if not loop.last -%}, {%- endif %} {%- endfor %} {%- if not loop.last %} union all {% endif %} {%- endfor -%} {%- if (rows | length) == 0 -%} select {%- for column_name, column_value in default_row.items() %} {{ column_value }} as {{ adapter.quote(column_name) }}{% if not loop.last -%},{%- endif %} {%- endfor %} limit 0 {%- endif -%} {% endmacro %} -
core/dbt/parser/unit_tests.py-_build_fixture_raw_codeAdd in a section to identify the
column_name_to_data_typesfrom a supplied node if it has not been supplied.def _build_fixture_raw_code(self, rows, column_name_to_data_types, fixture_format, node): if fixture_format == UnitTestFormat.SQL: return rows if column_name_to_data_types is None and node: column_name_to_data_types = { col.name: col.data_type for col in node.columns.values() if col.data_type } return ("{{{{ get_fixture_sql({rows}, {column_name_to_data_types}) }}}}").format( rows=rows, column_name_to_data_types=column_name_to_data_types )
-
core/dbt/parser/unit_tests.py-parse_unit_test_casePass the original input node into
_build_fixture_raw_codewhen defining thecommon_fieldsdict.common_fields = { "resource_type": NodeType.Model, # root directory for input and output fixtures "original_file_path": unit_test_node.original_file_path, "config": ModelConfig(materialized="ephemeral"), "database": original_input_node.database, "alias": original_input_node.identifier, "schema": original_input_node.schema, "fqn": original_input_node.fqn, "checksum": FileHash.empty(), "raw_code": self._build_fixture_raw_code(given.rows, None, given.format, original_input_node), "package_name": original_input_node.package_name, "unique_id": f"model.{original_input_node.package_name}.{input_name}", "name": input_name, "path": f"{input_name}.sql", }