feat(snowflake): add scheduler support for dynamic tables #1710
Conversation
Add support for the Snowflake dynamic table `scheduler` property, which controls whether the internal refresh scheduler is ENABLE or DISABLE. This is distinct from `scheduling_state` (ACTIVE/SUSPENDED), which is a Snowflake-managed runtime status. Key changes: - Add `Scheduler` enum (ENABLE/DISABLE) and make `target_lag` optional in `SnowflakeDynamicTableConfig` to support scheduler=DISABLE without a target lag specification. - Apply implicit scheduler defaults in both `parse_relation_config` and `parse_relation_results`: when scheduler is not explicitly set, default to ENABLE if target_lag is present, DISABLE otherwise. This aligns the Python-side change detection logic with the Jinja macro defaults. - Add scheduler change detection in `dynamic_table_config_changeset` and prevent target_lag from being included in ALTER when its new value is None (avoids generating `target_lag = 'None'` in SQL). - Update CREATE, REPLACE, and ALTER macros to render `scheduler` in DDL with correct ENABLE/DISABLE defaults. Guard ALTER target_lag rendering to skip when context is None. - Update materialization to issue `ALTER DYNAMIC TABLE ... REFRESH` when scheduler is DISABLE (explicit or implicit via missing target_lag), since Snowflake won't auto-refresh in that case. - Read the `scheduler` column from SHOW DYNAMIC TABLES when available. - Fix FileSystemLoader in test_alter_relation_comment_macro.py to use an absolute path, avoiding TemplateNotFound errors from different CWDs. - Add comprehensive unit tests for scheduler config parsing, changeset detection, and change detection logic, plus functional tests covering scheduler in CREATE/ALTER/REPLACE DDL and refresh behavior. Made-with: Cursor
a4393d5 to
3b2447d
Compare
|
Thank you for your pull request! We could not find a changelog entry for this change in the dbt-snowflake package. For details on how to document a change, see the Contributing Guide. |
| if scheduler := relation_config.config.extra.get("scheduler"): # type:ignore | ||
| config_dict["scheduler"] = scheduler.upper() | ||
| elif config_dict.get("target_lag"): | ||
| config_dict["scheduler"] = "ENABLE" |
There was a problem hiding this comment.
Can we use the enum values?
| scheduler = dynamic_table.get("scheduler") | ||
| target_lag = dynamic_table.get("target_lag") | ||
| if scheduler is None: | ||
| scheduler = "ENABLE" if target_lag else "DISABLE" |
| query: str | ||
| target_lag: str | ||
| snowflake_warehouse: str | ||
| target_lag: Optional[str] = None |
There was a problem hiding this comment.
target_lag is still required if scheduler is enabled right? We should raise a validation exception in that case.
There was a problem hiding this comment.
target lag is not mandatory any more
with scheduler = DISABLED , target_lag will not be allowed ( we treat dynamic tables as a unit of incrementalization and completely disconnected from the dynamic table built in scheduler)
with scheduler = ENABLED it is target lag is still mandatory
PS
i was debating whether DBT needs to validate it as opposed allowing snowflake to produce appropriate precise error ( implementing it here seem like extra validation effort that doesn't add a lot of value and make chnages (if snowflake adds some new modes hard to version)
| return cls("ON_CREATE") | ||
|
|
||
|
|
||
| class Scheduler(StrEnum): |
There was a problem hiding this comment.
I wonder if there's actually three states:
- DISABLE - we don't want the MV to be updated
- ENABLE - require target_lag (default state today
- MANUAL - dbt runs the refresh manually on every run
There was a problem hiding this comment.
There are 2 states:
ENABLE ( default like Dynamic TAbles /MV ) work today subject to be scheduled by Snowflake directly or part of the downstream schedule
DISABLED ( only manual refreshes allowed , Dynamic table scheduler will never touch it. directly or through upstream dependency). => DT's
i do not see a meaningful scenario to justify 3rd mode where we want to disable Dynamic tables from both DBT refreshes and Snowflake Dynamic table scheduler.
( i guess target_lag = downstream on all DTs in the pipeline achieves similar effect functionally , but i am not seeing meaningful scenario for that case)
moreover 2 modes provide really simple syntax for users ( they just need to specify warehouse, absence of target lag implies disabling Dynmic table scheduler and DBT running refreshes)
Add support for the Snowflake dynamic table
schedulerproperty, which controls whether the internal refresh scheduler is ENABLE or DISABLE. This is distinct fromscheduling_state(ACTIVE/SUSPENDED), which is a Snowflake-managed runtime status.Key changes:
Add
Schedulerenum (ENABLE/DISABLE) and maketarget_lagoptional inSnowflakeDynamicTableConfigto support scheduler=DISABLE without a target lag specification.Apply implicit scheduler defaults in both
parse_relation_configandparse_relation_results: when scheduler is not explicitly set, default to ENABLE if target_lag is present, DISABLE otherwise. This aligns the Python-side change detection logic with the Jinja macro defaults.Add scheduler change detection in
dynamic_table_config_changesetand prevent target_lag from being included in ALTER when its new value is None (avoids generatingtarget_lag = 'None'in SQL).Update CREATE, REPLACE, and ALTER macros to render
schedulerin DDL with correct ENABLE/DISABLE defaults. Guard ALTER target_lag rendering to skip when context is None.Update materialization to issue
ALTER DYNAMIC TABLE ... REFRESHwhen scheduler is DISABLE (explicit or implicit via missing target_lag), since Snowflake won't auto-refresh in that case.Read the
schedulercolumn from SHOW DYNAMIC TABLES when available.Fix FileSystemLoader in test_alter_relation_comment_macro.py to use an absolute path, avoiding TemplateNotFound errors from different CWDs.
Add comprehensive unit tests for scheduler config parsing, changeset detection, and change detection logic, plus functional tests covering scheduler in CREATE/ALTER/REPLACE DDL and refresh behavior.
Made-with: Cursor
resolves #
docs dbt-labs/docs.getdbt.com/#
Problem
Solution
Checklist