Metric Filters Inherit Parent Query Granularity#1965
Metric Filters Inherit Parent Query Granularity#1965wtremml18 wants to merge 4 commits intodbt-labs:mainfrom
Conversation
Adds support for including `metric_time` in metric filter group_by clauses, ensuring filter metrics align with the parent query's time grain. Updates parsing, spec resolution, and query rendering to handle metric_time as a special case, and adds related tests and example metrics. Also includes minor refactoring and test fixture updates to support the new behavior.
|
I don't have much context on the customer request, but some thoughts that came to mind:
|
|
Hi Paul - thanks for the quick reply!
When the parent query doesn’t group by So nothing is discarded per se; the resolver just never adds the metric_time column because the user didn’t request any time grain.
I see
While Also note that this is fully backwards compatible, so it only affects users who go out of their way to intentionally use a metric filter intra-period.
I tested a few different combinations of edge-cases: Two metric filters, one with The generated SQL only projects metric_time__month once in the outer SELECT. Inside the joins, both the base metric and the filter metric use metric_time__month, but MetricFlow aliases everything. There’s no column-name collision because the final projection is the parent query’s metric_time__month. There is an unexpected behavior where the result functionally prioritizes the I can add a warning on-parse if that's a better failure mode - looks like a pretty basic addition! |
Issues with the currently proposed solution
To expand on the point above, our main concern is how this fits into the overall consistency of the query interface.
Another way to evaluate this is from a documentation perspective. Ideally, this behavior should be explainable in documentation without relying on conditional rules (“if… then…”). That constraint is important for keeping the mental model simple and predictable for users.
Can you share the generated SQL for a filter case like the following? It would help clarify how the two metric subqueries are used. For the example metric in the PR: How does Thoughts on a path forwardFollowing up on an internal discussion, we see the value in the proposed feature, but we still have concerns about the interface. One possibility is that this use case fits better into a more general solution. Several users have asked about defining metrics with query-time parameters (e.g. a cumulative metric where the window is specified at query time). The grain for A concrete approach we would support right now is requiring an explicit grain for While this isn’t ideal due to repetition, it lets us make progress on your request without introducing additional semantics for If you’re planning to update the PR, can you also split the monolithic commit into smaller, logical commits to make review easier? For example: https://github.com/dbt-labs/metricflow/pull/1966/commits |
|
@wtremml18 Checking to see if you have thoughts on the above? |
Summary
This PR makes metric filters time-aware by allowing
metric_timein filtergroup_byand wiring that through MetricFlow’s resolution pipeline. This addresses a real analytics gap where filters were applied “all‑time” even when the parent query is time‑bucketed.This is an update to a previous draft pr (#1658) and resolves this issue (#1659)
Example use case (now supported):
{{ Metric('filter_metric', group_by=['klaviyo_account_id','metric_time']) }} > 0When the query uses
--group-by metric_time__month, the filter is evaluated per month instead of all‑time.Motivation / User Impact
Without this, queries like “ARR from accounts sending push in that month” were incorrectly calculated as “ARR from accounts that ever sent push.” This impacts any metric filtered by a cohort or event metric when the query is time‑bucketed.
Key Changes
metric_time.metric_timeis present, the filter subquery joins on the parent query’s time grain.metric_timeitem, with optionalmetric_time. Invalid inputs are rejected with a clear error.Behavior (Expected)
metric_timeis in the filter, the filter is evaluated at the query grain (day/week/month/etc.).metric_timeis ignored (backward compatible).Edge Cases Tested (Local Project)
These cases were exercised with a real project (
ns_push_and_in_app_messaging) to validate behavior:filter_metricto monthly-only time grain.--group-by metric_time__dayfails at query resolution. ❌ intended outcome--group-by pk_parent_model__parent_dimensionsucceeds.--group-by pk_filter_model__filter_only_dimensionfails: no join path. ❌ intended outcomeklaviyo_account_id,metric_time, plus another dimension.metric_time__monthbut query is dailyfilter_only_dimension.parent_only_entity_identity to base semantic model.--group-by filter_only_dimensionfails (no join path). ❌ intended outcomeAPI Implications
Queryable time grains should effectively be intersected with the filter metric’s time grain. If the API exposes
queryable_time_granularities, it should not advertise grains that the filter metric can’t support (or queries will fail at resolution).Testing
hatch run dev-env:pytest tests_metricflow/query_rendering/test_metric_filter_rendering.pyhatch run dev-env:pytest tests_metricflow/dataflow/builder/test_dataflow_plan_builder.py -k metric_filterhatch run dev-env:pytest tests_metricflow/integration/query_output/test_metric_filter_output.pyDBT_PROJECT_DIR=~/dbt DBT_PROFILES_DIR="$HOME/.dbt" hatch run dev-env:dbt parse(frommetricflow/dbt-metricflow)DBT_PROJECT_DIR=~/dbt DBT_PROFILES_DIR="$HOME/.dbt" hatch run dev-env:mf query --metrics ns_push_and_in_app_messaging --group-by metric_time__month --explain --limit 5(frommetricflow/dbt-metricflow)Risks / Considerations
metric_timegroup_by protects against ambiguous or multi‑hop joins.Notes
metricflow,metricflow-semantics, anddbt-metricflowdev envs use a localdbt-semantic-interfacescheckout (editable). Dependency warnings are expected due to pin differences.dbt parsein the local project emits pre‑existing warnings unrelated to this change.- A separate PR to dbt-semantic-interfaces is forthcoming.