Skip to content

Add 'metric_time' compatibility to metric filters#1658

Closed
wtremml18 wants to merge 6 commits intodbt-labs:mainfrom
wtremml18:main
Closed

Add 'metric_time' compatibility to metric filters#1658
wtremml18 wants to merge 6 commits intodbt-labs:mainfrom
wtremml18:main

Conversation

@wtremml18
Copy link

Description
This Pull Request updates metric-filter handling and associated tests to support filters on metrics that include multiple group_by items—especially the special case where one of them is metric_time. For example:

Filter: {{ Metric('bookings', group_by=['listing','metric_time']) }} > 0

This allows the sl query --metrics <metric_with_filter> --group-by metric_time__month to return time-bucketed results rather than filtering for all-time. This is especially important for business metrics filtered to something like "paid users"

Key Changes
Allow Multiple Items in group_by:

  • Updated GroupByMetricPattern.from_call_parameter_set in typed_patterns.py to parse all group_by entries, no longer forcing a single item. Likely need to validate that this doesn't cause fan-out joins.
  • If metric_time is among the group_by items, we treat it as a valid dimension link for the metric.

This is the using the "Everything is an Entity" approach :/

How to Use
With these changes, you can now write filters or expressions that respect the parent metric's group_by granularity:

{{ Metric('my_metric', group_by=['account_id','metric_time']) }} > 100

Potential Shortcomings / Future Considerations
Verbose Column Names:

  • If a user lists ['metric_time','listing'], the final column might include both references, plus metric_time again if your code uses subquery links. If you want shorter names (like "listing__metric_time__my_metric"), you may need a more refined name-building approach.

Performance of Multiple Group‑Bys:

  • Adding many items to group_by can increase the complexity of the joins. It may be worth confirming that the multi‑group‑by logic in typed_patterns.py does not cause unexpected query bloat.
    metric_time Semantics:
  • Currently, metric_time is still treated as if it were just another entity. If your pipeline needs time‑aware behaviors (like date truncation, time grain pushdown, etc.), you should confirm that these references are recognized and optimized.

Backward Compatibility:

  • Existing single-item filters should continue to function the same. But if your code or documentation assumes only one group_by is possible, you’ll want to update docs / help text.

Testing

  • test_metric_in_filter_with_metric_time ensures that 'listing' plus 'metric_time' in group_by is resolved properly.

Note that I wanted to get feedback before extending my changes in 'metricflow-semantics' before extending it to 'metricflow'.

After reviewing a few PRs, it's not quite clear which new tests are required. I see the original PR for enabling metrics in filters was pretty sparse as compared with the extensive integration tests in other PRs.

@cla-bot
Copy link

cla-bot bot commented Jan 30, 2025

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR.

CLA has not been signed by users: @wtremml18

@cla-bot cla-bot bot added the cla:yes label Feb 6, 2025
@courtneyholcomb
Copy link
Contributor

Hey @wtremml18! Thanks so much for the contribution!
A couple of things that might be helpful:

  • Integration tests that would be useful to add live in metricflow/tests_metricflow/query_rendering/test_metric_filter_rendering.py and metricflow/tests_metricflow/query_rendering/test_query_rendering.py. Writing & running those is an easy way to see what will break at each stage.
  • We'll need to add support for dimensions in the MetricCallParameterSet.group_by attribute (this lives in dbt-semantic-interfaces). You should be able to work around that in the meantime, we can likely pick that up on our side before this is finalized.

I think you're looking to unblock a specific use case, can you clarify what that is so we can help make sure this is shaping up to the right result? I.e., what is an example query and expected result you're looking for?

@wtremml18
Copy link
Author

wtremml18 commented Feb 11, 2025

Thanks @courtneyholcomb ! I have more details in this issue

The gist is that we need metric filters to respect past dates so we can calculate things like:
[Weekly/Monthly/Quarterly] [Metric] filtered to [Entities] that have done [Metric] within that timeframe :

  • Email Volume for SMS-Active Accounts:
    Metric:{sum_emails_sent}
    Filter:{sum_sms_sent > 0}
  • Support Ticket Volume from Mobile App Users
    Metric:{sum_support_tickets}
    Filter:{mobile_app_usage > 0}
  • Refund Rate for Accounts Contacting Support:
    Metric:{sum_refunds}
    Filter:{sum_support_tickets > 0}

These are a couple examples, but you can see how querying metrics like these would lead to unexpected results because the metric filters use 'all-time'

@wtremml18
Copy link
Author

@courtneyholcomb - almost a year later, I have two PRs! These are passing locally and behave as expected in both jaffle-sl-template and Klaviyo's production dbt project.

MetricFlow PR: #1965
dbt-Semantic-Interfaces PR: dbt-labs/dbt-semantic-interfaces#464

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants