-
Notifications
You must be signed in to change notification settings - Fork 348
feat: Add MV granularities and infer config from SummingMergeTree #1550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🦋 Changeset detectedLatest commit: 60dff26 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Code Review✅ No critical issues found. The PR successfully adds MV granularities and SummingMergeTree support with comprehensive test coverage. The implementation follows HyperDX patterns correctly. Minor observations:
Recommendation: Approve and merge ✅ |
E2E Test Results✅ All tests passed • 56 passed • 4 skipped • 731s
Tests ran across 4 shards in parallel. |
c9c0a2e to
d514c50
Compare
d514c50 to
77a1aa9
Compare
| .toLowerCase() | ||
| .includes('count') | ||
| ? 'count' | ||
| : 'sum'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check other agg functions here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For AggregatingMergeTree, we do. But for SummingMergeTree, the behavior is that the summedColumnNames (which are the arguments passed to the engine eg. SummingMergeTree([...columns])) are all summed. That's only valid for sums and counts - so we handle count as a special case and assume others are sums.
SummingMergeTree does technically support other aggregations if the Column has AggregateFunction type - in which case the .map above will cover that case. I'll update the test case to show that.
| function getSourceTableColumn( | ||
| mvColumnName: string, | ||
| aggFn: string, | ||
| sourceTableColumnNames: Set<string>, | ||
| ) { | ||
| // By convention: MV Columns are named "<aggFn>__<sourceColumn>" | ||
| const nameSuffix = mvColumnName.split('__')[1]; | ||
| return sourceTableColumnNames.has(nameSuffix) && aggFn !== 'count' | ||
| ? nameSuffix | ||
| : ''; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could use mv DDL to figure out the mapping cuz users don't need to follow these conventions to map source table columns, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can in most cases! I've updated getSourceTableColumn to extract the column definitions from the materialized view DDL and try to match up source and target column names, if the convention-based matching fails. The user always has the option to explicitly configure the source column name, in case the inference fails though (such as due to multiple MVs writing to the table, or unexpectedly complex expressions).
41d734e to
e5e98fd
Compare
e5e98fd to
99b7b07
Compare
99b7b07 to
1c7e72e
Compare
wrn14897
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Closes HDX-3064
Summary
This PR makes two improvements to MV configuration
aggFn__sourceColumnconvention.Also, tests have been updated and additional tests have been backported from the EE repo.
Demo
Screen.Recording.2026-01-05.at.10.22.42.AM.mov
Testing
To test the Summing Merge Tree inference, create a materialized view like so: