You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
The MAP_SUBSET performance warning was triggering incorrectly in many
cases:
1. Warning on transform_values (which transforms values, not filters
keys)
2. Warning on map_filter when lambda uses values (e.g., v > 2)
3. Warning on map_filter with non-membership key comparisons (e.g., k >
0)
4. Warning on all map columns, even when not related to features
This diff tightens the detection logic to only warn when ALL conditions
are met:
- Function is map_filter (not transform_values or other map functions)
- Lambda does NOT reference the value argument
- Lambda uses simple key membership tests (k = c, k IN (...), OR
combinations, CONTAINS)
- Column name contains "features" (the main motivation for this rule)
This eliminates false positives while focusing on the intended use case
of optimizing feature map filtering operations.
Implementation details:
- Added isKeyOnlyMembershipFilter() to validate lambda only uses keys
- Added expressionReferencesName() to detect value argument usage
- Added isSimpleKeyEquality() to validate membership-only comparisons
- Added containsFeatures() to limit warnings to feature-related columns
- Updated test cases to reflect correct warning behavior
Differential Revision: D84627028
```
== NO RELEASE NOTE ==
```
Copy file name to clipboardExpand all lines: presto-main-base/src/test/java/com/facebook/presto/sql/analyzer/TestAnalyzer.java
+11-14Lines changed: 11 additions & 14 deletions
Original file line number
Diff line number
Diff line change
@@ -168,30 +168,27 @@ void testNoORWarning()
168
168
@Test
169
169
publicvoidtestMapFilterWarnings()
170
170
{
171
-
assertHasWarning(
172
-
analyzeWithWarnings("SELECT map_filter(x, (k, v) -> v > 1) FROM (VALUES (map(ARRAY[1,2], ARRAY[2,3]))) AS t(x)"),
173
-
PERFORMANCE_WARNING,
174
-
"Function 'presto.default.map_filter' uses a lambda on large maps which is expensive. Consider using map_subset");
171
+
assertNoWarning(analyzeWithWarnings("SELECT map_filter(user_features, (k, v) -> v > 1) FROM (VALUES (map(ARRAY[1,2], ARRAY[2,3]))) AS t(user_features)"));
175
172
176
173
assertHasWarning(
177
-
analyzeWithWarnings("SELECT map_filter(x, (k, v) -> k = 2) FROM (VALUES (map(ARRAY[1,2,3], ARRAY[10,20,30]))) AS t(x)"),
174
+
analyzeWithWarnings("SELECT map_filter(user_features, (k, v) -> k = 2) FROM (VALUES (map(ARRAY[1,2,3], ARRAY[10,20,30]))) AS t(user_features)"),
178
175
PERFORMANCE_WARNING,
179
176
"Function 'presto.default.map_filter' uses a lambda on large maps which is expensive. Consider using map_subset");
180
177
181
178
assertHasWarning(
182
-
analyzeWithWarnings("SELECT map_filter(x, (k, v) -> k IN (1, 3)) FROM (VALUES (map(ARRAY[1,2,3], ARRAY[10,20,30]))) AS t(x)"),
179
+
analyzeWithWarnings("SELECT map_filter(user_features, (k, v) -> k IN (1, 3)) FROM (VALUES (map(ARRAY[1,2,3], ARRAY[10,20,30]))) AS t(user_features)"),
183
180
PERFORMANCE_WARNING,
184
181
"Function 'presto.default.map_filter' uses a lambda on large maps which is expensive. Consider using map_subset");
185
182
186
-
assertHasWarning(
187
-
analyzeWithWarnings("SELECT map_filter(x, (k, v) -> v IN (20, 30)) FROM (VALUES (map(ARRAY[1,2,3], ARRAY[10,20,30]))) AS t(x)"),
188
-
PERFORMANCE_WARNING,
189
-
"Function 'presto.default.map_filter' uses a lambda on large maps which is expensive. Consider using map_subset");
183
+
assertNoWarning(analyzeWithWarnings("SELECT map_filter(user_features, (k, v) -> v IN (20, 30)) FROM (VALUES (map(ARRAY[1,2,3], ARRAY[10,20,30]))) AS t(user_features)"));
190
184
191
-
assertHasWarning(
192
-
analyzeWithWarnings("SELECT map_filter(x, (k, v) -> k + v > 25) FROM (VALUES (map(ARRAY[1,2,3], ARRAY[10,20,30]))) AS t(x)"),
193
-
PERFORMANCE_WARNING,
194
-
"Function 'presto.default.map_filter' uses a lambda on large maps which is expensive. Consider using map_subset");
185
+
assertNoWarning(analyzeWithWarnings("SELECT map_filter(user_features, (k, v) -> k + v > 25) FROM (VALUES (map(ARRAY[1,2,3], ARRAY[10,20,30]))) AS t(user_features)"));
186
+
187
+
assertNoWarning(analyzeWithWarnings("SELECT map_filter(user_features, (k, v) -> k > 2) FROM (VALUES (map(ARRAY[1,2,3], ARRAY[10,20,30]))) AS t(user_features)"));
188
+
189
+
assertNoWarning(analyzeWithWarnings("SELECT transform_values(user_features, (k, v) -> v * 2) FROM (VALUES (map(ARRAY[1,2], ARRAY[2,3]))) AS t(user_features)"));
190
+
191
+
assertNoWarning(analyzeWithWarnings("SELECT map_filter(x, (k, v) -> k = 2) FROM (VALUES (map(ARRAY[1,2,3], ARRAY[10,20,30]))) AS t(x)"));
0 commit comments