planner: enable cd-c algorithm by default for join reorder | tidb-test=pr/2676#66349
planner: enable cd-c algorithm by default for join reorder | tidb-test=pr/2676#66349ti-chi-bot[bot] merged 55 commits intopingcap:masterfrom
Conversation
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
I've accepted your request and will start reviewing the pull request. I'll conduct a thorough review covering code quality, potential issues, and implementation details. ⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.
|
|
remainder: add inTest check if has remaining edge |
| } | ||
| }) | ||
| if p.SCtx().GetSessionVars().EnableOuterJoinReorder { | ||
| p, err := joinorder.Optimize(p) |
There was a problem hiding this comment.
P1 (see pkg/planner/core/joinorder/join_order.go:507): With default tidb_opt_cartesian_join_order_threshold=0 (CartesianJoinOrderThreshold), CD-C’s 2nd pass forces allowNoEQ=true and then multiplies cumCost by cartesianFactor (0), making cartesian/no-EQ joins cost 0. This can drive severe plan regressions/OOM once this path is default; please ensure the penalty never reduces cost (e.g., clamp factor >=1 / additive penalty) and/or don’t run the 2nd pass when cartesian is disabled.
There was a problem hiding this comment.
Thanks for the update. The next push will be reviewed to confirm the cartesian factor clamping/penalty fix is in place and no longer causes cost collapse to 0.
ℹ️ Learn more details on Pantheon AI.
There was a problem hiding this comment.
PTAL, I've updated the code related to cumCost and cartesianFactor
There was a problem hiding this comment.
Confirmed fixed ✅ — the clamp at lines 485–486 of the latest commit correctly sets cartesianFactor = 1 before the second-pass greedyConnectJoinNodes call whenever cartesianFactor <= 0, preventing cumCost * 0 = 0. This resolves the cost-collapse concern. Thanks for addressing it.
ℹ️ Learn more details on Pantheon AI.
| }) | ||
| if p.SCtx().GetSessionVars().EnableOuterJoinReorder { | ||
| p, err := joinorder.Optimize(p) | ||
| return p, false, err |
There was a problem hiding this comment.
P2 (see pkg/planner/core/joinorder/join_order.go:273): tidb_opt_join_reorder_threshold is effectively ignored (threshold check commented out; useGreedy := true), making DP unreachable. This is a compatibility/operability regression for users relying on DP join reorder with threshold > 0.
There was a problem hiding this comment.
DP will be implemented in future pr
There was a problem hiding this comment.
Understood — please consider adding a comment in the code noting that tidb_opt_join_reorder_threshold is temporarily non-functional (DP path pending a follow-up PR), so operators and future contributors are not surprised by the ignored variable. A tracking issue link would also help.
ℹ️ Learn more details on Pantheon AI.
| failpoint.Return(p2, false, err) | ||
| } | ||
| }) | ||
| if p.SCtx().GetSessionVars().EnableOuterJoinReorder { |
There was a problem hiding this comment.
P2: EnableOuterJoinReorder now selects which join reorder engine runs (joinorder.Optimize vs optimizeRecursive), impacting pure-inner-join queries too. That changes the var’s operational meaning beyond “whether outer joins participate”; consider a separate session var/flag for engine selection and keep this var scoped to outer-join behavior.
There was a problem hiding this comment.
outdated comment
There was a problem hiding this comment.
Acknowledged — if the engine-selection logic has been refactored in the latest commits, this concern no longer applies. Thanks for the update.
ℹ️ Learn more details on Pantheon AI.
|
@pantheon-ai[bot]: adding LGTM is restricted to approvers and reviewers in OWNERS files. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
| failpoint.Return(p2, false, err) | ||
| } | ||
| }) | ||
| if p.SCtx().GetSessionVars().EnableOuterJoinReorder { |
There was a problem hiding this comment.
P2: Routing the default join-reorder through joinorder.Optimize (via EnableOuterJoinReorder) bypasses the legacy extractJoinGroup path that checks tidb_opt_join_reorder_through_sel (TiDBOptJoinReorderThroughSel, around L53). This makes that knob ineffective in default configs. Should the new joinorder path honor it (or document/rename it as legacy-only)?
There was a problem hiding this comment.
outdated comment. TiDBOptJoinReorderThroughSel has been supported in the new impl
There was a problem hiding this comment.
Acknowledged — good to know TiDBOptJoinReorderThroughSel is now handled natively in the new implementation. The related P1 concern about selection predicates potentially being dropped when that knob is ON (filed separately at join_order.go:85/94) still stands and should be verified once addressed.
ℹ️ Learn more details on Pantheon AI.
|
/retest |
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR introduces schema-aligned join condition handling and replaces failpoint-based control flow with session variable gating for outer join reordering. Core planner logic is enhanced with condition alignment functions, and test expectations across multiple suites are updated to reflect the resulting query plan changes. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip Migrating from UI to YAML configuration.Use the |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/planner/core/rule_join_reorder.go (2)
295-300: Remove stale failpoint placeholder block.The commented failpoint block is dead code and now just adds confusion around the active control path.
🧹 Proposed cleanup
- // failpoint.Inject("enableCDCJoinReorder", func(val failpoint.Value) { - // if val.(bool) { - // p2, err := joinorder.Optimize(p) - // failpoint.Return(p2, false, err) - // } - // })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/rule_join_reorder.go` around lines 295 - 300, Remove the stale commented failpoint block that references failpoint.Inject("enableCDCJoinReorder") and the inner call to joinorder.Optimize(p) (variables p and p2); simply delete the entire commented-out block so the active control flow is not obscured by dead code, and ensure no other references to enableCDCJoinReorder remain in this scope.
291-303: Please attach targeted planner test evidence for this optimizer-path switch.Given this changes join reorder dispatch logic, include the exact planner and integration test commands/results in the PR thread before merge.
As per coding guidelines
pkg/planner/**: Run targeted planner unit tests (go test -run <TestName> -tags=intest,deadlock) and update rule testdata when needed for changes to planner rules or logical/physical plans.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/rule_join_reorder.go` around lines 291 - 303, Add targeted planner unit and integration tests that exercise both branches of the join reorder dispatch (when p.SCtx().GetSessionVars().EnableOuterJoinReorder is true and false) so CI and reviewers can see exact planner outputs; create/modify tests (e.g., in a new or existing pkg/planner/*_test.go such as rule_join_reorder_test.go) that invoke the planner on representative queries and assert or capture the produced logical/physical plans for the Optimize and optimizeRecursive paths, update any rule testdata expected outputs under pkg/planner/testdata as needed, and run the commands to produce evidence: go test -run <TestName> -tags=intest,deadlock (and the relevant unit test runs) then paste the exact command lines and their outputs (planner and integration results) into the PR thread for review.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/planner/core/rule_join_reorder.go`:
- Around line 295-300: Remove the stale commented failpoint block that
references failpoint.Inject("enableCDCJoinReorder") and the inner call to
joinorder.Optimize(p) (variables p and p2); simply delete the entire
commented-out block so the active control flow is not obscured by dead code, and
ensure no other references to enableCDCJoinReorder remain in this scope.
- Around line 291-303: Add targeted planner unit and integration tests that
exercise both branches of the join reorder dispatch (when
p.SCtx().GetSessionVars().EnableOuterJoinReorder is true and false) so CI and
reviewers can see exact planner outputs; create/modify tests (e.g., in a new or
existing pkg/planner/*_test.go such as rule_join_reorder_test.go) that invoke
the planner on representative queries and assert or capture the produced
logical/physical plans for the Optimize and optimizeRecursive paths, update any
rule testdata expected outputs under pkg/planner/testdata as needed, and run the
commands to produce evidence: go test -run <TestName> -tags=intest,deadlock (and
the relevant unit test runs) then paste the exact command lines and their
outputs (planner and integration results) into the PR thread for review.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
pkg/planner/core/casetest/binaryplan/testdata/binary_plan_suite_out.jsonpkg/planner/core/casetest/binaryplan/testdata/binary_plan_suite_xut.jsonpkg/planner/core/casetest/cbotest/testdata/analyze_suite_out.jsonpkg/planner/core/casetest/cbotest/testdata/analyze_suite_xut.jsonpkg/planner/core/casetest/enforcempp/BUILD.bazelpkg/planner/core/casetest/enforcempp/enforce_mpp_test.gopkg/planner/core/casetest/hint/testdata/integration_suite_out.jsonpkg/planner/core/casetest/hint/testdata/integration_suite_xut.jsonpkg/planner/core/casetest/mpp/testdata/integration_suite_out.jsonpkg/planner/core/casetest/mpp/testdata/integration_suite_xut.jsonpkg/planner/core/casetest/rule/rule_outer2inner_test.gopkg/planner/core/casetest/rule/testdata/join_reorder_suite_out.jsonpkg/planner/core/casetest/rule/testdata/outer2inner_out.jsonpkg/planner/core/casetest/rule/testdata/outer2inner_xut.jsonpkg/planner/core/casetest/testdata/json_plan_suite_out.jsonpkg/planner/core/casetest/testdata/json_plan_suite_xut.jsonpkg/planner/core/casetest/tpcds/testdata/tpcds_suite_out.jsonpkg/planner/core/casetest/tpcds/testdata/tpcds_suite_xut.jsonpkg/planner/core/rule_join_reorder.gotests/integrationtest/r/planner/core/casetest/index/index.resulttests/integrationtest/r/planner/core/casetest/rule/rule_join_reorder.resulttests/integrationtest/r/planner/core/issuetest/planner_issue.resulttests/integrationtest/r/planner/core/join_reorder2.resulttests/integrationtest/r/planner/core/plan_cache.resulttests/integrationtest/r/planner/core/rule_join_reorder.resulttests/integrationtest/r/select.result
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integrationtest/r/tpch.result`:
- Line 447: The EXPLAIN predicate printer is dropping relation/alias qualifiers,
producing unsatisfiable text like eq(tpch50.nation.n_name, "JAPAN") for joined
instances; update the predicate/expr formatter (e.g., the function that formats
scalar expressions / column references used by the HashJoin explain path) to
include the per-source alias or qualifier (the join input alias such as n1/n2)
when rendering ColumnRef nodes so the predicate shows eq(n1.n_name, "JAPAN") and
eq(n2.n_name, "INDIA"); locate the formatter used by HashJoin explain output and
change it to prefer the relation alias (or join alias) over the fully qualified
catalog.table name when an alias is present.
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #66349 +/- ##
================================================
+ Coverage 77.6965% 78.1128% +0.4162%
================================================
Files 2012 1942 -70
Lines 550857 538480 -12377
================================================
- Hits 427997 420622 -7375
+ Misses 121144 117414 -3730
+ Partials 1716 444 -1272
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
Review Complete Findings: 0 issues ℹ️ Learn more details on Pantheon AI. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/planner/core/joinorder/conflict_detector.go`:
- Around line 785-802: The cloned ScalarFunction's cached hash becomes stale
after you mutate its arguments in the block handling *expression.ScalarFunction
(where you call e.Clone(), assign newFunc.GetArgs()[i] = newArgs[i] after
alignExprNotNullWithSchema), so update the code to invalidate/recompute the hash
cache: after the mutation loop call expression.ReHashCode(newFunc) (or rebuild
the function via expression.NewFunctionInternal(...) instead of mutating the
clone) so the canonicalhashcode reflects the new arguments.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
pkg/planner/core/casetest/rule/testdata/outer_to_semi_join_suite_out.jsonpkg/planner/core/casetest/rule/testdata/outer_to_semi_join_suite_xut.jsonpkg/planner/core/casetest/tpch/testdata/tpch_suite_out.jsonpkg/planner/core/casetest/tpch/testdata/tpch_suite_xut.jsonpkg/planner/core/joinorder/conflict_detector.go
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
/retest |
1 similar comment
|
/retest |
|
/unhold |
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
/retest |
1 similar comment
|
/retest |
| └─HashJoin root inner join, equal:[eq(tpch50.orders.o_custkey, tpch50.customer.c_custkey) eq(tpch50.nation.n_nationkey, tpch50.customer.c_nationkey)] | ||
| ├─TableReader(Build) root data:TableFullScan | ||
| │ └─TableFullScan cop[tikv] table:customer keep order:false | ||
| └─HashJoin(Probe) root inner join, equal:[eq(tpch50.lineitem.l_orderkey, tpch50.orders.o_orderkey)] | ||
| └─HashJoin root inner join, equal:[eq(tpch50.customer.c_nationkey, tpch50.nation.n_nationkey)], other cond:or(and(eq(tpch50.nation.n_name, "JAPAN"), eq(tpch50.nation.n_name, "INDIA")), and(eq(tpch50.nation.n_name, "INDIA"), eq(tpch50.nation.n_name, "JAPAN"))) | ||
| ├─TableReader(Build) root data:Selection | ||
| │ └─Selection cop[tikv] or(eq(tpch50.nation.n_name, "INDIA"), eq(tpch50.nation.n_name, "JAPAN")) | ||
| │ └─TableFullScan cop[tikv] table:n2 keep order:false | ||
| └─HashJoin(Probe) root inner join, equal:[eq(tpch50.orders.o_custkey, tpch50.customer.c_custkey)] | ||
| ├─TableReader(Build) root data:TableFullScan | ||
| │ └─TableFullScan cop[tikv] table:orders keep order:false | ||
| └─HashJoin(Probe) root inner join, equal:[eq(tpch50.supplier.s_suppkey, tpch50.lineitem.l_suppkey)] | ||
| ├─HashJoin(Build) root inner join, equal:[eq(tpch50.nation.n_nationkey, tpch50.supplier.s_nationkey)] | ||
| │ ├─HashJoin(Build) root CARTESIAN inner join, other cond:or(and(eq(tpch50.nation.n_name, "JAPAN"), eq(tpch50.nation.n_name, "INDIA")), and(eq(tpch50.nation.n_name, "INDIA"), eq(tpch50.nation.n_name, "JAPAN"))) |
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
|
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #63887
Problem Summary:
What changed and how does it work?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
Bug Fixes
Tests