Skip to content

[SPARK-52841][SQL][TESTS] Fix PlanStabilitySuite id normalization #51534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

peter-toth
Copy link
Contributor

@peter-toth peter-toth commented Jul 17, 2025

What changes were proposed in this pull request?

This PR fixes and id normalization in PlanStabilitySuites and

Why are the changes needed?

Currently if we run a subset of PlanStabilitySuites then it might fail due to plan and expression id conflixts:

build/sbt "sql/testOnly *TPCDSV1_4_PlanStabilitySuite"
...
[info] - check simplified (tpcds-v1.4/q9) *** FAILED *** (81 milliseconds)

This is because the expr id regex doesn't skip and plan id regex doesn't take into account that the plan id is formatted as id=#... in SubqueryExec nodes.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing UTs + manual subset run.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Jul 17, 2025

// Normalize the plan id in Exchange nodes. See `Exchange.stringArgs`.
// Normalize the plan ids in Exchange and Subquery nodes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for supporting Subquery correctly.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-52841][SQL] Fix PlanStabilitySuite id normalization [SPARK-52841][SQL][TESTS] Fix PlanStabilitySuite id normalization Jul 17, 2025
@dongjoon-hyun
Copy link
Member

I added [TESTS] since this is a test-only PR.

@@ -78,8 +78,8 @@ trait PlanStabilitySuite extends DisableAdaptiveExecutionSuite {
}

private val referenceRegex = "#\\d+".r
private val normalizeRegex = "#\\d+L?".r
private val planIdRegex = "plan_id=\\d+".r
private val exprIdRegexp = "(?<prefix>(?<!id=)#)\\d+L?".r
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert on regex, can you give a bit more explanation about the fix here? So the test suite does normalize the expr IDs but misses some places?

Copy link
Contributor Author

@peter-toth peter-toth Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (?<!id=) part is a negative lookbehind to not match id=#123 but match all other #123 like expr ids. id=#123 like ids are actually plan ids in SubqueryExec nodes.

The(?<prefix> capture group just simplifies replacement.

Copy link
Contributor Author

@peter-toth peter-toth Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem in some subset test runs was, that an id showed up both as plan id and as expression id. So the previous exprIdRegexp replaced all occurances to the same normalized one.

While in the golden run the ids were different so exprIdRegexp and planIdRegex replaced them to 2 different normalized ids.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id=#123 like ids are actually plan ids in SubqueryExec nodes.

Ah I didn't know it. Can we leave a code comment to explain it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added in 7e1cbcf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants