[SPARK-54560][SQL] Safe type casting in `QueryPlan._subqueries` #53272

liuzqt · 2025-12-02T00:35:24Z

What changes were proposed in this pull request?

Safe type casting in QueryPlan._subqueries

Why simpling removing `e.plan.asInstanceOf[PlanType]` is not enough

looking at the foreachWithSubqueries API below

  def foreachWithSubqueries(f: PlanType => Unit): Unit = {
    def actualFunc(plan: PlanType): Unit = {
      f(plan)
      plan.subqueries.foreach(_.foreachWithSubqueries(f))
    }
    foreach(actualFunc)
  }

PlanExpression’s type parameter is erased at runtime. When we matched on

case planExpression: PlanExpression[PlanType @unchecked] =>
  planExpression.plan

the JVM only checked that the object was a PlanExpression; it did not verify that its plan was really a PlanType (e.g. SparkPlan). Because of @unchecked, the compiler suppressed the exhaustivity/type warning and happily treated the result as a PlanType. Later, when foreachWithSubqueries invoked the lambda f: SparkPlan => Unit, the JVM inserted a cast to SparkPlan, and we will still end up ClassCastException if the actual object is a logical plan.

Why are the changes needed?

QueryPlan._subqueries is dangerous because it force type casting (code pointer)

Imagine a SparkPlan instance invoke this API where some of its subqueries could be LogicalPlan(this could happen in AQE where logical->phyiscal planning happen respectively in main/sub queries and they could be out-of-sync at a specific point.

Reasoning why we dont' need a new API
Although this API is at critical path of the whole Spark SQL, there is no need to create a separate API since if we run into this class cast error the whole query will just fail so it's always better to fix the issue and no need to preserve the "failure" buggy behavior.

Does this PR introduce any user-facing change?

NO

How was this patch tested?

Existing UTs.

Was this patch authored or co-authored using generative AI tooling?

NO

VindhyaG · 2025-12-02T05:09:54Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala

  private val _subqueries = new TransientBestEffortLazyVal(() =>
-    expressions.filter(_.containsPattern(PLAN_EXPRESSION)).flatMap(_.collect {
-      case e: PlanExpression[_] => e.plan.asInstanceOf[PlanType]
-    })


As far as I understand, the earlier logic did not return None in case there were any subqueries regardless of the plantype. Now we do have a chance of empty sequence to be returned right? If so, should we not have UTs for that.

fix

2609083

github-actions bot added the SQL label Dec 2, 2025

update

30a40d5

VindhyaG reviewed Dec 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-54560][SQL] Safe type casting in `QueryPlan._subqueries` #53272

[SPARK-54560][SQL] Safe type casting in `QueryPlan._subqueries` #53272

Uh oh!

liuzqt commented Dec 2, 2025 •

edited

Loading

Uh oh!

VindhyaG Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-54560][SQL] Safe type casting in QueryPlan._subqueries #53272

Are you sure you want to change the base?

[SPARK-54560][SQL] Safe type casting in QueryPlan._subqueries #53272

Uh oh!

Conversation

liuzqt commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why simpling removing e.plan.asInstanceOf[PlanType] is not enough

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

VindhyaG Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-54560][SQL] Safe type casting in `QueryPlan._subqueries` #53272

[SPARK-54560][SQL] Safe type casting in `QueryPlan._subqueries` #53272

liuzqt commented Dec 2, 2025 •

edited

Loading

Why simpling removing `e.plan.asInstanceOf[PlanType]` is not enough