Skip to content

Conversation

@asl3
Copy link
Contributor

@asl3 asl3 commented Dec 2, 2025

What changes were proposed in this pull request?

Reverting the existing implementation to handle for cases of sourceSide child nodes without numOutputRows, and will re-target the implementation to later Spark release.

Why are the changes needed?

The current implementation may grab the incorrect numOutputRows metric if there is an intermediary node (such as custom Spark operator) which does not support the metric.

This is because we target the first sourceSide child node with numOutputRows. If a SparkExtension node does not contain this metric but transforms the source table, then we could progress all the way to the source table and grab the incorrect metric.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing CI, as this is a revert

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Dec 2, 2025
Copy link
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @asl3 ! Yes , I think better to be safe than risk an incorrect metric if we find a node that does not have numOutputRows but changes the number of output rows

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants