Skip to content

Conversation

@manuzhang
Copy link
Member

No description provided.

@manuzhang
Copy link
Member Author

manuzhang commented Sep 23, 2025

Created apache/spark#52423 at Spark side to fix failed tests for Spark 4.1.0-preview1.

025-09-22T16:49:13.6462602Z TestCreateActions > testAddColumnOnMigratedTableAtEnd() > catalogName = spark_catalog, implementation = org.apache.iceberg.spark.SparkSessionCatalog, config = {type=hive, default-namespace=default, parquet-enabled=true, cache-enabled=false}, type = hive FAILED
2025-09-22T16:49:13.6466422Z     org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 431.0 failed 1 times, most recent failure: Lost task 0.0 in stage 431.0 (TID 3270) (localhost executor driver): java.lang.NullPointerException: Cannot invoke "org.apache.spark.sql.types.DataType.transformRecursively(scala.PartialFunction)" because "type" is null
2025-09-22T16:49:13.6469913Z     	at org.apache.spark.sql.vectorized.ColumnVector.<init>(ColumnVector.java:341)
2025-09-22T16:49:13.6471238Z     	at org.apache.iceberg.spark.data.vectorized.ConstantColumnVector.<init>(ConstantColumnVector.java:41)

dongjoon-hyun pushed a commit to apache/spark that referenced this pull request Sep 24, 2025
…ith null DataType

### What changes were proposed in this pull request?
Check whether the parameter DataType is null in ColumnVector constructor before transforming it

### Why are the changes needed?
A subclass of ColumnVector, e.g. Iceberg's [ConstantColumnVector](https://github.com/apache/iceberg/blob/main/spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ConstantColumnVector.java#L41), could be created with null `DataType`. It throws NPE after #51349, which can be verified by failed tests in [integrating Spark 4.1.0-preview1 in Iceberg](apache/iceberg#14155)

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
UT.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #52423 from manuzhang/SPARK-53678.

Authored-by: manuzhang <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
@manuzhang manuzhang changed the title Spark: Add support for Spark 4.1 (4.1.0-preview1) Spark: Add support for Spark 4.1 Sep 24, 2025
@manuzhang manuzhang force-pushed the spark4.1-preview branch 5 times, most recently from 7df730b to 030d9e8 Compare September 26, 2025 02:05
@manuzhang manuzhang force-pushed the spark4.1-preview branch 2 times, most recently from 6efbecc to cac12c7 Compare September 30, 2025 06:49
@manuzhang manuzhang force-pushed the spark4.1-preview branch 2 times, most recently from 4388c21 to c3f856b Compare October 27, 2025 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant