Skip to content

Commit d50e9b7

Browse files
harshmotw-dbcloud-fan
authored andcommitted
[SPARK-54454][SQL] Enable variant shredding and variant logical type annotation configs by default
### What changes were proposed in this pull request? This PR enables the annotation of the variant parquet logical type and shredded writes and reads by default. ### Why are the changes needed? 1. Having variant data annotated with the variant logical type is required by the parquet variant spec ([source](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#variant-in-parquet)). This is necessary to adhere to the spec 2. Variant shredding brings in significant performance optimizations over regular unshredded variants, and should be the default mode. ### Does this PR introduce _any_ user-facing change? Yes, variant data written by Spark would be annotated with the variant logical type annotation and variant shredding would be enabled by default. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No Closes #53164 from harshmotw-db/harshmotw-db/enable_variant_shredding. Lead-authored-by: Harsh Motwani <[email protected]> Co-authored-by: Wenchen Fan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 3a06297) Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent 8ee56f1 commit d50e9b7

File tree

1 file changed

+4
-4
lines changed
  • sql/catalyst/src/main/scala/org/apache/spark/sql/internal

1 file changed

+4
-4
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1598,7 +1598,7 @@ object SQLConf {
15981598
"variant logical type.")
15991599
.version("4.1.0")
16001600
.booleanConf
1601-
.createWithDefault(false)
1601+
.createWithDefault(true)
16021602

16031603
val PARQUET_IGNORE_VARIANT_ANNOTATION =
16041604
buildConf("spark.sql.parquet.ignoreVariantAnnotation")
@@ -5526,15 +5526,15 @@ object SQLConf {
55265526
"requested fields.")
55275527
.version("4.0.0")
55285528
.booleanConf
5529-
.createWithDefault(false)
5529+
.createWithDefault(true)
55305530

55315531
val VARIANT_WRITE_SHREDDING_ENABLED =
55325532
buildConf("spark.sql.variant.writeShredding.enabled")
55335533
.internal()
55345534
.doc("When true, the Parquet writer is allowed to write shredded variant. ")
55355535
.version("4.0.0")
55365536
.booleanConf
5537-
.createWithDefault(false)
5537+
.createWithDefault(true)
55385538

55395539
val VARIANT_FORCE_SHREDDING_SCHEMA_FOR_TEST =
55405540
buildConf("spark.sql.variant.forceShreddingSchemaForTest")
@@ -5567,7 +5567,7 @@ object SQLConf {
55675567
.doc("Infer shredding schema when writing Variant columns in Parquet tables.")
55685568
.version("4.1.0")
55695569
.booleanConf
5570-
.createWithDefault(false)
5570+
.createWithDefault(true)
55715571

55725572
val LEGACY_CSV_ENABLE_DATE_TIME_PARSING_FALLBACK =
55735573
buildConf("spark.sql.legacy.csv.enableDateTimeParsingFallback")

0 commit comments

Comments
 (0)