Skip to content

Commit e00cd4e

Browse files
vinodkcdongjoon-hyun
authored andcommitted
[SPARK-54206][CONNECT][FOLLOWUP] Use VARBINARY type and reasonable max length for BinaryType
### What changes were proposed in this pull request? This PR improves the JDBC type mapping for BinaryType in the Spark Connect JDBC client ### Why are the changes needed? - **Semantic correctness**: Types.VARBINARY (variable-length) better matches Spark's BinaryType semantics. - **Industry alignment**: SQL Server dialect already uses VARBINARY(MAX) for BinaryType . Trino JDBC driver uses VARBINARY with a maximum of 1 GB. MariaDB JDBC driver uses VARBINARY/LONGVARBINARY for blob types ### Does this PR introduce _any_ user-facing change? Yes, but minimal impact. Both BINARY and VARBINARY map to byte array types The precision change is within reasonable bounds ### How was this patch tested? Existing tests: All tests in `SparkConnectJdbcDataTypeSuite` pass. ### Was this patch authored or co-authored using generative AI tooling? No Closes #53252 from vinodkc/br_SPARK-54206_followup_fix. Authored-by: vinodkc <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 87a8b56) Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent d50e9b7 commit e00cd4e

File tree

2 files changed

+6
-6
lines changed

2 files changed

+6
-6
lines changed

sql/connect/client/jdbc/src/main/scala/org/apache/spark/sql/connect/client/jdbc/util/JdbcTypeUtils.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ private[jdbc] object JdbcTypeUtils {
3939
case DateType => Types.DATE
4040
case TimestampType => Types.TIMESTAMP
4141
case TimestampNTZType => Types.TIMESTAMP
42-
case BinaryType => Types.BINARY
42+
case BinaryType => Types.VARBINARY
4343
case _: TimeType => Types.TIME
4444
case other =>
4545
throw new SQLFeatureNotSupportedException(s"DataType $other is not supported yet.")
@@ -83,7 +83,7 @@ private[jdbc] object JdbcTypeUtils {
8383
case LongType => 19
8484
case FloatType => 7
8585
case DoubleType => 15
86-
case StringType => 255
86+
case StringType => Int.MaxValue
8787
case DecimalType.Fixed(p, _) => p
8888
case DateType => 10
8989
case TimestampType => 29

sql/connect/client/jdbc/src/test/scala/org/apache/spark/sql/connect/client/jdbc/SparkConnectJdbcDataTypeSuite.scala

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -216,9 +216,9 @@ class SparkConnectJdbcDataTypeSuite extends ConnectFunSuite with RemoteSparkSess
216216
assert(metaData.getColumnTypeName(1) === "STRING")
217217
assert(metaData.getColumnClassName(1) === "java.lang.String")
218218
assert(metaData.isSigned(1) === false)
219-
assert(metaData.getPrecision(1) === 255)
219+
assert(metaData.getPrecision(1) === Int.MaxValue)
220220
assert(metaData.getScale(1) === 0)
221-
assert(metaData.getColumnDisplaySize(1) === 255)
221+
assert(metaData.getColumnDisplaySize(1) === Int.MaxValue)
222222
}
223223
}
224224

@@ -389,7 +389,7 @@ class SparkConnectJdbcDataTypeSuite extends ConnectFunSuite with RemoteSparkSess
389389

390390
val metaData = rs.getMetaData
391391
assert(metaData.getColumnCount === 1)
392-
assert(metaData.getColumnType(1) === Types.BINARY)
392+
assert(metaData.getColumnType(1) === Types.VARBINARY)
393393
assert(metaData.getColumnTypeName(1) === "BINARY")
394394
assert(metaData.getColumnClassName(1) === "[B")
395395
assert(metaData.isSigned(1) === false)
@@ -405,7 +405,7 @@ class SparkConnectJdbcDataTypeSuite extends ConnectFunSuite with RemoteSparkSess
405405

406406
val metaData = rs.getMetaData
407407
assert(metaData.getColumnCount === 1)
408-
assert(metaData.getColumnType(1) === Types.BINARY)
408+
assert(metaData.getColumnType(1) === Types.VARBINARY)
409409
assert(metaData.getColumnTypeName(1) === "BINARY")
410410
assert(metaData.getColumnClassName(1) === "[B")
411411
}

0 commit comments

Comments
 (0)