Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Dec 2, 2025

What changes were proposed in this pull request?

Change SparkBuildInfo to use its own classloader instead of thread context classloader to load spark-version-info.properties.

Why are the changes needed?

I hit an issue during the Connect JDBC driver & JetBrains DataGrip integration.

2025-11-25 18:48:09,475 [  55114]   WARN - #c.i.d.d.BaseDatabaseErrorHandler$MissingDriverClassErrorInfo - Exception org.apache.spark.SparkException: Could not find spark-version-info.properties [in thread "RMI TCP Connection(3)-127.0.0.1"]
java.lang.ExceptionInInitializerError: Exception org.apache.spark.SparkException: Could not find spark-version-info.properties [in thread "RMI TCP Connection(3)-127.0.0.1"]
	at org.apache.spark.SparkBuildInfo$.<clinit>(SparkBuildInfo.scala:35)
	at org.apache.spark.sql.connect.client.SparkConnectClient$.org$apache$spark$sql$connect$client$SparkConnectClient$$genUserAgent(SparkConnectClient.scala:978)
	at org.apache.spark.sql.connect.client.SparkConnectClient$Configuration$.apply$default$8(SparkConnectClient.scala:999)
	at org.apache.spark.sql.connect.client.SparkConnectClient$Builder.<init>(SparkConnectClient.scala:683)
	at org.apache.spark.sql.connect.client.SparkConnectClient$.builder(SparkConnectClient.scala:676)
	at org.apache.spark.sql.connect.client.jdbc.SparkConnectConnection.<init>(SparkConnectConnection.scala:31)
	at org.apache.spark.sql.connect.client.jdbc.NonRegisteringSparkConnectDriver.connect(NonRegisteringSparkConnectDriver.scala:36)
	at com.intellij.database.remote.jdbc.helpers.JdbcHelperImpl.connect(JdbcHelperImpl.java:786)
	at com.intellij.database.remote.jdbc.impl.RemoteDriverImpl.connect(RemoteDriverImpl.java:47)

After adding some debug messages, I found it was caused by using wrong classloader.

c.i.e.r.RemoteProcessSupport - ContextClassLoader: com.intellij.database.remote.jdbc.impl.JdbcClassLoader$1@559cc356
c.i.e.r.RemoteProcessSupport - SparkBuildInfo ClassLoader: com.intellij.database.remote.jdbc.impl.JdbcClassLoader$JdbcClassLoaderImpl@62e93ea8

Similar issue that affects Hive JDBC driver and Spark's Isolated Classloader (see SPARK-32256) was fixed by HADOOP-14067

Does this PR introduce any user-facing change?

This fixes corner issues that the application uses multiple classloaders with Spark libs.

How was this patch tested?

Pass GHA to ensure the change breaks nothing, also manually verified the Connect JDBC driver & JetBrains DataGrip integration.

Was this patch authored or co-authored using generative AI tooling?

No.

@pan3793
Copy link
Member Author

pan3793 commented Dec 2, 2025

cc @zsxwing as you worked on SPARK-32256

@pan3793
Copy link
Member Author

pan3793 commented Dec 2, 2025

also cc @LuciferYang and @dongjoon-hyun as this bug affects Connect JDBC driver use cases

@dongjoon-hyun
Copy link
Member

Thank you for pinging me, @pan3793 . BTW, given the affected versions, I don't think this is a blocker-level issue.

@pan3793
Copy link
Member Author

pan3793 commented Dec 2, 2025

@dongjoon-hyun, yes, I classify it as a regular bug, not a blocker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants