Skip to content

Conversation

pan3793
Copy link
Member

@pan3793 pan3793 commented Sep 30, 2024

Spark 3.4.x is EOL, 3.5.0 was released on September 13th 2023 and will be maintained for 31 months until April 12th 2026.

https://spark.apache.org/versioning-policy.html

rm -rf /var/lib/apt/lists/* ; \
update-ca-certificates -f ; \
JAVA_8=`update-java-alternatives --list | grep java-1.8.0-openjdk | awk '{print $NF}'` ; \
update-java-alternatives --set $JAVA_8 ; \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use update-java-alternatives, so that other JVM commands like javac, jstack also get updated

ln -snf /usr/local/spark-${APACHE_SPARK_VERSION}-bin-${APACHE_SPARK_CUSTOM_NAME} /usr/local/spark
RUN if [ "$SCALA_VERSION" = "2.13" ]; then APACHE_SPARK_CUSTOM_NAME=hadoop3-scala2.13; else APACHE_SPARK_CUSTOM_NAME=hadoop3; fi && \
SPARK_TGZ_NAME=spark-${APACHE_SPARK_VERSION}-bin-${APACHE_SPARK_CUSTOM_NAME} && \
if [ ! -d "/usr/local/$SPARK_TGZ_NAME" ]; then \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

install the specific version of spark only when the base image does not have that.

@pan3793 pan3793 marked this pull request as ready for review October 7, 2025 02:31
@pan3793 pan3793 marked this pull request as draft October 7, 2025 03:14
val asm = "org.ow2.asm" % "asm" % asmVersion // Apache v2
val asmCommons = "org.ow2.asm" % "asm-commons" % asmVersion // Apache v2
val asmUtil = "org.ow2.asm" % "asm-util" % asmVersion // Apache v2
val clapper = "org.clapper" %% "classutil" % "1.5.1" // Apache v2, used for detecting plugins
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/bmc/classutil

Version 1.5.0 and on are licensed under the Apache License, version 2.0.

* @return The new class finder
*/
protected def newClassFinder(): ClassFinder = ClassFinder(classpath)
protected def newClassFinder(): ClassFinder = ClassFinder(classpath, Some(Opcodes.ASM9))
Copy link
Member Author

@pan3793 pan3793 Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some modern version libs, like Jackson, use multi release jars, which main classes are built against Java 8, but ship optional classes compiled with higher JDK version, we must use newer ASM lib to make it work.

see also bmc/classutil#45

val pekkoTestkit = "org.apache.pekko" %% "pekko-testkit" % pekkoVersion // Apache v2

val clapper = "org.clapper" %% "classutil" % "1.5.1" // BSD 3-clause license, used for detecting plugins
val asmVersion = "9.9"
Copy link
Member Author

@pan3793 pan3793 Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upgrade ASM libs used by org.clapper:classutil to address

25/10/07 03:10:47 WARN Main$$anon$1: No external magics provided to PluginManager!
Exception in thread "main" java.lang.IllegalArgumentException: Unsupported class file major version 61
	at shadeasm.org.objectweb.asm.ClassReader.<init>(ClassReader.java:195)
	at shadeasm.org.objectweb.asm.ClassReader.<init>(ClassReader.java:176)
	at shadeasm.org.objectweb.asm.ClassReader.<init>(ClassReader.java:162)
	at shadeasm.org.objectweb.asm.ClassReader.<init>(ClassReader.java:283)
	at shadeclapper.org.clapper.classutil.asm.ClassFile$.load(ClassFinderImpl.scala:222)
	at shadeclapper.org.clapper.classutil.ClassFinder.classData(ClassFinder.scala:404)
	at shadeclapper.org.clapper.classutil.ClassFinder.$anonfun$processOpenZip$2(ClassFinder.scala:359)
	at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
	at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
	at scala.collection.Iterator.toStream(Iterator.scala:1417)
	at scala.collection.Iterator.toStream$(Iterator.scala:1416)
	at scala.collection.AbstractIterator.toStream(Iterator.scala:1431)
	at scala.collection.Iterator.$anonfun$toStream$1(Iterator.scala:1417)
	at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1173)
	at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1163)
	at scala.collection.immutable.Stream.$anonfun$$plus$plus$1(Stream.scala:372)
	at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1173)
	at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1163)
	at scala.collection.immutable.StreamIterator.$anonfun$next$1(Stream.scala:1061)
	at scala.collection.immutable.StreamIterator$LazyCell.v$lzycompute(Stream.scala:1050)
	at scala.collection.immutable.StreamIterator$LazyCell.v(Stream.scala:1050)
	at scala.collection.immutable.StreamIterator.hasNext(Stream.scala:1055)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at scala.collection.Iterator.foreach(Iterator.scala:943)
	at scala.collection.Iterator.foreach$(Iterator.scala:943)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
	at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
	at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
	at scala.collection.immutable.Map$MapBuilderImpl.$plus$plus$eq(Map.scala:648)
	at scala.collection.immutable.Map$MapBuilderImpl.$plus$plus$eq(Map.scala:595)
	at scala.collection.TraversableOnce.toMap(TraversableOnce.scala:372)
	at scala.collection.TraversableOnce.toMap$(TraversableOnce.scala:370)
	at scala.collection.AbstractIterator.toMap(Iterator.scala:1431)
	at shadeclapper.org.clapper.classutil.ClassFinder$.classInfoMap(ClassFinder.scala:445)
	at org.apache.toree.plugins.PluginSearcher.loadClassMap(PluginSearcher.scala:80)
	at org.apache.toree.plugins.PluginSearcher.internalClassInfo$lzycompute(PluginSearcher.scala:36)
	at org.apache.toree.plugins.PluginSearcher.internalClassInfo(PluginSearcher.scala:35)
	at org.apache.toree.plugins.PluginSearcher.internal$lzycompute(PluginSearcher.scala:39)
	at org.apache.toree.plugins.PluginSearcher.internal(PluginSearcher.scala:39)
	at org.apache.toree.plugins.PluginManager.internalPlugins$lzycompute(PluginManager.scala:45)
	at org.apache.toree.plugins.PluginManager.internalPlugins(PluginManager.scala:44)
	at org.apache.toree.plugins.PluginManager.initialize(PluginManager.scala:80)
	at org.apache.toree.boot.layer.StandardComponentInitialization.initializePlugins(ComponentInitialization.scala:219)
	at org.apache.toree.boot.layer.StandardComponentInitialization.initializeComponents(ComponentInitialization.scala:83)
	at org.apache.toree.boot.layer.StandardComponentInitialization.initializeComponents$(ComponentInitialization.scala:69)
	at org.apache.toree.Main$$anon$1.initializeComponents(Main.scala:35)
	at org.apache.toree.boot.KernelBootstrap.initialize(KernelBootstrap.scala:102)
	at org.apache.toree.Main$.delayedEndpoint$org$apache$toree$Main$1(Main.scala:35)
	at org.apache.toree.Main$delayedInit$body.apply(Main.scala:24)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1$adapted(App.scala:80)
	at scala.collection.immutable.List.foreach(List.scala:431)
	at scala.App.main(App.scala:80)
	at scala.App.main$(App.scala:78)
	at org.apache.toree.Main$.main(Main.scala:24)
	at org.apache.toree.Main.main(Main.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1034)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:199)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:222)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1125)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1134)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

@pan3793 pan3793 marked this pull request as ready for review October 7, 2025 03:48
SPARK_TGZ_NAME=spark-${APACHE_SPARK_VERSION}-bin-${APACHE_SPARK_CUSTOM_NAME} && \
if [ ! -d "/usr/local/$SPARK_TGZ_NAME" ]; then \
cd /tmp ; \
wget -q https://www.apache.org/dyn/closer.lua/spark/spark-${APACHE_SPARK_VERSION}/${SPARK_TGZ_NAME}.tgz?action=download -O ${SPARK_TGZ_NAME}.tgz ; \
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closer.lua is recommended by ASF infra, which prefers to download the tgz from the dlcdn site, and falls back to the archive site if unavailable

https://infra.apache.org/release-download-pages.html#download-scripts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant