my classpath is missing serializable and cloneable classes.. i am not sure how to fix this.
i have a sbt application which looks like this
name := "realtime-spark-streaming" version := "0.1" resolvers += "confluent" at "https://packages.confluent.io/maven/" resolvers += "Public Maven Repository" at "https://repository.com/content/repositories/pangaea_releases" val sparkVersion = "3.2.0" // https://mvnrepository.com/artifact/org.apache.spark/spark-core libraryDependencies += "org.apache.spark" %% "spark-core" % "3.2.0" // https://mvnrepository.com/artifact/org.apache.spark/spark-streaming libraryDependencies += "org.apache.spark" %% "spark-streaming" % "3.2.0" libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.2.0" libraryDependencies += "com.walmart.grcaml" % "us-aml-commons" % "latest.release" libraryDependencies += "org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion //libraryDependencies += "org.apache.spark" % "spark-streaming-kafka-0-10_2.11" % "3.2.0" % "2.1.3" //libraryDependencies += "org.slf4j" % "slf4j-simple" % "1.7.12" // https://mvnrepository.com/artifact/org.apache.kafka/kafka libraryDependencies += "org.apache.kafka" %% "kafka" % "6.1.0-ccs" resolvers += Resolver.mavenLocal scalaVersion := "2.13.6"
when i do a sbt build i am getting..
Symbol 'type scala.package.Serializable' is missing from the classpath. This symbol is required by 'class org.apache.spark.sql.SparkSession'. Make sure that type Serializable is in your classpath and check for conflicting dependencies with `-Ylog-classpath`. A full rebuild may help if 'SparkSession.class' was compiled against an incompatible version of scala.package. import org.apache.spark.sql.{DataFrame, SparkSession}
Symbol 'type scala.package.Serializable' is missing from the classpath. This symbol is required by 'class org.apache.spark.sql.Dataset'. Make sure that type Serializable is in your classpath and check for conflicting dependencies with `-Ylog-classpath`. A full rebuild may help if 'Dataset.class' was compiled against an incompatible version of scala.package. def extractData(spark: SparkSession, configDetails: ReadProperties, pcSql: String, query: String): DataFrame = {
my dependency tree only shows jars, but this seems to be a class/package conflict or missing..
Advertisement
Answer
You’re using an incompatible Scala version (2.13.6). From the Spark documentation:
Spark runs on Java 8/11, Scala 2.12, Python 3.6+ and R 3.5+. Python 3.6 support is deprecated as of Spark 3.2.0. Java 8 prior to version 8u201 support is deprecated as of Spark 3.2.0. For the Scala API, Spark 3.2.0 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x).
If you use a Scala version from the 2.12.x family you should be fine.