How to resolve (java.lang.ClassNotFoundException: com.mongodb.spark.sql.DefaultSource.DefaultSource) in pyspark i’m using pycharm

Tags: , , ,



With Pycharm I’m getting this error: java.lang.ClassNotFoundException: com.mongodb.spark.sql.DefaultSource.DefaultSource How can I resolve this issue?

I tried:

spark = SparkSession.builder.config("spark.jars", "/Users/diwakarkumar/spark-2.4.6-bin-hadoop2.7/jars/").appName(
    "my_job").master("local[*]").getOrCreate()

I also tried setting the classpath of the jars also .bash_profile:

export CLASSPATH=~/my_jars/

I had many jars in my_jars but still didn’t get it to work. I keep getting the same error.

Answer

Provide comma separated jarfiles instead of directory path in spark.jars

spark = SparkSession.builder.config("spark.jars", "/Users/diwakarkumar/spark-2.4.6-bin-hadoop2.7/jars/jar1,/Users/diwakarkumar/spark-2.4.6-bin-hadoop2.7/jars/jar2").appName(
    "my_job").master("local[*]").getOrCreate()

Alternatively you can also use package option.



Source: stackoverflow