Skip to content
Advertisement

Tag: apache-spark

Adjust classpath / change spring version in azure databricks

I’m trying to use Apache Spark/Ignite integration in Azure Databricks. I install the org.apache.ignite:ignite-spark-2.4:2.9.0 maven library using the Databricks UI. And I have an error while accessing my ignite cahces: Here the AbstractApplicationContext is compiled with ReflectionUtils of different spring version. I see the spring-core-4.3.26.RELEASE.jar is installed in the /dbfs/FileStore/jars/maven/org/springframework during the org.apache.ignite:ignite-spark-2.4:2.9.0 installation and there are no other spring

How to resolve (java.lang.ClassNotFoundException: com.mongodb.spark.sql.DefaultSource.DefaultSource) in pyspark i’m using pycharm

With Pycharm I’m getting this error: java.lang.ClassNotFoundException: com.mongodb.spark.sql.DefaultSource.DefaultSource How can I resolve this issue? I tried: I also tried setting the classpath of the jars also .bash_profile: I had many jars in my_jars but still didn’t get it to work. I keep getting the same error. Answer Provide comma separated jarfiles instead of directory path in spark.jars Alternatively you can

Apache spark and scala, error while executing queries

I am working with a dataset whose sample is as follows: I have executed the following commands successfully: I am getting following error: java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: java.lang.Character is not a valid external type for schema of string I am getting the same error when executing any query against the data. Can you please have a look and provide

CSV file from HDFS to Oracle BLOB using Spark

I’m working on Java app that uses Spark 2.3.1 to load data from Oracle to HDFS and vice versa. I want to create CSV file in HDFS and then load it to Oracle (12.2) BLOB. The code.. I’m new to Spark.. so any ideas please how to convert JavaRDD to BufferedInputStream, or get rid of mess above and put Dataset

ClassNotFoundException: Failed to find data source: bigquery

I’m trying to load data from Google BigQuery into Spark running on Google Dataproc (I’m using Java). I tried to follow instructions on here: https://cloud.google.com/dataproc/docs/tutorials/bigquery-connector-spark-example I get the error: “ClassNotFoundException: Failed to find data source: bigquery.” My pom.xml looks like this: After adding the dependency to my pom.xml it was downloading a lot to build the .jar, so I think

Advertisement