Tag: apache-spark

Adjust classpath / change spring version in azure databricks

apache-spark azure-databricks ignite java spring

I’m trying to use Apache Spark/Ignite integration in Azure Databricks. I install the org.apache.ignite:ignite-spark-2.4:2.9.0 maven library using the Databricks UI. And I have an error while accessing my ignite cahces: Here the AbstractApplicationContext is compiled with ReflectionUtils of different spr…

How to get corresponding quarter of previous year in Scala

apache-spark date java localdate scala

I have a date string with me in the format – “20202” [“yyyyQ”]. Is there a way to get the corresponding quarter of previous year ? ex- for 20202 , it should be 20192 Answer An alternative to the other answers is using my lib Time4J and its class CalendarQuarter. Example: Two main…

Read data saved by spark redis using Java

apache-spark java redis spark-redis spring-data-redis

I using spark-redis to save Dataset to Redis. Then I read this data by using Spring data redis: This object I save to redis: Save object by using spark-redis: Repository: I can’t read this data have been saved in Redis by using Spring data redis because structure data saved by spark-redis and spring dat…

How to resolve (java.lang.ClassNotFoundException: com.mongodb.spark.sql.DefaultSource.DefaultSource) in pyspark i’m using pycharm

apache-spark java pyspark python

With Pycharm I’m getting this error: java.lang.ClassNotFoundException: com.mongodb.spark.sql.DefaultSource.DefaultSource How can I resolve this issue? I tried: I also tried setting the classpath of the jars also .bash_profile: I had many jars in my_jars but still didn’t get it to work. I keep gett…

Apache spark and scala, error while executing queries

apache-spark java scala

I am working with a dataset whose sample is as follows: I have executed the following commands successfully: I am getting following error: java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: java.lang.Character is not a valid external type for schema of string I am getting the same e…

CSV file from HDFS to Oracle BLOB using Spark

apache-spark java oracle

I’m working on Java app that uses Spark 2.3.1 to load data from Oracle to HDFS and vice versa. I want to create CSV file in HDFS and then load it to Oracle (12.2) BLOB. The code.. I’m new to Spark.. so any ideas please how to convert JavaRDD to BufferedInputStream, or get rid of mess above and put…

spark-submit error: Invalid maximum heap size: -Xmx4g –jars, but enough of memory on the system

apache-spark hail java

I am running a spark job: And the command gives an error: Invalid maximum heap size: -Xmx4g –jars Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. I checked memory: So, it seems to be fine. I checked java: Then I checked in Chrome whether spark …

ClassNotFoundException: Failed to find data source: bigquery

apache-spark google-bigquery google-cloud-dataproc java maven

I’m trying to load data from Google BigQuery into Spark running on Google Dataproc (I’m using Java). I tried to follow instructions on here: https://cloud.google.com/dataproc/docs/tutorials/bigquery-connector-spark-example I get the error: “ClassNotFoundException: Failed to find data source:…

Spark – Transforming Complex Data Types

apache-spark apache-spark-sql java user-defined-functions

Goal The goal I want to achieve is to read a CSV file (OK) encode it to Dataset<Person>, where Person object has a nested object Address[]. (Throws an exception) The Person CSV file In a file called person.csv, there is the following data describing some persons: The first line is the schema and address…

version conflict, current version [2] is different than the one provided [1]

apache-kafka apache-spark elasticsearch java

I have a Kafka topic and a Spark application. The Spark application gets data from Kafka topic, pre aggregates it and stores it in Elastic Search. Sounds simple, right? Everything works fine as expected, but the minute I set “spark.cores” property something other than 1, I start getting After rese…