Skip to content
Advertisement

CSV file from HDFS to Oracle BLOB using Spark

I’m working on Java app that uses Spark 2.3.1 to load data from Oracle to HDFS and vice versa. I want to create CSV file in HDFS and then load it to Oracle (12.2) BLOB.

The code..

JavaScript

I’m new to Spark.. so any ideas please how to convert JavaRDD to BufferedInputStream, or get rid of mess above and put Dataset to Oracle BLOB in more sane way..

Thanks

Advertisement

Answer

Finally.. after couple days of fighting with Oracle, Hadoop and Spark, I found solution for my task:

JavaScript

Writing of 2 Gb CSV from Spark Dataset into HDFS, and following reading of this CSV from HDFS into Oracle BLOB took about 5 minutes..

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement