Hadoop NumberFormatException on string “ ”

20.2 on windows with cygwin (for a class project). I’m not sure why but I cannot run any jobs — I just get a NumberFormatException. I’m thinking its an issue with my machine because I cannot even run …

MapReduce filtering to get customers not in order list?

Currently learning on MapReduce and trying to figure out how to code this into Java. Two input files, called customers.txt and car_orders.txt: customers.txt =================== 12345 Peter 12346 …

Caused by: java.lang.ClassNotFoundException: play.api.libs.functional.syntax.package

I am getting this following error (Caused by: java.lang.ClassNotFoundException: play.api.libs.functional.syntax.package) while I am trying to run my code I have right dependencies and added right Jar …

Read a file from google storage in dataproc

I’m tring to migrate a scala spark job from hadoop cluster to GCP, I have this snippest of code that read a file and create an ArrayBuffer[String] import java.io._ import org.apache.hadoop….

create file with webHdfs

I would like to create a file to hdfs with webhdfs, I wrote the function below public ResponseEntity createFile(MultipartFile f) throws URISyntaxException { URI uriPut = new URI( …

Unable to submit concurrent Hadoop jobs

I am running Hadoop 2.7 on my local machine, along with HBase 1.4 and Phoenix 4.15. I have written an application which submits map reduce jobs that delete data in HBase through Phoenix. Each job is …

Reducer doesn’t call reduce method when using my own class as output value MapReduce Hadoop

I was trying to use my own Class object as the output value of my Mapper and use them inside the Reducer but the reduce() method isn’t called and my app was going to be terminated if I remove the …

Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/tracing/SpanReceiverHost

I am running Hadoop 2.8.1 and Hive 2.3.0 I am tring to read values from an a table created in Hive and the current exception is java.lang.ClassNotFoundException: org.apache.hadoop.tracing….

How to do CopyMerge in Hadoop 3.0?

I know hadoop version 2.7’s FileUtil has the copyMerge function that merges multiple files into a new one. But the copyMerge function is no longer supported per the API in the 3.0 version. Any …

Spark SASL not working on the emr with yarn

So first, I want to say the only thing I have seen address this issue is here: Spark 1.6.1 SASL. However, when adding the configuration for the spark and yarn authentication, it is still not working. …