Tag: hadoop

Unable to submit concurrent Hadoop jobs

apache-phoenix hadoop hbase java mapreduce

I am running Hadoop 2.7 on my local machine, along with HBase 1.4 and Phoenix 4.15. I have written an application which submits map reduce jobs that delete data in HBase through Phoenix. Each job is run by an individual thread of a ThreadPoolExecutor and looks like this: Everything is fine if there is only 1 …

Reducer doesn’t call reduce method when using my own class as output value MapReduce Hadoop

hadoop java mapreduce

I was trying to use my own Class object as the output value of my Mapper and use them inside the Reducer but the reduce() method isn’t called and my app was going to be terminated if I remove the default constructor of DateIncome class. I wrote my codes as follows: Driver: Mapper: Reducer: DateIncome: I…

jps command for Hadoop processes

eclipse hadoop java terminal

there. I have hadoop 2.4.1 running on ubuntu. Executing jps command, I am getting this output: is it normal to get “3794 org.eclipse.equinox.launcher_1.5.0.v20180512-1130.jar” along with the output of jps? I am asking because didn’t get it before. Suddenly, it started to give this result wit…

java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.createDirectoryWithMode0

cloudera-cdh hadoop java maven

I cannot solve this exception, I’ve read the hadoop docu and all related stackoverflow questions that I could find. My fileSystem.mkdirs(***) throws: I am including the following dependencies in my app (via maven pom.xml), all in version 2.6.0-cdh5.13.0: hadoop-common, hadoop-hdfs, hadoop-client, hadoop…

Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hadoop/tracing/SpanReceiverHost

hadoop hive java maven

I am running Hadoop 2.8.1 and Hive 2.3.0 I am tring to read values from an a table created in Hive and the current exception is And here is the code that I have used to read the tables And here it is the Pom file that I have used org.apache.hive.hcatalog hive-hcatalog-core 2.3.0 org.apache.hive.hcatalog hive-…

What’s the usage of the file named “conf/masters” in Flink?

apache-flink hadoop java

Since we can specify a master by “jobmanager.rpc.address” in “flink-conf.yaml” ,what’s the usage of the file named”conf/masters”? Answer It is used for starting standalone cluster with HA mode. You can check out more here

Hadoop Error starting ResourceManager and NodeManager

hadoop hadoop3 java resourcemanager

I’m trying to setup Hadoop3-alpha3 with a Single Node Cluster (Psuedo-distributed) and using the apache guide to do so. I’ve tried running the example MapReduce job but every time the connection is refused. After running sbin/start-all.sh I’ve been seeing these exceptions in the ResourceMana…

Unable to connect to Phoenix using JDBC

apache-phoenix hadoop jar java jdbc

I have a Hadoop Cluster set up with HBase and Phoenix and I’m trying to connect to Phoenix using JDBC, but I am sort of unable to get a successful connection. I want to use JDBC to connect using Python 3.x but as for simple test purposes I set up a connection using Java in Eclipse. I was originally usin…

How to do CopyMerge in Hadoop 3.0?

hadoop java

I know hadoop version 2.7’s FileUtil has the copyMerge function that merges multiple files into a new one. But the copyMerge function is no longer supported per the API in the 3.0 version. Any ideas on how to merge all files within a directory into a new single file in the 3.0 version of hadoop? Answer …

Spark SASL not working on the emr with yarn

apache-spark hadoop java yarn

So first, I want to say the only thing I have seen address this issue is here: Spark 1.6.1 SASL. However, when adding the configuration for the spark and yarn authentication, it is still not working. …