Skip to content
Advertisement

Unable to submit concurrent Hadoop jobs

I am running Hadoop 2.7 on my local machine, along with HBase 1.4 and Phoenix 4.15. I have written an application which submits map reduce jobs that delete data in HBase through Phoenix. Each job is run by an individual thread of a ThreadPoolExecutor and looks like this:

public class MRDeleteTask extends Task {

    private final Logger LOGGER = LoggerFactory.getLogger(MRDeleteTask.class);
    private String query;
    public MRDeleteTask(int id, String q) {
        this.setId(id);
        this.query = q;
    }

    @Override
    public void run() {
        LOGGER.info("Running Task: " + getId());
        try {
            Configuration configuration = HBaseConfiguration.create();
            Job job = Job.getInstance(configuration, "phoenix-mr-job-"+getId());
            LOGGER.info("mapper input: " + this.query);
            PhoenixMapReduceUtil.setInput(job, DeleteMR.PhoenixDBWritable.class, "Table", QUERY);
            job.setMapperClass(DeleteMR.DeleteMapper.class);
            job.setJarByClass(DeleteMR.class);
            job.setNumReduceTasks(0);
            job.setOutputFormatClass(NullOutputFormat.class);
            job.setOutputKeyClass(ImmutableBytesWritable.class);
            job.setOutputValueClass(Writable.class);
            TableMapReduceUtil.addDependencyJars(job);
            boolean result = job.waitForCompletion(true);

        }
        catch (Exception e) {
            LOGGER.info(e.getMessage());
        }
    }
}

Everything is fine if there is only 1 thread in the ThreadPoolExecutor. If more than one such Hadoop jobs are submitted concurrently, nothing happens. As per the logs, the error looks like:

4439 [pool-1-thread-2] INFO  MRDeleteTask  - java.util.concurrent.ExecutionException: java.io.IOException: Unable to rename file: [/tmp/hadoop-user/mapred/local/1595274269610_tmp/tmp_phoenix-4.15.0-HBase-1.4-client.jar] to [/tmp/hadoop-user/mapred/local/1595274269610_tmp/phoenix-4.15.0-HBase-1.4-client.jar]

4439 [pool-1-thread-1] INFO  MRDeleteTask  - java.util.concurrent.ExecutionException: ExitCodeException exitCode=1: chmod: /private/tmp/hadoop-user/mapred/local/1595274269610_tmp/phoenix-4.15.0-HBase-1.4-client.jar: No such file or directory

The tasks are submitted using ThreadPoolExecutor.submit() and their status is being checked using the returned future future.isDone().

Advertisement

Answer

The jobs were not being submitted to YARN, but instead running locally from Intellij. Adding the following to the job configuration solved the issue:

conf.set("mapreduce.framework.name", "yarn");
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement