Skip to content
Advertisement

MapReduce filtering to get customers not in order list?

Currently learning on MapReduce and trying to figure out how to code this into Java.

Two input files, called customers.txt and car_orders.txt:

JavaScript

The idea is to apply MapReduce and output the customer that did not make a car order – in above scenario it is Emily.

JavaScript

This is what I have in mind:

JavaScript

Any help in the form of pseudocode for this application will be greatly appreciated.

Advertisement

Answer

It’s basically a reduce-side-join where you discard the outputs that have both sides filled – same as you put it in your pseudocode.

The code for that in Hadoop MapReduce would look like that:

JavaScript

That would emit:

JavaScript

So the reducer would look fairly simple:

JavaScript

And that should just emit Emily.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement