Open Source Programming: Hadoop Combiner

Tuesday, July 14, 2015

Hadoop Combiner

Hadoop Combiner

The Hadoop combiner is used for reducing/optimizing the bandwidth of the MapReduce Job.

The combiner sits in between the Maper and Reducer, so the workflow would be like Maper -> Combiner -> Reducer.

The combiner acts as a Mini-Reducer. So the output from the Maper is sent to Combiner and from Combiner it is sent to Reducer

Example : – Suppose an Maper program emits the wordcount ("Hello",1) three times. So instead of passing these three sets ("Hello",1), ("Hello",1),("Hello",1) to reduce combiner will pass ("Hello",3) . It will reduce the overhead for the reducer.

Difference between Combiner and Reducer

Combiners can be used on the fucntions which are commutative(a.b==b.a)
Reducer can get input from multiple mapper but combiner can get input from only one mapper.

Open Source Programming

Tuesday, July 14, 2015

Hadoop Combiner

No comments:

Post a Comment

Categories