Hadoop Combiner
The Hadoop combiner is used for reducing/optimizing the bandwidth of the MapReduce Job.
The combiner sits in between the Maper and Reducer, so the workflow would be like Maper -> Combiner -> Reducer.
The combiner acts as a Mini-Reducer. So the output from the Maper is sent to Combiner and from Combiner it is sent to Reducer
Example : – Suppose an Maper program emits the wordcount ("Hello",1) three times. So instead of passing these three sets ("Hello",1), ("Hello",1),("Hello",1) to reduce combiner will pass ("Hello",3) . It will reduce the overhead for the reducer.
Difference between Combiner and Reducer
The Hadoop combiner is used for reducing/optimizing the bandwidth of the MapReduce Job.
The combiner sits in between the Maper and Reducer, so the workflow would be like Maper -> Combiner -> Reducer.
The combiner acts as a Mini-Reducer. So the output from the Maper is sent to Combiner and from Combiner it is sent to Reducer
Example : – Suppose an Maper program emits the wordcount ("Hello",1) three times. So instead of passing these three sets ("Hello",1), ("Hello",1),("Hello",1) to reduce combiner will pass ("Hello",3) . It will reduce the overhead for the reducer.
Difference between Combiner and Reducer
- Combiners can be used on the fucntions which are commutative(a.b==b.a)
- Reducer can get input from multiple mapper but combiner can get input from only one mapper.
No comments:
Post a Comment