mapreduce - hadoop map reduce secondary sorting -

can 1 explain me how secondary sorting works in hadoop ?
why must 1 use groupingcomparator , how work in hadoop ?

i going through link given below , got doubt on how groupcomapator works.
can 1 explain me how grouping comparator works?

http://www.bigdataspeak.com/2013/02/hadoop-how-to-do-secondary-sort-on_25.html

grouping comparator

once data reaches reducer, data grouped key. since have composite key, need make sure records grouped solely natural key. accomplished writing custom grouppartitioner. have comparator object considering yearmonth field of temperaturepair class purposes of grouping records together.

public class yearmonthgroupingcomparator extends writablecomparator {      public yearmonthgroupingcomparator() {         super(temperaturepair.class, true);     }      @override     public int compare(writablecomparable tp1, writablecomparable tp2) {         temperaturepair temperaturepair = (temperaturepair) tp1;         temperaturepair temperaturepair2 = (temperaturepair) tp2;         return temperaturepair.getyearmonth().compareto(temperaturepair2.getyearmonth());     } }

here results of running our secondary sort job:

new-host-2:sbin bbejeck$ hdfs dfs -cat secondary-sort/part-r-00000

190101 -206

190102 -333

190103 -272

190104 -61

190105 -33

190106 44

190107 72

190108 44

190109 17

190110 -33

190111 -217

190112 -300

while sorting data value may not common need, it’s nice tool have in pocket when needed. also, have been able take deeper @ inner workings of hadoop working custom partitioners , group partitioners. refer link also..what use of grouping comparator in hadoop map reduce

Story

Search This Blog

mapreduce - hadoop map reduce secondary sorting -