'Iterate Twice in Map reduce

I have written a Reducer job in which my key and value is composite . I have a requirement of iterating twice through the values and hence trying to cache the values but the same value is getting repeated. Please help me out.

Below is my Reducer class.

 public static class Reducerclass  extends Reducer<Text,Text,Text,Text> {
            DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss a");

            private MultipleOutputs<Text, Text> multipleOutputs;

            @Override
            public void setup(Context context){
                multipleOutputs = new MultipleOutputs<Text, Text>(context);
            }
            public void reduce(Text rkey, Iterable<Text> rvalue, Context context) throws IOException, InterruptedException {             
                ArrayList<Text> ArrayList  = new ArrayList<Text>();
                Iterator<Text> iterator = rvalue.iterator();

                while (iterator.hasNext()) {
                    Text writable = iterator.next();
                    System.out.println("first iteration: " + writable);
                    ArrayList.add(new Text(writable));
context.write(new Text(rkey + ", "),new Text(writable + "--> first iteration"));
                }

                 int size = ArrayList.size();
                    for (int i = 0; i < size; ++i) {
                        System.out.println("second iteration: " + ArrayList.get(i));
context.write(new Text(rkey + ", "),new Text(ArrayList.get(i) + "--> Second iteration--->" + "Array Size -->" + size));
                    }


            }


        }

Input File:

1509075052824 13.0619798 80.1468367
1509075112825 13.07537311 80.19612851
1509073985114 13.0507832 80.25069245
1509075072824 12.91690859 80.06168244

Expected Output:

first iteration: 1509075052824 13.0619798 80.1468367
first iteration: 1509075112825 13.07537311 80.19612851
first iteration: 1509073985114 13.0507832 80.25069245
first iteration: 1509075072824 12.91690859 80.06168244

second iteration: 1509075052824 13.0619798 80.1468367
second iteration: 1509075112825 13.07537311 80.19612851
second iteration: 1509073985114 13.0507832 80.25069245
second iteration: 1509075072824 12.91690859 80.06168244

Current Output:

1509075042823 12.91877675 80.0466234--> first iteration
1509075042823 12.91877675 80.0466234--> Second iteration--->Array Size -->1
1509074972821 12.91738175 80.05294765--> first iteration
1509074972821 12.91738175 80.05294765--> Second iteration--->Array Size -->1
1509073795109 13.05561879 80.11920979--> first iteration
1509073795109 13.05561879 80.11920979--> Second iteration--->Array Size -->1
1509075132826 12.97988349 80.16310309--> first iteration
1509075132826 12.97988349 80.16310309--> Second iteration--->Array Size -->1
1509073885111 13.06640175 80.2457003--> first iteration
1509073885111 13.06640175 80.2457003--> Second iteration--->Array Size -->1

Thanks in Advance!



Solution 1:[1]

If you want the reducer all collected into one single Arraylist, you need one Reducer.

To get that, you need your Mapper to always output the same rkey

Solution 2:[2]

I iterated it twice by mark the iterator first and then reset after the first iterates. the point is to use the ReduceContext.ValueIterator

    ReduceContext.ValueIterator iterator = (ReduceContext.ValueIterator)rvalue.iterator();
      iterator.mark();//mark the first location
      //iterator the first time.
      while (iterator.hasNext()) {
       //do your things
      }
      //reset the iterator.
      iterator.reset();
      //iterator the second time.
      while (iterator.hasNext()) {
       //do your things
      }

after mark added, the task will output back_0_0.out at every next() called. The first iterate can't not clearMark, because the reset() need the mark. we can do this after the second iterator.

    ReduceContext.ValueIterator iterator = (ReduceContext.ValueIterator)rvalue.iterator();
      iterator.mark();//mark the first location
      //iterator the first time.
      while (iterator.hasNext()) {
       //do your things
      }
      //reset the iterator.
      iterator.reset();
      //clearMark
      iterator.clearMark();
      //iterator the second time.
      while (iterator.hasNext()) {
       //do your things
      }

      //the Iterable class is the same with each key group, so we need to reset to clear the clearMarkFlag. and ignore the Exception to get it run.
      try{
          iterator.reset();
      }catch(Exception ignored){}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 OneCricketeer
Solution 2