'Merging 4 sorted Arrays into one

I have this method to merge 2 sorted arrays into one sorted array:

    public void merge(T[] a, int l1, int r1, T[] b, int l2, int r2, T[] c, int l3) {
        while (l1 < r1 && l2 < r2) {
            if (a[l1].compareTo(b[l2]) < 0) {
                c[l3++] = a[l1++];
            } else
                c[l3++] = b[l2++];
        }

        while (l1 < r1)
            c[l3++] = a[l1++];
        while (l2 < r2)
            c[l3++] = b[l2++];
    }

But now I want to do it with 4 arrays at once.

I tried really long to come up with a solution, but wasn’t really successful. Does somebody have an idea how to do it?



Solution 1:[1]

I've generalized the problem to "merging N sorted arrays into a single sorted array".

The code provided in the question utilizes generics. But it introduces a problem because arrays are not type-safe. In short, there's a substantial difference in their behavior: arrays are covariant and, on the other hand, generics are invariant. Due to that, compiler will not be abler to identify a problem when generics and arrays are mixed. It's a good practice to avoid usage of generic arrays.

Also, I've taken into account that it is clearly an algorithmic problem (therefore its audience broader than readers who have a deep insight in Java, which is required to grasp generic-based implementation) I've decided to create two flavors of solution one using arrays exclusively, another with generics and Collections framework.

Non-generic version

Below is the description of how to merge an arbitrary number of sorted arrays of primitives:

  • find the total number of elements and create a resulting array based on it;
  • define an array that will maintain a current position in each of the source arrays;
  • using a nested for loop for each position in the resulting array, pick the lowest value of all currently accessible values.

The time complexity of this algorithm is O(n * m) (where n - is the total number of elements in all arrays and m is the number of arrays).

The implementation might look like this:

public static int[] mergeNSorted(int[]... arrays) {
    int[] result = new int[getTotalLength(arrays)];
    int[] positions = new int[arrays.length]; // position for each array
    
    for (int pos = 0; pos < result.length; pos++) {
        int minCurVal = Integer.MAX_VALUE;
        int curArr = 0;
        for (int i = 0; i < arrays.length; i++) {
            if (positions[i] < arrays[i].length && arrays[i][positions[i]] < minCurVal) {
                minCurVal = arrays[i][positions[i]];
                curArr = i;
            }
        }
        result[pos] = minCurVal;
        positions[curArr]++;
    }
    return result;
}

public static int getTotalLength(int[][] arrays) {
    long totalLen = 0;
    for (int[] arr : arrays) totalLen += arr.length;
    
    if (totalLen > Integer.MAX_VALUE) throw new IllegalArgumentException("total length exceeded Integer.MAX_VALUE");
    return (int) totalLen;
}

main() - demo

public static void main(String[] args) {
    int[][] input =
        {{1, 3}, {}, {2, 6, 7}, {10}, {4, 5, 8, 9}};

    System.out.println(Arrays.toString(mergeNSorted(input)));
}

Output

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Generic version

In this version, input considered to be a list containing multiple lists of generic type T which expected to implement Comparable interface.

This solution enhances the array-based implementation provided above, reducing the overall time complexity to O(n * log m) (where n - is the total number of elements in all arrays and m is the number of arrays).

Instead of performing m iteration for each resulting element it maintains a PriorityQueue, which in this case represents a Min-Heap (i.e. when a head element is being retrieved from it, it'll have the lowest value of all the elements that are present in the queue).

Every element in queue wraps the value of a particular element retrieved from one of the given lists, as well the data regarding the source of this value (i.e. an index of the list and a position inside this list).

This wrapper over the element of the nested list can be represented by the class shown below.

public class ElementWrapper<V extends Comparable<V>> implements Comparable<ElementWrapper<V>> {
    private V value;
    private int listIndex;
    private int position;
    
    public ElementWrapper(V value, int listIndex, int position) {
        this.value = value;
        this.listIndex = listIndex;
        this.position = position;
    }
    
    // getters
    
    @Override
    public int compareTo(ElementWrapper<V> o) {
        return value.compareTo(o.getValue());
    }
}

Note, that this class implements the of Comparable interface based on the value of wrapped list element.

The queue is being prepopulated with the first element of each non-empty list. And then until the queue is not empty, its lowest element is being removed and gets added to the resulting list. Also, if a list to which the latest element retrieved from the queue points, has more elements, the next of them will be added into the queue.

Note that both operations of adding a new element into the priority queue add() and removing its head element remove() according to the documentation has a cost of O(n) time (where n is the number of elements in the queue).

The same time complexity can be achieved by utilizing a TreeSet instead, but in practice PriorityQueue will perform better because a heap is easier to maintain than a red-black tree.

The code might look like this:

public static <T extends Comparable<T>> List<T> mergeNSorted(List<List<T>> lists) {
    List<T> result = new ArrayList<>();
    Queue<ElementWrapper<T>> queue = getInitializedQueue(lists);
    
    while (!queue.isEmpty()) {
        ElementWrapper<T> next = queue.remove();
        result.add(next.getValue());
        
        if (next.getPosition() + 1 < lists.get(next.getListIndex()).size()) {
            queue.add(new ElementWrapper<>(lists.get(next.getListIndex()).get(next.getPosition() + 1),
                                           next.getListIndex(),
                                           next.getPosition() + 1));
        }
    }
    return result;
}

public static <T extends Comparable<T>> Queue<ElementWrapper<T>> getInitializedQueue(List<List<T>> lists) {
    Queue<ElementWrapper<T>> queue = new PriorityQueue<>();
    for (int i = 0; i < lists.size(); i++) {
        if (lists.get(i).isEmpty()) continue;
        queue.add(new ElementWrapper<>(lists.get(i).get(0), i, 0));
    }
    return queue;
}

main() - demo

public static void main(String[] args) {
    List<List<Integer>> genericInput =
        List.of(List.of(1, 3), List.of(), List.of(2, 6, 7), List.of(10), List.of(4, 5, 8, 9));
    
    System.out.println(mergeNSorted(genericInput));
}

Output

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Solution 2:[2]

There is a much simpler way using Java8 streams than doing this by hand:

  1. combine all arrays into one stream (i've used 2 but you can use as many as you want to):
int[] arr1 = {1, 7, 10};
int[] arr2 = {1, 2, 4, 9};

Stream<int[]> ints = Stream.of(arr1, arr2);
  1. then flatMap and sort them in a stream:
IntStream intStream = ints.flatMapToInt(Arrays::stream).sorted();

and when you print them you will see all the numbers sorted:

intStream.forEach(System.out::println);

1
1
2
4
7
9
10

combined in a function, it could look something like this:

public int[] merge(int[]... arrays) {
  return Stream.of(arrays)
                 .flatMapToInt(Arrays::stream)
                 .sorted()
                 .toArray();
}

EDIT: The advantage of streams is, that you can further modify the values as you like. e.g. by leveraging the distinct function you can easily remove duplicates:

intStream = intStream.distinct();
intStream.forEach(System.out::println);

1
2
4
7
9
10

Solution 3:[3]

I'm not a Java programmer so I'll just give Pythonesque pseudo-code.

First turn each non-emptyarray into a triplet:

(next_value, index, array)

Now put those into a priority queue sorted by next value.

while 0 < queue.size():
    (next_value, index, array) = queue.poll()
    answer.append(next_value)
    if index+1 < array.length:
        queue.add((array[index+1], index+1, array))

If you have k arrays, this will take O(log(k)) comparisons per element produced.

Sadly, Java does not seem to have anything corresponding to the swaptop method. I practice if one array has a run of values, using .peek() to get the top element then .swaptop(...) if you can will let you go through those runs with O(1) work per element.

Solution 4:[4]

This could also be an good example using List<String> in addition to int[]

import org.testng.annotations.Test;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class TestClass {

    public static List<String> list(String... elems) {

        return new ArrayList<>(Arrays.asList(elems));
    }

    public static List<String> mergedListSorted(List<String>... listsVarArgs) {

        return Stream.of(listsVarArgs).flatMap(List::stream).sorted().collect(Collectors.toList());
    }

    @Test
    public void sortedListsTest() {

        // Sorted sub lists
        List<String> AGMS = list("A", "G", "M", "S");
        List<String> BHNT = list("B", "H", "N", "T");
        List<String> CIOU = list("C", "I", "O", "U");
        List<String> DJPV = list("D", "J", "P", "V");
        List<String> EKQW = list("E", "K", "Q", "W");
        List<String> FLRX = list("F", "L", "R", "X");

        System.out.println(mergedListSorted(AGMS, BHNT, CIOU, DJPV, EKQW, FLRX));
        System.out.println(mergedListSorted(BHNT, BHNT, CIOU, BHNT));

    }

}

The according output of two examples:

[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X]
[B, B, B, C, H, H, H, I, N, N, N, O, T, T, T, U]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 btilly
Solution 4 DFB_Altintop