'How to find Z highest subset Sums from the array of N elements

I have an array of numbers, now I want to find all possible subset sums and get the top z elements in it.

Example:

Input : arr[] = {2, 4, 5}, z = 3
SubsetSums : 0 2 4 5 6 7 9 11
output : Top z=3 elements = 11,9,7

Here is my code:

static List<Long> subsetSums(int arr[], int n)
    {
 
        // There are total 2^n subsets
        int total = 1 << n;
        List<Long> list = new ArrayList<>(); 
        // Consider all numbers from 0 to 2^n - 1
        for (int i = 0; i < total; i++) {
            int sum = 0;
 
            // Consider binary representation of
            // current i to decide which elements
            // to pick.
            for (int j = 0; j < n; j++)
                if ((i & (1 << j)) != 0)
                    sum += arr[j];
 
            // Print sum of picked elements.
            System.out.print(sum + " ");
            list.add(sum);
        }
        list.sort((a,b) -> b.compareTo(a));
        List<Long> response = new ArrayList<>();
        for(int i=0; i<k && i<list.size(); i++) {
           response.add(list.get(i));  
        }
        return response;
    }

I have taken part of this code from GeeksforGeeks.

The time complexity of this program is O(n*2^n)

constraints:

input size is 1 to 10^5
values in input array is -10^9 to 10^9
z is in range 1 to 2000

I have used this during one of the Hackerrank exams, but out of 15 test cases, only 7 test cases passed with the remaining all timed out because input size n can be very large.

How to solve this problem with lower time complexity?



Solution 1:[1]

For that, you don't need to generate all the possible combinations (unless z is equal to n). To minimize time complexity, you need to find the maximum sum of positive elements, then determine the logarithm base 2 of z of elements having the lowest absolute values and construct z max sums based on them.

When z is negligible in comparison to n, e.g. z=3 and n=100,000, then n will be the major contributor to the time complexity which will be close to O(n). In the general case, the overall time complexity will be O(n*log2(z)) because we need to perform partial sorting in order to find log2(z) elements with the lowest modulus (all costs of this algorithm are listed below).

Because it's an algorithmic problem, I will deliberately avoid using Java-specific implementations of data structures (except for PriorityQueue) and will use arrays in this solution, so that it will be more clear for anyone who has experience with C-like languages. I guess it will be fairly easy to reimplement the code with the Java collections framework if needed. And as far as you have a massive input and performance is concerned, collection can't bit an array of primitives.

In order to obtain a list of Long from an array long[] you can use the following statement: LongStream.of(array).boxed().toList();

Implementation Steps

Let's describe the steps we need to take to address this problem:

  • Find the minimum number of source array elements required to generate z minimum sum combinations. Which will be equal to logarithm base 2 of z rounded up to the nearest whole number ( ceiling(log2(z)) ). So that 3 will give 2 (not 1), 5 will give 3. There's no built-in function for log2 in Java, in the solution below I'm using functionality of the Integer class to find the position of the right most significant bit to calculate log2 and total count of significant bits to round it (rounding required if z is not a power of 2, i.e. bit-count is greater than 1). The cost of this operation is O(1).

  • Iterate over the source array and determine the largest possible sum of positive elements (denoted as total. If all elements are negative, the greatest possible sum would be 0 (sum of a subset of size 0). And it'll cost O(n) time, nothing unexpected.

  • Simultaneously with the previous step (while iterating the source array) we should find ceiling(log2(z)) elements having the lowest absolute value. PriorityQueue class which represent a heap data structure is used for that purpose. For the purpose of selecting the lowest elements from the source array, we need a Max-heap (a heap in which all elements are less or equal to the head element). To achieve that, the reversed-order comparator is being passed as an argument into the constructor of PriorityQueue. Since each operation of enqueuing and dequeuing will be done in O(log2(z)) time (reminder, target size of the queue is log2(z)) the worse case scenario is when each iteration over the source array will result in inserting/removing of elements from the queue, which will happen if the source array is sorted in descending order based on the absolute values of elements, e.g. 8,-7,5,-4,3,2,1. The overall worse case time complexity of this step is O(n*log2(z)).

  • The next step is to generate all possible sum combinations from the values of stored in the queue (see example below). And return combinations as an unsorted array. Since the number of the sum combinations is roughly equal to z (it'll be slightly greater for z values that are not power of 2, e.g. 8 combinations for z=5). And will give a time complexity of O(z).

  • Sort the array of the lowest subset sums created in the previous step and cut off all element that are not needed (when z isn't a power of 2). This step will cost O(z log(z)) time.

  • And finally, obtain the result - z the highest sums of subsets by subtract each element of the array of the lowest subset sums from the maximum sum of the positive elements in the source array. This operation will have a time complexity of O(z).

Schemes and Examples

Let's assume that the sorted set of values from the source array can be represented as follows:

enter image description here

Subsets of elements having the highest sums would be the following:

  • All positive elements, the sum equals to total;
  • All positive elements, except of the element having a value of 1, sum equals to total-1;
  • All positive elements, with addition of the negative element having a value of -1, the sum is equal to total-1;
  • All positive elements, except of the element having a value of 2, the sum is equal to total-2;
  • All positive elements, with addition of the negative element having a value of -2, the sum equals to total-2;
  • etc.

The result of excluding a positive element from the subset or including a negative element having the same absolute value is identical. Therefore, there's no need to discriminate between them while storing into the queue, we need only their modulus.


The following example illustrates how the process of generating of all possible sum combinations from the elements of a queue containing the lowest absolute values of the source array.

enter image description here

The resulting array of the sum combinations need to be sorted in order to extract the lowest values from it, regardless of the order from which elements are being fetched from the queue, as it can be observed from the example above. I.e. since the resulting set of sums constructed from the sorted set of element (see example) appears to be unordered, it doesn't make sense to remove elements from the queue using dequeue operation which will cost O(log z) time, but instead it would be more efficient to iterate over the underlying array of the PriorityQueue and process its elements in the arbitrary order.

Implementation

The code might look like this:

public static long[] getMaxSums(int[] source, int z) {
    long total = 0;
    int limit = 31 - Integer.numberOfLeadingZeros(z) + (z == 1 || Integer.bitCount(z) > 1 ? 1 : 0);
    Queue<Integer> queue = new PriorityQueue<>(Comparator.reverseOrder()); // will contain up to `limit` lowest positive values
    for (int next : source) {
        tryAdd(queue, Math.abs(next), limit);
        if (next > 0) total += next;
    }
    return getHighestSumComb(queue, total, z);
}

private static void tryAdd(Queue<Integer> queue, int next, int limit) {
    if (queue.size() == limit && queue.element() > next) queue.remove(); // if next absolute value is less than the highest element in the queue and max size has been exceeded the highest element needs to be removed from the queue
    if (queue.size() < limit) queue.add(next);
}

private static long[] getHighestSumComb(Queue<Integer> queue, long total, int z) {
    long[] lowestSums = getLowestAbsSumComb(queue);

    long[] result;
    if (z == lowestSums.length) { // i.e. if Z is a power of 2, and we need all sum combinations
        result = lowestSums;
    } else {
        Arrays.sort(lowestSums);
        result = Arrays.copyOf(lowestSums, z);
    }
    
    for (int i = 0; i < result.length; i++) {
        result[i] = total - result[i];
    }
    return result;
}

private static long[] getLowestAbsSumComb(Queue<Integer> queue) {
    long[] sumComb = new long[(int) Math.pow(2, queue.size())];
    int pos = 1; // current position in the resulting array is initialized to 1 because the first element has to have value of 0
    for (int next: queue) {
        int prevPos = pos;
        for (int i = 0; i < prevPos; i++) {
            sumComb[pos++] = sumComb[i] + next;
        }
    }
    return sumComb;
}

main() - demo

public static void main(String[] args) {
    System.out.println(Arrays.toString(getMaxSums(new int[]{2, 4, 5}, 3)));
    System.out.println(Arrays.toString(getMaxSums(new int[]{-2, 5, -3, 7, 9}, 5)));
}

Output

[11, 9, 7]
[21, 19, 18, 16, 16]

Solution 2:[2]

If you first sort the input array, you don't have to calculate all those sums:

  • The largest sum is the sum of all values
  • The next largest sum is the (sum of all values - the smallest element)
  • The next largest sum is the (sum of all values - Min(the sum of the 2 smallest elements, the next smallest element))
  • ...

When z is a lot smaller than n^2, this will speed up your solution a lot.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2