'Find all subsets of length k in an array

Given a set {1,2,3,4,5...n} of n elements, we need to find all subsets of length k .

For example, if n = 4 and k = 2, the output would be {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}.

I am not even able to figure out how to start. We don't have to use the inbuilt library functions like next_permutation etc.

Need the algorithm and implementation in either C/C++ or Java.



Solution 1:[1]

Recursion is your friend for this task.

For each element - "guess" if it is in the current subset, and recursively invoke with the guess and a smaller superset you can select from. Doing so for both the "yes" and "no" guesses - will result in all possible subsets.
Restraining yourself to a certain length can be easily done in a stop clause.

Java code:

private static void getSubsets(List<Integer> superSet, int k, int idx, Set<Integer> current,List<Set<Integer>> solution) {
    //successful stop clause
    if (current.size() == k) {
        solution.add(new HashSet<>(current));
        return;
    }
    //unseccessful stop clause
    if (idx == superSet.size()) return;
    Integer x = superSet.get(idx);
    current.add(x);
    //"guess" x is in the subset
    getSubsets(superSet, k, idx+1, current, solution);
    current.remove(x);
    //"guess" x is not in the subset
    getSubsets(superSet, k, idx+1, current, solution);
}

public static List<Set<Integer>> getSubsets(List<Integer> superSet, int k) {
    List<Set<Integer>> res = new ArrayList<>();
    getSubsets(superSet, k, 0, new HashSet<Integer>(), res);
    return res;
}

Invoking with:

List<Integer> superSet = new ArrayList<>();
superSet.add(1);
superSet.add(2);
superSet.add(3);
superSet.add(4);
System.out.println(getSubsets(superSet,2));

Will yield:

[[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]

Solution 2:[2]

Use a bit vector representation of the set, and use an algorithm similar to what std::next_permutation does on 0000.1111 (n-k zeroes, k ones). Each permutation corresponds to a subset of size k.

Solution 3:[3]

This is python. Sorry for the spanish ;)

from pprint import pprint
conjunto = [1,2,3,4, 5,6,7,8,9,10]
k = 3
lista = []
iteraciones = [0]
def subconjuntos(l, k):
    if k == len(l):
        if not l in lista:
            lista.append(l)
        return
    for i in l:
        aux = l[:]
        aux.remove(i)
        result = subconjuntos(aux, k)
        iteraciones[0] += 1
        if not result in lista and result:
            lista.append( result)

subconjuntos(conjunto, k)
print (lista)
print ('cant iteraciones: ' + str(iteraciones[0]))

Solution 4:[4]

Check out my solution

import java.util.ArrayList;
import java.util.HashSet;
import java.util.Set;


 public class Subset_K {
public static void main(String[]args)
{
    Set<String> x;
    int n=4;
    int k=2;
    int arr[]={1,2,3,4};
    StringBuilder sb=new StringBuilder();
    for(int i=1;i<=(n-k);i++)
        sb.append("0");
    for(int i=1;i<=k;i++)
        sb.append("1");
    String bin=sb.toString();
    x=generatePerm(bin);
    Set<ArrayList <Integer>> outer=new HashSet<ArrayList <Integer>>();
    for(String s:x){
        int dec=Integer.parseInt(s,2);
        ArrayList<Integer> inner=new ArrayList<Integer>();
        for(int j=0;j<n;j++){
            if((dec&(1<<j))>0)
                inner.add(arr[j]);
        }
        outer.add(inner);
    }
    for(ArrayList<?> z:outer){
        System.out.println(z);
    }
}

    public static Set<String> generatePerm(String input)
{
    Set<String> set = new HashSet<String>();
    if (input == "")
        return set;

    Character a = input.charAt(0);

    if (input.length() > 1)
    {
        input = input.substring(1);

        Set<String> permSet = generatePerm(input);

        for (String x : permSet)
        {
            for (int i = 0; i <= x.length(); i++)
            {
                set.add(x.substring(0, i) + a + x.substring(i));
            }
        }
    }
    else
    {
        set.add(a + "");
    }
    return set;
}
}

I am working on a 4 element set for test purpose and using k=2. What I try to do is initially generate a binary string where k bits are set and n-k bits are not set. Now using this string I find all the possible permutations of this string. And then using these permutations I output the respective element in the set. Would be great if someone could tell me about the complexity of this problem.

Solution 5:[5]

   #include<iostream>
   #include<cstdio>
   #include<vector>
   using namespace std;
   vector<int> v;
   vector<vector<int> > result;

  void subset(int arr[],int k,int n,int idx){
  if(idx==n)
 return;

if(k==1){
    for(int i=idx;i<n;i++)
     {
        v.push_back(arr[i]);
        result.push_back(v);
        v.pop_back();
     }
}

 for(int j=idx;j<n;j++) {
  v.push_back(arr[j]);
  subset(arr,k-1,n,j+1);
  v.pop_back();
  }
 }

int main(){
int arr[] = {1,2,3,4,5,6,7};
int k = 4;
int n =sizeof(arr)/sizeof(arr[0]);
subset(arr,k,n,0);

for(int i = 0;i<result.size();i++)
 { 
  for(int j = 0;j<result[i].size();j++)
   {
     cout << result[i][j] << " ";
   }
   cout << endl;
 }
}

Solution 6:[6]

Another intresting solution.

#include<bits/stdc++.h>
using namespace std;
long factorial(int n) { return (n==1|| n==0|| n < 0) ? 1 : n *factorial(n-1) ;}
void printS(int set[],int n,int k) 
{ 

   long noofsubset =  factorial(n) / (factorial(n-k)*factorial(k));
   bitset<32> z ((1 << (k)) - 1);
   string s = z.to_string();
    int i = 0;
        while(i<noofsubset)
        { 
                  for (int j = 0; j  < n;j++)
                  {
                      if(s[(32-n)+j] == '1')
                        cout << set[j]<<" ";
                  }
                    cout << endl;
                next_permutation(s.begin(),s.end());
                i++;
        } 
}

void printSubsetsOfArray(int input[], int size) {
    int k  = 3;
  printS(input,size,k)  ;
}

Solution 7:[7]

Slight improvement for @amit top voted answer:

His code keep checking combinations even when there won't be any chance for them to reach the wanted length. We can stop creating combinations much earlier:

e.g. for [1,2,3,4,5,6,7,8,9,10] , length = 8 , the code will still try all combinations of length 7,6,5,4,3,2,1 although they will obviously just be thrown away, halting only when idx reaches the end of the list.

We can improve the running time by stopping earlier, when we already know the set we build + the optional remaining digits will still be too short.

change :

//unsuccessful stop clause
if (idx == superSet.size()) return;

into:

// unsuccessful stop clause
Integer maxFutureElements = superSet.size() - idx;
if (current.size() + maxFutureElements < length) return;

Solution 8:[8]

Here's a short python algorithm. I haven't used any predefined functions as such so I believe it could be easily translated to Java/C

def subs(l,n):

if(len(l)<k):

return []

elif(k==0):

return [[]]

else:

lis=[[l[0]]+b for b in (subs(l[1:],k-1))]

return (lis+subs(l[1:],k))

Here l is the list [1,2,...,m]

Solution 9:[9]

Please check my solution:-

private static void printPermutations(List<Integer> list, int subSetSize) {
    List<Integer> prefixList = new ArrayList<Integer>();
    printPermutations(prefixList, list, subSetSize);
}

private static void printPermutations(List<Integer> prefixList, List<Integer> list, int subSetSize) {
    if (prefixList.size() == subSetSize) {
        System.out.println(prefixList);
    } else {
        for (int i = 0; i < list.size(); i++) {
            Integer removed = list.remove(i);
            prefixList.add(removed);
            printPermutations(prefixList, list, subSetSize);
            prefixList.remove(removed);
            list.add(i, removed);
        }
    }
}

This is similar to String permutations:-

private static void printPermutations(String str) {
    printAllPermutations("", str);
}

private static void printAllPermutations(String prefix, String restOfTheString) {
    int len = restOfTheString.length();
    System.out.println(prefix);
    for (int i = 0; i < len; i++) {
        printAllPermutations(prefix + restOfTheString.charAt(i), restOfTheString.substring(0, i) + restOfTheString.substring(i + 1, len));
    }
}

Solution 10:[10]

This is an implemation in F#:

// allSubsets: int -> int -> Set<Set<int>>
let rec allSubsets n k =
    match n, k with
    | _, 0 -> Set.empty.Add(Set.empty)
    | 0, _ -> Set.empty
    | n, k -> Set.union (Set.map (fun s -> Set.add n s) (allSubsets (n-1) (k-1)))
                        (allSubsets (n-1) k)

You can try it in the F# REPL:

> allSubsets 3 2;;

val it : Set<Set<int>> = set [set [1; 2]; set [1; 3]; set [2; 3]]

> allSubsets 4 2;;

val it : Set<Set<int>> = set [set [1; 2]; set [1; 3]; set [1; 4]; set [2; 3]; set [2; 4]; set [3; 4]]

This Java class implements the same algorithm:

import java.util.HashSet;
import java.util.Set;

public class AllSubsets {

    public static Set<Set<Integer>> allSubsets(int setSize, int subsetSize) {
        if (subsetSize == 0) {
            HashSet<Set<Integer>> result = new HashSet<>();
            result.add(new HashSet<>());
            return result;
        }
        if (setSize == 0) {
            return new HashSet<>();
        }
        Set<Set<Integer>> sets1 = allSubsets((setSize - 1), (subsetSize - 1));
        for (Set<Integer> set : sets1) {
            set.add(setSize);
        }
        Set<Set<Integer>> sets2 = allSubsets((setSize - 1), subsetSize);
        sets1.addAll(sets2);
        return sets1;
    }
}

If you do not like F# or Java then visit this website. It lists solutions to your particular problem in various programming languages:

http://rosettacode.org/wiki/Combinations

Solution 11:[11]

JavaScript implementation:

var subsetArray = (function() {
  return {
    getResult: getResult
  }

  function getResult(array, n) {

    function isBigEnough(value) {
      return value.length === n;
    }

    var ps = [
      []
    ];
    for (var i = 0; i < array.length; i++) {
      for (var j = 0, len = ps.length; j < len; j++) {
        ps.push(ps[j].concat(array[i]));
      }
    }
    return ps.filter(isBigEnough);
  }
})();



 var arr = [1, 2, 3, 4,5,6,7,8,9];
 console.log(subsetArray.getResult(arr,2));

Solution 12:[12]

Here is an iterative version in python. Essence of it is increment_counters() function which returns all possible combinations. We know it needs to be called C(n,r) times.

def nchooser(n,r):
    """Calculate the n choose r manual way"""
    import math
    f = math.factorial
    return f(n) / f(n-r) / f(r)

def increment_counters(rc,r,n):
    """This is the essense of the algorithm. It generates all possible indexes.
    Ex: for n = 4, r = 2, rc will have values (0,1),(0,2),(0,3),(1,2),(1,3),(2,3).
    You may have better understanding if you print all possible 35 values for
    n = 7, r = 3."""

    rc[r-1] += 1     # first increment the least significant counter
    if rc[r-1] < n:  # if it does not overflow, return
        return

    # overflow at the last counter may cause some of previous counters to overflow
    # find where it stops (ex: in n=7,r=3 case, 1,2,3 will follow 0,5,6)
    for i in range(r-2,-1,-1): # from r-2 to 0 inclusive
        if rc[i] < i+n-r:
            break
    # we found that rc[i] will not overflow. So, increment it and reset the
    # counters right to it. 
    rc[i] += 1
    for j in range(i+1,r):
        rc[j] = rc[j-1] + 1

def combinations(lst, r):
    """Return all different sub-lists of size r"""
    n = len(lst)
    rc = [ i for i in range(r) ]  # initialize counters
    res = []
    for i in range(nchooser(n,r)): # increment the counters max possible times 
        res.append(tuple(map(lambda k: lst[k],rc)))
        increment_counters(rc,r,n)

    return res

Solution 13:[13]

Here is a Java version of what I think Simple is talking about, using a binary representation of all sets in the power set. It's similar to how Abhiroop Sarkar did it, but I think a boolean array makes more sense than a string when you are just representing binary values.

private ArrayList<ArrayList<Object>> getSubsets(int m, Object[] objects){
    // m = size of subset, objects = superset of objects
    ArrayList<ArrayList<Object>> subsets = new ArrayList<>();
    ArrayList<Integer> pot = new ArrayList<>();
    int n = objects.length;
    int p = 1;
    if(m==0)
        return subsets;
    for(int i=0; i<=n; i++){
        pot.add(p);
        p*=2;
    }
    for(int i=1; i<p; i++){
        boolean[] binArray = new boolean[n];
        Arrays.fill(binArray, false);
        int y = i;
        int sum = 0;
        for(int j = n-1; j>=0; j--){
            int currentPot = pot.get(j);
            if(y >= currentPot){
                binArray[j] = true;
                y -= currentPot;
                sum++;
            }
            if(y<=0)
                break;
        }
        if(sum==m){
            ArrayList<Object> subsubset = new ArrayList<>();
            for(int j=0; j < n; j++){
                if(binArray[j]){
                    subsubset.add(objects[j]);
                }
            }
            subsets.add(subsubset);
        }
    }

    return subsets;
}

Solution 14:[14]

If you are looking for Iterator pattern answer then here you go.

public static <T> Iterable<List<T>> getList(final Iterable<? extends T> list) {

    List<List<T>> listOfList = new ArrayList<>();

    for (T t: list)
        listOfList.add(Collections.singletonList(t));

    return listOfList;
}
public static <T> Iterable<List<T>> getIterable(final Iterable<? extends T> list, final int size) {

    final List<T> vals = new ArrayList<>();
    int numElements = 0;
    for (T t : list) {
        vals.add(t);
        numElements++;
    }

    if (size == 1) {
        return getList(vals);
    }
    if (size == numElements) {
        return Collections.singletonList(vals);
    }

    return new Iterable<List<T>>() {

        @Override
        public Iterator<List<T>> iterator() {
            return new Iterator<List<T>>() {

                int currPos = 0;                    
                Iterator<List<T>> nextIterator = getIterable(
                    vals.subList(this.currPos + 1, vals.size()), size - 1).iterator();

                @Override
                public boolean hasNext() {
                    if ((this.currPos < vals.size()-2) && (this.currPos+size < vals.size()))
                        return true;
                    return false;
                }

                @Override
                public List<T> next() {
                    if (!nextIterator.hasNext()) {
                        this.currPos++;
                        nextIterator = getIterable(vals.subList(this.currPos+1, vals.size()), size-1).iterator();
                    }
                    final List<T> ret = new ArrayList<>(nextIterator.next());
                    ret.add(0, vals.get(this.currPos));
                    return ret;
                }
            };
        }
    };
}

Solution 15:[15]

Here is a simple algorithm to enumerate all k-subsets of [n]={0,...,n-1} in lexicographic order. That is, the first of these subsets is S0=(0,1,2...,k-1), and the last is Slast=(n-k, n-k+1,...,n-1). For any k-subset S and for any 0 < j < k, we have S[j-1] < S[j] <= n+j-k.

For example, if n=10 and k=4, S0=(0,1,2,3) and Slast=(6,7,8,9). Notice, for example, that no combination can have S[1]>7 (in which case we'd have S[j]>n+j-k), since then there would be not enough values left to fill thr remaining positions j=2..3.

The idea of the algorithm is to start with the first combination S0, and then call next() repeatedly to generate the next k-subset each time. The function next() traverses the current k-subset backwards, starting from the last position j=k-1 down to 0, until it finds an entry S[j] that has not yet reached its maximum allowed value n+j-k and can thus be increased. Then it increases this position by one and fills the remaining positions, j+1..k-1 with consecutive values from S[j]+1. The algorithm stops as soon as no position can be further increased.

For example, suppose we have S=(3,7,8,9). Starting from j=3, we see that S[3],S[2],S[1] have reached their maximum values. Thus, the rightmost position that can still be increased is S[0]. This value is updated to S[0]+1=4, and the following positions are updated to 5,6,7. Hence the next k-subset will be S=(4,5,6,7).

#include <stdlib.h>
#include <stdbool.h>
#include <stdio.h>


bool next(int *S, int k, int n) {
    int j = k-1;
    while (j >= 0 && S[j] == n + j - k) 
        j--;
    if (j < 0) return false;
    S[j] += 1;
    for (int i = j+1; i < k ; i++) 
        S[i] = S[i-1] + 1;
    return true;
}

int main(int argc, char *argv[])
{
    int n = 10;
    int k = 4;
    int *S = (int *)calloc(k, sizeof(int));
    for (int j = 0; j < k; S[++j] = j); //first k-subset
    int no = 0;
    do {
        printf("subset #%d: ",no++);
        for (int j=0; j < k; j++) {
            printf("%d ", S[j]);
        }
        printf("\n");
    } while(next(S, k, n));
    return 0;
}