'Find optimum in combination of groups out of individuals due to certain criteria

I would like to form random groups out of a number of individuals. For the individuals I have information about their location (two possibilities) and the team at the location they belong to. It is not yet clear, how many individuals will join. The groups should be formed on the following restrictions:

  • Groups should in general consist out of 3 - 4 people
  • Within a group, the share between locations (2 possibilities) should be as equal as possible, meaning in a group of 4 people 2 of each location (2:2) and in a group of 3 people 2 of one location and 1 of the other location (2:1)
  • The people within a group from the same location should belong to different teams
  • Applying the script/algorithm again should lead to a different composition of the groups, but be an equally optimal result (this criteria does not necessarily be fullfilled)

Very basic example
I have 8 individuals. 4 of them are from location x, 4 are from location y. Within the 4 individuals from location x two are from team 1 (x-1) and two are from team 2 (x-2). Same for the 4 people from location y (y-1, y-1, y-2, y-2). The ideal solution would be two groups of 4 people with the following members:

  • Group 1: [x-1, x-2, y-1, y-2]

  • Group 2: [x-1, x-2, y-1, y-2] The groupings have an equal share between locations and within the individuals from the same location the team differs.

The worst solution for this case would be the following:

  • Group 1: [x-1, x-1, x-2, x-2]
  • Group 2: [y-1, y-1, y-2, y-2]

Within a group all individuals are from the same location. Also multiple individuals from the same location belong to the same team.

I'm currently using python to solve this but am really happy to use an approach with R as well.

The number and size of groups I basically build through if .. else / case_when clauses. With a minimum of three individuals, N as floor(number of individuals/4) and R as remainder/left over of the division number of individuals / 4 the cases are:

  1. R = 0 --> N groups with 4 individuals each
  2. R = 1 AND N < 2 --> 1 group with 5 individuals
  3. R = 1 AND N >= 2 --> 3 groups with 3 individuals and N-3 groups of 4 individuals
  4. R = 2 --> 2 groups with 3 individuals and N-1 groups of 4 individuals
  5. R = 3 --> 1 group with 3 individuals and N groups of 4 individuals

First I'm calculating all possible combinations for groups out of four/three individuals using itertools.combinations(). Afterwards I rate those groups based on the stated criteria and remove groups which have insufficient ratings.

Based on how many groups of size four/three I need I create a nested list which consists of multiple bins of the possible 3-pax- and 4-pax-groups which the aim to use itertools.product() to find the possible combinations out of x groups. In the next step I remove combinations which contain the same group more than once.

Currently I'm stuck with the approach how to remove combinations with groups having an overlap within the individuals. I thought about customizing the itertools.product()-method and include restrictions, but so far I haven't been successful.

Also this approach needs a lot of ressources (runtime), which increase exponential by the number of individuals. So any hints on different, more efficient approaches are very welcome.

import pandas as pd
import math
import itertools
import collections

# Example dataset
df = pd.DataFrame(data={"ID": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                        "Name": ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"],
                        "Location": ["y", "x", "y", "x", "x", "x", "y", "x", "x", "y"],
                        "Team": ['1', '1', '1', '2', '3', '2', '2', '1', '3', '2']})


# Defining a class for individuals
class Person:
    def __init__(self, personid, location, team, group):
        self.personid = personid
        self.location = location
        self.team = team
        self.group = group

# Create List of Individuals
individuals = []
for n in df.to_dict('records'):
    individuals.append(Person(personid=n['ID'], location=n['Location'], team=n['Team'], group='n/a'))


# Get permutations of four
combinations = list(itertools.combinations(individuals,4))

# store them in a dictionary
dict_combinations_four = {}
counter = 0

for n in combinations:
    counter += 1
    dict_combinations_four["4_"+str(counter)] = n  # resulting in 4_1, 4_2 ... 4_n as Key for combinations

# calculate points for combinations
dict_comb_four_points = {}

for n in dict_combinations_four.keys():
    list_location = []
    list_team = []
    points = 0
    for i in dict_combinations_four[n]:
        list_location.append(i.location)
        list_team.append(i.location + "_" + i.team)
    # 1. different locations:
    cLocation = list(collections.Counter(list_location).items())
    if len(cLocation) == 2:
        if cLocation[0][1] == cLocation[1][1]: # length has to be two, otherwise there is only one location, count has to be 2 == 2 for equal share;
            points += 3
        elif cLocation[0][1] == 3 or cLocation[1][1] == 3: # for share 3:1 or 1:3
            points += 1
    else:
        points += 0
    # 2. different teams within location
    cTeam = list(collections.Counter(list_team).items())
    if len(cTeam) == 4:# in this case there are four different location / team combinations
        points += 3
    elif len(cTeam) == 3:
        points += 1
    else:
        points += 0

    dict_comb_four_points[n] = points

# remove combinations which reach insufficient points
for i in dict_comb_four_points.keys():
    if dict_comb_four_points[i] <=3:
        dict_combinations_four.pop(i)


# get permutations of three
combinations = list(itertools.combinations(individuals,3))

# store them in a dictionary
dict_combinations_three = {}
counter = 0

for n in combinations:
    counter += 1
    dict_combinations_three["3_" + str(counter)] = n  # resulting in 3_1, 3_2 ... 3_n as Key for combinations


# function to get unique values: https://www.geeksforgeeks.org/python-get-unique-values-list/
def unique(list1):
    # insert the list to the set
    list_set = set(list1)
    # convert the set to the list
    unique_list = (list(list_set))
    return(unique_list)

# calculate points for combinations

dict_comb_three_points = {}

for n in dict_combinations_three.keys():
    list_location = []
    list_team = []
    points = 0
    for i in dict_combinations_three[n]:
        list_location.append(i.location)
        list_team.append(i.location + "_" + i.team)
    # 1. different locations:
    if len(unique(list_location)) >=2:
        points += 3
    # 2. different teams within location
    if len(unique(list_team)) == 4:
        points += 3
    elif len(unique(list_team)) == 3:
        points += 1

    dict_comb_three_points[n] = points

# remove combinations which reach insufficient points
for i in dict_comb_three_points.keys():
    if dict_comb_three_points[i] <= 3:
        dict_combinations_three.pop(i)


# How many groups with what sizes are needed? Define Class for groupings:
class Groupings:
    def __init__(self, participants):
        self.participants = participants
        self.num_pax = len(self.participants)
        num_helper = self.num_pax/4
        fl_num_helper = math.floor(num_helper)
        rest = (num_helper - math.floor(num_helper)) * 4
        self.group_four_pax = fl_num_helper if rest == 0 else fl_num_helper - 2 if rest == 1 and fl_num_helper >= 2 else 0 if rest == 1 and fl_num_helper < 2 else fl_num_helper - 1 if rest == 2 else fl_num_helper
        self.group_three_pax = 0 if rest == 0 else 3 if rest == 1 and fl_num_helper >= 2 else 0 if rest == 0 and fl_num_helper < 2 else 2 if rest == 2 else 1
        self.group_five_pax = 1 if rest == 1 and fl_num_helper < 2 else 0
        self.groupsizes = {"four_pax": self.group_four_pax, "three_pax": self.group_three_pax, "five_pax": self.group_five_pax}

# Assign participants to instance of Class, numbers are contained within instance
mysterygroups = Groupings(individuals)


# Create list with repeated group_lists
list_grouping_bins = []
for i in range(mysterygroups.group_four_pax):
    list_grouping_bins.append(list(dict_combinations_four.keys()))
for i in range(mysterygroups.group_three_pax):
    list_grouping_bins.append(list(dict_combinations_three.keys()))


# Find possible combinations of groups:
comb_groups = itertools.product(*list_grouping_bins)

# Remove combinations which contain the same group more than once
comp_groups_clean = []
comp_groups_unclean = []
for i in comb_groups:
    cI = list(collections.Counter(i).items())
    if len(i) == len(cI):
        comp_groups_clean.append(i)
    else:
        comp_groups_unclean.append(i)


Solution 1:[1]

Here is a heuristic that will allow you to randomly create lots of solutions, assuming that there aren't any potential problems. Potential problems that may pose a challenge include too many locations with 1 person, one location with too many people, or a large location completely dominated by one team.

The key idea is a maximum weight matching, which can be found with the Blossom algorithm. That is, if you take a graph, you can assign each edge a weight of 1 and it will find a matching of as many pairs as possible. If you instead assign each edge a weight of 1 plus a small random factor, which of the possible maximum matchings you'll find is dependent on that small random factor.

So first construct a graph where 2 people are connected if and only if they are at the same location and different teams. A maximum matching now gives you lots of pairs. Of course matching as many pairs as possible will lead to a lot of teams of size 4, and few of size 3. If you want to adjust that, you can add to your graph a number of blank slots, connected to lots of people. And now some people will pair up with a blank slot, giving you more singletons.

After this pairing phase, you have groups of 2 and 1. If you have an odd number of groups, randomly split one pair to get an even one.

And now you construct a new graph. 2 groups are connected if they are at different locations and at least one is a pair. Now find a maximal matching. If all goes well, everyone will now be in groups of 3-4, which are either 2-1 or 2-2, where each 2 consists of a pair of people of different teams at the same location.

Every time you run it, you'll get a solution. If you assign random weights, they will always be different. If you assign a penalty for any pairing that puts together people who were together in a previous run, you'll get lots of groupings AND they will work to be different from the previous runs.

But the heart of it all is the Blossom algorithm.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 btilly