'Python Knapsack problem - using the average value of the selected items as a constraint?

I'm still relatively new to python so I'm having trouble figuring out how to accomplish a certain feat.

What I'm trying to do: I have an Excel file with two columns - Sell and margin. There could be a variable number of rows. I need to find a "best fit" where the sum of n# of Sell rows is within 5% of a target sell amount AND where the avg margin is also within 5% of a target margin % (if there is a solution, if not increase the % tolerances and run again. But I can figure that out later).

From my googling, I learned this a multi-constraint knapsack problem and I was able to find some examples to work off of. The code I borrowed is from here: https://towardsdatascience.com/solving-the-multiple-knapsack-problem-with-google-or-tools-e961dfc8288e

My problem: I can't figure out how to set a constraint where the average margin of the items placed in the bag are near the target average margin.

Here's my modified code from the above site where I'm using random data for sell and margin:

from ortools.linear_solver import pywraplp

solver = solver = pywraplp.Solver.CreateSolver('SCIP')

data = {}
sell = [8520,9600,5340,8379,846,1098,1510,4954,1039,620,795,3260,75,200,75]
margin = [25, 25, 25, 34, 25, 25, 25, 25, 25, 25, 27, 20, 100, 27, 100]

assert len(sell) == len(margin)
data['values'] = values
data['sell'] = sell
data['margin'] = margin
data['items'] = list(range(len(sell)))
data['num_items'] = len(sell)
number_bags = 1 #All have the same capacity of 50 pounds
data['target_amount'] = [9262]
data['avg_margin'] = [27] 
data['bags'] = list(range(number_bags))
assert len(data['target_amount']) == number_bags
assert len(data['target_amount']) == len(data['avg_margin'])
print('sell:',*data['sell'])
print('margin:',*data['margin'])
print("Number of Items:", data['num_items'])
print("Number of Knapsacks:" , number_bags)

x = {}
for i in data['items']:
    for j in data['bags']:
        x[(i,j)] = solver.IntVar(0,1,'x_%i_%i' % (i, j))

#Constraint for an item being placed in 1 knapsack
for i in data['items']:
    solver.Add(sum(x[i,j] for j in data['bags'])<=1)
    
#Knapsack Capacity Constraint
for j in data['bags']:
    solver.Add(sum(x[(i,j)]*data['sell'][i] 
                  for i in data['items']) <= data['target_amount'][j])
#margin Constraint
#for j in data['bags']:
#    solver.Add(sum(x[(i,j)]*data['margin'][i]
#                  for i in data['items']) <= data['target_amount'][j])

#objective function
objective = solver.Objective()
for i in data['items']:
    for j in data['bags']:
        objective.SetCoefficient(x[(i,j)], data['sell'][i])
objective.SetMaximization()

solv = solver.Solve()
if solv == pywraplp.Solver.OPTIMAL:
    print('Total Packed Value:', objective.Value())
    total_Sell = 0
    for j in data['bags']:
        bag_Sell = 0
        avg_margin= 0
        count = 0
        print('\n','Bag', j+1 , '\n')
        for i in data['items']:
            if x[i,j].solution_value()>0:
                print('Line:', i , 
                      'Sell', data['sell'][i], 
                      'margin',data['margin'][i],
                     )
                bag_Sell += data['sell'][i]
                avg_margin += data['margin'][i]
                count += 1
        print('Packed Knapsack Sell: ', bag_Sell)
        print('Packed Knapsack margin: ', round(avg_margin / count, 2))
else:
    print("There is no optimal solution")


Is this even possible? Am I just way out of my league?

Thanks


Solution 1:[1]

Adding an average Margin constraint is not very difficult, once you write things down.

Constraint to keep the Average margin within two bounds

To your model you would add the following:

min_acceptable_margin = data['target_margin'][0] * 0.95
max_acceptable_margin = data['target_margin'][0] * 1.05

#Average Margin Constraints
for j in data['bags']:
   solver.Add(sum(x[(i,j)]*data['margin'][i] - x[(i,j)] * min_acceptable_margin
                 for i in data['items']) >= 0 , name=f"GT_Target_Avg")
   solver.Add(sum(x[(i,j)]*data['margin'][i] - x[(i,j)] * max_acceptable_margin
                 for i in data['items']) <= 0 , name=f"LT_Target_Avg")

As you can see, we write out the terms to keep the constraint Linear in our LP.

Since you say you are new to Python, I am providing the full code:

Full Working Code (OR-Tools LP)

from ortools.linear_solver import pywraplp
solver = solver = pywraplp.Solver.CreateSolver('SCIP')

sell = [8520,9600,5340,8379,846,1098,1510,4954,1039,620,795,3260,75,200,75]
margin = [25, 25, 25, 34, 25, 25, 25, 25, 25, 25, 27, 20, 100, 27, 100]
target_margin = [40]
bag_capacities = [9262]


def prepare_data():
  data = {}
  assert len(sell) == len(margin)
  data['sell'] = sell
  data['margin'] = margin
  data['items'] = list(range(len(sell)))
  data['num_items'] = len(sell)
  number_bags = 1 
  data['target_amount'] = bag_capacities
  data['target_margin'] = target_margin
  data['bags'] = list(range(number_bags))
  assert len(data['target_amount']) == number_bags
  assert len(data['target_amount']) == len(data['target_margin'])
  print('sell:',*data['sell'])
  print('margin:',*data['margin'])
  print("Number of Items:", data['num_items'])
  print("Number of Knapsacks:" , number_bags)

  return data


data = prepare_data()
print(data)


#Start Formulating the problem
x = {}
for i in data['items']:
    for j in data['bags']:
        x[(i,j)] = solver.IntVar(0,1,'x_%i_%i' % (i, j))

#Constraint for an item being placed in 1 knapsack
for i in data['items']:
    solver.Add(sum(x[i,j] for j in data['bags'])<=1, name="Max_One_Item"+str(i))

#Knapsack Capacity Constraint
for j in data['bags']:
    solver.Add(sum(x[(i,j)]*data['sell'][i] 
                  for i in data['items']) <= data['target_amount'][j], name=f"BagCapacity_{i}")

#Average Margin Constraints
min_acceptable_margin = data['target_margin'][0] * 0.95
max_acceptable_margin = data['target_margin'][0] * 1.05

for j in data['bags']:
   solver.Add(sum(x[(i,j)]*data['margin'][i] - x[(i,j)] * min_acceptable_margin
                 for i in data['items']) >= 0 , name=f"GT_Target_Avg")
   solver.Add(sum(x[(i,j)]*data['margin'][i] - x[(i,j)] * max_acceptable_margin
                 for i in data['items']) <= 0 , name=f"LT_Target_Avg")

#objective function
objective = solver.Objective()
for i in data['items']:
    for j in data['bags']:
        objective.SetCoefficient(x[(i,j)], data['sell'][i])
objective.SetMaximization()

#Call the solver
solv = solver.Solve()

#Print the results
if solv == pywraplp.Solver.OPTIMAL:
    print('Total Packed Value:', objective.Value())
    total_Sell = 0
    for j in data['bags']:
        bag_Sell = 0
        avg_margin= 0
        count = 0
        print('\n','Bag', j+1 , '\n')
        for i in data['items']:
            if x[i,j].solution_value()>0:
                print('Selected:', i , 
                      'Sell', data['sell'][i], 
                      'margin',data['margin'][i],
                     )
                bag_Sell += data['sell'][i]
                avg_margin += data['margin'][i]
                count += 1
        print('Packed Knapsack Sell: ', bag_Sell)
        print('Avg Packed Knapsack margin: ', round(avg_margin / count, 2))
        print(f'Target Margin {data["target_margin"][0]} (Min {min_acceptable_margin} Max {max_acceptable_margin})')
else:
    print("There is no optimal solution")

When I ran it with the numbers above, I got the following result:

Total Packed Value: 9109.0

 Bag 1 

Selected: 7 Sell 4954 margin 25

Selected: 9 Sell 620 margin 25
Selected: 11 Sell 3260 margin 20
Selected: 12 Sell 75 margin 100
Selected: 13 Sell 200 margin 27
Packed Knapsack Sell:  9109
Avg Packed Knapsack margin:  39.4
Target Margin 40 (Min 38.0 Max 42.0)

Useful Commands when debugging

A couple of handy commands to know which will help you debug your LP:

print('Number of variables =', solver.NumVariables())

print('Number of constraints =', solver.NumConstraints())

Finally, one last very useful trick. It is good to write out the formulation that the solver is seeing.

#Use this to print out the ORTOOLS Formulation
print(solver.ExportModelAsLpFormat(False).replace('\\', '').replace(',_', ','), sep='\n')

Hope that helps you move forward.

Solution 2:[2]

Instead of using code it is usually better to use math (and a piece of paper). So we have:

 sum(i,x[i]*m[i])
 ---------------- ? Target
   sum(i,x[i])

Step 1 would be to get rid of the division:

 sum(i,x[i]*m[i]) ? Target * sum(i,x[i])

Or

 |sum(i,x[i]*m[i]) - Target * sum(i,x[i])| <= MaxDeviation

This can be linearized as:

 -MaxDeviation <= sum(i,x[i]*m[i]) - Target * sum(i,x[i]) <= MaxDeviation

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ram Narasimhan
Solution 2 Erwin Kalvelagen