'Dynamic bin per row in a dataset in Pandas

I am having trouble dynamically binning my dataset for further calculation. My goal is to have specific bin/labels for each individual row in my dataframe, based on a function, and have the corresponding label assign to the column 'action'.

My dataset is:

id  value1 value2 type  length  amount
1   0.9     1.0     X   10      ['A', 'B']
2   2.0     1.6     Y   80      ['A']
3   0.3     0.5     X   29      ['A', 'C'] 

The function is as follows:

    def bin_label_generator(amount):
        if amount< 2:
            amount= 2
        lower_bound = 1.0 - (1.0/amount) 
        mid_bound = 1.0
        upper_bound = 1.0 + (1.0/amount)
        thresholds = {
            'bins':[-np.inf, lower_bound, mid_bound, upper_bound, np.inf],
            'labels':[0, 1.0, 2.0, 3.0]
        }
        return thresholds

This is my current code, but it requires me to specify a row in order to cut. I would want this to happen automatically with the dictionary specified in the row itself.

# filter on type
filter_type_series = df['type'].str.contains('X')

# get amount of items in amount list
amount_series = df[filter_type_series ]['amount'].str.len()

# generate bins for each row in series
bins_series = amount_series.apply(bin_label_generator)

# get the max values to for binning
max_values = df[filter_type_series].loc[:, [value1, value2]].abs().max(1)

# following line requires a row index, what I do not want
df['action'] = pd.cut(max_values, bins=bins_series[0]['bins'], labels=bins_series[0]['labels'])


Solution 1:[1]

Found a fix myself, by just iterating over every single row in the series, and then adding it towards the columns in the actual df.

type = 'X'

first_df = df.copy()
type_series = mst_df['type'].str.contains(type)

# loop over every row to dynamically use pd.cut with bins/labels from specific row
for index, row in mst_df[mst_series].iterrows():
#     get the max value from rows
    max_val = row[[value1, value2]].abs().max()
    
#     get amount of cables
    amount = len(row['amount'])
    
#   get bins and labels for specific row
    bins_label_dict = bin_label_generator(amount)
    bins = bins_label_dict['bins']
    labels = bins_label_dict['labels']
    
#     append label to row with max value
    first_df .loc[index, 'action'] = pd.cut([max_val], bins=bins, labels=labels)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mmhverheijden