'creating new column in dataframe with the values from another column in the same dataframe [duplicate]

As a scientific researcher I am a beginner in Python.

I am trying to make a new column in the following dataframe:

                            x      y      z   bat      gradient
date                                                       
2022-04-15 10:17:14.721  0.125  0.016  1.032  NaN    0.0320
2022-04-15 10:17:39.721  0.125 -0.016  1.032  NaN    0.0000
2022-04-15 10:18:04.721  0.125  0.016  1.032  NaN    0.0000
2022-04-15 10:18:29.721  0.125 -0.016  1.032  NaN    0.0000
2022-04-15 10:18:54.721  0.125  0.016  1.032  NaN    0.0160
                       ...    ...    ...  ...       ...
2022-05-02 17:03:04.721 -0.750 -0.016  0.710  NaN    0.7855
2022-05-02 17:03:29.721 -0.750 -0.016  0.710  NaN    1.4420
2022-05-02 17:03:54.721  0.719 -0.302 -0.419  NaN    0.8690
2022-05-02 17:04:19.721 -0.625 -0.048 -0.871  NaN    1.1965
2022-05-02 17:04:44.721 -0.969  0.016 -0.032  NaN    1.2470

And I have certain limits/intervals (whiskers from a boxplot):

limit_start_A = 0.15
limit_end_A = 0.20

limit_start_B =0.20
limit_end_B = 0.40

limit_start_C = 0.40
limit_end_C = 0.90

limit_start_D = 0.90
limit_end_D = 1.1

I would like to make a new column named "result" based on the values that are in the "gradient" column. So when the gradient has a value between the limit/interval of "limit_start_B - limit_start_B" it gives the row in the new "result" column the letter "B".



Solution 1:[1]

Dont use so many variables, rather use a list and pandas.cut:

limits = [0.15, 0.20, 0.40, 0.90, 1.1]
labels = ['A', 'B', 'C', 'D']

df['result'] = pd.cut(df['gradient'], bins=limits, labels=labels)

output:

                             x      y      z  bat  gradient result
date                                                              
2022-04-15 10:17:14.721  0.125  0.016  1.032  NaN    0.0320    NaN
2022-04-15 10:17:39.721  0.125 -0.016  1.032  NaN    0.0000    NaN
2022-04-15 10:18:04.721  0.125  0.016  1.032  NaN    0.0000    NaN
2022-04-15 10:18:29.721  0.125 -0.016  1.032  NaN    0.0000    NaN
2022-04-15 10:18:54.721  0.125  0.016  1.032  NaN    0.0160    NaN
2022-05-02 17:03:04.721 -0.750 -0.016  0.710  NaN    0.7855      C
2022-05-02 17:03:29.721 -0.750 -0.016  0.710  NaN    1.4420    NaN
2022-05-02 17:03:54.721  0.719 -0.302 -0.419  NaN    0.8690      C
2022-05-02 17:04:19.721 -0.625 -0.048 -0.871  NaN    1.1965    NaN
2022-05-02 17:04:44.721 -0.969  0.016 -0.032  NaN    1.2470    NaN

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1