'drop duplicates and exclude specific columns and take the lowest value

I have this example dataset

CPU_Sub_Series  RAM     Screen_Size   Resolution   Price
Intel i5         8      15.6          1920x1080    699
Intel i5         8      15.6          1920x1080    569
Intel i5         8      15.6          1920x1080    789
Ryzen 5          16     16.0          2560x1600    999
Ryzen 5          32     16.0          2560x1600    1299

All I want to do is, check and then drop the duplicate data, except in the price column, and then keep the lowest value in the price column.
So, the output column is like this :

CPU_Sub_Series  RAM     Screen_Size   Resolution   Price
Intel i5         8      15.6          1920x1080    569
Ryzen 5          16     16.0          2560x1600    999
Ryzen 5          32     16.0          2560x1600    1299

Should I sort it first by price? and then what?
df.sort_values('Price') ? and then what?



Solution 1:[1]

In addition to @Daniele Bianco's answer, you can also get the result like this (almost similar approach but slightly different form):

import pandas as pd

df = pd.DataFrame({
    'CPU_Sub_Series': ['Intel i5', 'Intel i5', 'Intel i5', 'Ryzen 5', 'Ryzen 5'],
    'RAM': [8, 8, 8, 16, 32],
    'Screen_Size': [15.6, 15.6, 15.6, 16.0, 16.0],
    'Resolution': ['1920x1080', '1920x1080', '1920x1080', '2560x1600', '2560x1600'],
    'Price': [699, 569, 789, 999, 1299]
})

df = df.groupby(["CPU_Sub_Series", "RAM", "Screen_Size", "Resolution"])['Price'].min().reset_index()
print(df)
#  CPU_Sub_Series  RAM  Screen_Size Resolution  Price
#0       Intel i5    8         15.6  1920x1080    569
#1        Ryzen 5   16         16.0  2560x1600    999
#2        Ryzen 5   32         16.0  2560x1600   1299

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Park