'Change values of column in df using conditional in two columns
I'm having the following problem: I'm working with a dataset that can be found at https://www.kaggle.com/datasets/ricardomattos05/jogos-do-campeonato-brasileiro
The column 'home_team' has 2 equal teams ('Atlético') and the difference between them is the column 'home_team_state'. How can I change the name of the teams in the data frame?
I tried to use some conditional like
df['home_team'] = np.where([df.home_team] == 'Atletico' and df['home_team_state'] == 'MG', 'Atlético MG', df.home_team)
but it does not work. So, basically, my question is: Can I change the specif values of a column in a Data Frame using the conditions of 2 columns?
Any help is appreciated!
Solution 1:[1]
You can't change but you can create a new one an then replace, like this.
In your case you can create 'home_team_copy':
df.loc[([df.home_team] == 'Atletico') | (df['home_team_state'] == 'MG'),
'new_home_team]' = "<whatever you want, True or False, YES OR NO, HOME
TEAM..>"
Than, if was the case you replace the values or change de name os the column new_home_team
to home_team
for example.
Example:
data | cts | op |
---|---|---|
2017-04-24 | 3 | B |
2022-04-18 | 6 | S |
2017-04-14 | 10 | S |
2022-04-13 | 4 | S |
Code:
df.loc[(df['cts'] > 5) | (df['op'] == 'S'), 'obs'] =
'flip'
And the output is:
data | cts | op | obs |
---|---|---|---|
2017-04-24 | 3 | B | NaN |
2022-04-18 | 6 | S | flip |
2017-04-14 | 10 | S | flip |
2022-04-13 | 4 | S | NaN |
Hope it helps :)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | marc_s |