'Use of Replace() in Python Dataframe for Multiple Columns but same value
Query: I need to replace the 1 old value with the 1 new value for a bunch of columns (not all columns) in a dataframe. The question is about the syntax to be used. Is there a shorter syntax?
Sample Dataframe:
df = pd.DataFrame({'A': [0,1,2,3,4],
'B': [5,6,7,0,9],
'C': [2,0,9,3,0],
'D': [1,3,0,5,2]})
I need all 0 to be replaced with 10 in the above df but only for column A and C (Not for B or D).
Code that I use to do this:
Method 1: Two separate commands.
df['A'].replace({0:10},inplace=True)
df['C'].replace({0:10},inplace=True)
Method 2: One command using dictionary in dictionary
df.replace({'A': {0:10}, 'C': {0:10}},inplace=True)
Method 3: Keeping new value out of dictionary
df.replace({'A':0,'C':0},10,inplace=True)
Expected Outcome:
A B C D
0 10 5 2 1
1 1 6 10 3
2 2 7 9 0
3 3 0 3 5
4 4 9 10 2
I am able to get expected outcome using all three methods. But I have a doubt that can we give a list of columns and enter old and new values for replacement only once?
Something like:
df.replace({['col_ref'...]:{'old':'new'})
#OR
df['col_ref'...].replace()
In my scenario, there are 26 columns out of 52 that need replacing, and the value is to be replaced through a regex command. Now I can store the regex command as a variable and use the method 2 to do this. But this also requires entering the variable name for 26 times. Is there any shorter way where I can enter these 26 columns and the regex replacement {'r':'r2'} only once?
Solution 1:[1]
I tried this.
for col in [list of columns]:
df.replace({col:{'r':'r2'}},regex=True,inplace=True)
This is the shortest way I could think of to write minimum code characters.
However, if there is a faster way, other answers are welcome.
Solution 2:[2]
I was looking on how to do this quicker myself this week and found this method and setup to handle instead of a for loop:
col_list = ['A', 'B']
df[col_list] = df[col_list.replace(0,10,inplace=True)
If you are using regex for a string:
col_list = ['A', 'B']
df[col_list] = df[col_list.replace('[\$,]','',regex=True, inplace=True)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Meet |
Solution 2 | tylerjames |