'compare multiple columns of pandas dataframe with one column
I have a dataframe: df-
A B C D E
0 V 10 5 18 20
1 W 9 18 11 13
2 X 8 7 12 5
3 Y 7 9 7 8
4 Z 6 5 3 90
I want to add a column 'Result' which should return 1 if the value in column 'E' is greater than the values in B, C & D columns else return 0.
Output should be:
A B C D E Result
0 V 10 5 18 20 1
1 W 9 18 11 13 0
2 X 8 7 12 5 0
3 Y 7 9 7 8 0
4 Z 6 5 3 90 1
For few columns, i would use logic like : if(and(E>B,E>C,E>D),1,0), But I have to compare around 20 columns (from B to U) with column name 'V'. Additionally, the dataframe has around 100 thousand rows.
I am using
df['Result']=np.where((df.ix[:,1:20])<df['V']).all(1),1,0)
And it gives a Memory error.
Solution 1:[1]
One possible solution is compare in numpy
and last convert boolean mask to int
s:
df['Result'] = (df.iloc[:, 1:4].values < df[['E']].values).all(axis=1).astype(int)
print (df)
A B C D E Result
0 V 10 5 18 20 1
1 W 9 18 11 13 0
2 X 8 7 12 5 0
3 Y 7 9 7 8 0
4 Z 6 5 3 90 1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | jezrael |