'remove rows in dataframe which are not all 1 or all 0
I need to retain rows in the dataframe which has all row values as 0 or all 1.
a = np.repeat(0,10)
b = np.repeat(1,10)
ab = pd.DataFrame({'col1':a,'col2':b}).transpose()
Solution 1:[1]
Possible solution is the following:
# pip install pandas
import pandas as pd
# create test dataframe
df = pd.DataFrame({'col1':[0,0,0,0],'col2':[1,1,1,1],'col3':[0,1,0,1],'col4':['a','b',0,1],'col5':['a','a','a','a']}).transpose()
df
# filter rows of dataframe
df = df[df.eq(0).all(axis=1) | df.eq(1).all(axis=1)]
df
Returns
Solution 2:[2]
One option, get the diff and ensure the result is always 0:
import numpy as np
np.all(np.diff(ab.values, 1)==0, 1)
Output:
array([ True, True])
Then use this to slice:
ab[np.all(np.diff(ab.values, 1)==0, 1)]
Other option, use nunique
:
ab[ab.nunique(1).eq(1)]
Solution 3:[3]
I am using this presently as it also works I guess..
Df= Df[(Df.sum(axis=1)==0) | (Df.sum(axis=1)==Df.shape[1])]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 | Nandu Menon |