'Remove rows that contain False in a column of pandas dataframe
I assume this is an easy fix and I'm not sure what I'm missing. I have a data frame as such:
index c1 c2 c3
2015-03-07 01:27:05 False False True
2015-03-07 01:27:10 False False True
2015-03-07 01:27:15 False False False
2015-03-07 01:27:20 False False True
2015-03-07 01:27:25 False False False
2015-03-07 01:27:30 False False True
I want to remove any rows that contain False
in c3
. c3
is a dtype=bool
. I'm consistently running into problems since it's a boolean and not a string/int/etc, I haven't handled that before.
Solution 1:[1]
Pandas deals with booleans in a really neat, straightforward manner:
df = df[df.c3]
This does the same thing but without creating a copy (making it faster):
df = df.loc[df.c3, :]
When you're filtering dataframes using df[...]
, you often write some function that returns a boolean value (like df.x > 2
). But in this case, since the column is already a boolean, you can just put df.c3
in on its own, which will get you all the rows that are True
.
If you wanted to get the opposite (as the original title to your question implied), you could use df[~df.c3]
or df.loc[~df.c3, :]
, where the ~
inverts the booleans.
For more on boolean indexing in Pandas, see the docs. Thanks to @Mr_and_Mrs_D for the suggestion about .loc
.
Solution 2:[2]
Well the question's title and the question itself are the exact opposite, but:
df = df[df['c3'] == True] # df will have only rows with True in c3
Solution 3:[3]
Solution
df.drop(df[df['c3'] == False].index, inplace=True)
This explicitly drops rows where 'c3'
is False
and not just keeping rows that evaluate to True
Solution 4:[4]
Consider DataFrame.query
. This allows a chained operation, thereby avoiding referring to the dataframe by the name of its variable.
filtered_df = df.query('my_col')
This should return rows where my_col
evaluates to true. To invert the results, use query('~my_col'
) instead.
To do this in-place instead:
df.query('my_col', inplace=True)
Solution 5:[5]
Another option is to use pipe
:
df.pipe(lambda x: x[x['c3']])
It also works in a method chain like query
, but also with a Series:
df['c3'].pipe(lambda x: x[x])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | DeepSpace |
Solution 3 | piRSquared |
Solution 4 | |
Solution 5 | nocibambi |