'Filtering a dataframe with several or statements
I have a dataframe with about several columns that have conditions that I wish to filter in various combinations. I want to keep all columns where any set of conditions is met.
For instance if four th conditions are
- city = "NY" and weather ="Rainy"
- city 'Philly' and weather ="Sunny" and time = "Day"
- city 'Philly' and weather ="Rainy" and time = "Night"
- city 'Albany' and time = "Night"
I want to keep all rows where any of those four conditions are met it would be expressed as
writing that out with a data.iloc["city"] with a bunch of ands or or sounds messy and there is room for error as my conditions grow
What do you think is the best way to handle this?
For clarification the below dataframe is before running the procedure
City | Weather | Time |
---|---|---|
NYC | Sunny | Day |
NYC | Rainy | Night |
Philly | Sunny | Day |
Philly | Rainy | Day |
Philly | Rainy | Night |
Seattle | Windy | Day |
Albany | Rainy | Night |
Albany | Sunny | Day |
The following is the resulting dataframe
City | Weather | Time |
---|---|---|
NYC | Rainy | Night |
Philly | Sunny | Day |
Philly | Rainy | Night |
Albany | Rainy | Night |
Solution 1:[1]
you could use '&' and '|' like this:
df[
((df['City']=='NYC') & (df['Weather']=='Rainy'))
| ((df['City']=='Philly') & (df['Weather']=='Sunny') & (df['Time']=='Day'))
| ((df['City']=='Philly') & (df['Weather']=='Rainy') & (df['Time']=='Night'))
| ((df['City']=='Albany') & (df['Time']=='Night'))
]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |