'Check for existence of multiple columns
Is there a more sophisticated way to check if a dataframe df
contains 2 columns named Column 1
and Column 2
:
if numpy.all(map(lambda c: c in df.columns, ['Column 1', 'Columns 2'])):
do_something()
Solution 1:[1]
You can use Index.isin
:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B C D E F
0 1 4 7 1 5 7
1 2 5 8 3 3 4
2 3 6 9 5 6 3
If need check at least one value use any
cols = ['A', 'B']
print (df.columns.isin(cols).any())
True
cols = ['W', 'B']
print (df.columns.isin(cols).any())
True
cols = ['W', 'Z']
print (df.columns.isin(cols).any())
False
If need check all
values:
cols = ['A', 'B', 'C','D','E','F']
print (df.columns.isin(cols).all())
True
cols = ['W', 'Z']
print (df.columns.isin(cols).all())
False
Solution 2:[2]
I know it's an old post...
From this answer:
if set(['Column 1', 'Column 2']).issubset(df.columns):
do_something()
or little more elegant:
if {'Column 1', 'Column 2'}.issubset(df.columns):
do_something()
Solution 3:[3]
The one issue with the given answer (and maybe it works for the OP) is that it tests to see if all of the dataframe's columns are in a given list - but not that all of the given list's items are in the dataframe columns.
My solution was:
test = all([ i in df.columns for i in ['A', 'B'] ])
Where test
is a simple True
or False
Solution 4:[4]
Also to check the existence of a list items in a dataframe columns, and still using isin
, you can do the following:
col_list = ['A', 'B']
pd.index(col_list).isin(df.columns).all()
As explained in the accepted answer, .all()
is to check if all items in col_list
are present in the columns, while .any()
is to test the presence
of any of them.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | nick |
Solution 3 | elPastor |
Solution 4 |