'How to select top level columns in multi header pandas dataframe

I have a multi header dataframe and it looks like that:

              SPY                           ARKW                     
              Open            High           Open          High       
               Bid     Ask     Bid     Ask    Bid    Ask    Bid    Ask
Date                                                                  
2018-07-03  272.86  272.87  272.97  272.98  56.00  56.35  56.24  56.63
2018-07-02  269.49  269.51  272.04  272.05  54.87  55.21  56.04  56.09
2018-06-29  272.09  272.12  273.66  273.67  55.58  55.95  55.95  56.07

I want to iterate over all top level columns so in that case SPY and ARKW so I tried:

for column in df:
    print(column)

but it iterates also over lower level columns:

('SPY', 'Open', 'Bid')
('SPY', 'Open', 'Ask')
('SPY', 'High', 'Bid')
('SPY', 'High', 'Ask')
('ARKW', 'Open', 'Bid')
('ARKW', 'Open', 'Ask')
('ARKW', 'High', 'Bid')
('ARKW', 'High', 'Ask')

How can I iterate over top level columns and then for each select Open Bid column?



Solution 1:[1]

I think loop is not necessary, for selecting use slicers:

idx = pd.IndexSlice
df1 = df.loc[:, idx[:, 'Open', 'Bid']]
print (df1)
               SPY   ARKW
              Open   Open
               Bid    Bid
2018-07-03  272.86  56.00
2018-07-02  269.49  54.87
2018-06-29  272.09  55.58

Or use DataFrame.xs:

df1 = df.xs(('Open', 'Bid'), level=[1, 2], axis=1) 
print (df1)
               SPY   ARKW
2018-07-03  272.86  56.00
2018-07-02  269.49  54.87
2018-06-29  272.09  55.58

df1 = df.xs(('Open', 'Bid'), level=[1, 2], axis=1, drop_level=False) 
print (df1)
               SPY   ARKW
              Open   Open
               Bid    Bid
2018-07-03  272.86  56.00
2018-07-02  269.49  54.87
2018-06-29  272.09  55.58

EDIT:

For better performance all values in MultiIndex has to be sorted:

df = df.sort_index(axis=1)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1