'retrieve only months with at least 28 sample days - pandas dataframe

Hello to the people of the web,

I have a dataframe containing 'DATE' (datetime) as index and TMAX as column with values: tmax dataframe

What i'm trying to do is checking for every month (of each year) the amount of samples (each TMAX column value is considered as a sample). If I have less than 28 samples, I want to drop that particular month (of that particular year) and all it's samples.

I have the following code:

if __name__ == '__main__':
    df = pd.read_csv("2961941.csv")

    # set date column as index, drop the 'DATE' column to avoid repititions + create as datetime object
    # speed up parsing using infer_datetime_format=True.
    df['DATE'] = pd.to_datetime(df['DATE'], infer_datetime_format=True)
    df.set_index('DATE', inplace=True)

    # create new table out of 'DATE' and 'TMAX'
    tmax = df.filter(['DATE', 'TMAX'], axis=1)
    # erase rows with missing data
    tmax.dropna()
    # create snow table & delete rows with missing info
    snow = df.filter(['DATE', 'SNOW']).dropna()
    # for index, row in tmax.iterrows():

Thanks for the help.



Solution 1:[1]

I can suggest trying the following. Here I have highlighted the results of counting days in a month into a variable 'a'. And then I filter the data in which there are less than 28 days in a month. It worked for me.

a = df.groupby(pd.Grouper(level='DATE', freq="M")).transform('count')
print(df[a['TMAX'] >= 28])

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1