'resampling raises ValueError: Values falls before first bin

I don't understand when and why this error is raised.

From my understanding, resample should create as many bins as needed in order to bin all the timestamps of the index. So the message "Values falls before first bin" does not make much sense to me.

Example/actual output:

>>> df = pd.DataFrame(index=pd.date_range(start='2021-04-22 01:00:00', end='2021-04-28 01:00', freq='1d'), data = [1]*7)
>>> df 
                     0
2021-04-22 01:00:00  1
2021-04-23 01:00:00  1
2021-04-24 01:00:00  1
2021-04-25 01:00:00  1
2021-04-26 01:00:00  1
2021-04-27 01:00:00  1
2021-04-28 01:00:00  1
>>> df.resample(rule='7d', origin='2021-04-29 00:00:00', closed='right', label='right').sum()
[...]
ValueError: Values falls before first bin

Expected output:

>>> df.resample(rule='7d', origin='2021-04-29 00:00:00', closed='right', label='right').sum() 
            0
2021-04-29  7 # bin (2021-04-22 00:00:00, 2021-04-29 00:00:00]

I'm using pandas 1.3.5

Solution 1:^[1]

From this question I learned that the timestamps are likely truncated with respect to the unit given in the rule argument before they are sorted into the correct bin.

This means that

2021-04-22 01:00:00 is rounded to 2021-04-22 00:00:00
2021-04-22 00:00:00 does not fit into the bin (2021-04-22 00:00:00, 2021-04-29 00:00:00] which leads to the ValueError

To my eyes this looks like a bug or misfeature. At least one of "truncate timestamps before binning" or "don't add bins as needed, instead raise error" seems to be wrong.

Solution 2:^[2]

I found time = time.dt.normalize() to help

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	actual_panda
Solution 2	Hanan Shteingart

'resampling raises ValueError: Values falls before first bin

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]