'Pandas - Take value n month before

I am working with datetime. Is there anyway to get a value of n months before.

For example, the data look like:

dft = pd.DataFrame(
    np.random.randn(100, 1),
    columns=["A"],
    index=pd.date_range("20130101", periods=100, freq="M"),
)

dft

Then:

  • For every Jul of each year, we take value of December in previous year and apply it to June next year
  • For other month left (from Aug this year to June next year), we take value of previous month
  • For example: that value from Jul-2000 to June-2001 will be the same and equal to value of Dec-1999.

What I've been trying to do is:

dft['B'] = np.where(dft.index.month == 7, 
                    dft['A'].shift(7, freq='M') ,
                    dft['A'].shift(1, freq='M'))

However, the result is simply a copy of column A. I don't know why. But when I tried for single line of code :

dft['C'] = dft['A'].shift(7, freq='M')

then everything is shifted as expected. I don't know what is the issue here



Solution 1:[1]

The issue is index alignment. This shift that you performed acts on the index, but using numpy.where you convert to arrays and lose the index.

Use pandas' where or mask instead, everything will remain as Series and the index will be preserved:

dft['B'] = (dft['A'].shift(1, freq='M')
            .mask(dft.index.month == 7,  dft['A'].shift(7, freq='M'))
            )

output:

                   A         B
2013-01-31 -2.202668       NaN
2013-02-28  0.878792 -2.202668
2013-03-31 -0.982540  0.878792
2013-04-30  0.119029 -0.982540
2013-05-31 -0.119644  0.119029
2013-06-30 -1.038124 -0.119644
2013-07-31  0.177794 -1.038124
2013-08-31  0.206593 -2.202668  <- correct
2013-09-30  0.188426  0.206593
2013-10-31  0.764086  0.188426
...              ...       ...
2020-12-31  1.382249 -1.413214
2021-01-31 -0.303696  1.382249
2021-02-28 -1.622287 -0.303696
2021-03-31 -0.763898 -1.622287
2021-04-30  0.420844 -0.763898

[100 rows x 2 columns]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1