'median in pandas dropping center value
I am working in pandas and want to implement an algorithm that requires I assess a modified centered median on a window, but omitting the middle value. So for instance the unmodified might be:
ser = pd.Series(data=[0.,1.,2.,4.5,5.,6.,8.,9])
med = ser.rolling(5,center=True).median()
print(med)
and I would like the result for med[3] to be 3.5 (the median of 1.,2.,4.,6.) rather than 4.5 which the ordinary windowed median. Is there an economical way to do this?
Solution 1:[1]
Try:
import numpy as np
import pandas as pd
ser = pd.Series(data=[0.,1.,2.,4.5,5.,6.,8.,9])
med = ser.rolling(5).apply(lambda x: np.median(np.concatenate([x[0:2],x[3:5]]))).shift(-2)
print(med)
With output:
0 NaN
1 NaN
2 2.75
3 3.50
4 5.25
5 6.50
6 NaN
7 NaN
And more generally:
rolling_size = 5
ser.rolling(rolling_size).apply(lambda x: np.median(np.concatenate([x[0:int(rolling_size/2)],x[int(rolling_size/2)+1:rolling_size]]))).shift(-int(rolling_size/2))
Solution 2:[2]
ser = pd.Series(data=[0.,1.,2.,4.5,5.,6.,8.,9])
def median(series, window = 2):
df = pd.DataFrame(series[window:].reset_index(drop=True))
df[1] = series[:-window]
df = df.apply(lambda x: x.mean(), axis=1)
df.index += window - 1
return df
median(ser)
I think it is simpler
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Avi Thaker |
Solution 2 | MoRe |