'Sum of different slices rows and column
I have pandas DataFrame df and three arrays columns_list, lower_boarder and upper_boarder all have the same shape. I want to find array with shape as input arrays. For i-th sum in array I need sum of i-th column from i-th lower_boarder and i-th upper_boarder. Is it possible to do in pandas way?
df = pd.DataFrame(np.arange(0, 10).repeat(5).reshape(10, 5), columns=['col1', 'col2', 'col3', 'col4', 'col5'])
col1 col2 col3 col4 col5
0 0 0 0 0 0
1 1 1 1 1 1
2 2 2 2 2 2
3 3 3 3 3 3
4 4 4 4 4 4
5 5 5 5 5 5
6 6 6 6 6 6
7 7 7 7 7 7
8 8 8 8 8 8
9 9 9 9 9 9
columns_list = ['col1', 'col2', 'col3', 'col1']
lower_boarder = [2, 3, 5, 6]
upper_boarder = [4, 5, 6, 9]
sum = []
for i in range(len(columns_list)):
sum.append(df.loc[lower_boarder[i]:upper_boarder[i], columns_list[i]].sum())
How can I do it without itteration? If shapes of columns_list is different?
Solution 1:[1]
Create DataFrame only by columns by columns_list
by DataFrame.reindex
and then compare arange
by length of DataFrame by lower_boarder, upper_boarder
with numpy broadcasting, if not match generate 0
by DataFrame.where
, last sum columns to Series
:
df1 = df.reindex(columns_list, axis=1)
m1 = np.arange(len(df1))[:, None] >= np.array(lower_boarder)
m2 = np.arange(len(df1))[:, None] <= np.array(upper_boarder)
#if default index
m1 = df1.index.to_numpy()[:, None] >= np.array(lower_boarder)
m2 = df1.index.to_numpy()[:, None] <= np.array(upper_boarder)
s = df1.where(m1 & m2, 0).sum()
print (s)
col1 9
col2 12
col3 11
col1 30
dtype: int64
Numpy alternative with numpy.where
:
a = np.where(m1 & m2, df1, 0).sum(axis=0).tolist()
print (a)
[9, 12, 11, 30]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |