'How to duplicate each row having only one column different than the previous row pandas data frame?
I have a big data and I want to duplicate each row just below the original column by changing just one column value
I want to copy the previous row value in place of "same" and I want to change the last column which is the same as the c column
import numpy as np
import pandas as pd
import sys
df = pd.DataFrame([[45, 20, 'A1', 46, 20, 'A2'],
[45, 20 ,'B2', 46, 20, 'B1'],
[46, 20, 'A2', 47, 20, 'A1'],
[46, 20, 'B1', 47, 20, 'B2']],columns=['A','B','C','D','E','F'])
new_row = {"A":0,"B":0,"C":0,"D":0, "E":0,"F":0}
s = pd.Series(new_row, df.columns)
f = lambda d: d.append(s, ignore_index=True)
grp = np.arange(len(df)) // 1
df.groupby(grp, group_keys=False).apply(f).reset_index(drop=True)
input:
expected output:
Solution 1:[1]
Assuming this input:
A B C D E F
45 20 A1 46 20 A2
45 20 B2 46 20 B1
46 20 A2 47 20 A1
46 20 B1 47 20 B2
and the fact that you want to duplicate rows while getting the values of C for column F:
out = (pd.concat([df, df.assign(F=df['C'])])
.sort_index(kind='stable').reset_index(drop=True)
)
output:
A B C D E F
0 45 20 A1 46 20 A2
1 45 20 A1 46 20 A1
2 45 20 B2 46 20 B1
3 45 20 B2 46 20 B2
4 46 20 A2 47 20 A1
5 46 20 A2 47 20 A2
6 46 20 B1 47 20 B2
7 46 20 B1 47 20 B1
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | mozway |