'Add a duplicate row and change the value of the duplicated row based on some other value in Pandas

I want to merge 2 columns of the same dataframe, and add a duplicate row using the same values as it has in the other columns.

consider the following dataframe:

Column A Column B Column C
ABC '' 1
GHI XYZ 2
'' PQR 3
'' '' 4

The conditions are:

  1. If the Column A has a alphanumeric value and the Column B has a Nan value or a '' (empty string) -> the Result column should only consider the value from Number-first
  2. If the Column A has a Nan or '' (empty string) value and the Column B has a alphanumeric value -> the Result column should only consider the value from Number-second
  3. If the values from both the columns are alphanumeric the result column should duplicate itself where the first value should be Column A and the second value should be Column B
  4. If both the Columns have Nan or empty string values, the result should consist of a '' (empty string) value

Following would be the output for the above dataframe:

Column A Column B Column C Result
ABC '' 1 ABC
GHI XYZ 2 GHI
GHI XYZ 2 XYZ
'' PQR 3 PQR
'' '' 4 ''

I have been unsuccessful in making it work.



Solution 1:[1]

You can create a list of values from Column A and Column B columns then explode it:

result = df[['A', 'B']].replace('', np.nan).stack().groupby(level=0).apply(list)
df = df.assign(Result=result.fillna('')).explode('Result')
print(df)

# Output
     A    B  C Result
0  ABC       1    ABC
1  GHI  XYZ  2    GHI
1  GHI  XYZ  2    XYZ
2       PQR  3    PQR
3            4       

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Corralien