'Python Dataframes - Breaking out single rows with duplicate columns into multiple rows and fewer columns

I have a data frame like this:

A B C Date1 Time1 Value1 Date2 Time2 Value2
abc def ghi 01-01-2000 15:00:00 100 01-01-2000 19:00:00 110

There are duplicate columns for Date/Time/Value and I would like to merge them into a single column. The tricky part is that I would like the values of columns A/B/C to be shown for each new row added. Something like this:

A B C Date Time Value
abc def ghi 01-01-2000 15:00:00 100
abc def ghi 01-01-2000 19:00:00 110

The only approach I could think of was to take a subset of the original dataframe such as:

A B C Date1 Time1 Value1
abc def ghi 01-01-2000 15:00:00 100

AND

A B C Date2 Time2 Value2
abc def ghi 01-01-2000 19:00:00 110

Then try to merge them together. There has to be a faster way to do this. Any help would be appreciated



Solution 1:[1]

pd.wide_to_long works fine for this; however, if your indices are not unique, pivot_longer from pyjanitor can be helpful here, and is efficient:

# pip install pyjanitor
import pandas as pd
import janitor

df.pivot_longer(
     index = ['A','B','C'], 
     names_to = '.value', 
     names_pattern = r"(\D+)\d")

     A    B    C        Date      Time  Value
0  abc  def  ghi  01-01-2000  15:00:00    100
1  abc  def  ghi  01-01-2000  19:00:00    110

The .value here determines which part of the columns remain as column headers; this is determined from the grouped regex in named_pattern.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 sammywemmy