'How to append two cell values located in the same dataframe column?

    Disclosure  Source
35      
36      
37      
38      
39  202-1        GRI 202: Market Presence
40               2016
41  
42      
43      

The Source Column has empty values, before removing them I would like to know how can I merge "GRI 202: Market Presence" with "2016".

eg: GRI 202: Market Presence 2016

When you check the whole Dataframe, you will see values like this and they are suppose to be in the same cell. I have spent hours reading the docs I Have not been able to find the best approach.



Solution 1:[1]

With the following toy dataframe:

import pandas as pd

df = pd.DataFrame(
    {
        "Disclosure": ["", "", "202-1", "", "", "", "202-2", "", ""],
        "Source": [
            "",
            "",
            "GRI 202: Market Presence",
            "2016",
            "",
            "",
            "GRI 202: Market Presence",
            "2017",
            "",
        ],
    }
)

print(df)
# Output
  Disclosure                    Source
0
1
2      202-1  GRI 202: Market Presence
3                                 2016
4
5
6      202-2  GRI 202: Market Presence
7                                 2017
8

Here is one way to do it:

df = (
    df.assign(temp=pd.to_numeric(df["Source"], errors="coerce").shift(-1))
    .dropna()
    .pipe(
        lambda df: df.assign(
            Source=df["Source"] + " " + df["temp"].astype(int).astype(str)
        ).drop(columns="temp")
    )
    .reset_index(drop=True)
)

print(df)
# Output
  Disclosure                         Source
0      202-1  GRI 202: Market Presence 2016
1      202-2  GRI 202: Market Presence 2017

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Laurent