'Sort columns values based on floats inside a string, then concat

I'm working on a pretty messy DF. Looking like this, but with 30 columns:

a b
some text (other text) : 56.3% (text again: 40%) again text (not same text) : 33% (text text: 60.1%)
text (always text) : 26.6% (aaand text: 80%) still text (too much text) : 86% (last text: 10%)

What I'm trying to do is creating another column, c, which concat a & b, but the concatenation must be sorted based on the first number (I don't whant to change row's order). Result expected:

c
some text (other text) : 56% (text again: 40%) again text (not same text) : 33% (text text: 60%)
still text (too much text) : 86% (last text: 10%) text (always text) : 26% (aaand text: 80%)

Any idea ?



Solution 1:[1]

You can try apply a customized function

def concat(row):
    keys = row.str.extract('(\d+\.?\d*)%')[0].astype(float).tolist()
    row = [x for _, x in sorted(zip(keys, row.tolist()))]
    return ' '.join(row)

df['c'] = df.apply(concat, axis=1)
print(df)

                                                  a                                                  b
0  some text (other text) : 56.3% (text again: 40%)  again text (not same text) : 33% (text text: 6...
1      text (always text) : 26.6% (aaand text: 80%)  still text (too much text) : 86% (last text: 10%)
                                                  a  \
0  some text (other text) : 56.3% (text again: 40%)
1      text (always text) : 26.6% (aaand text: 80%)

                                                     b  \
0  again text (not same text) : 33% (text text: 60.1%)
1    still text (too much text) : 86% (last text: 10%)

                                                                                                      c
0  again text (not same text) : 33% (text text: 60.1%) some text (other text) : 56.3% (text again: 40%)
1        text (always text) : 26.6% (aaand text: 80%) still text (too much text) : 86% (last text: 10%)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1