'BigQuery sort a CONCAT of multiple string columns in SELECT

First I will present example of what I'm trying to achieve, and then will share a bit more info. This is a relatively straightforward task to explain, however I am new to BigQuery and am not sure if the task is simple to implement or not.

I have a dataset of names and am attempting to concat the names into a single column, however the names need to be sorted first. The dataset looks as such:

p1    p2    p3    p4
Nick  Tom   Joe   Chris
Sal   Bill  Tom   Joe
Nick  Joe   Chris Sal
Nick  Joe   Joe   Chris

I'd like to create a 5th column that concats p1, p2, p3 and p4 in a sorted manner so that the output is as such:

p1    p2    p3    p4    concat_col
Nick  Tom   Joe   Chris Chris_Joe_Nick_Tom
Sal   Bill  Tom   Joe   Bill_Joe_Sal_Top
Nick  Joe   Chris Sal   Chris_Joe_Nick_Sal
Nick  Joe   Tom   Chris Chris_Joe_Nick_Top

Currently I have the following

SELECT 
  p1, p2, p3, p4,
  concat(p1, '_', p2, '_', p3, '_', p4) as concat_col
FROM 
  my_table

...and this concats the columns but obviously doesn't do so in a sorted manner.



Solution 1:[1]

You can put the strings into an array, unnest them and re-aggregate:

select t.*,
       (select string_agg(el, '_' order by el)
        from unnest(array[t.p1, t.p2, t.p3, t.p4]) el
       ) as str
from (select 'Nick' as p1, 'Tom' as p2, 'Joe' as p3, 'Chris' as p4 union all
      select 'Sal', 'Bill', 'Tom', 'Joe' union all
      select 'Nick', 'Joe', 'Chris', 'Sal' union all
      select 'Nick', 'Joe', 'Joe', 'Chris'
     ) t ;

Solution 2:[2]

In some use cases - you would want to not to have dependency on column names - below generic code does help with this

#standardSQL
SELECT *, 
  (
    SELECT STRING_AGG(TRIM(item), '_' ORDER BY TRIM(item))
    FROM UNNEST(REGEXP_EXTRACT_ALL(FORMAT('%t', t), r'[^(),]+')) item
  ) AS concat_col
FROM my_table t   

As you can see, above code does not call out column name still produces expected result

Row p1      p2      p3      p4      concat_col   
1   Nick    Tom     Joe     Chris   Chris_Joe_Nick_Tom   
2   Sal     Bill    Tom     Joe     Bill_Joe_Sal_Tom     
3   Nick    Joe     Chris   Sal     Chris_Joe_Nick_Sal   
4   Nick    Joe     Joe     Chris   Chris_Joe_Joe_Nick   

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Gordon Linoff
Solution 2 Mikhail Berlyant