'BigQuery sort a CONCAT of multiple string columns in SELECT
First I will present example of what I'm trying to achieve, and then will share a bit more info. This is a relatively straightforward task to explain, however I am new to BigQuery and am not sure if the task is simple to implement or not.
I have a dataset of names and am attempting to concat the names into a single column, however the names need to be sorted first. The dataset looks as such:
p1 p2 p3 p4
Nick Tom Joe Chris
Sal Bill Tom Joe
Nick Joe Chris Sal
Nick Joe Joe Chris
I'd like to create a 5th column that concats p1, p2, p3 and p4 in a sorted manner so that the output is as such:
p1 p2 p3 p4 concat_col
Nick Tom Joe Chris Chris_Joe_Nick_Tom
Sal Bill Tom Joe Bill_Joe_Sal_Top
Nick Joe Chris Sal Chris_Joe_Nick_Sal
Nick Joe Tom Chris Chris_Joe_Nick_Top
Currently I have the following
SELECT
p1, p2, p3, p4,
concat(p1, '_', p2, '_', p3, '_', p4) as concat_col
FROM
my_table
...and this concats the columns but obviously doesn't do so in a sorted manner.
Solution 1:[1]
You can put the strings into an array, unnest them and re-aggregate:
select t.*,
(select string_agg(el, '_' order by el)
from unnest(array[t.p1, t.p2, t.p3, t.p4]) el
) as str
from (select 'Nick' as p1, 'Tom' as p2, 'Joe' as p3, 'Chris' as p4 union all
select 'Sal', 'Bill', 'Tom', 'Joe' union all
select 'Nick', 'Joe', 'Chris', 'Sal' union all
select 'Nick', 'Joe', 'Joe', 'Chris'
) t ;
Solution 2:[2]
In some use cases - you would want to not to have dependency on column names - below generic code does help with this
#standardSQL
SELECT *,
(
SELECT STRING_AGG(TRIM(item), '_' ORDER BY TRIM(item))
FROM UNNEST(REGEXP_EXTRACT_ALL(FORMAT('%t', t), r'[^(),]+')) item
) AS concat_col
FROM my_table t
As you can see, above code does not call out column name still produces expected result
Row p1 p2 p3 p4 concat_col
1 Nick Tom Joe Chris Chris_Joe_Nick_Tom
2 Sal Bill Tom Joe Bill_Joe_Sal_Tom
3 Nick Joe Chris Sal Chris_Joe_Nick_Sal
4 Nick Joe Joe Chris Chris_Joe_Joe_Nick
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Gordon Linoff |
Solution 2 | Mikhail Berlyant |