'Unpivot multiple columns in Snowflake
I have a table that looks as follows:
I need to unpivot the Rating and the Comments as follows:
What is the best way to do this in Snowflake?
Note: there are some cells in the comment columns that are NULL
Adding details:
create or replace table reviews(name varchar(50), acting_rating int, acting_comments text, comedy_rating int, comedy_comments text);
insert into reviews values
('abc', 4, NULL, 1, 'NO'),
('xyz', 3, 'some', 1, 'haha'),
('lmn', 1, 'what', 4, NULL);
select * from reviews;
select name, skill, skill_rating, comments
from reviews
unpivot(skill_rating for skill in (acting_rating, comedy_rating))
unpivot(comments for skill_comments in (acting_comments,comedy_comments))
--Following where clause is added to filter the irrelevant comments due to multiple unpivots
where substr(skill,1,position('_',skill)-1) = substr(skill_comments,1,position('_',skill_comments)-1)
order by name;
will produce produce the desired results, but with data that has NULLs, the unpivoted rows that have NULLs go missing from the output:
NAME SKILL SKILL_RATING COMMENTS
abc COMEDY_RATING 1 NO
lmn ACTING_RATING 1 what
xyz ACTING_RATING 3 some
xyz COMEDY_RATING 1 haha
Solution 1:[1]
If all you need to solve is for the table specified in the question - you can do it manually with a set of UNION ALL
:
select NAME
, 'ACTING_RATING' as SKILL, ACTING_RATING as SKILL_RATING, ACTING_COMMENTS as SKILL_COMMENTS
from DATA
union all
select NAME
, 'COMEDY_RATING', COMEDY_RATING, COMEDY_COMMENTS
from DATA
union all
select NAME
, 'MUSICAL_PERFORMANCE_RATING', MUSICAL_PERFORMANCE_RATING, MUSICAL_PERFORMANCE_COMMENTS
from DATA
Solution 2:[2]
This is a basic script and should give the desired output
create or replace table reviews(name varchar(50), acting_rating int, acting_comments text, comedy_rating int, comedy_comments text);
insert into reviews values
('abc', 4, 'something', 1, 'NO'),
('xyz', 3, 'some', 1, 'haha'),
('lmn', 1, 'what', 4, 'hahaha');
select * from reviews;
select name, skill, skill_rating, comments
from reviews
unpivot(skill_rating for skill in (acting_rating, comedy_rating))
unpivot(comments for skill_comments in (acting_comments,comedy_comments))
--Following where clause is added to filter the irrelevant comments due to multiple unpivots
where substr(skill,1,position('_',skill)-1) = substr(skill_comments,1,position('_',skill_comments)-1)
order by name;
Solution 3:[3]
I've had same problem, Here is my solution for unpivoting by two categories AND keeping nulls:
First you replace NULL
's with some string, for example: 'NULL'
Then brake the two unpivots into two separate cte's and create common category column to join them again later, 'skill' in your case.
Lastly, join the two cte's by name and skill category, replace the 'NULL' string with actual NULL
create or replace table reviews(name varchar(50), acting_rating int, acting_comments text, comedy_rating int, comedy_comments text);
insert into reviews values
('abc', 4, 'something', 1, 'NO'),
('xyz', 3, 'some', 1, 'haha'),
('lmn', 1, 'what', 4, 'hahaha');
WITH base AS (SELECT name
, acting_rating
, IFNULL(acting_comments, 'NULL') AS acting_comments
, comedy_rating
, IFNULL(comedy_comments, 'NULL') AS comedy_comments
FROM reviews
)
, skill_rating AS (SELECT name
, REPLACE(skill, '_RATING', '') AS skill
, skill_rating
FROM base
UNPIVOT (skill_rating FOR skill IN (acting_rating, comedy_rating))
)
, comments AS (SELECT name
, REPLACE(skill_comments, '_COMMENTS', '') AS skill
, comments
FROM base
UNPIVOT (comments FOR skill_comments IN (acting_comments,comedy_comments))
)
SELECT s.name
, s.skill
, s.skill_rating
, NULLIF(c.comments, 'NULL') AS comments
FROM skill_rating AS s
JOIN comments AS c
ON s.name = c.name
AND s.skill = c.skill
ORDER BY name;
The result:
name skill skill_rating comments
abc ACTING 4 <null>
abc COMEDY 1 NO
lmn ACTING 1 what
lmn COMEDY 4 <null>
xyz ACTING 3 some
xyz COMEDY 1 haha
Solution 4:[4]
If the goal is to store the unpivoted result as a table then INSERT ALL
could be used to unpivot mutliple columns at once:
Setup:
create or replace table reviews(
name varchar(50), acting_rating int,
acting_comments text, comedy_rating int, comedy_comments text);
insert into reviews values
('abc', 4, NULL, 1, 'NO'),
('xyz', 3, 'some', 1, 'haha'),
('lmn', 1, 'what', 4, NULL);
select * from reviews;
Query:
CREATE OR REPLACE TABLE reviews_transposed(
name VARCHAR(50)
,skill TEXT
,skill_rating INT
,skill_comments TEXT
);
INSERT ALL
INTO reviews_transposed(name, skill, skill_rating, skill_comments)
VALUES (name, 'ACTING_RATING', acting_rating, acting_comments)
INTO reviews_transposed(name, skill, skill_rating, skill_comments)
VALUES (name, 'COMEDY_RATING', comedy_rating, comedy_comments)
SELECT *
FROM reviews;
SELECT *
FROM reviews_transposed;
Before:
After:
This approach has one significant advantage over UNION ALL
approach proposed by Felippe, when saving into table (the number of table scans and thus partition read is growing for each UNION ALL
wheareas INSERT ALL
scans source table only once.
INSERT INTO reviews_transposed
select NAME
, 'ACTING_RATING' as SKILL, ACTING_RATING as SKILL_RATING, ACTING_COMMENTS as SKILL_COMMENTS
from reviews
union all
select NAME
, 'COMEDY_RATING', COMEDY_RATING, COMEDY_COMMENTS
from reviews;
vs INSERT ALL
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Iqra Ijaz |
Solution 3 | Alex Mirkin |
Solution 4 |