'SQL - Count how many unique IDs would be w/o a category if "category bucket" was removed
I am using Snowflake for this SQL question if there are any unique functions I can use, please help me out!
I have a data set with unique ids, other attributes that aren’t important, and then a list of categories (~22) each Unique id could fall into (denoted by a 1 if it’s in the category and 0 if not.)
I am trying to figure out how to write something where I could see if across all the categories, if a category was removed if any of the unique ids would then be left without any category and count how many unique ids would then do total how many ids would be left category less.
Example below for unique id Jshshsv it is only in CatAA but id Hairbdb is in CatY and CatAA. If CatAA was dropped, how many Ids would be left with no category?
UniqueID | Sum across Categories | CatX | CatY | CatZ | CatAA |
---|---|---|---|---|---|
Hairbdb | 2 | 0 | 1 | 0 | 1 |
Jshshsv | 1 | 0 | 0 | 0 | 1 |
For some reason I just cannot figure out how to do this in a manageable way in sql with so many category buckets. Any tips or things to try would be appreciated.
Solution 1:[1]
if you are storing the categories in columns (though not a good design) you could try this.
SELECT UniqueID , sum(CatX+CatY+CatZ+CatAA) over (partition by UniqueID) as "Sum across Categories",
CatX, CatY, CatZ, CatAA FROM (
SELECT 'Hairbdb' as UniqueID, 0 as CatX, 1 as CatY, 0 as CatZ, 1 as CatAA from dual
UNION ALL
SELECT 'Jshshsv', 0,0,0,1 from dual
);
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Himanshu Kandpal |