'check for any column contains provided values after group by
I have a data set like this.
CREATE TABLE EntityItems (
Id INT NOT NULL,
EnitityItemId INT NOT NULL,
PRIMARY KEY (Id, EnitityItemId)
);
INSERT INTO EntityItems
VALUES
(1, 1),
(1, 2),
(1, 3),
(2, 1),
(2, 2),
(3, 1),
(3, 4)
;
I want to filter this table where any of the group contains both 1 and 2 values(in EntityItemId column) after grouping by Id column.
let's say I am checking for both 1 and 2 values to be there in a group. I am expecting the result as
Id
1
2
Please let me know if you need clarification. Thanks
Solution 1:[1]
You can aggregate and check, if the distinct count of entity item IDs is equal to 2 in a HAVING
clause.
SELECT id
FROM elbat
WHERE enitityitemid IN (1, 2)
GROUP BY id
HAVING count(DISTINCT enitityitemid) = 2;
Solution 2:[2]
you can try a query like below
select Id
from
yourtable where
EnitityitemId in (1,2)-- this is your condition
group by Id
having
count(1)=2
Solution 3:[3]
One approach to filtering groups based on multiple values is to use self-joins and apply the condition for each value on each of the twin tables. With 2 distinct values, you can use a single join:
SELECT ei0.Id
FROM EntityItems AS ei0
JOIN EntityItems AS ei1 ON ei0.id = ei1.id
WHERE ei0.EntityitemId = 1
AND ei1.EntityitemId = 2
;
Depending on the exact nature of conditions, the query may need a GROUP BY
clause on the selected columns, though it's unnecessary for the sample case.
For the sample, this query performs well given the appropriate indices. The query plan with the sample data:
With a larger table, I'd expect the index seek to have a smaller impact and the scan to dominate. Compare with the execution plan for the group-and-count approach from sticky bit's answer, which also has a single scan:
The group-and-count can more succinctly be adapted to work with additional values (add them to the IN
list and change the count used in the COUNT
comparison); the join-based approach will require an additional join for each value. Also, if there isn't an appropriate index (or the condition precludes using and index), the performance of the join-based approach will be worse than the group-and-count. On the other hand, the join-based approach is possibly more readily adapted to some complex conditions.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | sticky bit |
Solution 2 | DhruvJoshi |
Solution 3 | outis |