'SQL: Count more than average HAVING vs. WHERE
I had a fairly straightforward interview question: return all countries that have more customers than the average number of customers of all cities.
country
table:
id country_name
1 Austria
2 Germany
3 United Kingdom
city
table:
id city_name country_id
1 Wien 1
2 Berlin 2
3 Hamburg 2
4 London 3
customer
table:
id customer_name city_id
1 cust1 1
2 cust2 4
3 cust3 3
4 cust4 1
5 cust5 2
6 cust6 1
7 cust7 4
8 cust8 2
Below is my solution. My answer was close but didn't pass the test case because mine returned more countries than the expected output. Do you know why, or how would you approach this problem?
SELECT
co.country_name, COUNT(customer_name)
FROM
country co
JOIN
city ci ON co.id = ci.country_id
JOIN
customer cu ON ci.id = cu.city_id
GROUP BY co.country_name
HAVING COUNT(customer_name) > (SELECT
AVG(temp.cnt)
FROM
(SELECT
ci.id, COUNT(customer_name) AS cnt
FROM
city ci
JOIN customer cu ON ci.id = cu.city_id
GROUP BY ci.id) temp)
ORDER BY co.country_name ASC;
Users in this thread SQL: Count values higher than average for a group are using WHERE
instead of HAVING
to select value bigger than average. I wonder if HAVING
is not suitable for this type of problem.
Solution 1:[1]
Your current query is one valid way to solve this problem, but maybe the recipient of your answer was expecting a different solution. A more modern way of doing this might involve analytic functions:
WITH cte AS (
SELECT co.country_name, COUNT(customer_name) cnt,
AVG(COUNT(customer_name)) OVER () avg_cnt
FROM country co
INNER JOIN city ci ON co.id = ci.country_id
INNER JOIN customer cu ON ci.id = cu.city_id
GROUP BY co.country_name
)
SELECT country_name, cnt
FROM cte
WHERE cnt > avg_cnt;
In the CTE above, as we turn out the counts for each country name, we also compute the average count across your entire original result set. Then, all we have to is subquery and compare each count against the average.
Solution 2:[2]
The logic is same, fetching queries differ.
Write a query which will return all city name, country name and number of customers with more customers than the average number of customers of all cities in ascending order.
SELECT city.city_name, country.country_name, COUNT(customer.id) AS
num_customer
FROM city
JOIN country ON city.country_id = country.id
JOIN customer ON city.id = customer.city_id
GROUP BY city.id
HAVING num_customer > (SELECT AVG(num_customer) FROM (SELECT city.id,
COUNT(customer.id) AS num_customer
FROM city
JOIN country ON city.country_id = country.id
JOIN customer ON city.id = customer.city_id
GROUP BY city.id) AS temp)
ORDER BY country.country_name;
Solution 3:[3]
I tried below query and it worked for me.
WITH CTE AS(
SELECT DISTINCT CO.country_name, Ci.city_name,COUNT(city_id) AS cnt
, AVG(COUNT(city_id)) OVER() avg_cnt
FROM COUNTRY CO
INNER JOIN City CI ON CI.country_id = CO.id
INNER JOIN Customer CU ON CU.city_id = CI.id
GROUP BY CO.country_name,CI.city_name
)
SELECT country_name,city_name, cnt
FROM CTE WHERE cnt > avg_cnt ORDER BY country_name
Let me know if this works for you. Thanks
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Tim Biegeleisen |
Solution 2 | Lokesh Patel |
Solution 3 | Kumar |