'SQL: Count more than average HAVING vs. WHERE

I had a fairly straightforward interview question: return all countries that have more customers than the average number of customers of all cities.

country table:

id country_name
1  Austria
2  Germany
3  United Kingdom

city table:

id city_name country_id
1  Wien      1
2  Berlin    2
3  Hamburg   2
4  London    3

customer table:

id customer_name city_id
1  cust1         1
2  cust2         4
3  cust3         3
4  cust4         1
5  cust5         2
6  cust6         1
7  cust7         4
8  cust8         2

Below is my solution. My answer was close but didn't pass the test case because mine returned more countries than the expected output. Do you know why, or how would you approach this problem?

SELECT 
    co.country_name, COUNT(customer_name)
FROM
    country co
        JOIN
    city ci ON co.id = ci.country_id
        JOIN
    customer cu ON ci.id = cu.city_id
GROUP BY co.country_name
HAVING COUNT(customer_name) > (SELECT 
        AVG(temp.cnt)
    FROM
        (SELECT 
            ci.id, COUNT(customer_name) AS cnt
        FROM
            city ci
        JOIN customer cu ON ci.id = cu.city_id
        GROUP BY ci.id) temp)
ORDER BY co.country_name ASC;

Users in this thread SQL: Count values higher than average for a group are using WHERE instead of HAVING to select value bigger than average. I wonder if HAVING is not suitable for this type of problem.

sql


Solution 1:[1]

Your current query is one valid way to solve this problem, but maybe the recipient of your answer was expecting a different solution. A more modern way of doing this might involve analytic functions:

WITH cte AS (
    SELECT co.country_name, COUNT(customer_name) cnt,
           AVG(COUNT(customer_name)) OVER () avg_cnt
    FROM country co
    INNER JOIN city ci ON co.id = ci.country_id
    INNER JOIN customer cu ON ci.id = cu.city_id
    GROUP BY co.country_name
)

SELECT country_name, cnt
FROM cte
WHERE cnt > avg_cnt;

In the CTE above, as we turn out the counts for each country name, we also compute the average count across your entire original result set. Then, all we have to is subquery and compare each count against the average.

Solution 2:[2]

The logic is same, fetching queries differ.

Write a query which will return all city name, country name and number of customers with more customers than the average number of customers of all cities in ascending order.

SELECT city.city_name, country.country_name, COUNT(customer.id) AS 
num_customer
FROM city
JOIN country ON city.country_id = country.id
JOIN customer ON city.id = customer.city_id
GROUP BY city.id
HAVING num_customer > (SELECT AVG(num_customer) FROM (SELECT city.id, 
COUNT(customer.id) AS num_customer
FROM city
JOIN country ON city.country_id = country.id
JOIN customer ON city.id = customer.city_id
GROUP BY city.id) AS temp)
ORDER BY country.country_name;

Solution 3:[3]

I tried below query and it worked for me.

WITH CTE AS(
SELECT DISTINCT CO.country_name, Ci.city_name,COUNT(city_id) AS cnt
, AVG(COUNT(city_id)) OVER() avg_cnt
FROM COUNTRY CO
INNER JOIN City CI ON CI.country_id = CO.id
INNER JOIN Customer CU ON CU.city_id = CI.id
GROUP BY CO.country_name,CI.city_name
)

SELECT country_name,city_name, cnt
FROM CTE WHERE cnt > avg_cnt ORDER BY country_name

Let me know if this works for you. Thanks

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tim Biegeleisen
Solution 2 Lokesh Patel
Solution 3 Kumar