'PostgreSQL slow nested queries
I'm trying to solve a slow query in PostgreSQL. I have a table "element" and a table "relation" The table relation enables to put any items of the table "element" in relation with other items of the same table "element". Another table "subtype" describes the type of the element. I list here only the most important columns for clarity.
Table: element(id, id_subtype, identification_number)
Table: relation(id, type_source, id_source, type_destination, id_destination)
Table: subtype(id, name, code)
I want to list all entries of the table "element" with the following columns:
- Id, identification_number
- a concatenated string of all its relations to other elements
- a concatenated string of all its relations to other elements of the subtype with code = "zone"
- a concatenated string of all its relations to other elements of the subtype with code = "secteur"
I have this query so far
SELECT
e.id, e.name,
string_agg(distinct(elem_identification_number), ', ') as rel_element_string,
string_agg(distinct(elem_zone_identification_number), ', ') as rel_zone_element_string,
string_agg(distinct(elem_sector_identification_number), ', ') as rel_sector_element_string
FROM(
SELECT e.id,
CASE
WHEN elem.id is null THEN null
ELSE concat(s.name, ' ', elem.identification_number)
END AS elem_identification_number,
CASE
WHEN s_zone.id is null THEN null
ELSE elem_zone.identification_number
END AS elem_zone_identification_number,
CASE
WHEN s_sector.id is null THEN null
ELSE elem_sector.identification_number
END AS elem_sector_identification_number
FROM element e
LEFT JOIN relation re ON re.id_source = e.id AND re.type_source = 'element' AND re.type_destination = 'element'
LEFT JOIN element elem ON re.id_destination = elem.id
LEFT JOIN subtype s ON elem.id_subtype = s.id
LEFT JOIN relation re_zone ON re_zone.id_source = e.id AND re_zone.type_source = 'element' AND re_zone.type_destination = 'element' AND re_zone.is_deleted = false
LEFT JOIN element elem_zone ON re_zone.id_destination = elem_zone.id
LEFT JOIN subtype s_zone ON elem_zone.id_subtype = s_zone.id AND s_zone.code = 'zone'
LEFT JOIN relation re_sector ON re_sector.id_source = e.id AND re_sector.type_source = 'element' AND re_sector.type_destination = 'element' AND re_sector.is_deleted = false
LEFT JOIN element elem_sector ON re_sector.id_destination = elem_sector.id
LEFT JOIN subtype s_sector ON elem_sector.id_subtype = s_sector.id AND s_sector.code = 'secteur'
WHERE e.is_deleted = false AND e.id_subtype = 18
UNION ALL
/* Same query but with reveresed id_source - id_destination */
) as e
GROUP BY id, e.identification_number, ...
ORDER BY id DESC";
The query plan of the full request (with all columns) looks like this with the "explain"
https://explain.depesz.com/s/Lk9h
I also have 2 indexes on table "relation"
CREATE INDEX idx_relation
ON public.relation USING btree
(id_chantier ASC NULLS LAST, type_source COLLATE pg_catalog."default" ASC NULLS LAST, id_source ASC NULLS LAST)
CREATE INDEX idx_relation_dest
ON public.relation USING btree
(id_chantier ASC NULLS LAST, type_destination COLLATE pg_catalog."default" ASC NULLS LAST, id_destination ASC NULLS LAST)
Any idea how I can improve the query? Thank you!
Solution 1:[1]
You have a combinatorial explosion here. For example, if each of your string_aggs produces a list of a 100 things for each e, you first have a dataset of 100^3, or a million things, per e before the distinct compacts it back down again.
The way to avoid that is to not write one 10-way join, but rather write 3 correlated subqueries where each subquery has a 3-way join plus a reference to the outer table. Something like:
select e.*,
(select string_agg(...) from relation, element, subtype ...) rel_element_string,
(select string_agg(...) from relation, element, subtype ...) rel_zone_element_string,
(select string_agg(...) from relation, element, subtype ...) rel_sector_element_string
from elements e
WHERE e.is_deleted = false AND e.id_subtype = 18
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | jjanes |