'Query Slower with different value for a parameter postgress

I´m with an issue where hibernate generates the following query on an existing view.

select
    count(proclistvw0_.proc_id) as col_0_0_
from proc_list_vw proclistvw0_
where 1=1
  and (    proclistvw0_.sub_group_edp_company = $1
           or
           proclistvw0_.sub_group_edp_company = $2
      )
  and (proclistvw0_.proc_country in ($3))
  and proclistvw0_.sub_group_edp_company = $4
  ;

Keeping the parameters 1,2 and 4 as constants changing only the 3 from 'ESP' to to 'BRA' the query will return almost the same amount of results but the time to do so will increase from 3 seconds to 5 min one thing that I could not understand for the slow query is that for some reason it seems to do all the work in 7 seconds but it hangs for the rest of the time.

Here the analyse plan for the slow query

Aggregate  (cost=165310.40..165310.41 rows=1 width=8) (actual time=379605.458..379605.486 rows=1 loops=1)
  ->  Nested Loop  (cost=10712.71..165310.40 rows=1 width=4) (actual time=2385.256..379605.125 rows=146 loops=1)
        ->  Nested Loop  (cost=10712.56..165310.24 rows=1 width=8) (actual time=2385.225..379603.816 rows=146 loops=1)
              ->  Nested Loop  (cost=10712.29..165309.94 rows=1 width=13) (actual time=2385.178..379600.458 rows=146 loops=1)
                    Join Filter: (pp.id = pqr.proc_id)
                    Rows Removed by Join Filter: 32081164
                    ->  Nested Loop Left Join  (cost=10711.28..19598.32 rows=1 width=8) (actual time=308.508..7085.120 rows=146 loops=1)
                          Join Filter: (pp.id = d.proc_id)
                          Rows Removed by Join Filter: 394052
                          ->  Hash Join  (cost=10710.10..15633.72 rows=1 width=8) (actual time=262.425..350.010 rows=146 loops=1)
                                Hash Cond: (t4cp_1.proc_id = pp.id)
                                ->  HashAggregate  (cost=10696.18..12872.36 rows=217618 width=36) (actual time=262.094..333.606 rows=219734 loops=1)
                                      Group Key: t4cp_1.proc_id
                                      Batches: 1  Memory Usage: 28689kB
                                      ->  Hash Right Join  (cost=6055.89..10131.17 rows=226002 width=4) (actual time=45.110..178.231 rows=261886 loops=1)
                                            Hash Cond: (t4cp_1.per_id = t4.id)
                                            ->  Seq Scan on process_person t4cp_1  (cost=0.00..3482.02 rows=226002 width=8) (actual time=0.012..15.687 rows=226012 loops=1)
                                            ->  Hash  (cost=3579.72..3579.72 rows=198093 width=4) (actual time=44.427..44.428 rows=198124 loops=1)
                                                  Buckets: 262144  Batches: 1  Memory Usage: 9014kB
                                                  ->  Index Only Scan using person_id_index1 on person t4  (cost=0.42..3579.72 rows=198093 width=4) (actual time=0.011..15.006 rows=198124 loops=1)
                                                        Heap Fetches: 448
                                ->  Hash  (cost=13.91..13.91 rows=1 width=4) (actual time=0.276..0.279 rows=146 loops=1)
                                      Buckets: 1024  Batches: 1  Memory Usage: 14kB
                                      ->  Bitmap Heap Scan on process pp  (cost=12.79..13.91 rows=1 width=4) (actual time=0.125..0.258 rows=146 loops=1)
                                            Recheck Cond: (((country_cd)::text = 'BRA'::text) AND ((sub_group_edp_company_cd)::text = 'EDPR'::text))
"                                            Filter: ((COALESCE(status_cd, ' '::character varying))::text <> 'CANCELLED'::text)"
                                            Rows Removed by Filter: 26
                                            Heap Blocks: exact=107
                                            ->  BitmapAnd  (cost=12.79..12.79 rows=1 width=0) (actual time=0.111..0.113 rows=0 loops=1)
                                                  ->  Bitmap Index Scan on process_country_cd_index1  (cost=0.00..2.75 rows=180 width=0) (actual time=0.023..0.023 rows=172 loops=1)
                                                        Index Cond: ((country_cd)::text = 'BRA'::text)
                                                  ->  Bitmap Index Scan on process_sub_group_edp_company_cd_index1  (cost=0.00..9.79 rows=1120 width=0) (actual time=0.084..0.084 rows=1121 loops=1)
                                                        Index Cond: ((sub_group_edp_company_cd)::text = 'EDPR'::text)
                          ->  Subquery Scan on d  (cost=1.18..3962.26 rows=187 width=4) (actual time=0.077..45.913 rows=2699 loops=146)
                                Filter: (d.seq = 1)
                                Rows Removed by Filter: 34680
                                ->  WindowAgg  (cost=1.18..3495.04 rows=37378 width=379) (actual time=0.076..43.617 rows=37379 loops=146)
                                      ->  Incremental Sort  (cost=1.18..2654.03 rows=37378 width=17) (actual time=0.073..19.633 rows=37379 loops=146)
"                                            Sort Key: tb.proc_id, tb.period_year DESC, tb.period_id DESC"
                                            Presorted Key: tb.proc_id
                                            Full-sort Groups: 132568  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB
                                            ->  Index Only Scan using total_balance_idx_2 on total_balance tb  (cost=0.41..864.58 rows=37378 width=17) (actual time=0.018..5.726 rows=37379 loops=146)
                                                  Index Cond: (risk_type = 'PROB'::text)
                                                  Heap Fetches: 2190
                    ->  Subquery Scan on pqr  (cost=1.01..145648.77 rows=5028 width=13) (actual time=0.096..2534.723 rows=219735 loops=146)
                          Filter: (pqr.seq = 1)
                          Rows Removed by Filter: 786096
                          ->  WindowAgg  (cost=1.01..133078.51 rows=1005621 width=5109) (actual time=0.096..2466.654 rows=1005831 loops=146)
                                ->  Incremental Sort  (cost=1.01..112966.09 rows=1005621 width=25) (actual time=0.092..1305.446 rows=1005831 loops=146)
"                                      Sort Key: qualitative_record.proc_id, qualitative_record.value_date DESC, qualitative_record.cre_dt DESC"
                                      Presorted Key: qualitative_record.proc_id
                                      Full-sort Groups: 4159102  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB
                                      Pre-sorted Groups: 30514  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
                                      ->  Index Scan using idx_pqr_proc_id on qualitative_record  (cost=0.42..71733.43 rows=1005621 width=25) (actual time=0.014..1063.333 rows=1005831 loops=146)
"              ->  Index Only Scan using ""COMPANY_EDP_pkey"" on company_edp c  (cost=0.28..0.30 rows=1 width=7) (actual time=0.015..0.015 rows=1 loops=146)"
                    Index Cond: (code = (pqr.comp_code)::text)
                    Heap Fetches: 0
        ->  Index Only Scan using stakeholder_id_index1 on stakeholder sh  (cost=0.15..0.17 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=146)
"              Index Cond: (id = COALESCE(pqr.stkh_resp, 0))"
              Heap Fetches: 146
Planning Time: 3.656 ms
Execution Time: 379609.145 ms

the Analyse for the fast query

Aggregate  (cost=165595.31..165595.32 rows=1 width=8) (actual time=2896.812..2896.821 rows=1 loops=1)
  ->  Nested Loop  (cost=15971.78..165595.31 rows=3 width=4) (actual time=1501.960..2896.729 rows=160 loops=1)
        ->  Nested Loop  (cost=15971.63..165594.81 rows=3 width=8) (actual time=1501.922..2896.228 rows=160 loops=1)
              ->  Merge Right Join  (cost=15971.35..165593.92 rows=3 width=13) (actual time=1501.876..2895.188 rows=160 loops=1)
                    Merge Cond: (d.proc_id = pp.id)
                    ->  Subquery Scan on d  (cost=1.18..3962.26 rows=187 width=4) (actual time=0.076..48.049 rows=2693 loops=1)
                          Filter: (d.seq = 1)
                          Rows Removed by Filter: 34680
                          ->  WindowAgg  (cost=1.18..3495.04 rows=37378 width=379) (actual time=0.075..45.719 rows=37373 loops=1)
                                ->  Incremental Sort  (cost=1.18..2654.03 rows=37378 width=17) (actual time=0.064..21.045 rows=37374 loops=1)
"                                      Sort Key: tb.proc_id, tb.period_year DESC, tb.period_id DESC"
                                      Presorted Key: tb.proc_id
                                      Full-sort Groups: 907  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB
                                      ->  Index Only Scan using total_balance_idx_2 on total_balance tb  (cost=0.41..864.58 rows=37378 width=17) (actual time=0.027..5.981 rows=37378 loops=1)
                                            Index Cond: (risk_type = 'PROB'::text)
                                            Heap Fetches: 14
                    ->  Materialize  (cost=15970.18..161631.17 rows=3 width=17) (actual time=1482.552..2846.829 rows=160 loops=1)
                          ->  Merge Join  (cost=15970.18..161631.17 rows=3 width=17) (actual time=1482.551..2846.750 rows=160 loops=1)
                                Merge Cond: (pqr.proc_id = pp.id)
                                ->  Subquery Scan on pqr  (cost=1.01..145648.77 rows=5028 width=13) (actual time=0.111..2496.949 rows=219605 loops=1)
                                      Filter: (pqr.seq = 1)
                                      Rows Removed by Filter: 785840
                                      ->  WindowAgg  (cost=1.01..133078.51 rows=1005621 width=5109) (actual time=0.110..2427.881 rows=1005445 loops=1)
                                            ->  Incremental Sort  (cost=1.01..112966.09 rows=1005621 width=25) (actual time=0.101..1265.385 rows=1005446 loops=1)
"                                                  Sort Key: qualitative_record.proc_id, qualitative_record.value_date DESC, qualitative_record.cre_dt DESC"
                                                  Presorted Key: qualitative_record.proc_id
                                                  Full-sort Groups: 28476  Sort Method: quicksort  Average Memory: 27kB  Peak Memory: 27kB
                                                  Pre-sorted Groups: 209  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
                                                  ->  Index Scan using idx_pqr_proc_id on qualitative_record  (cost=0.42..71733.43 rows=1005621 width=25) (actual time=0.015..1019.817 rows=1005471 loops=1)
                                ->  Sort  (cost=15969.17..15969.48 rows=126 width=8) (actual time=337.498..337.561 rows=160 loops=1)
                                      Sort Key: pp.id
                                      Sort Method: quicksort  Memory: 32kB
                                      ->  Hash Join  (cost=11041.15..15964.77 rows=126 width=8) (actual time=270.708..337.422 rows=160 loops=1)
                                            Hash Cond: (t4cp_1.proc_id = pp.id)
                                            ->  HashAggregate  (cost=10696.18..12872.36 rows=217618 width=36) (actual time=269.564..323.144 rows=219734 loops=1)
                                                  Group Key: t4cp_1.proc_id
                                                  Batches: 1  Memory Usage: 28689kB
                                                  ->  Hash Right Join  (cost=6055.89..10131.17 rows=226002 width=4) (actual time=54.069..184.369 rows=261886 loops=1)
                                                        Hash Cond: (t4cp_1.per_id = t4.id)
                                                        ->  Seq Scan on process_person t4cp_1  (cost=0.00..3482.02 rows=226002 width=8) (actual time=0.011..15.682 rows=226012 loops=1)
                                                        ->  Hash  (cost=3579.72..3579.72 rows=198093 width=4) (actual time=53.309..53.310 rows=198124 loops=1)
                                                              Buckets: 262144  Batches: 1  Memory Usage: 9014kB
                                                              ->  Index Only Scan using person_id_index1 on person t4  (cost=0.42..3579.72 rows=198093 width=4) (actual time=0.016..17.161 rows=198124 loops=1)
                                                                    Heap Fetches: 448
                                            ->  Hash  (cost=343.39..343.39 rows=127 width=4) (actual time=0.889..0.890 rows=160 loops=1)
                                                  Buckets: 1024  Batches: 1  Memory Usage: 14kB
                                                  ->  Index Scan using process_sub_group_edp_company_cd_index1 on process pp  (cost=0.29..343.39 rows=127 width=4) (actual time=0.027..0.865 rows=160 loops=1)
                                                        Index Cond: ((sub_group_edp_company_cd)::text = 'EDPR'::text)
"                                                        Filter: (((COALESCE(status_cd, ' '::character varying))::text <> 'CANCELLED'::text) AND ((country_cd)::text = 'ESP'::text))"
                                                        Rows Removed by Filter: 960
"              ->  Index Only Scan using ""COMPANY_EDP_pkey"" on company_edp c  (cost=0.28..0.30 rows=1 width=7) (actual time=0.005..0.005 rows=1 loops=160)"
                    Index Cond: (code = (pqr.comp_code)::text)
                    Heap Fetches: 0
        ->  Index Only Scan using stakeholder_id_index1 on stakeholder sh  (cost=0.15..0.17 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=160)
"              Index Cond: (id = COALESCE(pqr.stkh_resp, 0))"
              Heap Fetches: 160
Planning Time: 3.860 ms
Execution Time: 2900.956 ms


Solution 1:[1]

Just to share what I found out, In the end, in my case, the solution was an "order by id" on the temporary tables before the left join by id in the view. With this, the time stabilized around 2.500 ms.

Hope that this could help someone in the future.

Solution 2:[2]

The planner thinks country_cd and sub_group_edp_company_cd are independent, but in fact they are not, as almost everything with country_cd = 'BRA' also has sub_group_edp_company_cd = 'EDPR'. So it thinks there will be 1 qualifying row, when in fact there are 146 of them. This leads to poor choices elsewhere in the plan. If it knew it would need to execute "Subquery Scan on pqr" 146 times rather than one time, then it would have taken steps to make it faster, like hashing it or materializing it.

Creating a multicolumn statistics on those two columns might be enough to fix the problem.

for some reason it seems to do all the work in 7 seconds but it hangs for the rest of the time.

You are misunderstanding the plan. But without knowing why you think that, I don't know what part of it you are misunderstanding.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 D Saggin
Solution 2 jjanes