DEV Community

PostgreSQL query planner parameters and prepared statements

PostgreSQL query planner parameters and prepared statements

PostgreSQL provides several planner configuration parameters, such as enable_seqscan and enable_indexscan, that influence how execution plans are generated. These settings affect planning, not the execution of an already-generated plan. With prepared statements, this raises an interesting question: should planner settings be applied before PREPARE, before EXECUTE, or both?

Let's look at a simple example: a "tasks" table with a due date and a "done" status:

\c
drop table if exists tasks;

-- a table of tasks with status (done or not) and due date
create table tasks (
  id bigint generated always as identity primary key,
  due timestamptz,
  done boolean
);

-- insert 500 tasks, with 1% not done
insert into tasks (due, done)
select now() + interval '1 day' * n, 42 != n % 100
from generate_series(1, 500) n;

-- index the todo (partial index)
create index on tasks (due, id) where done = false;

vacuum analyze tasks;

With a partial index, I indexed only the tasks that are not yet done (done = false) because that's my most frequent query pattern:

postgres=# explain select id, due, done from tasks where done = false and id > 0 order by due limit 1;
                                        QUERY PLAN
---------------------------------------------------------------------------------------
 Limit  (cost=0.13..3.60 rows=1 width=17)
   ->  Index Scan using tasks_due_id_idx1 on tasks  (cost=0.13..17.47 rows=5 width=17)
         Index Cond: (id > 0)
(3 rows)

With partial indexes, the condition covered by the index is not even visible in the execution plan because the index itself enforces the condition.

Prepared statement

I decided to use a prepared statement with all values as parameters. It is probably not a good idea in this case. When a parameter can have only a few different values and you expect different cardinalities for each, you should probably define one query per value, using literals. I'm doing this to illustrate what can happen, with a simple, extreme example:

postgres=# prepare c (boolean, int) as
  select id, due, done from tasks where done = $1 and id > 0 order by due limit $2;
PREPARE

postgres=# explain (analyze, settings) execute c (false, 1);
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.13..3.60 rows=1 width=17) (actual time=0.087..0.088 rows=1.00 loops=1)
   Buffers: shared hit=4 read=1
   ->  Index Scan using tasks_due_id_idx on tasks  (cost=0.13..17.47 rows=5 width=17) (actual time=0.087..0.087 rows=1.00 loops=1)
         Index Cond: (id > 0)
         Index Searches: 1
         Buffers: shared hit=4 read=1
 Planning:
   Buffers: shared hit=3
 Planning Time: 0.148 ms
 Execution Time: 0.099 ms
(10 rows)

With the same prepared statement, I disabled the Index Scan before the next execution:

postgres=# set enable_indexscan to off;
SET

postgres=# explain (analyze, settings) execute c (false, 1);
                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Limit  (cost=10.28..10.28 rows=1 width=17) (actual time=0.043..0.043 rows=1.00 loops=1)
   Buffers: shared hit=4
   ->  Sort  (cost=10.28..10.29 rows=5 width=17) (actual time=0.042..0.042 rows=1.00 loops=1)
         Sort Key: due
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=4
         ->  Seq Scan on tasks  (cost=0.00..10.25 rows=5 width=17) (actual time=0.012..0.037 rows=5.00 loops=1)
               Filter: ((NOT done) AND (id > 0))
               Rows Removed by Filter: 495
               Buffers: shared hit=4
 Settings: enable_indexscan = 'off'
 Planning:
   Buffers: shared hit=12
 Planning Time: 0.174 ms
 Execution Time: 0.055 ms
(15 rows)

I was still using the same prepared statement, yet the execution plan had changed. PostgreSQL generated a new custom plan for this execution, so it used the planner setting active at EXECUTE time (enable_indexscan = 'off') rather than any setting active when the statement was prepared.

With plan_cache_mode set to the default auto, each execution of the prepared statement goes through the planning phase like a non-prepared statement for at least the first five executions, so the query planner parameters at EXECUTE time drive the planning. The result is a custom plan. After five executions, it may switch to a generic plan if the optimizer estimates it's worth it, comparing the generic plan's cost with the average execution cost of the previous custom plans (including planning overhead).

Because this decision depends on cost estimates, previous execution history, and the generic-versus-custom plan heuristic, plan selection may be less predictable than when plan_cache_mode is explicitly set. I recommend not relying on the auto behavior and instead deciding whether a prepared statement should be generic or custom by setting plan_cache_mode accordingly. Ideally, you should use parameters only when a generic plan is acceptable, and use different prepared statements with literals when the value matters for choosing the optimal access path.

Generic plan

You might expect a generic plan to permanently preserve the planner environment that was in effect at the time the plan was created. Here is the same example with plan_cache_mode set to force_generic_plan:

--- reset the session
postgres=# \c
You are now connected to database "postgres" as user "postgres".

postgres=# \dconfig enable*scan*
List of configuration parameters
     Parameter      | Value
--------------------+-------
 enable_bitmapscan  | on
 enable_indexonlyscan | on
 enable_indexscan   | on
 enable_seqscan     | on
 enable_tidscan     | on
(5 rows)

-- disable auto plan cache mode and set it to generic
postgres=# set plan_cache_mode = force_generic_plan;
SET

postgres=# prepare c (boolean, int) as
  select id, due, done from tasks where done = $1 and id > 0 order by due limit $2;
PREPARE

postgres=# explain (analyze, settings) execute c (false, 1);
                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Limit  (cost=21.46..21.52 rows=25 width=17) (actual time=0.057..0.057 rows=1.00 loops=1)
   Buffers: shared hit=7
   ->  Sort  (cost=21.46..22.08 rows=250 width=17) (actual time=0.055..0.055 rows=1.00 loops=1)
         Sort Key: due
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=7
         ->  Seq Scan on tasks  (cost=0.00..11.50 rows=250 width=17) (actual time=0.010..0.040 rows=5.00 loops=1)
               Filter: ((id > 0) AND (done = $1))
               Rows Removed by Filter: 495
               Buffers: shared hit=4
 Settings: plan_cache_mode = 'force_generic_plan'
 Planning:
   Buffers: shared hit=122
 Planning Time: 0.453 ms
 Execution Time: 0.073 ms
(15 rows)

With the predicate on "done" that can take any value, a generic plan cannot use the partial index that contains entries only for the true value, so the query planner falls back to a Seq Scan. I disable Seq Scan for the next execution:

postgres=# set enable_seqscan to off;
SET

postgres=# explain (analyze, settings) execute c (false, 1);
                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Limit  (cost=21.46..21.52 rows=25 width=17) (actual time=0.059..0.060 rows=1.00 loops=1)
   Buffers: shared hit=4
   ->  Sort  (cost=21.46..22.08 rows=250 width=17) (actual time=0.057..0.057 rows=1.00 loops=1)
         Sort Key: due
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=4
         ->  Seq Scan on tasks  (cost=0.00..11.50 rows=250 width=17) (actual time=0.018..0.051 rows=5.00 loops=1)
               Filter: ((id > 0) AND (done = $1))
               Rows Removed by Filter: 495
               Buffers: shared hit=4
 Settings: plan_cache_mode = 'force_generic_plan', enable_seqscan = 'off'
 Planning Time: 0.013 ms
 Execution Time: 0.080 ms
(13 rows)

The statement was not re-planned. While there's no direct proof, several clues suggest it: Seq Scan persisted even when disabled, even though an alternative access method, such as using the primary key index, is available and would respect the directive, even if not optimal. The Planning section was absent, as seen in the initial EXECUTE after PREPARE, which showed Buffers: shared hit related to catalog lookups. The Planning Time was brief, only covering the time to retrieve the plan from cache. There was no Disabled: true indicator or a very high cost noted in earlier PostgreSQL versions for Seq Scan, indicating that enable_seqscan = 'off' was ineffective in this case.

A potential source of confusion is that enable_seqscan = 'off' appears in the Settings section, even though it was not used to produce the displayed plan. The Settings section shows planner-related GUC values active during EXPLAIN execution, which may differ from those active when a cached generic plan was created.

DDL invalidation

Prepared statements continue to use cached plans when query parameters change, but generic plans might be re-created if invalidated. DDL statements, such as adding a column, invalidate cached plans regardless of whether they depended on that column:

postgres=# alter table tasks add column description text;
ALTER TABLE

postgres=# explain (analyze, settings) execute c (false, 1);
                                                         QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=41.54..41.60 rows=25 width=17) (actual time=0.113..0.114 rows=1.00 loops=1)
   Buffers: shared hit=10
   ->  Sort  (cost=41.54..42.17 rows=250 width=17) (actual time=0.111..0.112 rows=1.00 loops=1)
         Sort Key: due
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=10
         ->  Bitmap Heap Scan on tasks  (cost=20.09..31.59 rows=250 width=17) (actual time=0.069..0.105 rows=5.00 loops=1)
               Recheck Cond: (id > 0)
               Filter: (done = $1)
               Rows Removed by Filter: 495
               Heap Blocks: exact=4
               Buffers: shared hit=10
               ->  Bitmap Index Scan on tasks_pkey  (cost=0.00..20.02 rows=500 width=0) (actual time=0.050..0.050 rows=500.00 loops=1)
                     Index Cond: (id > 0)
                     Index Searches: 1
                     Buffers: shared hit=6
 Settings: plan_cache_mode = 'force_generic_plan', enable_seqscan = 'off'
 Planning:
   Buffers: shared hit=23
 Planning Time: 0.300 ms
 Execution Time: 0.130 ms
(21 rows)

This time, enable_seqscan = 'off' was used because the prepared statement was re-planned, which effectively skipped Seq Scan in favor of a Bitmap Heap Scan. Because the plan is generic, PostgreSQL cannot assume that $1 will always meet the partial index's predicate. Therefore, the partial index cannot be used, but the primary key index tasks_pkey contains entries for all rows and can be used when sequential scan is disabled.

Disabled: true (PostgreSQL 18)

I've run another DDL to remove the primary key and, consequently, the index, and the prepared statement is re-planned:

postgres=# alter table tasks drop constraint tasks_pkey;
ALTER TABLE

postgres=# explain (analyze, settings) execute c (false, 1);
                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------
 Limit  (cost=21.46..21.52 rows=25 width=17) (actual time=0.054..0.055 rows=1.00 loops=1)
   Buffers: shared hit=4
   ->  Sort  (cost=21.46..22.08 rows=250 width=17) (actual time=0.052..0.052 rows=1.00 loops=1)
         Sort Key: due
         Sort Method: top-N heapsort  Memory: 25kB
         Buffers: shared hit=4
         ->  Seq Scan on tasks  (cost=0.00..11.50 rows=250 width=17) (actual time=0.014..0.046 rows=5.00 loops=1)
               Disabled: true
               Filter: ((id > 0) AND (done = $1))
               Rows Removed by Filter: 495
               Buffers: shared hit=4
 Settings: plan_cache_mode = 'force_generic_plan', enable_seqscan = 'off'
 Planning:
   Buffers: shared hit=16 dirtied=2
 Planning Time: 0.251 ms
 Execution Time: 0.069 ms
(16 rows)

The mention of Disabled: true indicates that the disabled node was still in use due to the absence of an alternative. The only index on this table is a partial index, which can be used only with a custom plan when parameter $1 is false, not with a standard plan.

In PostgreSQL 17, instead of Disabled: true, you would see an extremely high cost, indicating that disabled scans are deprioritized:

                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=10000000021.46..10000000021.52 rows=25 width=17) (actual time=0.144..0.145 rows=1 loops=1)
   ->  Sort  (cost=10000000021.46..10000000022.08 rows=250 width=17) (actual time=0.143..0.143 rows=1 loops=1)
         Sort Key: due
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Seq Scan on tasks  (cost=10000000000.00..10000000011.50 rows=250 width=17) (actual time=0.098..0.136 rows=5 loops=1)
               Filter: ((id > 0) AND (done = $1))
               Rows Removed by Filter: 495
 Settings: plan_cache_mode = 'force_generic_plan', enable_seqscan = 'off', jit = 'off'
 Planning Time: 0.014 ms
 Execution Time: 0.167 ms
(10 rows)

It's important to determine if the EXECUTE-time parameters were applied to the plan, as the Settings section can be misleading. It displays the parameters set during the execution explanation, but they only affected the plan if a re-planning occurred. EXPLAIN does not reveal the parameters that established the cached plan.

In an upcoming blog post about pg_plan_advice, I'll share a different approach.

Comments

No comments yet. Start the discussion.