Crisp5 min readGo deeper →

Query planner and EXPLAIN

Postgres compiles your SQL into a tree of physical operators chosen by a cost model. EXPLAIN ANALYZE shows you what it actually ran and how long each node took.

Postgres takes your SQL, parses it, rewrites it, then picks the cheapest execution plan from millions of possibilities. The chosen plan is a tree of operators: scans, joins, sorts, aggregates. EXPLAIN shows the plan. EXPLAIN ANALYZE runs it and shows actual time per node.

The pipeline

Parse: SQL string to parse tree
Rewrite: apply views, rules
Plan: cost-based optimizer picks operator tree
Execute: pull rows up through the tree

The planner uses statistics from pg_statistic (collected by ANALYZE) to estimate selectivity. It picks the lowest-cost plan based on seq_page_cost, random_page_cost, cpu_tuple_cost, etc.

Reading EXPLAIN

EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 42;

Index Scan using idx_orders_user on orders  (cost=0.43..8.45 rows=1 width=64) (actual time=0.012..0.025 rows=2 loops=1)
  Index Cond: (user_id = 42)
Planning Time: 0.1 ms
Execution Time: 0.05 ms

Read inside out:

Index Scan: walked the index, fetched rows
cost=0.43..8.45: estimated startup cost..total cost in arbitrary units
rows=1: estimated row count
actual rows=2: real row count. Big gap means stats are off.
loops=1: how many times this node ran (matters for nested loops)

Postgres query lifecycle

What to look for

Estimated vs actual rows: 10x divergence means ANALYZE is needed or stats target is too low
Seq Scan on big table: usually bad. Maybe missing index, maybe planner correct.
Nested Loop with high outer rows: O(n*m) blow-up
Sort spilling to disk: Sort Method: external merge means work_mem too low
Bitmap Heap Scan with high lossy: page-level bitmap dropped detail, doing recheck on whole pages

Useful options

EXPLAIN (ANALYZE, BUFFERS, VERBOSE, FORMAT JSON) SELECT ...

ANALYZE: actually run the query
BUFFERS: show shared buffer hits and reads (cache efficiency)
VERBOSE: show output columns per node
FORMAT JSON: machine readable, paste into depesz.com

The interview line

"Postgres compiles SQL into a tree of physical operators. The planner picks the tree using a cost model based on statistics from ANALYZE. EXPLAIN ANALYZE shows the chosen plan with actual row counts and times. The key things I look for are estimated vs actual row mismatches, sort spills to disk, and unexpected sequential scans."

Learn more

Docs
PostgreSQL: Using EXPLAINPostgreSQL
Article
explain.depesz.comDepesz