three postgres tricks i keep forgetting

Things I re-discover every few months and wish I had written down.

1. generate_series for test data

Stop writing Python scripts to seed tables. Postgres can do it inline:

INSERT INTO events (user_id, created_at)
SELECT
  (random() * 1000)::int,
  now() - (random() * interval '90 days')
FROM generate_series(1, 100000);

Runs in under a second for 100k rows. Adjust the expression for whatever shape your data needs.

2. explain (analyze, buffers)

EXPLAIN ANALYZE gives you timing. Adding BUFFERS tells you whether the plan is hitting cache or disk:

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE user_id = 42 ORDER BY created_at DESC LIMIT 20;

Buffers: shared hit=N read=M — if read is high and hit is low, either your table doesn't fit in shared_buffers or you have a sequential scan masquerading as an index scan.

3. lateral joins for per-row subqueries

For "latest N records per group" queries, LATERAL is cleaner than a window function and often faster:

SELECT u.id, recent.*
FROM users u
CROSS JOIN LATERAL (
  SELECT id, created_at
  FROM events
  WHERE user_id = u.id
  ORDER BY created_at DESC
  LIMIT 5
) recent;

The key is that the subquery can reference columns from the outer query. Window functions can do this too, but LATERAL makes the intent obvious and the planner sometimes picks a better index path.