Database PostgreSQL

Postgres Covering Indexes and the Visibility Map

6 min read

Aug 6, 2020 12:00:00 AM

Covering indexes in Postgres came up in our weekly tech call because of a query optimization review one of the teams has in progress. It seemed like some covering indexes weren't working. I've been reading a book called Curious Moon, a work of fiction about Postgres (it's as delightful as it sounds). Since I had the enceladus dataset already loaded, I used it to answer some questions about this kind of index.

Understanding Index-Only Scans and Visibility

Postgres handles covering indexes through the "index-only scans" feature. The requirements are similar to those of most RDBMSs. The index type matters (currently supported only in btree or gist indexes) and the index has to fully cover the columns requested by the query. However, there's an additional requirement related to a query optimization. The page block has to be fully "visible" to the query's mvcc snapshot. Data that hasn't changed recently has a better chance of remaining stable among older and newer transactions on the table. This information is stored in a visibility map adjacent to the table data. Postgres of course has a robust set of extensions, and pg_visibility helps give us insight into the stats around the visibility map. No surprises on installing the extension. This is also one that is available in Cloud SQL for Postgres.

enceladus=# create extension pg_visibility;

I'll import the master_plan data from the Cassini mission:

enceladus=# COPY master_plan FROM '.../data/master_plan.csv' WITH DELIMITER ',' HEADER CSV;
 COPY 61873
 
 enceladus=# select * from master_plan limit 1;
 -[ RECORD 1 ]------+---------------------------
 start_time_utc | 2004-135T18:40:00
 duration | 000T09:22:00
 date | 14-May-04
 team | CAPS
 spass_type | Non-SPASS
 target | Saturn
 request_name | SURVEY
 library_definition | Magnetospheric survey
 title | MAPS Survey
 description | MAPS magnetospheric survey

Managing Visibility with Vacuum

At this point, all the visibility flags are off. This makes sense, because we just performed a bunch of inserts. The vacuum process turns them all on:

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
 -------+-------------
 1959 | f
 
 enceladus=# vacuum master_plan ;
 VACUUM
 
 enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
 -------+-------------
 1959 | t

Let's perform a lot of inserts. I'm using the "watch" option here so I don't have to type as much. Afterwards, a lot of the visibility flags are marked as off:

enceladus=# COPY master_plan FROM '.../data/master_plan.csv' WITH DELIMITER ',' HEADER CSV;
 COPY 61873
 
 enceladus=# \watch
 Fri Jul 24 16:26:52 2020 (every 2s)
 ...
 enceladus=# select count(*) from master_plan;
 -[ RECORD 1 ]--
 count | 3093650
 
 enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
 -------+-------------
 97868 | f
 1941 | t

Analyzing the Query Plan

Before I add the covering index, I'll run my intended query. By the way, don't take the absolute times into account here. This is a test on a puny laptop. I'm mainly trying to demonstrate the absence or presence of "index only scan" in the explain plan (notice the "Parallel Seq Scan" below):

time psql -U postgres -d enceladus -c"explain analyze select team, spass_type, target from master_plan where team='CDA' or team='RSS' order by team, target"
 
 QUERY PLAN
 --------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge (cost=133280.23..161724.62 rows=243792 width=22) (actual time=5075.047..5244.599 rows=286500 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Sort (cost=132280.21..132584.95 rows=121896 width=22) (actual time=5069.007..5103.744 rows=95500 loops=3)
 Sort Key: team, target
 Sort Method: external merge Disk: 3352kB
 Worker 0: Sort Method: external merge Disk: 3448kB
 Worker 1: Sort Method: external merge Disk: 3432kB
 -> Parallel Seq Scan on master_plan (cost=0.00..119480.37 rows=121896 width=22) (actual time=0.996..4877.452 rows=95500 loops=3)
 Filter: ((team = 'CDA'::text) OR (team = 'RSS'::text))
 Rows Removed by Filter: 935717
 Planning Time: 0.419 ms
 Execution Time: 5272.250 ms
 (13 rows)
 real 0m6.450s
 user 0m0.012s
 sys 0m0.027s

Let's add an index we hope to provide index-only queries:

CREATE INDEX idx_tpt ON public.master_plan USING btree (team, spass_type, target);

It doesn't look like it helped. I still see "Parallel Seq Scan" below:

time psql -U postgres -d enceladus -c"explain analyze select team, spass_type, target from master_plan where team='CDA' or team='RSS' order by team, target"
 
 QUERY PLAN
 --------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge (cost=132711.41..160669.96 rows=239628 width=22) (actual time=4746.679..4915.888 rows=286500 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Sort (cost=131711.39..132010.92 rows=119814 width=22) (actual time=4722.466..4757.252 rows=95500 loops=3)
 Sort Key: team, target
 Sort Method: external merge Disk: 3424kB
 Worker 0: Sort Method: external merge Disk: 3416kB
 Worker 1: Sort Method: external merge Disk: 3392kB
 -> Parallel Seq Scan on master_plan (cost=0.00..119144.31 rows=119814 width=22) (actual time=3.395..4533.381 rows=95500 loops=3)
 Filter: ((team = 'CDA'::text) OR (team = 'RSS'::text))
 Rows Removed by Filter: 935717
 Planning Time: 21.649 ms
 Execution Time: 4943.530 ms
 (13 rows)
 real 0m7.130s
 user 0m0.013s
 sys 0m0.038s

Restoring Performance with pg_visibility

Remember that Postgres-specific requirement regarding the visibility flag for covering indexes?

enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
 -------+-------------
 97868 | f
 1941 | t

Clean up the flags:

enceladus=# vacuum master_plan ;
 VACUUM
 
 enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
 -------+-------------
 97947 | t

Now we see the index-only scan show up (notice "Index Only Scan" below):

time psql -U postgres -d enceladus -c"explain analyze select team, spass_type, target from master_plan where team='CDA' or team='RSS' order by team, target"
 
 QUERY PLAN
 -----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge (cost=114722.12..142667.60 rows=239516 width=22) (actual time=461.861..612.337 rows=286500 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Sort (cost=113722.09..114021.49 rows=119758 width=22) (actual time=455.925..482.057 rows=95500 loops=3)
 Sort Key: team, target
 Sort Method: external merge Disk: 4080kB
 Worker 0: Sort Method: external merge Disk: 3080kB
 Worker 1: Sort Method: external merge Disk: 3072kB
 -> Parallel Index Only Scan using idx_tpt on master_plan (cost=0.43..101163.65 rows=119758 width=22) (actual time=24.874..358.271 rows=95500 loops=3)
 Filter: ((team = 'CDA'::text) OR (team = 'RSS'::text))
 Rows Removed by Filter: 935717
 Heap Fetches: 0
 Planning Time: 0.538 ms
 Execution Time: 640.655 ms
 (14 rows)
 real 0m1.065s
 user 0m0.009s
 sys 0m0.012s

The problem returns if you perform a lot of updates. Back to "Seq Scan" below:

enceladus=# update master_plan set target='Saturn2' where target='Saturn';
 UPDATE 847900
 
 enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
 --------+-------------
 119439 | f
 4627 | t
 
 time psql -U postgres -d enceladus -c"explain analyze select team, spass_type, target from master_plan where team='CDA' or team='RSS' order by team, target"
 
 QUERY PLAN
 ---------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge (cost=165711.17..201108.67 rows=303386 width=22) (actual time=14423.062..14960.570 rows=286500 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Sort (cost=164711.14..165090.37 rows=151693 width=22) (actual time=14414.109..14449.936 rows=95500 loops=3)
 Sort Key: team, target
 Sort Method: external merge Disk: 3456kB
 Worker 0: Sort Method: external merge Disk: 3432kB
 Worker 1: Sort Method: external merge Disk: 3360kB
 -> Parallel Seq Scan on master_plan (cost=0.00..148545.86 rows=151693 width=22) (actual time=0.113..14210.193 rows=95500 loops=3)
 Filter: ((team = 'CDA'::text) OR (team = 'RSS'::text))
 Rows Removed by Filter: 935717
 Planning Time: 8.331 ms
 Execution Time: 15011.649 ms
 (13 rows)
 real 0m17.167s
 user 0m0.014s
 sys 0m0.044s

We can clean it up again and get our index-only scans (and performance) back:

enceladus=# vacuum master_plan ;
 VACUUM
 
 enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
 --------+-------------
 124066 | t
 
 time psql -U postgres -d enceladus -c"explain analyze select team, spass_type, target from master_plan where team='CDA' or team='RSS' order by team, target"
 
 QUERY PLAN
 ------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather Merge (cost=135747.09..163705.64 rows=239628 width=22) (actual time=1822.618..1963.040 rows=286500 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Sort (cost=134747.07..135046.60 rows=119814 width=22) (actual time=1811.738..1835.136 rows=95500 loops=3)
 Sort Key: team, target
 Sort Method: external merge Disk: 3640kB
 Worker 0: Sort Method: external merge Disk: 3336kB
 Worker 1: Sort Method: external merge Disk: 3280kB
 -> Parallel Index Only Scan using idx_tpt on master_plan (cost=0.43..122179.99 rows=119814 width=22) (actual time=50.554..1716.339 rows=95500 loops=3)
 Filter: ((team = 'CDA'::text) OR (team = 'RSS'::text))
 Rows Removed by Filter: 935717
 Heap Fetches: 0
 Planning Time: 4.707 ms
 Execution Time: 2005.764 ms
 (14 rows)
 real 0m2.710s
 user 0m0.011s
 sys 0m0.015s

Best Practices and Maintenance

Moral: You knew this already, but make sure you're vacuuming frequently enough, especially if you're doing a lot of inserts or updates. The default vacuum ratio is generally too high, so you could lower that and autovacuum will likely help you out. If you are seeing that covering indexes don't seem to be working, a likely reason is that you aren't running vacuums frequently enough on quickly changing datasets. Here are some other queries to use with the extension. This one does an in-depth check to find out if the visibility map is corrupt (maybe due to a previous database crash). We want zero rows found:

enceladus=# select * from pg_check_visible('master_plan');
 
 t_ctid
 --------
 (0 rows)

If you do discover that a visibility map is corrupt, rebuild it using the pg_truncate_visibility_map function, and then vacuum.

enceladus=# select pg_truncate_visibility_map('master_plan');
 
 enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
 --------+-------------
 124066 | f
 
 enceladus=# vacuum master_plan ;
 VACUUM
 
 enceladus=# select count(blkno), all_visible from pg_visibility_map('master_plan') group by all_visible;
 count | all_visible
 --------+-------------
 124066 | t