How Do I Estimate the VACUUM Cleanup Rate or How Long It Will Take?

The execution of AUTOVACUUM is divided into several steps. The execution duration of each step is different. The following figure shows the simplified process.

Figure 1 VACUUM process
Click to enlarge

As shown in the preceding figure, the time required to perform VACUUM on a single table is related to the number of table pages, number of index pages, number of global indexes, number of partitions, and I/O control time.

The formula for estimating the speed of VACUUM on a single table is as follows:

Non-partitioned tables
Number of pages = Number of data pages + Number of index pages

Cleanup time = Number of pages × Cleanup duration + Number of pages/Evaluation cost × Sleep time

Cleanup speed = Total number of dead rows/Cleanup time
Partitioned Table
Number of pages = Number of data pages + Number of ordinary index pages + (Total number of global index pages) × Number of partitions

Cleanup time = Number of pages × Cleanup duration + Number of pages/Evaluation cost × Sleep time

Cleanup speed = Total number of dead rows/Cleanup time

You can run the following SQL statement to evaluate the speed of VACUUM on a single table:

Non-partitioned table

select 
((heap_pages * prune_time) + (index_pages * prune_time) * nindexs + ((heap_pages+index_pages) * hit_percent * Vacuum_cost_page_hit + (heap_pages+index_pages) * miss_percent * Vacuum_cost_page_miss + (heap_pages+index_pages) * dirty_percent * Vacuum_cost_page_dirty) * Vacuum_cost_delay / Vacuum_cost_limit) as totaltime, deadtuples/totaltime as speed
from 
(select relpages as heap_pages,  b.n_dead_tup as deadtuples, b.n_dead_tup/(reltuples+b.n_dead_tup +1) as dirty_percent from pg_class a, pg_stat_all_tables b where a.relname = 't1' and b.relname = a.relname) as abc , (select sum(relpages) as index_pages, count(*) as nindexs from pg_class where relname in (select indexname from pg_indexes where tablename = 't1') and relkind = 'i') as bcd , (select 0.07 as prune_time, 0.05 as miss_percent , 0.95 as hit_percent , 1 as Vacuum_cost_page_hit,10 as Vacuum_cost_page_miss , 200 as Vacuum_cost_limit ,20 as Vacuum_cost_delay , 20 as Vacuum_cost_page_dirty);

Partitioned table

select 
((heap_pages * prune_time) + (local_index_pages * prune_time) * nlocalindexs + (gpi_index_pages * prune_time) * npartitions * ngpiindexs + ((heap_pages + local_index_pages + gpi_index_pages) * hit_percent * Vacuum_cost_page_hit + (heap_pages + local_index_pages + gpi_index_pages) * miss_percent * Vacuum_cost_page_miss + (heap_pages + local_index_pages + gpi_index_pages) * dirty_percent * Vacuum_cost_page_dirty) * Vacuum_cost_delay / Vacuum_cost_limit) as total_time,
 dead_tuples/total_time as speed
 from
 (select relpages as heap_pages, b.n_dead_tup as dead_tuples, b.n_dead_tup/(reltuples+b.n_dead_tup+1) as dirty_percent from pg_class a, pg_stat_all_tables b where a.relname = 't1' and b.relname = a.relname) as abc , 
 (select sum(relpages) as local_index_pages, 
 count(*) as nlocalindexs 
 from 
 pg_class where oid in (select indexrelid from pg_index where indrelid=(select oid from pg_class where relname = 't1')) and relkind='i') as bcd , (select sum(relpages) as gpi_index_pages, count(*) as ngpiindexs  from pg_class where oid in (select indexrelid from pg_index where indrelid=(select oid from pg_class where relname = 't1')) and relkind='I') as fenqu , (select count(*) as npartitions from pg_partition where parentid in (select oid from pg_class where relname = 't1'))as qqq ,(select 0.07 as prune_time, 0.05 as miss_percent , 0.95 as hit_percent , 1 as Vacuum_cost_page_hit,10 as Vacuum_cost_page_miss , 200 as Vacuum_cost_limit ,20 as Vacuum_cost_delay , 20 as Vacuum_cost_page_dirty) as yyy;

In the formula, prune_time indicates the time required for cleaning up a tuple. The value is related to the hardware I/O capability. For SSDs, the value is around 0.07 ms. For NVMe SSDs with higher I/O capabilities, set this parameter to a smaller value. For HDDs with lower I/O capabilities, set this parameter to a larger value.
miss_percent and hit_percent indicate the cache hit and miss rates. In scenarios right after data import, the cache hit rate is relatively low. After the system becomes stable, the miss rate is around 0.5 and the hit rate is around 0.95.
The value of vacuum_cost_delay is the same as the value of the GUC parameter autovacuum_vacuum_cost_delay.

In addition, the following progress analysis view is introduced in the current version to obtain the AUTOVACUUM progress and efficiency.

gaussdb=#select * from gs_stat_progress_vacuum(140123447490304);
       pid       | dataname | relid | relname |     phase     |
         detail
-----------------+----------+-------+---------+---------------+---------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------
 140123447490304 | postgres | 24602 | t1      | scanning heap | vacuum db "postgres" starts at 2025-02-09 14:10:17.863838-05, vacuumed 12/242 relations, current r
elation "t1"; heap: scanned 55298/73216 blocks, skipped 0 blocks; info: found 0 dead tuples, already run 610s, delay 608s.
(1 row)

Parent topic: FAQ

Previous topic: How Do I Evaluate Whether the Current Table or Database Requires VACUUM?

Next topic: How Do I Locate Inaccurate Statistics Such as Number of Dead Rows in the Database?