How Do I Estimate the VACUUM Cleanup Rate or How Long It Will Take?
The execution of AUTOVACUUM is divided into several steps. The execution duration of each step is different. The following figure shows the simplified process.

As shown in the preceding figure, the time required to perform VACUUM on a single table is related to the number of table pages, number of index pages, number of global indexes, number of partitions, and I/O control time.
The formula for estimating the speed of VACUUM on a single table is as follows:
- Non-partitioned tables
Number of pages = Number of data pages + Number of index pages
Cleanup time = Number of pages × Cleanup duration + Number of pages/Evaluation cost × Sleep time
Cleanup speed = Total number of dead rows/Cleanup time
- Partitioned Table
Number of pages = Number of data pages + Number of ordinary index pages + (Total number of global index pages) × Number of partitions
Cleanup time = Number of pages × Cleanup duration + Number of pages/Evaluation cost × Sleep time
Cleanup speed = Total number of dead rows/Cleanup time
You can run the following SQL statement to evaluate the speed of VACUUM on a single table:
- Non-partitioned table
select ((heap_pages * prune_time) + (index_pages * prune_time) * nindexs + ((heap_pages+index_pages) * hit_percent * Vacuum_cost_page_hit + (heap_pages+index_pages) * miss_percent * Vacuum_cost_page_miss + (heap_pages+index_pages) * dirty_percent * Vacuum_cost_page_dirty) * Vacuum_cost_delay / Vacuum_cost_limit) as totaltime, deadtuples/totaltime as speed from (select relpages as heap_pages, b.n_dead_tup as deadtuples, b.n_dead_tup/(reltuples+b.n_dead_tup +1) as dirty_percent from pg_class a, pg_stat_all_tables b where a.relname = 't1' and b.relname = a.relname) as abc , (select sum(relpages) as index_pages, count(*) as nindexs from pg_class where relname in (select indexname from pg_indexes where tablename = 't1') and relkind = 'i') as bcd , (select 0.07 as prune_time, 0.05 as miss_percent , 0.95 as hit_percent , 1 as Vacuum_cost_page_hit,10 as Vacuum_cost_page_miss , 200 as Vacuum_cost_limit ,20 as Vacuum_cost_delay , 20 as Vacuum_cost_page_dirty);
- Partitioned table
select ((heap_pages * prune_time) + (local_index_pages * prune_time) * nlocalindexs + (gpi_index_pages * prune_time) * npartitions * ngpiindexs + ((heap_pages + local_index_pages + gpi_index_pages) * hit_percent * Vacuum_cost_page_hit + (heap_pages + local_index_pages + gpi_index_pages) * miss_percent * Vacuum_cost_page_miss + (heap_pages + local_index_pages + gpi_index_pages) * dirty_percent * Vacuum_cost_page_dirty) * Vacuum_cost_delay / Vacuum_cost_limit) as total_time, dead_tuples/total_time as speed from (select relpages as heap_pages, b.n_dead_tup as dead_tuples, b.n_dead_tup/(reltuples+b.n_dead_tup+1) as dirty_percent from pg_class a, pg_stat_all_tables b where a.relname = 't1' and b.relname = a.relname) as abc , (select sum(relpages) as local_index_pages, count(*) as nlocalindexs from pg_class where oid in (select indexrelid from pg_index where indrelid=(select oid from pg_class where relname = 't1')) and relkind='i') as bcd , (select sum(relpages) as gpi_index_pages, count(*) as ngpiindexs from pg_class where oid in (select indexrelid from pg_index where indrelid=(select oid from pg_class where relname = 't1')) and relkind='I') as fenqu , (select count(*) as npartitions from pg_partition where parentid in (select oid from pg_class where relname = 't1'))as qqq ,(select 0.07 as prune_time, 0.05 as miss_percent , 0.95 as hit_percent , 1 as Vacuum_cost_page_hit,10 as Vacuum_cost_page_miss , 200 as Vacuum_cost_limit ,20 as Vacuum_cost_delay , 20 as Vacuum_cost_page_dirty) as yyy;

- In the formula, prune_time indicates the time required for cleaning up a tuple. The value is related to the hardware I/O capability. For SSDs, the value is around 0.07 ms. For NVMe SSDs with higher I/O capabilities, set this parameter to a smaller value. For HDDs with lower I/O capabilities, set this parameter to a larger value.
- miss_percent and hit_percent indicate the cache hit and miss rates. In scenarios right after data import, the cache hit rate is relatively low. After the system becomes stable, the miss rate is around 0.5 and the hit rate is around 0.95.
- The value of vacuum_cost_delay is the same as the value of the GUC parameter autovacuum_vacuum_cost_delay.
In addition, the following progress analysis view is introduced in the current version to obtain the AUTOVACUUM progress and efficiency.
gaussdb=# select * from gs_stat_progress_vacuum(140123447490304);
pid | dataname | relid | relname | phase |
detail
-----------------+----------+-------+---------+---------------+---------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------
140123447490304 | postgres | 24602 | t1 | scanning heap | vacuum db "postgres" starts at 2025-02-09 14:10:17.863838-05, vacuumed 12/242 relations, current r
elation "t1"; heap: scanned 55298/73216 blocks, skipped 0 blocks; info: found 0 dead tuples, already run 610s, delay 608s.
(1 row)
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot