VACUUM
Description
VACUUM recycles storage space occupied by tables or B-Tree indexes. In normal database operation, rows that have been deleted are not physically removed from their table; they remain present until a VACUUM is done. Therefore, it is necessary to do VACUUM periodically, especially on frequently-updated tables.
Precautions
- If no table is specified, VACUUM processes the tables on which the user has the corresponding permission in the current database. With a parameter, VACUUM processes only that table.
- To perform VACUUM operation on a table, you must be a table owner or a user granted the VACUUM permission on the table. When separation of duties is disabled, system administrators have this permission by default. However, database owners are allowed to VACUUM all tables in their databases, except shared catalogs. (The restriction for shared catalogs means that a true database-wide VACUUM can only be executed by a system administrator). VACUUM skips over any tables that the calling user does not have the permission to vacuum.
- VACUUM cannot be executed inside a transaction block.
- It is recommended that active production databases be vacuumed frequently (at least nightly), in order to remove dead rows. After adding or deleting a large number of rows, it might be a good idea to run VACUUM ANALYZE for the affected table. This will update the system catalogs with the results of all recent changes, and allow the query planner to make better choices in planning queries.
- FULL is recommended only in special scenarios. For example, you wish to physically narrow the table to decrease the occupied disk space after deleting most rows of a table. VACUUM FULL usually shrinks a table more than VACUUM does. If the physical space usage does not decrease after you run the statement, check whether there are other active transactions (that have started before you delete data transactions and not ended before you run VACUUM FULL). If there are such transactions, run this statement again when the transactions quit.
- VACUUM FULL returns the free space in the table to the tablespace by rebuilding the table. During the rebuilding, extra storage space equivalent to the valid data size in the table is required. For a non-segment-page table, after VACUUM FULL is executed, the physical files occupied by the original table are deleted, and the physical file space occupied by the original table is returned to the OS. For a segment-page table, after VACUUM FULL is executed, the physical space occupied by the original table is returned to the segment-page data file instead of the OS.
- VACUUM causes a substantial increase in I/O traffic, which might cause poor performance for other active sessions. Therefore, it is sometimes advisable to use the cost-based VACUUM delay feature.
- When VERBOSE is specified, VACUUM prints progress messages to indicate which table is currently being processed. Various statistics about the tables are printed as well.
- When the option list is surrounded by parentheses, the options can be written in any order. If there are no brackets, the options must be given in the order displayed in the syntax.
- VACUUM and VACUUM FULL clear deleted tuples after the delay specified by vacuum_defer_cleanup_age.
- VACUUM ANALYZE executes a VACUUM operation and then an ANALYZE operation for each selected table. This is a handy combination form for routine maintenance scripts.
- Plain VACUUM (without FULL) simply recycles space and makes it available for reuse. This form of statement can operate concurrently with normal reading and writing of the table, as an exclusive lock is not obtained. VACUUM FULL executes wider processing, including moving rows across blocks to compress tables so they occupy the minimum number of disk blocks. This form is much slower and requires an exclusive lock on each table while it is being processed.
- If the xc_maintenance_mode parameter is not enabled, VACUUM FULL skips all system catalogs.
- If you run VACUUM FULL immediately after running DELETE, the space will not be recycled. After running DELETE, execute 1000 non-SELECT transactions, or wait for 1s and then execute one transaction. Then, run VACUUM FULL to the space.
- During VACUUM FULL, an exclusive lock is added to the table. Therefore, you are advised not to run VACUUM FULL during peak hours. Otherwise, the lock waiting time is too long or a deadlock occurs.
- To ensure performance and statistics accuracy, do not run ANALYZE-related commands, such as VACUUM ANALYZE, AUTOANALYZE, and manual ANALYZE, at the same time or frequently.
- For Ustore, the behavior of manual VACUUM is the same as that in Astore. Locks are obtained to clear heap tables and indexes. In Ustore, AUTOVACUUM only clears GPIs of partitioned tables, updates FSMs of heap tables, and recycles index pages.
- When VACUUM FULL is executed, partitions are traversed for clearance and GPIs are rebuilt after partition clearance. Therefore, if there are a large number of partitions, you are advised to delete GPIs first and rebuild indexes after VACUUM FULL is executed. In this way, the execution time of VACUUM FULL is reduced.
- You cannot perform VACUUM FULL online for database-level, index-level, partitioned table, secondary partitioned table, segment-page table, hash/range bucket table, temporary table, and unlogged table objects. If an object involves the preceding unsupported types, the ONLINE keyword does not take effect. By default, VACUUM FULL is executed offline, and a message is displayed indicating that the VACUUM FULL statement cannot be executed online.
- When performing online VACUUM FULL operations, reserve sufficient disk space, at least equal to the total size of the tables to be operated and the accessories such as indexes on the tables.
- If the online VACUUM FULL execution is canceled, the residual data clearance process starts first. If the execution is also canceled again, the clearance process ends and residual data remains.
- Resource contention exists between online VACUUM FULL operations and user services. Therefore, online VACUUM FULL operations are not suitable for heavy-traffic scenarios. In scenarios where resources are limited, fine-grained resource control is recommended. Parallel acceleration is not recommended.
- Online VACUUM FULL operations are not supported in M-compatible mode.
- Online VACUUM FULL execution does not support the FREEZE keyword. If both FREEZE and ONLINE keywords are specified, VACUUM FULL is executed offline by default, and a message is displayed indicating that the VACUUM FULL statement cannot be executed online.
- Online VACUUM FULL operations cannot be performed in a transaction or a stored procedure.
- If online VACUUM FULL fails, significant table space bloat may occur. The bloat degree is proportional to the online VACUUM FULL execution duration and the number of concurrent write operations. Long transactions may block online VACUUM FULL. You are advised not to perform online VACUUM FULL when long transactions exist.
- After VACUUM FULL is executed online, DQL operations are blocked and an error is reported for concurrent DML operations.
- During online VACUUM FULL execution, do not execute DDL statements concurrently with other DDL statements. Otherwise, deadlocks or errors may occur. In this case, reconnect to the client and try again.
- A schema named online$$ddl$$[hash value] is generated during online VACUUM FULL execution.
- By default, the owner of the schema is the initial user. You are advised not to perform the operations on the schema during online VACUUM FULL execution. Otherwise, the online VACUUM FULL execution may be terminated and an unknown error may occur. You are advised not to query tables in this schema. By default, gs_dump will not export this schema.
- You cannot create a schema whose prefix is online$$. Otherwise, an error notice is displayed. If schemas with the same name conflict with each other, online VACUUM FULL execution does not take effect. By default, VACUUM FULL is executed offline, and a message is displayed indicating that the VACUUM FULL statement cannot be executed online due to the schema name conflict. -
- If a GSI exists in the object table, the ONLINE keyword does not take effect. By default, VACUUM FULL is executed offline, and a message is displayed indicating that VACUUM FULL cannot be executed online. -
Syntax
- Recycle space and update statistics information, without requirements for keyword orders.
1 2 3
VACUUM [ ( { FULL | FREEZE | VERBOSE | { ANALYZE | ANALYSE } } [, ... ] ) ] [ table_name [ (column_name [, ...] ) ] [ PARTITION ( partition_name ) ] ] [ OFFLINE | [ ONLINE [ WITH ( { online_parameter = value } [, ... ] ) ] ] ];
- Recycle space, without updating statistics information.
1 2
VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ table_name [ PARTITION ( partition_name ) ] ] [ OFFLINE | [ ONLINE [ WITH ( { online_parameter = value } [, ... ] ) ] ] ];
- Recycle space and update statistics information, and require keywords in order.
1 2 3
VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] { ANALYZE | ANALYSE } [ VERBOSE ] [ table_name [ ( column_name [, ...] ) ] ] [ PARTITION ( partition_name ) ] [ OFFLINE | [ ONLINE [ WITH ( { online_parameter = value } [, ... ] ) ] ] ];
- Recycle the space of a specified bucket and update the bucketxid column in the pg_hashbucket system catalog. Statistics are not updated. It can be called by administrators only during scale-out.
1
VACUUM FREEZE BUCKETS (bucketid [, ...]);
Parameters
- FULL
Selects "FULL" vacuum, which can recycle more space, but takes much longer and exclusively locks the table.
Using FULL will cause statistics information missing. To collect statistics information, add the keyword ANALYZE to the VACUUM FULL statement.
- FREEZE
Is equivalent to running VACUUM with the vacuum_freeze_min_age parameter set to 0.
- VERBOSE
Prints a detailed VACUUM activity report for each table.
- ANALYZE | ANALYSE
Updates statistics used by the planner to determine the most efficient way to execute a query.
- table_name
Specifies the name (optionally schema-qualified) of a specific table to vacuum.
Value range: name of a specific table to vacuum. The default value indicates all tables in the current database.
- column_name
Specifies the name of the column to be analyzed. This parameter must be used together with ANALYZE.
Value range: name of a specific column to analyze. The default value indicates all columns.
The mechanism of the VACUUM ANALYZE statement is to execute VACUUM and ANALYZE in sequence. Therefore, if column_name is incorrect, VACUUM may be successfully executed but ANALYZE may fail to be executed. For a partitioned table, ANALYZE may fail to be executed after VACUUM is successfully executed on a partition.
- partition_name
Specifies the partition name of the table to be cleared. The default value indicates all partitions.
- ONLINE | OFFLINE
Executes VACUUM online or offline. This parameter is used to execute VACUUM FULL online and therefore the read and write services are not blocked for a long time during VACUUM FULL.
The keywords ONLINE and OFFLINE are used.- ONLINE: executes online. If the VACUUM FULL statement contains scenarios (such as the FREEZE keyword) that are not supported by the current feature, the statement is executed offline and a NOTICE is displayed, indicating that the ONLINE keyword is ignored.
- OFFLINE: executes offline.
Value range: OFFLINE or ONLINE, which indicates offline execution and online execution respectively. The value is OFFLINE by default.
If VACUUM FULL fails to be executed online due to user instruction termination commands, lock timeout, or database faults, the system automatically clears residual objects. Automatic cleanup may fail due to reasons such as lock timeout. In this case, manual cleanup is required. For severe faults, such as FATAL, PANIC, and database faults, residual temporary schema objects may exist, affecting subsequent online VACUUM FULL and online DDL execution. In this case, you need to use the online_ddl_cleanup() function to manually clear the objects. If no residual data exists, you are advised not to use the online_ddl_cleanup() function. Otherwise, online VACUUM FULL may be terminated and an unknown error may occur.
- online_parameter
Specifies the online VACUUM FULL option when the ONLINE keyword is specified. This parameter does not take effect if the ONLINE keyword is not specified or conditions for online VACUUM FULL execution are not met.
Value range:
- parallel_threads: controls multi-thread parallel operations to achieve higher performance when resources are sufficient. Value range: an integer ranging from 1 to 32. The default value is 1.
- When a user enables resource control, for example, I/O resource control (for example, io_limits > 0 or io_priority = high/medium/low), parallel_threads ignores resource control under the user or session. In addition, a message is displayed, indicating that the current resource control is invalid and multi-thread acceleration is enabled. The online VACUUM FULL multi-thread execution complies with the resource control of the initial user.
- This parameter can take effect together with the index parallel parameter parallel_workers. parallel_workers controls the degree of parallelism for index building during VACUUM FULL, and parallel_threads controls the degree of parallelism for catchups during VACUUM FULL.
- max_catchup_times: controls the maximum number of catchups. This parameter is used to prevent the execution of VACUUM FULL from taking a long time. Value range: an integer ranging from 1 to 50. The default value is 50.
Online VACUUM FULL execution records the IUD incremental data generated during data reconstruction to a temporary table and reconstructs the incremental data in multiple rounds of catchups. To prevent VACUUM FULL catchup failures due to a large amount of concurrent IUD data, you can set this parameter to limit the number of catchups.
- parallel_threads: controls multi-thread parallel operations to achieve higher performance when resources are sufficient. Value range: an integer ranging from 1 to 32. The default value is 1.
Examples
- VACUUM
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
-- Create the table tbl_test and insert data into the tables. gaussdb=# CREATE TABLE tbl_test(c1 int); gaussdb=# INSERT INTO tbl_test VALUES (1); -- View the data and the CTID of the data. gaussdb=# SELECT ctid,* FROM tbl_test; ctid | c1 -------+---- (0,1) | 1 (1 row) -- Delete the data record. gaussdb=# DELETE FROM tbl_test; -- Insert a data record again. It is found that a new ctid is used. gaussdb=# INSERT INTO tbl_test VALUES (2); gaussdb=# SELECT ctid,* FROM tbl_test; ctid | c1 -------+---- (0,2) | 2 (1 row) -- After the VACUUM statement is executed, the old space is reused when data is inserted. gaussdb=# VACUUM ANALYZE tbl_test; gaussdb=# INSERT INTO tbl_test VALUES (3); gaussdb=# SELECT ctid,* FROM tbl_test; ctid | c1 -------+---- (0,1) | 3 (0,2) | 2 (2 rows) -- Drop the table. gaussdb=# DROP TABLE tbl_test;
- VACUUM FULL
-- Create a table. gaussdb=# CREATE TABLE tbl_test2(c1 int); -- Insert 100,000 data records and check the table size. gaussdb=# INSERT INTO tbl_test2 VALUES (generate_series(1,100000)); gaussdb=# SELECT 'tbl_test2' AS tablename, pg_size_pretty(pg_relation_size('tbl_test2')) AS size; tablename | size -----------+--------- tbl_test2 | 3048 kB (1 row) -- Delete data and check the table size. gaussdb=# DELETE FROM tbl_test2; gaussdb=# SELECT 'tbl_test2' AS tablename, pg_size_pretty(pg_relation_size('tbl_test2')) AS size; tablename | size -----------+--------- tbl_test2 | 3048 kB (1 row) -- Use VACUUM FULL to reclaim space and check the table size. gaussdb=# VACUUM FULL ANALYZE tbl_test2; gaussdb=# SELECT 'tbl_test2' AS tablename, pg_size_pretty(pg_relation_size('tbl_test2')) AS size; tablename | size -----------+--------- tbl_test2 | 0 bytes (1 row) -- Use VACUUM FULL to recycle spaces online. gaussdb=# VACUUM FULL ANALYZE tbl_test2 ONLINE; -- Drop. gaussdb=# DROP TABLE tbl_test2;
Suggestions
- vacuum
- VACUUM cannot be executed inside a transaction block.
- It is recommended that active production databases be vacuumed frequently (at least nightly), in order to remove dead rows. It is strongly recommended that you run VACUUM ANALYZE after adding or deleting a large number of records.
- FULL is recommended only in special scenarios. For example, you wish to physically narrow the table to decrease the occupied disk space after deleting most rows of a table.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot