Updated on 2025-04-21 GMT+08:00

VACUUM

Function

This command removes all files that are not managed by Delta from the table directory and to delete data files that are no longer in the latest state of the table transaction log and exceed the retention threshold. The default threshold is 7 days.

Precautions

RETAIN num HOURS indicates the retention threshold, which is recommended to be at least 7 days.

If you run VACUUM on a Delta table, you will no longer be able to view versions created before the specified data retention period.

Delta Lake has a safety check to prevent running dangerous VACUUM commands, which will report an error when the specified retention threshold is less than 168 hours. If you determine the retention threshold for the vacuum operation, you can disable this safety check by setting the Spark configuration property spark.databricks.delta.retentionDurationCheck.enabled to false.

Syntax

VACUUM[database_name.]table_name|DELTA.`obs://bucket_name/tbl_path` [RETAIN num HOURS];

You can simulate the execution of the vacuum operation using the DRY RUN parameter, which returns a list of files that the vacuum will delete:

VACUUM[database_name.]table_name|DELTA.`obs://bucket_name/tbl_path` [RETAIN num HOURS] [DRY RUN];

Parameter Description

Table 1 Parameter descriptions of VACUUM

Parameter

Description

database_name

Name of the database, consisting of letters, numbers, and underscores (_)

table_name

Name of the table in the database, consisting of letters, numbers, and underscores (_)

bucket_name

OBS bucket name

tbl_path

Storage location of the Delta table in the OBS bucket

num

Retention period

Required Permissions

  • SQL permissions
Table 2 Permissions required for executing VACUUM

Permission Description

UPDATE permission on a table

  • Fine-grained permission: dli:table:update
  • Metadata services provided by LakeFormation. Refer to the LakeFormation documentation for details on permission configuration.

Examples

VACUUM delta_table0 RETAIN 168 HOURS;

VACUUM delta_table0 RETAIN 48 HOURS DRY RUN;

VACUUM delta.`obs://bucket_name0/db0/delta_table0` RETAIN 168 HOURS;

System Response

You can view the result in driver logs or on the client.