VACUUM
Function
This command removes all files that are not managed by Delta from the table directory and to delete data files that are no longer in the latest state of the table transaction log and exceed the retention threshold. The default threshold is 7 days.
Precautions
RETAIN num HOURS indicates the retention threshold, which is recommended to be at least 7 days.
If you run VACUUM on a Delta table, you will no longer be able to view versions created before the specified data retention period.
Delta Lake has a safety check to prevent running dangerous VACUUM commands, which will report an error when the specified retention threshold is less than 168 hours. If you determine the retention threshold for the vacuum operation, you can disable this safety check by setting the Spark configuration property spark.databricks.delta.retentionDurationCheck.enabled to false.
Syntax
VACUUM[database_name.]table_name|DELTA.`obs://bucket_name/tbl_path` [RETAIN num HOURS];
You can simulate the execution of the vacuum operation using the DRY RUN parameter, which returns a list of files that the vacuum will delete:
VACUUM[database_name.]table_name|DELTA.`obs://bucket_name/tbl_path` [RETAIN num HOURS] [DRY RUN];
Parameter Description
Parameter |
Description |
---|---|
database_name |
Name of the database, consisting of letters, numbers, and underscores (_) |
table_name |
Name of the table in the database, consisting of letters, numbers, and underscores (_) |
bucket_name |
OBS bucket name |
tbl_path |
Storage location of the Delta table in the OBS bucket |
num |
Retention period |
Required Permissions
- SQL permissions
Permission Description |
---|
UPDATE permission on a table |
- Fine-grained permission: dli:table:update
- Metadata services provided by LakeFormation. Refer to the LakeFormation documentation for details on permission configuration.
Examples
VACUUM delta_table0 RETAIN 168 HOURS; VACUUM delta_table0 RETAIN 48 HOURS DRY RUN; VACUUM delta.`obs://bucket_name0/db0/delta_table0` RETAIN 168 HOURS;
System Response
You can view the result in driver logs or on the client.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.