Updated on 2024-11-29 GMT+08:00

Cleaning

Function

Cleans Hudi tables. For details, see Cleaning.

Syntax

call run_clean(table=>'[table]', clean_policy=>'[clean_policy]', retain_commits=>'[retain_commits]', hours_retained=> '[hours_retained]', file_versions_retained=> '[file_versions_retained]');

Parameter Description

Table 1 Parameters

Parameter

Description

Mandatory

table

Name of the table to be queried. The value can be in the database.tablename format.

Yes

clean_policy

Policy for deleting data files of an earlier version. The default value is KEEP_LATEST_COMMITS.

No

retain_commits

This parameter is available only when clean_policy is set to KEEP_LATEST_COMMITS.

No

hours_retained

This parameter is available only when clean_policy is set to KEEP_LATEST_BY_HOURS.

No

file_version_retained

This parameter is available only when clean_policy is set to KEEP_LATEST_FILE_VERSIONS.

No

Example

call run_clean(table => 'hudi_table1');

call run_clean(table => 'hudi_table1', retain_commits => 2);

call run_clean(table => 'hudi_table1', clean_policy => 'KEEP_LATEST_FILE_VERSIONS', file_version_retained => 1);

Precautions

The cleaning operation cleans data files of an earlier version in partitions only when trigger conditions are met. If trigger conditions are not met, this operation does not clean the data files even if the command is successfully executed.

System Response

You can view query results on the client.