Updated on 2025-02-22 GMT+08:00

CLEAN

Function

Cleans instants on the Timeline based on configurations and deletes historical version files to reduce the data storage and read/write pressure of Hudi tables.

Syntax

RUN CLEAN ON tableIdentifier;

RUN CLEAN ON tablelocation;

Parameter Description

Table 1 Parameter descriptions

Parameter

Description

tableIdentifier

Name of the Hudi table

tablelocation

Storage path of the Hudi table

Example

run clean on h1;
run clean on "obs://bucket/path/h1";

Caveats

  • The clean operation can only be executed by the owner of the table.
  • To modify the default parameters of the clean command, you need to configure the parameters such as the number of commits to be retained in the settings when executing the SQL command. Refer to Typical Hudi Configuration Parameters.
  • When using the metadata service provided by DLI, this command does not support OBS paths.

System Response

You can check whether the job status is successful, and view the job log to confirm whether there is any exception.