CLEAN_FILE
Function
Cleans invalid data files from the Hudi table directory.
Syntax
call clean_file(table => '[table_name]', mode=>'[op_type]', backup_path=>'[backup_path]', start_instant_time=>'[start_time]', end_instant_time=>'[end_time]');
Parameter Description
Parameter |
Description |
---|---|
table_name |
Mandatory. Name of the Hudi table from which invalid data files are to be deleted. |
op_type |
Optional. Command running mode. The default value is dry_run. Value options are dry_run, repair, undo, and query. dry_run: displays invalid data files to be cleaned. repair: displays and cleans invalid data files. undo: restores deleted data files. query: displays the backup directories that have been cleaned. |
backup_path |
Mandatory. Backup directory of the data files to be restored. This parameter is available only when the running mode is undo. |
start_time |
Optional. Start time for generating invalid data files. This parameter is available only when the running mode is dry_run or repair. The start time is not limited by default. |
end_time |
Optional. End time for generating invalid data files. This parameter is available only when the running mode is dry_run or repair. The end time is not limited by default. |
Example
call clean_file(table => 'h1', mode=>'repair'); call clean_file(table => 'h1', mode=>'dry_run'); call clean_file(table => 'h1', mode=>'query'); call clean_file(table => 'h1', mode=>'undo', backup_path=>'/tmp/hudi/h1/.hoodie/.cleanbackup/hoodie_repair_backup_20220222222222');
Precautions
The command cleans only invalid Parquet files.
System Response
You can view command execution results in the driver log or on the client.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot