OPTIMIZE
Function
Rewrites specified data in a target table.
Currently, this syntax is only supported for Iceberg tables.
Precautions
- If a sort key is specified, data within files will be ordered after rewriting, but global ordering is not guaranteed.
- When defining rewrite conditions, you are advised to use partition-based criteria for efficiency.
Syntax
OPTIMIZE table_name REWRITE DATA
[ WITH OPTIONS (option_key = option_value [, ...]) ]
[ WHERE condition ]
[ ORDER BY target_list [ ASC | DESC ] ]
Parameter Description
- table_name: name of the target table. Range: an existing table name.
- condition: an expression that returns a boolean value, used to determine which files need to be rewritten.
- target_list: sort keys for data rewriting.
- option_key, option_value: configuration options for data rewriting.
|
Parameter |
Value |
Description |
Default Value |
|---|---|---|---|
|
target-file-size-bytes |
0–Max(uint64) |
Size of rewritten files. |
Matches write.target-file-size-bytes. |
|
min-file-size-bytes |
0–Max(uint64) |
Original files smaller than this threshold are targeted. |
75% of target-file-size-bytes |
|
max-file-size-bytes |
0–Max(uint64) |
Original files larger than this threshold are targeted. |
180% of target-file-size-bytes |
|
min-input-files |
0–Max(uint32) |
Rewriting occurs if a partition has more target files than this value. |
5 |
|
delete-file-threshold |
0–Max(uint32) |
Rewriting occurs if a partition has more delete files than this value. |
INT_MAX |
|
rewrite-all |
true, false |
Ignores all other conditions and rewrites all matching files. |
false |
|
max-file-group-size-bytes |
0–Max(uint64) |
Splits target files into multiple tasks if their total size exceeds this. |
100 GB |
Examples
CREATE TABLE reason_t2
(
TABLE_SK INTEGER ,
TABLE_ID VARCHAR(20) ,
TABLE_NA VARCHAR(20)
) PARTITION BY (TABLE_SC VARCHAR(20)) STORE AS iceberg;
INSERT INTO reason_t2 VALUES (1, 'S01', 'StudentA', 'Xiangbei'),(2, 'T01', 'TeacherA', 'Lingnan'),(3, 'T02', 'TeacherB', 'Hainan');
INSERT INTO reason_t2 VALUES (4, 'S02', 'StudentX', 'Xiangbei'),(5, 'T03', 'TeacherX', 'Lingnan'),(6, 'T04', 'TeacherY', 'Hainan');
OPTIMIZE reason_t2 REWRITE DATA WITH OPTIONS ('rewrite-all' = 'true') WHERE TABLE_SC = 'Hainan';
Run the ORDER BY statement to view the data rewriting plan of the TABLE_SC = 'Lingnan' partition.
EXPLAIN VERBOSE OPTIMIZE reason_t2 REWRITE DATA WITH OPTIONS ('rewrite-all' = 'true') WHERE TABLE_SC = 'Lingnan' ORDER BY 1;
QUERY EXEC INFO
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
id | operation | E-rows | E-distinct | E-width | E-costs
----+-------------------------------------------------------------------------+--------+------------+---------+---------
1 | -> Row Adapter | 1000 | | 178 | 73.68
2 | -> Vector Insert on reason_t2 | 1000 | | 178 | 73.68
3 | -> Vector Sort | 1000 | | 178 | 53.68
4 | -> Partitioned Vector Foreign Scan on reason_t2 | 1000 | | 178 | 1.35
Predicate Information (identified by plan id)
---------------------------------------------------------------------------------------------------------------------------------------------------------------
4 --Partitioned Vector Foreign Scan on reason_t2
Server Type: lf
Pruning results: (Manifests total: 3, Manifests left: 2, Files total: 4, Files dynamic prune 0, Files dynamic RF prune: 0, Files left after pruning: 2)
DN read from: direct
replication: 2 files 0.00 MB
Configure min-file-size-bytes to rewrite all target files.
EXPLAIN VERBOSE OPTIMIZE reason_t2 REWRITE DATA WITH OPTIONS ('min-file-size-bytes' = '0') ORDER BY 1;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot