OPTIMIZE
Function
This command optimizes the data layout in storage to improve query speed.
Precautions
- Since optimization is time-consuming, you need to determine the frequency of running OPTIMIZE based on the trade-off between better end-user query performance and the optimization computation time.
- To optimize partitioned tables, set spark.sql.forcePartitionPredicatesOnPartitionedTable.enabled to false.
Syntax
OPTIMIZE [database_name.]table_name
[ WHERE boolExpression]
Z-ordering:
OPTIMIZE [database_name.]table_name
[ WHERE boolExpression]
ZORDER BY (columnList);
Parameter Description
Parameter |
Description |
---|---|
database_name |
Name of the database, consisting of letters, numbers, and underscores (_) |
table_name |
Name of the table in the database, consisting of letters, numbers, and underscores (_) |
boolExpression |
Filter condition expression |
columnList |
List of fields specified for z-ordering, and the Z-order columns should be different from the partition columns. |
Required Permissions
- SQL permissions
Permission Description |
---|
UPDATE permission on a table |
- Fine-grained permission: dli:table:update
- Metadata services provided by LakeFormation. Refer to the LakeFormation documentation for details on permission configuration.
Examples
OPTIMIZE delta_table0; optimize delta_table0 where dt >= '2020-01-01'; OPTIMIZE delta_table0 WHERE dt >= current_timestamp() - INTERVAL 1 day ZORDER BY (price);
System Response
Display the success or failure of the command execution in the driver logs and the client.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot