Delta Cleansing and Optimization
Cleansing a Delta Table
You can run the VACUUM command on a Delta table to remove data files that are no longer referenced and were created before the retention threshold.
VACUUM delta_table0; VACUUM delta_table0 RETAIN 168 HOURS;--The unit can only be HOURS.
Optimizing a Delta Table
To improve query speed, Delta Lake supports optimizing the data layout in storage, which will compress many smaller files into larger ones.
optimize delta_table0; optimize delta_table0 where date >= '2020-01-01';
Z-Ordering
Z-ordering is another technique to speed up queries. Sorting data with Z-order can reorganize the data in storage. When your data is appropriately sorted, you can skip more files and read less data, thus running faster. To sort Z-Order data, specify the columns to sort by in ZORDER BY.
OPTIMIZE delta_table0 ZORDER BY (price);
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot