Delta Cleansing and Optimization
Cleansing a Delta Table
You can run the VACUUM command on a Delta table to remove data files that are no longer referenced and were created before the retention threshold.
VACUUM delta_table0; VACUUM delta_table0 RETAIN 168 HOURS;--The unit can only be HOURS.
Optimizing a Delta Table
To improve query speed, Delta Lake supports optimizing the data layout in storage, which will compress many smaller files into larger ones.
optimize delta_table0; optimize delta_table0 where date >= '2020-01-01';
Z-Ordering
Z-ordering is another technique to speed up queries. Sorting data with Z-order can reorganize the data in storage. When your data is appropriately sorted, you can skip more files and read less data, thus running faster. To sort Z-Order data, specify the columns to sort by in ZORDER BY.
OPTIMIZE delta_table0 ZORDER BY (price);
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.