Help Center > > Developer Guide> Query Performance Optimization> Optimization Cases> Case: Adjusting the Distribution Key

Case: Adjusting the Distribution Key

Updated at: Jul 15, 2020 GMT+08:00


During a site test, the information is displayed after EXPLAIN ANALYZE is executed:

According to the execution information, HashJoin becomes the performance bottleneck of the whole plan. Based on the execution time of HashJoin [2657.406, 93339.924] (for details about the value, see SQL Execution Plan), it can be seen that severe skew occurs on different DNs during the HashJoin operation.

In the memory information (as shown in the following figure), it can be seen that the data skew occurs in the memory usage of each node.

Optimization Analysis

The preceding two symptoms indicate that this SQL statement has serious computing skew. The further lower-layer analysis on the HashJoin operator shows that serious computing skew [38.885,2940.983] occurs in Seq Scan on s_riskrate_setting. Based on the description of the Scan, we can infer that the performance problems of this plan lie in data skew occurred in the s_riskrate_setting table. Later, it is proved that serious data skew occurred in the s_riskrate_setting table. After performance optimization, the execution time is reduced from 94s to 50s.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?

Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel