Case: Adjusting Distribution Keys
Symptom
During a site test, the information is displayed after EXPLAIN ANALYZE is run:
According to the execution information, Hash Join becomes the performance bottleneck of the whole plan. Based on the execution time of Hash Join [2657.406,93339.924] (for details about the value, see Description), it can be seen that severe skew occurs on different DNs during the Hash Join operation.
In the memory information (as shown in the following figure), it can be seen that the data skew occurs in the memory usage of each node.
Optimization Analysis
The preceding two features indicate that this SQL statement has extremely serious computing unbalance. The further lower-layer analysis on the Hash Join operator shows that serious computing skew [38.885,2940.983] occurs in Seq Scan on s_riskrate_setting. Based on the description of the Scan, we can infer that the performance problems of this plan lie in data skew occurred in the s_riskrate_setting table. Later, it is proved that serious data skew occurred in the s_riskrate_setting table. After performance optimization, the execution time is reduced from 94s to 50s.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot