Optimizing the Aggregate Algorithms
Scenarios
Spark SQL supports hash aggregate algorithm. Namely, use fast aggregate hashmap as cache to improve aggregate performance. The hashmap replaces the previous ColumnarBatch to avoid performance problems caused by the wide mode (multiple key or value fields) of an aggregate table.
Procedure
- Install the Spark client.
For details, see Installing a Client.
- Log in to the Spark client node as the client installation user.
Modify the following parameters in the {Client installation directory}/Spark/spark/conf/spark-defaults.conf file on the Spark client.
Table 1 Parameter description Parameter
Description
Example Value
spark.sql.codegen.aggregate.map.twolevel.enabled
Specifies whether to enable aggregation algorithm optimization.
- true: Enable
- false: Disable
true
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot