Adjusting Timeout for Hive Metadata Loading
Scenario
A large partitioned table contains too many partitions. As a result, the task times out. In addition, a large number of partitions may take more time to load and synchronize with the metadata storage cache. To achieve better performance in larger-scale storage, you are advised to adjust the maximum timeout interval for loading the metadata cache and the maximum waiting time for loading the metadata connection pool accordingly.
Procedure
- Log in to FusionInsight Manager as a HetuEngine administrator and choose Cluster > Services > HetuEngine.
- On the Dashboard tab page that is displayed, find the Basic Information area, and click the link next to HSConsole WebUI.
- Click Data Source, locate the row that contains the Hive data source, click Edit in the Operation column, and add the following custom parameters:
Table 1 Metadata timeout parameters Parameter
Default Value
Description
hive.metastore-timeout
10s
- Specifies the maximum timeout interval (in seconds or minutes) for caching metadata loaded by the Hive data source in co-deployment scenarios.
- For operations in a large partition table, the value can be 60s or greater. Set this parameter based on the data volume.
hive.metastore.connection.pool.maxWaitMillis
1000
- Specifies the maximum waiting time of the connection pool (in milliseconds) for loading metadata to the Hive data source in co-deployment scenarios.
- If the connection pool is frequently accessed and the number of connections in the connection pool is small, the value can be 100000 or larger. Set this parameter based on the service volume.
- Click OK.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot