Updated on 2024-11-29 GMT+08:00

Adjusting Timeout for Hive Metadata Loading

Scenario

A large partitioned table contains too many partitions. As a result, the task times out. In addition, a large number of partitions may take more time to load and synchronize with the metadata storage cache. To achieve better performance in larger-scale storage, you are advised to adjust the maximum timeout interval for loading the metadata cache and the maximum waiting time for loading the metadata connection pool accordingly.

Procedure

  1. Log in to FusionInsight Manager as a HetuEngine administrator and choose Cluster > Services > HetuEngine.
  2. On the Dashboard tab page that is displayed, find the Basic Information area, and click the link next to HSConsole WebUI.
  3. Click Data Source, locate the row that contains the Hive data source, click Edit in the Operation column, and add the following custom parameters:

    Table 1 Metadata timeout parameters

    Parameter

    Default Value

    Description

    hive.metastore-timeout

    10s

    • Specifies the maximum timeout interval (in seconds or minutes) for caching metadata loaded by the Hive data source in co-deployment scenarios.
    • For operations in a large partition table, the value can be 60s or greater. Set this parameter based on the data volume.

    hive.metastore.connection.pool.maxWaitMillis

    1000

    • Specifies the maximum waiting time of the connection pool (in milliseconds) for loading metadata to the Hive data source in co-deployment scenarios.
    • If the connection pool is frequently accessed and the number of connections in the connection pool is small, the value can be 100000 or larger. Set this parameter based on the service volume.

  4. Click OK.