Help Center/ MapReduce Service/ User Guide (Ankara Region)/ Alarm Reference/ ALM-16003 Background Thread Usage Exceeds the Threshold
Updated on 2024-11-29 GMT+08:00

ALM-16003 Background Thread Usage Exceeds the Threshold

Alarm Description

The system checks the background thread usage in every 30 seconds. This alarm is generated when the usage of the background thread pool of Hive exceeds the threshold.

Alarm Attributes

Alarm ID

Alarm Severity

Alarm Type

Service Type

Auto Cleared

16003

Critical (default threshold: 90%)

Major (default threshold: 80%)

Quality of service

Hive

Yes

Alarm Parameters

Type

Parameter

Description

Location Information

Source

Specifies the cluster for which the alarm is generated.

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Additional Information

Trigger condition

Specifies the threshold for triggering the alarm.

Impact on the System

There are too many background threads, so the newly submitted task cannot run in time.

Possible Causes

The usage of the background thread pool of Hive is excessively high when:
  • There are many tasks executed in the background thread pool of HiveServer.
  • The capacity of the background thread pool of HiveServer is too small.

Handling Procedure

Check the number of tasks executed in the background thread pool of HiveServer.

  1. On the FusionInsight Manager portal, choose Cluster > Name of the desired cluster > Services > Hive. On the displayed page, click HiveServer Instance and check values of Background Thread Count and Background Thread Usage.
  2. Check whether the number of background threads in the latest half an hour is excessively high. (By default, the queue number is 100, and the thread number is considered as high if it is 90 or larger.)

    • If it is, go to 3.
    • If it is not, go to 5.

  3. Adjust the number of tasks submitted to the background thread pool. (For example, cancel some time-consuming tasks with low performance.)
  4. Check whether the values of Background Thread Count and Background Thread Usage decrease.

    • If it is, go to 7.
    • If it is not, go to 5.

Check the capacity of the HiveServer background thread pool.

  1. On the FusionInsight Manager portal, choose Cluster > Name of the desired cluster > Services > Hive. On the displayed page, click HiveServer Instance and check values of Background Thread Count and Background Thread Usage.
  2. Increase the value of hive.server2.async.exec.threads in the ${BIGDATA_HOME}/FusionInsight_HD_8.1.0.1/1_23_HiveServer/etc/hive-site.xml file. For example, increase the value by 20%.
  3. Save the modification.
  4. Check whether the alarm is cleared.

    • If it is, no further action is required.
    • If it is not, go to 9.

Collect fault information.

  1. On the FusionInsight Manager portal, choose O&M > Log > Download.
  2. Select Hive in the required cluster from the Service.
  3. Click the edit icon in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
  4. Contact the O&M engineers and send the collected logs.

Alarm Clearance

After the fault is rectified, the system automatically clears this alarm.

Related Information

None.