Help Center/ MapReduce Service/ User Guide/ MRS Cluster O&M/ MRS Cluster Alarm Handling Reference/ ALM-45329 Presto Coordinator Resource Group Queuing Tasks Exceed the Threshold
Updated on 2024-09-23 GMT+08:00

ALM-45329 Presto Coordinator Resource Group Queuing Tasks Exceed the Threshold

Alarm Description

The system queries the number of queuing tasks in a resource group through the JMX interface. This alarm is generated when the system detects that the number of queuing tasks in a resource group exceeds the threshold.

Alarm Attributes

Alarm ID

Alarm Severity

Auto Cleared

45329

Minor

Yes

Alarm Parameters

Parameter

Description

ServiceName

Specifies the service for which the alarm was generated.

RoleName

Specifies the role for which the alarm was generated.

HostName

Specifies the host for which the alarm was generated.

Trigger Condition

Specifies the threshold for triggering the alarm.

Impact on the System

A large number of tasks may be in the queuing state and cannot be processed as expected. When the number of queuing tasks in a resource group exceeds the threshold (maxQueued), new tasks cannot be executed.

Possible Causes

The resource group configuration is improper or too many tasks in the resource group are submitted.

Handling Procedure

  1. On FusionInsight Manager, choose Cluster, click the name of the desired cluster, and choose Services > Presto. On the page that is displayed, click Configurations and All Configurations, choose Coordinator > Customization, and change the value of resourceGroupAlarm in the resource-groups parameter to change the threshold for each resource group.
  2. Collect fault information.

    1. Log in to the cluster node based on the host name in the fault information and query the number of queuing tasks on the Presto client based on Resource Group in the additional information.
    2. Log in to the cluster node based on the host name in the fault information, view the /var/log/Bigdata/nodeagent/monitorlog/monitor.log file, and search for resource group information to view the monitoring collection information of the resource group.
    3. Contact O&M personnel and provide the collected logs.

Alarm Clearance

This alarm is automatically cleared after the fault is rectified.

Related Information

None