Updated on 2025-08-09 GMT+08:00

ALM-12016 CPU Usage Exceeds the Threshold

Alarm Description

The system checks the CPU usage every 30 seconds and compares the actual CPU usage with the threshold. The CPU usage has a default threshold. This alarm is generated when the CPU usage exceeds the threshold for several times (configurable, 10 times by default) consecutively.

The alarm is cleared in the following two scenarios: The value of Trigger Count is 1 and the CPU usage is smaller than or equal to the threshold; the value of Trigger Count is greater than 1 and the CPU usage is smaller than or equal to 90% of the threshold.

Alarm Attributes

Alarm ID

Alarm Severity

Auto Cleared

12016

Major

Yes

Alarm Parameters

Parameter

Description

Source

Specifies the cluster or system for which the alarm is generated.

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Trigger Condition

Specifies the threshold for triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.

Impact on the System

  • Latency: If the CPU usage of a host is too high, service processes may run slowly and services may be delayed.
  • Service failure: If the host CPU usage is too high, service processing may slow down, time out, or fail. As a result, jobs may fail to run.

Possible Causes

  • The alarm threshold or alarm smoothing times are incorrect.
  • CPU configuration cannot meet service requirements. The CPU usage reaches the upper limit. Or the service is in peak hours. As a result, the CPU usage reaches the upper limit in a short period of time.

Handling Procedure

Check whether the alarm threshold or alarm Trigger Count are correct.

  1. Change the alarm threshold and alarm Trigger Count based on CPU usage.

    On FusionInsight Manager, choose O&M > Alarm > Thresholds > Host > CPU > Host CPU Usage and change the alarm smoothing times based on CPU usage, as shown in Figure 1.

    Trigger Count indicates how many consecutive times the threshold is reached when the alarm is triggered.

    Figure 1 Setting alarm smoothing times

    On Host CPU Usage page and click Modify in the Operation column to change the alarm threshold, as shown in Figure 2.

    Figure 2 Setting an alarm threshold

  2. After 2 minutes, check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to Step 3.

Check whether the CPU usage reaches the upper limit.

  1. In the alarm list on FusionInsight Manager, click in the row where the alarm is located to view the alarm host address in the alarm details.
  2. On the Hosts page, click the node on which the alarm is reported.
  3. Observe the real-time data of the host CPU usage for about 5 minutes. If the CPU usage exceeds the threshold for multiple times, contact the MRS cluster administrator to increase the CPU specifications.
  4. Check whether the current traffic is in peak hours. If this alarm was generated during peak hours, expand the node capacity or contact the MRS cluster administrator to improve the CPU specifications.
  5. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to Step 8.

Collect fault information.

  1. On the FusionInsight Manager in the active cluster, choose O&M > Log > Download.
  2. Expand the Service drop-down list, select OmmServer, and click OK.
  3. Click in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
  4. Contact the O&M personnel and send the collected log information.

Alarm Clearance

After the fault is rectified, the system automatically clears this alarm.

Related Information

None