Updated on 2024-01-17 GMT+08:00

ALM-26053 Slot Usage of Storm Exceeds the Threshold (For MRS 2.x or Earlier)

Description

The system checks the slot usage of Storm every 60 seconds and compares it with the threshold. This alarm is generated if the slot usage exceeds the threshold.

To modify the threshold, users can choose System > Threshold Configuration on MRS Manager.

This alarm is cleared if the slot usage is lower than or equal to the threshold.

Attribute

Alarm ID

Alarm Severity

Auto Clear

26053

Major

Yes

Parameters

Parameter

Description

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Trigger condition

Generates an alarm when the actual indicator value exceeds the specified threshold.

Impact on the System

Users cannot run new Storm tasks.

Possible Causes

  • Supervisors are abnormal in the cluster.
  • Supervisors are normal but have poor processing capability.

Procedure

  1. Check the supervisor status.

    1. Go to the cluster details page and click Components.
    2. Choose Storm > Supervisor.
    3. In Role, check whether the cluster has supervisor instances that are in the Faulty or Recovering state.
      • If yes, go to 1.d.
      • If no, go to 2.a or 3.a.
    4. Select the supervisor instances that are in the Faulty or Recovering state and choose More > Restart Instance.
      • If yes, go to 1.e.
      • If the restart fails, go to 4.
    5. Wait a moment and then check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 2.a or 3.a.

  2. Increase the number of slots for the supervisors.

    1. Go to the cluster details page and click Components.
    2. Choose Storm > Supervisor > Service Configuration, and set Type to All.
    3. Increase the value of supervisor.slots.ports to increase the number of slots for each supervisor. Then restart the instances.
    4. Wait a moment and then check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 4.

  3. Expand the capacity of the supervisors.

    1. Add nodes.
    2. Wait a moment and then check whether the alarm is cleared.
      • If yes, no further action is required.
      • If the restart fails, go to 4.

  4. Collect fault information.

    1. On MRS Manager, choose System > Export Log.
    2. Contact the O&M engineers and send the collected logs.

Related Information

N/A