Updated on 2024-04-11 GMT+08:00

ALM-23001 Loader Service Unavailable (For MRS 2.x or Earlier)

Description

The system checks the Loader service availability every 60 seconds. This alarm is generated if the Loader service is unavailable and is cleared after the Loader service recovers.

Attribute

Alarm ID

Alarm Severity

Auto Clear

23001

Critical

Yes

Parameters

Parameter

Description

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Impact on the System

Data loading, import, and conversion are unavailable.

Possible Causes

  • The services that Loader depends on are abnormal.
    • ZooKeeper is abnormal.
    • HDFS is abnormal.
    • DBService is abnormal.
    • Yarn is abnormal.
    • MapReduce is abnormal.
  • The network is faulty. Loader cannot communicate with its dependent services.
  • Loader is running improperly.

Procedure

  1. Check the ZooKeeper status.

    1. Go to the MRS cluster details page and click Components.
    2. Choose ZooKeeper and check whether the health status of ZooKeeper is normal.
      • If yes, go to 1.d.
      • If no, go to 1.c.
    3. Choose More > Restart Service to restart ZooKeeper. After ZooKeeper starts, check whether the "ALM-23001 Loader Service Unavailable" alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 1.d.
    4. On MRS Manager, check whether the ALM-12007 Process Fault alarm is reported.
      • If yes, go to 1.e.
      • If no, go to 2.a.
    5. In Alarm Details of the "ALM-12007 Process Fault" alarm, check whether ServiceName is ZooKeeper.
      • If yes, go to 1.f.
      • If no, go to 2.a.
    6. Clear the alarm according to the handling suggestions of "ALM-12007 Process Fault".
    7. Check whether the "ALM-23001 Loader Service Unavailable" alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 2.a.

  2. Check the HDFS status.

    1. Go to the MRS cluster details page and choose Alarms.
    2. On MRS Manager, check whether the "ALM-14000 HDFS Service Unavailable alarm" is reported.
      • If yes, go to 2.c.
      • If no, go to 3.a.
    3. Clear the alarm according to the handling suggestions of "ALM-14000 HDFS Service Unavailable".
    4. Check whether the "ALM-23001 Loader Service Unavailable" alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 3.a.

  3. Check the DBService status.

    1. Go to the MRS cluster details page and click Components.
    2. Choose DBService to check whether the health status of DBService is normal.
      • If yes, go to 4.a.
      • If no, go to 3.c.
    3. Choose More > Restart Service to restart DBService. After DBService starts, check whether the "ALM-23001 Loader Service Unavailable" alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 4.a.

  4. Check the MapReduce status.

    1. Go to the MRS cluster details page and click Components.
    2. Choose MapReduce and check whether the health status of MapReduce is normal.
      • If yes, go to 5.a.
      • If no, go to 4.c.
    3. Choose More > Restart Service to restart MapReduce. After MapReduce starts, check whether the "ALM-23001 Loader Service Unavailable" alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 5.a.

  5. Check the Yarn status.

    1. Go to the MRS cluster details page and click Components.
    2. Choose Yarn and check whether the health status of Yarn is normal.
      • If yes, go to 5.d.
      • If no, go to 5.c.
    3. Choose More > Restart Service to restart Yarn. After Yarn starts, check whether the "ALM-23001 Loader Service Unavailable" alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 5.d.
    4. On MRS Manager, check whether the "ALM-18000 Yarn Service Unavailable" alarm is reported.
      • If yes, go to 5.e.
      • If no, go to 6.a.
    5. Clear the alarm according to the handling suggestions of "ALM-18000 Yarn Service Unavailable".
    6. Check whether the "ALM-23001 Loader Service Unavailable" alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 6.a.

  6. Check the network connections between Loader and its dependent components.

    1. Go to the MRS cluster details page and click Components.
    2. Click Loader.
    3. Click Instance. The Sqoop instance list is displayed.
    4. Record the management IP addresses of all Sqoop instances.
    5. Log in to the hosts using the IP addresses obtained in 6.d. Run the following commands to switch the user:

      sudo su - root

      su - omm

    6. Run the ping command to check whether the network connection between the hosts where the Sqoop instances reside and the dependent components is normal. (The dependent components include ZooKeeper, DBService, HDFS, MapReduce, and Yarn. The method to obtain the IP addresses of the dependent components is the same as that used to obtain the IP addresses of the Sqoop instances.)
      • If yes, go to 7.
      • If no, go to 6.g.
    7. Contact the network administrator to repair the network.
    8. Check whether the "ALM-23001 Loader Service Unavailable" alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 7.

  7. Collect fault information.

    1. On MRS Manager, choose System > Export Log.
    2. Contact the O&M engineers and send the collected logs.