Updated on 2022-11-04 GMT+08:00

ALM-23001 Loader Service Unavailable

Description

The system checks the Loader service availability every 60 seconds. This alarm is generated when the system detects that the Loader service is unavailable. This alarm is cleared when the Loader service is available.

Attribute

Alarm ID

Alarm Severity

Automatically Cleared

23001

Critical

Yes

Parameters

Name

Meaning

Source

Specifies the cluster for which the alarm is generated.

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Impact on the System

When the Loader service is unavailable, the data loading, import, and conversion functions are unavailable.

Possible Causes

  • The internal service on which the Loader service depends is abnormal.
    • The ZooKeeper service is abnormal.
    • The HDFS service is abnormal.
    • The DBService service is abnormal.
    • The Yarn service is abnormal.
    • The Mapreduce service is abnormal.
  • Environment fault: The network is abnormal, which the Loader service cannot communicate with the depended internal services and cannot provide services.
  • Software fault: The Loader service cannot run properly.

Procedure

Check the ZooKeeper service status.

  1. On the FusionInsight Manager home page, choose Cluster > Name of the desired cluster > Services > ZooKeeper to check whether the ZooKeeper running status is Normal.

    • If yes, go to 3.
    • If no, go to 2.

  2. Choose More > Restart Service to restart the ZooKeeper service. In the alarm list, check whether LoaderService Unavailable is cleared.

    • If yes, no further action is required.
    • If no, go to 3.

  3. On the FusionInsight Manager, check whether the alarm list contains Process Fault.

    • If yes, go to 4.
    • If no, go to 7.

  4. In the Location area of Process Fault, check whether ServiceName is ZooKeeper.

    • If yes, go to 5.
    • If no, go to 7.

  5. Rectify the fault by following the steps provided in ALM-12007 Process Fault.
  6. In the alarm list, check whether Loader Service Unavailable is cleared.

    • If yes, no further action is required.
    • If no, go to 7.

Check the HDFS service status.

  1. On the FusionInsight Manager, check whether the alarm list contains HDFS Service Unavailable.

    • If yes, go to 8.
    • If no, go to 10.

  2. Rectify the fault by following the steps provided in ALM-14000 HDFS Service Unavailable.
  3. In the alarm list, check whether Loader Service Unavailable is cleared.

    • If yes, no further action is required.
    • If no, go to 10.

Check the DBService status.

  1. On the FusionInsight Manager home page, choose Cluster > Name of the desired cluster > Services > DBService to check whether the DBService running status is Normal.

    • If yes, go to 12.
    • If no, go to 11.

  2. Choose More > Restart Service to restart the DBService service. In the alarm list, check whether LoaderService Unavailable is cleared.

    • If yes, no further action is required.
    • If no, go to 12.

Check the Mapreduce status.

  1. On the FusionInsight Manager home page, choose Cluster > Name of the desired cluster > Services > Mapreduce to check whether the Mapreduce running status is Normal.

    • If yes, go to 16.
    • If no, go to 13.

  2. Choose More > Restart Service to restart the Mapreduce service. In the alarm list, check whether LoaderService Unavailable is cleared.

    • If yes, no further action is required.
    • If no, go to 16.

Check the Yarn status.

  1. On the FusionInsight Manager home page, choose Cluster > Name of the desired cluster > Services > Yarn to check whether the Yarn running status is Normal.

    • If yes, go to 16.
    • If no, go to 15.

  2. Choose More > Restart Service to restart the Yarn service. In the alarm list, check whether LoaderService Unavailable is cleared.

    • If yes, no further action is required.
    • If no, go to 16.

  3. On the FusionInsight Manager, check whether the alarm list contains Yarn Service Unavailable.

    • If yes, go to 17.
    • If no, go to 19.

  4. Rectify the fault by following the steps provided in ALM-18000 Yarn Service Unavailable.
  5. In the alarm list, check whether Loader Service Unavailable is cleared.

    • If yes, no further action is required.
    • If no, go to 19.

Check the network connection between Loader and dependent components.

  1. On the FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > Loader.
  2. Click Instance and the LoaderServer instance list is displayed.
  3. Record the Management IP Address in the row of LoaderServer(Active).
  4. Log in to the host where the active LoaderServer runs as omm user using the IP address obtained in 21.
  1. Run the ping command to check whether communication between the host that runs the active LoaderServer and the hosts that run the dependent components. (The dependent components include ZooKeeper, DBService, HDFS, Mapreduce and Yarn. Obtain the IP addresses of the hosts that run these services in the same way as that for obtaining the IP address of the active LoaderServer.)

    • If yes, go to 26.
    • If no, go to 24.

  2. Contact the administrator to restore the network.
  3. In the alarm list, check whether Loader Service Unavailable is cleared.

    • If yes, no further action is required.
    • If no, go to 26.

Collect fault information.

  1. On the FusionInsight Manager, choose O&M > Log > Download.
  2. Select the following nodes in the required cluster from the Service drop-down list:

    • ZooKeeper
    • HDFS
    • DBService
    • Yarn
    • Mapreduce
    • Loader

  3. Click in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
  4. On the FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > Loader.
  5. Choose More > Restart Service, and click OK.
  1. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 32.

  2. Contact the O&M personnel and send the collected logs.

Alarm Clearing

After the fault is rectified, the system automatically clears this alarm.

Related Information

None