Updated on 2024-01-17 GMT+08:00

ALM-16004 Hive Service Unavailable (For MRS 2.x or Earlier)

Description

The system checks the Hive service status every 30 seconds. This alarm is generated when the Hive service is unavailable.

This alarm is cleared when the Hive service recovers.

Attribute

Alarm ID

Alarm Severity

Auto Clear

16004

Critical

Yes

Parameters

Parameter

Description

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Impact on the System

The system cannot provide data loading, query, and extraction services.

Possible Causes

  • Basic services, such as ZooKeeper, HDFS, Yarn, and DBService work incorrectly, or the Hive process is faulty.
    • ZooKeeper is abnormal.
    • HDFS is abnormal.
    • Yarn is abnormal.
    • DBService is abnormal.
    • The Hive service process is faulty. If the alarm is caused by a Hive process fault, the alarm report has a delay of about 5 minutes.
  • The network communication between the Hive service and basic services is interrupted.

Procedure

  1. Check the HiveServer/MetaStore process status.

    1. Go to the MRS cluster details page and click Components.
    2. Choose Hive > Instances. In the Hive instance list, check whether the status of all HiveSserver/MetaStore instances is Unknown.
      • If yes, go to 1.c.
      • If no, go to 2.
    3. Above the Hive instance list, choose More > Restart Instance to restart the HiveServer/MetaStore process.
    4. In the alarm list, check whether ALM-16004 Hive Service Unavailable is cleared.
      • If yes, no further action is required.
      • If no, go to 2.

  2. Check the ZooKeeper status.

    1. Go to the cluster details page and choose Alarms.
    2. On MRS Manager, check whether the ALM-12007 Process Fault alarm is reported.
      • If yes, go to 2.c.
      • If no, go to 3.
    3. In the Alarm Details area of ALM-12007 Process Fault, check whether ServiceName is ZooKeeper.
      • If yes, go to 2.d.
      • If no, go to 3.
    4. Rectify the fault by following steps provided in ALM-12007 Process Fault.
    5. In the alarm list, check whether ALM-16004 Hive Service Unavailable is cleared.
      • If yes, no further action is required.
      • If no, go to 3.

  3. Check the HDFS status.

    1. Go to the cluster details page and choose Alarms.
    2. In the alarm list, check whether the alarm ALM-14000 HDFS Service Unavailable exists.
      • If yes, go to 3.c.
      • If no, go to 4.
    3. Rectify the fault by following the steps provided in ALM-14000 HDFS Service Unavailable.
    4. In the alarm list, check whether ALM-16004 Hive Service Unavailable is cleared.
      • If yes, no further action is required.
      • If no, go to 4.

  4. Check the Yarn status.

    1. Go to the cluster details page and choose Alarms.
    2. In the alarm list on MRS Manager, check whether the alarm ALM-18000 Yarn Service Unavailable is generated.
      • If yes, go to 4.c.
      • If no, go to 4.
    3. Rectify the fault by following the steps provided in ALM-18000 Yarn Service Unavailable.
    4. In the alarm list, check whether ALM-16004 Hive Service Unavailable is cleared.
      • If yes, no further action is required.
      • If no, go to 4.

  5. Check the DBService status.

    1. Go to the cluster details page and choose Alarms.
    2. In the alarm list on MRS Manager, check whether ALM-27001 DBService Unavailable is generated.
      • If yes, go to 5.c.
      • If no, go to 6.
    3. Rectify the fault by following the handling procedure in ALM-27001 DBService Unavailable (For MRS 2.x or Earlier).
    4. In the alarm list, check whether ALM-16004 Hive Service Unavailable is cleared.
      • If yes, no further action is required.
      • If no, go to 6.

  6. Check the network connection between Hive and ZooKeeper, HDFS, Yarn, and DBService.

    1. Go to the MRS cluster details page and click Components.
    2. Click Hive.
    3. Click Instances.

      The HiveServer instance list is displayed.

    4. Click Host Name in the row of HiveServer.

      The HiveServer host status page is displayed.

    5. Record the IP address under Summary.
    6. Use the IP address obtained in 6.e to log in to the host where HiveServer is located.
    7. Run the ping command to check whether the network connection between the host that runs HiveServer and the hosts that run the ZooKeeper, HDFS, Yarn, and DBService services is normal. Methods of obtaining IP addresses of the hosts that run ZooKeeper, HDFS, Yarn, and DBService services as well as the HiveServer IP address are the same.
      • If yes, go to 7.
      • If no, go to 6.h.
    8. Contact the O&M personnel to restore the network.
    9. In the alarm list, check whether ALM-16004 Hive Service Unavailable is cleared.
      • If yes, no further action is required.
      • If no, go to 7.

  7. Collect fault information.

    1. On MRS Manager, choose System > Export Log.
    2. Contact the O&M engineers and send the collected logs.

Reference

None