Help Center/ MapReduce Service/ User Guide (Ankara Region)/ Alarm Reference/ ALM-16001 Hive Warehouse Space Usage Exceeds the Threshold
Updated on 2024-11-29 GMT+08:00

ALM-16001 Hive Warehouse Space Usage Exceeds the Threshold

Alarm Description

The system checks the Hive warehouse space usage every 30 seconds. You can view Percentage of HDFS Space Used by Hive to the Available Space on the Hive service monitoring page. This alarm is generated when the Hive warehouse space usage exceeds the specified threshold.

To change the threshold, choose O&M. In the navigation pane on the left, click Alarm > Thresholds > Name of the desired cluster > Hive > Percentage of HDFS Space Used by Hive to the Available Space.

When the number of trigger times is 1, this alarm is cleared if the Hive warehouse space usage is less than or equal to the threshold. When the number of trigger times is greater than 1, this alarm is cleared if the Hive warehouse space usage is less than or equal to 90% of the threshold.

The MRS cluster administrator can reduce the repository space usage by increasing the repository capacity or releasing some used space.

Alarm Attributes

Alarm ID

Alarm Severity

Alarm Type

Service Type

Auto Cleared

16001

MRS 3.3.0 and earlier: minor (default threshold: 85%)

MRS 3.3.0 and later:

Critical (default threshold: 95%)

Major (default threshold: 85%)

Quality of service

Hive

Yes

Alarm Parameters

Type

Parameter

Description

Location Information

Source

Specifies the cluster for which the alarm was generated.

ServiceName

Specifies the service for which the alarm was generated.

RoleName

Specifies the role for which the alarm was generated.

HostName

Specifies the host for which the alarm was generated.

Additional Information

Trigger Condition

Specifies the alarm triggering condition.

Impact on the System

The system cannot write data properly. Some data may be lost.

Possible Causes

  • The upper limit of the HDFS capacity available for Hive is too small.
  • The HDFS space is insufficient.
  • Some data nodes break down.

Handling Procedure

Extend system capacity.

  1. Analyze the cluster HDFS space usage and increase the HDFS capacity for Hive.

    Log in to FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > Hive > Configuration, select All Configurations, search for hive.metastore.warehouse.size.percent, and increase the value. Assume that the value of the configuration item is A, total HDFS storage space is B, the threshold is C, and HDFS space used by Hive is D. Adjust the value by complying with A x B x C > D. You can view the total HDFS storage space on the HDFS NameNode page, and the HDFS space used by Hive on the Hive monitoring page.

  2. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 3.

Perform capacity expansion for the system.

  1. Expand the system.
  2. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 5.

Check whether the data node is normal.

  1. Log in to FusionInsight Manager and choose O&M > Alarm > Alarms.
  2. Check whether ALM-12006 NodeAgent Process Is Abnormal, ALM-12007 Process Fault, and ALM-14002 DataNode Disk Usage Exceeds the Threshold are reported.

    • If yes, go to 7.
    • If no, go to 9.

  3. Clear the alarm by following the steps provided in ALM-12006 NodeAgent Process Is Abnormal, ALM-12007 Process Fault, or ALM-14002 DataNode Disk Usage Exceeds the Threshold.
  4. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 9.

Collect fault information.

  1. On FusionInsight Manager, choose O&M. In the navigation pane on the left, choose Log > Download.
  2. Expand the Service drop-down list, and select Hive for the target cluster.
  3. Click the edit icon in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
  4. Contact O&M engineers and provide the collected logs.

Alarm Clearance

This alarm is automatically cleared after the fault is rectified.

Related Information

None.