Help Center/ MapReduce Service/ User Guide (ME-Abu Dhabi Region)/ Troubleshooting/ Using HDFS/ HDFS Capacity Usage Reaches 100%, Causing Unavailable Upper-layer Services Such as HBase and Spark
Updated on 2022-12-08 GMT+08:00

HDFS Capacity Usage Reaches 100%, Causing Unavailable Upper-layer Services Such as HBase and Spark

Issue

The HDFS capacity usage of the cluster reaches 100%, and the HDFS service status is read-only. As a result, upper-layer services such as HBase and Spark are unavailable.

Symptom

The HDFS capacity usage is 100%, the disk capacity usage is only about 85%, and the HDFS service status is read-only. As a result, upper-layer services such as HBase and Spark are unavailable.

Cause Analysis

Currently, NodeManager and DataNode share data disks. By default, MRS reserves 15% of data disk space for non-HDFS. You can change the percentage of data disk space by setting the HDFS parameter dfs.datanode.du.reserved.percentage.

If the HDFS disk usage is 100%, you can set dfs.datanode.du.reserved.percentage to a smaller value to restore services and then expand disk capacity.

Procedure

  1. Log in to any Master node in the cluster.
  2. Run the source /opt/client/bigdata_env command to initialize environment variables.

    If it is a security cluster, run the kinit -kt <keytab file> <pricipal name> command for authentication.

  3. Run the hdfs dfs –put ./startDetail.log /tmp command to check whether HDFS fails to write files.

    19/05/12 10:07:32 WARN hdfs.DataStreamer: DataStreamer Exception
    org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/startDetail.log._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 3 datanode(s) running and no node(s) are excluded in this operation.

  4. Run the hdfs dfsadmin -report command to check the used HDFS capacity. The command output shows that the HDFS capacity usage has reached 100%.

    Configured Capacity: 5389790579100 (4.90 TB)
    Present Capacity: 5067618628404 (4.61 TB)
    DFS Remaining: 133350196 (127.17 MB)
    DFS Used: 5067485278208 (4.61 TB)
    DFS Used%: 100.00%
    Under replicated blocks: 10
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    Missing blocks (with replication factor 1): 0
    Pending deletion blocks: 0

  5. When the HDFS capacity usage reaches 100%, change the percentage of data disk space by setting the HDFS parameter dfs.datanode.du.reserved.percentage.

    1. Go to the service configuration page.
      • MRS Manager: Log in to MRS Manager and choose Services > HDFS > Configuration.
      • FusionInsight Manager: Log in to FusionInsight Manager and choose Cluster > Name of the target cluster > Service > HDFS > Configurations.
    2. Select All Configurations and search for dfs.datanode.du.reserved.percentage in the search box.
    3. Change the value of this parameter to 10.

  6. After the modification, increase the number of disks of the Core node.