Help Center/ MapReduce Service/ Troubleshooting/ Using HDFS/ Unbalanced DataNode Disk Usages of a Node
Updated on 2025-08-19 GMT+08:00

Unbalanced DataNode Disk Usages of a Node

Symptom

The disk usage of each DataNode on a node is uneven.

Example:

189-39-235-71:~ # df -h
Filesystem  Size  Used Avail Use% Mounted on
/dev/xvda  360G  92G   250G  28% /
/dev/xvdb  700G  900G   200G  78% /srv/BigData/hadoop/data1
/dev/xvdc  700G  900G   200G  78% /srv/BigData/hadoop/data2
/dev/xvdd  700G  900G   200G  78% /srv/BigData/hadoop/data3
/dev/xvde  700G  900G   200G  78% /srv/BigData/hadoop/data4
/dev/xvdf  10G   900G   890G  2% /srv/BigData/hadoop/data5
189-39-235-71:~ #  

Possible Causes

Some disks are faulty and are replaced with new ones. The new disk usage is low.

Disks are added. For example, the original four data disks are expanded to five disks.

Cause Analysis

There are two policies for writing data to Block disks on DataNodes: Round Robin and Preferentially writing data to the disk with the more available space.

  • Description of the dfs.datanode.fsdataset.volume.choosing.policy parameter
  • Possible values:
    • org.apache.hadoop.hdfs.server.datanode.fsdataset.RoundRobinVolumeChoosingPolicy: Round Robin.
    • org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy: Preferentially writing data to the disk with more available space:

Procedure

  1. Log in to FusionInsight Manager.

    For details about how to log in to FusionInsight Manager, see Accessing MRS FusionInsight Manager.

  2. Choose Cluster > Services > HDFS > Configurations > All Configurations.
  3. Search for the parameter dfs.datanode.fsdataset.volume.choosing.policy and change its value to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy.
  4. Save the modification and restart the affected services or instances. Then, the DataNode preferentially selects a node with the most available disk space to store data copies.

    • Data written to the DataNode will be preferentially written to the disk with more available disk space.
    • The high usage of some disks can be relieved with the gradual deletion of aging data from the HDFS.