On this page
Help Center/ MapReduce Service/ Troubleshooting/ Using HDFS/ Uneven Data Distribution Due to HDFS Client Installation on the DataNode

Uneven Data Distribution Due to HDFS Client Installation on the DataNode

Updated on 2024-12-18 GMT+08:00

Symptom

Data is unevenly distributed on HDFS DataNodes. Disk usage of a node is high or even reaches 100% while disks on other nodes have sufficient idle space.

Cause Analysis

In the HDFS data replica mechanism, the first replica is stored to the local node where the client is stored. As a result, disks of the node run out while disks of other nodes have sufficient idle space.

Solution

  1. For the existing data unevenly distributed, run the following command to balance data:

    /opt/client/HDFS/hadoop/sbin/start-balancer.sh -threshold 10

    /opt/client indicates the actual client installation directory.

  2. For new data, install the client on the node without DataNode.
Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback