Updated on 2022-12-08 GMT+08:00

Using CDM to Migrate Data to HDFS

Issue

A user failed to use CDM to migrate data from an old cluster to HDFS of a new cluster.

Symptom

When CDM is used to import data from the source HDFS to the destination HDFS, the destination MRS cluster is faulty and the NameNode cannot be started.

The logs show that the Java heap space error is reported during the startup. The JVM parameter of the NameNode needs to be modified.

Figure 1 Fault logs

Cause Analysis

When the user uses CDM to migrate data, the HDFS data volume is too large. As a result, a stack exception occurs when metadata is merged.

Procedure

  1. Search for the GC_OPTS parameter in HDFS->NameNode and increase the values of -Xms512M and -Xmx512M based on service requirements.
  2. Save the configuration and restart the affected services or instances.