Updated on 2024-10-11 GMT+08:00

Using CDM to Migrate Data to HDFS

Issue

A user failed to use CDM to migrate data from an old cluster to HDFS of a new cluster.

Symptom

When CDM is used to import data from the source HDFS to the destination HDFS, the destination MRS cluster is faulty and the NameNode cannot be started.

The logs show that the Java heap space error is reported during the startup. The JVM parameter of the NameNode needs to be modified.

Figure 1 Fault logs

Cause Analysis

When the user uses CDM to migrate data, the HDFS data volume is too large. As a result, a stack exception occurs when metadata is merged.

Procedure

  1. Go to the HDFS service configuration page.

    • For versions earlier than MRS 2.0.1: Log in to MRS Manager, choose Services > HDFS > Service Configuration, and select All from the Basic drop-down list.
    • For MRS 2.0.1 or later: Click the cluster name on the MRS console, choose Components > HDFS > Service Configuration, and select All from the Basic drop-down list.
    • For MRS 3.x or later: Log in to FusionInsight Manager and choose Cluster. Click the name of the target cluster and choose Services > HDFS > Configurations > All Configurations.

  2. Search for the GC_OPTS parameter in HDFS->NameNode and increase the values of -Xms512M and -Xmx512M based on service requirements.
  3. Save the configuration and restart the affected services or instances.