Help Center/ MapReduce Service/ User Guide (Ankara Region)/ Troubleshooting/ Using HBase/ Failed to Import HBase Data Due to Oversized File Blocks
Updated on 2024-11-29 GMT+08:00

Failed to Import HBase Data Due to Oversized File Blocks

Symptom

Error Message "NotServingRegionException" is displayed when data is imported to HBase.

Cause Analysis

When a block is greater than 2 GB, a read exception occurs during the seek operation of the HDFS. A full GC occurs when data is frequently written to the RegionServer. As a result, the heartbeat between the HMaster and RegionServer becomes abnormal, and the HMaster marks the RegionServer as dead, and the RegionServer is forcibly restarted. After the restart, the servercrash mechanism is triggered to roll back WALs. Currently, the splitwal file has reached 2.1 GB and has only one block. As a result, the HDFS seek operation becomes abnormal and the WAL file splitting fails. However, the RegionServer detects that the WAL needs to be split and triggers the splitwal mechanism, causing a loop between WAL splitting and the splitting failure. In this case, the regions on the RegionServer node cannot be brought online, and an exception is thrown indicating that the region is not online when a region on the RegionServer is queried.

Procedure

  1. Go to the HBase service page.

    Log in to FusionInsight Manager and choose Cluster. Click the name of the desired cluster, and choose Services > HBase.

  2. On the right of HMaster Web UI, click HMaster (Active) to go to the HBase Web UI page.
  3. On the Procedures page, view the node where the problem occurs.
  4. Log in to the faulty node as user root and run the hdfs dfs -ls command to view all block information.
  5. Run the hdfs dfs -mkdir command to create a directory for storing faulty blocks.
  6. Run the hdfs dfs -mv command to move the faulty block to the new directory.

Summary and Suggestions

The following is provided for your reference:

  • If data blocks are corrupted, run the hdfs fsck /tmp -files -blocks -racks command to check the health information about data blocks.
  • If you perform data operations when a region is being split, NotServingRegionException is thrown.