Help Center/ MapReduce Service/ User Guide (Paris Region)/ Troubleshooting/ Using HBase/ High CPU Usage Caused by Zero-Loaded RegionServer
Updated on 2024-10-11 GMT+08:00

High CPU Usage Caused by Zero-Loaded RegionServer

Symptom

The CPU usage of RegionServer is high, but there is no service running on RegionServer.

Cause Analysis

  1. Run the top command to obtain the CPU usage of RegionServer processes and check the IDs of processes with high CPU usage.
  2. Obtain the CPU usage of threads under these processes based on the RegionServer process IDs.

    Run the top -H -p <PID> (replace it with the actual RegionServer process ID). As shown in the following figure, the CPU usage of some threads reaches 80%.

     PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
     75706 omm       20   0 6879444   1.0g  25612 S  90.4  1.6   0:00.00 java
     75716 omm       20   0 6879444   1.0g  25612 S  90.4  1.6   0:04.74 java
     75720 omm       20   0 6879444   1.0g  25612 S  88.6  1.6   0:01.93 java
     75721 omm       20   0 6879444   1.0g  25612 S  86.8  1.6   0:01.99 java
     75722 omm       20   0 6879444   1.0g  25612 S  86.8  1.6   0:01.94 java
     75723 omm       20   0 6879444   1.0g  25612 S  86.8  1.6   0:01.96 java
     75724 omm       20   0 6879444   1.0g  25612 S  86.8  1.6   0:01.97 java
     75725 omm       20   0 6879444   1.0g  25612 S  81.5  1.6   0:02.06 java
     75726 omm       20   0 6879444   1.0g  25612 S  79.7  1.6   0:02.01 java
     75727 omm       20   0 6879444   1.0g  25612 S  79.7  1.6   0:01.95 java
     75728 omm       20   0 6879444   1.0g  25612 S  78.0  1.6   0:01.99 java
  3. Obtain the thread stack information based on the ID of the RegionServer process.

    jstack 12345 >allstack.txt (Replace it with the actual RegionServer process ID.)

  4. Convert the thread ID into the hexadecimal format:

    printf "%x\n" 30648

    In the command output, the TID is 77b8.

  5. Search the thread stack based on the hexadecimal TID. It is found that the compaction operation is performed.

  6. Perform the same operations on other threads. It is found that the threads are compaction threads.

Solution

This is a normal phenomenon.

The threads that consume a large number of CPU resources are compaction threads. Some threads invoke the Snappy compression algorithm, and some threads invoke HDFS data writing and reading. Each region has massive sets of data and numerous data files and uses the Snappy compression algorithm. For this reason, the compaction operations consume a large number of CPU resources.

Fault Locating Methods

  1. Run the top command to check the process with high CPU usage.
  2. Check the threads with high CPU usage in the process.

    Run the top -H -p <PID> command to print CPU usage of threads under the process.

    Obtain the thread with the highest CPU usage from the query result. You can also obtain the thread by running the following command:

    Or run the ps -mp <PID> -o THREAD,tid,time | sort -rn command.

    View the command output to obtain the ID of the thread with the highest CPU usage.

  3. Obtain the stack of the faulty thread.

    The jstack tool is the most effective and reliable tool for locating Java problems.

    You can obtain the jstack tool from the java/bin directory.

    jstack <PID> > allstack.txt

    Obtain the process stack and output it to a local file.

  4. Convert the thread ID into the hexadecimal format:

    printf "%x\n" <PID>

    The process ID in the command output is the TID.

  5. Run the following command to obtain the TID and output it to a local file:

    jstack <PID> | grep <TID> > Onestack.txt

    If you want to view the TID in the CLI only, run the following command:

    jstack <PID> | grep <TID> -A 30

    -A 30 indicates that 30 lines are displayed.