Help Center/ MapReduce Service/ Troubleshooting/ Using Spark/ Spark Job Suspended Due to Insufficient Memory or Lack of JAR Packages
Updated on 2023-01-11 GMT+08:00

Spark Job Suspended Due to Insufficient Memory or Lack of JAR Packages

Issue

The memory is insufficient or no JAR package is added when a Spark job is submitted. As a result, the job is in the pending state for a long time or memory overflow occurs during job running.

Symptom

The job is pending for a long time after being submitted. The following error information is displayed after the job is executed repeatedly:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: 
Aborting TaskSet 3.0 because task 0 (partition 0) cannot run anywhere due to node and executor blacklist. 
Blacklisting behavior can be configured via spark.blacklist.*. 

Cause Analysis

The memory is insufficient or no JAR package is added when the job is submitted.

Procedure

  1. Check whether the JAR package is added when the job is submitted.

    • If yes, go to 2.
    • If no, add the JAR package. If the job execution becomes normal, no further action is required. If the job is still in the pending state for a long time, go to 2.

  2. Log in to the MRS console, click a cluster name on the Active Clusters page and view the node specifications of the cluster on the Nodes page.
  3. Add cluster resources for the nodemanager process.

    Operations on MRS Manager:

    1. Log in to MRS Manager and choose Services > Yarn > Service Configuration.
    2. Set Type to All, and then search for yarn.nodemanager.resource.memory-mb in the search box to view the value of this parameter. You are advised to set the parameter value to 75% to 90% of the total physical memory of nodes.

    Operations on FusionInsight Manager:

    1. Log in to FusionInsight Manager. Choose Cluster > Services > Yarn.
    2. Choose Configurations > All Configurations. Search for yarn.nodemanager.resource.memory-mb in the search box and check the parameter value. You are advised to set the parameter value to 75% to 90% of the total physical memory of nodes.

  4. Modify the Spark service configuration.

    Operations on MRS Manager:

    1. Log in to MRS Manager and choose Services > Spark > Service Configuration.
    2. Set Type to All, and then search for spark.driver.memory and spark.executor.memory in the search box.

      Set these parameters to a larger or smaller value based on the complexity and memory requirements of the submitted Spark job. (Generally, the values need to be increased.)

    Operations on FusionInsight Manager:

    1. Log in to FusionInsight Manager. Choose Cluster > Services > Spark.
    2. Choose Configurations > All Configurations. Search for spark.driver.memory and spark.executor.memory in the search box and increase or decrease the values based on actual requirements. Generally, increase the values based on the complexity and memory of the submitted Spark job.
    • If a SparkJDBC job is used, search for SPARK_EXECUTOR_MEMORY and SPARK_DRIVER_MEMORY and change their values based on the complexity and memory requirements of the submitted Spark job. (Generally, the values need to be increased.)
    • If the number of cores needs to be specified, you can search for spark.driver.cores and spark.executor.cores and change their values.

  5. Scale out the cluster if the preceding requirements still cannot be met because Spark depends on the memory for computing.