Deze pagina is nog niet beschikbaar in uw eigen taal. We werken er hard aan om meer taalversies toe te voegen. Bedankt voor uw steun.

On this page

Show all

Help Center/ MapReduce Service/ Developer Guide (Normal_Earlier Than 3.x)/ Spark Application Development/ FAQs/ What Should I Do If FileNotFoundException Occurs When spark-submit Is Used to Submit a Job in Spark on Yarn Client Mode?

What Should I Do If FileNotFoundException Occurs When spark-submit Is Used to Submit a Job in Spark on Yarn Client Mode?

Updated on 2022-09-14 GMT+08:00

Question

When user omm (not user root) uses spark-submit to submit a job in yarn-client mode, the FileNotFoundException occurs and the job can continue running. However, the logs of the Driver program fail to be viewed. For example, after running the spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client /opt/client/Spark/spark/examples/jars/spark-examples_2.11-2.2.1-mrs-1.7.0.jar command, the command output is shown in the following figure.

Answer

Possible Causes

When a job is executed in yarn-client mode, the Spark Driver is executed locally. The Driver log file is configured using -Dlog4j.configuration=./log4j-executor.properties. In the log4j-executor.properties configuration file, the Driver logs are outputted to the ${spark.yarn.app.container.log.dir}/stdout file. However, when Spark Driver is executed locally, its log output directory changes to /stdout, because ${spark.yarn.app.container.log.dir} is not configured. In addition, non-root users do not have the permission to create and modify stdout in the root directory. As a result, FileNotFoundException is reported. However, when a job is executed in yarn-cluster mode, the Spark Driver is executed on Application Master. When Application Master is started, a log output directory is set using -D${spark.yarn.app.container.log.dir}. Therefore, FileNotFoundException is not reported when the job is executed in yarn-cluster mode.

Solution:

Note: In the following example, the default value of $SPAKR_HOME is /opt/client/Spark/spark.

Solution 1: Manually switch the log configuration file. Change the value of the -Dlog4j.configuration=./log4j-executor.properties configuration item (default: ./log4j-executor.properties) of spark.driver.extraJavaOptions in the $SPARK_HOME/conf/spark-defaults.conf file. In yarn-client mode, change the value to -Dlog4j.configuration=./log4j.properties. In yarn-cluster mode, change the value to -Dlog4j.configuration=./log4j-executor.properties.

Solution 2: Modify the startup script $SPARK_HOME/bin/spark-class. In the spark-class script, add the following information below #!/usr/bin/env bash.

# Judge mode: client and cluster; Default: client
argv=`echo $@ | tr [A-Z] [a-z]`
if [[ "$argv" =~ "--master" ]];then
    mode=`echo $argv | sed -e 's/.*--master //'`
    master=`echo $mode | awk '{print $1}'`
    case $master in
    "yarn")
        deploy=`echo $mode | awk '{print $3}'`
        if [[ "$mode" =~ "--deploy-mode" ]];then
                deploy=$deploy
        else
                deploy="client"
        fi
    ;;
  "yarn-client"|"local")
        deploy="client"
    ;;
    "yarn-cluster")
        deploy="cluster"
    ;;
    esac
else
    deploy="client"
fi
# modify the spark-defaults.conf
number=`sed -n -e '/spark.driver.extraJavaOptions/=' $SPARK_HOME/conf/spark-defaults.conf`
if [ "$deploy"x = "client"x ];then
    `sed -i "${number}s/-Dlog4j.configuration=.*properties /-Dlog4j.configuration=.\/log4j.properties /g" $SPARK_HOME/conf/spark-defaults.conf`
else
    `sed -i "${number}s/-Dlog4j.configuration=.*properties /-Dlog4j.configuration=.\/log4j-executor.properties /g" $SPARK_HOME/conf/spark-defaults.conf`
fi

The functions of these script lines are similar to those of solution 1. You can change the value of the -Dlog4j.configuration=./log4j-executor.properties configuration item (default: ./log4j-executor.properties) of spark.driver.extraJavaOptions in the $SPARK_HOME/conf/spark-defaults.conf file based on the Yarn mode.

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback