Updated on 2022-08-12 GMT+08:00

Configuring Environment Variables in Yarn-Client and Yarn-Cluster Modes

Scenario

Values of some configuration parameters of Spark client vary depending on its work mode (YARN-Client or YARN-Cluster). If you switch Spark client between different modes without first changing values of such configuration parameters, Spark client fails to submit jobs in the new mode.

To avoid this, configure parameters as described in Table 1.

  • In Yarn-Cluster mode, use the new parameters (path and parameters of Spark server).
  • In Yarn-Client mode, uses the original parameters.

    They are spark.driver.extraClassPath, spark.driver.extraJavaOptions, and spark.driver.extraLibraryPath.

If you choose not to add the parameters in Table 1, Spark client can continue to operate well in either mode but the mode switch requires changes to some of its configuration parameters.

Configuration Parameters

Navigation path for setting parameters:

On Manager, choose Cluster > Name of the desired cluster > Services > Spark2x > Configurations. Click All Configurations and enter a parameter name in the search box.

Table 1 Parameter description

Parameter

Description

Default Value

spark.yarn.cluster.driver.extraClassPath

Indicates the extraClassPath of the driver in Yarn-cluster mode. Set the parameter to the path and parameters of the server.

The original parameter spark.driver.extraClassPath indicates the extraClassPath of Spark client. By using different parameters to separate the settings of Spark server from the settings of Spark client, you can switch Spark client to different modes without changing parameter values.

${BIGDATA_HOME}/common/runtime/security

spark.yarn.cluster.driver.extraJavaOptions

Indicates the extraJavaOptions of Driver in Yarn-Cluster mode and is set to path and parameters of extraJavaOptions of Spark server.

The original parameter spark.driver.extraJavaOptions indicates the path of extraJavaOptions of Spark client. By using different parameters to separate the settings of Spark server from the settings of Spark client, you can switch Spark client to different modes without changing parameter values.

-Xloggc:<LOG_DIR>/indexserver-%p-gc.log -XX:+PrintGCDetails -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 -XX:GCLogFileSize=10M -Dlog4j.configuration=./__spark_conf__/__hadoop_conf__/log4j-executor.properties -Dlog4j.configuration.watch=true -Djava.security.auth.login.config=./__spark_conf__/__hadoop_conf__/jaas-zk.conf -Dzookeeper.server.principal=${ZOOKEEPER_SERVER_PRINCIPAL} -Djava.security.krb5.conf=./__spark_conf__/__hadoop_conf__/kdc.conf -Djetty.version=x.y.z -Dorg.xerial.snappy.tempdir=${BIGDATA_HOME}/tmp -Dcarbon.properties.filepath=./__spark_conf__/__hadoop_conf__/carbon.properties -Djdk.tls.ephemeralDHKeySize=2048 -Dspark.ssl.keyStore=./child.keystore #{java_stack_prefer}