Configuring Spark to Load Third-Party JAR Packages for UDF Registration or SparkSQL Extension
This section is available for MRS 3.5.0-LTS or later version only.
Scenarios
To enhance Spark's capabilities, custom UDFs or JAR packages are frequently used. However, to use these third-party JAR packages, you must specify the third-party class loading path before starting Spark.
Prerequisites
Custom JAR package has been uploaded to the client node. This section uses spark-test.jar as an example to describe how to upload the package to the /tmp directory on the client node.
Configuring Parameters
- Log in to the node where the client is installed as the client installation user and load environment variables.
cd Client installation directory
source bigdata_env
If Kerberos authentication has been enabled for the cluster (in security mode), run the following command for user authentication. If Kerberos authentication is not enabled for the cluster (in normal mode), user authentication is not required.
kinit Component service user
- Upload the JAR package to the HDFS, for example, hdfs://hacluster/tmp/spark/JAR.
hdfs dfs -put /tmp/spark-test.jar /tmp/spark/JAR/
- Modify the following parameters in the Client installation directory/Spark/spark/conf/spark-defaults.conf file on the Spark client.
Parameter
Value
spark.jars
JAR package path, for example, hdfs://hacluster/tmp/spark/JAR/spark-test.jar.
- Log in to FusionInsight Manager, choose Cluster > Services > Spark, click Configurations, and click All Configurations. On the displayed page, click JDBCServer(Role) and then Custom. Add the following parameters in the custom area, and restart the JDBCServer service.
Parameter
Value
spark.jars
JAR package path, for example, hdfs://hacluster/tmp/spark/JAR/spark-test.jar.
- Verify that the JAR package has been loaded and the execution result does not contain "ClassNotFoundException".
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot