Help Center/ MapReduce Service/ Component Operation Guide (LTS)/ Using Spark/Spark2x/ Spark Core Enterprise-Class Enhancements/ Configuring Spark to Load Third-Party JAR Packages for UDF Registration or SparkSQL Extension
Updated on 2024-12-13 GMT+08:00

Configuring Spark to Load Third-Party JAR Packages for UDF Registration or SparkSQL Extension

This section is available for MRS 3.5.0-LTS or later version only.

Scenarios

To enhance Spark's capabilities, custom UDFs or JAR packages are frequently used. However, to use these third-party JAR packages, you must specify the third-party class loading path before starting Spark.

Prerequisites

Custom JAR package has been uploaded to the client node. This section uses spark-test.jar as an example to describe how to upload the package to the /tmp directory on the client node.

Configuring Parameters

  1. Log in to the node where the client is installed as the client installation user and load environment variables.

    cd Client installation directory

    source bigdata_env

    If Kerberos authentication has been enabled for the cluster (in security mode), run the following command for user authentication. If Kerberos authentication is not enabled for the cluster (in normal mode), user authentication is not required.

    kinit Component service user

  2. Upload the JAR package to the HDFS, for example, hdfs://hacluster/tmp/spark/JAR.

    hdfs dfs -put /tmp/spark-test.jar /tmp/spark/JAR/

  3. Modify the following parameters in the Client installation directory/Spark/spark/conf/spark-defaults.conf file on the Spark client.

    Parameter

    Value

    spark.jars

    JAR package path, for example, hdfs://hacluster/tmp/spark/JAR/spark-test.jar.

  4. Log in to FusionInsight Manager, choose Cluster > Services > Spark, click Configurations, and click All Configurations. On the displayed page, click JDBCServer(Role) and then Custom. Add the following parameters in the custom area, and restart the JDBCServer service.

    Parameter

    Value

    spark.jars

    JAR package path, for example, hdfs://hacluster/tmp/spark/JAR/spark-test.jar.

  5. Verify that the JAR package has been loaded and the execution result does not contain "ClassNotFoundException".