Updated on 2022-11-18 GMT+08:00

How to Submit the Spark Application Using Java Commands?

Question

How to submit the Spark application using Java commands in addition to spark-submit commands?

Answer

Use the org.apache.spark.launcher.SparkLauncher class and run Java command to submit the Spark application.

  1. Define the org.apache.spark.launcher.SparkLauncher class. The SparkLauncherJavaExample and SparkLauncherScalaExample are provided by default as example code. You can modify the input parameters of example code as required.

    • If you use Java as the development language, you can compile the SparkLauncher class by referring to the following code:
          public static void main(String[] args) throws Exception {
              System.out.println("com.huawei.bigdata.spark.examples.SparkLauncherExample <mode> <jarParh> <app_main_class> <appArgs>");
              SparkLauncher launcher = new SparkLauncher();
              launcher.setMaster(args[0])
                  .setAppResource(args[1]) // Specify user app jar path
                  .setMainClass(args[2]);
              if (args.length > 3) {
                  String[] list = new String[args.length - 3];
                  for (int i = 3; i < args.length; i++) {
                      list[i-3] = args[i];
                  }
                  // Set app args
                  launcher.addAppArgs(list);
              }
      
              // Launch the app
              Process process = launcher.launch();
              // Get Spark driver log
              new Thread(new ISRRunnable(process.getErrorStream())).start();
              int exitCode = process.waitFor();
              System.out.println("Finished! Exit code is "  + exitCode);
          }
    • If you use Scala as the development language, you can compile the SparkLauncher class by referring to the following code:
        def main(args: Array[String]) {
          println(s"com.huawei.bigdata.spark.examples.SparkLauncherExample <mode> <jarParh>  <app_main_class> <appArgs>")
          val launcher = new SparkLauncher()
          launcher.setMaster(args(0))
            .setAppResource(args(1)) // Specify user app jar path
            .setMainClass(args(2))
            if (args.drop(3).length > 0) {
              // Set app args
              launcher.addAppArgs(args.drop(3): _*)
            }
      
      
          // Launch the app
          val process = launcher.launch()
          // Get Spark driver log
          new Thread(new ISRRunnable(process.getErrorStream)).start()
          val exitCode = process.waitFor()
          println(s"Finished! Exit code is $exitCode")
        }

  2. Develop the Spark application based on the service requirements and configure constant values such as the main class of the user-compiled Spark application. For details about different scenarios, see Developing the Project.

    • If you use the security mode, you are advised to prepare the security authentication code, service application code, and related configurations according to the security requirements.

      In yarn-cluster mode, security authentication cannot be added to the Spark project. Therefore, users need to add security authentication code or run commands to perform security authentication. There is security authentication code in the example code. In yarn-cluster mode, modify the corresponding security code before running the operation.

    • In normal mode, prepare the service application code and related configurations.

  3. Call the org.apache.spark.launcher.SparkLauncher.launch() function to submit user applications.

    1. Generate jar packages from the SparkLauncher application and user applications, and upload the jar packages to the Spark node of the application. For details about how to generate jar packages, see Compiling and Running the Application.
      • The compilation dependency package of SparkLauncher is spark-launcher_2.12-3.1.1-hw-ei-311001-SNAPSHOT.jar. Please obtain it from the jars directory of FusionInsight_Spark2x_8.1.0.1.tar.gz in Software.
      • The compilation dependency packages of user applications vary with the code. You need to load the dependency package based on the compiled code.
    2. Upload the dependency jar package of the application to a directory, for example, $SPARK_HOME/jars (the node where the application will run).

      Upload the dependency packages of the SparkLauncher class and the application to the jars directory on th client. The dependency package of the example code has existed in the jars directory on the client.

      If you want to use the Spark Launcher class, the node where the application runs must have the Spark client installed. Th running of the Spark Launcher class is dependent on the configured environment variables, running dependency package, and configuration files.

    3. In the node where the Spark application is running, run the following command to submit the application. Then you can check the running situation through Spark WebUI and check the result by obtaining specified files. See Checking the Commissioning Result for details.

      java -cp $SPARK_HOME/conf:$SPARK_HOME/jars/*:SparkLauncherExample.jar com.huawei.bigdata.spark.examples.SparkLauncherExample yarn-client /opt/female/FemaleInfoCollection.jar com.huawei.bigdata.spark.examples.FemaleInfoCollection <inputPath>