Updated on 2024-10-23 GMT+08:00

Spark Application Development Process

Spark includes Spark Core, Spark SQL and Spark Streaming, whose development processes are the same.

Figure 1 and Table 1 describe the stages in the development process.

Figure 1 Spark development process
Table 1 Description of Spark development process

Stage

Description

Reference

Preparing the development environment

The Spark application is developed in Scala, Java, and Python. The IDEA tool is recommended to prepare development environments in different languages based on the reference. The running environment of Spark is the Spark client. Install and configure the client based on the reference.

Preparing a Local Application Development Environment

Preparing the configuration files for connecting to the cluster

During the development or a test run of the program, you need to use the cluster configuration files to connect to an MRS cluster. The configuration files usually contain the cluster component information file and user files used for security authentication. You can obtain the required information from the created MRS cluster.

Nodes used for program debugging or running must be able to communicate with the nodes within the MRS cluster, and the hosts domain name must be configured.

Preparing the Configuration File for Connecting Spark to the Cluster

Configuring and importing sample projects

Spark provides sample projects in various scenarios. sample projects can be imported for studying. Or you can create a new Spark project based on the reference.

Importing and Configuring Spark Sample Projects

(Optional) Creating Spark Sample Projects

Writing program code for a service scenario

Sample projects in different languages including Scala, Java, and Python are provided. Sample projects in different scenarios including Streaming, SQL, JDBC client program, and Spark on HBase are also provided.

This helps users to learn about the programming interfaces of all Spark components quickly.

Developing Spark Applications

Compiling and running the project

You can compile the developed application and deliver it for running based on the reference.

Writing and Running the Spark Program in the Linux Environment