Updated on 2024-08-10 GMT+08:00

Spark Application Development Process

Spark includes Spark Core, Spark SQL and Spark Streaming, whose development processes are the same.

Figure 1 and Table 1 describe the stages in the development process.

Figure 1 Spark development process
Table 1 Spark application development process

Stage

Description

Reference

Understand basic concepts.

Before developing an application, it is important to have a grasp of the basic concepts of Spark. The specific concepts to focus on will depend on the scenario at hand, but generally include Spark Core, Spark SQL, and Spark Streaming.

Concepts

Prepare the development and operating environment.

Spark applications can be developed in Scala, Java, and Python. You are advised to use IntelliJ IDEA to configure development environments in different languages according to the guide. The running environment of Spark is the Spark client. Install and configure the client based on the reference.

Preparing a Local Application Development Environment

Create a project.

Spark offers sample projects for various scenarios, which can be imported for study purposes. Or you can create a Spark project according to the guide.

Importing and Configuring Spark Sample Projects

(Optional) Creating Spark Sample Projects

Write program code for a service scenario.

Spark provides sample projects in Scala, Java, and Python, covering various scenarios such as Streaming, SQL, JDBC client programs, and Spark on HBase.

These samples are designed to help users quickly learn about the programming interfaces of all Spark components.

Developing Spark Applications

Compile and run the application.

You can compile the developed application and deliver it for running based on the reference.

Commissioning a Spark Application

View application running results.

Application running results are stored in the specified directory. You can also check the running results through the UI.

Tune the application.

You can optimize the application based on its running status to meet requirements of the service scenario.

After application tuning, compile and run the application again.

Spark2x Performance Optimization