Updated on 2022-09-14 GMT+08:00

Development Process

Spark includes Spark Core, Spark SQL and Spark Streaming, whose development processes are the same.

Figure 1 and Table 1 describe the stages in the development process.

Figure 1 Spark development process
Table 1 Description of Spark development process

Stage

Description

Reference

Understand the basic concepts.

Before the application development, the basic concepts of Spark are required to be understood. Choose the concepts required to be understood based on the actual scenario. The basic concepts include the basic concept of Spark Core, basic concept of Spark SQL and basic concept of Spark Streaming.

Basic Concepts

Prepare the development and operating environment.

The Spark application is developed in Scala, Java, and Python. The IDEA tool is recommended to prepare development environments in different languages based on the reference. The running environment of Spark is the Spark client. Install and configure the client based on the reference.

Development and Operating Environment

Prepare projects.

Spark provides sample projects in various scenarios. sample projects can be imported for studying. Or you can create a new Spark project based on the reference.

Configuring and Importing Sample Projects

Creating a New Project (Optional)

Develop projects based on scenarios.

Sample projects in different languages including Scala, Java, and Python are provided. Sample projects in different scenarios including Streaming, SQL, JDBC client program, and Spark on HBase are also provided.

This helps users to learn about the programming interfaces of all Spark components quickly.

Developing the Project

Compile and run the application.

Users compile the developed application and deliver it for running based on the reference.

Compiling and Running the Application

Check the application running results.

Application running results are stored in the directory specified by users. Users can also check the running results through the UI.

Checking the Commissioning Result

Tune the application.

Based on the application running results, tun the application to meet the requirements of the service scenario.

After application tuning, compile and run the application again.

Spark2x Performance Optimization