Updated on 2023-08-31 GMT+08:00

Spark2x Sample Project

To obtain an MRS sample project, visit https://github.com/huaweicloud/huaweicloud-mrs-example and switch to the branch that matches the MRS cluster version. Download the package to the local PC and decompress it to obtain the sample project of each component.

MRS provides the following Spark2x sample projects:
Table 1 Spark2x-related sample projects

Sample Project Location

Description

sparksecurity-examples/SparkHbasetoCarbonJavaExample

Java sample project for Spark to synchronize HBase data to CarbonData.

In this sample project, the application writes data to HBase in real time for point query services. Data is synchronized to CarbonData tables in batches at a specified interval for analytical query services.

sparksecurity-examples/SparkHbasetoHbaseJavaExample

Java/Scala/Python program that uses Spark to read data from and then write data to HBase.

In this sample project, the Spark applications analyze and summarize data in two HBase tables.

sparksecurity-examples/SparkHbasetoHbasePythonExample

sparksecurity-examples/SparkHbasetoHbaseScalaExample

sparksecurity-examples/SparkHivetoHbaseJavaExample

Java/Scala/Python program that uses Spark to read data from Hive and then write data to HBase.

In this sample project, the Spark applications analyze and process data in Hive tables and write the results to HBase tables.

sparksecurity-examples/SparkHivetoHbasePythonExample

sparksecurity-examples/SparkHivetoHbaseScalaExample

sparksecurity-examples/SparkJavaExample

Java/Python/Scala/R sample project of Spark Core tasks.

The applications of this project read text data from HDFS and then calculate and analyze the data.

SparkRExample is only available for clusters with Kerberos authentication enabled.

sparksecurity-examples/SparkPythonExample

sparksecurity-examples/SparkRExample

sparksecurity-examples/SparkScalaExample

sparksecurity-examples/SparkLauncherJavaExample

Java/Scala sample project that uses Spark Launcher to submit jobs.

This project uses the org.apache.spark.launcher.SparkLauncher class through Java or Scala commands to submit Spark applications.

sparksecurity-examples/SparkLauncherScalaExample

sparksecurity-examples/SparkOnClickHouseJavaExample

Spark uses the native ClickHouse JDBC APIs and Spark JDBC driver to create, query, and insert ClickHouse databases and tables.

sparksecurity-examples/SparkOnClickHousePythonExample

sparksecurity-examples/SparkOnClickHouseScalaExample

sparksecurity-examples/SparkOnHbaseJavaExample

Java/Scala/Python sample project in the Spark on HBase scenario.

You can use HBase as data sources in applications. In this project, data is stored in HBase in Avro format. Data is read from the HBase, and the read data is filtered.

sparksecurity-examples/SparkOnHbasePythonExample

sparksecurity-examples/SparkOnHbaseScalaExample

sparksecurity-examples/SparkOnHudiJavaExample

Java/Scala/Python sample project in the Spark on Hudi scenario.

The applications of this project use Spark to perform operations such as data insertion, query, update, incremental query, query at a specific time point, and data deletion on Hudi.

sparksecurity-examples/SparkOnHudiPythonExample

sparksecurity-examples/SparkOnHudiScalaExample

sparksecurity-examples/SparkOnMultiHbaseScalaExample

Spark accesses the Scala sample project of HBase in two clusters at the same time.

sparksecurity-examples/SparkSQLJavaExample

Java/Python/Scala sample project of Spark SQL tasks.

The applications of this project read text data from HDFS and then calculate and analyze the data.

sparksecurity-examples/SparkSQLPythonExample

sparksecurity-examples/SparkSQLScalaExample

sparksecurity-examples/SparkStreamingKafka010JavaExample

Java/Scala sample project used by Spark Streaming to receive data from Kafka and perform statistical analysis.

The applications of this project accumulate and calculate the stream data in Kafka in real time and calculate the total number of records of each word.

sparksecurity-examples/SparkStreamingKafka010PythonExample

sparksecurity-examples/SparkStreamingtoHbaseJavaExample010

Java/Scala/Python sample project used by Spark Streaming to read Kafka data and write the data into HBase.

The applications of this project start a task every 5 seconds to read data from Kafka and update the data to a specified HBase table.

sparksecurity-examples/SparkStreamingtoHbasePythonExample010

sparksecurity-examples/SparkStreamingtoHbaseScalaExample010

sparksecurity-examples/SparkStructuredStreamingJavaExample

In Spark applications, Structured Streaming is used to call Kafka APIs to obtain word records. Word records are classified to obtain the number of records of each word.

sparksecurity-examples/SparkStructuredStreamingPythonExample

sparksecurity-examples/SparkStructuredStreamingScalaExample

sparksecurity-examples/SparkThriftServerJavaExample

Java/Scala sample project for Spark SQL access through JDBC.

In this sample, you can customize JDBCServer clients and use JDBC connections to create, load data to, query, and delete data tables.

sparksecurity-examples/SparkThriftServerScalaExample

sparksecurity-examples/StructuredStreamingADScalaExample

Structured Streaming is used to read advertisement request data, display data, and click data from Kafka, obtain effective display statistics and click statistics in real time, and write the statistics to Kafka.

sparksecurity-examples/StructuredStreamingStateScalaExample

This Spark structured streaming program collects statistics on the number of events in each session and the start and end timestamp of the sessions in different batches, and outputs the sessions that the state is updated in this batch.