Sample Projects of MRS Components
To obtain the sample projects from Obtaining the MRS Application Development Sample Project, switch the branch to the version that matches the MRS cluster, download the package to a local directory, and decompress the package to obtain the sample code project of each component.
The MRS sample code library provides sample projects of basic functions of each component. Table 1 lists the projects of the current version.
Component |
Sample Project Location |
Description |
|
---|---|---|---|
ClickHouse |
clickhouse-examples |
Java program that creates and deletes ClickHouse data tables, and inserts and queries data in MRS clusters This program establishes server connections, creates databases and data tables, inserts data, queries data, and deletes data tables. |
|
Flink |
|
FlinkCheckpointJavaExample |
Java/Scala program for Flink asynchronous checkpointing In this project, the program uses custom operators to continuously generate data. The generated data is a quadruple of long, string, string, and integer values. The program collects statistic results and displays them on the terminal. A checkpoint is triggered every other 6 seconds and the checkpoint result is stored in HDFS. |
FlinkCheckpointScalaExample |
|||
FlinkKafkaJavaExample |
Java/Scala program that uses a Flink job to produce and consume data from Kafka. In this project, assume that a Flink service receives one message per second. The Producer application sends data to Kafka, the Consumer application receives data from Kafka, and the program processes and prints the data. |
||
FlinkKafkaScalaExample |
|||
FlinkPipelineJavaExample |
Java/Scala sample program for Flink job pipeline. In this example, a publisher job generates 10,000 data records per second, and the other two jobs subscribe to the data, respectively. After receiving the data, the subscriber jobs convert data formats, sample the data, and output the samples. |
||
FlinkPipelineScalaExample |
|||
FlinkSqlJavaExample |
SQL job submission through Jar jobs on the client |
||
FlinkStreamJavaExample |
Java/Scale program for Flink to construct DataStream This program analyzes user log data based on service requirements, reads text data, generates DataStreams, filters data that meets specified conditions, and obtains results. |
||
FlinkStreamScalaExample |
|||
FlinkStreamSqlJoinExample |
Flink SQL Join program This program calls APIs of the flink-connector-kafka module to produce and consume data. It generates Table1 and Table2, uses Flink SQL to perform joint query on the tables, and displays results. |
||
HBase |
hbase-examples |
hbase-example |
HBase data read and write This program calls HBase APIs to create user tables, import user data, add and query user information, and create secondary indexes for user tables. |
hbase-rest-example |
A development example for using HBase REST interfaces. This program uses REST APIs to query HBase cluster information, obtain tables, use NameSpaces, and manipulate tables. |
||
hbase-thrift-example |
A development example for accessing HBase ThriftServer. This program accesses ThriftServer to manipulate tables, and write data to and read data from tables. |
||
hbase-zk-example |
A development example for HBase to access ZooKeeper. You can use the same client process to access MRS ZooKeeper and third-party ZooKeeper at the same time. The HBase client accesses MRS ZooKeeper, and the customer application accesses third-party ZooKeeper. |
||
HDFS |
|
Java program for HDFS file operations. This program creates HDFS folders, writs files, appends file content, reads files, and deletes files or folders. |
|
hdfs-c-example |
A C language development example for using HDFS. This program connects the HDFS file system and implements file operation functions, such as creating, reading, writing, appending, and deleting files. |
||
Hive |
hive-jdbc-example |
Java program for Hive JDBC to process data In this project, JDBC APIs are used to connect Hive and perform data operations. You can use JDBC APIs to create tables, load data, and query data. You can access FusionInsight ZooKeeper and third-party ZooKeeper in the same client process at the same time. |
|
hive-jdbc-example-multizk |
|||
hcatalog-example |
Java program for Hive HCatalog to process data HCatalog APIs are used to define and query MRS Hive metadata with Hive CLI. |
||
python3-examples |
Python 3 program to connect Hive and execute SQL statements. This program uses Python 3 to connect Hive and submits data analysis tasks. |
||
Kafka |
kafka-examples |
Java program for processing Kafka streaming data The program is developed based on Kafka Streams to count words in each message by reading messages in the input topic and to output the result in key-value pairs by consuming data in the output topic. |
|
Manager |
manager-examples |
Program for calling FusionInsight Manager APIs This program calls Manager APIs to create, modify, and delete cluster users. |
|
MapReduce |
|
Java program for submitting MapReduce jobs This program runs a MapReduce statistics data job to analyze and process data and output data required by users. It illustrates how to write MapReduce jobs to access multiple service components in HDFS, HBase, and Hive, helping you to develop for key operations such as authentication and configuration loading. |
|
Oozie |
|
OozieMapReduceExample |
Program for submitting MapReduce jobs with Oozie. This program demonstrates how to use Java APIs to submit MapReduce jobs, query job status, and perform offline analysis on website log files. |
OozieSparkHBaseExample |
Program for using Oozie to schedule Spark jobs to access HBase. |
||
OozieSparkHiveExample |
Program for using Oozie to schedule Spark jobs to access Hive. |
||
Spark |
|
SparkHbasetoCarbonJavaExample |
Java program for Spark to synchronize HBase data to CarbonData. In this project, the program writes data to HBase in real time for point queries. Data is synchronized to CarbonData tables in batches at a specified interval for analytical queries. |
SparkHbasetoHbaseJavaExample |
Java/Scala/Python program that uses Spark to read data from and then write data to HBase The program uses Spark jobs to analyze and summarize data of two HBase tables. |
||
SparkHbasetoHbasePythonExample |
|||
SparkHbasetoHbaseScalaExample |
|||
SparkHivetoHbaseJavaExample |
Java/Scala/Python program that uses Spark to read data from Hive and then write data to HBase The program uses Spark jobs to analyze and summarize data of a Hive table and write result to an HBase table. |
||
SparkHivetoHbasePythonExample |
|||
SparkHivetoHbaseScalaExample |
|||
SparkJavaExample |
Java/Python/Scala/R program of Spark Core tasks The program reads text data from HDFS and then calculates and analyzes the data. SparkRExample is only available for clusters with Kerberos authentication enabled. |
||
SparkPythonExample |
|||
SparkScalaExample |
|||
SparkRExample |
|||
SparkLauncherJavaExample |
Java/Scala program that uses Spark Launcher to submit jobs The program uses the org.apache.spark.launcher.SparkLauncher class through Java/Scala commands to submit Spark jobs. |
||
SparkLauncherScalaExample |
|||
SparkOnHbaseJavaExample |
Java/Scala/Python program in the Spark on HBase scenario The program uses HBase as data sources. In this project, data is stored in HBase in Avro format. Data is read from HBase, and the read data is filtered. |
||
SparkOnHbasePythonExample |
|||
SparkOnHbaseScalaExample |
|||
SparkOnHudiJavaExample |
Java/Scala/Python program in the Spark on Hudi scenario The program uses Spark jobs to perform operations such as insertion, query, update, incremental query, query at a specific time, and data deletion on Hudi. |
||
SparkOnHudiPythonExample |
|||
SparkOnHudiScalaExample |
|||
SparkOnMultiHbaseScalaExample |
Scala program that uses Spark to access HBase in two clusters at the same time This program is only available for clusters with Kerberos authentication enabled. |
||
SparkSQLJavaExample |
Java/Python/Scala program of Spark SQL tasks The program reads text data from HDFS and then calculates and analyzes the data. |
||
SparkSQLPythonExample |
|||
SparkSQLScalaExample |
|||
SparkStreamingKafka010JavaExample |
Java/Scala program used by Spark Streaming to receive data from Kafka and perform statistical analysis The program accumulates and calculates the stream data in Kafka in real time and calculates the total number of records of each word. |
||
SparkStreamingKafka010ScalaExample |
|||
SparkStreamingtoHbaseJavaExample010 |
Java/Scala/Python sample project used by Spark Streaming to read Kafka data and write the data into HBase The program starts a task every 5 seconds to read data from Kafka and updates the data to a specified HBase table. |
||
SparkStreamingtoHbasePythonExample010 |
|||
SparkStreamingtoHbaseScalaExample010 |
|||
SparkStructuredStreamingJavaExample |
The program uses Structured Streaming in Spark jobs to call Kafka APIs to obtain word records. Word records are classified to obtain the number of records of each word. |
||
SparkStructuredStreamingPythonExample |
|||
SparkStructuredStreamingScalaExample |
|||
SparkThriftServerJavaExample |
Java/Scala program for Spark SQL access through JDBC. In this sample, a custom JDBCServer client and JDBC connections are used to create, load data to, query, and delete tables. |
||
SparkThriftServerScalaExample |
|||
StructuredStreamingADScalaExample |
Structured Streaming is used to read advertisement request data, display data, and click data from Kafka, obtain effective display statistics and click statistics in real time, and write the statistics to Kafka. |
||
StructuredStreamingStateScalaExample |
This Spark structured streaming program collects statistics on the number of events in each session and the start and end timestamp of the sessions in different batches, and outputs the sessions that the state is updated in this batch. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot