Alluxio |
alluxio-examples |
Use Alluxio to connect the storage system sample program through a public interface. The example writes and reads files. |
Flink |
flink-examples |
The following sample programs are provided:
- DataStream program
Java/Scale program for constructing DataStream with Flink This project analyzes user log data based on service requirements, reads text data, generates DataStreams, filters data that meets specified conditions, and obtains results.
- Program that produces and consumes data in Kafka
Java/Scala program that uses a Flink job to produce and consume data from Kafka. In this project, assume that a Flink service receives one message per second. The Producer application sends data to Kafka, the Consumer application receives data from Kafka, and the program processes and prints the data.
- Asynchronous checkpointing
Java/Scala program for Flink asynchronous checkpointing. In this project, the program uses custom operator to continuously generate data. The generated data is a quadruple of long, string, string, and integer values. The program collects statistic results and displays them on the terminal. A checkpoint is triggered every other 6 seconds and the checkpoint result is stored in HDFS.
- Stream SQL join
Flink streaming SQL join program. This program calls APIs of the flink-connector-kafka module to produce and consume data. It generates Table1 and Table2, uses Flink SQL to perform joint query on the tables, and displays results.
|
HBase |
hbase-examples
|
HBase data read and write
This program calls HBase APIs to create user tables, import user data, add and query user information, and create secondary indexes for user tables. |
HDFS |
hdfs-examples |
Java program for HDFS file operations.
This program creates HDFS folders, writs files, appends file content, reads files, and deletes files or folders. |
Hive |
hive-examples |
The following JDBC/HCatalog sample programs are provided:
- Java program for Hive JDBC to process data
In this project, JDBC APIs are used to connect Hive and perform data operations. JDBC APIs are called to create tables, load data, and query data.
- Java program for Hive HCatalog to process data
HCatalog APIs are used to define and query MRS Hive metadata with Hive CLI.
|
Impala |
impala-examples |
Java program for Impala JDBC to process data
In this project, the JDBC APIs are called to connect Impala and perform data operations in Impala. JDBC APIs are called to create tables, load data, and query data. |
Kafka |
kafka-examples |
Java program for processing Kafka streaming data
The program is developed based on Kafka Streams to count words in each message by reading messages in the input topic and to output the result in key-value pairs by consuming data in the output topic. |
MapReduce |
mapreduce-examples |
Java program for submitting MapReduce jobs
This program runs a MapReduce statistics data job to analyze and process data and output data required by users.
It illustrates how to write MapReduce jobs to access multiple service components in HDFS, HBase, and Hive, helping you to develop for key operations such as authentication and configuration loading. |
Presto |
presto-examples |
The following JDBC/HCatalog sample programs are provided:
- Java program for Presto JDBC to process data
In this project, the JDBC APIs are called to connect Presto and perform data operations in Presto. JDBC APIs are called to create tables, load data, and query data.
- Java program for Presto HCatalog to process data
|
OpenTSDB |
opentsdb-examples |
OpenTSDB APIs are called to collect monitoring information in a large-scale cluster and query data in seconds. This program can write, query, and delete data. |
Spark |
spark-examples |
SparkHbasetoHbaseJavaExample |
Java/Scala program that uses Spark to read data from and then write data to HBase
The program uses Spark jobs to analyze and summarize data of two HBase tables. |
SparkHbasetoHbaseScalaExample |
SparkHivetoHbaseJavaExample |
Java/Scala program that uses Spark to read data from Hive and then write data to HBase
The program uses Spark jobs to analyze and summarize data of a Hive table and write result to an HBase table. |
SparkHivetoHbaseScalaExample |
SparkJavaExample |
Java/Python/Scala program of Spark Core tasks
The program reads text data from HDFS and then calculates and analyzes the data. |
SparkPythonExample |
SparkScalaExample |
SparkLauncherJavaExample |
Java/Scala program that uses Spark Launcher to submit jobs
The program uses the org.apache.spark.launcher.SparkLauncher class through Java/Scala commands to submit Spark jobs. |
SparkLauncherScalaExample |
SparkOnHbaseJavaExample |
Java/Scala program in the Spark on HBase scenario
The program uses HBase as data sources. In this project, data is stored in HBase in Avro format. Data is read from the HBase, and the read data is filtered. |
SparkOnHbaseScalaExample |
SparkSQLJavaExample |
Java/Scala program of Spark SQL tasks
The program reads text data from HDFS and then calculates and analyzes the data. |
SparkSQLScalaExample |
SparkStreamingJavaExample |
Java/Scala program used by Spark Streaming to receive data from Kafka and perform statistical analysis
This program analyzes user log data based on service requirements, reads text data, generates DataStreams, filters data that meets specified conditions, and obtains results. |
SparkStreamingScalaExample |
SparkStreamingKafka010JavaExample |
Java/Scala program used by Spark Streaming to receive data from Kafka and perform statistical analysis
The program accumulates and calculates the stream data in Kafka in real time and calculates the total number of records of each word. |
SparkStreamingKafka010ScalaExample |
SparkStreamingtoHbaseJavaExample |
Java/Scala sample project used by Spark Streaming to read Kafka data and write the data into HBase
The program starts a task every 5 seconds to read data from Kafka and updates the data to a specified HBase table. |
SparkStreamingtoHbaseScalaExample |
SparkStructuredStreamingJavaExample |
The program uses Structured Streaming in Spark jobs to call Kafka APIs to obtain word records. Word records are classified to obtain the number of records of each word. |
SparkStructuredStreamingScalaExample |
SparkThriftServerJavaExample |
Java/Scala program for Spark SQL access through JDBC.
In this sample, a custom JDBCServer client and JDBC connections are used to create, load data to, query, and delete tables. |
SparkThriftServerScalaExample |
Storm |
storm-examples |
storm-common-examples |
Constructor of Storm topologies and Spout/Bolt The program can create Spout, Bolt, and Topology. |
storm-hbase-examples |
Interaction between Storm and HBase of MRS The program submits the Storm topology and stores the data to the WordCount table of HBase. |
storm-hdfs-examples |
Interaction between Storm and HDFS of MRS The program submits the Storm topology and stores the data to HDFS. |
storm-jdbc-examples |
Accessing MRS Storm with JDBC The program uses Storm topology to insert data into a table. |
storm-kafka-examples |
Interaction between Storm and Kafka of MRS The program uses the Storm topology to send data to Kafka and display the data. |
storm-obs-examples |
Interaction between Storm and OBS of MRS The program submits the Storm topology and stores the data to OBS. |