Scenario Description

Assume that table1 of HBase stores a user's consumption amount on the current day and table2 stores the user's history consumption amount data.

In table1, the key=1,cf:cid=100 record indicates that user1's consumption amount on the current day is 100 CNY.

In table2, the key=1,cf:cid=1000 record indicates that user1's history consumption amount is 1000 CNY.

Based on some service requirements, a Spark application must be developed to implement the following functions:

Calculate a user's history consumption amount based on the user name, that is, the user's total consumption amount =100 (consumption amount of the current day) + 1000 (history consumption amount).

In the preceding example, the application run result is that in table2, the total consumption amount of user1 (key=1) is cf:cid=1100 CNY.

Data Planning

Use the HBase shell tool to create HBase table1 and table2 and insert data to them.

Run the following command to create a table named table1 through HBase:

create 'table1', 'cf'
Run the following command to insert data through HBase:

put 'table1', '1', 'cf:cid', '100'
Run the following command to create a table named table2 through HBase:

create 'table2', 'cf'
Run the following command on HBase to insert data into table2:

put 'table2', '1', 'cf:cid', '1000'

If Kerberos authentication is enabled, set spark.yarn.security.credentials.hbase.enabled in the client configuration file spark-defaults.conf and on the sparkJDBC server to true.