Scenario Description

Users can use Spark to call HBase APIs to operate HBase tables. In the Spark applications, users can use HBase APIs to create a table, read the table, and insert data into the table.

Data Planning

Save the original data files in HDFS.

Create the input_data1.txt text file on the local PC and copy the following content to the input_data1.txt file.
```
20,30,40,xxx
```
Create the /tmp/input folder in the HDFS, and run the following commands to upload input_data1.txt to the /tmp/input directory:
1. On the HDFS client, run the following commands for authentication:
  cd /opt/client
  
  kinit -kt '/opt/client/Spark/spark/conf/user.keytab' <Service user for authentication>
  
  Specify the path of the user.keytab file based on the site requirements.
2. On the HDFS client running the Linux OS, run the hadoop fs -mkdir /tmp/input command (or the hdfs dfs command) to create a directory.
3. On the HDFS client running the Linux OS, run the hadoop fs -put input_xxx.txt /tmp/input command to upload the data file.

If Kerberos authentication is enabled, set spark.yarn.security.credentials.hbase.enabled in the client configuration file spark-defaults.conf to true.