Scenario Description
Scenario Description
Users can use Spark to call HBase APIs to operate HBase tables. In the Spark applications, users can use HBase APIs to create a table, read the table, and insert data into the table.
Data Planning
Save the original data files in HDFS.
- Create the input_data1.txt text file on the local PC and copy the following content to the input_data1.txt file.
20,30,40,xxx
- Create the /tmp/input folder in the HDFS, and run the following commands to upload input_data1.txt to the /tmp/input directory:
- On the HDFS client, run the following commands for authentication:
kinit -kt '/opt/client/Spark/spark/conf/user.keytab' <Service user for authentication>
Specify the path of the user.keytab file based on the site requirements.
- On the HDFS client running the Linux OS, run the hadoop fs -mkdir /tmp/input command (or the hdfs dfs command) to create a directory.
- On the HDFS client running the Linux OS, run the hadoop fs -put input_xxx.txt /tmp/input command to upload the data file.
- On the HDFS client, run the following commands for authentication:
If Kerberos authentication is enabled, set spark.yarn.security.credentials.hbase.enabled in the client configuration file spark-defaults.conf to true.
Development Guidelines
- Create an HBase table.
- Insert data to the HBase table.
- Use Spark Application to read data from the HBase table.
Last Article: Spark on HBase Application
Next Article: Java Sample Code
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.