Scenario
Scenario Description
Spark can access HBase in two clusters concurrently on condition that mutual trust has been configured between these two clusters.
Data Planning
- Configure the IP addresses and host names of all ZooKeeper and HBase nodes in cluster2 to the /etc/hosts file on the client node of cluster1.
- In cluster1 and cluster2, find the hbase-site.xml file in the conf directory of the Spark2x client, and save it to the /opt/example/A and /opt/example/B directories.
- Run the following spark-submit command:
spark-submit --master yarn --deploy-mode client --files /opt/example/B/hbase-site.xml --keytab /opt/FIclient/user.keytab --principal sparkuser --class com.huawei.spark.examples.SparkOnMultiHbase /opt/example/SparkOnMultiHbase-1.0.jar
Development Approach
- When accessing HBase, you need to use the configuration file of the corresponding cluster to create a Configuration object for creating a Connection object.
- Use the Connection object you create to perform operations on the HBase table, such as creating a table and inserting, viewing, and printing data.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.