Scenario

Scenario Description

Spark can access HBase in two clusters concurrently on condition that mutual trust has been configured between these two clusters.

Data Planning

Configure the IP addresses and host names of all ZooKeeper and HBase nodes in cluster2 to the /etc/hosts file on the client node of cluster1.
In cluster1 and cluster2, find the hbase-site.xml file in the conf directory of the Spark2x client, and save it to the /opt/example/A and /opt/example/B directories.

Run the following spark-submit command:

spark-submit --master yarn --deploy-mode client --files /opt/example/B/hbase-site.xml --keytab /opt/FIclient/user.keytab --principal sparkuser  --class com.huawei.spark.examples.SparkOnMultiHbase /opt/example/SparkOnMultiHbase-1.0.jar

Development Approach

When accessing HBase, you need to use the configuration file of the corresponding cluster to create a Configuration object for creating a Connection object.
Use the Connection object you create to perform operations on the HBase table, such as creating a table and inserting, viewing, and printing data.

Parent topic: Concurrent Access from Spark to HBase in Two Clusters

Previous topic: Concurrent Access from Spark to HBase in Two Clusters

Next topic: Scala Sample Code