Updated on 2024-08-10 GMT+08:00

Development Plan

Overview

Spark can access two HBase clusters concurrently, provided that mutual trust has been configured between them.

Preparing Data

  1. Configure the IP addresses and host names of all ZooKeeper and HBase nodes in cluster2 to the /etc/hosts file on the client node of cluster1.
  2. In cluster1 and cluster2, find the hbase-site.xml file in the conf directory of the Spark2x client, and save it to the /opt/example/A and /opt/example/B directories.
  3. Run the following spark-submit command:

    spark-submit --master yarn --deploy-mode client --files /opt/example/B/hbase-site.xml --keytab /opt/FIclient/user.keytab --principal sparkuser --class com.huawei.spark.examples.SparkOnMultiHbase /opt/example/SparkOnMultiHbase-1.0.jar

Development Guidelines

  1. To access HBase, create a Configuration object using the configuration file of the corresponding cluster, which is then used to create a Connection object.
  2. Perform operations on the HBase table using the Connection object you create, such as creating a table and inserting, viewing, and printing data.