Updated on 2023-01-11 GMT+08:00

Configuring an HBase Data Source

Scenario

This section describes how to add an HBase data source on HSConsole.

Prerequisites

  • The domain name of the cluster where the data source is located must be different from the HetuEngine cluster domain name.
  • The cluster where the data source is located and the HetuEngine cluster nodes can communicate with each other.
  • In the /etc/hosts file of all nodes in the cluster where HetuEngine is located, add the mapping between the host names and IP addresses of the cluster where the data source to be connected is located, and add 10.10.10.10 hadoop.System domain name in the /etc/hosts file (for example, 10.10.10.10 hadoop.hadoop.com). Otherwise, HetuEngine cannot connect to the nodes that are not in the cluster based on the host name.
  • A HetuEngine compute instance has been created.
  • The SSL communication encryption configuration of ZooKeeper in the cluster where the data source is located must be the same as that of ZooKeeper in the cluster where HetuEngine is located.

    To check whether SSL communication encryption is enabled, log in to FusionInsight Manager, choose Cluster > Services > ZooKeeper > Configurations > All Configurations, and enter ssl.enabled in the search box. If the value of ssl.enabled is true, SSL communication encryption is enabled. If the value is false, SSL communication encryption is disabled.

Procedure

  1. Obtain the hbase-site.xml, hdfs-site.xml, and core-site.xml configuration files of the HBase data source.

    1. Log in to FusionInsight Manager of the cluster where the HBase data source is located.
    2. Choose Cluster > Dashboard.
    3. Choose More > Download Client and download the client file as prompted.
    4. Decompress the downloaded client file package and obtain the hbase-site.xml, core-site.xml, and hdfs-site.xml files in the FusionInsight_Cluster_1_Services_ClientConfig/HBase/config directory.
    5. If hbase.rpc.client.impl exists in the hbase-site.xml file, change the value of hbase.rpc.client.impl to org.apache.hadoop.hbase.ipc.RpcClientImpl.
      <property>
      <name>hbase.rpc.client.impl</name>
      <value>org.apache.hadoop.hbase.ipc.RpcClientImpl</value>
      </property>

  2. Obtain the user.keytab and krb5.conf files of the proxy user of the HBase data source.

    1. Log in to FusionInsight Manager of the cluster where the HBase data source is located.
    2. Choose System > Permission > User.
    3. Locate the row that contains the target data source user, click More in the Operation column, and select Download Authentication Credential.
    4. Decompress the downloaded package to obtain the user.keytab and krb5.conf files.

    The proxy user of the data source must have the permission to perform HBase operations.

  3. Log in to FusionInsight Manager as a HetuEngine administrator and choose Cluster > Services > HetuEngine. The HetuEngine service page is displayed.
  4. In the Basic Information area on the Dashboard page, click the link next to HSConsole WebUI. The HSConsole page is displayed.
  5. Choose Data Source.
  1. Click Add Data Source. Configure parameters on the Add Data Source page.

    1. Configure Basic Information. For details, see Table 1.
      Table 1 Basic Information

      Parameter

      Description

      Example Value

      Name

      Name of the data source to be connected.

      The value can contain only letters, digits, and underscores (_) and must start with a letter.

      hbase_1

      Data Source Type

      Type of the data source to be connected. Select HBase.

      HBase

      Description

      Description of the data source.

      The value can contain only letters, digits, commas (,), periods (.), underscores (_), spaces, and line breaks.

      -

    2. Configure parameters in the HBase Configuration area. For details, see Table 2.
      Table 2 HBase Configuration

      Parameter

      Description

      Example Value

      Driver

      The default value is hbase-connector.

      hbase-connector

      ZooKeeper Quorum Address

      Service IP addresses of all quorumpeer instances of the ZooKeeper service for the data source. If the ZooKeeper service of the data source uses IPv6, you need to specify the client port number in the ZooKeeper Quorum address.

      Log in to FusionInsight Manager, choose Cluster > Services > ZooKeeper > Instance, and view the IP addresses of all the hosts housing the quorumpeer instances.

      • IPv4: 10.0.136.132,10.0.136.133,10.0.136.134
      • IPv6: [0.0.0.0.0.0.0.0]:24002

      ZooKeeper Client Port Number

      Port number of the ZooKeeper client.

      Log in to FusionInsight Manager and choose Cluster > Service > ZooKeeper. On the Configurations tab page, check the value of clientPort.

      2181

      HBase RPC Communication Protection

      Set this parameter based on the value of hbase.rpc.protection in the hbase-site.xml file obtained in 1.

      • If the value is authentication, set this parameter to No.
      • If the value is privacy, set this parameter to Yes.

      No

      Security Authentication Mechanism

      After the security mode is enabled, the default value is KERBEROS.

      KERBEROS

      Principal

      Configure this parameter when the security authentication mechanism is enabled. Set the parameter to the user to whom the user.keytab file obtained in 2 belongs.

      user_hbase@HADOOP2.COM

      Keytab File

      Configure this parameter when the security mode is enabled. It specifies the security authentication key. Select the user.keytab file obtained in 2.

      user.keytab

      krb5 File

      Configure this parameter when the security mode is enabled. It is the configuration file used for Kerberos authentication. Select the krb5.conf file obtained in 2.

      krb5.conf

      hbase-site File

      Configure this parameter when the security mode is enabled. It is the configuration file required for connecting to HDFS. Select the hbase-site.xml file obtained in 1.

      hbase-site.xml

      core-site File

      Configure this parameter when the security mode is enabled. This file is required for connecting to HDFS. Select the core-site.xml file obtained in 1.

      core-site.xml

      hdfs-site File

      Configure this parameter when the security mode is enabled. This file is required for connecting to HDFS. Select the hdfs-site.xml file obtained in 1.

      hdfs-site.xml

    3. Modify custom configurations.
      • You can click Add to add custom configuration parameters.
      • You can click Delete to delete custom configuration parameters.
    4. Click OK.

  2. Log in to the node where the cluster client is located and run the following commands to switch to the client installation directory and authenticate the user:

    cd /opt/client

    source bigdata_env

    kinit User performing HetuEngine operations (If the cluster is in normal mode, skip this step.)

  3. Run the following command to log in to the catalog of the data source:

    hetu-cli --catalog Data source name --schema default

    For example, run the following command:

    hetu-cli --catalog hbase_1 --schema default