Updated on 2024-12-13 GMT+08:00

Adding a GBase Data Source

This topic is available for MRS 3.5.0 and later versions only.

HetuEngine allows you to configure, access, and query the GBase data source. This topic guides you to add an GBase JDBC data source on the HSConsole page of the cluster.

Prerequisites

  • The data source and the HetuEngine cluster nodes can communicate with each other.
  • In the /etc/hosts file of all nodes in the cluster where HetuEngine is deployed, add the host names in the cluster where the data source to be interconnected is deployed and the IP address mappings.
  • If Kerberos authentication has been enabled for the cluster (security mode), create a HetuEngine administrator. If Kerberos authentication has been disabled for the cluster (normal mode), create a HetuEngine service user and grant the HDFS administrator permission to the user. That is, when you create a user, add the user to both the hadoop and hadoopmanager user groups, for details about how to create a user, see Creating a HetuEngine Permission Role.
  • A HetuEngine compute instance has been created. For details, see Creating a HetuEngine Compute Instance.
  • You have obtained the IP address, port number, username, and password for logging in to the GBase database.

Constraints on the Interconnection with GBase Data Sources

  • HetuEngine supports interconnecting with GBase using the following SQL syntaxes: SHOW CATALOGS, SCHEMAS, TABLES, COLUMNS, DESCRIBE, USE, and SELECT TABLE/VIEW.
  • The schema and table names of GBase data sources supported by HetuEngine are case insensitive.

Configuring the GBase Data Source

Installing a cluster client

  1. Install the cluster client that contains the HetuEngine service in the /opt/hadoopclient directory.

Preparing the GBase driver

  1. Obtain the GBase driver file in JAR format from GBase's official website. The version must be gbase-connector-java-9.5.0.1-build1-bin.jar or later.
  2. Upload the GBase driver file to the cluster where HetuEngine is deployed.

    You can use either of the following methods:

    • Upload the file to HDFS on FusionInsight Manager.
      1. Log in to FusionInsight Manager as a HetuEngine administrator and choose Cluster > Services > HDFS.
      2. In the Basic Information area on the Dashboard page, click the link next to NameNode Web UI.
      3. Choose Utilities > Browse the file system and click to create the /user/hetuserver/fiber/extra_file/driver/gbase directory.
      4. Go to the /user/hetuserver/fiber/extra_file/driver/gbase directory and click to upload the GBase driver file obtained in 2.
      5. Click the value in the Permission column in the row containing the driver file, select Read and Write in the User column, Read in the Group column, and Read in the Other column, and click Set.
    • Run HDFS commands to upload the file.
      1. Upload the obtained GBase driver file to any directory on the node where the HDFS service client is deployed.
      2. Log in to the node where the HDFS service client is deployed and switch to the client installation directory, for example, /opt/hadoopclient.

        cd /opt/hadoopclient

      3. Configure environment variables.

        source bigdata_env

      4. If the cluster is in security mode, run the following command to authenticate the user. In normal mode, user authentication is not required.

        kinit HetuEngine administrator username

        Enter the password as prompted.

      5. Create the /user/hetuserver/fiber/extra_file/driver/gbase directory, upload the GBase driver obtained in 2, and modify the permission.

        hdfs dfs -mkdir -p /user/hetuserver/fiber/extra_file/driver/gbase

        hdfs dfs -put GBase driver file path /user/hetuserver/fiber/extra_file/driver/gbase

        hdfs dfs -chmod -R 644 /user/hetuserver/fiber/extra_file/driver/gbase

Configuring the GBase data source

  1. Log in to FusionInsight Manager as a HetuEngine administrator and choose Cluster > Services > HetuEngine.
  2. In the Basic Information area on the Dashboard page, click the link next to HSConsole WebUI.
  3. Choose Data Source and click Add Data Source. Configure parameters on the Add Data Source page.

    1. Configure the basic information, enter the data source name, and select JDBC > GBase as the data source type.
    2. In the GBase Configuration area, configure the parameters according to Table 1.
      Table 1 GBase configurations

      Parameter

      Description

      Example Value

      Driver Name

      Select the GBase driver that has been uploaded in 2. The driver format is xxx.jar.

      gbase-connector-java-9.5.0.1-build1-bin.jar

      JDBC URL

      JDBC URL for connecting to the GBase database.

      Format: jdbc:mysql://IP address of the GBase database:Port number.

      The default port is 5258.

      jdbc:gbase://192.168.1.1:5258

      Username

      GBase username for connecting to the GBase data source.

      -

      Password

      GBase password for connecting to the GBase data source.

      -

    3. (Optional) Customize the configuration.

      Click Add to add custom configuration parameters. Configure custom parameters of the GBase data source. For details, see Table 2.

      Table 2 Custom parameters for the GBase data source

      Parameter

      Description

      Example Value

      GBase.auto-reconnect

      Whether to reconnect automatically.

      • true (default value): Enable automatic reconnection.
      • false: Disable automatic reconnection.

      true

      GBase.max-reconnects

      Maximum number of reconnection attempts. The default value is 3.

      3

      GBase.jdbc.use-information-schema

      Whether the driver should use INFORMATION_SCHEMA to derive the information used by DatabaseMetaData.

      true

      use-connection-pool

      Whether to use the JDBC connection pool. The default value is true.

      true

      jdbc.connection.pool.maxTotal

      Maximum number of connections in the JDBC connection pool. The default value is 8.

      8

      jdbc.connection.pool.maxIdle

      Maximum number of idle connections in the JDBC connection pool. The default value is 8.

      8

      jdbc.connection.pool.minIdle

      Minimum number of idle connections in the JDBC connection pool. The default value is 0.

      0

      unsupported-type-handling

      How data types that are not supported by the connector will be processed.

      • CONVERT_TO_VARCHAR: Convert unsupported types to VARCHAR and allow only read operations on them.
      • IGNORE (default value): Do not display the unsupported types.

      IGNORE

      join-pushdown.enabled

      Whether join pushdown is enabled.

      • true (default value): Enable join pushdown.
      • false: Disable join pushdown.

      true

      You can click Delete to delete custom configuration parameters.

    4. Click OK

  4. Log in to the node where the cluster client is deployed and run the following commands to switch to the client installation directory and authenticate the user:

    cd /opt/hadoopclient

    source bigdata_env

    kinit User performing HetuEngine operations (If the cluster is in normal mode, skip this command.)

  5. Log in to the catalog of the data source.

    hetu-cli --catalog Data source name --schema Database name

    For example, run the following command:

    hetu-cli --catalog gbase_1 --schema gbasedb

  6. Run the following command. If the database table information can be viewed or no error is reported, the connection is successful.

    show tables;

Mapping Between GBase and HetuEngine Data Types

Table 3 Mapping Between GBase and HetuEngine Data Types

GBase Type

HetuEngine Type

TINYINT

TINYINT

SMALLINT

SMALLINT

INTEGER

INTEGER

BIGINT

BIGINT

DOUBLE

DOUBLE

FLOAT

REAL

DECIMAL(p, s)

DECIMAL(p, s)

CHAR(n)

CHAR(n)

VARCHAR(n)

VARCHAR(n)

TEXT

VARCHAR(65535)

BLOB, LONGBLOB

VARBINARY

DATE

DATE

TIME

TIME

DATETIME

TIMESTAMP(6)

TIMESTAMP(n)

TIMESTAMP(n)