Adding a GBase Data Source

This topic is available for MRS 3.5.0 and later versions only.

HetuEngine allows you to configure, access, and query the GBase data source. This topic guides you to add an GBase JDBC data source on the HSConsole page of the cluster.

Prerequisites

The data source and the HetuEngine cluster nodes can communicate with each other.
In the /etc/hosts file of all nodes in the cluster where HetuEngine is deployed, add the host names in the cluster where the data source to be interconnected is deployed and the IP address mappings.
If Kerberos authentication has been enabled for the cluster (security mode), create a HetuEngine administrator. If Kerberos authentication has been disabled for the cluster (normal mode), create a HetuEngine service user and grant the HDFS administrator permission to the user. That is, when you create a user, add the user to both the hadoop and hadoopmanager user groups, for details about how to create a user, see Creating a HetuEngine Permission Role.
A HetuEngine compute instance has been created. For details, see Creating a HetuEngine Compute Instance.
You have obtained the IP address, port number, username, and password for logging in to the GBase database.

Constraints on the Interconnection with GBase Data Sources

HetuEngine supports interconnecting with GBase using the following SQL syntaxes: SHOW CATALOGS, SCHEMAS, TABLES, COLUMNS, DESCRIBE, USE, and SELECT TABLE/VIEW.
The schema and table names of GBase data sources supported by HetuEngine are case insensitive.

Configuring the GBase Data Source

Installing a cluster client

Install the cluster client that contains the HetuEngine service in the /opt/hadoopclient directory.

Preparing the GBase driver

Obtain the GBase driver file in JAR format from GBase's official website. The version must be gbase-connector-java-9.5.0.1-build1-bin.jar or later.
Upload the GBase driver file to the cluster where HetuEngine is deployed.

You can use either of the following methods:
- Upload the file to HDFS on FusionInsight Manager.
  1. Log in to FusionInsight Manager as a HetuEngine administrator and choose Cluster > Services > HDFS.
  2. In the Basic Information area on the Dashboard page, click the link next to NameNode Web UI.
  3. Choose Utilities > Browse the file system and click to create the /user/hetuserver/fiber/extra_file/driver/gbase directory.
  4. Go to the /user/hetuserver/fiber/extra_file/driver/gbase directory and click to upload the GBase driver file obtained in 2.
  5. Click the value in the Permission column in the row containing the driver file, select Read and Write in the User column, Read in the Group column, and Read in the Other column, and click Set.
- Run HDFS commands to upload the file.
  1. Upload the obtained GBase driver file to any directory on the node where the HDFS service client is deployed.
  2. Log in to the node where the HDFS service client is deployed and switch to the client installation directory, for example, /opt/hadoopclient.
    cd /opt/hadoopclient
  3. Configure environment variables.
    source bigdata_env
  4. If the cluster is in security mode, run the following command to authenticate the user. In normal mode, user authentication is not required.
    kinit HetuEngine administrator username
    
    Enter the password as prompted.
  5. Create the /user/hetuserver/fiber/extra_file/driver/gbase directory, upload the GBase driver obtained in 2, and modify the permission.
    hdfs dfs -mkdir -p /user/hetuserver/fiber/extra_file/driver/gbase
    
    hdfs dfs -put GBase driver file path /user/hetuserver/fiber/extra_file/driver/gbase
    
    hdfs dfs -chmod -R 644 /user/hetuserver/fiber/extra_file/driver/gbase

Configuring the GBase data source

Log in to FusionInsight Manager as a HetuEngine administrator and choose Cluster > Services > HetuEngine.
In the Basic Information area on the Dashboard page, click the link next to HSConsole WebUI.

Choose Data Source and click Add Data Source. Configure parameters on the Add Data Source page.

Configure the basic information, enter the data source name, and select JDBC > GBase as the data source type.

In the GBase Configuration area, configure the parameters according to Table 1.

**Table 1** GBase configurations
Parameter	Description	Example Value
Driver Name	Select the GBase driver that has been uploaded in 2. The driver format is xxx.jar.	gbase-connector-java-9.5.0.1-build1-bin.jar
JDBC URL	JDBC URL for connecting to the GBase database. Format: jdbc:mysql://IP address of the GBase database:Port number. The default port is 5258.	jdbc:gbase://192.168.1.1:5258
Username	GBase username for connecting to the GBase data source.	-
Password	GBase password for connecting to the GBase data source.	-

(Optional) Customize the configuration.

Click Add to add custom configuration parameters. Configure custom parameters of the GBase data source. For details, see Table 2.

**Table 2** Custom parameters for the GBase data source
Parameter	Description	Example Value
GBase.auto-reconnect	Whether to reconnect automatically. true (default value): Enable automatic reconnection. false: Disable automatic reconnection.	true
GBase.max-reconnects	Maximum number of reconnection attempts. The default value is 3.	3
GBase.jdbc.use-information-schema	Whether the driver should use INFORMATION_SCHEMA to derive the information used by DatabaseMetaData.	true
use-connection-pool	Whether to use the JDBC connection pool. The default value is true.	true
jdbc.connection.pool.maxTotal	Maximum number of connections in the JDBC connection pool. The default value is 8.	8
jdbc.connection.pool.maxIdle	Maximum number of idle connections in the JDBC connection pool. The default value is 8.	8
jdbc.connection.pool.minIdle	Minimum number of idle connections in the JDBC connection pool. The default value is 0.	0
unsupported-type-handling	How data types that are not supported by the connector will be processed. CONVERT_TO_VARCHAR: Convert unsupported types to VARCHAR and allow only read operations on them. IGNORE (default value): Do not display the unsupported types.	IGNORE
join-pushdown.enabled	Whether join pushdown is enabled. true (default value): Enable join pushdown. false: Disable join pushdown.	true

You can click Delete to delete custom configuration parameters.

Click OK

Log in to the node where the cluster client is deployed and run the following commands to switch to the client installation directory and authenticate the user:

cd /opt/hadoopclient

source bigdata_env

kinit User performing HetuEngine operations (If the cluster is in normal mode, skip this command.)
Log in to the catalog of the data source.

hetu-cli --catalog Data source name --schema Database name

For example, run the following command:

hetu-cli --catalog gbase_1 --schema gbasedb
Run the following command. If the database table information can be viewed or no error is reported, the connection is successful.

show tables;

Mapping Between GBase and HetuEngine Data Types

**Table 3** Mapping Between GBase and HetuEngine Data Types
GBase Type	HetuEngine Type
TINYINT	TINYINT
SMALLINT	SMALLINT
INTEGER	INTEGER
BIGINT	BIGINT
DOUBLE	DOUBLE
FLOAT	REAL
DECIMAL(p, s)	DECIMAL(p, s)
CHAR(n)	CHAR(n)
VARCHAR(n)	VARCHAR(n)
TEXT	VARCHAR(65535)
BLOB, LONGBLOB	VARBINARY
DATE	DATE
TIME	TIME
DATETIME	TIMESTAMP(6)
TIMESTAMP(n)	TIMESTAMP(n)