Updated on 2024-05-29 GMT+08:00

Configuring a HetuEngine Data Source

Scenario

This section describes how to add another HetuEngine data source on the HSConsole page for a cluster in security mode.

Procedure

  1. Obtain the user.keytab file of the proxy user of the HetuEngine cluster in a remote domain.

    1. Log in to FusionInsight Manager of the HetuEngine cluster in the remote domain.
    2. Choose System > Permission > User.
    3. Locate the row that contains the target data source user, click More in the Operation column, and select Download Authentication Credential.
    4. The user.keytab file extracted from the downloaded file is the user credential file.

  2. Log in to FusionInsight Manager as a HetuEngine administrator and choose Cluster > Services > HetuEngine. The HetuEngine service page is displayed.
  3. In the Basic Information area on the Dashboard page, click the link next to HSConsole WebUI. The HSConsole page is displayed.
  4. Choose Data Source and click Add Data Source. Configure parameters on the Add Data Source page.

    1. In the Basic Configuration area, configure Name and choose HetuEngine for Data Source Type.
    2. Configure parameters in the HetuEngine Configuration area. For details, see Table 1.
      Table 1 HetuEngine Configuration

      Parameter

      Description

      Example Value

      Driver

      The default value is hsfabric-initial.

      hsfabric-initial

      Username

      Configure this parameter when the security mode is enabled.

      It specifies the user who accesses the remote HetuEngine. Set the parameter to the user to whom the user.keytab file obtained in 1 belongs.

      hetu_test

      Keytab File

      Configure this parameter when the security mode is enabled.

      This is the Keytab file of the user who accesses the remote DataCenter. Select the user.keytab file obtained in 1.

      user.keytab

      Two Way Transmission

      This parameter indicates whether to enable bidirectional transmission for cross-domain data transmission. The default value is Yes.

      • Yes: Two-way transmission: Requests are forwarded to the remote HSFabric through the local HSFabric. If two-way transmission is enabled, the local HSFabric address must be configured.
      • No: Unidirectional transmission: Requests are directly sent to the remote HSFabric.

      Yes

      Local Configuration

      Host IP address and port number of the HSFabric instance that is responsible for external communication of the HetuEngine service in the local MRS cluster.

      1. Log in to FusionInsight Manager of the local cluster, choose Cluster > Services > HetuEngine > Instance, and check the service IP address of the HSFabric.
      2. Click HSFabric, choose Instance Configuration, and check the value of server.port. The default value is 29900.

      192.162.157.32:29900

      Remote Address

      Host IP address and port number of the HSFabric instance that is responsible for external communication of the HetuEngine service in the remote MRS cluster.

      1. Log in to FusionInsight Manager of the remote cluster, choose Cluster > Services > HetuEngine > Instance, and check the service IP address of the HSFabric.
      2. Click HSFabric, choose Instance Configuration, and check the value of server.port. The default value is 29900.

      192.168.1.1:29900

      Region

      Region to which the current request initiator belongs. The value can contain only digits and underscores (_).

      0755_01

      Receiving Data Timeout (s)

      Timeout interval for receiving data, in seconds.

      60

      Total Task Timeout (s)

      Total timeout duration for executing a cross-domain task, in seconds.

      300

      Tasks Used by Worker Nodes

      Number of tasks used by each worker node to receive data.

      5

      Data Compression

      • Yes: Data compression is enabled.
      • No: Data compression is disabled.

      Yes

    3. (Optional) Customize the configuration.
      • You can click Add to add custom configuration parameters. Configure custom parameters of the HetuEngine data source. For details, see Table 2.
        Table 2 Custom parameters of the HetuEngine data source

        Parameter

        Description

        Example Value

        hsfabric.health.check.time

        Interval for checking the HSFabric instance status, in seconds.

        60

        hsfabric.subquery.pushdown

        Whether to enable cross-domain query pushdown. The function is enabled by default.

        • true: enables cross-domain query pushdown.
        • false: disables cross-domain query pushdown.

        true

        hsfabric.local.tenant

        (available for MRS 3.3.0 or later)

        Tenant queue used by the remote HetuEngine for computing

        • If this parameter is not set, the system randomly selects the tenant to which the user belongs based on the configured user.
        • If this parameter is set, the specified tenant will be used. This parameter applies to scenarios where strict tenant verification is enabled.

        -

      • You can click Delete to delete custom configuration parameters.
    4. Click OK.

  5. Log in to the node where the cluster client is located and run the following commands to switch to the client installation directory and authenticate the user:

    cd /opt/client

    source bigdata_env

    kinit User performing HetuEngine operations (If the cluster is in normal mode, skip this step.)

  6. Run the following command to log in to the catalog of the data source:

    hetu-cli --catalog Data source name --schema Database name

    For example, run the following command:

    hetu-cli --catalog hetuengine_1 --schema default

  7. Run the following command. If the database table information can be viewed or no error is reported, the connection is successful.

    show tables;

Data Type Mapping

Currently, HetuEngine data sources support the following data types: BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, REAL, DOUBLE, DECIMAL, VARCHAR, CHAR, DATE, TIMESTAMP, ARRAY, MAP, TIME WITH TIMEZONE, TIMESTAMP WITH TIME ZONE, and TIME.

Performance Optimization

The query pushdown function is supported to improve query speed.

This function is enabled by default. You can also enable it by adding related custom parameters according to 4.c.

Constraints

  • The following syntaxes are not supported: CREATE, ALTER, DROP VIEW, INSERT OVERWRITE, UPDATE, and DELETE.
  • INSERT is not supported for cross-domain data sources.