Updated on 2024-04-29 GMT+08:00

Configuring an MRS Hetu Connection

Table 1 MRS Hetu connection

Parameter

Mandatory

Description

Data Connection Type

Yes

MRS Hetu is selected by default and cannot be changed.

Name

Yes

Name of the data connection to create. Data connection names can contain a maximum of 100 characters. They can contain only letters, digits, underscores (_), and hyphens (-).

Tag

No

Attribute of the data connection to create. Tags make management easier.
NOTE:

The tag name can contain only letters, digits, and underscores (_) and cannot start with an underscore (_) or contain more than 100 characters.

Applicable Modules

Yes

Select the modules for which this connection is available.

All modules are selected by default, which means this connection is available for all the modules that support the data source connected by this connection. For details about the data sources supported by each module, see Data Sources.

Basic and Network Connectivity Configuration

Manual

Yes

Select the connection mode. If you do not need to access MRS clusters in other projects or enterprise projects, select Cluster Name Mode.
  • Cluster Name Mode: Select an existing cluster. You can only connect to an MRS cluster in the same project and enterprise project.
  • If you select Connection String Mode, you can set Manager IP and enable communication between this connection's agent (CDM cluster) and an MRS cluster in another project or enterprise project so that you can access the MRS cluster.

Manager IP

Yes

This parameter is mandatory when Connection String Mode is selected for Manual.

Set this parameter to the floating IP address of MRS Manager. Only MRS clusters are supported. A Hadoop cluster can be connected only after it is managed by MRS.

NOTE:
  • MRS clusters of version 3.1.1 and later can be connected.
  • To connect to MRS clusters of version 3.2.1, add parameter protocol.v1.alternate-header-name with value Presto in the coordinator.config.properties and worker.config.properties files for the compute instance on the HetuEngine WebUI.

You can click Select next to the text box and select an MRS cluster in the same project and enterprise project. If you want to access an MRS cluster in another project or enterprise project, obtain and enter the floating IP address of MRS Manager and ensure that the connection's agent (CDM cluster) can communicate with the tenant-plane MRS cluster. To obtain the floating IP address of MRS Manager, log in to the active master node of the MRS cluster and run the ifconfig command. In the command output, the IP address of eth0:wsom is the floating IP address of MRS Manager. For details about how to log in to the master node of the MRS cluster, see Logging In to an ECS.

Enter multiple IP addresses based on the scenario in sequence and separate them with commas (,), for example, 127.0.0.1 or 127.0.0.1,127.0.0.2,127.0.0.3.
  • If you enter one IP address, enter the management-plane floating IP address of the MRS cluster.
  • If you enter three IP addresses, enter the IP address of the active node on the MRS cluster service plane, IP address of the standby node on the MRS cluster service plane, and the floating IP address of the MRS cluster management plane.

MRS Cluster Name

Yes

This parameter is mandatory when Cluster Name Mode is selected for Manual.

The name of the MRS cluster. Select an MRS cluster that Hive belongs to. Only MRS clusters are supported. A Hadoop cluster can be selected only after it is managed by MRS. All the MRS clusters with the same project ID and enterprise project are displayed.

NOTE:
  • MRS clusters of version 3.1.1 and later can be connected.
  • To connect to MRS clusters of version 3.2.1, add parameter protocol.v1.alternate-header-name with value Presto in the coordinator.config.properties and worker.config.properties files for the compute instance on the HetuEngine WebUI.
If the connection fails after you select a cluster, check whether the MRS cluster can communicate with the CDM instance which functions as the agent. They can communicate with each other in the following scenarios:
  • If the CDM cluster in the DataArts Studio instance and the MRS cluster are in different regions, a public network or a dedicated connection is required. If the Internet is used for communication, ensure that an EIP has been bound to the CDM cluster, and the MRS cluster can access the Internet and the port has been enabled in the firewall rule.
  • If the CDM cluster in the DataArts Studio instance and the cloud service are in the same region, VPC, subnet, and security group, they can communicate with each other by default. If they are in the same VPC but in different subnets or security groups, you must configure routing rules and security group rules. For details about how to configure routing rules, see configuring routes. For details about how to configure security group rules, see configuring security group rules.
  • The MRS cluster and the DataArts Studio workspace belong to the same enterprise project. If they do not, you can modify the enterprise project of the workspace.

KMS Key

Yes

KMS key used to encrypt and decrypt the authentication information for the data source

Agent

Yes

MRS is not a fully managed service and cannot be directly connected to DataArts Studio. A CDM cluster can provide an agent for DataArts Studio to communicate with non-fully-managed services. Therefore, you need to select a CDM cluster when creating an MRS data connection. If no CDM cluster is available, create one first.

As a network proxy, the CDM cluster must be able to communicate with the MRS cluster. To ensure network connectivity, the CDM cluster must be in the same region, AZ, VPC, and subnet as the MRS cluster. The security group rule must also allow the CDM cluster communicate with the MRS cluster.

NOTE:
  • MRS Hetu connections are supported only in CDM 2.9.2 and later versions.
  • If a CDM cluster functions as the agent for a data connection in Management Center, the cluster cannot connect to multiple MRS security clusters. You are advised to plan multiple agents which are mapped to MRS security clusters one by one.
  • If a CDM cluster functions as the agent for a data connection in Management Center, the cluster supports a maximum of 200 concurrent active threads. If multiple data connections share an agent, a maximum of 200 SQL, Shell, and Python scripts submitted through the connections can run concurrently. Excess tasks will be queued. You are advised to plan multiple agents based on the workload.

hsbroker IP Address List

Yes

IP addresses of the hsbroker nodes of the MRS Hetu component. Use commas (,) to separate multiple IP addresses.

To obtain the port number, perform the following operations:

  1. Log in to MRS FusionInsight Manager.
  2. Choose Cluster > Services > HetuEngine > Role > HSBroker to obtain the service IP addresses of all HSBroker instances.

hsbroker Port

Yes

Port number of the hsbroker node of the MRS Hetu component.

To obtain the port number, perform the following operations:

  1. Log in to MRS FusionInsight Manager.
  2. Choose Cluster > Services > HetuEngine > Configurations > All Configurations and search for server.port on the right to obtain the port number of HSBroker.

Data Source Authentication and Other Function Configuration

Authentication Method

Yes

This parameter is mandatory when Connection String Mode is selected for Manual.

It specifies the authentication method used for accessing the MRS cluster. The following options are available:
  • SIMPLE: for non-security mode
  • KERBEROS: for security mode

Username

Yes

Username of the MRS cluster. The user must have permissions of HetuEngine.

To create a data connection for an MRS security cluster, do not use user admin. The admin user is the default management page user and cannot be used as the authentication user of the security cluster. You can create an MRS user whose password never expires by referring to Creating a Kerberos Authentication User for an MRS Security Cluster. When creating an MRS data connection, set Username and Password to the new MRS username and password.
NOTE:
  • For clusters of MRS 3.1.0 or later, the user must at least have permissions of the Manager_viewer role to create data connections in Management Center. To perform database, table, and data operations on components, the user must also have user group permissions of the components.
  • For clusters earlier than MRS 3.1.0, the user must have permissions of the Manager_administrator or System_administrator role to create data connections in Management Center.
  • A user with only the Manager_tenant or Manager_auditor permission cannot create connections.
  • You are advised to set a user password that never expires to prevent connection failures and service loss caused by password expiration.
NOTICE:

After creating the HetuEngine user, you need to complete the configurations in Using HetuEngine from Scratch.

Password

Yes

Password for accessing the MRS cluster.

Creating a Kerberos Authentication User for an MRS Security Cluster

To create a data connection for an MRS security cluster, do not use user admin. The admin user is the default management page user and cannot be used as the authentication user of the security cluster. To create an MRS user, perform the following steps:

For clusters of MRS 3.x:

  1. Log in to MRS Manager as user admin.
  2. Choose System > Permission > Security Policy > Password Policy. Click Add Password Policy and add a policy under which the password never expires.
    • Set Password Policy Name to neverexp.
    • Set Password Validity Period (Days) to 0, indicating that the password never expires.
    • Set Password Expiration Notification (Days) to 0.
    • Retain the default values for other parameters.
  3. Choose System > Permission > User. On the page displayed, click Create to add a dedicated user as the Kerberos authentication user and set the password policy to neverexp. Select the user group superGroup for the user, and assign all roles to the user.
    • For clusters of MRS 3.1.0 or later, the user must at least have permissions of the Manager_viewer role to create data connections in Management Center. To perform database, table, and data operations on components, the user must also have user group permissions of the components.
    • For clusters earlier than MRS 3.1.0, the user must have permissions of the Manager_administrator or System_administrator role to create data connections in Management Center.
    • A user with only the Manager_tenant or Manager_auditor permission cannot create connections.
  4. Log in to Manager as the new user and change the initial password. Otherwise, the connection fails to be created.
  5. Synchronize IAM users.
    1. Log in to the MRS console.
    2. Choose Clusters > Active Clusters, select a running cluster, and click its name to go to its details page.
    3. In the Basic Information area of the Dashboard page, click Synchronize on the right side of IAM User Sync to synchronize IAM users.
      • When the policy of the user group to which the IAM user belongs changes from MRS ReadOnlyAccess to MRS CommonOperations, MRS FullAccess, or MRS Administrator, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD (System Security Services Daemon) cache of cluster nodes needs time to be updated. Then, submit a job. Otherwise, the job may fail to be submitted.
      • When the policy of the user group to which the IAM user belongs changes from MRS CommonOperations, MRS FullAccess, or MRS Administrator to MRS ReadOnlyAccess, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD cache of cluster nodes needs time to be updated.

For clusters of MRS 2.x or earlier:

  1. Log in to the MRS Manager as user admin.
  2. On FusionInsight Manager, choose System Settings and click Configure Password Policy to modify the password policy.
    • Set Password Validity Period (Days) to 0, indicating that the password never expires.
    • Set Password Expiration Notification (Days) to 0.
    • Retain the default values for other parameters.
  3. Choose System > Manage User. On the page displayed, add a dedicated user as the Kerberos authentication user. Select the user group superGroup for the user, and assign all roles to the user.
    • For clusters of MRS 2.x or earlier, the user must have permissions of the Manager_administrator or System_administrator role to create data connections in Management Center.
    • A user with only the Manager_tenant or Manager_auditor permission cannot create connections.
  4. Log in to MRS Manager as the new user and change the initial password. Otherwise, the connection fails to be created.
  5. Synchronize IAM users.
    1. Log in to the MRS console.
    2. Choose Clusters > Active Clusters, select a running cluster, and click its name to go to its details page.
    3. In the Basic Information area of the Dashboard page, click Synchronize on the right side of IAM User Sync to synchronize IAM users.
      • When the policy of the user group to which the IAM user belongs changes from MRS ReadOnlyAccess to MRS CommonOperations, MRS FullAccess, or MRS Administrator, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD (System Security Services Daemon) cache of cluster nodes needs time to be updated. Then, submit a job. Otherwise, the job may fail to be submitted.
      • When the policy of the user group to which the IAM user belongs changes from MRS CommonOperations, MRS FullAccess, or MRS Administrator to MRS ReadOnlyAccess, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD cache of cluster nodes needs time to be updated.