Updated on 2026-03-20 GMT+08:00

MRS Hive Connection Parameters

Table 1 MRS Hive connection

Parameter

Mandatory

Description

Data Connection Type

Yes

MRS Hive is selected by default and cannot be changed.

Name

Yes

Name of the data connection to create. Data connection names can contain a maximum of 100 characters. They can contain only letters, digits, underscores (_), and hyphens (-).

Description

No

A description which can help identify the data connection more easily. It can contain a maximum of 100 characters.

Tag

No

Attribute of the data connection to create. Tags make management easier.
NOTE:

The tag name can contain only letters, digits, and underscores (_) and cannot start with an underscore (_) or contain more than 100 characters.

Applicable Modules

Yes

Select the modules for which this connection is available.

NOTE:
  • When offline or real-time data migration jobs are enabled, you can select the DataArts Migration module. Then you can select this data connection when creating a data migration job in DataArts Factory.
  • You can use offline or real-time data migration jobs only after you apply for the whitelist membership. To use this feature, contact customer service or technical support.

Basic and Network Connectivity Configuration

Connection Type

Yes

Connection type. Proxy connection is recommended.
  • Proxy connection: An agent (CDM cluster) is used to access MRS clusters. This method supports all versions of MRS clusters.
  • MRS API connection: MRS APIs are used to access MRS clusters. This method supports only MRS clusters of the 2.X or a later version.

    When you select MRS API connection, pay attention to the following restrictions:

    1. The MRS API connection is available only for DataArts Factory.
    2. In DataArts Factory, you cannot view or manage the databases, data tables, and fields of the connection in a visualized manner. If an MRS cluster of version 3.2.1 or later is connected, you can view rather than manage the databases, data tables, and fields of the connection in a visualized manner.
    3. When the SQL editor of DataArts Factory is used to run SQL statements, the execution results can be displayed only in logs.
  • MRS tenant plane connection: MRS clusters are accessed through the MRS tenant plane. This type is not supported for DataArts Migration and DataArts Catalog.
NOTE:

If the connection is used by components other than DataArts Migration and DataArts Factory, the connection mode cannot be set to MRS API connection.

If you select MRS tenant plane connection and the cluster is managed by MRS, you must complete the configuration by referring to "Configuring OBS and IAM for JobGateway" in User Guide in MapReduce Service (MRS) x.x.x Usage Guide so that the data connection can be created.

Manual

Yes

This parameter is mandatory when Connection Type is set to Proxy connection.

Select the connection mode. If you do not need to access MRS clusters in other projects or enterprise projects, select Cluster Name Mode.
  • Cluster Name Mode: Select an existing cluster. You can only connect to an MRS cluster in the same project and enterprise project.
  • If you select Connection String Mode, you can set Manager IP and enable communication between this connection's agent (CDM cluster) and an MRS cluster in another project or enterprise project so that you can access the MRS cluster.

Manager IP

Yes

This parameter is mandatory when Connection String Mode is selected for Manual.

Set this parameter to the floating IP address of MRS Manager. Only MRS clusters are supported. A Hadoop cluster can be connected only after it is managed by MRS.
NOTE:
  • When connecting to a component of the MRS cluster, you need to enable the port of the component in the outbound rule of the security group of the MRS cluster. For details about the port of each component, see Common Ports for MRS Cluster Services.
  • DataArts Studio does not support MRS clusters whose Kerberos encryption type is aes256-sha2,aes128-sha2, and only supports MRS clusters whose Kerberos encryption type is aes256-sha1,aes128-sha1.

You can click Select next to the text box and select an MRS cluster in the same project and enterprise project. If you want to access an MRS cluster in another project or enterprise project, obtain and enter the floating IP address of MRS Manager and ensure that the connection's agent (CDM cluster) can communicate with the tenant-plane MRS cluster. To obtain the floating IP address of MRS Manager, log in to the active master node of the MRS cluster and run the ifconfig command. In the command output, the IP address of eth0:wsom is the floating IP address of MRS Manager. For details about how to log in to the master node of the MRS cluster, see Logging In to an ECS.

Enter multiple IP addresses based on the scenario in sequence and separate them with commas (,), for example, 127.0.0.1 or 127.0.0.1,127.0.0.2,127.0.0.3.
  • If you enter one IP address, enter the management-plane floating IP address of the MRS cluster.
  • If you enter three IP addresses, enter the IP address of the active node on the MRS cluster service plane, IP address of the standby node on the MRS cluster service plane, and the floating IP address of the MRS cluster management plane.

MRS Cluster Name

Yes

This parameter is mandatory when MRS API connection is selected for Connection Type or Cluster Name Mode is selected for Manual.

The name of the MRS cluster. Select an MRS cluster that Hive belongs to. Only MRS clusters are supported. A Hadoop cluster can be selected only after it is managed by MRS. All the MRS clusters with the same project ID and enterprise project are displayed.
NOTE:
  • When connecting to a component of the MRS cluster, you need to enable the port of the component in the outbound rule of the security group of the MRS cluster. For details about the port of each component, see Common Ports for MRS Cluster Services.
  • DataArts Studio does not support MRS clusters whose Kerberos encryption type is aes256-sha2,aes128-sha2, and only supports MRS clusters whose Kerberos encryption type is aes256-sha1,aes128-sha1.

If the connection fails after you select a cluster, check whether the MRS cluster can communicate with the CDM instance which functions as the agent. They can communicate with each other in the following scenarios:
  • If the CDM cluster in the DataArts Studio instance and the MRS cluster are in different regions, a public network or a dedicated connection is required. If the Internet is used for communication, ensure that an EIP has been bound to the CDM cluster, and the MRS cluster can access the Internet and the port has been enabled in the firewall rule.
  • If the CDM cluster in the DataArts Studio instance and the MRS cluster are in the same region, VPC, subnet, and security group, they can communicate with each other by default. If they are in the same VPC but in different subnets or security groups, you must configure routing rules and security group rules. For details about how to configure routing rules, see Configuring Routing Rules. For details about how to configure security group rules, see Configuring Security Group Rules.
  • The MRS cluster and the DataArts Studio workspace belong to the same enterprise project. If they do not, you can modify the enterprise project of the workspace.
NOTE:

If an agent is connected to multiple MRS clusters and one of the MRS clusters is deleted or abnormal, connections to the other MRS clusters will be affected. Therefore, you are advised to connect an agent to only one MRS cluster.

KMS encryption key

No

This parameter is mandatory when Connection Type is set to Proxy connection.

KMS key used to encrypt and decrypt data source authentication information. Select a default or custom key.
NOTE:
  • When you use KMS for encryption through DataArts Studio or KPS for the first time, the default key dlf/default or kps/default is automatically generated. For more information about default keys, see What Is a Default Master Key?.
  • Only symmetric keys are supported. Asymmetric keys are not supported.

jobGateway IP

Yes

  • This parameter is mandatory when Connection Type is set to MRS tenant plane connection. The management plane contains two nodes. Therefore, enter two IP addresses and separate them with commas (,). To obtain the IP addresses, perform the following steps:

    Log in to the MRS Manager management plane and choose Cluster > JobGateway. On the Instances tab page, obtain the management IP address of the JobBalancer role.

  • For a managed cluster, enter the service IP address.

Agent

Yes

This parameter is mandatory when Connection Type is set to Proxy connection or MRS tenant plane connection.

MRS is not a fully managed service and cannot be directly connected to DataArts Studio. A CDM cluster can provide an agent for DataArts Studio to communicate with non-fully-managed services. Therefore, you need to select a CDM cluster when creating an MRS data connection. If no CDM cluster is available, create one first by referring to Creating a CDM Cluster.

As a network proxy, the CDM cluster must be able to communicate with the MRS cluster. To ensure network connectivity, the CDM cluster must be in the same region and AZ and use the same VPC and subnet as the MRS cluster. The security group rule must also allow the CDM cluster to communicate with the MRS cluster.

NOTE:
  • If you use the same CDM cluster as the agent for multiple connections to MRS clusters with Kerberos authentication enabled, jobs will fail. You are advised to plan multiple CDM clusters based on service requirements.
  • If a CDM cluster functions as the agent for a data connection in Management Center, the cluster supports a maximum of 200 concurrent active threads. If multiple data connections share an agent, a maximum of 200 SQL, Shell, and Python scripts submitted through the connections can run concurrently. Excess tasks will be queued. You are advised to plan multiple agents based on the workload.

Data Source Authentication and Other Function Configuration

Authentication Method

Yes

This parameter is mandatory when Connection String Mode is selected for Manual.

It specifies the authentication method used for accessing the MRS cluster. The following options are available:
  • SIMPLE: for non-security mode
  • KERBEROS: for security mode

Username

Yes

Human-machine user of the MRS cluster. This parameter is mandatory when Connection Type is set to Proxy connection. If a new MRS user is used for connection, you need to log in to Manager and change the initial password.

To create a data connection for an MRS security cluster, do not use user admin. The admin user is the default management page user and cannot be used as the authentication user of the security cluster. You can create an MRS user whose password never expires by referring to Creating a Kerberos Authentication User for an MRS Security Cluster. When creating an MRS data connection, set Username and Password to the new MRS username and password.
NOTE:
  • For clusters of MRS 3.1.0 or later, the user must at least have permissions of the Manager_viewer role to create data connections in Management Center. To perform database, table, and data operations on components, the user must also have user group permissions of the components.
  • For clusters earlier than MRS 3.1.0, the user must have permissions of the Manager_administrator or System_administrator role to create data connections in Management Center.
  • A user with only the Manager_tenant or Manager_auditor permission cannot create connections.
  • You are advised to set a user password that never expires to prevent connection failures and service loss caused by password expiration.

Password

Yes

The password for accessing the MRS cluster. This parameter is mandatory when Connection Type is set to Proxy connection.

Enable ldap

No

This parameter is available when Connection Type is set to Proxy connection.

If LDAP authentication is enabled for an external LDAP server connected to MRS Hive, the LDAP username and password are required for authenticating the connection to MRS Hive. In this case, this option must be enabled. Otherwise, the connection will fail.

ldapUsername

Yes

This parameter is mandatory when Enable ldap is enabled.

Enter the username configured when LDAP authentication was enabled for MRS Hive.

ldapPassword

Yes

This parameter is mandatory when Enable ldap is enabled.

Enter the password configured when LDAP authentication was enabled for MRS Hive.

MRS Authentication Type

Yes

This parameter is mandatory when Connection Type is set to MRS tenant plane connection.

  • iam: A public agent or IAM account is used for authentication. If no public scheduling identity has been configured, the job execution user is used for authentication.
  • keytab: A username and a Keytab file are used for MRS authentication.
NOTE:

The token used for the IAM mode is valid for 24 hours. You need to periodically update the token by editing the data connection (save the settings without modification). The IAM mode is recommended only for test and verification. The keytab mode is recommended for scheduling tasks in the production environment.

Username

Yes

This parameter is mandatory when Connection Type is set to MRS tenant plane connection and MRS Authentication Type is set to keytab.

If you select an MRS cluster that uses Kerberos authentication, you cannot enter admin for this parameter. The username is the human-machine user or machine-machine user of the MRS cluster. This parameter is mandatory if the MRS tenant plane is used for connection.

To create a data connection for an MRS security cluster, do not use user admin. The admin user is the default management page user and cannot be used as the authentication user of the security cluster. You can create an MRS user whose password never expires by referring to Creating a Kerberos Authentication User for an MRS Security Cluster. When creating an MRS data connection, set Username to the new MRS username.

Keytab File

Yes

This parameter is mandatory when Connection Type is set to MRS tenant plane connection and MRS Authentication Type is set to keytab.

Click Select. In the displayed Driver File dialog box, select a .keytab file and click OK.

If no Keytab file is available, obtain one and upload it first. To obtain a Keytab file, perform the following operations:

Log in to the MRS console, click the name of a cluster on the Active Clusters page, and click Download Authentication Credential to download a Keytab file. Then decompress the downloaded package and upload the Keytab file to the system..

NOTE:

You can upload a Keytab file only if you have at least one of the following permissions:

  • DAYU Administrator or Tenant Administrator permissions
  • Workspace administrator permissions
  • Development and production custom role with the permissions to operate RDS driver packages

Real-time Metadata Synchronization

Yes

When real-time metadata synchronization is enabled, the metadata of the connected MRS cluster is synchronized to the Data Map component in real time. You are advised to enable this function.

NOTE:
  • MRS 3.3.0 and later versions and MRS 3.1.0.0.8 and later patch versions support real-time metadata synchronization. The real-time metadata synchronization function needs to be manually enabled in the MRS cluster and bound to an agent with the DAYU User permissions for metadata synchronization authentication. For details, see Configuring Real-Time Metadata Synchronization for an MRS Cluster.
  • Whether the real-time metadata synchronization function is enabled depends on whether this function is enabled for the latest connection to the same MRS cluster in the DataArts Studio instance. That is, if real-time synchronization is disabled or enabled for an MRS connection, this function is also disabled or enabled for all connections to the same MRS cluster.

    For example, if there are two MRS connections (connected to the same MRS cluster) in the same or different workspaces of a DataArts Studio instance, and real-time metadata synchronization is enabled for the connection that is created first and disabled for the connection that is created later, then real-time metadata synchronization is disabled for the MRS cluster. If real-time metadata synchronization is disabled for the connection that is created first and enabled for the connection that is created later, then real-time metadata synchronization is enabled for the MRS cluster.

Metadata Collection Scope

No

Databases and data tables whose metadata will be synchronized in real time. If this parameter is not set, all metadata will be synchronized.

The value can be in either of the following formats:

  • database_name: databases whose names contain database_name
  • database_name.table_name: databases whose names contain database_name and data tables whose names contain table_name

Examples:

  • If you enter datatest, the metadata of the tables in the databases whose names contain datatest will be synchronized in real time.
  • If you enter datatest.table1, metadata of the tables whose names contain table_name in the databases whose names contain datatest will be synchronized in real time.

OBS storage support

No

This parameter is displayed when DataArts Migration is selected for Applicable Modules.

The server must support OBS storage. When creating a Hive table, you can store the table in OBS.

Use Agency

No

This parameter is displayed when DataArts Migration is selected for Applicable Modules.

If you enable the agency function, you can create a data connection without having a permanent AK/SK and execute CDM jobs using the scheduling identity configured in DataArts Factory.

Public agency

No

This parameter is displayed when DataArts Migration is selected for Applicable Modules and Use Agency is enabled.

The agency is only used to check whether the connection agency function is normal. CDM jobs will be executed using the scheduling identity configured in DataArts Factory.

AK

N/A

This parameter is displayed when DataArts Migration is selected for Applicable Modules and OBS storage support is enabled.

AK and SK are used to log in to the OBS server.

You need to create an access key for the current account and obtain an AK/SK pair.

To obtain an access key, perform the following steps:
  1. Log in to the management console, move the cursor to the username in the upper right corner, and select My Credentials from the drop-down list.
  2. On the My Credentials page, choose Access Keys, and click Create Access Key. See Figure 1.
    Figure 1 Clicking Create Access Key
  3. Click OK and save the access key file as prompted. The access key file will be saved to your browser's configured download location. Open the credentials.csv file to view Access Key Id and Secret Access Key.
    NOTE:
    • Only two access keys can be added for each user.
    • To ensure access key security, the access key is automatically downloaded only when it is generated for the first time and cannot be obtained from the management console later. Keep them properly.

SK

N/A

DataArts Migration Configuration

HIVE Version

HIVE_3_X

This parameter is displayed when DataArts Migration is selected for Applicable Modules.

Hive version Set it to the Hive version on the server.

NOTE:

Select HIVE_3_X for connections to Hive servers of version 3.x, and select HIVE_2_X for connections to Hive servers of version 2.x. If you select an unmatched connection version, the connection test may succeed, but the queried database table may be empty, or the job may fail.

Process Type

EMBEDDED

This parameter is used only when the Hive version is HIVE_3_X. Possible values are:
  • EMBEDDED: The link instance runs with CDM. This mode delivers better performance.
  • Standalone: The link instance runs in an independent process. If CDM needs to connect to multiple Hadoop data sources (MRS, Hadoop, or CloudTable) with both Kerberos and Simple authentication modes, only Standalone can be used.
    NOTE:

    The STANDALONE mode is used to solve the version conflict problem. If the connector versions of the source and destination ends of the same link are different, a JAR file conflict occurs. In this case, you need to place the source or destination end in the STANDALONE process to prevent the migration failure caused by the conflict.

Check Hive JDBC Connectivity

No

This parameter is displayed when DataArts Migration is selected for Applicable Modules.

Whether to check the Hive JDBC connectivity

Creating a Kerberos Authentication User for an MRS Security Cluster

To create a data connection for an MRS security cluster, do not use user admin. The admin user is the default management page user and cannot be used as the authentication user of the security cluster. To create an MRS user, perform the following steps:

For clusters of MRS 3.x:

  1. Log in to MRS Manager as user admin.
  2. Choose System > Permission > Security Policy > Password Policy. Click Add Password Policy and add a policy under which the password never expires.
    • Set Password Policy Name to neverexp.
    • Set Password Validity Period (Days) to 0, indicating that the password never expires.
    • Set Password Expiration Notification (Days) to 0.
    • Retain the default values for other parameters.
  3. Choose System > Permission > User. On the page displayed, click Create to add a dedicated human-machine user as the Kerberos authentication user and set the password policy to neverexp. Select the user group superGroup for the user, and assign all roles to the user.
    • For clusters of MRS 3.1.0 or later, the user must at least have permissions of the Manager_viewer role to create data connections in Management Center. To perform database, table, and data operations on components, the user must also have user group permissions of the components.
    • For clusters earlier than MRS 3.1.0, the user must have permissions of the Manager_administrator or System_administrator role to create data connections in Management Center.
    • A user with only the Manager_tenant or Manager_auditor permission cannot create connections.
  4. Log in to Manager as the new user and change the initial password. Otherwise, the connection fails to be created.
  5. Synchronize IAM users.
    1. Log in to the MRS console.
    2. Choose Clusters > Active Clusters, select a running cluster, and click its name to go to its details page.
    3. In the Basic Information area of the Dashboard page, click Synchronize on the right side of IAM User Sync to synchronize IAM users.
      • If the status is Synchronized, skip this step.
      • When the policy of the user group to which the IAM user belongs changes from MRS ReadOnlyAccess to MRS CommonOperations, MRS FullAccess, or MRS Administrator, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD (System Security Services Daemon) cache of cluster nodes needs time to be updated. Then, submit a job. Otherwise, the job may fail to be submitted.
      • When the policy of the user group to which the IAM user belongs changes from MRS CommonOperations, MRS FullAccess, or MRS Administrator to MRS ReadOnlyAccess, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD cache of cluster nodes needs time to be updated.

For clusters of MRS 2.x or earlier:

  1. Log in to the MRS Manager as user admin.
  2. On FusionInsight Manager, choose System Settings and click Configure Password Policy to modify the password policy.
    • Set Password Validity Period (Days) to 0, indicating that the password never expires.
    • Set Password Expiration Notification (Days) to 0.
    • Retain the default values for other parameters.
  3. Choose System > Manage User. On the page displayed, add a dedicated human-machine user as the Kerberos authentication user. Select the user group superGroup for the user, and assign all roles to the user.
    • For clusters of MRS 2.x or earlier, the user must have permissions of the Manager_administrator or System_administrator role to create data connections in Management Center.
    • A user with only the Manager_tenant or Manager_auditor permission cannot create connections.
  4. Log in to MRS Manager as the new user and change the initial password. Otherwise, the connection fails to be created.
  5. Synchronize IAM users.
    1. Log in to the MRS console.
    2. Choose Clusters > Active Clusters, select a running cluster, and click its name to go to its details page.
    3. In the Basic Information area of the Dashboard page, click Synchronize on the right side of IAM User Sync to synchronize IAM users.
      • If the status is Synchronized, skip this step.
      • When the policy of the user group to which the IAM user belongs changes from MRS ReadOnlyAccess to MRS CommonOperations, MRS FullAccess, or MRS Administrator, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD (System Security Services Daemon) cache of cluster nodes needs time to be updated. Then, submit a job. Otherwise, the job may fail to be submitted.
      • When the policy of the user group to which the IAM user belongs changes from MRS CommonOperations, MRS FullAccess, or MRS Administrator to MRS ReadOnlyAccess, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD cache of cluster nodes needs time to be updated.

Configuring Real-Time Metadata Synchronization for an MRS Cluster

The real-time metadata synchronization function needs to be manually enabled in the MRS cluster and bound to an agent with the DAYU User permissions for metadata synchronization authentication.

Enable real-time metadata synchronization.

  1. Log in to MRS Manager as user admin.
  2. Choose Cluster > Services > Hive > Configuration > All Configurations, enter the parameter names in Table 2 in the search box, and set the parameters.

    Table 2 Configuration parameters

    Parameter

    Value

    Purpose

    Hive->MetaStore

    hive.metastore.customized.configs

    • Name: com.huawei.cloud.dataarts.endpoint
    • Value: Endpoint of the Data Map component of DataArts Studio

      An endpoint is the request address for calling an API. Endpoints vary depending on services and regions. You can obtain the endpoints of the service from Regions and Endpoints.

    (Mandatory) Enabling real-time metadata synchronization

    • Name: hive.metastore.event.listeners
    • Value: com.huawei.cloud.dii.catalog.agent.listener.MrsMetaStoreEventListener

    Hive->HiveServer

    hive.stats.autogather

    true

    (Optional) Enabling metadata statistics

    hive.security.authorization.sqlstd.confwhitelist

    Add ,hive.exec.pre.hooks to the end of the original value.

    (Optional) Enabling the last access time

    hive.server.customized.configs

    • Name: hive.exec.pre.hooks
    • Value: org.apache.hadoop.hive.ql.security.authorization.plugin.DisallowTransformHook,org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec
    Figure 2 Configuring the hive.metastore.customized.configs parameters

  3. After configuring all parameters in Table 2, click Save in the upper left corner and OK in the displayed dialog box.

    Figure 3 Saving the configuration

  4. After saving the configuration, switch to the Instances tab page, select the instance that has expired, click More, and select Instance Rolling Restart to make the configuration take effect.

    Figure 4 Performing a rolling instance restart

Authorize and bind an agency.

  1. Log in to the IAM console.
  2. Choose Agencies. In the agency list, locate the preset MRS_ECS_DEFAULT_AGENCY agency and click Authorize.

    If the preset MRS_ECS_DEFAULT_AGENCY agency is not found, you can buy an MRS cluster and select the MRS_ECS_DEFAULT_AGENCY agency in advanced settings. When the MRS cluster creation starts, the MRS_ECS_DEFAULT_AGENCY agency is automatically generated.

    Figure 5 Authorizing an agency

  3. On the authorization page, enter DAYU in the search box and select DAYU User.

    Figure 6 Selecting the permission

  4. After selecting the permission, click Next to set the authorization scope. In this example, retain the default settings and click OK to complete the authorization.
  5. On the MRS management console, choose Clusters > Active Clusters. Click the name of the target cluster to go to the cluster details page.
  6. On the Dashboard page, locate the O&M Management area and check that the cluster has been bound to the MRS_ECS_DEFAULT_AGENCY agency. If the cluster is not bound to the MRS_ECS_DEFAULT_AGENCY agency, you need to manually select the MRS_ECS_DEFAULT_AGENCY agency.

    Figure 7 Binding an agency