Updated on 2025-07-21 GMT+08:00

Apache Hive Connection Parameters (Internal Test)

Table 1 Apache Hive connection

Parameter

Mandatory

Description

Data Connection Type

Yes

The value is fixed at Apache Hive.

Name

Yes

Name of the data connection to create. Data connection names can contain a maximum of 100 characters. They can contain only letters, digits, underscores (_), and hyphens (-).

Description

No

A description which can help identify the data connection more easily. It can contain a maximum of 100 characters.

Tag

No

Attribute of the data connection to create. Tags make management easier.
NOTE:

The tag name can contain only letters, digits, and underscores (_) and cannot start with an underscore (_) or contain more than 100 characters.

Applicable Modules

Yes

Select the modules for which this connection is available.

NOTE:
  • When offline or real-time data migration jobs are enabled, you can select the DataArts Migration module. Then you can select this data connection when creating a data migration job in DataArts Factory.
  • You can use offline or real-time data migration jobs only after you apply for the whitelist membership. To use this feature, contact customer service or technical support.

Basic and Network Connectivity Configuration

Use Cluster Config

Yes

Select a cluster configuration you have created.

You can use the cluster configuration to simplify parameter settings for the Hadoop connection.

URI

Yes

This parameter is mandatory when Use Cluster Config is disabled.

NameNode URI, for example, hdfs://nn1_example.com/

Hive Metastore

Yes

This parameter is mandatory when Use Cluster Config is disabled.

Hive metadata address. For details, see the hive.metastore.uris configuration item. Example: thrift://host-192-168-1-212:9083

IP and Host Name Mapping

No

This parameter is mandatory when Use Cluster Config is disabled.

If the Hadoop configuration file uses the host name, configure the mapping between the IP address and host name. Separate the IP addresses and host names by spaces and mappings by semicolons (;), carriage returns, or line feeds.

KMS Key

No

This parameter is mandatory when Use Cluster Config is enabled.

KMS key used to encrypt and decrypt data source authentication information. Select a default or custom key.
NOTE:
  • When you use KMS for encryption through DataArts Studio or KPS for the first time, the default key dlf/default or kps/default is automatically generated. For more information about default keys, see What Is a Default Master Key?.
  • Only symmetric keys are supported. Asymmetric keys are not supported.

Agent

Yes

This parameter is mandatory when Use Cluster Config is enabled.

DataArts Studio cannot directly connect to non-fully managed services. An agent is required for DataArts Studio to communicate with non-fully managed services. A CDM cluster can function as an agent. If no CDM cluster is available, create one by referring to Creating a CDM Cluster.

DataArts Migration Configuration

HIVE Version

HIVE_3_X

This parameter is displayed when DataArts Migration is selected for Applicable Modules.

Hive version Set it to the Hive version on the server.

NOTE:

Select HIVE_3_X for connections to Hive servers of version 3.x, and select HIVE_2_X for connections to Hive servers of version 2.x. If you select an unmatched connection version, the connection test may succeed, but the queried database table may be empty, or the job may fail.

Hive Properties

hive.storeFormat=textfile

This parameter is displayed when DataArts Migration is selected for Applicable Modules.

(Optional) Click Add to add the JDBC connector attributes of multiple specified data sources. For details, see the JDBC connector document of the corresponding database.

The following are some examples:
  • connectTimeout=360000 and socketTimeout=360000: When a large amount of data needs to be migrated or the entire table is retrieved using query statements, the migration fails due to connection timeout. In this case, you can customize the connection timeout interval (ms) and socket timeout interval (ms) to prevent failures caused by timeout.
  • useCursorFetch=false: By default, useCursorFetch is enabled, indicating that the JDBC connector communicates with relational databases using a binary protocol. Some third-party systems may have compatibility issues, causing migration time conversion errors. In this case, you can disable this function. Open-source MySQL databases support the useCursorFetch parameter, and you do not need to set this parameter.

Hive JDBC URL

No

URL for connecting to Hive JDBC. By default, an anonymous user is used for the connection. To specify a user, add the hadoop.user.name configuration in advanced attributes.

Example: SIMPLE:jdbc:hive2://example:10000; KERBEROS:jdbc:hive2://example:10000;principal=${Principle}.

Data Source Authentication and Other Function Configuration

Authentication Method

Yes

Authentication type:
  • SIMPLE: for non-security mode
  • KERBEROS: for security mode

Enable ldap

No

If LDAP authentication is enabled for an external LDAP server connected to Apache Hive, the LDAP username and password are required for authenticating the connection to Apache Hive. In this case, this option must be enabled. Otherwise, the connection will fail.

ldapUsername

Yes

This parameter is mandatory when Enable ldap is enabled.

Enter the username configured when LDAP authentication was enabled for Apache Hive.

ldapPassword

Yes

This parameter is mandatory when Enable ldap is enabled.

Enter the password configured when LDAP authentication was enabled for Apache Hive.