Updated on 2023-07-26 GMT+08:00

Connecting to an MRS Hive Data Source

Overview

ROMA Connect can use MRS Hive as a data source for data integration tasks. Before using the MRS Hive data source, you need to connect it to ROMA Connect.

If two data integration tasks use MRS data sources of different versions (including MRS Hive, MRS HDFS, and MRS HBase) and Kerberos authentication is enabled for the MRS data sources, the two data integration tasks cannot be executed at the same time. Otherwise, the integration tasks fail.

Prerequisites

  • Each connected data source must belong to an integration application. Before connecting a data source, ensure that an integration application is available. Otherwise, create an integration application first.
  • The execution permission has been configured for machine-machine interaction users. For details, see "Hive Application Development > Environment Preparation >Preparing a Development User" in the MapReduce Service Developer Guide.

Procedure

  1. Log in to the ROMA Connect console. On the Instances page, click View Console next to a specific instance.
  2. In the navigation pane on the left, choose Data Sources. In the upper right corner of the page, click Access Data Source.
  3. On the Default tab page, select MRS Hive and click Next.
  4. Configure the data source connection information.
    Table 1 Data source connection information

    Parameter

    Description

    Name

    Enter a data source name. It is recommended that you enter a name based on naming rules to facilitate search.

    Integration Application

    Select the integration application to which the data source belongs.

    Description

    Enter the descriptive information.

    HDFS URL

    • If the root directory is used, set this parameter to hdfs:///hacluster. This operation requires the administrator rights.
    • If a planned directory is used, set this parameter to the planned directory.
    • If a user database directory is used, for example, /user/hive/testdb, the user must have the permission on the directory.

    Machine-to-machine Username

    Enter the machine-machine username for connecting to MRS Hive.

    Configuration File

    Click Upload File to upload the MRS Hive configuration files. For details, see Obtaining the MRS Hive Configuration File.

    Obtaining the MRS Hive Configuration File

    1. Obtain krb5.conf and user.keytab files.

      Download the user authentication file from MRS Manager by following the procedure described in "Downloading a User Authentication File" in the MapReduce Service User Guide, and decompress the file to obtain the krb5.conf and user.keytab files.

    2. Obtain the hiveclient.properties, core-site.xml, hdfs-site.xml, and hosts files.

      Download the client configuration file from the MRS console by following the procedure described in Downloading a Client Configuration File > Using Hive from Scratch in the MapReduce Service User Guide. After the file is decompressed:

      • Obtain the hosts file from xxx_Services_ClientConfig_ConfigFiles.
      • Obtain the hiveclient.properties file from xxx_Services_ClientConfig_ConfigFiles > Hive > config.
      • Obtain the core-site.xml and hdfs-site.xml files from xxx_Services_ClientConfig_ConfigFiles > HDFS > config.
    3. Create a version file.

      Create a text file named Version without an extension, and add version=MRS 2.1.0 to the file.

    4. Generate the MRS Hive configuration file.

      Save the obtained files to a new directory and compress them into a .zip package. All files are stored in the root directory of the .zip package.

      • The file name contains a maximum of 255 characters, including only letters and digits.
      • The file size cannot exceed 2 MB.
  5. Click Check Connectivity to check the connectivity between ROMA Connect and the data source.
    • If the test result is Data source connected successfully, go to the next step.
    • If the test result is Failed to connect to the data source, check the data source status and connection parameters, and click Recheck until the connection is successful.
  6. Click Create.