Updated on 2023-07-26 GMT+08:00

Connecting to an FI HDFS Data Source

Overview

ROMA Connect can use FI HDFS as a data source for data integration tasks. Before using the FI HDFS data source, you need to connect it to ROMA Connect.

Prerequisites

Each connected data source must belong to an integration application. Before connecting a data source, ensure that an integration application is available. Otherwise, create an integration application first.

Procedure

  1. Log in to the ROMA Connect console. On the Instances page, click View Console next to a specific instance.
  2. In the navigation pane on the left, choose Data Sources. In the upper right corner of the page, click Access Data Source.
  3. On the Default tab page, select FI HDFS and click Next.
  4. Configure the data source connection information.
    Table 1 Data source connection information

    Parameter

    Description

    Name

    Enter a data source name. It is recommended that you enter a name based on naming rules to facilitate search.

    Integration Application

    Select the integration application to which the data source belongs.

    Description

    Enter the descriptive information.

    HDFS URL

    • If the root directory is used, set this parameter to hdfs:///hacluster. This operation requires the administrator rights.
    • If a planned directory is used, set this parameter to the planned directory.
    • If a user database directory is used, for example, /user/hdfs/testdb, the user must have the permission on the directory.

    Machine-machine Username

    Enter the user authentication name, for example, eip_fdi_hdfs.

    Configuration File

    Click Upload File to upload the FI HDFS configuration file. For details, see Obtaining the FusionInsight HDFS Configuration File.

    Obtaining the FusionInsight HDFS Configuration File

    1. Obtain krb5.conf and user.keytab files.

      Download the user authentication file from MRS Manager by following the procedure described in "Downloading a User Authentication File" in the MapReduce Service User Guide, and decompress the file to obtain the krb5.conf and user.keytab files.

    2. Obtain the core-site.xml, hdfs-site.xml, and hosts files.

      Download the client configuration file from the MRS console by following the procedure described in Updating a Client Configuration File > Using Hive from Scratch in the MapReduce Service User Guide. After the file is decompressed:

      • Obtain the hosts file from xxx_Services_ClientConfig_ConfigFiles.
      • Obtain the core-site.xml and hdfs-site.xml files from xxx_Services_ClientConfig_ConfigFiles > HDFS > config.
    3. Generate the FI HDFS configuration file.

      Save the obtained files to a new directory and compress them into a .zip package. All files are stored in the root directory of the .zip package.

      • The file name contains a maximum of 255 characters, including only letters and digits.
      • The file size cannot exceed 2 MB.
  5. Click Check Connectivity to check the connectivity between ROMA Connect and the data source.
    • If the test result is Data source connected successfully, go to the next step.
    • If the test result is Failed to connect to the data source, check the data source status and connection parameters, and click Recheck until the connection is successful.
  6. Click Create.